0% found this document useful (0 votes)
70 views51 pages

Practice Test 6 70 Questions Udemy

The role that allows a user to administer users and manage all database objects is the ACCOUNTADMIN role. Snowflake offers tools to extract data from source systems. Snowflake's architecture consists of storage, virtual warehouses, and cloud services layers.

Uploaded by

Shantanu Joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views51 pages

Practice Test 6 70 Questions Udemy

The role that allows a user to administer users and manage all database objects is the ACCOUNTADMIN role. Snowflake offers tools to extract data from source systems. Snowflake's architecture consists of storage, virtual warehouses, and cloud services layers.

Uploaded by

Shantanu Joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Question 1:

Which role in Snowflake allows a user to administer users and manage all database
objects?

ACCOUNTADMIN

(Correct)

SYSADMIN

SECURITYADMIN

ROOT
Explanation
The account administrator (ACCOUNTADMIN) role is the most powerful role in the
system. This role alone is responsible for configuring parameters at the account level.
Users with the ACCOUNTADMIN role can view and operate on all objects in the
account, can view and manage Snowflake billing and credit data, and can stop any
running SQL statements.

In the default access control hierarchy, both of the other administrator roles are
owned by this role:

The security administrator (SECURITYADMIN) role includes the privileges to create


and manage users and roles.

The system administrator (SYSADMIN) role includes the privileges to create


warehouses, databases, and all database objects (schemas, tables, etc.).

https://docs.snowflake.com/en/user-guide/security-access-control-
considerations.html#using-the-accountadmin-role

Question 2:
Skipped
Which transformations are available when using the COPY INTO command to load
data files into Snowflake from a stage? (select all that apply)

Column data type conversion

(Correct)

Column concatenation

(Correct)

Filters

Aggregates
Explanation
Filtering the results of a FROM clause using a WHERE clause is not supported.

The VALIDATION_MODE parameter does not support COPY statements that


transform data during a load.

https://docs.snowflake.com/en/user-guide/data-load-transform.html#supported-
functions

Question 3:
Skipped
Snowflake offers tools to extract data from source systems

TRUE

FALSE
(Correct)

Question 4:
Skipped
Select the layers which are part of snowflake(select 3)

STORAGE

(Correct)

DATA CATALOG

VIRTUAL WAREHOUSE

(Correct)

CLOUD SERVICES

(Correct)

Explanation
Snowflake’s novel design physically separates but logically integrates storage,
compute and services like security and metadata; we call it multi-cluster, shared data
and it consists of 3 components:

1. Storage: the persistent storage layer for data stored in Snowflake

2. Compute: a collection of independent compute resources that execute data


processing tasks required for queries

3. Services: a collection of system services that handle infrastructure, security,


metadata, and optimization across the entire Snowflake system

https://www.snowflake.com/wp-content/uploads/2014/10/A-Detailed-View-Inside-
Snowflake.pdf

Question 5:
Skipped
When data is staged to a Snowflake internal staging area using the PUT command,
the data is encrypted on the client’s machine

TRUE

(Correct)

FALSE
Explanation
Uploaded files are automatically encrypted with 128-bit or 256-bit keys. The
CLIENT_ENCRYPTION_KEY_SIZE parameter specifies the size key used to encrypt the
files

https://docs.snowflake.com/en/sql-reference/sql/put.html#usage-notes

Question 6:
Skipped
Which snowflake features are available for enabling continuous data pipelines

Continuous data loading

(Correct)

Change data tracking

(Correct)

Recurring tasks

(Correct)


Table Pipes
Explanation
Snowflake provides the following features to enable continuous data pipelines:

Continuous data loading

Options for continuous data loading include the following:

Snowpipe

Snowflake Connector for Kafka

Third-party data integration tools

Change data tracking

A stream object records the delta of change data capture (CDC) information for a
table (such as a staging table), including inserts and other data manipulation
language (DML) changes. A stream allows querying and consuming a set of changes
to a table, at the row level, between two transactional points of time.

In a continuous data pipeline, table streams record when staging tables and any
downstream tables are populated with data from business applications using
continuous data loading and are ready for further processing using SQL statements.

For more information, see Change Tracking Using Table Streams.

Recurring tasks

A task object defines a recurring schedule for executing a SQL statement, including
statements that call stored procedures. Tasks can be chained together for successive
execution to support more complex periodic processing.

Tasks may optionally use table streams to provide a convenient way to continuously
process new or changed data. A task can transform new or changed rows that a
stream surfaces. Each time a task is scheduled to run, it can verify whether a stream
contains change data for a table (using SYSTEM$STREAM_HAS_DATA) and either
consume the change data or skip the current run if no change data exists.

Users can define a simple tree-like structure of tasks that executes consecutive SQL
statements to process data and move it to various destination tables.
https://docs.snowflake.com/en/user-guide/data-pipelines-intro.html#introduction-
to-data-pipelines

Question 7:
Skipped
Select the statements which are true for an external table

External tables are read-only

(Correct)

External tables can be used for query and join operations

(Correct)

Views can be created against external tables

(Correct)

Data can be updated in external tables


Explanation
In a typical table, the data is stored in the database; however, in an external table, the
data is stored in files in an external stage. External tables store file-level metadata
about the data files, such as the filename, a version identifier and related properties.
This enables querying data stored in files in an external stage as if it were inside a
database. External tables can access data stored in any format supported by COPY
INTO <table> statements.

External tables are read-only, therefore no DML operations can be performed on


them; however, external tables can be used for query and join operations. Views can
be created against external tables.

Querying data stored external to the database is likely to be slower than querying
native database tables; however, materialized views based on external tables can
improve query performance.
https://docs.snowflake.com/en/user-guide/tables-external-intro.html

Question 8:
Skipped
What are the two data loading approaches in snowflake

BULK LOADING

(Correct)

CONTINUOUS LOADING

(Correct)

INGEST LOADING
Explanation
BULK LOADING and CONTINUOUS LOADING are the two approaches. For bulk
loading you can use the COPY command and select a relevant warehouse to perform
the COPY. In this case compute is user managed

For continuous loading you can use snowpipe which is a serverless way of loading
data in micro batches. Compute is managed by snowflake.

Question 9:
Skipped
Which of the below are considered as best practices while loading data into
snowflake?

Isolate data loading workload into its own virtual warehouse

(Correct)

Split large files into smaller files

(Correct)

Compress the source files

(Correct)

If format is in CSV, convert them into ORC


Explanation
Having a dedicated warehouse for load workload ensures that it will not be
interrupted by any other workloads. Splitting the large files is a good practice
because it enables snowflake to parallize the load operation. Since the files are sent
over the wire, it is always better to compress them first.
Question 10:
Skipped
Load performance in snowflake is fastest for which file format

CSV

(Correct)

ORC

AVRO

PARQUET
Question 11:
Skipped
COPY and INSERT operations in snowflake are non-blocking

TRUE

(Correct)

FALSE
Explanation
COPY and INSERT do not block any other operations on the table
Question 12:
Skipped
Organizing input data by granular path can improve load performance

TRUE

(Correct)

FALSE
Question 13:
Skipped
Which are the key concepts that will need to be considered while loading data into
snowflake

STAGE OBJECT

(Correct)

FILE FORMAT

(Correct)

TRANSFORMATION AND ERROR VALIDATION

(Correct)

FILE SIZE
Explanation
Copying the file to a stage object is recommended while loading data into snowflake.
File format is used to identify the data format (CSV, JSON etc) of the source file.
Minor transformation and validations can be done as part of loading data
Question 14:
Skipped
Which approach would result in improved performance through linear scaling of data
ingestion workload?

Resize virtual warehouse

Consider practice of organizing data by granular path

Consider practice of splitting input file batch within the recommended


size of 10 MB to 100 MB

All of the above

(Correct)

Question 15:
Skipped
Which are the two variant columns available in a snowflake table loaded by kafka
connector

RECORD_CONTENT

(Correct)

RECORD_METADATA

(Correct)

RECORD_KEY
Explanation
Every Snowflake table loaded by the Kafka connector has a schema consisting of two
VARIANT columns:

RECORD_CONTENT. This contains the Kafka message.

RECORD_METADATA. This contains metadata about the message, for example, the
topic from which the message was read.

If Snowflake creates the table, then the table contains only these two columns. If the
user creates the table for the Kafka Connector to add rows to, then the table can
contain more than these two columns (any additional columns must allow NULL
values because data from the connector does not include values for those columns).

https://docs.snowflake.com/en/user-guide/kafka-connector-overview.html#schema-
of-topics-for-kafka-topics

Question 16:
Skipped
The RECORD_METADATA contains which information

Topic

(Correct)

Partition

(Correct)

Key

(Correct)


CreateTime / LogAppendTime

(Correct)

Value
Explanation
The RECORD_METADATA column contains the following information by default:

Topic

Partition

Offset

CreateTime / LogAppendTime

key

schema_id

headers

The value of the message is in RECORD_CONTENT

https://docs.snowflake.com/en/user-guide/kafka-connector-overview.html#schema-
of-topics-for-kafka-topics

Question 17:
Skipped
If multiple instances of the kafka connector is started on the same topic or partitions,
duplicate records may flow into snowflake table

TRUE

(Correct)

FALSE
Explanation
Instances of the Kafka connector do not communicate with each other. If you start
multiple instances of the connector on the same topics or partitions, then multiple
copies of the same row might be inserted into the table. This is not recommended;
each topic should be processed by only one instance of the connector.
Question 18:
Skipped
Select the ones that are true for snowflake kafka connector

Kafka connector guarantees exactly-once delivery

(Correct)

Kafka connector guarantees that rows are inserted in the order

Kafka connector guarantees reprocessing of messages


Explanation
Although the Kafka connector guarantees exactly-once delivery, it
does not guarantee that rows are inserted in the order that they were originally
published.
Question 19:
Skipped
Which are true with respect to SMT(simple message transformation) when
neither key.converter or value.converter is set

All SMTs are supported

Most SMTs are supported

(Correct)

regex.router is not supported


(Correct)

Explanation
Single Message Transformations (SMTs) are applied to messages as they flow
through Kafka Connect. When you configure the Kafka Configuration Properties, if
you set either key.converter or value.converter to one of the following values,
then SMTs are not supported on the corresponding key or value:

com.snowflake.kafka.connector.records.SnowflakeJsonConverter

com.snowflake.kafka.connector.records.SnowflakeAvroConverter

com.snowflake.kafka.connector.records.SnowflakeAvroConverterWithoutSchemaR
egistry

When neither key.converter or value.converter is set, then most SMTs are


supported, with the current exception of regex.router .

https://docs.snowflake.com/en/user-guide/kafka-connector-overview.html#kafka-
connector-limitations

Question 20:
Skipped
Which are the supported values for a JSON name/value pair?

A number (integer or floating point)

(Correct)

A string (in double quotes)

(Correct)

A Boolean (true or false)

(Correct)

An array (in square brackets)

(Correct)

An object (in curly braces)

(Correct)

Null

(Correct)

complex datatype
Explanation
A value in a name/value pair can be:

1. A number (integer or floating point)

2. A string (in double quotes)

3. A Boolean (true or false)

4. An array (in square brackets)

5. An object (in curly braces)

6. Null

https://docs.snowflake.com/en/user-guide/semistructured-data-
formats.html#supported-data-types

Question 21:
Skipped
Which of the below are binary formats?


AVRO

(Correct)

JSON

ORC

(Correct)

PARQUET

(Correct)

Explanation
What is ORC?

Used to store Hive data, the ORC (Optimized Row Columnar) file format was
designed for efficient compression and improved performance for reading, writing,
and processing data over earlier Hive file formats. For more information about ORC,
see https://orc.apache.org/.

Snowflake reads ORC data into a single VARIANT column. You can query the data in
a VARIANT column just as you would JSON data, using similar commands and
functions.

Alternatively, you can extract select columns from a staged ORC file into separate
table columns using a CREATE TABLE AS SELECT statement.

ORC is a binary format.

What is parquet?

Parquet is a compressed, efficient columnar data representation designed for


projects in the Hadoop ecosystem. The file format supports complex nested data
structures and uses Dremel record shredding and assembly algorithms. For more
information, see parquet.apache.org/documentation/latest/.
Snowflake reads Parquet data into a single VARIANT column. You can query the data
in a VARIANT column just as you would JSON data, using similar commands and
functions.

Alternatively, you can extract select columns from a staged Parquet file into separate
table columns using a CREATE TABLE AS SELECT statement.

Parquet is a binary format.

What is AVRO?

Avro is an open-source data serialization and RPC framework originally developed


for use with Apache Hadoop. It utilizes schemas defined in JSON to produce
serialized data in a compact binary format. The serialized data can be sent to any
destination (i.e. application or program) and can be easily deserialized at the
destination because the schema is included in the data.

An Avro schema consists of a JSON string, object, or array that defines the type of
schema and the data attributes (field names, data types, etc.) for the schema type.
The attributes differ depending on the schema type. Complex data types such as
arrays and maps are supported.

Snowflake reads Avro data into a single VARIANT column. You can query the data in
a VARIANT column just as you would JSON data, using similar commands and
functions.

https://docs.snowflake.com/en/user-guide/semistructured-intro.html#what-is-
parquet

https://docs.snowflake.com/en/user-guide/semistructured-intro.html#what-is-orc

https://docs.snowflake.com/en/user-guide/semistructured-intro.html#what-is-avro

Question 22:
Skipped
Which constrains are enforced in snowflake

Referential integrity constraints

NOT NULL constraint


(Correct)

UNIQUE Constraint
Explanation
Referential integrity constraints in Snowflake are informational and, with the
exception of NOT NULL, not enforced. Constraints other than NOT NULL are created
as disabled.

However, constraints provide valuable metadata. The primary keys and foreign keys
enable members of your project team to orient themselves to the schema design and
familiarize themselves with how the tables relate with one another.

https://docs.snowflake.com/en/user-guide/table-considerations.html#referential-
integrity-constraints

Question 23:
Skipped
You want to get the DDL statement of a snowflake table. What is the command that
you will use?


1. select get_ddl('table', 'mydb.public.salesorders');

(Correct)


1. show table 'mydb.public.salesorders';


1. show table like 'mydb.public.salesorders';
Explanation
Query the GET_DDL function to retrieve a DDL statement that could be executed to
recreate the specified table. The statement includes the constraints currently set on a
table.
Question 24:
Skipped
You have a small table in snowflake which has only 10,000 rows. Specifying a
clustering key will further improved the queries that run on this table

FALSE

(Correct)

TRUE
Explanation
Specifying a clustering key is not necessary for most tables. Snowflake performs
automatic tuning via the optimization engine and micro-partitioning. In many cases,
data is loaded and organized into micro-partitions by date or timestamp, and is
queried along the same dimension.

When should you specify a clustering key for a table? First, note that clustering a
small table typically doesn’t improve query performance significantly.

For larger data sets, you might consider specifying a clustering key for a table when:

The order in which the data is loaded does not match the dimension by which it is
most commonly queried (e.g. the data is loaded by date, but reports filter the data
by ID). If your existing scripts or reports query the data by both date and ID (and
potentially a third or fourth column), you may see some performance improvement
by creating a multi-column clustering key.

Query Profile indicates that a significant percentage of the total duration time for
typical queries against the table is spent scanning. This applies to queries that filter
on one or more specific columns.

Note that reclustering rewrites existing data with a different order. The previous
ordering is stored for 7 days to provide Fail-safe protection. Reclustering a table
incurs compute costs that correlate to the size of the data that is reordered.

https://docs.snowflake.com/en/user-guide/table-considerations.html#when-to-set-a-
clustering-key

Question 25:
Skipped
There is no query performance difference between a column with a maximum length
declaration (e.g. VARCHAR(16777216) ), and a smaller precision. Still it is recommended
to define an appropriate column length because of the below reasons

Data loading operations are more likely to detect issues such as columns
loaded out of order, e.g. a 50-character string loaded erroneously into a
VARCHAR(10) column. Such issues produce errors
(Correct)

When the column length is unspecified, some third-party tools may


anticipate consuming the maximum size value, which can translate into
increased client-side memory usage or unusual behavior

(Correct)

Data unloading will be performant if appropriate column lengths are


defined
Explanation
Snowflake compresses column data effectively; therefore, creating columns larger
than necessary has minimal impact on the size of data tables. Likewise, there is no
query performance difference between a column with a maximum length declaration
(e.g. VARCHAR(16777216) ), and a smaller precision.

However, when the size of your column data is predictable, we do recommend


defining an appropriate column length, for the following reasons:

Data loading operations are more likely to detect issues such as columns loaded out
of order, e.g. a 50-character string loaded erroneously into a VARCHAR(10) column.
Such issues produce errors.

When the column length is unspecified, some third-party tools may anticipate
consuming the maximum size value, which can translate into increased client-side
memory usage or unusual behavior.

https://docs.snowflake.com/en/user-guide/table-considerations.html#when-to-
specify-column-lengths

Question 26:
Skipped
You want to convert an existing permanent table to a transient table (or vice versa)
while preserving data and other characteristics such as column defaults and granted
privileges. What is the best way to do it?

Run an ALTER TABLE command to convert the tables


Unload the data from the existing table into a CSV file. Create the new
table and then load the data back in

Create a new table and use the COPY GRANTS clause

(Correct)

Explanation
Currently, it isn’t possible to change a permanent table to a transient table using
the ALTER TABLE command. The TRANSIENT property is set at table creation and
cannot be modified.

Similarly, it isn’t possible to directly change a transient table to a permanent table.

To convert an existing permanent table to a transient table (or vice versa) while
preserving data and other characteristics such as column defaults and granted
privileges, you can create a new table and use the COPY GRANTS clause, then copy the
data:

1. create transient table my_new_table like my_old_table copy grants;


2. insert into my_new_table select * from my_old_table;

https://docs.snowflake.com/en/user-guide/table-considerations.html#converting-a-
permanent-table-to-a-transient-table-or-vice-versa

Question 27:
Skipped
A transient table can be cloned to a permanent table

TRUE

FALSE

(Correct)

Explanation
You can’t clone a transient table to a permanent table.
Question 28:
Skipped
If you clone a permanent table(bar) into a transient table(foo) using the below
command
1. create transient table foo clone bar copy grants;
What will happen to the partitions?

Old partitions will not be affected,but new partitions added to the clone
will follow the transient lifecycle

(Correct)

All the partitions will be affected

Only the old partitions will be affected


Explanation
Another way to make a copy of a table (but change the lifecycle from permanent to
transient) is to CLONE the table, for example:
1. create transient table foo clone bar copy grants;
Old partitions will not be affected (i.e. won’t become transient), but new partitions
added to the clone will follow the transient lifecycle.
Question 29:
Skipped
You want to identify the potential performance bottlenecks and improvement
opportunities of a query. What will you do?

Use Query Profile

(Correct)

Use Explain plan


Call snowflake support
Explanation
Query Profile is a powerful tool for understanding the mechanics of queries. It can be
used whenever you want or need to know more about the performance or behavior
of a particular query. It is designed to help you spot typical mistakes in SQL query
expressions to identify potential performance bottlenecks and improvement
opportunities.

https://docs.snowflake.com/en/user-guide/ui-query-profile.html#when-to-use-
query-profile

Question 30:
Skipped
What does snowflake use for monitoring network traffic and use activity?

Lacework

(Correct)

Sumo logic

Threat Stack
Explanation
Snowflake uses lacework for behavioral monitoring of production infrastructure
which includes network traffic and user activity. It uses Sumo Logic and Threat Stack
to monitor failed logins, file integrity monitoring and unauthorized system
modifications.
Question 31:
Skipped
In which scenarios would you consider to use materialized views

The query results contain a small number of rows and/or columns


relative to the base table

(Correct)

Query results contain results that require significant processing

(Correct)

Query is on an external table

(Correct)

View's base table does not change frequently

(Correct)

None of the above


Explanation
Materialized views are particularly useful when:

1. Query results contain a small number of rows and/or columns relative to the base
table (the table on which the view is defined).

2. Query results contain results that require significant processing, including:

3. Analysis of semi-structured data.

4. Aggregates that take a long time to calculate.

5. The query is on an external table (i.e. data sets stored in files in an external stage),
which might have slower performance compared to querying native database tables.

6. The view’s base table does not change frequently.

https://docs.snowflake.com/en/user-guide/views-materialized.html#when-to-use-
materialized-views

Question 32:
Skipped
What will the below query return

SELECT TOP 10 GRADES FROM STUDENT;

The top 10 highest grades

The 10 lowest grades

Non-deterministic list of 10 grades

(Correct)

Explanation
An ORDER BY clause is not required; however, without an ORDER BY clause,
the results are non-deterministic because results within a result set are not
necessarily in any particular order. To control the results returned, use an ORDER
BY clause.

n must be a non-negative integer constant.

https://docs.snowflake.com/en/sql-reference/constructs/top_n.html#usage-notes

Question 33:
Skipped
Loading data using snowpipe REST API is supported for external stage only

TRUE

FALSE

(Correct)

Explanation
Snowpipe supports loading from the following stage types:
1. Named internal (Snowflake) or external (Amazon S3, Google Cloud Storage, or
Microsoft Azure) stages

2. Table stages

https://docs.snowflake.com/en/user-guide/data-load-snowpipe-rest-gs.html#step-1-
create-a-stage-if-needed

Question 34:
Skipped
With default settings, how long will a query run on snowflake

Snowflake will cancel the query if it runs more than 48 hours

(Correct)

Snowflake will cancel the query if it runs more than 24 hours

Snowflake will cancel the query if the warehouse runs out of memory

Snowflake will cancel the query if the warehouse runs out of memory and
hard disk storage
Explanation
STATEMENT_TIMEOUT_IN_SECONDS

This parameter tells Snowflake how long can a SQL statement run before the system
cancels it. The default value is 172800 seconds (48 hours)

This is both a session and object type parameter. As a session type, it can be applied
to the account, a user or a session. As an object type, it can be applied to
warehouses. If set at both levels, the lowest value is used.

Question 35:
Skipped
You have created a TASK in snowflake. How will you resume it?

No need to resume, the creation operation automatically enables the task

ALTER TASK mytask1 RESUME;

(Correct)

ALTER TASK mytask1 START;


Explanation
It is important to remember that a Task that has just been created will be suspended
by default. It is necessary to manually enable this task by “altering” the task as
follows:

ALTER TASK mytask1 RESUME;

Question 36:
Skipped
What technique does snowflake use to limit the number of micro-partitions scanned by each
query

Indexing

Pruning

(Correct)

Map Reduce

B-tree
Explanation
The micro-partition metadata maintained by Snowflake enables precise pruning of columns
in micro-partitions at query run-time, including columns containing semi-structured data. In
other words, a query that specifies a filter predicate on a range of values that accesses 10% of
the values in the range should ideally only scan 10% of the micro-partitions.

For example, assume a large table contains one year of historical data with date and hour
columns. Assuming uniform distribution of the data, a query targeting a particular hour
would ideally scan 1/8760th of the micro-partitions in the table and then only scan the portion
of the micro-partitions that contain the data for the hour column; Snowflake uses columnar
scanning of partitions so that an entire partition is not scanned if a query only filters by one
column.

In other words, the closer the ratio of scanned micro-partitions and columnar data is to the
ratio of actual data selected, the more efficient is the pruning performed on the table.

For time-series data, this level of pruning enables potentially sub-second response times for
queries within ranges (i.e. “slices”) as fine-grained as one hour or even less.

Not all predicate expressions can be used to prune. For example, Snowflake does not prune
micro-partitions based on a predicate with a subquery, even if the subquery results in a
constant.

https://docs.snowflake.com/en/user-guide/tables-clustering-micropartitions.html#query-
pruning

Question 37:
Skipped
Which of the two statements are true about the variant data type in SnowFlake?

Optimized storage based on repeated elements

(Correct)

Stored in a seperate file format from structured data

Can be queried using json path notation

(Correct)


Requires a custom mapping for each record type
Explanation
When Snowflake loads semi-structured data, it optimizes how it stores that data internally by
automatically discovering the attributes and structure that exist in the data, and using that
knowledge to optimize how the data is stored. Snowflake also looks for repeated attributes
across records, organizing and storing those repeated attributes separately. This enables better
compression and faster access, similar to the way that a columnar database optimizes storage
of columns of data.
Question 38:
Skipped
What is the recommended approach for making a variant column accessible in a BI tool

A pre-defined mapping

A view

(Correct)

Leveraging a json parser

BI tool cannot access json


Question 39:
Skipped
You can map snowflake to any s3 bucket and can query the data directly as long as the data is
in Parquet or ORC format

TRUE

FALSE

(Correct)

Question 40:
Skipped
The following factors affect data load rates

Physical location of the stage

(Correct)

Virtual warehouse RAM

Gzip compression efficiency

(Correct)

Thread size
Question 41:
Skipped
What are the two mechanisms to detect if new stage file is there in a snowpipe?

Automating Snowpipe using cloud messaging

(Correct)

Calling Snowpipe REST endpoints

(Correct)

Calling the custom APIs exposed through AWS EKS


Explanation
https://docs.snowflake.com/en/user-guide/data-load-snowpipe-intro.html#how-does-
snowpipe-work

Snowpipe loads data from files as soon as they are available in a stage . The data
is loaded according to the COPY statement defined in a referenced pipe.
A pipe is a named, first-class Snowflake object that contains a COPY statement used by
Snowpipe. The COPY statement identifies the source location of the data files (i.e., a stage)
and a target table. All data types are supported, including semi-structured data types such as
JSON and Avro.

Different mechanisms for detecting the staged files are available:

Automating Snowpipe using cloud messaging

Automated data loads leverage event notifications for cloud storage to inform Snowpipe of
the arrival of new data files to load. Snowpipe copies the files into a queue, from which they
are loaded into the target table in a continuous, serverless fashion based on parameters
defined in a specified pipe object.

Snowflake currently supports the following storage account types:

Amazon Web Services (AWS)

Amazon S3

Microsoft Azure

Blob storage

Data Lake Storage Gen2 — Supported as a preview feature.

General-purpose v2

For more information, see Automating Continuous Data Loading Using Cloud Messaging.

Calling Snowpipe REST endpoints

Your client application calls a public REST endpoint with the name of a pipe object and a list
of data filenames. If new data files matching the list are discovered in the stage referenced by
the pipe object, they are queued for loading. Snowflake-provided compute resources load
data from the queue into a Snowflake table based on parameters defined in the pipe.

Amazon Web Services (AWS)

Amazon S3

Google Cloud Platform

Cloud Storage

Microsoft Azure

Blob storage

Data Lake Storage Gen2 — Supported as a preview feature.


General-purpose v2

For more information, see Calling Snowpipe REST Endpoints to Load Data.

Question 42:
Skipped
When you load data using Snowpipe, loads are always performed in a single transaction

FALSE

(Correct)

TRUE
Explanation
It is important to know the difference between the two techniques, so not just on this
transaction related questions, please also go through the other differences here

https://docs.snowflake.com/en/user-guide/data-load-snowpipe-
intro.html#how-is-snowpipe-different-from-bulk-data-loading

Transactions

Bulk data load

Loads are always performed in a single transaction. Data is inserted into table alongside any
other SQL statements submitted manually by users.

Snowpipe

Loads are combined or split into a single or multiple transactions based on the number and
size of the rows in each data file. Rows of partially loaded files (based on the ON_ERROR
copy option setting) can also be combined or split into one or more transactions.

Question 43:
Skipped
Snowpipe does not guarantee loading of files in the order that they are staged

FALSE

TRUE

(Correct)

Explanation
Very important to remember this

Load Order of Data Files

For each pipe object, Snowflake establishes a single queue to sequence data files awaiting
loading. As new data files are discovered in a stage, Snowpipe appends them to the queue.
However, multiple processes pull files from the queue; and so, while Snowpipe generally
loads older files first, there is no guarantee that files are loaded in the same order they are
staged.

Question 44:
Skipped
What command will you run to pause a pipe?


1. alter pipe <pipe name> set pipe_execution_paused = true;

(Correct)


1. alter pipe <pipe name> set pipe_execution_paused = stop;


1. alter pipe <pipe name> set pipe_execution_paused = halt;
Explanation
Pause the mypipe pipe:
1. alter pipe mypipe set pipe_execution_paused = true;
https://docs.snowflake.com/en/sql-reference/sql/alter-pipe.html#examples
Question 45:
Skipped
All of the below are valid executionState of a snowpipe except:

RUNNING

STOPPED_FEATURE_DISABLED

STALLED_EXECUTION_ERROR

PAUSED

STOPPED

(Correct)

Explanation
executionState

Current execution state of the pipe; could be any one of the following:

RUNNING (i.e. everything is normal; Snowflake may or may not be actively processing files
for this pipe)

STOPPED_FEATURE_DISABLED

STOPPED_STAGE_DROPPED

STOPPED_FILE_FORMAT_DROPPED

STOPPED_MISSING_PIPE

STOPPED_MISSING_TABLE

STALLED_COMPILATION_ERROR

STALLED_INITIALIZATION_ERROR

STALLED_EXECUTION_ERROR

STALLED_INTERNAL_ERROR

PAUSED

PAUSED_BY_SNOWFLAKE_ADMIN

PAUSED_BY_ACCOUNT_ADMIN

Question 46:
Skipped
How do you set a return value in a task?


1. create task set_return_value
2. warehouse=return_task_wh
3. schedule='1 minute' as
4. call system$set_return_value('The quick brown fox jumps over the
lazy dog');

(Correct)


1. create task set_return_value
2. warehouse=return_task_wh
3. schedule='1 minute' as
4. call system$set_return_code('The quick brown fox jumps over the
lazy dog');


1. create task set_return_value
2. warehouse=return_task_wh
3. schedule='1 minute' as
4. call set_return_value('The quick brown fox jumps over the lazy
dog');
Explanation
SYSTEM$SET_RETURN_VALUE

Explicitly sets the return value for a task.

In a tree of tasks, a task can call this function to set a return value. Another task that identifies
this task as the predecessor task (using the AFTER keyword in the task definition) can retrieve
the return value set by the predecessor task.

https://docs.snowflake.com/en/sql-
reference/functions/system_set_return_value.html#examples

Question 47:
Skipped
Query load is calculated by dividing the execution time (in seconds) of all queries in an
interval by the total time (in seconds) for the interval.

TRUE

(Correct)


FALSE
Explanation
https://docs.snowflake.com/en/user-guide/warehouses-load-monitoring.html#how-query-
load-is-calculated
Question 48:
Skipped
Resource monitors can be used to control credit usage for the Snowflake-provided
warehouses, including the snowpipe warehouse

TRUE

FALSE

(Correct)

Explanation
Resource monitors provide control over virtual warehouse credit usage; however, you
cannot use them to control credit usage for the Snowflake-provided warehouses, including
the SNOWPIPE warehouse.
Question 49:
Skipped
This snowpipe rest API Fetches a report about ingested files whose contents have been added
to table

loadHistoryScan and insertReport

(Correct)

insertPipeReport

insertFiles
Explanation
Endpoint: loadHistoryScan

Fetches a report about ingested files whose contents have been added to table. Note that for
large files, this may only be part of the file. This endpoint differs from insertReport in that
it views the history between two points in time. There is a maximum of 10,000 items
returned, but multiple calls can be issued to cover the desired time range.

Additional explanation

Please note there was a mistake in this question earlier. The earlier question's answer selected
only loadHistoryScan but for this question both loadHistoryScan and insertReport are
correct. Please see the explanation above to see the difference
between loadHistoryScan and insertReport

Thanks to Rupa who caught this mistake

Question 50:
Skipped
To help avoid exceeding the rate limit (error code 429), snowflake recommends relying more
heavily on insertReport than loadHistoryScan

TRUE

(Correct)

FALSE
Explanation
loadHistoryScan endpoint is rate limited to avoid excessive calls. To help avoid exceeding
the rate limit (error code 429), we recommend relying more heavily
on insertReport than loadHistoryScan . When calling loadHistoryScan , specify the
most narrow time range that includes a set of data loads. For example, reading the last 10
minutes of history every 8 minutes would work well. Trying to read the last 24 hours of
history every minute will result in 429 errors indicating a rate limit has been reached. The
rate limits are designed to allow each history record to be read a handful of times.

https://docs.snowflake.com/en/user-guide/data-load-snowpipe-rest-apis.html#endpoint-
loadhistoryscan

Question 51:
Skipped
When working with a cloned table, you can use the below SQL statements


SELECT

DROP

SHOW

ALL OF THE ABOVE

(Correct)

Explanation
Clone table is just another table, only thing is that it shares micro partitions with the table it
has been cloned from.
Question 52:
Skipped
Select the options that differentiates a Partner Connect partner from a regular partner

Connect with snowflake through a wizard

(Correct)

Includes a partner trial account signup

(Correct)

Can be connected from the WEB UI

(Correct)

Includes automated role, user and staging database setup


(Correct)

None of the above


Explanation
Partner Connect lets you easily create trial accounts with selected Snowflake business
partners and integrate these accounts with Snowflake. This feature provides a convenient
option for trying additional tools and services, and then adopting the ones that best meet your
business needs.

https://docs.snowflake.com/en/user-guide/ecosystem-partner-connect.html#snowflake-
partner-connect

Question 53:
Skipped
To improve the query performance, which of the below techniques can be used in snowflake

Indexes

Distribution keys

Query hints

Cluster keys/Reclustering

(Correct)

Question 54:
Skipped
Which of the following best describes Snowflake's processing engine

EMR(Elastic Map Reduce)


Spark Engine

Presto

Native SQL Database engine

(Correct)

Explanation
Snowflake’s data warehouse is not built on an existing database or “big data” software
platform such as Hadoop. The Snowflake data warehouse uses a new SQL database engine
with a unique architecture designed for the cloud. To the user, Snowflake has many
similarities to other enterprise data warehouses, but also has additional functionality and
unique capabilities

https://docs.snowflake.com/en/user-guide/intro-key-concepts.html#key-concepts-architecture

Question 55:
Skipped
Which of the below are automatically provided by snowflake compared to other databases?

Installation and Hardware Configurations

(Correct)

Patch releases

(Correct)

Physical Security

(Correct)

Metadata and Collection statistics documentation


Question 56:
Skipped
Client has ODBC or JDBC available in their system but they do not have snowflake drivers.
Client will still be able to connect to snowflake?

TRUE

FALSE

(Correct)

Explanation
You will need the Snowflake ODBC or JDBC driver to connect to Snowflake, just having
JDBC and ODBC will not solve the problem

JDBC

https://docs.snowflake.com/en/user-guide/jdbc.html#jdbc-driver

Snowflake provides a JDBC type 4 driver that supports core JDBC functionality. The JDBC
driver must be installed in a 64-bit environment and requires Java 1.8 (or higher).

ODBC

https://docs.snowflake.com/en/user-guide/odbc.html#odbc-driver

Snowflake provides a driver for connecting to Snowflake using ODBC-based client


applications.

The ODBC driver has different prerequisites depending on the platform where it is installed.
For details, see the individual installation and configuration instructions for each platform.

In addition, different versions of the ODBC driver support the GET and PUT commands,
depending on the cloud service that hosts your Snowflake account:

Amazon Web Services: Version 2.17.5 (and higher)

Google Cloud Platform: Version 2.21.5 (and higher)

Microsoft Azure: Version 2.20.2 (and higher)

Question 57:
Skipped
Auto clustering can be switched off at an account level

FALSE

(Correct)

TRUE
Explanation
Auto clustering cannot be switched off at database of account level, it will need to be done at
the table level.
Question 58:
Skipped
A user can be defaulted to a role which user does not have access to

TRUE

FALSE

(Correct)

Explanation
You will be able to create a user with a default role to which the user does not have access.
However, the user will nit be able to logon to snowflake if he/she does not have access to the
default role. Hence a user cannot be defaulted to a role which he/she does not have access to
Question 59:
Skipped
With respect to Snowflake UI, which of the following is true?

A single session can be shared between multiple worksheets

Every worksheet can have a different role, warehouse and a database

(Correct)


Worksheets cannot have different role, warehouse and database

Every worksheet has its own session

(Correct)

Question 60:
Skipped
A network policy includes values in both allowed and blocked IP address lists, snowflake
applies the blocked IP address list first.

TRUE

(Correct)

FALSE
Explanation
When a network policy includes values in both the allowed and blocked IP address lists,
Snowflake applies the blocked IP address list first.

Do not add 0.0.0.0/0 to the blocked IP address list. 0.0.0.0/0 is interpreted to be “all
IPv4 addresses on the local machine”. Because Snowflake resolves this list first, this would
block your own access. Also, note that it is not necessary to include this IP address in the
allowed IP address list.

https://docs.snowflake.com/en/user-guide/network-policies.html#managing-account-level-
network-policies

Question 61:
Skipped
How are virtual warehouse credits charged?

per minute

per second

per-second, with a 60-second (i.e. 1-minute) minimum:

(Correct)

per hour
Question 62:
Skipped
Tri-Secret Secure option is available in which snowflake edition

Business critical or higher

(Correct)

Enterprise Edition

All editions
Question 63:
Skipped
A warehouse can be assigned to a single resource monitor only

TRUE

(Correct)

FALSE
Explanation
Assignment of Resource Monitors

A single monitor can be set at the account level to control credit usage for all warehouses in
your account.
In addition, a monitor can be assigned to one or more warehouses, thereby controlling the
credit usage for each assigned warehouse. Note, however, that a warehouse can be assigned
to only a single resource monitor.

https://docs.snowflake.com/en/user-guide/resource-monitors.html#assignment-of-resource-
monitors

Question 64:
Skipped
Select two true statement related to streams

Stream itself does not contain any table data

(Correct)

A stream only stores the offset for the source table

(Correct)

The hidden columns used by a stream does not consume any storage
Explanation
Note that a stream itself does not contain any table data. A stream only stores the offset for
the source table and returns CDC records by leveraging the versioning history for the source
table. When the first stream for a table is created, a pair of hidden columns are added to the
source table and begin storing change tracking metadata. These columns consume a
small amount of storage . The CDC records returned when querying a stream rely on a
combination of the offset stored in the stream and the change tracking metadata stored in the
table.
Question 65:
Skipped
You have a Snowflake table which is defined as below

CREATE OR REPLACE TABLE FRUITS(FRUIT_NUMBER NUMBER, FRUIT_DESCRIPTION


VARCHAR, AVAILABILITY VARCHAR);

If you would like to convert the fruit_number column to be a decimal with a certain precision
and scale, which command will you run?


SELECT FRUIT_NUMBER::DECIMAL(10,5) FROM FRUITS;

(Correct)

SELECT FRUIT_NUMBER(DECIMAL(10,5)) FROM FRUITS;

SELECT FRUIT_NUMBER AS DECIMAL(10,5) FROM FRUITS;

SELECT FRUIT_NUMBER.DECIMAL(10,5) FROM FRUITS;


Explanation
This question on CAST may come in may forms, but just remember that you can CAST to a
specific datatype using '::' or by using the CAST command

CAST , ::

Converts a value of one data type into another data type. The semantics of CAST are the
same as the semantics of the corresponding TO_ datatype conversion functions. If the cast
is not possible, an error is raised. For more details, see the individual
TO_ datatype conversion functions.

The :: operator provides alternative syntax for CAST.

I could have also written the query as below

1. SELECT CAST(FRUIT_NUMBER AS DECIMAL(10,5)) FROM FRUITS;

Question 66:
Skipped
You ran a query in snowflake and went to query history tab. The query history shows you the
below columns

1. QueryID

2. SQL TEXT

3. WAREHOUSE NAME

4. WAREHOUSE SIZE

5. SESSION ID
6. START TIME

7. END TIME

Which of the above column will indicate if an compute cost was incurred to run the query?

WAREHOUSE NAME

CREDIT

WAREHOUSE SIZE

(Correct)

SESSION ID
Explanation
Anytime a query incurs compute cost, you will see the warehouse size mentioned as shown
below

Just fo an experiment as below

1. Run the query SHOW TABLES

2. Go to query history

Do you see the ware house size? No, because SHOW TABLES is a metadata query and does
not incur any compute cost

Question 67:
Skipped
The FLATTEN command in snowflake has two version. One version uses a join and the
other version uses an object keyword. Please select two words that represent the options used
with the command?


OBJECT_CONSTRUCT

TABLE

(Correct)

TRY_CAST

LATERAL

(Correct)

Explanation
An example of the command is as below
1. select * from table(flatten(input => parse_json('[1, ,77]'))) f;

Question 68:
Skipped
You have two types of named stages, one is external stage and the other one is internal stage.
External stage will always require a cloud storage provider

TRUE

(Correct)

FALSE
Explanation
Ok, this is an easy question. But when you are working for a customer, what will you suggest
him? When should we use external stage? Please read below

It is preferred to use internal stage because SNOWFLAKE automatically encrypts the data in
internal stage. SNOWFLAKE is responsible for that encryption. If you use external stage, it
will be your responsibility to encrypt the data. But does it mean that you will never use
external stage. Well no, there may be use cases where the data is not much and is coming
from an external stakeholder and you do not want to store that data into your snowflake
tables, in such cases go ahead and use external stage. But always do an analysis of the use
case in hand
Question 69:
Skipped
When cloning a table, if the COPY GRANTS keywords are not included in the
CREATE <object> statement, then the new object does not inherit any explicit access
privileges granted on the original table but does inherit any future grants defined for the
object type in the schema

TRUE

(Correct)

FALSE
Explanation
General Usage Notes

A clone is writable and is independent of its source (i.e. changes made to the source or clone
are not reflected in the other object).

To create a clone, your current role must have the following privilege(s) on the source object:

Tables

SELECT

Pipes, Streams, Tasks

OWNERSHIP

Other objects

USAGE

In addition, to clone a schema or an object within a schema, your current role must have
required privileges on the container object(s) for both the source and the clone.

For tables, Snowflake only supports cloning permanent and transient tables; temporary tables
cannot be cloned.

For databases and schemas, cloning is recursive:


Cloning a database clones all the schemas and other objects in the database.

Cloning a schema clones all the contained objects in the schema.

However, the following object types are not cloned:

External tables

Internal (Snowflake) stages

For databases, schemas, and tables, a clone does not contribute to the overall data storage for
the object until operations are performed on the clone that modify existing data or add new
data, such as:

Adding, deleting, or modifying rows in a cloned table.

Creating a new, populated table in a cloned schema.

Cloning a table replicates the structure, data, and certain other properties (e.g. STAGE FILE
FORMAT ) of the source table. A cloned table does not include the load history of the source
table. Data files that were loaded into a source table can be loaded again into its clones.

When cloning tables, the CREATE <object> command syntax includes the COPY GRANTS
keywords:

1. If the COPY GRANTS keywords are not included in the CREATE <object>
statement, then the new object does not inherit any explicit access
privileges granted on the original table but does inherit any future grants
defined for the object type in the schema (using the GRANT <privileges> …
TO ROLE … ON FUTURE syntax).
If the COPY GRANTS option is specified in the CREATE <object> statement, then the new
object inherits any explicit access privileges granted on the original table but does not inherit
any future grants defined for the object type in the schema.
Question 70:
Skipped
Scaling down a virtual warehouse(e.g from a large warehouse to a small one) is an automated
process.

TRUE

FALSE

(Correct)

Explanation
To use a different warehouse, you will need to use the below command

USE WAREHOUSE <WAREHOUSE_NAME>

You might also like