Snowflake+Interview+Questions+ +Part+II
Snowflake+Interview+Questions+ +Part+II
Part II
By
Janardhan Bandi
26. What is snowpipe and write syntax for creating snowpipe.
Ans: Snowpipe is Snowflake’s continuous data ingestion service. Snowpipe loads
data within minutes after files are added to a stage and submitted for ingestion.
→ Snowpipe is serverless compute model.
→ Snowpipe provides a “pipeline” for loading fresh data in micro-batches as soon
as it is available.
Ans: Roles are the entities to which privileges on snowflake objects can be
granted and revoked.
→ Roles are assigned to users to allow them to perform actions required for
business functions in their organization.
→ A user can be assigned multiple roles.
Ans:
https://docs.snowflake.com/en/user-guide/security-access-control-overview.html
28. What are the editions of Snowflake and which one you are using in
your project?
Ans: There are 4 editions of Snowflake.
1. Standard
2. Enterprise
3. Business Critical
4. Virtual Private
→ These editions differ in terms of multi clusters, time travel, security and many
other features.
→ Cost depends on Snowflake edition we choose
Standard - $2.7/Credit
Enterprise - $4/Credit
Business Critical - $5.4/Credit
Ans:
→ We have to choose size based on the data size you are dealing with and the
queries complexity. Size can be anything from XS to 6XL
→ Also VW size and number of clusters depends on the Environment, Dev, Test,
Prod etc. based on the data and jobs you are dealing within that environment.
Ans:
Vertical Scaling (Scale up) : Increasing the Virtual Warehouse Size to reduce the
processing time and make you queries running faster.
Horizontal Scaling (Scale out): Increasing the number of clusters to avoid queries
going into queues, you need to scale out when your customer base grows or when your
parallel queries/jobs increases.
31. What are the diff table types in Snowflake?
3. Temporary Tables: Only active in that session and gets dropped once
we close the session, 0-1 day retention period and does not support fail
safe. Can be used in stored procedure for intermediate data storage.
Note: If you create any Database/Schema as Transient then all the tables under
that Database/Schema are Transient by default.
32. What is the use of transient tables and temporary tables?
Ans:
Transient tables are specifically designed for transitory data that needs
to be maintained beyond each session, but does not need the same
level of data protection and recovery provided by permanent tables. You
can create the stage tables and intermediate work tables as Transient.
Ans:
Permanent tables: 0-90 days retention period and 7 days fail safe period
after retention period is completed. We can adjust retention period.
Transient tables: 0-1 day retention period and does not support fail safe.
Transient tables: 0-1 day retention period and does not support fail safe.
Ans: Yes we can create, and all the queries will be fetching the data from
Temporary table in that session.
Ans:
After creation, transient tables cannot be converted to any other table type.
After creation, temporary tables cannot be converted to any other table type.
36. What are different caches available in Snowflake?
Ans: A Stream object records the delta of change data capture (CDC) information
for a table such as a staging table including inserts and other data manipulation
language (DML) changes like Update and Delete.
This stream will record all the changes occurring to the table over time.
38. How the Stream tracks the changes occurring a table?
Ans: Every stream contains 2 fields, based on the values of these 2 fields we can
identify the record is a Insert record or Update record or Delete record.
METADATA$ACTION – Insert/Delete
METADATA$ISUPDATE – True/False
Note: A Streams records Update operation as a set of Delete(delete old record) and
Insert(insert updated record).
Ans:
1. Choose appropriate table type
2. Define cluster keys on large table only and choose proper cluster keys
3. Reduce default retention period
4. Enable auto suspend and auto resume
5. Choose appropriate warehouse size, use scale-up and scale-out effectively
6. Understand storage and compute costs
44. What is the difference between Where and Having?
Ans:
Where clause is used to filter data while Having is used to filter the summary
data after applying the grouping functions like count, sum, avg etc.
Can you use both where and having in single sql statement?
Yes, we can, below is one example.
Query to fetch list of depts that contains more than 10 Active employees.
If your database doesn’t support ROWID then use Rank approach or below temporary
table approach.
1. Create a temp table with unique records of the actual table
2. Delete the data from actual table and insert data from temp table to actual table
3. Drop temp table
46. What is the diff between Union and Union All?
Ans:
UNION eliminates duplicate records after clubbing all records
UNION ALL will not eliminate the duplicate records.
Note: To perform UNION and UNION ALL, list of columns and order of
columns should be same in all statements or tables.
47. Below are two tables with one column in each
What is the number of records after each type of join?
1 1 A inner join B - 4
null null
48. Query to fetch Dept wise 3rd highest salary?
Ans:
SELECT DEPT_ID, SALARY FROM EMP
QUALIFY ROW_NUMBER() OVER(PARTITION BY DEPT_ID ORDER BY SALARY DESC) = 3;
Ans:
To extract Date from Timestamp:
select to_date('2022-05-22 00:00:00’); -- 2022-05-22
Ans:
SELECT TRIM(REGEXP_REPLACE(string, '[^[:digit:]]', '')) AS Numeric_value
FROM
( SELECT ' Area code for employee ID 12345 is 6789.' AS string ) a;
53. Tell me some performance tuning techniques of SQL queries.
Ans:
1. Define proper Indexes
2. Define proper partitioning keys. (Cluster keys in Snowflake).
3. Select only required fields, don’t put SELECT * blindly.
4. Replace OR with UNION if possible.
5. Use UNION instead of UNION ALL if you are sure there are no duplicates.
6. Use CTEs instead of subqueries if you want to use same result set at multiple places in
the query.
7. Use Inner Join instead of putting Cross Join + Where clause.
8. Collecting missing stats on table will help in Teradata.
9. Use materialized views for faster data retrieval and continuous data availability.
10. Look at query execution plan or Query profile to understand where is the bottleneck.
Some other frequently asked sql questions.
1. What is the diff between NVL and NVL2.
2. What are the DML commands in sql?
3. What is diff between varchar and nvarchar?
4. What is an index in database?
5. What is the diff between case and decode?
6. What are the types of slowly changing dimensions?
7. How a data warehouse is different from a database?
8. What is the diff between rank() and dense_rank()?
9. Write a query to find nth highest salary.
10. Write a query to fetch all the employee details with even number emp_id.
Thank You!
[email protected]
https://www.youtube.com/channel/UCNTGAQaxJMxZLS0GR1VHOKg