0% found this document useful (0 votes)
441 views

Snowflake Training

This document provides an introduction to a Snowflake fundamentals webinar. It includes sections for introductions where attendees can share their name, company, role and experience with Snowflake. It also contains information about remote attendee controls in Zoom, highlighting the participants and chat functions.

Uploaded by

Lionel Messi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
441 views

Snowflake Training

This document provides an introduction to a Snowflake fundamentals webinar. It includes sections for introductions where attendees can share their name, company, role and experience with Snowflake. It also contains information about remote attendee controls in Zoom, highlighting the participants and chat functions.

Uploaded by

Lionel Messi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 685

y

op
t-c
no
o-
-d
20
20
ke
fla
Snowflake

ow
Sn
©
0-
02
-2
Fundamentals

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights


Reserved
INTRODUCTIONS

y
op
t-c
no
o-
● Name

-d
20
● Company & Role

20
ke
fla
● Experience with Snowflake

ow
Sn
○ What do you already know?

©
0-
02
○ What do you want to learn?

-2
ay
-M
● Tell us something about yourself

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 2


REMOTE ATTENDEE CONTROLS

y
op
t-c
no
● Hover over your zoom interface to bring

o-
-d
20
up the control panel at the bottom

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 3


REMOTE ATTENDEE CONTROLS

y
op
t-c
no
● Click Participants to bring up the ● Click Chat to bring up the chat

o-
-d
20
participants panel, with access to interface

20
non-verbal feedback

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 4


PARTICIPANTS PANEL

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
Andrea Carlton

©
0-
Andrew Sherwood

02
Click an icon in the non-verbal feedback

-2
controls to set it by your name

ay
-M
13
om
c
a.
at
td
nt
v@

Non-verbal feedback controls


da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 5


CHAT INTERFACE

y
op
t-c
no
● Chat to everyone, or send a

o-
-d
20
personal message to the

20
instructor or another participant

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 6


COURSE OBJECTIVES

y
op
t-c
By the end of this course you will be able to set up and administer Snowflake to your users. You will also be

no
o-
able to effectively use snowflake to get business value out of your data. You will be able to:

-d
20
20
ke
● Configure and manage account parameters

fla
ow
● Configure users and roles with appropriate privileges

Sn
©
0-
● Load and transform data in Snowflake

02
-2
ay
● Query data, and size virtual warehouses for different use cases

-M
13
● Work with semi-structured data in Snowflake

om
c
● Understand Snowflake’s micro-partitioning and how it contributes to query optimization

a.
at
td
● Use Time Travel and failsafe storage to recover data and provide continuous data protection
nt
v@
da

● Use zero-copy cloning to provide groups with different use cases access to the same data
. ya
11

● Use Data Sharing to send your data in real time to customers, partners or other company accounts
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 7


COURSE AGENDA

y
op
t-c
no
Overview & Architecture Access Control and User Management

o-
-d
20
20
Clients, Connectors, & Ecosystems Semi-Structured Data

ke
fla
ow
Snowflake Caching Continuous Data Protection

Sn
©
0-
SQL Support in Snowflake Data Sharing

02
-2
ay
Data Movement Performance & Concurrency

-M
13
om
Snowflake Functions Account & Resource Management

c
a.
at
td
Managing Security Preview Features
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 8


LAB EXERCISE

y
op
10 minutes

t-c
no
o-
Introduction - Prepare for Class

-d
20
20
ke
● Log in to Snowflake Training Portal

fla
ow
○ https://training.snowflake.com/login.html

Sn
● Instructor will provide you access to a Snowflake account

©
0-
02
○ https://xxYY1234.snowflakecomputing.com

-2
ay
○ User assigned for the course (30 days)

-M
13
● Access student materials

om
○ Available for 90 days

c
a.
at
○ Need access to workbook for labs td
nt
● Download lab files
v@

○ SQL ready to go (/users/ directory)


da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 9


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
OVERVIEW AND

Sn
©
0-
02
-2
ARCHITECTURE

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● The Snowflake Difference

no
o-
● Snowflake Structure

-d
20
● Cloud Services Layer

20

ke
Data Storage Layer

fla
● Compute Layer

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 11


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
THE SNOWFLAKE

Sn
©
0-
02
-2
DIFFERENCE

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SNOWFLAKE

y
A FULL DATA WAREHOUSE BUILT FOR THE CLOUD

op
t-c
no
o-
-d
20
20
Our vision Our solution

ke
fla
Allow our customers to access all their Next-generation data warehouse

ow
data in one place to make actionable built from the ground up for the

Sn
decisions anytime, anywhere, with any cloud to address today’s data and

©
0-
number of users. analytics challenges.

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
ya

SQL Data Built for Delivered as


.
11

Warehouse the cloud a service


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 13


SNOWFLAKE CLOUD DATA PLATFORM

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
ya

Complete Self-Tuning All Your Data All Your Users Pay Only for Live Data
.
11

SQL Database Sharing


lip

What You Use


di

© 2020 Snowflake Computing Inc. All Rights Reserved 14


TRADITIONAL DATA ARCHITECTURE

y
COMPLEX, COSTLY, CONSTRAINED

op
t-c
no
o-
-d
20
Data Data Normalization Data

20
Integration Transformation & Aggregation Analytics

ke
Data Sources Data Consumers

fla
ow
OLTP Operational

Sn
File Sharing
Databases Data Warehouses
Reporting

©
0-
Enterprise

02
ETL ELT
Applications

-2
ay
Data Marts

-M
Data Science Ad Hoc
Third-Party Analysis

13
CDC

om
Web/Log

c
a.
Data Backups Cubes

at
td
Real-time
nt
Streaming Data Lake
IoT
v@

Analytics
da
ya
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 15


MODERN DATA ARCHITECTURE
WITH SNOWFLAKE

y
op
t-c
no
o-
-d
20
20
ke
fla
Data Sources Data Consumers

ow
Sn
OLTP Data Data Data Data Data Data Data
ETL, Streaming

©
Warehouse Lake Engineering Exchange Applications Science
Databases Monetization

0-
02
Enterprise

-2
Applications Operational

ay
Reporting

-M
13
Third-Party

om
Ad Hoc

c
Web/Log

a.
Analysis

at
Data
td
nt
v@

Real-time
IoT
Analytics
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 16


MODERN DATA ARCHITECTURE
WITH SNOWFLAKE

y
op
t-c
no
o-
-d
20
20
ke
fla
Data Sources Data Consumers

ow
Sn
OLTP Data Data Data Data Data Data Data
ETL, Streaming

©
Warehouse Lake Engineering Exchange Applications Science
Databases Monetization

0-
02
Enterprise

-2
Applications Operational

ay
Reporting

-M
13
Third-Party

om
Logical datamarts Ad Hoc

c
Web/Log

a.
Analysis

at
Data
td
nt
v@

Real-time
IoT
Analytics
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 17


REQUIREMENTS OF A CLOUD DATA PLATFORM

y
op
t-c
no
o-
-d
20
20
One Platform Unlimited

ke
fla
One Copy of Data Performance

ow
Many Workloads and Scale

Sn
©
0-
02
-2
ay
-M
13
om
Secure & Near-zero

c
a.
at
Governed Access td Maintenance,
nt
to All Data as a Service
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 18


A NEW ARCHITECTURE

y
MULTI-CLUSTER, SHARED DATA, IN THE CLOUD

op
t-c
no
o-
-d
20
20
ke
fla
Traditional Architectures Snowflake

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
Shared-disk Shared-nothing Multi-cluster, shared data
nt
v@
da

Shared storage Decentralized, local storage Centralized, scale-out storage


ya

Single cluster Single cluster Multiple, independent compute clusters


.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 19


SNOWFLAKE ARCHITECTURE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Scale Out Services

Sn
©
Multi-Cluster Compute

0-
02
Centralized Storage

-2
ay
-M
Cloud Agnostic Layer

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 20


MULTI-CLUSTER SHARED DATA ARCHITECTURE

y
op
t-c
no
o-
JDBC/ODBC

-d
VPC/VNet

20
20
ke
Authentication & Access Control

fla
ow
Cloud ● Storage decoupled

Sn
services from compute

©
Infrastructure Optimizer Metadata Security

0-
manager manager

02
-2
● All data in one place

ay
Compute

-M
(Virtual

13
Cache Cache Cache Cache
Warehouses) ● Dynamically combine

om
storage and compute

c
a.
at
A B C D E
tdF G H
Database A` B` C` D`
nt
I J K L M N O P E` F` G` H`
v@

storage Q R S T U V W X
da
ya
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 21


FUNCTIONAL ARCHITECTURE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 22


FUNCTIONAL ARCHITECTURE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 23


FUNCTIONAL ARCHITECTURE

y
op
t-c
no
o-
-d
Data Load

20
Marketing
M

20
XS Analytics / Reporting / BI
Structured &

ke
Semi-Structured

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
S td
L
nt
Data Data Science
v@

Transformation
da
.ya
11
lip

Finance
di

L App
© 2020 Snowflake Computing Inc. All Rights Reserved 24
FUNCTIONAL ARCHITECTURE

y
op
t-c
no
o-
-d
Data Load

20
Marketing
M

20
XS Analytics / Reporting / BI
Structured &

ke
Semi-Structured

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
S td
L
nt
Data Data Science
v@

Transformation
da
.ya
11
lip

Finance
di

L App
© 2020 Snowflake Computing Inc. All Rights Reserved 25
FUNCTIONAL ARCHITECTURE

y
op
t-c
no
o-
-d
Data Load

20
Marketing
M

20
XS Analytics / Reporting / BI
Structured &

ke
Semi-Structured

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
S td
nt
Data XL Data Science
v@

Transformation
da
.ya
11
lip

Finance
di

L App
© 2020 Snowflake Computing Inc. All Rights Reserved 26
FUNCTIONAL ARCHITECTURE

y
op
t-c
no
o-
-d
Data Load

20
Marketing
M

20
XS Analytics / Reporting / BI
Structured &

ke
Semi-Structured

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
S td
nt
Data XL Data Science
v@

Transformation
da
.ya
11
lip

Finance
di

L App
© 2020 Snowflake Computing Inc. All Rights Reserved 27
FUNCTIONAL ARCHITECTURE

y
op
t-c
no
M

o-
-d
Data Load

20
Marketing
M

20
XS Analytics / Reporting / BI
Structured &

ke
Semi-Structured

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
S td
nt
Data XL Data Science
v@

Transformation
da
.ya
11
lip

Finance
di

L App
© 2020 Snowflake Computing Inc. All Rights Reserved 28
FUNCTIONAL ARCHITECTURE

y
op
t-c
no
M

o-
-d
Data Load

20
Marketing
M

20
XS Analytics / Reporting / BI
Structured &

ke
Semi-Structured

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
S td
nt
Data XL Data Science
v@

Transformation
da
.ya
11
lip

Finance
di

L App
© 2020 Snowflake Computing Inc. All Rights Reserved 29
FUNCTIONAL ARCHITECTURE

y
op
t-c
no
M

o-
-d
Data Load

20
Marketing
M

20
XS Analytics / Reporting / BI
Structured &

ke
Semi-Structured

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
S td
nt
Data Data Science
v@

Transformation
da
.ya
11
lip

Finance
di

L App
© 2020 Snowflake Computing Inc. All Rights Reserved 30
FUNCTIONAL ARCHITECTURE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Your

Sn
Business

©
0-
Ecosystem

02
-2
ay
-M
Secure Sharing &

13
Collaboration

om
c
a.
at
td Your Private Public Data
nt
Data Exchange Exchange
L
v@
da
. ya
11
lip

Your Employees
di

© 2020 Snowflake Computing Inc. All Rights Reserved 31


FUNCTIONAL ARCHITECTURE

y
op
t-c
no
o-
-d
Test/Dev

20
M

20
ke
fla
ow
Your

Sn
Clone Business

©
0-
Ecosystem

02
-2
ay
-M
Secure Sharing &

13
Collaboration

om
c
a.
at
td Your Private Public Data
nt
Data Exchange Exchange
L
v@
da
. ya
11
lip

Your Employees
di

© 2020 Snowflake Computing Inc. All Rights Reserved 32


FUNCTIONAL ARCHITECTURE

y
op
t-c
no
o-
-d
Test/Dev

20
M

20
ke
fla
ow
Your

Sn
Clone Business

©
0-
Ecosystem

02
-2
ay
-M
Secure Sharing &

13
Collaboration

om
c
a.
at
td Your Private Public Data
nt
Data Exchange Exchange
L
v@
da
. ya
11
lip

Your Employees
di

© 2020 Snowflake Computing Inc. All Rights Reserved 33


SNOWFLAKE EDITIONS

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 34


GLOBAL SNOWFLAKE

y
CURRENTLY SUPPORTED REGIONS

op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 35


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
SNOWFLAKE

Sn
©
0-
02
-2
STRUCTURE

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


LOGICAL DATA
ORGANIZATION

y
op
● Databases and schemas logically

t-c
no
organize data within a Snowflake

o-
Account

-d
account

20
20
Database Database

ke
fla
Schema ● A database is a logical grouping of

ow
Sn
schemas

©
0-
○ Each database belongs to a single account

02
-2
...

ay
-M
● A Schema is a logical grouping of

13
database objects, such as tables and

om
Database

c
a.
views
at
td
nt
v@
da
ya
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 37


SNOWFLAKE OBJECTS

y
op
● All Snowflake objects reside within

t-c
no
Account
logical containers, with the top level

o-
-d
container being the Snowflake Account

20
User Table

20
ke
fla
View
● All objects are individually securable

ow
Role

Sn
Stored

©
Procedure

0-
38
Virtual ● Users perform operations on objects

02
Warehouse

-2
UDF “privileges” that are “granted” to roles

ay
-M
Resource

13
Monitor Stage
● Sample Privileges:

c om
a.
Database Schema File Format ○ Create a virtual warehouse
at
td
○ List tables in a schema
nt
v@

Pipe
○ Insert data into a table
da
ya

Sequence
.

○ Select data from a table


11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 38


TABLE TYPES

y
op
t-c
no
o-
-d
20
20
ke
fla
PERMANENT

ow
Sn
©
● Persist until dropped

0-
02
● Designed for data that

-2
requires the highest

ay
-M
level of data protection

13
and recovery

om
● Default table type

c
a.
at
td
nt
v@

Time Travel Up to 90 days


da

with Enterprise
. ya
11

Fail-Safe
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 39


TABLE TYPES

y
op
t-c
no
o-
-d
20
20
ke
fla
PERMANENT TEMPORARY

ow
Sn
©
● Persist until dropped ● Persist and tied to a

0-
session (think single

02
● Designed for data that user)

-2
requires the highest

ay
-M
level of data protection ● Used for transitory data

13
and recovery (for example, ETL/ELT)

om
● Default table type

c
a.
at
td
nt
v@

Time Travel Up to 90 days 0 or 1 days


da

with Enterprise
. ya
11

Fail-Safe x
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 40


TABLE TYPES

y
op
t-c
no
o-
-d
20
20
ke
fla
PERMANENT TEMPORARY TRANSIENT*

ow
Sn
©
● Persist until dropped ● Persist and tied to a ● Persist until dropped

0-
session (think single

02
● Designed for data that user) ● Multiple user

-2
requires the highest

ay
-M
level of data protection ● Used for transitory data ● Used for data that

13
and recovery (for example, ETL/ELT) needs to persist, but

om
does not need the same
● Default table type level of data retention as

c
a.
a permanent table
at
td
nt
v@

Time Travel Up to 90 days 0 or 1 days 0 or 1 days


da

with Enterprise
. ya
11

Fail-Safe x x
lip
di

*Transient applicable to Database, Schema and Table


© 2020 Snowflake Computing Inc. All Rights Reserved 41
TABLE TYPES

y
op
t-c
no
o-
-d
20
20
ke
fla
PERMANENT TEMPORARY TRANSIENT* EXTERNAL

ow
Sn
©
● Persist until dropped ● Persist and tied to a ● Persist until dropped ● Persist until removed

0-
session (think single

02
● Designed for data that user) ● Multiple user

-2
● Snowflake “over” an
requires the highest

ay
external data lake

-M
level of data protection ● Used for transitory data ● Used for data that

13
and recovery (for example, ETL/ELT) needs to persist, but
● Data accessed via an

om
does not need the same
● Default table type level of data retention as external stage

c
a.
a permanent table
at
td ● Read-only
nt
v@

Time Travel Up to 90 days 0 or 1 days 0 or 1 days x


da

with Enterprise
. ya
11

Fail-Safe x x x
lip
di

*Transient applicable to Database, Schema and Table


© 2020 Snowflake Computing Inc. All Rights Reserved 42
VIEW TYPES

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Standard View

Sn
©
0-
02
● Default view type

-2
ay
● Named definition of a

-M
13
query--SELECT statement

om
c
● Executes as owning role

a.
at
td
● Underlying DDL available to
nt
v@

any role with access to the


da

view.
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 43


VIEW TYPES

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Standard View Secure View

Sn
©
0-
02
● Default view type ● Definition and details only

-2
visible to authorized users

ay
● Named definition of a

-M
13
query--SELECT statement ● Executes as owning role

om
c
● Executes as owning role ● Snowflake query optimizer

a.
at
td bypasses optimizations
● Underlying DDL available to used for regular views
nt
v@

any role with access to the


da

view.
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 44


VIEW TYPES

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Standard View Secure View Materialized View

Sn
©
0-
02
● Default view type ● Definition and details only ● Behaves more like a table

-2
visible to authorized users

ay
● Named definition of a ● Results of underlying

-M
13
query--SELECT statement ● Executes as owning role query are stored

om
c
● Executes as owning role ● Snowflake query optimizer ● Auto-refreshed

a.
at
td bypasses optimizations
● Underlying DDL available to used for regular views ● Secure Materialized View
nt
v@

any role with access to the is also supported


da

view.
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 45


LAB EXERCISE

y
op
DEMO: Using the Snowflake Web UI

t-c
no
o-
15 minutes

-d
20
20
ke
● Overview

fla
ow
● Worksheet Context

Sn
● Tips and Tricks

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 46


LAB EXERCISE

y
op
Take a Quick Test Drive

t-c
no
o-
25 minutes

-d
20
20
ke
Tasks:

fla
ow
● Log in to your Snowflake training account

Sn
● Create a database and a table for later use

©
0-
● Run queries on sample data

02
-2
ay
-M
Note:

13
om
● In this lab, you will define your user's default role, namespace, and warehouse so they

c
a.
will automatically be set in every worksheet you open.
at
td
● Internet Explorer is not supported!
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 47


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
CLOUD SERVICES

Sn
©
0-
02
-2
LAYER

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SNOWFLAKE
ARCHITECTURE

y
op
3 architectural layers:

t-c
no
o-
-d
● Storage Layer

20
20
ke
fla
● Compute (Virtual Warehouse) Layer

ow
Sn
©
0-
49
● Cloud Services Layer

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 49


SNOWFLAKE ARCHITECTURE: CLOUD SERVICES

y
op
t-c
no
● "Brains" of the service

o-
-d
20
20
● Coordinates activities across Management Optimization Security Availability Transactions Metadata

ke
fla
Snowflake

ow
Sn
©
0-
● Runs on compute instances

02
-2
provisioned by Snowflake

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 50


MANAGEMENT

y
op
● Centralized management for all storage

t-c
no
o-
-d
● Manages the compute that works with

20
20
the storage

ke
fla
ow
● Transparent, online updates and patches

Sn
©
0-
51

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 51


OPTIMIZER SERVICE

y
op
● SQL Optimizer

t-c
no
o-
○ Cost-based optimization (CBO)

-d
20
20
● Automatic JOIN order optimization

ke
fla
ow
○ No user input or tuning required

Sn
©
0-
● Automatic statistics gathering
52

02
-2
ay
-M
● Pruning using metadata about micro-

13
om
partitions

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 52


SECURITY

y
op
● Authentication

t-c
no
o-
-d
20
● Access control for users and roles

20
ke
fla
ow
Sn
● Access control for shares

©
0-
53

02
-2
ay
● Encryption and key management

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 53


METADATA MANAGEMENT

y
op
● Stores metadata as data is loaded into

t-c
no
the system

o-
-d
20
20
● Handles queries that can be processed

ke
fla
completely from metadata

ow
Sn
©
0-
54
● Used for Time Travel and Cloning

02
-2
ay
-M
● Every aspect of Snowflake architecture

13
leverages metadata

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 54


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
DATA STORAGE

Sn
©
0-
02
-2
LAYER

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


STORAGE LAYER

y
op
● Hybrid Columnar

t-c
no
o-
-d
● Automatic micro-partitioning

20
20
ke
fla
● Natural data clustering and optimization

ow
Sn
©
0-
56
● Native Semi-Structured data support and

02
-2
optimization

ay
-M
13
● All storage within Snowflake is billable

om
c
(compressed)
a.
Columnar
at
td
nt
v@

Cloud Storage
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 56


COLUMNAR COMPRESSION

y
op
● Ingestion automatically analyzes and

t-c
no
compresses data into table on load

o-
-d
20
20
Partitio ● Find the optimal compression scheme for

ke
fla
n each data type

ow
Sn
©
0-
57
● Storing same data type enables efficient

02
-2
compression

ay
-M
13
● Columns grow and shrink independently

om
c
a.
at ● Significant performance benefit by
td
nt
v@

reduction I/O and storage


da
ya

Column
.
11
lip

ar
di

© 2020 Snowflake Inc. All Rights Reserved 57


MICRO-PARTITIONS

y
op
● Contiguous units of storage that hold

t-c
no
table data

o-
-d
20
○ 50 - 500 MB of uncompressed data

20
ke
○ Generally...Max 16MB (Compressed)

fla
ow
Sn
● MANY micro-partitions per table

©
0-
58

02
-2
● IMMUTABLE !!!!!

ay
-M
13
om
● Services layer stores metadata about

c
a.
every micro-partition
at
td
nt
○ MIN/MAX (Range of values in each column)
v@
da

○ Number of distinct values


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 58


MICRO-PARTITIONING

y
op
● Physical data files that comprise

t-c
ID Name

no
Snowflake's logical tables

o-
-d
1 John

20
20
2 Scott
Table written to ● Automatically-created contiguous units of

ke
fla
3 Mary micro-partitions storage, partitioned based on ingestion

ow
Sn
4 Jane order

©
0-
5 Jack

02
-2
● Attempts to preserve natural data co-
59

6 Claire

ay
-M
location

13
om
ID 1 2 3

c
● Immutable - updates create new micro-
a.
P1
Name John Scott Mary at
td
partition versions
nt
v@
da
ya

ID 4 5 6
.
11

P2
Name Jane Jack Claire
lip
di

© 2020 Snowflake Inc. All Rights Reserved 59


METADATA

y
op
● Snowflake automatically collects and

t-c
no
maintains metadata about tables and

o-
-d
their underlying micro-partitions,

20
PARTITION

20
including:

ke
MICRO-PARTITION FILES METADATA

fla
ow
ID 1 2 3

Sn
ID: 1-3 ○ Table level

©
0-
Name
60
John Scott Mary Name: J - S ■ Row count

02
-2
■ Table size (in bytes)

ay
-M
ID 4 5 6 ■ File references and table versions

13
ID: 4-6

om
Name Jane Jack Claire Name: C - J

c
a.
○ Micro-partition column level:
at
td
■ Range of values
nt
v@

■ Number of distinct values


da
ya

■ MIN and MAX values


.
11

■ NULL count
lip
di

© 2020 Snowflake Inc. All Rights Reserved 60


DATA STORAGE BILLING

y
op
t-c
no
● Billed for actual storage use

o-
-d
20
○ Daily average Terabytes per month

20
ke
fla
ow
● On-demand pricing

Sn
©
○ Billed in arrears for storage used

0-
02
○ $40/Terabyte/month

-2
ay
-M
○ Minimum monthly charge of $25

13
om
c
● Pre-Purchased Capacity
a.
at
td
○ Billed up front with commitment to a certain capacity
nt
v@

○ Price varies on amount and cloud platform


da
ya

○ Customer is notified at 70% of capacity


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 61


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
COMPUTE

Sn
©
0-
02
-2
LAYER

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


VIRTUAL WAREHOUSES

y
op
● A named wrapper around a cluster of

t-c
no
S servers with CPU, memory, and disk

o-
-d
20
20
L ● The larger the warehouse, the more

ke
fla
servers in the cluster

ow
Sn
XL

©
0-
● Extra Small has one server per cluster

02
63
S

-2
○ Each size up doubles in size

ay
-M
13
● Jobs requiring compute run on virtual

om
M

c
warehouses
a.
at
td
nt
v@

M ● While running, a virtual warehouse


da

L consumes Snowflake credits


ya

S
.
11

S
○ You are charged for compute
lip
di

© 2020 Snowflake Inc. All Rights Reserved 63


WORKLOAD SEGMENTATION

y
op
t-c
no
● Should reflect units of workload management

o-
-d
20
○ ETL

20
ke
○ BI / Dashboards

fla
ow
○ Data Science

Sn
©
0-
Interactive Continuous
S3

02
Dashboard Loading

-2
(4TB/day) <5min

ay
SLA

-M
13
om
Virtual Warehouse Virtual

c
Auto Scale – X-Large x 5 Warehouse

a.
Medium

at
Reporting td ETL &
(Segmented)
nt
Maintenance
v@

Prod
da
ya

DB
.
11
lip

Virtual Warehouse Virtual Warehouse


di

2X-Large Large

© 2020 Snowflake Inc. All Rights Reserved 64


VIRTUAL WAREHOUSE TYPES

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
Standard

13
Multi-Cluster Warehouse (MCW)

om
c
● Will only ever have a single
a.
● Can spawn additional compute clusters
compute cluster at
td
(scale out) to manage changes in user
nt
v@

and concurrency needs


da

● Cannot “scale out”


. ya
11

● Enterprise Edition feature


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 65


VIRTUAL WAREHOUSE SIZING

y
op
t-c
no
● Warehouses are sized in “t-shirt” sizing

o-
-d
20
20
● Size determines the number of servers that comprise each cluster in

ke
fla
a warehouse

ow
Sn
©
0-
● Each larger size is double the preceding, in both VMs in the cluster

02
-2
and in Snowflake credits consumed

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 66


COMPUTE CREDITS

y
op
● Compute Credits are how you are billed

t-c
no
for compute usage

o-
-d
20
○ You may have a set number of credits

20
ke
○ You may be billed monthly for your credits

fla
ow
Sn
● Compute Credits are charged based on

©
0-
the number of virtual warehouses you
67

02
-2
use, their size, and how long you use

ay
-M
them

13
om
c
a.
● Warehouse usage (or compute) is
at
td
charged per-second, with a one-minute
nt
v@

minimum
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 67


VIRTUAL WAREHOUSE CREDITS

y
op
t-c
no
o-
-d
Warehouse Credits / Credits /

20
Servers

20
Size Hour Second

ke
fla
ow
X-Small 1 1 0.0003

Sn
©
Small 2 2 0.0006

0-
02
-2
Medium 4 4 0.0011

ay
-M
Large 8 8 0.0022

13
om
X-Large 16 16 0.0044

c
a.
2X-Large at
32 32 0.0089
td
nt
v@

3X-Large 64 64 0.0178
da
ya

4X-Large 128 128 0.0356


.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 68


CLOUD SERVICES COMPUTE BILLING

y
op
t-c
no
● Some compute occurs in the cloud services layer

o-
-d
20
20
● Customers are charged for cloud computing that exceeds

ke
fla
10% of total compute costs for the account

ow
Sn
©
0-
● Use WAREHOUSE_METERING_HISTORY view to see

02
-2
how much cloud service compute your account is using

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 69


RESIZING A WAREHOUSE

y
op
t-c
no
● Can be completed at any time, even when running

o-
-d
20
20
● Completed via ALTER WAREHOUSE statement or the UI

ke
fla
ow
Sn
● Effects of resizing:

©
0-
○ Suspended Warehouse: will start at new size upon next resume

02
-2
ay
○ Running Warehouse: immediate impact; running queries complete at current size, while queued

-M
queries run at new size

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 70


SCALE UP FOR PERFORMANCE

y
op
t-c
no
Elastic Processing Power (CPU, RAM, SSD)

o-
-d
● Raw performance boost for complex queries or ingesting large data sets

20
20
● More complex queries on larger datasets require larger Warehouses

ke
fla
● Not intended for handling concurrency issues (more users/queries) Medium

ow
Sn
©
0-
Warehouse Sizing Guidelines

02
-2
● Snowflake uses per-second billing: use larger warehouses for

ay
-M
more complex workloads and (auto) suspend when not in use

13
● Keep queries of similar size and complexity on the same warehouse

om
c
to simplify compute resource sizing
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


Large 71
SCALE OUT FOR CONCURRENCY

y
op
t-c
no
General Functionality and Considerations

o-
-d
20
20
● Single Virtual Warehouse with multiple compute clusters

ke
fla
● Delivers consistent SLA, automatically adding and removing

ow
Sn
compute clusters based on concurrent usage

©
0-
● Scale out during peak times and scale back during slow times

02
-2
● Queries load balanced across the clusters in a Virtual Warehouse

ay
-M
● Deployed across availability zones for high availability

13
om
c
Guidelines
a.
at
td
nt
v@

● MIN_CLUSTER_COUNT : 1-10 (default 1)


da

● MAX_CLUSTER_COUNT: 1-10 (default 1) >= MIN_CLUSTER_COUNT


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 72


SCALING UP VS. OUT EXAMPLE

y
CREDITS PER HOUR

op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 73


LAB EXERCISE

y
op
Work with Storage and Compute

t-c
no
o-
25 minutes

-d
20
20
ke
Tasks:

fla
ow
● Review the TRAINING_DB database

Sn
● Create and organize objects

©
0-
● Review storage usage

02
-2
● Run commands without compute

ay
-M
● Work with virtual warehouses

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 74


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
CLIENTS, CONNECTORS,

Sn
©
0-
02
AND ECOSYSTEMS

-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Clients & Interfaces

no
o-
● SnowSQL

-d
20
● Ecosystem Overview

20

ke
Connectors

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 76


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
CLIENTS & INTERFACES

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


WAYS TO CONNECT & USE SNOWFLAKE

y
op
t-c
no
● Web-based user interface - access all aspects of using and managing Snowflake

o-
-d
20
20
● Command-line clients - access all aspects of using and managing Snowflake

ke
fla
○ For example, SnowSQL

ow
Sn
©
0-
● ODBC and JDBC drivers - used by other applications to connect to Snowflake

02
-2
○ For example, Tableau

ay
-M
13
● Native connectors - used to develop applications for connecting to Snowflake

om
c
a.
○ For example, Python
at
td
nt
v@

● Third-party solutions - leverage native connectors to connect to Snowflake


da
ya

○ For example, ETL and BI tools


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 78


SNOWFLAKE WEB INTERFACE URLS

y
op
t-c
no
o-
-d
20
Account name

20
ke
AWS US West <account>.snowflakecomputing.com

fla
ow
Sn
©
0-
All other regions on AWS <account>.<region>.snowflakecomputing.com

02
-2
ay
-M
13
<account>.<region>.<provider>.snowflakecomputing.co
Other providers

om
m

c
a.
at Hostname
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 79


SNOWFLAKE WEB UI

y
op
● Account Management

t-c
no
● Virtual Warehouse Management

o-
-d
● Database Management

20
20
● Simple Data Loading

ke
fla
● Query Execution, Monitoring, & Profiling

ow

Sn
Password Management

©
0-
80
● Preference Management

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 80


SNOWFLAKE WEB UI: WORKSHEETS

y
op
t-c
no
Visual Interface for

o-
-d
20
creating and submitting

20
SQL queries

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 81


SNOWFLAKE WEB UI: WORKSHEETS

y
op
t-c
no
Worksheet Context
Worksheet context defines

o-
-d
20
the default namespace,

20
role, and warehouse to be

ke
fla
used for commands issued

ow
Sn
in the worksheet

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 82


WORKSHEET CONTEXT

y
op
t-c
no
● Can be changed with

o-
-d
20
drop-down menu, or with

20
SQL commands

ke
fla
○ USE ROLE <role>

ow
Sn
○ USE WAREHOUSE

©
0-
<name>

02
-2
ay
○ USE <database.schema>

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 83


SNOWFLAKE WEB UI: WORKSHEETS

y
op
t-c
no
Shows objects available

o-
-d
20
to the current role (set in

20
the worksheet context)

ke
fla
ow
Sn
©
0-
02
Object

-2
ay
Browser

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 84


SNOWFLAKE WEB UI: WORKSHEETS

y
op
t-c
no
Pane for submitting SQL

o-
-d
20
queries

20
ke
fla
ow
Sn
©
0-
Query Pane

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 85


SNOWFLAKE WEB UI: WORKSHEETS

y
op
t-c
no
Displays results for SQL

o-
-d
20
queries; links to query

20
profile

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya

Result Pane
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 86


SNOWFLAKE SAMPLE DATA

y
op
● Industry- standard TPC-DS and TPC-H

t-c
no
benchmarks

o-
-d
20
20
● Shared via

ke
fla
SNOWFLAKE_SAMPLE_DATA database

ow
Sn
(read only)

©
0-
87

02
-2
● Do not incur storage charges, but do

ay
-M
require a Virtual Warehouse to run

13
queries

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 87


SNOWFLAKE_SAMPLE_DATA

y
op
t-c
no
● TPC-H data

o-
-d
20
○ Order and supplier information for retailer

20
ke
fla
● TPC-DS data

ow
Sn
○ Sales and support information for a catalog company

©
0-
02
-2
● Weather Data from OpenWeatherMap

ay
-M
13
○ JSON format

om
○ Daily, hourly, and recent weather for 200,000+ cities

c
a.
○ July 2016 through present at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 88


TUTORIALS

y
op
t-c
no
o-
● Available through the UI

-d
20
20
● Guided walk-throughs and exercises atop the SNOWFLAKE_SAMPLE_DATA

ke
fla
ow
● Additional tutorials are detailed in the documentation

Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 89


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
SNOWSQL

20
-d
o-
no
t-c
op
y
SNOWSQL

y
op
t-c
no
● Command-line client for connecting to Snowflake

o-
-d
20
20
● Developed using the Snowflake connector for Python

ke
fla
ow
Sn
● Versions for Windows,

©
0-
MacOS, Linux

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 91


SNOWSQL

y
op
t-c
no
● Available for download from the UI: Help -> Downloads

o-
-d
20
20
● Run as an interactive shell, or in batch mode

ke
fla
ow
Sn
● Can leverage a configuration file to pass parameters like account, user, role, database,

©
0-
schema, etc.

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 92


SNOWSQL CONFIGURATION FILE

y
op
t-c
no
● Set defaults to be used

o-
-d
20
in SnowSQL

20
ke
fla
● Can set up multiple

ow
Sn
connectors

©
0-
02
-2
ay
-M
13
om
c
● Locations:
a.
at
td CAUTION: Password, if
○ Linux/Mac: ~/.snowsql/config
nt
specified, is stored in
v@

○ Windows: %USERPROFILE%\.snowsql\config clear text


da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 93


SNOWSQL CONFIGURATION FILE

y
op
t-c
no
● Configure multiple connections

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
● Connect with -c option

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 94


CHECK CURRENT ROLE IN SNOWSQL

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 95


LAB EXERCISE

y
op
Install and Work with SnowSQL

t-c
no
o-
35 minutes

-d
20
20
ke
Install and Work with SnowSQL

fla
ow
Sn
©
Note:

0-
02
● This lab requires the ability to install software on your laptop

-2
ay
-M
13
Tasks:

om
c
● Install SnowSQL
a.
at
td
● Run SQL Commands using SnowSQL
nt
v@

● Create & use a configuration file


da
ya

● Run a script using SnowSQL


.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 96


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
-d
o-
no
CONNECTORS

t-c
op
y
SUPPORTED CONNECTORS

y
op
t-c
no
JDBC Java

o-
-d
20
ODBC C/C++

20
ke
Python Python

fla
ow
Sn
Go Go

©
0-
Node.js Node.js

02
-2
ay
Spark Spark Dataframe (Scala, Python)

-M
13
R Java, C, C++ (RJDBC, RODBC)

om
c
a.
.NET Visual Basic, C#, F#, C++
at
td
nt

SnowSQL CLI client for SQL (Python)


v@
da

Web Interface UI for Admin, SQL, and more


. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 98


YTD July 2019

BI TOOL USAGE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Tableau Power BI Looker Mode Periscope MSFT Excel


da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 99


YTD July 2019

ETL TOOL USAGE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
Fivetran Matillion Stitch Talend Segment Informatica Alooma
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 100


YTD July 2019

ANALYTICS TOOLS USAGE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
Apache Spark Databricks R Alteryx SAS

13
om
c
a.
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 101


YTD July 2019

WORKLOAD BY CONNECTING CLIENT

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
Snowflake UI JDBC Python Connector ODBC SnowSQL
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 102


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
ECOSYSTEM

Sn
©
0-
02
-2
OVERVIEW

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
-d
o-
no
t-c
op
y
104
PARTNER CONNECT

y
op
t-c
no
● Create trial account with selected

o-
-d
20
Snowflake partners

20
ke
fla
● Use limited to ACCOUNTADMIN

ow
Sn
©
0-
● During connection, partner creates:

02
-2
○ PC_%partnername_USER

ay
-M
○ PC_%partnername_ROLE

13
om
○ PC_%partnername_DB

c
a.
at
○ PC_%partnername_WH td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 105


LAB EXERCISE

y
op
DEMO: Partner Connect

t-c
no
o-
5 minutes

-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 106


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
SNOWFLAKE CACHING

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Overview

no
o-
● Metadata Cache

-d
20
● Query Result Cache

20

ke
Data Cache in Compute Cluster

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 108


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
OVERVIEW

-d
o-
no
t-c
op
y
CACHING IN SNOWFLAKE

y
op
t-c
no
o-
-d
20
20
METADATA CACHE

ke
Metadata cache
Cloud

fla
Services

ow
Query Result Cache QUERY RESULT CACHE

Sn
©
0-
02
-2
WAREHOUSE DATA CACHE

ay
Virtual Virtual Virtual Virtual

-M
Warehouse Warehouse Warehouse Warehouse

13
om
Cache Cache Cache Cache

c
a.
at
td
nt
v@
da

Data
ya

Storage
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 110


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
METADATA

Sn
©
0-
02
-2
CACHE

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


METADATA CACHE

y
op
● Metadata stored in cloud services layer

t-c
no
Cloud Services

o-
-d
● Partition-level metadata

20
20
○ Row count

ke
fla
ow
Sn
● Partition-column-level metadata

©
0-
112

○ MIN/MAX values

02
-2
ay
○ Number of DISTINCT values

-M
13
○ Number of NULL values

om
c
a.
● Table versions and references to
at
td
nt
physical files (.fdn)
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 112


MICRO-PARTITION METADATA CACHE

y
op
t-c
no
● Used by SQL optimizer to speed up query

o-
Cloud Services

-d
20
compilation

20
ke
fla
● Used for queries that can be answered

ow
Sn
completely by metadata

©
0-
○ SHOW commands

02
-2
ay
○ MIN, MAX, COUNT

-M
13
om
● Fast

c
a.
at
td
● No virtual warehouse used
nt
v@
da

○ NOTE: Cloud Services charges may still apply!


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 113


METADATA CACHE - QUERY PROFILE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 114


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
QUERY RESULT

Sn
©
0-
02
-2
CACHE

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


QUERY RESULT CACHE

y
op
Persisted Query Results ● Query results are stored and managed

t-c
no
by the Cloud Services layer

o-
-d
Cloud Services

20
20
● Used if the identical query is run, and

ke
fla
base tables have not changed

ow
Sn
©
0-
116
● Available to other users:

02
-2
○ SHOW: any user in the same role

ay
-M
○ SELECT: any user with SELECT

13
om
permissions on all tables in query

c
a.
at
td
● Remains in QR cache for 24 hours
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 116


HOW IT WORKS

y
op
t-c
no
● Result sets are cached for 24 hours; counter resets Query Result Cache

o-
-d
20
each time matching query is re-used

20
ke
fla
● Result reuse controlled by USE_CACHED_RESULT

ow
Sn
parameter at account/user/session level

©
0-
02
-2
● Eligibility requirements for query to use result set cache:

ay
-M
○ Exact same SQL query (* except maybe whitespace)

13
om
○ Result must be deterministic (no random function)

c
a.
at
○ Changes CAN be made to source table(s), but only if they do
nt
td
not affect any micro-partitions relevant to query
v@
da

○ Must have the right permissions to use it


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved ● Result sets are stored 117
QUERY RESULT CACHE USE CASES

y
op
t-c
no
● Static dashboards Cloud Services

o-
-d
● Queries with significant computing

20
20
○ Semi-structured data analysis

ke
fla
ow
○ Aggregates

Sn
● Queries that are run frequently or are complex

©
0-
● Refine the output of another query

02
-2
ay
○ Use TABLE function RESULT_SCAN(<query id>);

-M
13
om
Benefits

c
a.
at
td
nt

● Fast
v@
da

● Will never give stale results


. ya

● No virtual warehouse used


11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 118


QUERY RESULT CACHE EXAMPLE

y
Query profile confirms re-use of query result cache

op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 119


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
-d
o-
DATA CACHE

no
t-c
op
y
DATA CACHE

y
op
● Stores file headers and column data from

t-c
no
queries

o-
-d
20
○ Stores the data, not the result

20
ke
fla
● Stored to SSD in virtual warehouse

ow
Sn
©
0-
● When a similar query is run, Snowflake
121

02
-2
Virtual
will use as much data from the data

ay
Warehouses

-M
cache as possible

13
om
c
● Available for all queries run on the same
a.
at
td
virtual warehouse
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 121


DATA CACHE

y
op
t-c
no
Query Example:

o-
-d
20
SELECT network FROM rating WHERE (data_stream = 'Live');

20
ke
fla
ow
Sn
What is in the Data Cache?

©
0-
02
-2
● Only the partitions satisfying predicates: data_stream = 'Live'

ay
-M
● Only the selected column: in this case, network

13
om
c
a.
at
td
Best Practice:
nt
v@
da
ya

● Group and execute similar queries on the same virtual warehouse to maximize data cache
.
11

reuse, for performance and cost optimization


lip
di

© 2020 Snowflake Inc. All Rights Reserved 122


HOW DATA CACHE WORKS

y
op
t-c
no
● When a query is run, file headers and column data

o-
-d
20
retrieved are stored on SSD

20
ke
fla
● Virtual warehouse will first read any locally available

ow
Sn
data, then read remainder from remote cloud storage

©
0-
02
-2
● Data is flushed out in a Least Recently Used (LRU)

ay
Virtual

-M
fashion when cache fills Warehouses

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 123


DATA CACHE: COLD EXAMPLE

y
op
t-c
no
o-
-d
Total Execution Time (31.57s)

20
20
ke
fla
95 % of the query cost was spent on remote IO

ow
Sn
©
0-
02
-2
ay
-M
13
om
0% of data was scanned from

c
cache.

a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 124


DATA CACHE: WARM EXAMPLE

y
op
t-c
no
o-
-d
20
20
ke
fla
Total Execution Time (7.787s)

ow
Sn
©
0-
02
Remote IO drops to 19% from previous 95%

-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
ya

Most query data read from local cache


.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 125


SUMMARY OF CACHE OPTIONS

y
op
t-c
Metadata Cache Query Result Cache Data Cache

no
o-
-d
Where is it Cloud services layer Cloud services layer Warehouse SSD

20
stored?

20
ke
What does it Metadata and statistics for Results set for each query Data used in query

fla
store? micro-partitions and tables

ow
Sn
When is it For commands like MIN, Identical query is run again Query using some or all of the

©
0-
used? MAX, COUNT same data is run

02
Base table data has not changed

-2
Data has not changed

ay
-M
How long does Continuously updated 24 hours While warehouse exists

13
it last?

om
Clock reset after every run Rotated out on a LRU basis

c
a.
Who can use Everyone at For SHOW: user in the same role Anyone who uses the same
td
nt
it? warehouse
v@

For SELECT: user with SELECT


da

permissions on all tables


. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 126


SUMMARY: LIFE CYCLE OF A QUERY

y
op
t-c
no
o-
Query from

-d
20
client

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 127


SUMMARY: LIFE CYCLE OF A QUERY

y
op
t-c
no
o-
Query from Result in query

-d
20
client result cache?

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 128


SUMMARY: LIFE CYCLE OF A QUERY

y
op
t-c
no
o-
Query from Result in query YES Has underlying

-d
20
client result cache? data changed?

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 129


SUMMARY: LIFE CYCLE OF A QUERY

y
op
t-c
no
o-
Query from Result in query YES Has underlying NO Return result from

-d
20
client result cache? data changed? query result cache

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 130


SUMMARY: LIFE CYCLE OF A QUERY

y
op
t-c
no
o-
Query from Result in query YES Has underlying NO Return result from

-d
20
client result cache? data changed? query result cache

20
NO YES

ke
fla
ow
Metadata result?

Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 131


SUMMARY: LIFE CYCLE OF A QUERY

y
op
t-c
no
o-
Query from Result in query YES Has underlying NO Return result from

-d
20
client result cache? data changed? query result cache

20
NO YES

ke
fla
ow
YES Return result from
Metadata result?

Sn
metadata cache

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 132


SUMMARY: LIFE CYCLE OF A QUERY

y
op
t-c
no
o-
Query from Result in query YES Has underlying NO Return result from

-d
20
client result cache? data changed? query result cache

20
NO YES

ke
fla
ow
YES Return result from
Metadata result?

Sn
metadata cache

©
0-
NO

02
-2
ay
Prune and filter

-M
13
om
c
a.
at
Start Warehouse nt
td
v@
da
ya

Fetch new data Store result in


.

Execute Return result


11

not in data cache query cache


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 133


LAB EXERCISE

y
op
Explore Snowflake Caching

t-c
no
o-
30 minutes

-d
20
20
ke
Tasks:

fla
ow
● Explore Metadata cache

Sn
● Explore Data cache

©
0-
● Explore Query result cache

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 134


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
SQL SUPPORT

Sn
©
0-
02
IN SNOWFLAKE

-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Querying and Filtering

no
o-
-d
● Collations

20
20
● Subqueries

ke
fla
● Query Profile

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 136


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
QUERYING AND

Sn
©
0-
02
-2
FILTERING

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


QUERY FORMULATION

y
op
t-c
no
Component What it does

o-
-d
20
SELECT Specifies which columns/aggregates/scalar transforms to return

20
ke
FROM Defines the data set (table or query) to work with

fla
ow
DISTINCT Returns only unique values

Sn
©
0-
WHERE Filters values returned by the FROM clause

02
-2
ay
LIMIT Limits the number of records in the returned result set

-M
13
GROUP BY Defines how data should be grouped in the results

om
c
a.
HAVING Specifies conditions related to the grouped data
at
td
nt
ORDER BY Specifies how the rows should be ordered
v@
da

JOIN Joins multiple tables based on common columns


. ya
11

PIVOT Rotates a table (turns unique values in a column, into columns)


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 138


SELECT

y
op
t-c
no
Specifies which columns, aggregates, or scalar transforms to return

o-
-d
20
20
Samples:

ke
fla
ow
Sn
SELECT * FROM mytable;

©
0-
02
-2
SELECT col3, col2, SUM(col1) FROM mytable;

ay
-M
13
SELECT col1 AS price, CAST(col3 AS DECIMAL(10,2)) FROM mytable;

om
c
a.
at
td
SELECT CURRENT_ROLE();
nt
v@
da
ya

SELECT (1+1);
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 139


FROM

y
op
t-c
no
Table, view, or table function to use in a SELECT statement

o-
-d
20
20
Samples:

ke
fla
ow
Sn
SELECT * FROM mytable;

©
0-
02
-2
SELECT * FROM myview;

ay
-M
13
SELECT a.col1, b.col2

om
c
a.
FROM lefttable a JOIN righttable b ON a.id = b.empid;
at
td
nt
v@

SELECT *
da
ya

FROM TABLE(INFORMATION_SCHEMA.login_history_by_user());
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 140


DISTINCT

y
op
t-c
no
Returns only unique values

o-
-d
20
20
Samples:

ke
fla
ow
Sn
SELECT DISTINCT(col5) FROM mytable;

©
0-
02
-2
SELECT COUNT(DISTINCT col5) FROM mytable;

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 141


WHERE

y
op
t-c
no
Filters the result of the FROM clause, using a predicate

o-
-d
20
20
Samples:

ke
fla
ow
Sn
SELECT * FROM invoices WHERE invoice_date < '2018-01-01';

©
0-
02
-2
SELECT * FROM invoices

ay
-M
WHERE amount < (SELECT AVG(amount) FROM invoices);

13
om
c
a.
SELECT * FROM invoices
at
td
WHERE invoice_date < DATEADD('DAYS', -30, CURRENT_DATE())
nt
v@

AND paid=FALSE;
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 142


LIMIT (FETCH)

y
op
● Limits the number of rows returned to the

t-c
no
ID NAME SALARY value specified

o-
-d
20
1 Joe 53000

20
● Does not just return the first n rows

ke
2 Rajesh 67000

fla
(unless the query uses ORDER BY)

ow
3 Saira 125000

Sn
©
4 Jenn 98750

0-
143
● Both LIMIT and FETCH are supported,

02
... ... ...

-2
and return the same results

ay
12390 Anis 45890

-M
13
om
c
SELECT * FROM employees

a.
at
LIMIT 1; td
nt
v@
da

ID NAME SALARY
ya
.
11

287 June 69340


lip
di

© 2020 Snowflake Inc. All Rights Reserved 143


GROUP BY

y
op
● Defines how data should be grouped in

t-c
ITEM ID ITEM_TYPE QTY

no
the results

o-
-d
1 Pen 57

20
20
2 Chair 3
● Groups rows that have the same values

ke
fla
3 Chair 7 in a specified column

ow
Sn
4 Table 5

©
0-
144
5 Pen 245 ● Computes aggregate functions for the

02
-2
resulting group if needed

ay
SELECT item_type, SUM(qty)

-M
13
FROM inventory

om
GROUP BY item_type;

c
a.
at
td
ITEM _TYPE QTY
nt
v@

Chair 10
da
ya

Pen 302
.
11
lip

Table 5
di

© 2020 Snowflake Inc. All Rights Reserved 144


HAVING

y
op
t-c
no
Specifies conditions related to grouped data

o-
-d
20
20
Samples:

ke
fla
ow
Sn
SELECT dept FROM employees

©
0-
GROUP BY dept HAVING COUNT(*) < 10;

02
-2
ay
-M
SELECT item, SUM(qty) FROM orders

13
GROUP BY item HAVING SUM(qty) < 100;

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 145


ORDER BY

y
op
● Orders values in ascending or

t-c
no
MFG ITEM_TYPE QTY
descending order on a specified column

o-
-d
OfficeMax Pen 57

20
20
OfficeGuys Chair 3
● Can have multi-level ORDER BY

ke
Staples Chair 7

fla
ow
Bob's Table 5
● By default, sorts in ascending (ASC)

Sn
Staples Pen 245

©
0-
146

order

02
-2
SELECT item_type, mfg, qty
○ Use DESC to sort in descending order

ay
FROM inventory ORDER BY item_type;

-M
13
om
ITEM _TYPE MFG QTY

c
a.
Chair OfficeGuys 3
at
td
Chair Staples 7
nt
v@

Pen OfficeMax 57
da
ya

Pen Staples 245


.
11

Table Bob's 5
lip
di

© 2020 Snowflake Inc. All Rights Reserved 146


JOIN

y
op
t-c
no
Combines rows from two tables/views/functions based on common columns

o-
-d
20
20
Samples:

ke
fla
ow
Sn
SELECT lefttable.col1, righttable.col2

©
0-
FROM lefttable JOIN righttable

02
-2
ON lefttable.last = righttable.last_name;

ay
-M
13
SELECT * FROM lefttable a JOIN righttable b ON a.key = b.key;

om
c
a.
at
td
SELECT t1.*, t2.*, t3.* FROM t1
nt
v@

LEFT OUTER JOIN t2 ON (t1.key = t2.key)


da
ya

RIGHT OUTER JOIN t3 ON (t3.id = t2.empid);


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 147


PIVOT

y
op
t-c
no
Turns unique values in a column into multiple columns

o-
-d
20
20
EmpID Sales Month

ke
fla
1 10000 JAN

ow
Sn
1 8000 JAN

©
0-
2 12000 JAN

02
-2
2 4000 JAN

ay
-M
1 6000 FEB

13
1 13000 FEB

om
2 15000 FEB

c
a.
at
1 10000 MAR td
nt
2 3000 MAR
v@
da

1 25000 APR
ya

2 31000 APR
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 148


PIVOT

y
op
t-c
no
Turns unique values in a column into multiple columns

o-
-d
20
20
EmpID Sales Month SELECT * FROM monthly_sales

ke
fla
1 10000 JAN PIVOT (SUM(sales) for month

ow
IN ('JAN', 'FEB', 'MAR', 'APR'))

Sn
1 8000 JAN
ORDER BY empid;

©
0-
2 12000 JAN

02
-2
2 4000 JAN

ay
-M
1 6000 FEB

13
1 13000 FEB

om
2 15000 FEB

c
a.
at
1 10000 MAR td
nt
2 3000 MAR
v@
da

1 25000 APR
ya

2 31000 APR
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 149


PIVOT

y
op
t-c
no
Turns unique values in a column into multiple columns

o-
-d
20
20
EmpID Sales Month SELECT * FROM monthly_sales

ke
fla
1 10000 JAN PIVOT (SUM(sales) for month

ow
IN ('JAN', 'FEB', 'MAR', 'APR'))

Sn
1 8000 JAN
ORDER BY empid;

©
0-
2 12000 JAN

02
-2
2 4000 JAN

ay
-M
1 6000 FEB

13
1 13000 FEB

om
2 15000 FEB

c
a.
at
1 10000 MAR td EmpID JAN FEB MAR APR
nt
2 3000 MAR
v@

1 18000 19000 10000 25000


da

1 25000 APR 2 16000 15000 3000 31000


ya

2 31000 APR
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 150


PIVOT

y
op
t-c
no
Turns unique values in a column into multiple columns

o-
-d
20
20
SELECT * FROM monthly_sales

ke
fla
PIVOT (SUM(sales) -- cell values

ow
Sn
FOR month IN ('JAN', 'FEB', 'MAR', 'APR')-- columns

©
0-
ORDER BY empid;

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 151


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
-d
o-
COLLATIONS

no
t-c
op
y
COLLATION

y
op
t-c
no
● Collate: to collect and combine in proper order

o-
-d
20
○ Proper order is open to interpretation

20
ke
fla
● With Snowflake, strings are stored internally in UTF-8

ow
Sn
○ UTF-8 uses the numeric representation of characters to sort

©
0-
02
○ Upper-case letter have a lower numeric representation than lower-case

-2
ay
○ Example:

-M
13
Anne

om
Bob

c
a.
at
Jerry td
nt
v@

charles
da
ya

deborah
.
11
lip

klaus
di

© 2020 Snowflake Inc. All Rights Reserved 153


SUPPORTED COLLATIONS

y
SPECIFY HOW TEXT SHOULD BE SORTED/COMPARED

op
t-c
no
Collation Use Examples

o-
-d
20
Language locale Language- and country-specific rules to apply en, en_US, fr,

20
fr_CA, etc.

ke
fla
ow
Case sensitivity Whether to consider case when comparing values cs or ci

Sn
©
Accent sensitivity Whether to consider accented characters equal to, or as or ai

0-
02
different from, their base characters

-2
ay
Punctuation Whether non-letter characters matter ps or pi

-M
13
sensitivity

om
c
First letter Whether to sort upper-case or lower-case letters first fl or fu

a.
preference at
td
nt
v@

Case conversion Convert to upper or lower case before comparisons upper or lower
da
ya

Space trimming Remove leading spaces, trailing spaces, or both trim, ltrim, rtrim
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 154


SPECIFYING COLLATIONS ON COLUMNS

y
op
t-c
no
CREATE TABLE test (col1 VARCHAR, col2 VARCHAR collate 'fr', col3 VARCHAR collate 'es');

o-
-d
20
INSERT INTO test VALUES

20
ke
('Apple', 'Apple', 'Apple'),

fla
('apple', 'apple', 'apple'),

ow
('AB CD', 'AB CD', 'AB CD'), unsorted

Sn
©
('ABC', 'ABC', 'ABC'), COL1

0-
('pino', 'pino', 'pino'),

02
Apple

-2
('piñata', 'piñata', 'piñata'),

ay
('ab\'cd', 'ab\'cd', 'ab\'cd'),

-M
apple

13
('abc', 'abc', 'abc');
AB CD

om
c
SELECT col1 FROM test;

a.
ABC
at
td
nt
pino
v@

piñata
da
. ya

ab'cd
11
lip

abc
di

© 2020 Snowflake Inc. All Rights Reserved 155


SPECIFYING COLLATIONS ON COLUMNS

y
op
t-c
no
CREATE TABLE test (col1 VARCHAR, col2 VARCHAR collate 'fr', col3 VARCHAR collate 'es');

o-
-d
20
INSERT INTO test VALUES

20
ke
('Apple', 'Apple', 'Apple'),

fla
('apple', 'apple', 'apple'), sorted by

ow
('AB CD', 'AB CD', 'AB CD'), unsorted utf8

Sn
©
('ABC', 'ABC', 'ABC'), COL1 COL1

0-
('pino', 'pino', 'pino'),

02
Apple AB CD

-2
('piñata', 'piñata', 'piñata'),

ay
('ab\'cd', 'ab\'cd', 'ab\'cd'),

-M
apple ABC

13
('abc', 'abc', 'abc');
AB CD Apple

om
c
SELECT col1 FROM test ORDER BY col1;

a.
ABC ab'cd
at
td
nt
pino abc
v@

piñata apple
da
. ya

ab'cd pino
11
lip

abc piñata
di

© 2020 Snowflake Inc. All Rights Reserved 156


SPECIFYING COLLATIONS ON COLUMNS

y
op
t-c
no
CREATE TABLE test (col1 VARCHAR, col2 VARCHAR collate 'fr', col3 VARCHAR collate 'es');

o-
-d
20
INSERT INTO test VALUES

20
ke
('Apple', 'Apple', 'Apple'),

fla
('apple', 'apple', 'apple'), sorted by sorted by

ow
('AB CD', 'AB CD', 'AB CD'), unsorted utf8 french

Sn
©
('ABC', 'ABC', 'ABC'), COL1 COL1 COL1

0-
('pino', 'pino', 'pino'),

02
Apple AB CD AB CD

-2
('piñata', 'piñata', 'piñata'),

ay
('ab\'cd', 'ab\'cd', 'ab\'cd'),

-M
apple ABC ab'cd

13
('abc', 'abc', 'abc');
AB CD Apple abc

om
c
SELECT col1 FROM test ORDER BY col1;

a.
ABC ab'cd ABC
at
td
nt
pino abc apple
v@

SELECT col1 FROM test ORDER BY col2;


piñata apple Apple
da
. ya

ab'cd pino piñata


11
lip

abc piñata pino


di

© 2020 Snowflake Inc. All Rights Reserved 157


SPECIFYING COLLATIONS ON COLUMNS

y
op
t-c
no
CREATE TABLE test (col1 VARCHAR, col2 VARCHAR collate 'fr', col3 VARCHAR collate 'es');

o-
-d
20
INSERT INTO test VALUES

20
ke
('Apple', 'Apple', 'Apple'),

fla
('apple', 'apple', 'apple'), sorted by sorted by sorted by

ow
('AB CD', 'AB CD', 'AB CD'), unsorted utf8 french spanish

Sn
©
('ABC', 'ABC', 'ABC'), COL1 COL1 COL1 COL1

0-
('pino', 'pino', 'pino'),

02
Apple AB CD AB CD AB CD

-2
('piñata', 'piñata', 'piñata'),

ay
('ab\'cd', 'ab\'cd', 'ab\'cd'),

-M
apple ABC ab'cd ab'cd

13
('abc', 'abc', 'abc');
AB CD Apple abc abc

om
c
SELECT col1 FROM test ORDER BY col1;

a.
ABC ab'cd ABC ABC
at
td
nt
pino abc apple apple
v@

SELECT col1 FROM test ORDER BY col2;


piñata apple Apple Apple
da
. ya

ab'cd pino piñata pino


11
lip

SELECT col1 FROM test ORDER BY col3;


abc piñata pino piñata
di

© 2020 Snowflake Inc. All Rights Reserved 158


SPECIFYING COLLATIONS IN COMPARISONS

y
op
t-c
no
CREATE TABLE test (c1 VARCHAR, c2 VARCHAR) C1 C2

o-
-d
20
its it's
INSERT INTO test VALUES

20
ke
('its', 'it\'s'), log in login

fla
('log in', 'login'),

ow
MyTable mytable
('MyTable', 'mytable');

Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 159


SPECIFYING COLLATIONS IN COMPARISONS

y
op
t-c
no
CREATE TABLE test (c1 VARCHAR, c2 VARCHAR) C1 C2

o-
-d
20
its it's
INSERT INTO test VALUES

20
ke
('its', 'it\'s'), log in login

fla
('log in', 'login'),

ow
MyTable mytable
('MyTable', 'mytable');

Sn
©
0-
SELECT * FROM test

02
C1 C2

-2
WHERE c1=c2;

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 160


SPECIFYING COLLATIONS IN COMPARISONS

y
op
t-c
no
CREATE TABLE test (c1 VARCHAR, c2 VARCHAR) C1 C2

o-
-d
20
its it's
INSERT INTO test VALUES

20
ke
('its', 'it\'s'), log in login

fla
('log in', 'login'),

ow
MyTable mytable
('MyTable', 'mytable');

Sn
©
0-
SELECT * FROM test

02
C1 C2

-2
WHERE c1=c2;

ay
-M
13
C1 C2
SELECT * FROM test

om
WHERE c1=c2 collate 'en-ci'; MyTable mytable

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 161


SPECIFYING COLLATIONS IN COMPARISONS

y
op
t-c
no
CREATE TABLE test (c1 VARCHAR, c2 VARCHAR) C1 C2

o-
-d
20
its it's
INSERT INTO test VALUES

20
ke
('its', 'it\'s'), log in login

fla
('log in', 'login'),

ow
MyTable mytable
('MyTable', 'mytable');

Sn
©
0-
SELECT * FROM test

02
C1 C2

-2
WHERE c1=c2;

ay
-M
13
C1 C2
SELECT * FROM test

om
WHERE c1=c2 collate 'en-ci'; MyTable mytable

c
a.
at
td
nt
C1 C2
v@
da

its it's
ya

SELECT * FROM test


.
11

WHERE c1=c2 collate 'en-ci-pi'; log in login


lip
di

MyTable mytable
© 2020 Snowflake Inc. All Rights Reserved 162
FUNCTIONS THAT SUPPORT COLLATION

y
op
t-c
no
● ●

o-
[NOT] BETWEEN LEAST

-d
20
● CASE ● LEFT

20
● COALESCE ● LISTAGG

ke
fla
● CONCAT, || ● LPAD

ow
Sn
● CONTAINS ● MIN / MAX

©
0-
● DECODE ● NULLIF

02
-2
● ENDSWITH ● NVL

ay
-M
● EQUAL_NULL ● NVL2

13
● ●

om
GET_DDL RIGHT

c
a.
● GREATEST ● RPAD
at
td
● IFF ● STARTSWITH
nt
v@

● IFNULL
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 163


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
-d
o-
SUBQUERIES

no
t-c
op
y
SUBQUERY OVERVIEW

y
op
t-c
no
● A Subquery is a query within another query

o-
-d
20
20
ke
fla
ow
Sn
Outer/main query SELECT <columns>

©
0-
Generally executed 2nd FROM <table>

02
-2
WHERE column_name <expression> <operator>

ay
-M
Subquery

13
( SELECT <columns>

om
Generally executed 1st FROM <table>

c
a.
WHERE …
at
td
nt
);
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 165


ADDITIONAL SUBQUERY INFO

y
op
t-c
no
● Can be in WHERE, HAVING, and FROM clauses

o-
-d
20
20
● Can be used with SELECT, UPDATE, INSERT, DELETE along with an expression operation

ke
fla
ow
Sn
● Subqueries are on the right side of the comparison operator:

©
0-
02
-2
SELECT rep, total_sales

ay
-M
FROM sales_results

13
WHERE total_sales >

om
c
(SELECT AVG(total_sales)

a.
FROM sales_results); at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 166


USE TEMP TABLE FOR REPETITIVE SUBQUERIES

y
op
t-c
no
If you're using the same subquery multiple times,

o-
-d
20
use a temporary table to materialize subquery

20
results:

ke
fla
ow
Sn
● Improves performance by reducing repeated I/O

©
0-
02
-2
● Saves cost by reducing repeated computation

ay
-M
13
● Temporary table exists only within session

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 167


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
QUERY PROFILE

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


QUERY STATISTICS - HISTORY TAB

y
op
t-c
no
● Provides a tabular, high-level view of each query

o-
-d
● Includes basic performance metrics (duration, bytes scanned, rows)

20
20
● Color-coded to provide quick insights

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 169


QUERY STATISTICS - HISTORY TAB

y
op
t-c
no
Local I/O

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11

Compilation Execution Remote I/O


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 170


QUERY STATISTICS - HISTORY TAB

y
op
t-c
no
● Hover over numeric value to see breakdown

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 171


ACCESS THE QUERY PROFILE

y
op
t-c
no
● Multiple ways to access the SQL Profiler

o-
-d
History tab

20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
Query results pane
nt
v@

in worksheet
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 172


QUERY PROFILE - DETAILS TAB

y
op
t-c
no
● Clicking on a Query ID brings up this view:

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 173


QUERY PROFILE - PROFILE TAB

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 174


QUERY PROFILE - MORE DETAILS

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 175


MULTI-STEP QUERIES

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 176


READING A QUERY PROFILE

y
op
t-c
no
Each node in the query profile represents

o-
-d
20
a step in the execution of the query

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 177


READING A QUERY PROFILE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 178


READING A QUERY PROFILE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 179


READING A QUERY PROFILE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
Click

ay
-M
to

13
filter

om
statistic

c
a.
s on
at
td
right
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 180


NODE EXECUTION TIME

y
op
t-c
no
o-
● Initialization = setup activities prior to processing

-d
20
20
● Processing = CPU data processing

ke
fla
ow
● Local Disk I/O = blocked on local SSD on node

Sn
©
0-
● Remote Disk I/O = blocked on remote cloud storage

02
-2
ay
● Network Communication = blocked on network data transfer

-M
13
● Synchronization = sync activities between processes

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 181


STATISTICS

y
op
t-c
no
● Scan progress = % of data scanned for table thus far

o-
-d
● Bytes scanned = local + remote I/O

20
20
● Percentage scanned from cache = local / (local +

ke
fla
remote)

ow
Sn
● External bytes scanned = from external object (stage)

©
0-
● Bytes sent over network = peer-peer data exchange

02
-2
● Partitions scanned = number of micro-partitions read

ay
-M
● Partitions total = number of micro-partitions in table

13
● Bytes spilled to local storage = written to local SSD on

om
c
node (insufficient memory)
a.
at
td
● Bytes spilled to remote storage = written to remote cloud
nt
v@

storage (insufficient local SSD)


da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 182


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
DATA LOADING

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Concepts

no
o-
-d
● Bulk Loading Overview

20
20
● Data Loading Recommendations

ke
fla
● Data Loading Transformations and Validations

ow
Sn
©
● Continuous Data Loading

0-
02
● Data Unloading

-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 184


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
CONCEPTS

-d
o-
no
t-c
op
y
DATA LOADING STEPS

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
Output Data to Files Stage Files to Cloud Storage Load Data from Cloud
at
td
nt
Storage Into Snowflake
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 186


DATA LOADING IN SNOWFLAKE

y
op
t-c
no
o-
-d
20
20
ke
fla
Structured data Metadata created

ow
(CSV, TSV, …)

Sn
Infrastructure
manager
Optimizer
Metadata
manager
Security Statistics for databases, tables, columns, &

©
files calculated & stored in cloud services layer

0-
02
-2
ay
Semi-structured data

-M
13
(JSON, Avro, XML,

om
Parquet..)

c
Database data stored
a.
at
td Actual data in Snowflake databases & tables
nt
S3 S3 S3 S3
v@

Stored in Snowflake-managed cloud storage


da

Optimized, proprietary file format


. ya
11

Automatically compressed & encrypted


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 187


DATA LOADING

y
op
TERMINOLOGY ● File Format

t-c
no
o-
-d
● Stage

20
20
ke
fla
● Database / Schema / Table

ow
Sn
©
0-
188
● Pipe

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 188


FILE FORMAT

y
op
● Named object that stores information to

t-c
no
parse files during load/unload

o-
-d
20
○ File type (CSV, JSON, etc.)

20
ke
○ Type-specific formatting options

fla
Col1,Col2,Col3,Col4

ow
Sn
123,ABC,987,FED ● Specify FILE FORMAT object:

©
234,BCD,876,EDC

0-
189

02
○ As part of a table
345,CDE,765,DCB

-2
ay
○ As part of a stage

-M
13
○ In the COPY INTO command

om
c
a.
at
CREATE FILE FORMAT DEMO_FF td
nt
TYPE = 'CSV'
v@

FIELD_DELIMITER = ','
da
ya

RECORD_DELIMITER = '\n'
.
11

SKIP_HEADER = 1;
lip
di

© 2020 Snowflake Inc. All Rights Reserved 189


STAGE

y
op
● Cloud file repository

t-c
no
o-
-d
● Simplifies and streamlines bulk loading

20
20
and unloading

ke
fla
ow
● Can be internal or external

Sn
©
0-
○ Internal: stored internally in Snowflake
190

02
-2
○ External: stored in an external location

ay
-M
13
om
● Best Practice: Create stage object to

c
a.
manage ingestion workload
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 190


TYPES OF STAGES

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
Table Stage User Stage Named Stage

-2
ay
@%[TABLE_NAME] @~ @[STAGE_NAME]

-M
13
om
c
a.
at
td
nt
● Created automatically ● Created manually
v@

● Internal ● Internal or External


da
ya

● Do not support setting file formats ● Can specify file format


.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 191


WORKFLOW

y
op
t-c
no
CREATE STAGE

o-
-d
20
20
CREATE FILE FORMAT

ke
fla
ow
Sn
CREATE TABLE

©
0-
02
-2
ay
COPY INTO TABLE

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 192


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
BULK LOADING

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


BULK LOADING

y
op
t-c
no
o-
-d
20
External Stages

20
Local Files or Cloud Storage

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
ya

Internal Stages
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 194


LOADING FROM LOCAL FILE SYSTEM

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

BULK LOADING FROM LOCAL FILE SYSTEM

© 2020 Snowflake Computing Inc. All Rights Reserved 195


EXAMPLE FROM LOCAL STORAGE

y
op
t-c
no
CREATE STAGE my_stage

o-
Named internal Stage

-d
FILE_FORMAT = my_csv_format;

20
20
ke
fla
ow
Sn
PUT FILE:///data/data.csv @my_stage;
PUT data to Stage

©
0-
02
-2
ay
-M
13
COPY INTO my_table FROM @my_stage;

om
COPY data to Table

c
a.
at
td
nt
v@
da

Note: Compression during the PUT uses local resources. The local host needs
ya

sufficient memory, and space in /tmp, or the PUT will fail.


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 196


LOAD FROM CLOUD STORAGE

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 197


EXAMPLE FROM CLOUD STORAGE

y
op
t-c
no
Create external stage CREATE STAGE my_s3_stage

o-
-d
url='s3://mybucket/encrypted_files/'

20
(Pointer to S3 bucket)

20
CREDENTIALS=(aws_key_id='******'

ke
fla
aws_secret_key='******')

ow
Sn
ENCRYPTION=(master_key = '********')

©
0-
FILE_FORMAT = my_csv_format;

02
-2
ay
-M
13
om
c
Load data using
a.
COPY INTO my_table
COPY at
td
nt
FROM @my_s3_stage
v@
da

PATTERN='.*sales.*.csv';
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 198


COPY COMMAND PARAMETERS

y
op
t-c
no
Optional items in blue

o-
-d
20
20
ke
fla
COPY INTO { [<namespace>].<table name> FROM { <stage> | <external location> }

ow
Sn
©
0-
[FILES = ( '<file name>' [ , '<file name>'] [, ...]}]

02
-2
ay
[PATTERN = '<regex pattern>']

-M
13
[FILE_FORMAT = ( { FORMAT_NAME = '[<namespace>.]<format name>' |

om
TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML}

c
a.
[ <format type options> ]})]
at
td
nt
[<copy options>]
v@
da

[VALIDATION_MODE = RETURN_<n>_ROWS | RETURN_ERRORS | RETURN_ALL_ERRORS]


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 199


COPY COMMAND OPTIONS

y
op
t-c
no
The COPY command has several options for handling data and errors

o-
-d
(defaults are show in blue)

20
20
ke
fla
ON_ERROR =

ow
Sn
{ CONTINUE | SKIP_FILE | SKIP_FILE_<num> | SKIP_FILE_<num>% | ABORT STATEMENT }

©
0-
SIZE_LIMIT = <num bytes>

02
-2
ay
PURGE = TRUE | FALSE

-M
13
RETURN_FAILED_ONLY = TRUE | FALSE

om
c
ENFORCE_LENGTH = TRUE | FALSE

a.
at
td
TRUNCATECOLUMNS = TRUE | FALSE
nt
v@

FORCE = TRUE | FALSE


da
ya

LOAD_UNCERTAIN_FILES = TRUE | FALSE


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 200


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
DATA LOADING

Sn
©
0-
02
-2
RECOMMENDATIONS

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


RECOMMENDATIONS

y
op
DATA LOADING ● File size

t-c
no
o-
-d
● Location path

20
20
ke
fla
● File type and format

ow
Sn
©
0-
202
● Other considerations (concurrency,

02
-2
compression, etc.)

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 202


FILE SIZE AND NUMBER

y
op
t-c
no
● File size and number of files are crucial to optimizing load performance

o-
-d
20
20
● Split large files before loading into Snowflake

ke
fla
ow
Sn
● Recommended: 10MB to 100MB (compressed)

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 203


FILE SIZE AND NUMBER

y
op
t-c
no
● File size and number of files are crucial to optimizing load performance

o-
-d
20
20
● Split large files before loading into Snowflake

ke
fla
ow
Sn
● Recommended: 10MB to 100MB (compressed)

©
0-
02
Warehouse Size # files in parallel

-2
ay
-M
XS 8

13
om
S 16

c
a.
at
M td 32
nt
v@

L 64
da
ya

XL 128
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 204


SERIAL COPY VS PARALLEL COPY

y
op
t-c
no
o-
200 TB

-d
20
Data

20
Files

ke
fla
ow
Sn
...

©
0-
02
-2
ay
Virtual XLARGE 2% Usage XLARGE 95% Usage

-M
13
Warehouse

om
c
a.
at
td
nt
v@

Snowflake
da
ya

Tables
.
11
lip
di

Note: Single COPY Command


© 2020 Snowflake Computing Inc. All Rights Reserved 205
FILE ORGANIZATION

y
op
t-c
no
● Organize data in logical paths (e.g., subject area and create date)

o-
-d
20
/system/market/daily/2018/09/05/

20
ke
fla
● Use wildcards late in the file path definition, to reduce scanning:

ow
Sn
©
COPY INTO table FROM /system/market/daily/2018/*

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 206


FILE LOCATIONS

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Single Location with Many Files Many Locations with Fewer Files
da

Faster: targeted directories allow for


ya

Slower: must scan more files to find


.
11

needed files scanning fewer files


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 207


LAB EXERCISE

y
op
Load Structured Data Using the UI

t-c
no
o-
25 minutes

-d
20
20
ke
Note: This lab uses an external stage that has already been created for you:

fla
ow
@DBHOL.SCHOL.AWS_LOAD1 (@<database>.<schema>.<stage>)

Sn
©
0-
02
Tasks:

-2
ay
● Create tables and file formats

-M
13
● Load the REGION table from your external stage

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 208


y
op
t-c
no
o-
-d
20
20
DATA LOADING

ke
fla
ow
Sn
©
TRANSFORMATIONS

0-
02
-2
ay
-M
& VALIDATIONS
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


TRANSFORMING DATA DURING LOAD

y
op
t-c
no
● The COPY command supports column reordering, column omission, and CAST using a

o-
-d
20
SELECT statement

20
ke
fla
○ NOT SUPPORTED: Joins, filters, aggregations

ow
Sn
○ Can include SEQUENCE columns, current_timestamp(), or other column functions during data load

©
0-
02
-2
● The VALIDATION_MODE parameter does not support transformations in COPY statements

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 210


TRANSFORMING DATA DURING LOAD EXAMPLES

y
op
t-c
no
o-
-d
COPY INTO home_sales (city, zip, sale_date, price)

20
20
FROM (SELECT SUBSTR(t.$2,4), t.$1, t.$5, t.$4 FROM @my_stage t)

ke
fla
ow
FILE_FORMAT = (FORMAT_NAME = MYCSVFORMAT);

Sn
©
0-
02
-2
ay
COPY INTO casttb(col1, col2, col3)

-M
13
om
FROM (SELECT TO_BINARY(t.$1, 'utf-8'),

c
a.
at
TO_DECIMAL (t.$2, '99.9', 9, 5), TO_TIMESTAMP_NTZ(t.$3)
td
nt
v@

FROM @~/datafile.csv.gz t) FILE_FORMAT = (TYPE = CSV)


da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 211


COPY INTO VS INSERT

y
op
t-c
no
● Snowflake is optimized for bulk load and batched DML using the COPY INTO command

o-
-d
20
20
● Use COPY INTO to load data rather than INSERT with SELECT

ke
fla
ow
Sn
● Use INSERT only if needed for transformations not supported by COPY INTO

©
0-
02
-2
● Batch INSERT statements

ay
-M
○ INSERT w/ SELECT

13
om
○ CREATE TABLE AS SELECT (CTAS)

c
a.
at
○ Minimize frequent single row DMLs nt
td
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 212


QUERYING STAGE DATA

y
op
Several functions

t-c
no
can be used

o-
-d
File

20
format

20
ke
fla
ow
Sn
Add

©
0-
filters

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 213


DATA LOAD MANAGEMENT

y
op
● Error handling

t-c
no
o-
-d
● Validating before and after load

20
20
ke
fla
● Monitoring

ow
Sn
©
0-
214

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 214


ERROR HANDLING

y
op
t-c
no
Use the ON_ERROR option to control how errors in the data load are handled

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 215


VALIDATE BEFORE LOAD

y
op
t-c
no
Execute the COPY command in validation mode using VALIDATION_MODE

o-
-d
20
20
ke
COPY INTO my_table

fla
ow
FROM @my_stage/mylife.csv.gz

Sn
©
0-
VALIDATION_MODE=return_all_errors;

02
-2
ay
-M
13
SET qid=LAST_QUERY_ID();

om
c
a.
at
td
nt
v@

SELECT rejected_record FROM TABLE(result_scan($qid));


da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 216


VALIDATE AFTER LOAD

y
op
t-c
no
Validates the files loaded in a past execution of the COPY INTO and returns all errors

o-
-d
20
encountered during the load, rather than just the first error

20
ke
fla
ow
SELECT * FROM table(VALIDATE(mytable, job_id => '<query_id>'));

Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 217


MONITORING

y
op
t-c
no
● Monitor the status of each COPY command run on the History tab page of the Snowflake UI

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
● Use the LOAD_HISTORY Information Schema view to retrieve the history of data loaded into

ay
-M
tables using the COPY command

13
om
SELECT * FROM information_schema.load_history

c
a.
WHERE SCHEMA_NAME=current_schema() AND at
td
nt
v@

TABLE_NAME='my_table' AND
da

LAST_LOAD_TIME > 'Fri, 01 APR 2016 16:00:00 -800';


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 218


LAB EXERCISE

y
op
Use VALIDATION_MODE and ON_ERROR

t-c
no
o-
10 minutes

-d
20
20
ke
Tasks:

fla
ow
● Detect file format problems with VALIDATION_MODE

Sn
● Load data with various ON_ERROR settings

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 219


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
CONTINUOUS

Sn
©
0-
02
-2
DATA LOADING

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


DATA LOADING APPROACHES

y
op
t-c
no
BULK CONTINUOUS

o-
(using COPY) (using Snowpipe)

-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 221


USE CASE

y
op
t-c
no
o-
COPY command (BATCH) Snowpipe (CONTINUOUS)

-d
20
● Migration from traditional data sources ● Ingestion from modern data sources

20
ke
fla
● Transaction boundary control ● Continuously generated data is

ow
Sn
○ BEGIN / START TRANSACTION / available for analysis in seconds

©
0-
COMMIT / ROLLBACK

02
-2
● No scheduling (with auto-ingest)

ay
● Independently scale compute resources

-M
13
for different ingestion workloads ● Serverless model needs no user-

om
c
a.
managed virtual warehouse
at
td
nt
v@
da

Batch Micro-batch Continuous


. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 222


PIPE

y
op
● Source application loads stage

t-c
no
o-
-d
● Files moved from stage to ingestion

20
Stage

20
queue

ke
Source Application

fla
ow
● Pipe contains a COPY INTO statement

Sn
©
0-
○ Source stage for data files
223

02
-2
○ Target table

ay
-M
13
om
Pipe TABLE
● Loads data into tables continuously from

c
a.
an ingestion queue
at
Ingestion td
Queue
nt
v@

COPY INTO <table> ● Can be paused/resumed, return status


da

FROM <stage>
. ya

FILE_FORMAT = <format>
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 223


SNOWPIPE REST API

y
op
t-c
Snowpipe Service Snowflake

no
o-
Application Database

-d
20
20
ke
fla
ow
REST Call

Sn
REST Endpoint
{file names}

©
0-
02
Server-less

-2
Loader

ay
-M
13
om
c
a.
at Cloud
td
nt

Provider
v@
da

Storage
. ya
11
lip

CREATE PIPE IF NOT EXISTS mypipe AS COPY INTO mytable FROM @mystage;
di

© 2020 Snowflake Inc. All Rights Reserved 224


AUTO-INGEST

y
op
t-c
no
o-
-d
Snowpipe Service

20
20
Snowflake

ke
Database

fla
ow
Sn
External

©
0-
Cloud notification

02
-2
Storage

ay
Server-less

-M
Loader

13
File data

om
c
a.
at
td
nt
CREATE PIPE Snowpipe_db.public.mypipe AUTO_INGEST=TRUE AS
v@

COPY INTO Snowpipe_db.public.mytable


da
ya

FROM @Snowpipe_db.public.mystage
.
11

FILE_FORMAT = (TYPE = 'JSON');


lip
di

© 2020 Snowflake Inc. All Rights Reserved 225


FILE LOAD ORDER

y
op
● Staged files are moved into an ingest

t-c
no
queue

o-
Stage

-d
20
20
● Files that appear in the stage later are

ke
fla
appended to the queue

ow
Sn
©
0-
● Multiple processes pull from the queue

02
-2
ay
-M
● Older files are generally loaded first, but

13
files are not guaranteed to be loaded in

om
c
the same order they are staged
a.
at
td
nt
Ingestion
v@

Queue
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 226


SNOWPIPE RECOMMENDATIONS

y
op
t-c
no
● File sizes 10-100MB compressed

o-
-d
20
20
● Stage files no more than once per minute

ke
fla
○ Overhead to manage files increases in relation to the number of files queued

ow
Sn
©
0-
02
● Currently supported for Amazon AWS and Microsoft Azure

-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 227


SNOWPIPE BILLING

y
op
t-c
no
● Serverless model: does not require a virtual warehouse

o-
-d
20
20
● Snowflake provides and manages compute resources

ke
fla
○ Capacity grows and shrinks depending on load

ow
Sn
©
0-
02
● Accounts charged based on actual compute usage

-2
ay
○ Don't need to worry about suspending a warehouse

-M
13
○ Charged per-second, per-core

om
c
a.
at
● Utilization cost of 0.06 credits per 1000 files notified via REST calls or auto-ingest
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 228


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
DATA UNLOADING

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


UNLOAD TO LOCAL FILE SYSTEM

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 230


UNLOAD TO CLOUD

y
op
● Use COPY INTO

t-c
no
o-
-d
● Recommend using external named stage

20
20
ke
fla
● Can also unload directly by specifying

ow
Sn
the URI and any necessary credentials

©
0-
231

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 231


DATA UNLOADING

y
op
CONSIDERATIONS ● File formats

t-c
no
o-
-d
● Empty strings vs NULL values

20
20
ke
fla
● Unloading relational table to semi-

ow
Sn
structured format

©
0-
232

02
-2
● File splitting

ay
-M
13
● Compression

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 232


FILE FORMATS

y
op
t-c
no
Unload data into:

o-
-d
20
20
● Any flat, delimited plain text format (CSV, TSC, etc.)

ke
fla
ow
Sn
● JSON

©
0-
○ Data must be unloaded from a column of VARIANT data type

02
-2
ay
-M
● Parquet

13
om
○ Use a SELECT statement to unload a table to Parquet as multiple columns

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 233


EMPTY STRINGS VS NULL VALUES

y
op
t-c
no
Handle NULLs and empty strings appropriately on unload

o-
-d
20
20
● Enclose strings in double or single quotes:

ke
fla
ow
FIELD_OPTIONALLY_ENCLOSED_BY = 'character' | NONE

Sn
©
0-
● Determine how empty fields are handled:

02
-2
ay
EMPTY_FIELD_AS_NULL = TRUE | FALSE

-M
13
om
● Convert SQL NULL values:

c
a.
at
NULL_IF = ( 'string1' [ , 'string2' ... ] ) nt
td
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 234


FILE SPLITTING

y
op
t-c
no
● Use the option SINGLE to create a single file

o-
-d
20
○ May still need to create multiple files, depending on the table size

20
ke
○ The file size limit for single-file mode is 5GB (AWS and GCP), or 256 MB (Azure)

fla
ow
Sn
COPY INTO @mystage/myfile.csv.gz FROM mytable

©
0-
02
FILE_FORMAT = (TYPE=csv COMPORESSION='gzip')

-2
ay
SINGLE=TRUE;

-M
13
om
● Possibly combine with MAX_FILE_SIZE

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 235


MAXIMUM FILE SIZE

y
op
t-c
no
● Set MAX_FILE_SIZE to create files larger or smaller than the default (about 16MB)

o-
-d
20
○ Depending on cloud limits (5GB for AWS and GCP, 256 MB for Azure)

20
ke
fla
ow
COPY INTO @mystage/myfile.csv.gz FROM mytable

Sn
©
0-
FILE_FORMAT = (TYPE=csv COMPRESSION='gzip')

02
-2
ay
SINGLE=TRUE MAX_FILE_SIZE=2000000000

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 236


COMPRESSION

y
op
t-c
no
By default, unloaded files are compressed using gzip

o-
-d
20
20
● Type of compression controlled with the COMPRESSION option

ke
fla
○ GZIP

ow
Sn
○ BZ2

©
0-
02
○ BROTLI

-2
ay
○ ZSTD

-M
13
○ DEFLATE

om
c
○ RAW_DEFLATE

a.
at
td
○ NONE
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 237


DATA UNLOADING

y
op
ADDITIONAL FUNCTIONALITY ● Using queries in unload

t-c
no
o-
-d
● Listing files in stage

20
20
ke
fla
● Removing files from stage

ow
Sn
©
0-
238

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 238


UNLOAD WITH SELECT

y
op
t-c
no
● SELECT queries in COPY statements support the full syntax of Snowflake SQL queries

o-
-d
20
20
ke
COPY INTO @my_stage

fla
ow
FROM (SELECT column1, column2 FROM my_table)

Sn
©
FILE_FORMAT = (FORMAT_NAME = 'my_format');

0-
02
-2
ay
● JOIN clauses enable downloading data from multiple tables

-M
13
om
COPY INTO @my_stage

c
a.
FROM (SELECT name, id1 at
td
nt
v@

FROM my_table1
da

JOIN my_table2 ON id1 = id2)


. ya
11

FILE_FORMAT = (FORMAT_NAME = 'my_format');


lip
di

© 2020 Snowflake Inc. All Rights Reserved 239


LIST COMMAND

y
op
t-c
no
List the files stored in a stage object

o-
-d
20
20
LIST @%monthly_sales_agg;

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 240


REMOVE COMMAND

y
op
t-c
no
Removes files that have been stored in an internal stage

o-
-d
20
20
ke
REMOVE @%monthly_sales_agg PATTERN='.*data*.';

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
LIST @%monthly_sales_agg;
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 241


LAB EXERCISE

y
op
Unload Structured Data

t-c
no
o-
30 minutes

-d
20
20
ke
Tasks:

fla
ow
● Unload a pipe-delimited file

Sn
● Unload part of a table

©
0-
● JOIN and unload

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 242


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
-d
FUNCTIONS

o-
no
t-c
op
y
MODULE AGENDA

y
op
t-c
● Snowflake Functions Overview

no
o-
● High-Performing Functions

-d
20
● User-Defined Functions

20

ke
Stored Procedures

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 244


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
OVERVIEW

-d
o-
no
t-c
op
y
SUPPORTED FUNCTION TYPES

y
op
t-c
Type Description

no
o-
-d
20
Scalar Takes a single value, returns a single value.

20
ke
Aggregate Performs a calculation on a set of values, and returns a single value.

fla
ow
Sn
Window Aggregate functions that operate on a subset of rows within the input rows.

©
0-
02
-2
Table Produces a collection of rows (either a nested table or a vararray) that can be

ay
-M
queried like a physical database table. Use in the FROM clause of a query.

13
om
System Used to execute action in the system, or return information about the system.

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 246


SCALAR FUNCTIONS

y
op
t-c
no
Performs a calculation on a single value, and returns a single value

o-
-d
20
20
ke
Value Value

fla
ow
1 1.00

Sn
SELECT CAST(value TO DECIMAL(5,2))

©
0-
FROM mytable
8 8.00

02
WHERE value < 10;

-2
ay
12 7.00

-M
13
om
7

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 247


AGGREGATE FUNCTIONS

y
op
t-c
no
● Performs a calculation on a set of values, and returns a single value

o-
-d
20
20
ke
Value

fla
ow
1

Sn
SELECT SUM(value) Value

©
0-
FROM mytable;

02
8 28

-2
ay
-M
12

13
om
7

c
a.
at
td
nt
v@

● Often used with the GROUP BY clause of the SELECT statement


da
. ya
11

● Except for COUNT, aggregate functions ignore null values.


lip
di

© 2020 Snowflake Inc. All Rights Reserved 248


WINDOW FUNCTIONS

y
op
t-c
no
Aggregate functions that work on a subset (window) of the input rows

o-
-d
20
20
DATE SALES

ke
fla
2019-08-01 8.73

ow
Sn
2019-08-02 129.95

©
0-
02
2019-08-03 13.75

-2
ay
-M
2019-08-04 21.75

13
om
2019-08-05 115.87

c
a.
at
td
nt

SELECT date, sales, SUM(sales)


v@
da

OVER(ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)


. ya
11
lip

AS mtd FROM aug_sales;


di

© 2020 Snowflake Inc. All Rights Reserved 249


WINDOW FUNCTIONS

y
op
t-c
no
Aggregate functions that work on a subset (window) of the input rows

o-
-d
20
20
DATE SALES DATE SALES MTD

ke
fla
2019-08-01 8.73 2019-08-01 8.73 8.73

ow
Sn
2019-08-02 129.95 2019-08-02 129.95 136.68

©
0-
02
2019-08-03 13.75 2019-08-03 13.75 152.43

-2
ay
-M
2019-08-04 21.75 2019-08-04 21.75 174.18

13
om
2019-08-05 115.87 2019-08-05 115.87 290.05

c
a.
at
td
nt

SELECT date, sales, SUM(sales)


v@
da

OVER(ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)


. ya
11
lip

AS mtd FROM aug_sales;


di

© 2020 Snowflake Inc. All Rights Reserved 250


TABLE FUNCTIONS

y
op
t-c
no
● Take scalar expressions as input

o-
-d
20
20
● Return a set of rows instead of a single scalar value

ke
fla
ow
Sn
● Appear in the FROM clause

©
0-
SELECT event_timestamp AS time, user_name

02
-2
FROM TABLE (login_history_by_user())

ay
-M
ORDER BY event_timestamp;

13
om
c
a.
TIME USER_NAME
at
td
nt
2019-08-12 08:11:00.166 -0700 DEBORAH
v@
da

2019-08-12 09:14:08.128 -0600 DAVE


. ya
11

2019-08-12 08:09:17.004 -0800 JOON


lip
di

© 2020 Snowflake Inc. All Rights Reserved 251


SYSTEM FUNCTIONS

y
op
t-c
no
o-
CONTROL INFORMATION

-d
20
20
ke
● ABORT_SESSION ● CLUSTERING_DEPTH

fla
ow
Sn
● ABORT_TRANSACTION ● CLUSTERING_INFORMATION

©
0-
02
● CANCEL_QUERY ● CURRENT_USER_TASK_NAME

-2
ay
-M
● CANCEL_ALL_QUERIES ● PIPE_STATUS

13
om
● PIPE_FORCE_RESUME ● STREAM_HAS_DATA

c
a.
at
td
● TASK_DEPENDENTS_ENABLE ● WHITELIST
nt
v@
da

● WAIT
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 252


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
HIGH-PERFORMING

Sn
©
0-
02
-2
FUNCTIONS

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


APPROXIMATION

y
op
t-c
no
● Snowflake uses HyperLogLog for a set of aggregate functions to estimate the approximate

o-
-d
20
number of distinct values in a data set.

20
ke
fla
● HyperLogLog is a state-of-the-art cardinality estimation algorithm, capable of estimating

ow
Sn
distinct cardinalities of trillions of rows with an average relative error of a few percent.

©
0-
02
-2
ay
-M
13
SELECT COUNT(cust_orders),

om
Count Count (distinct HLL

c
a.
COUNT(distinct cust_orders), (cust_orders) cust_orders) (cust_orders)
at
td
APPROX_COUNT_DISTINCT(cust_orders)
nt
2205 2205 2220
v@

FROM customers;
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 254


COUNTING LARGE DISTINCT DATASETS

y
op
t-c
no
o-
COUNT (DISTINCT col1) APPROX_COUNT_DISTINCT (col1)

-d
20
20
● Slower ● Much faster

ke
fla
ow
Sn
● Scalable ● Approximate but within ~1.62% error

©
0-
02
-2
● Exact ● Less memory intensive for column

ay
-M
with a large number of distinct values

13
om
c
SELECT COUNT(DISTINCT o_orderkey)
a.
APPROX_COUNT_DISTINCT(o_orderkey)
at
FROM orders; td
FROM orders;
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 255


ESTIMATION VS EXACT

y
op
t-c
no
o-
MEDIAN (col1) APPROX_PERCENTILE (col1)

-d
20
20
● Slower ● Much faster than MEDIAN

ke
fla
ow
Sn
● Scalable ● Uses a constant amount of space

©
0-
regardless of the size of the input

02
-2
● Exact

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 256


FREQUENCY ESTIMATION

y
op
t-c
no
APPROX_TOP_K ( <expr> [ , <k> [ , <counters> ] ] )

o-
-d
20
Returns estimated frequency of most frequent values

20
ke
fla
APPROX_TOP_K_ACCUMULATE( <expr> , <counters> )

ow
Sn
Skips the final estimation step and returns the Space-Saving state at the end of an aggregation

©
0-
02
-2
APPROX_TOP_K_COMBINE( <state> [ , <counters> ] )

ay
-M
Combines (i.e. merges) input states into a single output state

13
om
c
a.
APPROX_TOP_K_ESTIMATE( <state> [ , <k> ] )
at
td
Computes a cardinality estimate of a Space-Saving state produced by
nt
v@

APPROX_TOP_K_ACCUMULATE and APPROX_TOP_K_COMBINE


da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 257


SIMILARITY ESTIMATION

y
op
t-c
no
MINHASH( <k> , [ distinct] expr+ )

o-
-d
20
Estimates the approximate similarity between two or more data sets.

20
ke
fla
MINHASH_COMBINE([distinct] <state> ] )

ow
Sn
Combines input MinHash states into a single MinHash output state. This Minhash state can

©
0-
then be input to the APPROXIMATE_SIMILARITY function to estimate the similarity with other

02
-2
MinHash states.

ay
-M
13
APPROXIMATE_SIMILARITY ( [distinct] <expr> [,...] )

om
c
Returns an estimation of the similarity (Jaccard index) of inputs based on their MinHash states.
a.
at
td
Returns a value between 0.0 (no similarity) to 1.0 (identical).
nt
v@
da

Equivalent to APPROXIMATE_JACCARD_INDEX.
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 258


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
USER-DEFINED

Sn
©
0-
02
-2
FUNCTIONS

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


USER-DEFINED FUNCTIONS

y
op
● Perform custom operations that are not

t-c
no
available through the built-in functions

o-
-d
20
20
● SQL and JavaScript supported

ke
fla
ow
● No DDL/DML support

Sn
©
0-
260

02
-2
● Can be unsecure or secure

ay
-M
13
● Return a singular scalar value or, if

om
c
defined as a table function, a set of rows
a.
at
td
nt
v@

SQL | JAVASCRIPT
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 260


SQL UDF EXAMPLE

y
op
t-c
no
number_sold wholesale_price retail_price
1. Create the table and data to use

o-
-d
3 10.00 20.00

20
20
5 100.00 200.00

ke
fla
2. Create the UDF

ow
Sn
CREATE FUNCTION profit() RETURNS NUMERIC(11, 2) AS

©
0-
02
$$

-2
ay
SELECT SUM((retail_price - wholesale_price) * number_sold)

-M
13
FROM purchases

om
c
$$;

a.
at
td
nt

3. Call the UDF


v@
da

SELECT profit();
ya

PROFIT ()
.
11
lip

$530.00
di

© 2020 Snowflake Inc. All Rights Reserved 261


JAVASCRIPT UDF

y
op
t-c
no
create function variant_nulls(v variant)

o-
returns variant

-d
20
language javascript

20
as '

ke
fla
if (V === undefined) {

ow
return "input SQL null";

Sn
©
} else if xz (v === null) {

0-
02
return "input variant null";

-2
} else if (v === 1) {

ay
-M
} else if (v === 2) {

13
} else if (v === 3) {

om
return {

c
a.
key1 : undefined
at
td
key2 : null
nt
v@

};
da

} else {
. ya

return v;
11
lip

}
di

';
© 2020 Snowflake Inc. All Rights Reserved 262
y
op
t-c
no
o-
-d
20
20
ke
fla
ow
STORED

Sn
©
0-
02
-2
PROCEDURES

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


STORED PROCEDURES

y
op
● Allow procedural logic and error handling

t-c
no
that straight SQL does not support

o-
-d
20
20
● Implemented through JavaScript and,

ke
fla
optionally (commonly), SQL.

ow
Sn
©
0-
264
● JavaScript provides the control

02
-2
structures

ay
-M
13
● SQL is executed within the JavaScript by

om
c
calling functions in an API
a.
at
td
nt
v@

JAVASCRIPT ● Argument names are case-insensitive in


da

the SQL portion of stored procedure


. ya
11

code case-sensitive in the JavaScript


lip
di

portion
© 2020 Snowflake Inc. All Rights Reserved 264
COMPONENTS OF STORED PROCEDURE

y
op
t-c
no
Create stored procedure

o-
CREATE PROCEDURE stproc1(float_param1 FLOAT)

-d
RETURN STRING

20
20
LANGUAGE JAVASCRIPT

ke
STRICT

fla
AS

ow
$$ -- marks beginning and end of the code

Sn
©
try {

0-
02
snowflake.execute (

-2
{sqlText: "INSERT INTO stproc_test_table1 (num_col1)

ay
-M
VALUES (" +FLOAT_PARAM1 + ")"} );

13
return "Succeeded."; // Return status

om
} catch (err) {

c
a.
return "Failed: " + err; // status
at
td }
nt
$$ ;
v@
da
ya

call stproc1(5.14::FLOAT);
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 265


COMPONENTS OF STORED PROCEDURE

y
op
t-c
no
o-
CREATE PROCEDURE stproc1(float_param1 FLOAT)

-d
RETURN STRING

20
Data type to return

20
LANGUAGE JAVASCRIPT

ke
STRICT

fla
AS

ow
$$ -- marks beginning and end of the code

Sn
©
try {

0-
02
snowflake.execute (

-2
{sqlText: "INSERT INTO stproc_test_table1 (num_col1)

ay
-M
VALUES (" +FLOAT_PARAM1 + ")"} );

13
return "Succeeded."; // Return status

om
} catch (err) {

c
a.
return "Failed: " + err; // status
at
td }
nt
$$ ;
v@
da
ya

call stproc1(5.14::FLOAT);
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 266


COMPONENTS OF STORED PROCEDURE

y
op
t-c
no
o-
CREATE PROCEDURE stproc1(float_param1 FLOAT)

-d
RETURN STRING

20
20
LANGUAGE JAVASCRIPT
Language (currently only

ke
STRICT

fla
javascript supported) AS

ow
$$ -- marks beginning and end of the code

Sn
©
try {

0-
02
snowflake.execute (

-2
{sqlText: "INSERT INTO stproc_test_table1 (num_col1)

ay
-M
VALUES (" +FLOAT_PARAM1 + ")"} );

13
return "Succeeded."; // Return status

om
} catch (err) {

c
a.
return "Failed: " + err; // status
at
td }
nt
$$ ;
v@
da
ya

call stproc1(5.14::FLOAT);
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 267


COMPONENTS OF STORED PROCEDURE

y
op
t-c
no
o-
CREATE PROCEDURE stproc1(float_param1 FLOAT)

-d
RETURN STRING

20
20
LANGUAGE JAVASCRIPT

ke
STRICT

fla
AS

ow
$$ -- marks beginning and end of the code

Sn
©
try {

0-
02
snowflake.execute (

-2
{sqlText: "INSERT INTO stproc_test_table1 (num_col1)

ay
-M
VALUES (" +FLOAT_PARAM1 + ")"} );
SQL statements to execute

13
return "Succeeded."; // Return status

om
} catch (err) {

c
a.
return "Failed: " + err; // status
at
td }
nt
$$ ;
v@
da
ya

call stproc1(5.14::FLOAT);
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 268


COMPONENTS OF STORED PROCEDURE

y
op
t-c
no
o-
CREATE PROCEDURE stproc1(float_param1 FLOAT)

-d
RETURN STRING

20
20
LANGUAGE JAVASCRIPT

ke
STRICT

fla
AS

ow
$$ -- marks beginning and end of the code

Sn
©
try {

0-
02
snowflake.execute (

-2
{sqlText: "INSERT INTO stproc_test_table1 (num_col1)

ay
-M
VALUES (" +FLOAT_PARAM1 + ")"} );

13
return "Succeeded."; // Return status

om
} catch (err) {

c
a.
return "Failed: " + err; // status
at
td }
nt
$$ ;
v@
da

Invoke the stored procedure


ya

call stproc1(5.14::FLOAT);
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 269


HOW IT WORKS

y
op
Javascript

t-c
no
APIs

o-
Stored

-d
Database

20
Procedure

20
ke
fla
ow
Sn
● Snowflake-provided Javascript APIs

©
0-
02
○ Javascript objects and methods

-2
ay
○ Error handling and procedural logic

-M
13
om
● SQL and database access/operations

c
a.
○ Embed SQL codes in the Javascript at
td
nt
v@

○ The SQL codes execute database operations


da
ya

○ Migrate other databases’ stored procedure (SQL) codes by embedding the SQL in Javascript
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 270


STORED PROCEDURE SQL

y
EXECUTED THROUGH API OBJECTS

op
t-c
no
o-
-d
20
Object Class Description

20
Contains the methods in the stored procedure API.

ke
fla
Snowflake Object Accessible by default to JavaScript code (you do not need

ow
to create the object).

Sn
©
Provides the methods for executing a query statement, and

0-
Statement Object accessing metadata (for example, information about

02
-2
columns and rows) about the statement

ay
-M
Contains the results returned by a query (zero or more

13
ResultSet Object rows, each with one or more columns). You iterate through

om
a result set by repeating next() and taking some action

c
a.
Java does not have a native data type that corresponds to
at
td
SfDate Object Snowflake SQL TIMESTAMP data types. Used when you
nt

want to retrieve a TIMESTAMP and stored it as a variable.


v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 271


STORED PROCEDURES VS UDFS

y
op
t-c
no
o-
STORED PROCEDURE USER-DEFINED FUNCTION

-d
20
20
● MAY return a value ● MUST return a value

ke
fla
ow
Sn
● CANNOT return a set of rows ● CAN return a set of rows (table)

©
0-
02
-2
● Can access database objects and ● DDL and DML operations not

ay
-M
issue SQL statements permitted

13
om
c
a.
● Can run as function owner or caller ● Runs as the function owner
at
td
○ Owner is default
nt
v@

○ Specified at creation time


da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 272


LAB EXERCISE

y
op
Create User-Defined Functions and Stored Procedures

t-c
no
o-
30 minutes

-d
20
20
ke
Tasks:

fla
ow
● Create a JavaScript User-Defined Function

Sn
● Create a SQL User-Defined Function

©
0-
● Create a Stored Procedure

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 273


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
ACCESS CONTROL AND

Sn
©
0-
02
-2
USER MANAGEMENT

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Concepts

no
o-
-d
20
● System Roles

20
ke
fla
● Custom Roles and Inheritance

ow
Sn
©
0-
● Ownership

02
-2
ay
-M
● Configure and Manage Access

13
om
c
a.
● Recap
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 275


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
CONCEPTS

-d
o-
no
t-c
op
y
ACCESS CONTROL

y
op
What can you do? ● Access Control is part of Snowflake's

t-c
no
authorization model

o-
-d
20
20
● Access Control defines:

ke
fla
○ Who can use which roles

ow
Sn
○ Who can access which objects

©
0-
277

02
○ What operations can be performed on those

-2
ay
objects

-M
○ Who can create or alter access control

13
om
policies

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 277


USERS AND ROLES

y
op
t-c
no
USERS ROLES

o-
-d
20
20
PROD

ke
fla
SREEDHAR

ow
J A

Sn
©
0-
02
-2
TESTDEV

ay
ANNA

-M
13
S E A

om
c
a.
at
EVAN nt
td
ANALYST
v@

J S
da
. ya
11

JENN
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 278


PRIVILEGES AND GRANTS

y
op
t-c
no
● A privilege is the right to do something on or with a Snowflake securable object

o-
-d
20
20
● A grant gives specific privileges to a role, or gives a user the right to use a role

ke
fla
ow
Sn
©
PROD

0-
02
-2
J A

ay
-M
13
GRANT ROLE PROD TO USER JENN;

om
GRANT ROLE PROD TO USER ANNA;

c
a.
at
GRANT USAGE OF WAREHOUSE prod_wh TO ROLE PROD;
td
nt
v@
da

GRANT CREATE SCHEMA ON DATABASE prod_db TO ROLE PROD;


. ya

privilege
11
lip
di

grant
© 2020 Snowflake Inc. All Rights Reserved 279
GRANT ROLES TO USERS

y
op
t-c
no
● Grant roles to a user, to enable the user to access (use) those roles

o-
-d
20
20
GRANT ROLE <name> TO USER <name>;

ke
fla
ow
● Examples:

Sn
©
0-
02
-2
GRANT ROLE dba TO USER barney;

ay
-M
GRANT ROLE analyst, dba TO USER gail, henry;

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 280


GRANT PRIVILEGES TO ROLES

y
op
t-c
no
● Grant privileges on securable objects to roles

o-
-d
20
20
ke
GRANT <privilege(s)> ON <securable object(s)> TO ROLE <role(s)>;

fla
ow
Sn
● Examples:

©
0-
02
-2
ay
GRANT USAGE ON WAREHOUSE my_wh TO ROLE developers;

-M
13
om
c
GRANT SELECT, INSERT, DELETE ON ALL TABLES IN SCHEMA my_schema

a.
at
TO ROLE analyst, dba; td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 281


SECURABLE OBJECTS

y
op
● Securable objects are constructs within

t-c
no
Account Snowflake

o-
-d
20
20
User ● You grant privileges on securable objects

ke
fla
Table

ow
● Available privileges depend on the object

Sn
Role

©
View

0-
282

type

02
-2
ay
Sequence
Warehouse

-M
13
om
Stage

c
Resource

a.
at
Monitor
td
File Format
nt
v@
da

Database Schema Pipe


. ya
11
lip

UDF
di

© 2020 Snowflake Inc. All Rights Reserved 282


AVAILABLE PRIVILEGES DEPEND ON THE OBJECT

y
op
t-c
no
CREATE ROLE | USER | WAREHOUSE | DATABASE, MANAGE GRANTS, MONITOR USAGE...ON ACCOUNT

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 283


AVAILABLE PRIVILEGES DEPEND ON THE OBJECT

y
op
t-c
no
CREATE ROLE | USER | WAREHOUSE | DATABASE, MANAGE GRANTS, MONITOR USAGE...ON ACCOUNT

o-
-d
20
20
ke
GRANT USAGE | MONITOR | MODIFY | OPERATE ON WAREHOUSE

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 284


AVAILABLE PRIVILEGES DEPEND ON THE OBJECT

y
op
t-c
no
CREATE ROLE | USER | WAREHOUSE | DATABASE, MANAGE GRANTS, MONITOR USAGE...ON ACCOUNT

o-
-d
20
20
ke
GRANT USAGE | MONITOR | MODIFY | OPERATE ON WAREHOUSE

fla
ow
Sn
©
0-
GRANT MODIFY | MONITOR ON RESOURCE MONITOR

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 285


AVAILABLE PRIVILEGES DEPEND ON THE OBJECT

y
op
t-c
no
CREATE ROLE | USER | WAREHOUSE | DATABASE, MANAGE GRANTS, MONITOR USAGE...ON ACCOUNT

o-
-d
20
20
ke
GRANT USAGE | MONITOR | MODIFY | OPERATE ON WAREHOUSE

fla
ow
Sn
©
0-
GRANT MODIFY | MONITOR ON RESOURCE MONITOR

02
-2
ay
-M
13
GRANT MODIFY | MONITOR | USAGE | CREATE SCHEMA ON DATABASE

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 286


AVAILABLE PRIVILEGES DEPEND ON THE OBJECT

y
op
t-c
no
CREATE ROLE | USER | WAREHOUSE | DATABASE, MANAGE GRANTS, MONITOR USAGE...ON ACCOUNT

o-
-d
20
20
ke
GRANT USAGE | MONITOR | MODIFY | OPERATE ON WAREHOUSE

fla
ow
Sn
©
0-
GRANT MODIFY | MONITOR ON RESOURCE MONITOR

02
-2
ay
-M
13
GRANT MODIFY | MONITOR | USAGE | CREATE SCHEMA ON DATABASE

om
c
a.
at
td
GRANT MODIFY | MONITOR | USAGE | CREATE TABLE | CREATE VIEW...ON SCHEMA
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 287


AVAILABLE PRIVILEGES DEPEND ON THE OBJECT

y
op
t-c
no
CREATE ROLE | USER | WAREHOUSE | DATABASE, MANAGE GRANTS, MONITOR USAGE...ON ACCOUNT

o-
-d
20
20
ke
GRANT USAGE | MONITOR | MODIFY | OPERATE ON WAREHOUSE

fla
ow
Sn
©
0-
GRANT MODIFY | MONITOR ON RESOURCE MONITOR

02
-2
ay
-M
13
GRANT MODIFY | MONITOR | USAGE | CREATE SCHEMA ON DATABASE

om
c
a.
at
td
GRANT MODIFY | MONITOR | USAGE | CREATE TABLE | CREATE VIEW...ON SCHEMA
nt
v@
da
. ya
11

GRANT SELECT | INSERT | UPDATE | DELETE | TRUNCATE...ON TABLE


lip
di

© 2020 Snowflake Inc. All Rights Reserved 288


ABOUT PRIVILEGES

y
op
t-c
no
● Warehouse privileges USAGE and OPERATE are separate and specific

o-
-d
20
○ USAGE: use a warehouse to run queries

20
ke
○ OPERATE: start, stop, change the settings of a virtual warehouse

fla
ow
Sn
©
0-
02
● Acting on an object requires USAGE on its logical container(s)

-2
ay
○ Example: to use SELECT privileges on a table, you must have USAGE on the schema and database

-M
13
om
● CREATE, SELECT, INSERT, UPDATE, and DELETE are all separate privileges

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 289


USERS, ROLES, AND GRANTS

y
op
t-c
no
● Roles are granted permissions (privileges) to access objects

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
DBA Modify

-2
ay
-M
Database1

13
Analyst Usage

om
c
a.
at
td The role is assigned
nt
Privileges
v@
da
ya

Role Privilege Object


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 290


USERS, ROLES, AND GRANTS

y
op
t-c
no
● Roles are granted permissions (privileges) to access objects

o-
-d
● Users are granted roles which they can use

20
20
ke
fla
ow
Sn
©
0-
02
DBA Modify

-2
ay
-M
John_Wick Database1

13
Analyst Usage

om
c
a.
at
td
User is granted The role is assigned
nt
Role(s) Privileges
v@
da
ya

User Role Privilege Object


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 291


USERS, ROLES, AND GRANTS

y
op
t-c
no
● Roles are granted permissions (privileges) to access objects

o-
-d
● Users are granted roles which they can use

20
20
● Users USE a role, and get the role's privileges while using the role

ke
fla
ow
Sn
©
0-
02
DBA Modify

-2
ay
-M
John_Wick Database1

13
Analyst Usage

om
c
a.
at
td
User is granted The role is assigned
nt
Role(s) Privileges
v@
da
ya

User Role Privilege Object


.
11
lip

John has privileges to Modify the database while using the DBA role
di

© 2020 Snowflake Inc. All Rights Reserved 292


USERS, ROLES, AND GRANTS

y
op
t-c
no
● Roles are granted permissions (privileges) to access objects

o-
-d

20
Users are granted roles which they can use

20
● Users USE a role, and get the role's privileges while using the role

ke
fla
● Users can be assigned to multiple roles, but only use one (current role) at a time

ow
Sn
©
0-
02
DBA Modify

-2
ay
-M
John_Wick Database1

13
Analyst Usage

om
c
a.
at
td
User is granted The role is assigned
nt
Role(s) Privileges
v@
da
ya

User Role Privilege Object


.
11
lip

John has privileges to USE the database while using the Analyst role
di

© 2020 Snowflake Inc. All Rights Reserved 293


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
SYSTEM ROLES

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SYSTEM ROLES

y
op
● ACCOUNTADMIN

t-c
no
ACCOUNTADMIN

o-
○ Has all privileges

-d
20
20
● SECURITYADMIN

ke
fla
ow
○ Create, monitor, manage users and roles

Sn
SECURITYADMIN SYSADMIN
○ Grant or revoke privileges to users and

©
0-
295

02
roles

-2
ay
-M
● SYSADMIN

13
om
○ Create warehouses and databases

c
a.
at
td ○ Create other objects
nt
v@

● PUBLIC
da
. ya
11

○ No privileges
lip

PUBLIC
di

○ Default role for all users, unless otherwise


© 2020 Snowflake Inc. All Rights Reserved specfiied 295
DIVISION OF RESPONSIBILITY

y
ACCOUNTADMIN

op
t-c
no
● ACCOUNTADMIN has ALL privileges - either granted, or inherited. Grant this role to a

o-
-d
20
limited number of users.

20
ke
fla
ow
Sn
©
DEFAULT GRANTS:

0-
02
● CREATE SHARE

-2
ay
-M
● IMPORT SHARE By default, only ACCOUNTADMIN can do these things

13
om
● MONITOR USAGE

c
a.
at
td
nt
v@

ACCOUNTADMIN
da
.ya
11

SECURITYADMIN SYSADMIN
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 296


BEST PRACTICES

y
ACCOUNTADMIN

op
t-c
no
● Assign to a limited number of users

o-
-d
20
20
● Require MFA on all users with this role

ke
fla
ow
Sn
● Do not make it the default role for any user

©
0-
02
-2
● Never execute scripts using ACCOUNTADMIN

ay
-M
13
om
c
a.
at
td
nt
v@

ACCOUNTADMIN
da
. ya
11

SECURITYADMIN SYSADMIN
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 297


DIVISION OF RESPONSIBILITY

y
SECURITYADMIN

op
t-c
no
o-
SECURITYADMIN should create all users and roles, and GRANT them to SYSADMIN

-d
20
(or somewhere in the SYSADMIN hierarchy)

20
ke
fla
DEFAULT GRANTS:

ow
Sn
©
0-
○ CREATE USER ON ACCOUNT

02
-2
○ CREATE ROLE ON ACCOUNT

ay
○ MANAGE GRANTS ON ACCOUNT

-M
13
om
c
a.
at
td
nt
v@

ACCOUNTADMIN
da
. ya
11

SECURITYADMIN SYSADMIN
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 298


DIVISION OF RESPONSIBILITY

y
SYSADMIN

op
t-c
no
o-
SYSADMIN should grant PRIVILEGES to ROLES

-d
20
20
ke
fla
DEEFAULT GRANTS:

ow
Sn
©
0-
○ CREATE DATABASE on ACCOUNT

02
-2
○ CREATE WAREHOUSE on ACCOUNT

ay
-M
13
om
c
a.
at
td
nt
v@

ACCOUNTADMIN
da
. ya
11

SECURITYADMIN SYSADMIN
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 299


DIVISION OF RESPONSIBILITY

y
PUBLIC

op
t-c
no
o-
PUBLIC can't do anything but log in. It's the default role for new users, unless changed.

-d
20
20
ke
fla
DEFAULT GRANTS:

ow
Sn
○ Zero

©
0-
○ Zip

02
○ Zilch

-2
ay
○ Nada

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11

PUBLIC
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 300


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
CUSTOM ROLES

Sn
©
0-
02
-2
AND INHERITANCE

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


ROLE INHERITANCE

y
op
● A role inherits all the privileges of its

t-c
no
ACCOUNTADMIN underlying roles (those "lower" in the

o-
-d
hierarchy)

20
20
ke
fla
● ACCOUNTADMIN inherits privileges

ow
Sn
SECURITYADMIN SYSADMIN from SECURITYADMIN, SYSADMIN,

©
0-
302

and PUBLIC

02
-2
ay
-M
● SECURITY ADMIN and SYSADMIN

13
inherit privileges from PUBLIC

om
c
a.
at ● PUBLIC inherits nothing
td
nt
v@
da
. ya
11
lip

PUBLIC
di

© 2020 Snowflake Inc. All Rights Reserved 302


ROLE HIERARCHY

y
op
● Roles fit into the hierarchy based on

t-c
no
ACCOUNTADMIN which role(s) they are granted to

o-
-d
20
20
● Custom roles also inherit privileges from

ke
fla
the roles "beneath" them

ow
Sn
SECURITYADMIN SYSADMIN

©
0-
303
● Here:

02
-2
○ Role DBA has been granted to role

ay
-M
DBA
SYSADMIN

13
om
○ Role SYSADMIN inherits privileges from

c
a.
role DBA
at
td
○ Role DBA inherits privileges from role
nt
v@

PUBLIC
da
. ya
11
lip

PUBLIC
di

© 2020 Snowflake Inc. All Rights Reserved 303


CREATING CUSTOM ROLES

y
op
t-c
no
USE ROLE securityadmin;

o-
-d
20
20
CREATE ROLE dba;

ke
fla
ow
Sn
GRANT ROLE dba TO ROLE sysadmin;

©
0-
02
-2
GRANT ROLE dba TO USER jenna, tom;

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 304


ROLE HIERARCHY

y
op
● Grants:

t-c
no
ACCOUNTADMIN

o-
○ Role DBA has been granted to Role

-d
20
SYSADMIN

20
○ Role ANALYST has been granted to role

ke
fla
DBA

ow
Sn
SECURITYADMIN SYSADMIN ○ Role DEV has been granted to role DBA

©
0-
305

○ Role JR_ANALYST has been granted to

02
-2
role ANALYST

ay
DBA

-M
13
● Inheritance:

om
ANALYST DEV

c
a.
○ SYSADMIN inherits from DBA and roles
at
td
below it
nt
JR_ANALYST
v@

○ DBA inherits from DEV, ANALYST, and roles


da
ya

below them
.
11
lip

PUBLIC ○ ANALYST inherits from JR_ANALYST


di

© 2020 Snowflake Inc. All Rights Reserved 305


ROLE HIERARCHY VISUALIZATION

y
op
t-c
no
o-
AccountAdmin

-d
20
Inherits: All Roles

20
ke
SecurityAdmin SysAdmin

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 306


ROLE HIERARCHY VISUALIZATION

y
op
t-c
no
o-
AccountAdmin

-d
20
Inherits: All Roles

20
ke
SecurityAdmin SysAdmin

fla
ow
Inherits: DBA

Sn
©
0-
02
CREATE ROLE dba;

-2
DBA

ay
GRANT ROLE dba TO ROLE sysadmin;

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 307


ROLE HIERARCHY VISUALIZATION

y
op
t-c
no
o-
AccountAdmin

-d
20
Inherits: All Roles

20
ke
SecurityAdmin SysAdmin

fla
ow
Inherits: DBA, Analyst

Sn
©
0-
02
CREATE ROLE dba;

-2
DBA

ay
GRANT ROLE dba TO ROLE sysadmin;

-M
Inherits: Analyst

13
om
CREATE ROLE Analyst;

c
GRANT ROLE analyst TO ROLE dba;

a.
Analyst
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 308


ROLE HIERARCHY VISUALIZATION

y
op
t-c
no
o-
AccountAdmin

-d
20
Inherits: All Roles

20
ke
SecurityAdmin SysAdmin

fla
ow
Inherits: DBA, Analyst, Dev

Sn
©
0-
02
CREATE ROLE dba;

-2
DBA

ay
GRANT ROLE dba TO ROLE sysadmin;

-M
Inherits: Analyst, Dev

13
om
CREATE ROLE Analyst;

c
GRANT ROLE analyst TO ROLE dba;

a.
Analyst Dev
at
td
nt
v@

CREATE ROLE dev;


GRANT ROLE dev TO ROLE dba;
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 309


ROLE HIERARCHY VISUALIZATION

y
op
t-c
no
o-
AccountAdmin

-d
20
Inherits: All Roles

20
ke
SecurityAdmin SysAdmin

fla
ow
Inherits: DBA, Analyst, Dev, Jr_Analyst

Sn
©
0-
02
CREATE ROLE dba;

-2
DBA

ay
GRANT ROLE dba TO ROLE sysadmin;

-M
Inherits: Analyst, Dev, Jr_Analyst

13
om
CREATE ROLE Analyst;

c
GRANT ROLE analyst TO ROLE dba;

a.
Analyst Dev
at
td
nt
Inherits: Jr_Analyst
v@

CREATE ROLE dev;


GRANT ROLE dev TO ROLE dba;
da
. ya
11

Jr_Analyst
lip

CREATE ROLE jr_analyst;


di

GRANT ROLE jr_analyst TO ROLE analyst;

© 2020 Snowflake Computing Inc. All Rights Reserved 310


ROLES AND INHERITANCE

y
op
t-c
no
Public

o-
-d
20
default role

20
TABLEA

ke
Analyst Usage

fla
Select

ow
Sn
John_Wick

©
0-
TABLEB

02
Insert, Update, Delete,
DBA

-2
Truncate

ay
-M
13
om
c
a.
at
td
nt
v@
da

User Role Privilege Object


. ya
11
lip

Upon connecting to Snowflake, a user takes on their default role.


di

© 2020 Snowflake Computing Inc. All Rights Reserved 311


ROLES AND INHERITANCE

y
op
t-c
no
Public

o-
-d
20
20
TABLEA

ke
Analyst Usage

fla
Select

ow
Sn
John_Wick

©
0-
TABLEB

02
Insert, Update, Delete,
DBA

-2
Truncate

ay
-M
13
om
c
a.
at
td
nt
v@
da

User Role Privilege Object


. ya
11
lip

User John_Wick has been granted the roles Analyst, and DBA.
di

© 2020 Snowflake Computing Inc. All Rights Reserved 312


ROLES AND INHERITANCE

y
op
t-c
no
Public

o-
-d
20
20
TABLEA

ke
Analyst Usage

fla
Select

ow
Sn
John_Wick

©
0-
TABLEB

02
Insert, Update, Delete,
DBA

-2
Truncate

ay
-M
13
Granted

om
c
a.
Select
Dev
at Usage
td
nt
v@
da

User Role Privilege Object


. ya
11
lip

The role Dev has been granted to the role DBA.


di

© 2020 Snowflake Computing Inc. All Rights Reserved 313


ROLES AND INHERITANCE

y
op
t-c
no
Public

o-
-d
20
20
TABLEA

ke
Analyst Usage

fla
Select

ow
Sn
John_Wick

©
0-
TABLEB

02
Insert, Update, Delete,
DBA

-2
Truncate

ay
USE ROLE Analyst;

-M
13
Granted

om
c
a.
Select
Dev
at Usage
td
nt
v@
da

User Role Privilege Object


. ya
11
lip

When using the Analyst role, John can SELECT from TABLEA and USE a warehouse.
di

© 2020 Snowflake Computing Inc. All Rights Reserved 314


ROLES AND INHERITANCE

y
op
t-c
no
Public

o-
-d
20
20
TABLEA

ke
Analyst Usage

fla
Select

ow
Sn
John_Wick

©
0-
TABLEB

02
Insert, Update, Delete,
DBA

-2
Truncate

ay
USE ROLE dba;

-M
13
Granted

om
c
a.
Select
Dev
at Usage
td
nt
v@
da

User Role Privilege Object


. ya
11

With the DBA role, he can INSERT, UPDATE, DELETE, and TRUNCATE TABLEB. He also
lip
di

inherits the right to SELECT from TABLEB, and USE a warehouse.


© 2020 Snowflake Computing Inc. All Rights Reserved 315
ROLES AND INHERITANCE

y
op
t-c
no
Public

o-
-d
20
20
TABLEA

ke
Analyst Usage

fla
Select

ow
Sn
John_Wick

©
0-
TABLEB

02
Insert, Update, Delete,
DBA

-2
Truncate

ay
USE ROLE dev;

-M
13
Granted

om
c
a.
Developer
Dev Select
at Usage
td
nt
v@
da

User Role Privilege Object


. ya
11
lip

He can also USE the Dev role, since it was granted to a role he has the right to use. As Dev,
di

He can SELECT from TABLEB and USE a warehouse.


© 2020 Snowflake Computing Inc. All Rights Reserved 316
© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
-d
OWNERSHIP

o-
no
t-c
op
y
OWNERSHIP

y
op
t-c
no
● When you create an object, your current role becomes the owner

o-
-d
20
20
● Every role or object is owned by a single role

ke
fla
○ All users in that role share ownership, when they are using that role

ow
Sn
©
0-
02
● The owner can do anything with the object

-2
ay
-M
● The owner can grant privileges to itself or other roles

13
om
c
a.
● Ownership can be transferred by the owner
at
td
nt
v@
da

GRANT OWNERSHIP ON WAREHOUSE dev_ws TO ROLE dba


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 318


OWNERSHIP

y
op
t-c
no
● Warning: Object ownership will cause confusion.

o-
-d
20
○ Be aware of which role you're using when you create an object.

20
ke
○ When you change roles, your object may no longer be accessible (because the object belongs to your

fla
role when you created it, not to you).

ow
Sn
○ If an object goes "missing," check with the role owner.

©
0-
02
-2
● Best Practice: Have SECURITYADMIN own all roles

ay
-M
13
om
● Example:

c
a.
at
GRANT CREATE DATABASE ON ACCOUNT TO ROLE dba; nt
td
v@

USE ROLE dba;


da

CREATE DATABASE my_db1;


. ya
11

CREATE SCHEMA my_db1.myschema; -- privilege acquired through ownership


lip
di

© 2020 Snowflake Inc. All Rights Reserved 319


ROLE OWNERSHIP VERSUS GRANT

y
op
t-c
no
o-
AccountAdmin

-d
20
Inherits: All Roles

20
ke
SecurityAdmin SysAdmin

fla
ow
Inherits: DBA, Analyst, Dev, Jr_Analyst

Sn
©
0-
Grant

02
-2
DBA

ay
-M
Inherits: Analyst, Dev, Jr_Analyst

13
om
Grant Grant

c
a.
Analyst Dev
at
td
nt
Inherits: Jr_Analyst
v@
da

Grant
. ya
11

Jr_Analyst
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 320


ROLE OWNERSHIP VERSUS GRANT

y
op
t-c
no
o-
AccountAdmin

-d
20
Inherits: All Roles

20
ke
SecurityAdmin SysAdmin

fla
ow
Inherits: DBA, Analyst, Dev, Jr_Analyst
owne

Sn
rship
own

©
ersh

0-
ip Grant

02
ow
ne

-2
rs
hip DBA

ay
ow
ne

-M
Inherits: Analyst, Dev, Jr_Analyst
rs

13
hi
p

om
Grant Grant

c
a.
Analyst Dev
at
td
nt
Inherits: Jr_Analyst
v@

The SECURITYADMIN
da

Grant
ya

role OWNS all the roles,


.
11

but has NO data access. Jr_Analyst


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 321


OWNERSHIP AND GRANTS

y
op
t-c
no
● Granted privileges are different from ownership

o-
-d
20
20
● GRANTs give a role the right to do something with an object, or to use a role

ke
fla
ow
Sn
● The OWNER has the right to GRANT privileges on the objects they own

©
0-
02
-2
● The OWNER also has all rights to the objects they own

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 322


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
CONFIGURE AND

Sn
©
0-
02
-2
MANAGE ACCESS

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


CREATE USERS AND ROLES

y
op
● Users with SECURITYADMIN role or

t-c
no
above can manage users and roles using

o-
-d
CREATE USER johnsmith
SQL or the web UI

20
PASSWORD='rose-bud'

20
ke
default_role = developer

fla
● When creating a user, strongly

ow
must_change_password=true;

Sn
recommend forcing a password change

©
0-
324

DESC USER johnsmith; on first login

02
-2
ay
-M
CREATE ROLE dba; ● Can show users, describe a given user,

13
or alter/drop a user

om
c
GRANT ROLE dba TO USER johnsmith;
a.
at ● Once created, assign users to roles
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 324


MANAGE USERS

y
op
● Use ALTER USER to change the

t-c
no
ALTER USER johnsmith
password for another user

o-
SET PASSWORD = 'rose-bud-II'

-d
20
must_change_password = true;

20
● After 5 login failures, user is locked out

ke
fla
ALTER USER johnsmith for 15 minutes

ow
Sn
RESET password;

©
0-
325
● RESET PASSWORD generates a URL to

02
ALTER USER johnsmith

-2
share with the user

ay
SET disabled = true;

-M
○ Old password valid until changed by user

13
om
ALTER USER johnsmith ○ URL expires in 4 hours

c
a.
SET minutes_to_unlock = 0;
at
td
● Disabling a user terminates user
nt
v@

ALTER USER johnsmith sessions/locks them out immediately


da

SET PASSWORD = ‘XXX’;


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 325


ASSIGN A DEFAULT ROLE

y
op
● Default role is PUBLIC

t-c
no
o-
-d
● Set DEFAULT_ROLE for user to change

20
20
the default

ke
fla
○ ALTER USER name SET DEFAULT_ROLE

ow
Sn
DEFAULT_ROLE DEFAULT_ROLE Connect with role

©
0-
not set set to ROLE_A ROLE_B

02
-2
● Set the role in the connection string from

ay
-M
the client (JDBC, ODBC, SnowSQL…)

13
om
○ Role set in connection string overrides

c
a.
configured default role
at
td
nt
v@
da

PUBLIC ROLE_A ROLE_B


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 326


VIEW CURRENT PRIVILEGES

y
op
t-c
no
● View the current roles granted to a user:

o-
-d
20
20
ke
SHOW GRANTS TO USER <user_name>;

fla
ow
Sn
©
0-
● View the current set of privileges granted to a role:

02
-2
ay
-M
13
SHOW GRANTS TO ROLE <role_name>;

om
c
a.
● View which users a role has been granted to: at
td
nt
v@
da

SHOW GRANTS OF ROLE <role_name>;


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 327


GRANT PRIVILEGES

y
op
t-c
no
GRANT ROLE warehouse_manager TO USER kelly;

o-
-d
20
20
GRANT ROLE dba TO USER allison;

ke
fla
ow
Sn
GRANT ROLE analyst TO USER marie;

©
0-
02
-2
GRANT OPERATE ON WAREHOUSE wh1 TO ROLE warehouse_manager;

ay
-M
13
GRANT USAGE ON WAREHOUSE wh1 TO ROLE dba, analyst

om
c
a.
at
td
GRANT CREATE SCHEMA ON DATABASE db1 TO ROLE dba;
nt
v@
da
ya

GRANT SELECT ON ALL TABLES IN SCHEMA schema1 TO ROLE analyst;


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 328


FUTURE GRANTS

y
op
t-c
no
● Grants, by default, apply only to existing objects

o-
-d
20
20
ke
GRANT SELECT ON ALL TABLES IN SCHEMA mydb.myschema TO ROLE role1;

fla
ow
Sn
● Use ON FUTURE to grant privileges to objects that will be created in the future

©
0-
02
○ See documentation for details on future grants

-2
ay
-M
13
GRANT SELECT ON FUTURE TABLES IN SCHEMA mydb.myschema TO ROLE role1;

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 329


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
RECAP

fla
ke
20
20
-d
o-
no
t-c
op
y
SCENARIO

y
FUN WITH ROLES AND PRIVILEGES

op
t-c
no
● Nancy is new to CompanyA, who is using Snowflake

o-
-d
20
20
● Their previous Snowflake administrator (Roberto) quit to open a gelato shop in Antarctica,

ke
fla
and cannot be reached

ow
Sn
©
0-
● Before leaving, Roberto assigned Nancy to roles ACCOUNTADMIN, SECURITYADMIN

02
-2
and SYSADMIN, and left her a coupon for free gelato

ay
-M
13
● Management has asked Nancy to create a place where new DBA interns can play with

om
c
a.
their data warehouse without doing any real harm
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 331


SCENARIO

y
FUN WITH ROLES AND PRIVILEGES

op
t-c
no
● Create an intern role

o-
-d
20
20
● Create an intern database

ke
fla
ow
Sn
● Give interns the ability to create schemas in the intern database

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 332


COMPANYA CURRENT CONFIGURATION

y
op
t-c
no
ACCOUNTADMIN

o-
-d
20
What will Nancy do?

20
ke
fla
ow
Sn
SECURITYADMIN SYSADMIN

©
0-
02
-2
ay
-M
13
DBA Developer

om
c
a.
at
td
nt
v@
da
. ya
11

PUBLIC
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 333


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
20
ke
fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 334


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE ROLE intern;

20
ke
fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 335


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE ROLE intern;

20
ke
Statement executed successfully

fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 336


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE ROLE intern;

20
ke
Statement executed successfully

fla
Nancy

ow
Sn
GRANT ROLE intern

©
0-
TO ROLE dba;

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 337


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE ROLE intern; ACCOUNTADMIN

20
ke
Statement executed successfully

fla
Nancy

ow
Sn
GRANT ROLE intern

©
SECURITYADMIN SYSADMIN

0-
TO ROLE dba;

02
-2
Statement executed successfully

ay
-M
13
DBA Developer

om
c
a.
at
INTERN
td
nt
v@

Role:
da

SECURITYADMIN PUBLIC
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 338


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE ROLE intern;

20
ke
Statement executed successfully

fla
Nancy

ow
Sn
GRANT ROLE intern

©
0-
TO ROLE dba;

02
-2
Statement executed successfully

ay
-M
13
USE ROLE intern;

om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 339


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE ROLE intern;

20
ke
Statement executed successfully

fla
Nancy

ow
Sn
GRANT ROLE intern

©
0-
TO ROLE dba;

02
-2
Statement executed successfully

ay
-M
13
USE ROLE intern;

om
c
a.
SQL access control error: Requested Before anyone can use a role, they
at
td
role 'INTERN' is not assigned to the
must be granted those privileges.
nt
v@

executing user. Specify another role


Role:
da

to activate.
SECURITYADMIN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 340


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
GRANT ROLE intern TO

20
USER nancy;

ke
fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 341


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
GRANT ROLE intern TO

20
USER nancy;

ke
fla
Nancy Statement executed successfully

ow
Sn
©
0-
USE ROLE intern;

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 342


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 343


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
SQL access control error: Insufficient privileges

fla
Nancy to operate on account 'companya'

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 344


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
SQL access control error: Insufficient privileges

fla
Nancy to operate on account 'companya'

ow
Sn
©
0-
02
USE ROLE securityadmin;

-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 345


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 346


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
SQL access control error: Insufficient The role SECURITYADMIN does not

fla
Nancy privileges to operate on account 'companya'

ow
have the ability to create objects -

Sn
only users and roles.

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 347


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
SQL access control error: Insufficient The role SECURITYADMIN does not

fla
Nancy privileges to operate on account 'companya'

ow
have the ability to create objects -

Sn
only users and roles.

©
0-
02
USE ROLE sysadmin;

-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SECURITYADMIN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 348


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SYSADMIN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 349


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
Statement executed successfully

fla
Nancy

ow
Sn
GRANT CREATE SCHEMA ON

©
0-
DATABASE intern TO ROLE

02
intern;

-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

SYSADMIN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 350


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
Statement executed successfully

fla
Nancy

ow
Sn
GRANT CREATE SCHEMA ON

©
0-
DATABASE intern TO ROLE

02
intern;

-2
ay
-M
Statement executed successfully

13
om
c
a.
at
td
nt
v@

Role:
da

SYSADMIN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 351


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE DATABASE intern;

20
ke
Statement executed successfully

fla
Nancy

ow
Sn
GRANT CREATE SCHEMA ON

©
0-
DATABASE intern TO ROLE

02
intern;

-2
ay
-M
Statement executed successfully

13
om
GRANT USAGE ON DATABASE

c
a.
intern TO ROLE intern; Don't forget! To use an object (such
at
td
as a schema), you need to have
nt
v@

Role: USAGE privileges on its containing


da

SYSADMIN object (in this case, the database)


.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 352


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE WAREHOUSE While she's at it, Nancy creates a

20
intern_wh;
warehouse for the interns, and grants

ke
fla
Nancy them USAGE on it.

ow
Statement executed successfully

Sn
©
0-
02
GRANT USAGE ON WAREHOUSE

-2
INTERN

ay
intern_wh TO ROLE intern;

-M
13
Statement executed successfully Permissions:

om
c
• USAGE on DATABASE intern

a.
USE ROLE intern;
at
td
• CREATE SCHEMA on DATABASE intern
nt
v@

Role: • USAGE ON intern_wh


da

SYSADMIN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 353


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE SCHEMA test_schema;

20
ke
fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 354


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE SCHEMA test_schema;

20
ke
Statement executed successfully

fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 355


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
USE ROLE securityadmin;

20
ke
fla
Nancy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 356


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
USE ROLE securityadmin;

20
This is not an intern becoming a

ke
Statement executed successfully

fla
Nancy SECURITYADMIN - this is Nancy

ow
Sn
using a role she has been granted.

©
0-
02
Other users in the INTERN role would

-2
ay
not be able to do this.

-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 357


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE USER intern1

20
PASSWORD='Test12345'

ke
DEFAULT_ROLE=intern;

fla
Nancy

ow
Sn
Statement executed successfully

©
0-
02
GRANT ROLE intern

-2
ay
TO USER intern1;

-M
13
Statement executed successfully

om
c
a.
Nancy logs out of the account,
at
td
and logs back in as intern1.
nt
v@

Role:
da

SECURITYADMIN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 358


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
USE WAREHOUSE intern_wh;

20
ke
fla
Intern1

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 359


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
USE WAREHOUSE intern_wh;

20
ke
Statement executed successfully

fla
Intern1

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 360


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
USE WAREHOUSE intern_wh;

20
ke
Statement executed successfully

fla
Intern1

ow
Sn
©
0-
USE SCHEMA

02
intern.schema_test;

-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 361


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
USE WAREHOUSE intern_wh;

20
ke
Statement executed successfully

fla
Intern1

ow
Sn
©
0-
USE SCHEMA

02
intern.schema_test;

-2
ay
-M
Statement executed successfully

13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 362


COMPANYA SCENARIO

y
op
t-c
no
o-
-d
20
CREATE TABLE my_table

20
(col1 INT);

ke
fla
Intern1

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

Role:
da

INTERN
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 363


COMPANYA SCENARIO

y
op
t-c
no
The role INTERN has been granted

o-
-d
privileges to:

20
CREATE TABLE my_table

20
(col1 INT);

ke
• Use the database INTERN

fla
Nancy Statement executed successfully

ow
Sn
• Use the warehouse INTERN_WH

©
0-
02
• Create schemas on database INTERN

-2
ay
-M
13
Through ownership, the role INTERN can:

om
c
a.
• Create objects inside the created
at
td
schemas (which makes INTERN the
nt
v@

Role: owner of those objects)


da

SYSADMIN
.ya
11

• Do anything with the created objects


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 364


LAB EXERCISE

y
op
Access Control and User Management

t-c
no
o-
40 minutes

-d
20
20
ke
Exercise: Manage Roles

fla
ow
Sn
Tasks:

©
0-
● Creating Parent & Child Roles

02
-2
● Granting Privileges on Roles

ay
-M
● Dropping Roles

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 365


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
SEMI-STRUCTURED

Sn
©
0-
02
DATA

-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Overview

no
o-
-d
● Query Semi-Structured Data

20
20
● Load and Unload Semi-Structured Data

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 367


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
OVERVIEW

-d
o-
no
t-c
op
y
NATIVE SUPPORT

y
op
t-c
no
● Native support for all your data

o-
-d
20
○ Data Lakes with machine-generated data

20
ke
○ Sensor IoT type data

fla
ow
○ Semi-Structured origin formats (JSON, Avro, ORC, Parquet, or XML)

Sn
©
0-
02
● First level data types

-2
ay
○ Statistics are gathered and maintained, including subfields for quicker query access and effective

-M
13
partition pruning for performance

om
c
a.
● Support for all SQL operations (joins, group by, order by, …)
at
td
nt

○ Native syntax for accessing data even in semi-structured fields


v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 369


SEMI-STRUCTURED DATA
TYPES

y
op
● VARIANT – holds values of standard

t-c
no
SQL type, arrays, and objects

o-
-d
20
○ Non-native values (e.g., dates and

20
timestamps) are stored as strings in a

ke
fla
variant column

ow
Sn
○ Operations could be slower than when

©
0-
370
stored in a relational column as the

02
corresponding data type

-2
ay
-M
13
● OBJECT – a collection of key-value pairs

om
c
a.
○ The value is a VARIANT
at
td
nt
v@

● ARRAY – Arrays of varying sizes


da
ya

○ The value is a VARIANT


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


● Snowflake collects metadata on these 370

columns for use by optimizer for pruning


VARIANT DATA TYPE EXAMPLE

y
op
t-c
no
o-
SAMPLE WEATHER DATA SET IN SNOWFLAKE

-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 371


SEMI-STRUCTURED FILE FORMATS

y
op
t-c
no
[ {"table1": {"dim": {"H": 30, "W": 38, "L": 65} }, "options": ["teak", "oak", "walnut"] }, {"table2":…} ]

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da

JSON File
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SEMI-STRUCTURED FILE FORMATS

y
op
t-c
no
[ {"table1": {"dim": {"H": 30, "W": 38, "L": 65} }, "options": ["teak", "oak", "walnut"] }, {"table2":…} ]

o-
-d
20
key value (object) key value (array)

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da

JSON File
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SEMI-STRUCTURED FILE FORMATS

y
op
t-c
no
[ {"table1": {"dim": {"H": 30, "W": 38, "L": 65} }, "options": ["teak", "oak", "walnut"] }, {"table2":…} ]

o-
-d
20
key value (object) key value (array)

20
ke
fla
ow
key value (object)

Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da

JSON File
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SEMI-STRUCTURED FILE FORMATS

y
op
t-c
no
[ {"table1": {"dim": {"H": 30, "W": 38, "L": 65} }, "options": ["teak", "oak", "walnut"] }, {"table2":…} ]

o-
-d
20
k v k v k v key value (array)

20
ke
fla
ow
key value (object)

Sn
©
0-
02
-2
ay
-M
key value (object)

13
om
c
a.
at
td
nt
v@
da

JSON File
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SEMI-STRUCTURED FILE FORMATS

y
op
t-c
no
Other semi-structured file formats can be loaded into a VARIANT column and queried, using

o-
-d
20
similar commands and functions as JSON data.

20
ke
fla
● Avro: Open-source framework originally developed for use with Apache Hadoop

ow
Sn
©
0-
● ORC (Optimized Row Columnar): binary format used to store Hive data

02
-2
ay
-M
● Parquet: binary format designed for projects in the Hadoop ecosystem

13
om
c
● XML (Extensible Markup Language): consists primarily of tags <> and elements
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
QUERY

Sn
©
0-
02
-2
SEMI-STRUCTURED DATA

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


QUERY RICH HIERARCHICAL STRUCTURES IN SQL

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 378


QUERY RICH HIERARCHICAL STRUCTURES IN SQL

y
op
t-c
no
o-
SELECT

-d
20
value:temp:min::number(10,2) AS min,

20
ke
value:temp:max::number(10,2) AS max

fla
ow
FROM

Sn
©
daily_14_total dt,

0-
02
LATERAL FLATTEN(input=> dt.v:data)

-2
ay
-M
WHERE

13
dt.t = (SELECT MAX(t) FROM daily_14_total)

om
c
a.
ORDER BY min
at
td
LIMIT 10;
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 379


DATA FUNCTIONS

y
op
t-c
Array Creation and Object Creation and Extraction Casting Type Checking

no
Manipulation Manipulation

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
JSON and XML

ay
Conversion

-M
Parsing

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 380


ACCESSING VALUES IN NESTED DATA

y
op
t-c
no
Access a value in a VARIANT column for a particular key

o-
-d
20
20
ke
Colon - column:key Brackets - column[‘key’]

fla
ow
SELECT value:humidity SELECT value['humidity']

Sn
©
0-
AS pct_humidity AS pct_humidity

02
-2
FROM FROM

ay
-M
my_json_table my_json_table

13
om
… …

c
a.
PCT_HUMIDITY at PCT_HUMIDITY
td
nt
v@

========= =========
da

50 50
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 381


CASTING DATA FROM VARIANTS

y
op
t-c
no
● VARIANTs are often just strings, arrays or numbers

o-
-d
20
20
● Without CASTing, they remain VARIANT object types

ke
fla
ow
Sn
● Cast them to SQL data types using the : : operator

©
0-
02
-2
ay
SELECT

-M
13
value:humidity::number(10,2)

om
c
AS pct_humidity
a.
at
td
FROM
nt
v@

daily_14_total dt,
da
. ya

LATERAL FLATTEN(input=> dt.v:data)


11
lip
di

LIMIT 3;
© 2020 Snowflake Inc. All Rights Reserved 382
FLATTENING DATA

y
op
t-c
no
● VARIANTs may contain nested elements (arrays and objects that contain the data)

o-
-d
20
20
● FLATTEN() extracts data from nested elements

ke
fla
ow
Sn
● Almost always used with a LATERAL join (to refer to columns from other tables, views, or

©
0-
table function)

02
-2
ay
-M
LATERAL FLATTEN(input=> expression [ options ] )

13
om
c
a.
● input => The expression or column that will be unseated into rows.
at
td
The data must be of type VARIANT, OBJECT, or ARRAY.
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 383


COLUMNS RETURNED BY FLATTEN

y
op
t-c
no
o-
SEQ Unique sequence number associated with input record; not guaranteed to be

-d
20
gap-free or ordered any particular way

20
ke
KEY For maps or objects, this column contains the key to the exploded value

fla
ow
Sn
©
PATH Path to the element within a data structure that needs to be flattened

0-
02
-2
ay
INDEX Index of the element, if it is an array; otherwise NULL

-M
13
om
c
VALUE Value of the element of the flattened array or object
a.
at
td
nt
v@

THIS The element being flattened (useful in recursive flattening)


da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 384


FLATTENING DATA

y
op
t-c
no
SELECT * FROM

o-
-d
20
TABLE (FLATTEN(input => PARSE_JSON('[1,2,3]')));

20
ke
fla
SEQ KEY PATH INDEX VALUE THIS

ow
Sn
1 NULL [0] 0 1 [1,2,3]

©
0-
02
1 NULL [1] 1 2 [1,2,3]

-2
ay
-M
1 NULL [2] 2 3 [1,2,3]

13
om
c
a.
at
td
nt
v@

● Columns from both the source table and the FLATTEN results can be SELECTed
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 385


LAB EXERCISE

y
op
DEMO: View, Query, and Flatten Semi-Structured Data

t-c
no
o-
10 minutes

-d
20
20
ke
● WEATHER data in SNOWFLAKE_SAMPLE_DATA

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 386


LAB EXERCISE

y
op
Semi-Structured Data

t-c
no
o-
30 minutes

-d
20
20
ke
Exercise: Explore Semi-Structured Data

fla
ow
Sn
Tasks:

©
0-
● Review Weather Data

02
-2
● Extract & Transform Weather Data

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 387


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
LOAD AND UNLOAD

Sn
©
0-
02
-2
SEMI-STRUCTURED DATA

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


LOADING SEMI-STRUCTURED DATA

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
Output Data to Files Stage Files to Cloud Storage Load Data from Cloud
at
td
nt
Storage Into Snowflake
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 389


LOADING SEMI-STRUCTURED DATA

y
op
t-c
no
o-
-d
20
CREATE FILE FORMAT

20
ke
fla
ow
CREATE TABLE

Sn
©
0-
02
-2
COPY INTO TABLE

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 390


SEMI-STRUCTURED FILE FORMATS

y
op
t-c
no
● Create file formats for semi-structured data using one of the available types

o-
-d
20
20
ke
CREATE FILE FORMAT <name>

fla
ow
TYPE = { JSON | AVRO | ORC | PARQUET | XML }

Sn
©
0-
[<format options>];

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


SOME JSON FILE FORMAT OPTIONS

y
op
t-c
no
Option Default Details

o-
-d
20
COMPRESSION Load: Auto Supported algorithms: GZIP, BZ2, BROTLI, ZSTD, DEFLATE,

20
Unload: RAW_DEFLATE, NONE. If BROTLI, cannot use AUTO.

ke
GZIP

fla
ow
FILE_EXTENSION .JSON Only used for unloading. Specifies file extension to use

Sn
©
0-
ALLOW_DUPLICATE FALSE Only used for loading. If TRUE, allows duplicate object field names

02
(only the last one will be preserved)

-2
ay
STRIP_OUTER_ARRAY FALSE Only used for loading. If TRUE, JSON parser will remove outer

-M
13
brackets [ ]

om
STRIP_NULL_VALUES FALSE Only used for loading. If TRUE, JSON parser will remove object

c
a.
fields or array elements containing NULL
at
td
nt
DATE_FORMAT AUTO Used only for loading JSON data into separate columns. Defines the
v@

format of date string values in the data files.


da
ya

TIME_FORMAT AUTO Used only for loading JSON data into separate columns. Defines the
.
11

format on time string values in the data files.


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 392


LOAD OPTIONS FOR JSON DATA

y
op
t-c
no
● Copy the file in with no parsing or transformation:

o-
-d
20
1. Create a table with a single VARIANT column

20
ke
2. COPY INTO the table

fla
ow
COPY INTO <table>

Sn
FROM <file(s) in stage>

©
0-
FILE_FORMAT = (FORMAT_NAME = <json format with options>);

02
-2
ay
● Transform into columns and load

-M
13
1. Create a table with required column names and data types

om
c
a.
2. COPY INTO the table with parsing and transformations
at
td
nt
v@

COPY INTO <table>


da

FROM(SELECT <keys> FROM <file(s) in stage>)


. ya
11

FILE_FORMAT = (FORMAT_NAME = <json format with options>);


lip
di

© 2020 Snowflake Inc. All Rights Reserved


SOME PARQUET FILE FORMAT OPTIONS

y
op
t-c
no
Option Default Details

o-
-d
20
COMPRESSION Load: Auto Supported algorithms: LZO, SNAPPY, NONE

20
Unload: Snappy

ke
fla
ow
SNAPPY_COMPRESSION TRUE Only used for unloading

Sn
©
BINARY_AS_TEXT TRUE Only used for loading. Specifies whether to interpret columns

0-
02
with no defined logical data type as UTF-8 TEXT. When set to

-2
FALSE, Snowflake interprets these columns as binary data.

ay
-M
13
TRIM_SPACE FALSE Only used for loading Parquet data into separate columns.

om
Specifies whether to remove leading and trailing white space

c
a.
from strings.
at
td
nt
NULL_IF \\N Only used for loading Parquet data into separate columns. If
v@

used, replaces specified strings in the data load source with a


da

SQL NULL.
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 394


LOADING PARQUET DATA

y
op
t-c
no
● All Parquet data is stored in a single column, referenced as $1

o-
-d
20
20
● Copy in raw data and a single VARIANT column:

ke
fla
ow
Sn
COPY INTO <table>

©
0-
FROM <file(s) in stage>

02
FILE_FORMAT = (FORMAT_NAME = <parquet format with options>);

-2
ay
-M
13
● Transform and copy individual columns:

om
c
a.
at
COPY INTO <table> td
nt
FROM (SELECT $1:<col name>::<type>, $1:<col name>::<type>, ... FROM <stage>)
v@

FILE_FORMAT = (FORMAT_NAME = <json format with options>);


da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 395


MATCH_BY_COLUMN_NAME

y
op
t-c
no
● COPY INTO option

o-
-d
20
20
● Load semi-structured data directly into columns in the destination table, that match columns

ke
fla
represented into the data

ow
Sn
©
0-
● Supported for JSON, Avro, ORC, and Parquet

02
-2
ay
-M
● Example:

13
om
COPY INTO <table> FROM <stage>

c
a.
FILE_FORMAT = (FORMAT_NAME = <file format>)
at
td
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


UNLOADING SEMI-STRUCTURED DATA

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
Unload Data from tables Retrieve files from stage Output files
at
td
nt
into stage
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 397


FILE FORMATS FOR UNLOAD

y
op
t-c
no
o-
-d
● JSON:

20
20
○ Data must be unloaded from a column of VARIANT data type

ke
fla
ow
Sn
● Parquet:

©
0-
02
○ Use a SELECT statement to unload a table to Parquet as multiple columns

-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 398


LAB EXERCISE

y
op
Semi-Structured Data

t-c
no
o-
40 minutes

-d
20
20
ke
Exercise: Load and Unload Semi-Structured Data

fla
ow
Sn
Tasks:

©
0-
● Load Semi-Structured Parquet Data

02
-2
● Load Semi-Structured JSON Data

ay
-M
● Unload Semi-Structured JSON Data

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 399


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
CONTINUOUS

Sn
©
0-
02
-2
DATA PROTECTION

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Cloning

no
o-
● Time Travel

-d
20
● Fail Safe

20

ke
Database Replication

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 401


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
CLONING

20
20
-d
o-
no
t-c
op
y
ZERO-COPY CLONING

y
op
● Quickly take a "snapshot" of any table,

t-c
no
schema, or database

o-
-d
20
20
CLONE ● When the clone is created:

ke
fla
○ All micro-partitions in both tables are fully

ow
Sn
shared

©
0-
○ Micro-partition storage is owned by the
403

02
-2
SOME oldest table, clone references them

ay
-M
CHANGES

13
● No additional storage costs until changes

om
c
are made to the original or the clone
a.
at
td
nt
v@

● Often used to quickly spin up Dev or Test


da

environments,
. ya
11
lip
di

● Effective “backup” option as well


© 2020 Snowflake Inc. All Rights Reserved 403
CLONING EXAMPLE

y
BEFORE CLONE

op
t-c
no
o-
Services

Table A1

-d
Cloud

20
20
ke
fla
ow
Sn
©
Compute

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
Micro-Partition A Micro-Partition B Micro-Partition C Micro-Partition D
Storage

v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 404


CLONING EXAMPLE

y
TABLE A1 IS CLONED TO TABLE A2

op
t-c
no
o-
Services

Table A1 Table A2

-d
Metadata logical
Cloud

20
object is copied at

20
the Cloud Services

ke
fla
layer only

ow
Sn
©
Compute

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
Micro-Partition A Micro-Partition B Micro-Partition C Micro-Partition D
Storage

v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 405


CLONING EXAMPLE

y
TABLE A1 IS CLONED TO TABLE A2

op
t-c
no
o-
Services

Table A1 Table A2

-d
Cloud

Pathway to unused

20
20
micro-partition is

ke
kept for time travel.

fla
ow
Sn
©
Compute

0-
02
DML resulted in micro-

-2
ay
partition being
X

-M
rewritten and applied

13
to Table A2.

om
c
a.
at
td
nt
Micro-Partition A Micro-Partition B Micro-Partition C Micro-Partition D Micro-Partition E
Storage

v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 406


HOW DOES IT WORK?

y
op
t-c
no
o-
-d
20
CREATE DATABASE test_db

20
CLONE prod_db;

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
PROD_DB
td TEST_DB
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 407


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
-d
o-
TIME TRAVEL

no
t-c
op
y
PROBLEM

y
op
RECOVERING FROM MISTAKES ● User errors

t-c
no
o-
-d
● System errors

20
20
ke
fla
● Backup itself is a time-consuming task

ow
Sn
©
0-
409
● Specialized skill and management

02
-2
overhead

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 409


SOLUTION: TIME TRAVEL

y
op
● Access historical data at any point within

t-c
no
a defined retention period

o-
-d
20
20
● UNDO common mistakes

ke
fla
ow
● Protect against accidental or intentional

Sn
©
0-
410

modification, removal, or corruption

02
-2
○ Fix drops, deletes, edits

ay
-M
13
● Backup/Restore from time or ID

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 410


TIME TRAVEL SQL
EXTENSIONS

y
op
● Instantly bring back deleted tables,

t-c
no
I probably
schemas, and databases

o-
shouldn't have

-d
20
done that!

20
● Restore or duplicate data from key points

ke
fla
in the past:

ow
Sn
○ Point-in-time

©
DROP TABLE

0-
PAYROLL; ○ Prior to a specific query ID

02
-2
ay
-M
● Set retention times at the table, schema,

13
om
UNDROP <object> database, or account level

c
a.
at
td
CREATE <object> CLONE... AT|BEFORE
nt
v@
da

SELECT... FROM <table> AT|BEFORE


. ya
11

DATA_RETENTION_TIME_IN_DAYS
lip
di

© 2020 Snowflake Inc. All Rights Reserved 411


TIME TRAVEL

y
op
● Available for databases, schemas, and

t-c
no
tables

o-
-d
20
20
● Configuration retention option:

ke
fla
SELECT * FROM tbl BEFORE <queryID>
○ DATA_RETENTION_TIME_IN_DAYS

ow
Sn
SELECT * FROM tbl AT <timestamp>
○ Disable by setting retention to 0

©
0-
412

02
-2
V1 V2 V3
● SQL extensions:

ay
-M
○ AT | BEFORE - querying clause

13
New data Modified data

om
○ UNDROP - recovery

c
a.
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 412


CREATE TABLE WITH TIME TRAVEL

y
op
t-c
no
● Automatic with default retention period:

o-
-d
20
CREATE TABLE my_table (c1 int);

20
ke
fla
● Customizable retention period:

ow
Sn
©
0-
02
CREATE TABLE my_table (c1 int)

-2
ay
SET DATA_RETENTION_TIME_IN_DAYS=90;

-M
13
om
c
ALTER TABLE my_table

a.
at
td
SET DATA_RETENTION_TIME_IN_DAYS=30;
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 413


QUERYING WITH TIME TRAVEL

y
op
t-c
no
Query Clauses to support Time Travel Actions

o-
-d
20
20
● AT a specific time

ke
fla
ow
Sn
SELECT * FROM my_table1

©
0-
AT(TIMESTAMP => 'Mon, 01 May 2015 16:20:00 -0700'::timestamp);

02
-2
ay
-M
● BEFORE a specific query

13
om
c
a.
SELECT * FROM my_table1
at
td
BEFORE(STATEMENT => '8e5d0ca9-005e-44e6-b858-a8f5b37c5726');
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 414


DML EXAMPLES

y
op
t-c
no
● Cloning Historical Objects

o-
-d
20
20
ke
CREATE TABLE restored_table CLONE my_table1

fla
AT(TIMESTAMP => 'Mon, 09 May 2015 01:01:00 +0300'::timestamp);

ow
Sn
©
0-
02
CREATE DATABASE restored_db CLONE my_db

-2
ay
BEFORE(STATEMENT => '8e5d0ca9-005e-44e6-b858-a8f5b37c5726');

-M
13
om
● Restoring Objects

c
a.
at
td
nt
UNDROP TABLE/SCHEMA/DATABASE
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 415


HOW DOES IT WORK?

y
op
t-c
no
● Micro-partitions!

o-
-d
20
20
● Micro-partitions are immutable

ke
fla
ow
Sn
● When data is changed, new versions of the micro-partitions are created

©
0-
02
-2
● We keep the older version for the specified retention time

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 416


TIME TRAVEL OVERVIEW

y
op
t-c
no
o-
-d
20
DATA_RETENTION_TIME_IN_DAYS = 3

20
ke
v.1

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 417


TIME TRAVEL OVERVIEW

y
op
t-c
no
o-
-d
20
DATA_RETENTION_TIME_IN_DAYS = 3

20
ke
v.1
v.2
v.1

fla
ow
Sn
©
0-
02
-2
ay
-M
13
v.1

om
Day 1

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 418


TIME TRAVEL OVERVIEW

y
op
t-c
no
o-
-d
20
DATA_RETENTION_TIME_IN_DAYS = 3

20
ke
v.1
v.2
v.1

fla
ow
Sn
©
0-
02
-2
ay
-M
13
v.1

om
Day 1 Day 2

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 419


TIME TRAVEL OVERVIEW

y
op
t-c
no
o-
-d
20
DATA_RETENTION_TIME_IN_DAYS = 3

20
ke
v.1
v.2
v.1

fla
ow
Sn
©
0-
02
-2
ay
-M
13
v.1

om
Day 1 Day 2 Day 3

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 420


TIME TRAVEL OVERVIEW

y
op
t-c
no
o-
-d
20
DATA_RETENTION_TIME_IN_DAYS = 3

20
ke
v.1
v.2
v.1

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
Day 1 Day 2 Day 3

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 421


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
FAILSAFE

20
-d
o-
no
t-c
op
y
FAILSAFE OVERVIEW

y
op
t-c
no
o-
-d
20
DATA_RETENTION_TIME_IN_DAYS = 3

20
ke
v.2
v.1

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
Day 1 Day 2 Day 3

c
a.
at
td
nt
v@
da
. ya
11

Failsafe (7 Days)
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 423


FAILSAFE STORAGE

y
op
t-c
no
● Non-configurable, 7-day retention for historical data after Time Travel expiration

o-
-d
20
20
● Only accessible by Snowflake personnel

ke
fla
ow
Sn
● Admins can view fail-safe use in the Snowflake Web UI under Account > Billing & Usage

©
0-
02
-2
● Not supported for temporary or transient tables

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 424


DATA PROTECTION OPTIONS BY EDITION

y
op
t-c
no
Snowflake Edition Time Travel Time Travel Failsafe

o-
-d
(1 day) (90 days max) (7 days)

20
20
ke
Standard ✓ ✓

fla
ow
Sn
Premium ✓ ✓

©
0-
02
Enterprise ✓ ✓

-2
ay
-M
13
Business Critical ✓ ✓

om
c
a.
VPS ✓ ✓
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 425


LAB EXERCISE

y
op
Continuous Data Protection

t-c
no
o-
30 minutes

-d
20
20
ke
Tasks:

fla
ow
● Clone database objects

Sn
● DROP and UNDROP a table

©
0-
● Recover a table to a time before a change was made

02
-2
● Object naming constraints

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 426


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
DATABASE REPLICATION

Sn
©
0-
02
-2
AND FAILOVER

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


WHAT IS REPLICATION?

y
op
t-c
no
● Keeps database objects and stored data synchronized between one or more accounts

o-
-d
20
(within the same organization)

20
ke
fla
● Unit of replication is a database (permanent or transient)

ow
Sn
©
● Secondary database is read-only

0-
02
-2
ay
-M
13
Secondary Database

om
Primary Database (Destination)

c
a.
(Source)
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 428


REPLICATION USE CASES

y
op
t-c
no
o-
-d
20
20
Business Continuity & Secure Data Sharing Data Portability for

ke
fla
Disaster Recovery Across Regions / Clouds Account Migrations

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 429


REPLICATED DATABASE ● Tables
OBJECTS

y
○ Permanent

op
t-c
○ Transient

no
o-
○ Temporary

-d
20
20
○ Clustered
Business Continuity &

ke
fla
Disaster Recovery ○ Constraints

ow
Sn
©
0-
● Sequences

02
-2
ay
-M
● Views

13
om
○ Standard

c
a.
○ Materialized
at
td
nt
v@

● Stored Procedures
da
. ya
11
lip

● User-Defined Functions (UDF)


di

© 2020 Snowflake Inc. All Rights Reserved 430


DATABASE REPLICATION AND ENCRYTPION

y
op
t-c
no
● Snowflake encrypts database files in-transit from the source account to the target account

o-
-d
20
20
● If Tri-Secret Secure is enabled for the source and target accounts) the files are encrypted

ke
fla
using the public key for an encryption key pair that is protected by the account master key

ow
Sn
(AMK) for your target account

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 431


COMPUTE FOR DATABASE REPLICATION

y
op
t-c
no
● Replication operations use Snowflake-provided compute resources to copy data between

o-
-d
20
accounts

20
ke
fla
● Replication utilization is shown as a special Snowflake-provided warehouse named

ow
Sn
REPLICATION.

©
0-
02
-2
● Query either of the following

ay
-M
○ REPLICATION_USAGE_HISTORY table function

13
om
○ REPLICATION_USAGE_HISTORY view

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 432


DATA TRANSFER FOR DATABASE REPLICATION

y
op
t-c
no
● Initial database replication and subsequent synchronization operations incur data transfer

o-
-d
20
charges

20
ke
fla
● Rate is determined by the location of the source and target accounts, and the cloud provider

ow
Sn
©
0-
● Data Transfer Usage is shown in Billing & Usage Data Transfer

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 433


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
DATA SHARING

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Overview

no
o-

-d
Data Providers

20
20
● Data Consumers

ke
fla
● Reader Accounts

ow
Sn
● Data Exchange

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 435


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
OVERVIEW

-d
o-
no
t-c
op
y
A BETTER WAY TO SHARE DATA

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
Data Providers Data Consumers

ay
-M
13
om
c
a.
at
td
nt
v@
da
ya

Live Access
.
11

Data consumers immediately


lip

see all updates


di

© 2020 Snowflake Computing Inc. All Rights Reserved 437


INSTANT, LIVE DATA
SHARING

y
op
● No data movement; nothing is copied

t-c
no
o-
-d
● Data consumers immediately see all

20
20
updates

ke
fla
ow
● Share with an unlimited number of

Sn
©
0-
438

consumers

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 438


DATA SHARING ACCOUNTS

y
op
t-c
no
Share data with others

o-
-d
20
20
ke
fla
ow
Sn
Accesses shared data with

©
0-
their own Snowflake

02
Data Providers

-2
account

ay
-M
13
om
Query data using

c
a.
Data Consumers compute from data
at
td
nt

provider's account
v@
da
. ya
11
lip

Reader Accounts
di

© 2020 Snowflake Computing Inc. All Rights Reserved 439


WHAT IS A SHARE?

y
op
t-c
no
● A securable object in Snowflake

o-
-d
20
20
● Encapsulates all the information required to share objects

ke
fla
ow
Sn
● Consists of:

©
0-
○ Privileges that grant access to the database and schema containing the objects to share

02
-2
ay
○ Privileges that grant access to specific objects

-M
13
○ The account(s) with which the objects are shared

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


ABOUT SHARES

y
op
t-c
no
● Shares are read-only

o-
-d
20
20
● Tables, secure views, and secure UDFs can be shared

ke
fla
ow
Sn
● Access to a share can be revoked at any time

©
0-
02
-2
● Consumers can create new tables from a share

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


SHARING SECURE VIEWS

y
WITH ACCESS CONTROLS

op
t-c
Data

no
Data

o-
Provider Consume

-d
rs

20
20
ke
fla
partner_data acct_map

ow
id sales commission id acct_name

Sn
©
100 2,350 11.75 100 ABC

0-
02
100 1,975 49.375 200 XYZ

-2
200 3,459 8.6475 Account = ABC

ay
-M
200 9,156 68.67

13
Account = XYZ

om
c
a.
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 442


SHARING SECURE VIEWS

y
WITH ACCESS CONTROLS

op
t-c
Data

no
Data

o-
Provider Consume

-d
rs

20
20
ke
fla
partner_data acct_map

ow
id sales commission id acct_name

Sn
©
100 2,350 11.75 100 ABC

0-
02
100 1,975 49.375 200 XYZ

-2
200 3,459 8.6475 Account = ABC

ay
-M
200 9,156 68.67

13
Account = XYZ

om
c
CREATE SECURE VIEW shared_data

a.
AS SELECT * FROM partner data pd
at
JOIN acct_map am ON pd.id = am.id td
nt
AND am.acct_name = CURRENT_ACCOUNT();
v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 443


SHARING SECURE VIEWS

y
WITH ACCESS CONTROLS

op
t-c
Data

no
Data

o-
Provider Consume

-d
rs

20
20
ke
fla
partner_data acct_map

ow
id sales commission id acct_name

Sn
©
100 2,350 11.75 100 ABC

0-
02
100 1,975 49.375 200 XYZ

-2
200 3,459 8.6475 Account = ABC

ay
-M
200 9,156 68.67 shr_sales

13
Account = XYZ

om
c
CREATE SECURE VIEW shared_data

a.
AS SELECT * FROM partner data pd
at
JOIN acct_map am ON pd.id = am.id td
nt
AND am.acct_name = CURRENT_ACCOUNT();
v@
da

GRANT SELECT ON shared_data TO SHARE shr_sales;


.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 444


SHARING SECURE VIEWS

y
WITH ACCESS CONTROLS

op
t-c
Data

no
Data

o-
Provider Consume

-d
rs

20
20
ke
fla
partner_data acct_map

ow
id sales commission id acct_name

Sn
©
100 2,350 11.75 100 ABC

0-
02
100 1,975 49.375 200 XYZ

-2
200 3,459 8.6475 Account = ABC

ay
-M
200 9,156 68.67 shr_sales

13
Account = XYZ

om
c
CREATE SECURE VIEW shared_data

a.
AS SELECT * FROM partner data pd
at
JOIN acct_map am ON pd.id = am.id td
nt
AND am.acct_name = CURRENT_ACCOUNT();
v@
da

GRANT SELECT ON shared_data TO SHARE shr_sales;


.ya
11

ALTER SHARE shr_sales ADD ACCOUNTS=ABC,XYZ;


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 445


SHARING SECURE VIEWS

y
WITH ACCESS CONTROLS

op
t-c
Data

no
Data

o-
Provider Consume

-d
rs

20
20
ke
shared_data

fla
partner_data acct_map
id sales commission

ow
id sales commission id acct_name

Sn
100 2,350 11.75

©
100 2,350 11.75 100 ABC
100 1,975 49.375

0-
02
100 1,975 49.375 200 XYZ

-2
200 3,459 8.6475 Account = ABC

ay
-M
200 9,156 68.67 shr_sales

13
Account = XYZ

om
c
CREATE SECURE VIEW shared_data

a.
AS SELECT * FROM partner data pd
at
JOIN acct_map am ON pd.id = am.id td shared_data
nt
AND am.acct_name = CURRENT_ACCOUNT();
v@

id sales commission
da

GRANT SELECT ON shared_data TO SHARE shr_sales; 200 3,459 8.6475


.ya

200 9,156 68.67


11

ALTER SHARE shr_sales ADD ACCOUNTS=ABC,XYZ;


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 446


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
DATA PROVIDERS

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


DATA SHARING ACCOUNTS

y
op
t-c
no
Share data with others

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
Data Providers

-2
ay
-M
13
om
c
a.
Data Consumers
at
td
nt
v@
da
. ya
11
lip

Reader Accounts
di

© 2020 Snowflake Computing Inc. All Rights Reserved 448


DATA PROVIDERS

y
op
● Snowflake accounts that creates shares

t-c
no
and makes them available for others to

o-
-d
consume

20
20
ke
fla
● Unlimited number of shares can be

ow
Sn
created and an unlimited number of

©
0-
449

accounts can be added to a share

02
-2
ay
-M
● Grants provide granular access control to

13
selected objects, including at the row

om
Data Providers

c
level (using filters)
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 449


CREATE A SHARE

y
op
t-c
Data Provider

no
(ProvXyz)

o-
-d
20
20
USE ROLE ACCOUNTADMIN;

ke
fla
ow
Sn
CREATE SHARE share1; --empty share

©
0-
02
-2
GRANT USAGE ON DATABASE sales TO SHARE share1; -- add database

ay
-M
GRANT USAGE ON SCHEMA sales.east TO SHARE share1; -- add schema

13
GRANT SELECT ON TABLE east.accts TO SHARE share1; -- add table

om
c
a.
at
td
ALTER SHARE share1 ADD ACCOUNTS=Cons123; -- add account
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 450


CONSIDERATIONS FOR PROVIDERS

y
op
t-c
no
● Create shares with a role that has CREATE SHARES

o-
-d
20
privileges (such as the ACCOUNTADMIN role)

20
ke
fla
● Consumer accounts must be in the same region Data Providers

ow
Sn
©
0-
● Only tables, secure views, and secure UDFs are supported in shares at this time

02
-2
ay
-M
● New or modified data in a share are immediately available to all consumers

13
om
c
● You must grant usage on new objects created in a database in a share in order for them to
a.
at
td
be available to consumers
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 451


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
DATA CONSUMERS

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


DATA SHARING ACCOUNTS

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
Accesses shared data with

©
0-
their own Snowflake

02
Data Providers

-2
account

ay
-M
13
om
c
a.
Data Consumers
at
td
nt
v@
da
. ya
11
lip

Reader Accounts
di

© 2020 Snowflake Computing Inc. All Rights Reserved 453


DATA CONSUMERS

y
op
● Snowflake account that accesses a

t-c
no
share from another account

o-
-d
20
20
● Create a local database to "hold" the

ke
fla
share

ow
Sn
○ Can only be done once

©
0-
454

02
-2
● Can consume an unlimited number of

ay
-M
shares

13
Data Consumers

om
c
● Are charged for their own compute on
a.
at
td
that share
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 454


CONSUME A SHARE

y
op
t-c
Data

no
Consumer

o-
-d
(Consz123)

20
20
ke
USE ROLE ACCOUNTADMIN;

fla
ow
Sn
CREATE DATABASE From_provider

©
0-
FROM SHARE ProvXyz.share1; --read-only shared database

02
-2
ay
-M
USE DATABASE From_provider; --switch to read-only database

13
om
c
SELECT * FROM east.accts; --same as querying any other
a.
at
td database in your account
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 455


CONSIDERATIONS FOR CONSUMERS

y
op
t-c
no
● Administer shares with a role that has IMPORT SHARES

o-
-d
20
privileges (such as the ACCOUNTADMIN role)

20
ke
fla
● Share can only be consumed once per account Data Consumers

ow
Sn
©
0-
● Shared databases are read-only

02
-2
ay
-M
● Shared data can be copied into a table in the consumer account (but cannot be cloned)

13
om
c
● You cannot forward (re-share) shared databases
a.
at
td
nt
v@

● Time travel is not supported for shared databases


da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 456


LAB EXERCISE

y
op
DEMO: Data Sharing

t-c
no
o-
15 minutes

-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 457


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
THE DATA EXCHANGE

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


LAB EXERCISE

y
op
DEMO: Data Exchange

t-c
no
o-
15 minutes

-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 459


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
READER ACCOUNTS

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


DATA SHARING ACCOUNTS

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
Data Providers

-2
ay
-M
13
om
Query data using

c
a.
Data Consumers compute from data
at
td
nt

provider's account
v@
da
. ya
11
lip

Reader Accounts
di

© 2020 Snowflake Computing Inc. All Rights Reserved 461


READER ACCOUNTS

y
op
● Consumer who does not already have a

t-c
no
Snowflake account

o-
-d
20
20
● Accounts are created by Providers

ke
fla
ow
● Compute credits funded by the provider

Sn
©
0-
462

02
-2
● Consumer cannot write data to the

ay
-M
account

13
Reader Accounts

om
c
● Allows consumer to experiment with data
a.
at
td
sharing without a Snowflake contract
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 462


CONSIDERATIONS FOR READER ACCOUNTS

y
op
t-c
no
● Requires additional configuration after creating the share

o-
-d
20
and the reader account

20
ke
fla
○ Create a database from the share Data Reader

ow
Sn
○ Configure Users, Roles, Security for access to data in the reader account

©
0-
02
○ Create Virtual Warehouses for Reader use (paid for by the Data Provider)

-2
ay
○ Set default role and warehouses for logins

-M
13
○ Set up a resource monitor

om
c
a.
at
td
nt

● Provider responsible for support for the reader accounts


v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 463


SET UP A READER ACCOUNT

y
op
t-c
Data Provider

no
(ProvXyz)

o-
-d
20
20
USE ROLE ACCOUNTADMIN;

ke
fla
CREATE SHARE share1; --empty share

ow
Sn
©
GRANT USAGE ON DATABASE sales TO SHARE share1; -- add database

0-
02
GRANT USAGE ON SCHEMA sales.east TO SHARE share1; -- add schema

-2
ay
GRANT SELECT ON TABLE east.accts TO SHARE share1; -- add table

-M
13
om
CREATE MANAGED ACCOUNT <name> -- create reader account

c
a.
ADMIN_NAME=<username>, ADMIN_PASSWORD=<pw>,
at
td
nt
TYPE=READER
v@
da
ya

ALTER SHARE share1 ADD ACCOUNTS=<account>; -- add account


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 464


LAB EXERCISE

y
op
DEMO: Reader Accounts and Secure Views

t-c
no
o-
20 minutes

-d
20
20
ke
● Create a share

fla
ow
● Create a reader account with secure views

Sn
● Demonstrate secure views with different accounts

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 465


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
PERFORMANCE &

Sn
©
0-
02
-2
CONCURRENCY

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● SQL Performance Tips

no
o-
-d
● Data Clustering

20
20
● Virtual Warehouse Scaling

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 467


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
SQL PERFORMANCE

Sn
©
0-
02
-2
TIPS

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


TYPICAL DATABASE ORDER OF EXECUTION

y
op
t-c
no
o-
-d
20
20
Rows Groups Result

ke
fla
ow
Sn
● FROM ● GROUP ● SELECT

©
0-
● JOIN BY ● DISTINCT

02
-2
● WHERE ● HAVING ● ORDER BY

ay
-M
● LIMIT

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 469


TOP PERFORMANCE TIP

y
op
t-c
no
● Row operations are performed before GROUP operations

o-
-d
20
20
● Check row operations first

ke
fla
○ First check FROM and WHERE clauses

ow
Sn
○ Then check GROUP BY and HAVING clauses

©
0-
02
-2
ay
● Use appropriate filters, as early as possible

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 470


ADD LIMIT TO LARGE ORDER BY

y
op
t-c
no
o-
SELECT * FROM LINEITEM SELECT * FROM LINEITEM

-d
20
ORDER BY L_QUANTITY; ORDER BY L_QUANTITY LIMIT 10;

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 471


ADD LIMIT TO LARGE ORDER BY

y
op
t-c
no
o-
SELECT * FROM LINEITEM SELECT * FROM LINEITEM

-d
20
ORDER BY L_QUANTITY; ORDER BY L_QUANTITY LIMIT 10;

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 472


ADD LIMIT TO LARGE ORDER BY

y
op
t-c
no
o-
SELECT * FROM LINEITEM SELECT * FROM LINEITEM

-d
20
ORDER BY L_QUANTITY; ORDER BY L_QUANTITY LIMIT 10;

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 473


JOIN ON UNIQUE KEYS

y
op
t-c
Best Practices

no
o-
● Ensure keys are distinct

-d
20
20
● Understand relationships between your

ke
tables before joining

fla
ow
● Avoid many-to-many join

Sn
©
● Avoid unintentional cross join

0-
02
-2
ay
Troubleshooting Scenario

-M
13
● Joining on non-unique keys can explode

om
your data output (join explosion)

c
a.
○ Each row in table1 matches multiple at
td
nt
v@

rows in table 2
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 474


SNOWFLAKE BUILT-IN OPTIMIZATIONS

y
op
t-c
no
● Snowflake provides patented micro-partition pruning to optimize query performance:

o-
-d
20
○ Static partition pruning based on columns in WHERE clause

20
ke
○ Dynamic partition pruning based on JOIN columns of a query

fla
ow
Sn
● Best practices to assist SQL optimizer:

©
0-
02
○ Apply appropriate filters as early as possible in the query

-2
ay
-M
○ For naturally clustered tables, apply appropriate predicate columns (e.g. date columns) that have a

13
high correlation to ingestion order

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 475


PARTITION PRUNING

y
op
t-c
no
Effective pruning: query’s filter column matches table’s clustering order

o-
-d
20
20
ke
SELECT <items>

fla
FROM order

ow
Sn
WHERE o_orderdate >= to_date('1993-10-01')

©
0-
AND o_orderdate < dateadd(month, 3, to_date('1993-10-01'));

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 476


PREDICATES AND PERFORMANCE

y
op
t-c
no
● Built-in functions and UDFs are very useful, but they can impact performance when used in a

o-
-d
20
query predicate

20
ke
SELECT l_orderkey

fla
ow
FROM lineitem l, orders o

Sn
WHERE l_orderkey=o_orderkey AND

©
0-
02
LOG(10, l_extendprice) > 4.5 AND

-2
ay
LOG(10, o_totalprice - l_tax) > 4.5

-M
13
om
● Consider materializing the intermediate result using a temporary table

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 477


PITFALLS OF NON-PERFORMING PREDICATES

y
op
t-c
no
Some WHERE predicates provide no opportunity to optimize pruning

o-
-d
20
20
ke
SELECT c_customer_id, c_last_name

fla
FROM tpcds.customer

ow
Sn
WHERE UPPER (c_last_name) LIKE '%KROLL%';

©
0-
02
-2
ay
-M
13
no pruning

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 478


UNCORRELATED VS CORRELATED SUBQUERIES

y
op
t-c
no
● Uncorrelated scalar subqueries (with no external column references)

o-
-d
20
20
ke
SELECT p.name, p.annual_wage, p.ctry FROM pay AS p

fla
ow
WHERE p.annual_wage <

Sn
(SELECT per_capita_gdp FROM intl_gdp

©
0-
WHERE country = 'Brazil’);

02
-2
ay
-M
● Correlated scalar subqueries in WHERE clause

13
om
c
a.
at
SELECT p.name, p.annual_wage, p.ctry FROM pay AS p td
nt

WHERE p.annual_wage <


v@
da

(SELECT MAX(per_capita_gdp) FROM intl_gdp i


. ya

WHERE p.ctry = i.country);


11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 479


GROUP BY WITH FEW DISTINCT VALUES

y
op
t-c
no
SELECT

o-
-d
20
l_returnflag,

20
l_linestatus,

ke
fla
SUM(l_quantity)

ow
Sn
FROM lineitem

©
0-
GROUP BY

02
-2
l_returnflag,

ay
-M
l_linestatus;

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 480


GROUP BY WITH MANY DISTINCT VALUES

y
op
t-c
no
● Memory-intensive

o-
-d
● Spilling to disk, high network data

20
20
● Sub-optimal performance

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 481


ORDER BY

y
op
● Orders values in ascending or

t-c
no
descending order on a specified column

o-
-d
20
20
● Use ORDER BY in top level SELECT

ke
fla
only

ow
Sn
○ ORDER BY in subqueries does not impact

©
0-
the result, and slows performance
482

02
-2
ay
-M
13
om
c
a.
Wasted Compute
at
td
nt
v@
da

Best Practice
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 482


LAB EXERCISE

y
op
Performance & Concurrency

t-c
no
o-
15 minutes

-d
20
20
ke
Exercise: Review the Query Profile

fla
ow
Sn
Tasks:

©
0-
● Run a query

02
-2
● Review the query profile

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 483


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
DATA CLUSTERING

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


NATURAL DATA CLUSTERING

y
op
t-c
no
● As data is loaded into tables, micro-partitions are created based ingestion date

o-
-d
20
20
● Ingestion date may highly correlate with one or more columns

ke
fla
○ Tables with a sequential field

ow
Sn
○ Tables with a date field

©
0-
02
-2
ay
ID NICKNAME YOB DATE ORDER_NUMBER ITEM_TYPE

-M
13
1 PRISCILLA 1962 22-Jul-2019 23:29:07 2019_072111356 Office Supplies

om
c
a.
2 DAWG 2001 22-Jul-2019 23:32:00 2019_072111357 Office Furniture
at
td
3 JOKER 1997 22-Jul-2019 23:44:56 2019_072111358 Services
nt
v@

4 PRINCESS 1998 22-Jul-2019 23:59:01 2019_072111359 Office Supplies


da
ya

5 SNEEZY 1983 23-Jul-2019 00:01:07 2019_072111360 Printers


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 485


WHAT DETERMINES NATURAL CLUSTERING?

y
op
t-c
no
● A single data load reads source data and writes it into some number of micro-partitions

o-
-d
20
20
● The source data organization determines what range of values are represented in the micro-

ke
fla
partitions

ow
Sn
©
0-
● Examples:

02
-2
○ Source data contains rows from a specific day: that day's data will be in contiguous micro-partitions

ay
-M
13
om
○ Source data contains rows for a month or year: those micro-partitions will have high/low values for

c
a.
dates within that month or year
at
td
nt
v@

○ Source data is ordered alphabetically on a "last name" column: contiguous micro-partitions will be in
da
ya

alphabetical order
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 486


CLUSTERED TABLES AND CLUSTERING KEYS

y
op
t-c
no
● All tables have a “natural” clustering based on ingestion time

o-
-d
20
20
● Natural clustering can be overridden by specifically designating clustering “keys”

ke
fla
○ Clustering keys can be columns or expressions

ow
Sn
○ Use 1-3 keys maximum, in order of low to high cardinality. More keys is not “better”

©
0-
02
-2
● After clustering, like data (by key) is co-loaded in the same micro-partitions

ay
-M
13
om
● Snowflake keeps the clustering order updated in the background, billed on a per-second

c
a.
basis
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 487


WHAT TABLES SHOULD BE CLUSTERED?

y
op
t-c
no
● Clustering keys are not for every table!

o-
-d
20
○ Automatic clustering consumes credits

20
ke
○ Reclustering also increases storage costs

fla
ow
○ Can be less expensive to add cluster key after loading the table

Sn
©
○ Automatic Clustering can be disabled

0-
02
-2
ay
● Good candidates for clustering keys:

-M
13
○ Tables in the multi-terabyte range experience the most benefit from clustering (particularly if DML is

om
performed regularly on these tables)

c
a.
at
○ Tables that change infrequently are less expensive to recluster
td
nt
v@

○ Tables whose query performance degrades noticeably over time


da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 488


CLUSTERING COMMAND SAMPLES

y
op
t-c
no
CREATE TABLE t1 (c1 date, c2 string, c3 number)

o-
-d
20
CLUSTER BY (c1, c2);

20
ke
fla
CREATE TABLE t2 (c1 timestamp, c2 string, c3 number)

ow
Sn
CLUSTER BY (TO_DATE(c1), SUBSTRING(c2, 0, 10));

©
0-
02
-2
ALTER TABLE t1

ay
-M
CLUSTER BY (c1, c3);

13
om
c
a.
ALTER TABLE t2
at
td
CLUSTER BY (SUBSTRING(c2, 5, 10), TO_DATE(c1));
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 489


LAB EXERCISE

y
op
Explore Natural Clustering and Clustering Keys

t-c
no
o-
20 minutes

-d
20
20
ke
Notes:

fla
ow
● In this lab, you will compare observe how the organization of rows in Micro-Partitions

Sn
affect query performance.

©
0-
02
-2
ay
Tasks:

-M
● Query a table, review how many partitions are scanned

13
om
● Created a new table, partitioned differently than the original

c
a.
● Query the new table, review performance differences and telemetry (partitions read)
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 490


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
VIRTUAL WAREHOUSE

Sn
©
0-
02
-2
SCALING

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SCALING UP VS. SCALING OUT

y
op
t-c
no
o-
SCALING UP SCALING OUT

-d
20
20
ke
Resizing a warehouse to handle Adding more clusters to your

fla
ow
complex/process-intensive queries current Virtual Warehouse without

Sn
©
changing the size of the Warehouse

0-
02
to handle concurrency issues.

-2
Medium

ay
-M
13
om
Small

c
a.
Small
at
nt
td MAXIMUM: 10
v@

X-Small
da
. ya
11

MINIMUM: 1
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 492


SERVERS PER CLUSTER

y
op
● As queries are submitted, required

t-c
no
Warehouse Servers Clusters resources are calculated and reserved

o-
-d
Size

20
20
X-Small 1 1 ● If there are insufficient resources, that

ke
fla
query is queued until other queries finish

ow
Small 2 1

Sn
and release resources

©
Medium 4 1

0-
493

02
-2
Large 8 1

ay
-M
13
X-Large 16 1

om
2X-Large 32 1

c
a.
at
td
3X-Large 64 1
nt
v@

4X-Large 128 1
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 493


SCALING OUT: MULTI-CLUSTER WAREHOUSES

y
op
t-c
no
● Multi-cluster warehouses are designed specifically for handling queuing and performance

o-
-d
20
issues related to large numbers of concurrent users and/or queries.

20
ke
fla
● Multi-cluster warehouses can help automate this process if your number of users/queries

ow
Sn
tend to fluctuate.

©
0-
02
-2
ay
-M
MAXIMIZED

13
om
MAXIMUM: 10

c
a.
1
nt
at
td 2 OR
v@

MINIMUM: 1
da
ya

AUTO-SCALING
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 494


SCALING OUT: MAXIMIZED

y
op
t-c
no
When the warehouse is started, Snowflake starts all the clusters so that maximum resources

o-
-d
20
are available while the warehouse is running.

20
ke
fla
ow
Sn
©
MINIMUM=10 MINIMUM = 6

0-
02
MAXIMUM= 10 MAXIMUM = 6

-2
ay
-M
Virtual Warehouse A Virtual Warehouse B

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 495


SCALING OUT: AUTO-SCALING

y
op
t-c
no
o-
Allows Snowflake to start and stop

-d
20
clusters as needed to dynamically

20
manage the load on the warehouse

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 496


AUTO-SCALING: SCALING POLICY

y
op
t-c
Policy Description Cluster Starts... Cluster Shuts down...

no
o-
-d
Standard Minimizes queuing and Immediately when After 2-3 consecutive checks

20
20
starts additional clusters. either a query is (performed at 1 minute intervals)

ke
queued or the system determine that the load on the

fla
ow
detects that there’s least-loaded cluster could be

Sn
one more query than redistributed to the others without

©
0-
the currently-running spinning up the cluster again.

02
-2
clusters can execute.

ay
-M
13
Economy Favors keeping running Only if the system After 5-6 consecutive checks

om
clusters fully-loaded rather estimates there’s (performed at 1 minute intervals),

c
a.
at
than starting additional td enough query load to determine that the load on the
nt
clusters; could result in keep the cluster busy least-loaded cluster could be
v@

jobs being queued rather for at least 6 minutes. redistributed to the others without
da
ya

than starting additional spinning up the cluster again.


.
11

Clusters.
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 497


SCALING OUT POLICY: STANDARD

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Processing

Sn
©
Cluster A

0-
X-Small Server A

02
Warehouse

-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 498


SCALING OUT POLICY: STANDARD

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Processing

Sn
©
Cluster A

0-
X-Small Server A

02
Warehouse

-2
ay
-M
13
om
c
Queued

a.
at
nt
td Queries
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 499


SCALING OUT POLICY: STANDARD

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Processing Processing

Sn
©
Cluster A

0-
X-Small Server A

02
Warehouse

-2
ay
-M
13
om
c
Cluster B Queued

a.
Server B
at
nt
td Queries
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 500


SCALING OUT POLICY: ECONOMY

y
op
t-c
no
o-
-d
20
20
ke
fla
Processing

ow
Sn
©
0-
Cluster A

02
X-Small

-2
Warehouse

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 501


SCALING OUT POLICY: ECONOMY

y
op
t-c
no
o-
-d
20
20
ke
fla
Processing

ow
Sn
©
0-
Cluster A

02
X-Small

-2
Warehouse

ay
-M
13
om
c
a.
Queued
at
td Queries
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 502


SCALING OUT POLICY: ECONOMY

y
op
t-c
no
o-
-d
20
Will the current Query Load keep

20
this Cluster/Server busy for longer

ke
than 6 minutes?

fla
Processing

ow
Sn
©
0-
Cluster A

02
X-Small

-2
Warehouse

ay
-M
13
om
c
a.
Queued
at
td Queries
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 503


SCALING OUT POLICY: ECONOMY

y
op
t-c
no
o-
-d
20
Will the current Query Load keep

20
this Cluster/Server busy for longer

ke
than 6 minutes?

fla
Processing

ow
Sn
NO - do nothing

©
0-
Cluster A

02
X-Small

-2
Warehouse

ay
-M
13
om
c
a.
Queued
at
td Queries
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 504


SCALING OUT POLICY: ECONOMY

y
op
t-c
no
o-
-d
20
Will the current Query Load keep

20
this Cluster/Server busy for longer

ke
than 6 minutes?

fla
Processing

ow
Sn
YES – spin up another cluster

©
0-
Cluster A

02
X-Small

-2
Warehouse

ay
-M
13
om
c
Cluster B

a.
Server B Queued
at
td Queries
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 505


SCALING BACK POLICY: STANDARD

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Cluster A

Sn
X-Small Server A

©
Warehouse

0-
02
-2
ay
-M
13
Cluster B

om
Server B

c
a.
at
td
nt
v@
da
ya

Cluster C Server C
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SCALING BACK POLICY: STANDARD

y
op
t-c
no
o-
-d
20
20
T0 Could the load on the least-busy cluster

ke
be redistributed to the other clusters?

fla
ow
Cluster A

Sn
X-Small Server A

©
Warehouse

0-
02
-2
ay
-M
13
Cluster B

om
Server B

c
a.
at
td
nt
v@
da
ya

Cluster C Server C
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 507


SCALING BACK POLICY: STANDARD

y
op
t-c
no
o-
-d
20
20
T0 Could the load on the least-busy cluster NO

ke
be redistributed to the other clusters?

fla
ow
Cluster A YES

Sn
X-Small Server A

©
Warehouse

0-
02
T0 Could the load on the least-busy cluster

-2
+ 1 minute be redistributed to the other clusters?

ay
-M
13
Cluster B

om
Server B

c
a.
at
td
nt
v@
da
ya

Cluster C Server C
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 508


SCALING BACK POLICY: STANDARD

y
op
t-c
no
o-
-d
20
20
T0 Could the load on the least-busy cluster NO

ke
be redistributed to the other clusters?

fla
ow
Cluster A YES

Sn
X-Small Server A

©
Warehouse

0-
02
T0 Could the load on the least-busy cluster

-2
NO
+ 1 minute be redistributed to the other clusters?

ay
-M
YES

13
Cluster B

om
Server B

c
T0 Could the load on the least-busy cluster

a.
NO
at
td + 2 minutes be redistributed to the other clusters?
nt
v@

YES
da
ya

Server C
.
11
lip
di

Scale Back!
© 2020 Snowflake Computing Inc. All Rights Reserved 509
SCALING BACK POLICY: ECONOMY

y
op
t-c
no
o-
-d
T0 Could the load on the least-busy cluster NO

20
20
be redistributed to the other clusters?

ke
YES

fla
ow
Cluster A T0 Could the load on the least-busy cluster

Sn
X-Small Server A
+ 1 minute be redistributed to the other clusters?

©
Warehouse

0-
02
-2
ay
-M
13
Cluster B

om
Server B

c
a.
at
td
nt
v@
da
ya

Cluster C Server C
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 510


SCALING BACK POLICY: ECONOMY

y
op
t-c
no
o-
-d
T0 Could the load on the least-busy cluster NO

20
20
be redistributed to the other clusters?

ke
YES

fla
ow
Cluster A T0 Could the load on the least-busy cluster NO

Sn
X-Small Server A
+ 1 minute be redistributed to the other clusters?

©
Warehouse

0-
YES

02
-2
ay
T0 Could the load on the least-busy cluster NO

-M
+ 2 minutes be redistributed to the other clusters?

13
Cluster B

om
Server B YES
. . .

c
a.
at
td
nt
v@

T0 Could the load on the least-busy cluster NO


da

+ 5 minutes be redistributed to the other clusters?


ya

Cluster C Server C YES


.
11
lip
di

Scale Back!
© 2020 Snowflake Computing Inc. All Rights Reserved 511
ARE THESE WAREHOUSES EQUIVALENT?

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
X-Small Medium

-2
Warehouse Warehouse

ay
-M
13
om
Cluster

c
a.
Server
at
td
nt
Auto-Scale, Multi-Cluster X-Small Warehouse Medium Warehouse
v@

● Minimum of 1 ● One cluster of 4 servers


da
ya

● Maximum of 4
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 512


ALTERING QUERY BEHAVIOR

y
op
t-c
no
STATEMENT_QUEUED_TIMEOUT_IN_SECONDS

o-
-d
20
20
● How long a queued query will wait before being cancelled by the system

ke
fla
● Can be set for Account/Session/Role or Virtual Warehouse

ow
Sn
● Default: 0

©
0-
02
-2
ay
-M
STATEMENT_TIMEOUT_IN_SECONDS

13
om
c
● How long a query can run before being cancelled by the system
a.
at
td
● Set for Account/Session/Role or Virtual Warehouse
nt
v@

● Default: 2 days → Deployment best practice → lower at the account level to your preference
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 513


LAB EXERCISE

y
op
Determine Appropriate Warehouse Sizes

t-c
no
o-
30 minutes

-d
20
20
ke
Tasks:

fla
ow
● JOIN using an XSmall warehouse

Sn
● JOIN using a Medium warehouse

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 514


LAB RECAP

y
op
t-c
no
● Below are the results from running the same query against various warehouses sizes.

o-
-d
20
20
● Which warehouse size would you recommend, and why?

ke
fla
ow
Sn
©
Warehouse

0-
02
Size Query Time

-2
ay
-M
Small 24m 30s

13
om
Medium 10m 00s

c
a.
Large 5m 00s
at
td
nt
XLarge 2m 40s
v@
da

XXLarge 1m 40s
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 515


LAB RECAP

y
op
t-c
no
● Below are the results from running the same query against various warehouses sizes.

o-
-d
20
20
● Which warehouse size would you recommend, and why?

ke
fla
ow
Sn
©
Warehouse

0-
02
Size Query Time Credits

-2
ay
-M
Small 24m 30s .882

13
om
Medium 10m 00s .660

c
a.
Large 5m 00s .660
at
td
nt
XLarge 2m 40s .704
v@
da

XXLarge 1m 40s .890


. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 516


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
ACCOUNT & RESOURCE

Sn
©
0-
02
-2
MANAGEMENT

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Controlling Costs

no
o-
● Resource Monitors

-d
20
● INFORMATION_SCHEMA

20

ke
SNOWFLAKE Database

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 518


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
CONTROLLING COSTS

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


PERFORMANCE AND USAGE CONSIDERATIONS

y
op
t-c
no
o-
-d
COST PERFORMANCE

20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
● Storage ● Query workload

om
● Virtual warehouse compute ● Query complexity

c
a.
at
● Cloud services compute td ● Warehouse size
nt

● Data transfer/egress ● SQL efficiency


v@
da

charges
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 520


STORAGE USAGE

y
op
● Billed monthly

t-c
no
o-
○ Calculated based on daily bytes of all data

-d
20
stored each day in account

20
ke
fla
● Cost varies based on type of account,

ow
Sn
Database cloud provider, and region

©
0-
Tables

02
(Active, Time Travel, Fail-Safe)

-2
● Storage costs include:

ay
-M
○ Active storage – tables and objects

13
om
○ Time travel storage

c
a.
○ Failsafe storage
at
td
nt
○ Internal stages
v@
da

Internal
. ya
11

(User, Stages
Table, Named)
lip
di

© 2020 Snowflake Inc. All Rights Reserved 521


SAVE ON STORAGE COSTS

y
op
t-c
no
● Remove transient tables/schemas/databases when they are no longer needed

o-
-d
20
20
● Remove files in internal stages

ke
fla
○ Or use the PURGE option with COPY INTO

ow
Sn
©
0-
02
● Use appropriate time travel settings

-2
ay
○ Set at the account, database, schema, or table level

-M
13
om
● Capacity pricing is less expensive than pay-as-you-use

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 522


USER-MANAGED COMPUTE

y
op
● Only billed when running

t-c
no
o-
-d
20
● Billed per-second, with a 60-second

20
Servers / Credits /
Warehouse Size Credits / Second

ke
Cluster Hour
minimum

fla
ow
X-Small 1 1 0.0003

Sn
©
Small 2 2 0.0006

0-
● Can be limited and controlled via

02
-2
Medium 4 4 0.0011 resource monitors

ay
-M
Large 8 8 0.0022

13
om
X-Large 16 16 0.0044 ● Usage can be viewed via the UI or

c
a.
at
2X-Large 32 32 td
0.0089
nt
Information Schema/Account Usage
v@

3X-Large 64 64 0.0178
da
ya

4X-Large 128 128 0.0356


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 523


CLOUD COMPUTE

y
op
● Billed per second

t-c
no
o-
-d
● No virtual warehouses needed

20
20
ke
fla
● Includes:

ow
Sn
○ Metadata queries

©
0-
○ Query result cache use

02
-2
ay
○ Snowpipe

-M
13
○ Materialized view maintenance

om
○ Clustered table maintenance

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 524


CONTROL COMPUTE COSTS

y
op
t-c
no
● Right-size your virtual warehouses

o-
-d
20
20
● Consider data cache when setting auto-suspend values

ke
fla
ow
Sn
● Use multi-cluster warehouses and auto-scaling

©
0-
02
-2
● Limit who can create or manage warehouses

ay
-M
13
● Create resource monitors to track consumption

om
c
a.
○ Don't forget reader accounts!
at
td
nt
v@

● Track and manage cloud compute costs


da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 525


DATA TRANSFER CHARGES

y
op
● Database replication

t-c
no
o-
-d
● Varies based on source and destination

20
20
○ Region to region transfer within a cloud

ke
fla
provider are less expensive than transfers

ow
Sn
from one cloud provider to another

©
0-
02
$

-2
● Charges for:

ay
-M
○ Initial replication

13
om
○ Refreshes

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 526


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
INFORMATION_SCHEMA

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


INFORMATION_SCHEMA

y
op
● Set of system-defined views and table

t-c
no
functions that provide metadata

o-
-d
information about objects

20
20
ke
fla
ow
● Based on the SQL-92 ANSI Information

Sn
Schema + additional views and functions

©
0-
02
specific to Snowflake

-2
ay
-M
13
● Queries return only objects to which the

om
c
current role has been granted access
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 528


INFORMATION_SCHEMA

y
op
t-c
no
● Built-in, read-only schema

o-
-d
20
20
● Exists for each database

ke
fla
ow
Sn
● System-defined views and table functions

©
0-
○ Views for all objects in the database

02
○ Views for account-level objects

-2
ay
○ Table functions for historical and usage data

-M
13
om
● Provide extensive “logging” metadata information

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 529


SOME VIEWS IN INFORMATION_SCHEMA

y
op
t-c
no
o-
-d
● APPLICABLE_ROLES ● PROCEDURES

20
20
● COLUMNS ● SEQUENCES

ke
fla
ow
● DATABASES ● STAGES

Sn
©
0-
● ENABLED_ROLES ● TABLE_PRIVILEGES

02
-2
ay
● FILE FORMATS ● TABLE_STORAGE_METRICS

-M
13
● FUNCTIONS ● TABLES

om
c
a.
● LOAD_HISTORY at ● USAGE_PRIVILEGES
td
nt
v@

● PIPES ● VIEWS
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 530


TABLE FUNCTIONS IN INFORMATION_SCHEMA

y
op
t-c
no
o-
See documentation for a full list

-d
20
20
● AUTOMATIC_CLUSTERING_HISTORY

ke
fla
ow
● DATABASE_STORGE_USAGE_HISTORY

Sn
©
0-
● LOGIN_HISTORY

02
-2
ay
● LOGIN_HISTORY_BY_USER

-M
13
● MATERIALIZED_VIEW_REFRESH_HISTORY

om
c
a.
● QUERY_HISTORY
at
td
nt
v@

● WAREHOUSE_LOAD_HISTORY
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 531


INFORMATION_SCHEMA EXAMPLES

y
op
t-c
no
List object privileges

o-
-d
20
20
SELECT object_type, object_name, privilege_type, grantor, grantee

ke
FROM INFORMATION_SCHEMA.OBJECT_PRIVILEGES

fla
ow
ORDER BY object_type, object_name;

Sn
©
0-
02
OBJECT_TYPE OBJECT_NAME PRIVILEGE_TYPE GRANTOR GRANTEE

-2
ay
DATABASE DBHOL OWNERSHIP SYSADMIN SYSADMIN

-M
13
DATABASE MY_DB OWNERSHIP TRAINING_ROLE TRAINING_ROLE

om
c
DATABASE SNOWFLAKE USAGE ACCOUNTADMIN PUBLIC

a.
at
SCHEMA PUBLIC td
OWNERSHIP TRAINING_ROLE TRAINING_ROLE
nt
v@

TABLE TEST OWNERSHIP TRAINING_ROLE TRAINING_ROLE


da
.ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 532


INFORMATION_SCHEMA EXAMPLES

y
op
t-c
no
List total table size, by table, in megabytes (MB)

o-
-d
20
20
SELECT table_schema, table_name, table_owner, table_type,

ke
ROUND(bytes/1024/1024,3) AS mb

fla
ow
FROM INFORMATION_SCHEMA.TABLES

Sn
WHERE TABLE_SCHEMA NOT IN ('INFORMATION_SCHEMA');

©
0-
02
-2
ay
TABLE_SCHEMA TABLE_NAME TABLE_OWNER TABLE_TYPE MB

-M
13
PUBLIC ORDERS TRAINING_ROLE BASE TABLE 2043.877

om
c
PUBLIC CUSTOMER TRAINING_ROLE BASE TABLE 103.143

a.
at
td
DEV MY_TEMP_TABLE DBA LOCAL TEMPORARY 1048.304
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 533


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
SNOWFLAKE

Sn
©
0-
02
-2
DATABASE

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SNOWFLAKE DATABASE

y
op
t-c
no
● Shared by Snowflake as a secure view

o-
-d
20
20
● By default, only ACCOUNTADMIN can access the SNOWFLAKE database and schemas

ke
fla
ow
Sn
● Privileges can be granted to other roles

©
0-
02
-2
GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE <role>;

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 535


SNOWFLAKE DATABASE

y
op
● Shared by Snowflake to all accounts

t-c
no
○ Example of secure data sharing

o-
-d
20
20
● Contains 3 schemas:

ke
fla
○ ACCOUNT_USAGE

ow
○ INFORMATION_SCHEMA

Sn
©
○ READER_ACCOUNT_USAGE

0-
02
-2
ay
-M
● Contains historical usage data and logs

13
om
specific to your account

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 536


ACCOUNT_USAGE

y
op
● Contains views that display object

t-c
no
metadata and usage metrics

o-
-d
20
20
● Generally mirrors the corresponding

ke
fla
views and table functions in

ow
Sn
INFORMATION_SCHEMA, with some

©
0-
differences

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 537


READER_ACCOUNT_USAGE

y
op
● Contains views that apply to reader

t-c
no
accounts

o-
-d
20
20
● Small subset of views in

ke
fla
ACCOUNT_USAGE

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 538


INFORMATION_SCHEMA

y
op
● Schema automatically created in all

t-c
no
databases

o-
-d
20
20
● Does not serve a purpose in shared

ke
fla
databases, and can be disregarded

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 539


ACCOUNT_USAGE VS. INFORMATION_ SCHEMA

y
op
t-c
no
o-
INFORMATION_SCHEMA

-d
20
ACCOUNT_USAGE (IN INDIVIDUAL DATABASES)

20
ke
fla
ow
Includes dropped objects Does not include dropped objects

Sn
©
0-
02
-2
ay
45 minutes to 3 hours latency

-M
No latency

13
(varies by view)

om
c
a.
at
td
Data retained up to 6 months
nt
v@

Data retained for 1 year


(varies by view/table function)
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 540


ACCOUNT_USAGE EXAMPLES

y
op
t-c
no
Login History for a Specific User Over the Last Week:

o-
-d
20
20
SELECT event_id, event_timestamp, event_type, user_name, error_code, error_message

ke
FROM SNOWFLAKE.ACCOUNT_USAGE.LOGIN_HISTORY

fla
WHERE user_name = 'Anne_Jordan'

ow
AND EVENT_TIMESTAMP >= DATEADD('DAYS',-8,CURRENT_DATE())

Sn
AND EVENT_TIMESTAMP < CURRENT_DATE()

©
0-
ORDER BY EVENT_TIMESTAMP;

02
-2
ay
-M
13
EVENT_ID EVENT_TIMESTAMP EVENT_TYPE USER_NAME ERROR_CODE ERROR_MESSAGE

om
124077310283306 2019-09-05 10:58:39.777 -0700 LOGIN ANNE_JORDAN 390100 INCORRECT _USE...

c
a.
124077310289407 2019-09-12 08:21:43.103 -0700 LOGIN ANNE_JORDAN NULL NULL
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 541


ACCOUNT_USAGE EXAMPLES

y
op
t-c
no
Query History for a Specific Warehouse Over the Last Month:

o-
-d
20
20
SELECT *

ke
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY

fla
WHERE WAREHOUSE_NAME = 'DEMO_WH'

ow
AND START_TIME >= DATEADD('DAYS',-31,CURRENT_DATE())

Sn
AND START_TIME < CURRENT_DATE()

©
0-
ORDER BY START_TIME;

02
-2
ay
-M
QUERY_ID QUERY_TEXT DATABASE_ID DATABASE_NAME SCHEMA_ID

13
om
018eb0f9-0... SHOW WAREHOUSES; NULL NULL NULL

c
a.
at
018eb0fa-0... GRANT CREATE TABL... 239td FINANCE NULL
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 542


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
PREVIEW FEATURES

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


MODULE AGENDA

y
op
t-c
● Overview

no
o-
● Current Preview Features

-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 544


© 2020 Snowflake Computing Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
©
Sn
ow
fla
ke
20
20
OVERVIEW

-d
o-
no
t-c
op
y
WHAT IS A PREVIEW FEATURE?

y
op
t-c
no
● Implemented and tested in Snowflake

o-
-d
20
20
● Full usability and corner-case handling may not be complete

ke
fla
ow
Sn
● Use of preview features are not warranted against defects that may produce undesired

©
0-
results

02
-2
○ Behavior may change between Preview and Release

ay
-M
13
● Not recommended for production use

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 546


BEST PRACTICES FOR PREVIEW FEATURES

y
op
t-c
no
● Do not use on production data

o-
-d
20
20
● Do not use on production data

ke
fla
ow
Sn
● Do not use on production data

©
0-
02
-2
● Contact support for additional information if you really want to use it on production data

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 547


PREVIEW TYPES

y
op
t-c
no
● Private Preview / On Request

o-
-d
○ Typically invite-only at first

20
20
○ Generally in the early stages of preview

ke
○ Work with your account team if access is required

fla
ow
Sn
©
● Public Preview / Open

0-
02
○ Enabled by default for all accounts

-2
ay
○ Free to use, though still carry some risk and are not for full production use

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 548


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
CURRENT

Sn
©
0-
02
-2
PREVIEW FEATURES

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


CURRENT PREVIEW FEATURES

y
op
t-c
no
● How do I know what is in preview?

o-
-d
20
20
https://docs.snowflake.com/en/release-notes/preview-features.html

ke
fla
ow
Sn
©
0-
02
● How do I know when a preview feature is released?

-2
ay
-M
13
https://docs.snowflake.com/en/release-notes.html

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


y
op
t-c
no
o-
-d
20
20
CONTINUOUS DATA

ke
fla
ow
Sn
©
LOADING WITH

0-
02
-2
ay
-M
SNOWPIPE
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


USE CASE

y
op
t-c
no
COPY command (BATCH) Snowpipe (CONTINUOUS)

o-
-d
● Migration from traditional data sources ● Ingestion from modern data sources

20
20
○ Bulk loading, Backfill processing ● Continuously generated data is

ke
fla
● Transaction boundary control available for analysis in seconds

ow
○ BEGIN / START TRANSACTION / ● No scheduling (with auto-ingest)

Sn
©
COMMIT / ROLLBACK

0-
02
● Highly customizable for required ● Serverless model with no user-

-2
ay
-M
performance throughput managed virtual warehouse needed

13
○ Independent scaling of compute

om
c
resources for different ingestion
a.
workloads (loading, transformation) at
td
nt
v@
da
ya

Batch Micro-batch Continuous


.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


552
SNOWPIPE

y
op
t-c
no
• For continuous data ingestion (typically < 60 seconds 10-100Mb)

o-
-d
20
• Source: Internal or External Stage

20
• Uses Snowflake elastic compute resources (highly efficient. No WH needed)

ke
fla
ow
• Recommended file size. 10 - 100 Mb

Sn
©
0-
02
-2
ay
-M
13
Snowpipe

om
c
Pipe

a.
at
Blob Store
td
nt
(S3/ABS)
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


SNOWPIPE (REST) vs SNOWPIPE AUTO-INGEST

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
● Manually call Snowpipe ● Receives notifications from

c
REST API endpoint the cloud provider that new

a.
at
● Pass a list of files in the stage td files are in the cloud storage
location
nt
location
v@

● Works with INTERNAL and ● “Wakes up” and processes


da

external stages the new files


ya

● Monitors and works with


.
11

external stages
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 554


SNOWPIPE LOADING FROM CLOUD STORAGE

y
op
t-c
Customer

no
o-
Application

-d
20
20
ke
fla
ow
Sn
©
0-
02
Snowpipe

-2
ay
-M
13
Pipe Pipe
Amazon S3

om
Microsoft Azure

c
Blob Storage

a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SNOWPIPE LOADING FROM SNOWFLAKE STAGE

y
op
Put Put

t-c
Customer

no
Application

o-
-d
20
20
ke
fla
ow
Table

Sn
Stage

©
0-
02
-2
Pipe

ay
-M
13
Internal

om
User
Named Pipe Pipe Stage

c
Stage

a.
at
td Snowpipe
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


SNOWPIPE REST API

y
op
t-c
no
Snowpipe Service

o-
Snowflake

-d
Application Database

20
20
ke
fla
ow
Sn
REST Call

©
REST Endpoint

0-
{file names}

02
-2
Server-less

ay
Loader

-M
13
om
c
a.
at
td
nt
v@

S3
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 557


AUTO-INGEST

y
op
t-c
no
o-
-d
Snowpipe Service

20
20
Snowflake

ke
Database

fla
ow
Sn
©
External

0-
S3 notification

02
S3

-2
ay
Server-less

-M
Loader

13
File data

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


558
SECURITY CONSIDERATIONS

y
op
t-c
no
Role based access configuration for the user executing Snowpipe

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 559


MANUAL vs AUTO

y
op
t-c
no
Manual Automated

o-
-d
20
1. Create table to hold data ingested 1. Create table to hold data ingested

20
ke
from pipe from pipe

fla
ow
2. Create pipe with auto_ingest = false 2. Create pipe with auto_ingest = True

Sn
©
0-
3. Load data to stage 3. On AWS create SQS queue and

02
-2
4. Run alter pipe refresh attach to pipe

ay
-M
5. Validate that data is loaded with 4. Load data to stage

13
om
standard SQL 5. The SQS sends message to load

c
a.
at data
td
nt
v@

6. Validate that data is loaded with


da
ya

standard SQL
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


LAB EXERCISE

y
op
Continuous Data Loading

t-c
no
o-
15 minutes

-d
20
20
ke
Exercise: Using Snowflake Pipes

fla
ow
Sn
Tasks:

©
0-
● Setup

02
-2
● Create Pipe for Citibike trips table

ay
-M
● Place Data into stage and load into table

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 561


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
STREAMS & TASKS

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


ROAD MAP

y
op
t-c
no
o-
-d
20
DEFINITIONS CREATING AND

20
AND CONCEPTS

ke
MANAGING TASKS

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
INTRODUCTION USING STREAMS &
at
td
TO STREAMS TASKS TOGETHER
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 563


THE CHANGING DATA PROBLEM

y
op
t-c
no
o-
-d
How do I know what

20
20
changed in the staging

ke
table?

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 564


THE CHANGING DATA PROBLEM

y
op
t-c
no
o-
-d
How do I know what

20
20
changed in the staging

ke
table?

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
Where do I keep the

c
a.
at
business logic for my td
nt
transformations?
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 565


THE CHANGING DATA PROBLEM

y
op
t-c
no
o-
-d
How do I know what

20
How can I run a process

20
changed in the staging

ke
table? on a schedule?

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
Where do I keep the

c
a.
at
business logic for my td
nt
transformations?
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 566


THE CHANGING DATA PROBLEM

y
op
t-c
no
o-
-d
How do I know what

20
How can I run a process

20
changed in the staging

ke
table? on a schedule?

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
Where do I keep the How do I make my

c
a.
at
business logic for my td process run reliably in the
nt
transformations? Cloud?
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 567


THE CHANGING DATA PROBLEM

y
op
t-c
no
o-
-d
How do I know what How can I ensure Exactly-

20
How can I run a process

20
changed in the staging Once

ke
table? on a schedule? Semantics?

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
Where do I keep the How do I make my

c
a.
at
business logic for my td process run reliably in the
nt
transformations? Cloud?
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 568


STREAMS TASKS

y
op
T1

t-c
● Change Data Capture (CDC)

no
o-
-d
● Timestamp Offset

20
20
● Identify and act on changed records

ke
fla
ow
Sn
T1 T1

©
T2 T3

0-
02
-2
ay
A A STREAM

-M
13
● Scheduled SQL Statement Execution

om
B E E

c
● Can be chained and executed in order
a.
at
td
Data Changes Captured as ● Schedule defined via a CRON expression
nt
C C Stream F
v@

● Interval set in minutes


da


ya

Or can be set to follow upstream task


D F
.
11

● Check Streams to begin processing new data


lip
di

© 2020 Snowflake Inc. All Rights Reserved 569


STREAMS & TASKS

y
op
t-c
no
o-
Staging Table

-d
20
Snowpipe Task

20
Table

ke
Stream

fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
Merged Table Target Table 1 Target Table 2

c
a.
Task
at
Table td
nt
Stream
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 570


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
INTRODUCTION

Sn
©
0-
02
-2
TO STREAMS

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


WORKING WITH STREAMS STREAMS DDL

● Streams contain no data

y
op
t-c
● Think of it like a bookmark

no
● CREATE

o-
-d
● Table Versioning

20
● ALTER

20

ke
Standard vs. Append-only ● DROP

fla
ow
● Stream Columns:

Sn
● DESCRIBE

©
0-
02
● SHOW

-2
ay
METADATA$ACTION ● GRANT…TO ROLE

-M
13
om
● REVOKE…FROM ROLE

c
a.
METADATA$ISUPDATE at ● SHOW GRANTS
td
nt
v@
da
. ya
11

METADATA$ROW_ID
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


Tables vs. Streams

y
Data at Rest – Data in Motion

op
t-c
no
o-
-d
20
● Tables contain data at rest ● Streams represent data in motion

20
● Classic SQL ● Time-varying relations

ke
fla
● Represent a single point in time ● Represent every point in time

ow
Sn
● Reflects the most recent version ● Each point is known as an Offset

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt

Stream
v@

Table
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 573


● OFFSETS are a point in time into the
STREAM OFFSETS transactional versioned timeline of the

y
op
t-c
source table

no
o-
-d
20
● OFFSETS are advanced by a DML

20
statement that uses the Stream

ke
fla
ow
○ INSERT/UPDATE/DELETE/MERGE where (select

Sn
from <stream> where METADATA$...)

©
0-
○ Surround multiple DML statements with explicit

02
-2
transaction statements (BEGIN … COMMIT)

ay
-M
13
● Cannot view OFFSETS but can see the

om
table transaction timestamp:

c
a.
at
td ○ select
nt
system$stream_get_table_timestamp('<stream>’);
v@
da
ya

● A stream becomes stale when its OFFSET


.
11
lip

is positioned at a point earlier than the data


di

© 2020 Snowflake Inc. All Rights Reserved


retention period for the table 574
STREAM OBJECT METADATA

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
● METADATA$ACTION will always be either INSERT OR DELETE

13
● METADATA$ISUPDATE is a flag to determine if the INSERT or DELETE was part of an UPDATE statement

om
c
● METADATA$ROW_ID is a unique identifier of each change on the object.

a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved


STREAMS WORKFLOW

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
Staging Table

ay
-M
Snowpipe Table

13
Stream

om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
DEMO:

Sn
©
0-
02
-2
CREATING A STREAM

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


LAB EXERCISE

y
op
Streams and Tasks

t-c
no
o-
15 minutes

-d
20
20
ke
Exercise: Introduction to Streams

fla
ow
Sn
Tasks:

©
0-
● Create Basic Stream Table Streams

02
-2
● Query Table Streams

ay
-M
● Exploring Delta and Append Streams

13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 578


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
CREATING AND

Sn
©
0-
02
-2
MANAGING TASKS

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


WORKING WITH TASKS

y
op
t-c
no
● Executes a DML statement or Stored Procedure

o-
-d
20

20
May be run independently, or together with Streams

ke
fla
● Can be scheduled or set to run on a condition

ow
Sn

©
Use Cases:

0-
02
-2
ay
-M
13
○ Keep aggregates up-to-date

om
c
a.
○ Generate periodic reports nt
at
td
v@

○ Copy data into or out of Snowflake


da
ya

○ ALTER PIPE…REFRESH
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 580


TASKS WORKFLOW

y
op
t-c
no
o-
-d
20
20
ke
fla
ow
● Create a Task Administrator Role

Sn
©
0-
● CREATE TASK…. Statement

02
-2
ay
● ALTER TASK…RESUME

-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 581


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
DEMO:

Sn
©
0-
02
-2
CREATING A TASK

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


LAB EXERCISE

y
op
Streams and Tasks

t-c
no
o-
15 minutes

-d
20
20
ke
Exercise: Introduction to Tasks

fla
ow
Sn
Tasks:

©
0-
● Creating Tasks

02
-2
● Starting and Stopping Tasks

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 583


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
USING STREAMS &

Sn
©
0-
02
-2
TASKS TOGETHER

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


WORKING HAND-IN-HAND

y
op
t-c
no
● Streams and Tasks may be used separately or together

o-
-d
20
● Used together, they make a powerful automation tool

20
ke
fla
● Some use cases include:

ow
Sn
©
○ Shred semi-structured data into multiple data warehouse tables

0-
02
-2
○ Generate Reports

ay
-M
13
○ Routine Aggregation

om
c
○ Scheduled Data Ingestion
a.
at
td
nt
v@
da

Streams Tasks
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 585


PAIR STREAMS AND TASKS

y
op
t-c
Target Table 1

no
Staging Table

o-
-d
Table Task

20
20
Stream

ke
fla
ow
Sn
Target Table 2

©
User-Provided

0-
02
Compute Resources

-2

c
ay
-M
13
om

c
c
Destination Table (Aggregates)

a.
at
td
nt
v@

Task
da

Table
ya

Stream
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 586


STREAMS AND TASKS – CHAINING USE CASE

y
op
t-c
Time

no
o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
Table

-2
ay
Stream Task 1

-M
13
om
Task 2

c
a.
• Tasks are scheduled to follow at Task 3
td
nt
v@

each other upon completion


da
ya

of the previous task.


.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 587


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
DEMO:

Sn
©
0-
02
-2
USE CASES

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


LAB EXERCISE

y
op
Streams and Tasks

t-c
no
o-
15 minutes

-d
20
20
ke
Exercise: Pairing Streams and Tasks

fla
ow
Sn
Tasks:

©
0-
● Creating a Stream

02
-2
● Creating a Stored Procedure

ay
-M
● Creating a Task

13
om
● Starting and Stopping a Task

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 589


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
CONTINUOUS

Sn
©
0-
02
-2
DATA PROTECTION

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
HOW DOES

Sn
©
0-
02
-2
TIME TRAVEL WORK?

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


LOADING DATA

y
op
t-c
no
o-
-d
20
20
Table

ke
Insert into XXX ...

fla
ow
Sn
©
b c

0-
Partition: a V1 V1 V1 etc V1

02
-2
ay
-M
13
om
c
a.
Variable Length Micro-Partitions.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 592


MICRO-PARTITIONS VERSIONED

y
op
t-c
no
o-
-d
20
20
Table

ke
Insert into XXX ...

fla
ow
Sn
©
Version: b.1 c.1 V1

0-
a.1 V1 V1

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

All entries at version 1


da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 593


DATA CHANGES VERSIONED

y
op
t-c
no
o-
-d
20
20
Table

ke
Update X set ...

fla
ow
Sn
©
a.1 b.1 c.1 V1

0-
Partition: a.2 V1 V1

02
-2
ay
-M
13
om
c
a.
at
td
Ke
nt
v@

y
CurrentV1
Version
da
. ya

Old Version New version a.1 🡪 a.2 is created


11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 594


Table Versions

y
op
● Versions are snapshots in time

t-c
no
● With these versions, we can derive versions for table

o-
-d
● This is how we can go from DMLs to table versions and then do time

20
travel

20
ke
fla
ow
Sn
ID Name Version 1 (T1): File 1 + File 2

©
0-
DMLs

02
1 John
Version 2 (T2): File 1 + File 2

-2
ay
2 Scott File 1 DML 1: + File 1 + File 3

-M
+ File 2

13
3 Mary
Version 3 (T3) : File 1 + File 3

om
DML 2: + File 3 + File 4

c
4 Jane

a.
at
5 Jack File 2
td
DML 3: - File 2 > SELECT * FROM t AS OF T1
nt
v@

6 Claire + File 4 > SELECT * FROM t AS OF T2


da
ya

7 Pierre File 3
.
11
lip

5 Jack T T T
di

File 4 1 2 3
6 Claire
© 2020 Snowflake Computing Inc. All Rights Reserved New Modified data 595

data
QUERY: DEFAULT LATEST VERSION

y
op
t-c
no
o-
-d
20
20
Table

ke
fla
ow
Sn
©
a.1 b.1 c.1 V1

0-
Data: a.2 V1 V1

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
ya

select *
.
11

from my_table;
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 596


QUERY AS OF 60 SECONDS AGO

y
op
t-c
no
o-
-d
20
20
Table

ke
fla
ow
Sn
©
a.1 b.1 c.1 V1

0-
Partition: a.2 V1 V1

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@

select *
da
ya

from my_table
.
11

at(offset => 60);


lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 597


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
HOW DOES

Sn
©
0-
02
-2
CLONING WORK?

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


ZERO-COPY CLONING

y
op
t-c
● Convenient way to quickly take a “snapshot” of any table, schema, or database

no
o-
and create a derived copy of that object which initially shares the underlying

-d
20
storage

20
ke
fla
● No additional storage costs are incurred until changes are made to the original

ow
All
Unmodified
or cloned object(s)

Sn
Blocks

©
0-
02
● Clones can be cloned, with no limitations on the number or iterations of clones

Some Changes
-2
ay
that can be created

-M
13
● At the instant the clone is created:

om
c
a.
○ All micro-partitions in both tables are fully shared.
at
td Modified
○ The storage associated with these micro-partitions is owned by the oldest table in the clone
nt
v@

Some
group and the clone references these micro-partitions Unmodified
da

Blocks
. ya

● Often used to quickly spin up Dev and Test/QA environments and/or create data
11
lip

backups at points in time


di

© 2020 Snowflake Inc. All Rights Reserved 599


ZERO-COPY CLONING EXAMPLE: SINGLE TABLE

y
op
t-c
no
o-
Services

Table A1

-d
Global

20
20
ke
fla
ow
Sn
©
Compute

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
Micro-Partition A Micro-Partition B Micro-Partition C Micro-Partition D
Storage

v@
da
ya
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 600


ZERO-COPY CLONING EXAMPLE: CLONED TABLE

y
op
t-c
no
o-
Services

Table A1 Table A2

-d
Metadata logical
Global

20
object is copied at

20
CLONE... the Global Services

ke
fla
layer only

ow
Sn
©
Compute

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
Micro-Partition A Micro-Partition B Micro-Partition C Micro-Partition D
Storage

v@
da
ya
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 601


ZERO-COPY CLONING EXAMPLE: POST CLONE DML

y
op
t-c
no
o-
Services

Table A1 Table A2

-d
Global

20
20
CLONE...

ke
fla
ow
DML has resulted in 1

Sn
micro-partition being

©
Compute

0-
rewritten and applied to

02
Table A2. The pathway to

-2
=

ay
the unused micro-partition

-M
is still kept so we can

13
time-travel if required.

com
a.
at
td
nt
Micro-Partition A Micro-Partition B Micro-Partition C Micro-Partition D Micro-Partition E
Storage

v@
da
ya
.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 602


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
AGILE DATABASE

Sn
©
0-
02
-2
DEVELOPMENT

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
Sn
©
AGILE DEVELOPMENT

0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


AGILE DEVELOPMENT CYCLE

y
Clone from PROD, Apply changes, Test and Promote

op
t-c
no
o-
Parallel DEVs

-d
20
20
ke
fla
ow
Sn
©
0-
DEV Branch “n”

02
1

-2
ay
-M
Clone (at offset = -3600) Script Changes
2

13
& Test

om
c
a.
PROD Trunk at
td
nt
v@
da

3
ya

Deploy DB scripts & Code


.
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 605


AGILE DEVELOPMENT & TESTING

y
Delta changes applied to PROD

op
t-c
no
o-
-d
20
PROD Data

20
v1.0 5-Jan

ke
fla
ETL v1.0

ow
Sn
©
Load Deltas

0-
Database

02
-2
ay
S3

-M
13
om
c
a.
at
td
nt
v@
da
.ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 606


AGILE DEVELOPMENT & TESTING

y
PROD cloned to DEV and changes applied

op
t-c
no
o-
-d
20
PROD Data

20
v1.0 5-Jan

ke
fla
ETL v1.0

ow
Sn
©
Load Deltas

0-
Database

02
-2
ay
S3

-M
Clone Data

13
AT 4-Jan

om
c
a.
at
td
nt
v@
da
ya

Apply Change
.

Scripts to v2.0
11
lip

TEST Clone Data


di

v2.0 4-Jan

© 2020 Snowflake Computing Inc. All Rights Reserved 607


AGILE DEVELOPMENT & TESTING

y
Deltas applied to DEV

op
t-c
no
o-
-d
20
PROD Data

20
v1.0 5-Jan

ke
fla
ETL v1.0

ow
Sn
©
Load Deltas

0-
Database

02
-2
ay
S3

-M
13
om
c
a.
ETL v2.0
at
td
nt
v@

Delta
da

Clone
.ya
11
lip

Data TEST Clone Data


di

4th to 5th v2.0 4-Jan

© 2020 Snowflake Computing Inc. All Rights Reserved 608


AGILE DEVELOPMENT & TESTING

y
Clone PROD and compare for regression testing

op
t-c
no
o-
-d
20
PROD Data

20
v1.0 1-Feb

ke
fla
ETL v1.0

ow
Sn
©
Load Deltas

0-
Database

02
-2
ay
S3 Clone Data

-M
AT 5-Jan

13
om
c
a.
at
td
nt
v@
da
.ya
11
lip

PROD Clone TEST Clone


di

v1.0 v2.0
Data Data
© 2020 Snowflake Computing Inc. All Rights Reserved 609
5-Jan 5-Jan
CONSIDERATIONS

y
op
Be aware of the following

t-c
no
o-
● Virtual Warehouses: Separate PROD/TEST workloads. Auto-Suspend

-d
20
● Time Travel: Zero Copy Clones use Time Travel (Default Retention 1 Day)

20
ke
● Editions: Standard (0-1 Days). Enterprise+ (0-90 Days)

fla
ow
● Storage: Will increase Data Retention cost – rapidly changing tables

Sn
©
0-
● Grants: Single Table Clones don’t transfer access grants – but Database &

02
-2
ay
Schema clones retain child grants

-M
13
● Stages: External Stages Cloned. Internal Not. No Data is cloned.

om
c
● DDL Impact: Avoid DDL on source objects during cloning
a.
at
td
● DML Impact: Avoid DML during cloning with Retention Period Zero
nt
v@
da
.ya
11
lip

Source: https://docs.snowflake.net/manuals/user-guide/object-clone.html
di

© 2020 Snowflake Inc. All Rights Reserved 610


DML DURING CLONING

y
op
t-c
no
o-
-d
Retention Period = Zero

20
20
UPDATE tab set col = ‘X’

ke
fla
ow
Source

Sn
V1
v1 v1
V1 v2

©
Table

0-
02
Partition 1 Partition 2 Partition 1

-2
ay
-M
13
c om
a.
at
td
nt
Target V1 V1
v@

Clone
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 611


DML DURING CLONING

y
op
t-c
no
o-
Retention Period = Zero

-d
20
20
ke
fla
Source v1

ow
Marked Deleted V1 v2
Table

Sn
©
Partition 1 Partition 2 Partition 1

0-
02
-2
ay
-M
13
c om
a.
at
Target V1
Error
nt
td V1
Clone
v@
da
. ya
11
lip
di

programming error occurred: "000707 (02000): None: Data is not available." with query id none
© 2020 Snowflake Computing Inc. All Rights Reserved 612
LAB EXERCISE

y
op
Complete the Course Survey

t-c
no
o-
10 minutes

-d
20
20
ke
● Please!

fla
ow
Sn
● Take a few minutes before we move on to the last section to fill out your course surveys

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved 613


y
op
t-c
no
o-
-d
20
20
ke
fla
ow
FUNDAMENTALS

Sn
©
0-
02
-2
RECAP

ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Computing Inc. All Rights Reserved


QUESTION

y
op
t-c
no
Q: Can you have two tables with the same name in your account?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 615


QUESTION

y
op
t-c
no
Q: Can you have two tables with the same name in your account?

o-
-d
20
20
ke
fla
A: Yes, as long as the full path (database.schema.table) is different

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 616


QUESTION

y
op
t-c
no
Q: What are the elements of a worksheet context?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 617


QUESTION

y
op
t-c
no
Q: What are the elements of a worksheet context?

o-
-d
20
20
ke
fla
A: Role, Database, Schema, Warehouse

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 618


QUESTION

y
op
t-c
no
Q: Which element of the worksheet context is NOT displayed as part of the SnowSQL prompt?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
 

-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 619


QUESTION

y
op
t-c
no
Q: Which element of the worksheet context is NOT displayed as part of the SnowSQL prompt?

o-
-d
20
20
ke
fla
A: Role

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 620


QUESTION

y
op
t-c
no
Q: In what increment are you billed for compute?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 621


QUESTION

y
op
t-c
no
Q: In what increment are you billed for compute?

o-
-d
20
20
ke
fla
A: Per second, with a one-minute minimum

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 622


QUESTION

y
op
t-c
no
Q: What is a micro-partition?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 623


QUESTION

y
op
t-c
no
Q: What is a micro-partition?

o-
-d
20
20
ke
fla
A: A small file that stores part of a table, along with metadata about what it contains

ow
Sn
(such as MIN, MAX, COUNT).

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 624


QUESTION

y
op
t-c
no
Q: How are you billed for storage?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 625


QUESTION

y
op
t-c
no
Q: How are you billed for storage?

o-
-d
20
20
ke
fla
A: Per terabyte, per month

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 626


QUESTION

y
op
t-c
no
Q: In the following scenario, for how much time will you be billed for compute?

o-
-d
20
20
○ Your auto-resume warehouse is set to auto-suspend after 10 minutes, and is initially suspended

ke
fla
ow
○ You run a query using that warehouse, and it takes 7 minutes

Sn
○ You go to lunch and come back a few hours later

©
0-
02
■ No one else has used the warehouse while you were gone

-2
ay
○ You run a query that takes 43 seconds

-M
13
○ You go home for the day

om
c
■ No one else uses the warehouse

a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 627


QUESTION

y
op
t-c
no
Q: In the following scenario, for how much time will you be billed for compute?

o-
-d
20
20
○ Your auto-resume warehouse is set to auto-suspend after 10 minutes, and is initially suspended

ke
fla
ow
○ You run a query using that warehouse, and it takes 7 minutes – charge 7 minutes

Sn
○ You go to lunch and come back a few hours later – charge 10 minutes waiting for auto-suspend

©
0-
02
■ No one else has used the warehouse while you were gone

-2
ay
○ You run a query that takes 43 seconds

-M
13
○ You go home for the day charge 10:43

om
c
■ No one else uses the warehouse

a.
at
td
nt

A: 27 minutes, 43 seconds
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 628


QUESTION

y
op
t-c
no
Q: What are some of the things the cloud services layer is responsible for?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 629


QUESTION

y
op
t-c
no
Q: What are some of the things the cloud services layer is responsible for?

o-
-d
20
20
ke
fla
A:

ow
Sn
○ Storing metadata

©
0-
02
○ Access control (network policies)

-2
ay
○ User authentication

-M
13
○ Query optimization

om
○ Transaction control

c
a.
at
○ Upgrades and patches to software td
nt
v@

○ Storing query result cache


da
ya

○ Account management
.
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 630


QUESTION

y
op
t-c
no
Q: How many clusters are in a standard Medium warehouse?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 631


QUESTION

y
op
t-c
no
Q: How many clusters are in a standard Medium warehouse?

o-
-d
20
20
ke
fla
A: One

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 632


QUESTION

y
op
t-c
no
Q: How many servers are in a standard Medium warehouse?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 633


QUESTION

y
op
t-c
no
Q: How many servers are in a standard Medium warehouse?

o-
-d
20
20
ke
fla
A: Four

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 634


QUESTION

y
op
t-c
no
Q: How are regions and availability zones related?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 635


QUESTION

y
op
t-c
no
Q: How are regions and availability zones related?

o-
-d
20
20
ke
fla
A: Availability zones are inside a region; Snowflake automatically replicates data to three

ow
Sn
different availability zones in your region.

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 636


QUESTION

y
op
t-c
no
Q: If your query is running slowly, what should you do?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 637


QUESTION

y
op
t-c
no
Q: If your query is running slowly, what should you do?

o-
-d
20
20
ke
fla
A:

ow
Sn
1. Check the query profile

©
0-
02
■ If query pruning is low, try to make filters more efficient; do you need to cluster the table?

-2
ay
■ Look for unintended cross-joins

-M
■ Look for spilling to local or remote disk, and check memory-intensive clauses (GROUP BY, ORDER BY…)

13
om
2. Try with a larger warehouse

c
a.
at
■ Especially if you are spilling to remote storage
nt
td
3. Call Support
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 638


QUESTION

y
op
t-c
no
Q: TRUE or FALSE: Snowflake can automatically scale your warehouse up when needed

o-
-d
20
(make your warehouse larger)

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 639


QUESTION

y
op
t-c
no
Q: TRUE or FALSE: Snowflake can automatically scale your warehouse up when needed

o-
-d
20
(make your warehouse larger)

20
ke
fla
ow
Sn
A: FALSE. Snowflake can only automatically scale your warehouse out

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 640


QUESTION

y
op
t-c
no
Q: What are the two auto-scaling methods for a multi-cluster warehouse?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 641


QUESTION

y
op
t-c
no
Q: What are the two auto-scaling policies for a multi-cluster warehouse?

o-
-d
20
20
ke
fla
A: Standard and Economy

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 642


QUESTION

y
op
t-c
no
Q: With Economy scaling, how long might queries be queued before another cluster is added

o-
-d
20
to the warehouse?

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 643


QUESTION

y
op
t-c
no
Q: With Economy scaling, how long might queries be queued before another cluster is added

o-
-d
20
to the warehouse?

20
ke
fla
ow
Sn
A: Six minutes

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 644


QUESTION

y
op
t-c
no
Q: What is the difference between a temporary table and a transient table?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 645


QUESTION

y
op
t-c
no
Q: What is the difference between a temporary table and a transient table?

o-
-d
20
20
ke
fla
A: A temporary table is only available in a single session, and is dropped when the session

ow
Sn
ends. A transient table is available across sessions, and kept until it is explicitly dropped.

©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 646


QUESTION

y
op
t-c
no
Q: Why would you “scale up” your warehouse?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 647


QUESTION

y
op
t-c
no
Q: Why would you ”scale up” your warehouse?

o-
-d
20
20
A: To provide more resources for complex queries, to improve performance

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 648


QUESTION

y
op
t-c
no
Q: What ROLE should create users and roles?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 649


QUESTION

y
op
t-c
no
Q: What ROLE should create users and roles?

o-
-d
20
20
ke
fla
A: SECURITYADMIN (or USERADMIN)

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 650


QUESTION

y
op
t-c
no
Q: What privileges does SYSADMIN have by default?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 651


QUESTION

y
op
t-c
no
Q: What privileges does SYSADMIN have by default?

o-
-d
20
20
ke
fla
A: CREATE DATABASE and CREATE WAREHOUSE

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 652


QUESTION

y
op
t-c
no
Q: How long does the query result cache last?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 653


QUESTION

y
op
t-c
no
Q: How long does the query result cache last?

o-
-d
20
20
ke
fla
A: 24 hours, with the timer “reset” each time the cached result is used

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 654


QUESTION

y
op
t-c
no
Q: Who can use the data cache?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 655


QUESTION

y
op
t-c
no
Q: Who can use the data cache?

o-
-d
20
20
A: Anyone who uses the same warehouse

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 656


QUESTION

y
op
t-c
no
Q: Who can use the query result cache?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 657


QUESTION

y
op
t-c
no
Q: Who can use the query result cache?

o-
-d
20
20
A: For a SHOW: anyone in the same role

ke
fla
For a SELECT: anyone with select permission on all the tables in the query

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 658


QUESTION

y
op
t-c
no
Q: For how long does a query remain in the query history tab?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 659


QUESTION

y
op
t-c
no
Q: For how long does a query remain in the query history tab?

o-
-d
20
20
A: 14 days

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 660


QUESTION

y
op
t-c
no
Q: Name three differences between INFORMATION_SCHEMA and ACCOUNT_USAGE

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 661


QUESTION

y
op
t-c
no
Q: Name three differences between INFORMATION_SCHEMA and ACCOUNT_USAGE

o-
-d
20
20
ke
A: ACCOUNT_USAGE includes dropped objects; INFORMATION_SCHEMA does not

fla
ow
Sn
ACCOUNT_USAGE retains information for one year; INFORMATION_SCHEMA retains

©
0-
information for up to 6 months (it varies by view)

02
-2
ay
-M
ACCOUNT_USAGE has up to 3 hours of latency; INFORMATION_SCHEMA has no

13
om
latency

c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 662


QUESTION

y
op
t-c
no
Q: Name two things that only the ACCOUNTADMIN can do, by default

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 663


QUESTION

y
op
t-c
no
Q: Name three things that only the ACCOUNTADMIN can do, by default

o-
-d
20
20
A: CREATE SHARE, MONITOR USAGE

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 664


QUESTION

y
op
t-c
no
Q: What command is used to load or unload data?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 665


QUESTION

y
op
t-c
no
Q: What command is used to load or unload data?

o-
-d
20
20
A: COPY INTO

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 666


QUESTION

y
op
t-c
no
Q: What is the advantage of using an external stage, rather than copying data in directly from

o-
-d
20
the cloud storage?

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 667


QUESTION

y
op
t-c
no
Q: What is the advantage of using an external stage, rather than copying data in directly from

o-
-d
20
the cloud storage?

20
ke
fla
A: With an external stage, you can define the credentials and keys needed to access and

ow
Sn
decrypt the data. This allow you to create a stage that can be used without giving the

©
0-
users information on the credentials

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 668


QUESTION

y
op
t-c
no
Q: With enterprise edition, what is the longest time you can specify for Time Travel?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 669


QUESTION

y
op
t-c
no
Q: With enterprise edition, what is the longest time you can specify for Time Travel?

o-
-d
20
20
A: 90 days

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 670


QUESTION

y
op
t-c
no
Q: Why would you “scale out” your warehouse?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 671


QUESTION

y
op
t-c
no
Q: Why would you ”scale out” your warehouse?

o-
-d
20
20
A: To spin up additional clusters, to provide greater concurrency

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 672


QUESTION

y
op
t-c
no
Q: What is a virtual warehouse?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 673


QUESTION

y
op
t-c
no
Q: What is a virtual warehouse?

o-
-d
20
20
A: A collection of compute resources that can be used for queries, loading, etc.

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 674


QUESTION

y
op
t-c
no
Q: You have Time Travel set to keep your table data for 27 days. 35 days after dropping a

o-
-d
20
table you want to get it back. Can you?

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 675


QUESTION

y
op
t-c
no
Q: You have Time Travel set to keep your table data for 27 days. 35 days after dropping a

o-
-d
20
table you want to get it back. Can you?

20
ke
fla
A: No. The 7-day fail-safe period would only protect you through day 34.

ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 676


QUESTION

y
op
t-c
no
Q: When you share data with a consumer account, who pays for compute time?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 677


QUESTION

y
op
t-c
no
Q: When you share data with a consumer account, who pays for compute time?

o-
-d
20
20
A: The consumer account does.

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 678


QUESTION

y
op
t-c
no
Q: Can you set a compute quota on a specific user?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 679


QUESTION

y
op
t-c
no
Q: Can you set a compute quota on a specific user?

o-
-d
20
20
A: No, you can set a compute quota at either the account or warehouse level

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 680


QUESTION

y
op
t-c
no
Q: Can you set a storage quota at the account level?

o-
-d
20
20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 681


QUESTION

y
op
t-c
no
Q: Can you set a storage quota at the account level?

o-
-d
20
20
A: No, you cannot set storage quotas at any level

ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 682


QUESTION

y
op
t-c
no
Q: Is there any reason you wouldn’t want to auto-suspend all your warehouses after one

o-
-d
20
minute?

20
ke
fla
ow
Sn
©
0-
02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 683


QUESTION

y
op
t-c
no
Q: Is there any reason you wouldn’t want to auto-suspend all your warehouses after one

o-
-d
20
minute?

20
ke
fla
A: Suspending a warehouse clears the data cache. If another user runs a very similar query

ow
Sn
right after the warehouse is suspended, they will need to re-load all the data to the

©
0-
warehouse, which might take more time than you saved by shutting the warehouse down.

02
-2
ay
-M
13
om
c
a.
at
td
nt
v@
da
. ya
11
lip
di

© 2020 Snowflake Inc. All Rights Reserved 684


© 2020 Snowflake Inc. All Rights Reserved
di
lip
11
.ya
da
v@
nt
td
at
a.
com
13
-M
ay
-2
02
0-
THANK YOU

©
Sn
ow
fla
ke
20
20
-d
o-
no
t-c
op
y

You might also like