SlideShare a Scribd company logo
1© 2017 Snowflake Computing Inc. All Rights Reserved.
Y O U R D A T A , N O L I M I T S
© 2017 Snowflake Computing Inc. All Rights Reserved.
@KentGraziano
KENT GRAZIANO
Chief Technical Evangelist
Snowflake Computing
Demystifying
Data Warehousing as a Service
(DWaaS)
2© 2017 Snowflake Computing Inc. All Rights Reserved.
• Chief Technical Evangelist, Snowflake Computing
• Oracle ACE Director (DW/BI)
• OakTable Network
• Blogger – The Data Warrior
• Certified Data Vault Master and DV 2.0 Practitioner
• Former Member: Boulder BI Brain Trust (#BBBT)
• Member: DAMA Houston & DAMA International
• Data Architecture and Data Warehouse Specialist
• 30+ years in IT
• 25+ years of Oracle-related work
• 20+ years of data warehousing experience
• Author & Co-Author of a bunch of books (Amazon)
• Past-President of ODTUG and Rocky Mountain Oracle
User Group
My Bio
3© 2017 Snowflake Computing Inc. All Rights Reserved.
About Snowflake
Experienced,
accomplished
leadership team
2012
Founded by
industry veterans
with over 120
database patents
Vision:
A world with
no limits on data
First data
warehouse
built for the
cloud
Over 1000
customers
since GA
4© 2017 Snowflake Computing Inc. All Rights Reserved.
• Data Challenges
• What is a Data Warehouse as a Service?
• What can a Cloud DWaaS do for me?
• Features of a Cloud DWaaS
• Top 10 (or so) Cool Features of Snowflake
• Becoming more Agile with DWaaS
• Where Snowflake fits
• Spark Integration
• Continuous Loading
• Conclusion
Agenda
5© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved.
Data challenges today
6© 2017 Snowflake Computing Inc. All Rights Reserved.
Scenarios  with  affinity  for  cloud
Gartner 2016
Predictions:
By 2018, six
billion connected
things will be
requesting
support.
Connecting applications, devices, and
“things”
Reaching employees, business partners,
and consumers
Anytime, anywhere mobility
On demand, unlimited scale
Understanding behavior; generating,
retaining, and analyzing data
7© 2017 Snowflake Computing Inc. All Rights Reserved.
40 Zettabytes by 2020
Web ERP3rd party apps Enterprise apps IoTMobile
8© 2017 Snowflake Computing Inc. All Rights Reserved.
It’s not the data itself
it’s how you take full advantage of the insight it provides
Web ERP3rd party apps Enterprise apps IoTMobile
9© 2017 Snowflake Computing Inc. All Rights Reserved.
Most firms don’t consistently turn data into action
Source: Forrester
All Possible data All Possible Action
of firms
aspire to be
data-driven.
73%
of firms are
good at
turning data
into action.
29%
10© 2017 Snowflake Computing Inc. All Rights Reserved.
The data struggle
Data is stale
Hard to incorporate
new data sources
Too much time
waiting for data
Too much time spent
on manual
administration
Reports are too slow
whenever the system
is busy
Hard to experiment
with data
11© 2017 Snowflake Computing Inc. All Rights Reserved.
Symptoms of fundamental challenges
Data silos
Data locked into
separate
databases, big
data systems, and
applications
Inflexibility
Slow, cumbersome
scaling and limited
support for diverse
data
Complexity
Multiple systems
to integrate and
manage requiring
specialized skills
and tools
Performance
Contention for
limited resources
resulting in latency
and delays
Cost
Painful upfront
costs and
overprovisioned
capacity
12© 2017 Snowflake Computing Inc. All Rights Reserved.
The evolution of data platforms
Data warehouse
& platform
software
Vertica,
Greenplum,
Paraccel, Hadoop
Data
warehouse
appliance
Teradata
1990s 2000s 2010s
Cloud DWaaS
Snowflake
1980s
Relational
database
Oracle, DB2,
SQL Server
13© 2017 Snowflake Computing Inc. All Rights Reserved.
What is a Cloud DWaaS?
DW- Data Warehouse
• Relational database
• Uses standard SQL
• Optimized for fast loads and
analytic queries
aaS – As a Service
• Like SaaS (e.g. SalesForce.com)
• No infrastructure set up
• Minimal to no administration
• Managed for you by the vendor
• Pay as you go, for what you use
14© 2017 Snowflake Computing Inc. All Rights Reserved.
Goals of a Cloud DWaaS
Make your life easier
• So you can load and use your data
faster
Support business
• Make data accessible to more people
• Reduce time to insights
Handle big data too!
• Schema-less ingestion
15© 2017 Snowflake Computing Inc. All Rights Reserved.
What to Expect from a
Cloud DWaaS It should support standard SQL
(natively)
• It should support standard ETL, BI
& data science tools
• ODBC or JDBC connectivity
It should be infinitly scalable
(cloud)
• Handle huge amounts of data
• Handle large number of concurrent
queries without performance
degradation
It should handle flexible schema
data types
• No sharding or ETL required
16© 2017 Snowflake Computing Inc. All Rights Reserved.
What to Expect from a
Cloud DWaaS
It should be secure
• Built in encryption?
It shoud be stable
• Resiliancy and availability should be
easy to configure and manage
It should be easy to configure and
manage
It should provide a lower TCO
• Cloud scale pricing
17© 2017 Snowflake Computing Inc. All Rights Reserved.
Common customer scenarios
Data  warehouse  
for  SaaS  offerings
Use  Cloud  DW as  back-­
end  data  warehouse  
supporting  data-­driven  
SaaS  products
noSQL  replacement
Replace  use  of  noSQL  
system  (e.g.  Hadoop)  for  
transformation  and  SQL  
analytics  of  multi-­
structured  data  
Data  warehouse  
modernization
Consolidate  legacy  
datamarts  and  support  
new  projects
18© 2017 Snowflake Computing Inc. All Rights Reserved.
Enabling key use cases
Datamart & data silo consolidation
Consolidate legacy datamarts to eliminate silos and support new projects
Integrated data analytics
Directly load structured + semi-structured data into Snowflake for reporting
& analytics
Data Science, Exploratory & ad hoc analytics
Direct access to data for SQL analysts and data scientists to explore data,
identify correlations, build & test models
19© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved.
Introducing Snowflake
20© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowflake:
1st Data Warehouse Built for the Cloud
SQL relational database
Optimized storage & processing
Standard connectivity – BI, ETL, …
Data Warehousing…
Existing SQL skills and tools
“Load and go” ease of use
Cloud-based elasticity to fit any scale
Data
scientists
SQL
users &
tools
…for Everyone
21© 2017 Snowflake Computing Inc. All Rights Reserved.
What is Snowflake?
Built for the
cloud
SQL Data
Warehouse
Our Vision
Provide all users anytime, anywhere
insights so they can make actionable
decisions based on data
Our Solution
Next-generation data warehouse built
from the ground up for the cloud and
for today’s data and analytics
Delivered as
a service
22© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowflake’s differentiating technology
Unique architecture: Multi-cluster, shared data
Single place for
data
z
Instant, unlimited
scalability
Zero
management
Instant, live data
sharing
23© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved.
The Data Warrior’s
Top 10+ Cool Things About Snowflake
(A Data Geeks Guide to Cloud-native DW)
24© 2017 Snowflake Computing Inc. All Rights Reserved.
#10 – Persistent Result Sets
• No setup
• In Query History
• By Query ID
• 24 Hours
• No re-execution
• No Cost for Compute
25© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved.
#9 -Connect w/JDBC & ODBC to the cloud
Data Sources
Custom & Packaged
Applications & Data
Science
ODBC WEB UIJDBC
Interfaces
Java
>_
Scripting
Reporting
& Analytics
Data Modeling,
Management &
Transformation
SDDM
26© 2017 Snowflake Computing Inc. All Rights Reserved.
#8 - UNDROP
UNDROP TABLE <table name>
UNDROP SCHEMA <schema name>
UNDROP DATABASE <db name>
Part of Time Travel feature: AWESOME!
27© 2017 Snowflake Computing Inc. All Rights Reserved.
#7 Fast Clone (Zero-Copy)
Instant copy of table, schema, or database:
CREATE OR REPLACE TABLE MyTable_V2
CLONE MyTable;
With Time Travel:
CREATE SCHEMA mytestschema_clone_restore
CLONE testschema
BEFORE (TIMESTAMP =>
TO_TIMESTAMP(40*365*86400));
PROD
PUBLIC
Table A Table B
Table C
DEV
PUBLIC
Table A Table B
Table C
PUBLIC
Table A Table B
Table C
INT
28© 2017 Snowflake Computing Inc. All Rights Reserved.
#6 – JSON Support with SQL
Apple 101.12 250 FIH-2316
Pear 56.22 202 IHO-6912
Orange 98.21 600 WHQ-6090
Structured data
(e.g. CSV)
Semi-structured data
(e.g. JSON, Avro, XML)
{ "firstName": "John",
"lastName": "Smith",
"height_cm": 167.64,
"address": {
"streetAddress": "212nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100"
},
"phoneNumbers": [
{ "type": "home", "number":"212555-1234"},
{ "type": "office", "number": "646 555-4567"}
]
}
Optimized storage
Flexible schema - Native
Relational processing
select v:lastName::string as last_name
from json_demo;
All Your Data!
29© 2017 Snowflake Computing Inc. All Rights Reserved.
#5 – Standard SQL w/Analytic Functions
select Nation, Customer, Total
from (select
n.n_name Nation,
c.c_name Customer,
sum(o.o_totalprice) Total,
rank() over (partition by n.n_name
order by sum(o.o_totalprice) desc)
customer_rank
from orders o,
customer c,
nation n
where o.o_custkey = c.c_custkey
and c.c_nationkey = n.n_nationkey
group by 1, 2)
where customer_rank <= 3
order by 1, customer_rank
SQL
Complete SQL database
• Data definition language (DDLs)
• Query (SELECT)
• Updates, inserts and deletes (DML)
• Role based security
• Multi-statement transactions
New partner post: https://sonra.io/2018/02/04/create-custom-aggregate-udaf-window-functions-snowflake/
30© 2017 Snowflake Computing Inc. All Rights Reserved.
#4 – Separation of Storage & Compute
Snowflake’s multi-cluster, shared data architecture
Centralized storage
Instant, automatic scalability & elasticity
Service
Compute
Storage
31© 2017 Snowflake Computing Inc. All Rights Reserved.
#3 – Support Multiple Workloads
Accelerate the data pipeline
Run loading & analytics at any time, concurrently, to get
data to users faster
Scale compute to support any workload
Scale processing horsepower up and down on-the-fly,
with zero downtime or disruption
Scale concurrency without performance impact
Multi-cluster “virtual warehouse” architecture scales
concurrent users & workloads without contention
Deliver faster analytics
at any scale
Loading
Marketing
Finance
32© 2017 Snowflake Computing Inc. All Rights Reserved.
• Elastic scaling for storage
Low-cost cloud storage, fully
replicated and resilient
• Elastic scaling for compute
Virtual warehouses scale up &
down on the fly to support
workload needs
• Elastic scaling for concurrency
Automatically scale
concurrency using multi-
cluster virtual warehouses
Instant, unlimited scalability
33© 2017 Snowflake Computing Inc. All Rights Reserved.
#2  – Secure  by  Design  with  Automatic  Encryption  of  Data!
Embedded
multi-factor authentication
Federated authentication
available
Certified against enterprise-
class requirements
HIPPA Certified!
PCI Certified!
All data encrypted, always,
end-to-end
Encryption keys managed
automatically
NEW: Tri-secret security
Role-based access
control model
Granular privileges on all
objects & actions
Authentication Access control Data encryption External validation
34© 2017 Snowflake Computing Inc. All Rights Reserved.
#1 - Automatic Query Optimization
Zero Management
Fully managed with no knobs or tuning required
No indexes, distribution keys, partitioning,
vacuuming,…
Zero infrastructure costs
Zero admin costs
35© 2017 Snowflake Computing Inc. All Rights Reserved.
New #1 Fav – Data Sharing (The Data “Sharehouse”)
Data
Consumers
Data
Providers
No data movement
Share with unlimited
number of consumers
Live access
Data consumers
immediately see all updates
Ready to use
Consumers can immediately
start querying
36© 2017 Snowflake Computing Inc. All Rights Reserved.
Data Sharing - Configuring access
create share s1; --empty share
grant usage on database sales to share s1; -- add database
grant usage on schema sales.east to share s1; -- add schema
grant usage on view sales.east.accts to share s1; -- add view
alter share s1 add accounts=a1, a2, a3; -- add accounts
s1
Provider Database
Data Share Object
Schema List
View List
Account List
a3
a1
a2
Data provider Data Consumers
create database sales from share p1.s1;
© 2017 Snowflake Computing Inc. All Rights Reserved.
Enabling Agile DW in the Cloud
Examples
38© 2017 Snowflake Computing Inc. All Rights Reserved.
•Agile Warehouse Scaling
• Separation of Workloads
• Virtual WH Scaling Techniques
•Agile Data Lifecycle
• Cloning
Enabling the Agile Data Warehouse
© 2017 Snowflake Computing Inc. All Rights Reserved.
Agile  Warehouse  Scaling
Separation  of  Workloads
40© 2017 Snowflake Computing Inc. All Rights Reserved.
• DWaaS - Snowflake solution
• Assign each workload its own Virtual WH
• At a minimum, two WH: ETL and Business
• ETL can run continuously if it makes sense for the
business
• ETL / Business workload contention is eliminated
• Further subdivide business workloads into
own clusters as needed
• Eliminate contention between Sales, Marketing,
Finance, Data Science workloads
• International groups can operate clusters on their
own local time schedules
• Easy to add new workloads on demand!
Separation of Workloads
Virtual
Warehouse
Databases
Virtual
Warehouse
ETL & Data Loading
Business Workloads
Finance
Virtual
Warehouse
Test/Dev
Virtual
Warehouse
S
Marketing
Virtual
Warehouse
Sales
Virtual
Warehouse
S
Research
41© 2017 Snowflake Computing Inc. All Rights Reserved.
• Increase cluster size – on the fly!
• More data being analyzed
• More complex queries
• Get some concurrency boost
• No need to start, stop, or reboot!
• Automatic scale out!
• Multi-cluster warehouse concurrency
• Workload queries have usual weight
• But more of them, e.g. 20 dashboard users rather than the
usual 5
• Automatically scales back after the load
drops
Other Agile Warehouse Scaling Techniques
42© 2017 Snowflake Computing Inc. All Rights Reserved.
• Anticipated surges
• Explicitly increase WH nodes (T-shirt size) when expecting more data
• Explicitly increase MCWH minimum clusters when expecting more queries
• Can do both at once with ALTER WAREHOUSE
• Use cron or other scheduling/orchestration tool
• Unanticipated surges
• Rely on MCWH maximum clusters for some extra headroom
• Maximize Business Agility!
• Responsiveness for users
• Throughput and value extracted from variable compute power
• Minimize
• Cost and administrative overhead
Agile Warehouse Scaling – Best Practices
© 2017 Snowflake Computing Inc. All Rights Reserved.
Agile  Data  Lifecycle
44© 2017 Snowflake Computing Inc. All Rights Reserved.
• Separation of Workloads
• Individual virtual warehouse for each dev/test/prod functional area
• CLONE for dev/test or Data Science Sandbox – on demand!
• Full logical copy of the data, but uses no extra storage
• Test/dev operations against clone have no effect on original data
• Security
• RBAC limits dev/test access to clone and not production data
• Secure Views permit role- or user-based obfuscation / masking / projection
• Business Impact – better quality code
• Dev and test teams are working on data at scale, see true app performance
• Full range of values means fewer surprises when app encounters live data
Agile Data Lifecycle
45© 2017 Snowflake Computing Inc. All Rights Reserved.
• Create development (DEV) and integration (INT) databases from production
(PROD)
Scenario 1
PROD
PUBLIC
Table
A
Table B
INT
PUBLIC
Table
A
Table B
DEV
PUBLIC
Table
A
Table B
CLONE
CLONE
CREATE OR REPLACE DATABASE DEV
CLONE PROD
CREATE OR REPLACE DATABASE INT
CLONE PROD
46© 2017 Snowflake Computing Inc. All Rights Reserved.
• Create two new tables, C and D, in the development (DEV) database
Scenario 2: new development
PROD
PUBLIC
Table
A
Table B
INT
PUBLIC
Table
A
Table B
DEV
PUBLIC
Table
A
Table B
Table
C
Table
D
47© 2017 Snowflake Computing Inc. All Rights Reserved.
• Mini-release: promote table C for integration testing
Scenario 2: new development
PROD
PUBLIC
Table
A
Table B
INT
PUBLIC
Table
A
Table B
DEV
PUBLIC
Table
A
Table B
Table
C
Table
D
Table
C
CREATE
TABLE
LIKE
48© 2017 Snowflake Computing Inc. All Rights Reserved.
• Deploy to production: promote table C to PROD database
Scenario 2: new development
PROD
PUBLIC
Table
A
Table B
INT
PUBLIC
Table
A
Table B
DEV
PUBLIC
Table
A
Table B
Table
C
Table
D
Table
C
CREATE
TABLE
LIKE
Table
C
49© 2017 Snowflake Computing Inc. All Rights Reserved.
• Refresh dev: get latest PROD data into DEV and INT
Scenario 2: new development
PROD
PUBLIC
Table
A
Table B
INT
PUBLIC
Table
A
Table B
DEV
PUBLIC
Table
A
Table B
Table
C
Table
D
Table
C
Table
C
CLONE
DEV
PUBLIC
Table
A
Table B
Table
C
Table
D
DEV2
CLONE
CREATE OR REPLACE
TABLE Dev2.TableD
CLONE Dev.TableD
CREATE OR REPLACE
DATABASE DEV2
CLONE PROD
50© 2017 Snowflake Computing Inc. All Rights Reserved.
• CLONE for Data Scientists
• Quick and Safe sandbox for discovery and testing
• Experiment and explore, and write back!
• Test hypothisis and keep iterating – drop and reclone as needed
• No impact to production data
• Combine with own virtual warehouse for complete isolation
• Business Impact – better data science
• More fine-grained data over longer time intervals
• Deeper insights, better forecasting, more monetizable results
• CLONE for Compliance
• Monthly, quarterly, annual clones – financial reporting, auditing requirements
• Business Impact – simpler compliance
• Your "backups" are live and immediately available
Agile Data Lifecycle
© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowflake & Spark
52© 2017 Snowflake Computing Inc. All Rights Reserved.
Where Snowflake fits with complementary solutions
Vast majority, >> 80% or
more, is semi-/structured
>> 20% or less is unstructured
Data
Sources
for
Analytics
(Audio, video, text, etc.)
(Tables, CSV, JSON, Avro,
Parquet, XML, flat files, etc.)
Complementary
Special data processing
(Spark ML, Hadoop lakes, NoSQL
front-ends, In-memory DBs,
non-relational exploring, et al.)
Built for the cloud, relational
data warehouse/DB & analytics
delivered as a service
52
Use Cases
(Tables, metadata)
• Live analytics
• Operational DW
• Real-time dashboards
• Continuous data loading
• Change data capture
• Traditional DW/BI
• Consolidation of data silos
• Relational data exploration
• Interactive queries
• Data science/statistics (Python, R)
• And more
• Spark machine learning
• Unstructureddata mining
• Streaming analytics
• Non-relational data exploration
• Non-relational data science
• And more
53© 2017 Snowflake Computing Inc. All Rights Reserved.
Modern  Data  Architectures
Data Production Data Lake Data
Consumption
Web
Servers
Data
Bases
App
Metrics
Snowflake
Business
Intelligence
Data Science:
Exploration
Data Science:
Machine Learning
Model Training
Data
Augmentation &
Loading
Data
Augmentation &
Loading
Model-driven
Apps
Online
Recommendations
& Behavioral
Targeting on Web
Pages
Offline Churn
Prediction
Other Machine
Learning Apps
54© 2017 Snowflake Computing Inc. All Rights Reserved.
Modern  Data  Architectures:  
Data  Loading  &  Augmentation
Web
Servers
App
Metrics
Data
Bases
Data ProductionCustomers &
End Users
External Data
Augmentation
Spark
Cluster
Kafka
SnowSQL
COPY
Kinesis
Transformations and
Augmentations in e.g.
Python, Scala,
SparkSQL, Spark
Streaming etc.
Snowflake
Warehouse
Snowflake
Spark
Connector
Full/Pruned
Data Lake with
Augmentation
Snowflake
Spark
Connector
Source/Raw
Data
Augmented
Data
Source/Raw
Data
Augmented
Data
Transformations and
Augmentations using Snowflake
SQL
Augmentation
Augmentation
CDC -
FiveTran
55© 2017 Snowflake Computing Inc. All Rights Reserved.
Modern  Data  Architectures:  Data  Science
Jupyter
Notebook
using e.g.
ML
Data Exploration
Model Design
Spark app
using
Spark ML
Model
(Re-)Training
Operationalized
Model (Code + Params)
External Data
Augmentation
Spark
Cluster
Snowflake
Warehouse
Snowflake
Spark
Connector
Data Lake with
Augmentation
Snowflake
Spark
Connector
Source/Raw
Data
Augmented
Data
Source/Raw
Data
Augmented
Data
Augmentation
Augmentation
Data Exploration
using Snowflake SQL
56© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowflake WarehouseSpark Cluster (e.g. in AWS EMR)
AWS S3
3 2
Data Flow:
Data Frames
Data Flow:
Snowflake Tables
Master Node
Slave Nodes
Query
(SQL)
Cloud Services
Virtual Warehouse
1
Architecture:  Data  Interchange  (Read)
57© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowflake WarehouseSpark Cluster (e.g. in AWS EMR)
AWS S3
1 3
Data Flow:
Data Frames
Data Flow:
Snowflake Tables
Master Node
Slave Nodes
ControlFlow
(SQL)
Cloud Services
Virtual Warehouse
2
Architecture:  Data  Interchange  (Write)
58© 2017 Snowflake Computing Inc. All Rights Reserved.
Spark Cluster
Snowflake Virtual Warehouse
Super-Charge Spark Processing with Snowflake
• Spark optimizer extension
automatically identifies Spark
operations with corresponding
Snowflake implementations
• Spark connector pushes these
operations into Snowflake SQL
• Pushed operations include: project,
filter, join, aggregation, limit
• Optimized Snowflake SQL applied to
all Spark data frames backed by
Snowflake tables
• Transparent to Spark language choice:
No need to rewrite Spark (Python,
Scala, Java, …) or SparkSQL code
df1 = spark.load…(‘T1’)
df2 = spark.load…(‘T2’)
df1
.join(df2, ‘id’)
.groupBy(‘region’)
.count()
.collect()
SELECT region, count(*)
FROM T1
JOIN T2
ON T1.id = T2.id
GROUPBY region
Snowflake SQL
AWS
S3
Results
59© 2017 Snowflake Computing Inc. All Rights Reserved.
• Reduced complexity, better manageability
• Same data store for curated and raw data
• Same data store for data science and BI data sets
• Fully transactional data store even for raw data when needed
• Performance benefits
• Leveraging structure of the data even in semi-structured data formats such as JSON as compared to flat file
storage
• Metadata for pruning based on queries
• Easy to integrate with Spark as compute and application platform
• Overall compute capacity and cost savings due to performance benefits with
Snowflake
• Blog posts:
• https://www.snowflake.net/snowflake-and-spark-part-1-why-spark/
• https://www.snowflake.net/snowflake-spark-part-2-pushing-query-processing/
Snowflake + Spark Benefits
© 2017 Snowflake Computing Inc. All Rights Reserved.
NEW
Continous Data Loading with Snowpipe
61© 2017 Snowflake Computing Inc. All Rights Reserved.
Data is created constantly.Data accumulated over time.
Data is Being Generated Faster than Ever
Before
Infrequent
Now
Constant
Web
data
Corp
data
App
data
There’s an enormous opportunity to use
continuously generated data in analysis.
62© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowpipe is an automated service that
asynchronously listens for new data as it arrives in
Amazon Web Services’ S3 cloud storage service
and continuously loads that data into Snowflake.
© 2017 Snowflake Computing Inc. All Rights Reserved. Snowflake
Confidential – under NDA only.
63© 2017 Snowflake Computing Inc. All Rights Reserved.
6
Data Ingress Approaches & Snowpipe Support
Approach Approach Definition Snowpipe
Batch Data accumulates over time (hours,
days) and is then loaded periodically
Option 1:
Point at an S3 bucket and a destination table in
your warehouse at which point new data is
automatically uploaded.
Option 2:
Technical resources can interface directly with the
programmatic REST API along with Java and Python
SDKs to enable highly customized loading use cases
Micro-batch Data accumulates over small time
windows (minutes) and is then loaded
Continuous Every data item is loaded individually as
it arrives
64© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowpipe Scenario: Automatic Loading from S3
Snowflake
Database
External
S3
SnowPipe Service
Server-less
Loader
S3 notification
File data
65© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowpipe – Current Preview
Snowflake
Database
REST Call
{file names}
S3
Application
SnowPipe Service
REST Endpoint
Server-less
Loader
66© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowpipe Auto-Ingest: Fully Automatic Loading from S3
Snowflake
Database
External
S3
SnowPipe Service
Server-less
Loader
S3 notification
File data
Use simple Snowflake DDL statements to configure bucket for automatic load with Snowpipe
See the video here: https://www.youtube.com/embed/rwZWmcy0aBU?autoplay=1&rel=0
Blog posts:
https://www.snowflake.net/your-first-steps-with-snowpipe/
https://www.snowflake.net/snowpipe-serverless-loading-for-streaming-data-2/
67© 2017 Snowflake Computing Inc. All Rights Reserved.
Key Snowpipe Benefits
You only pay
for the compute
time you use to
load data
Avoid repeated
manual COPY
commands
Continuously
generated data is
available for
analysis in seconds
0 management. No
indexing, tuning,
partitioning or
vacuuming on load
Full support for
semi-structured
data on load
Server-less: No
servers to manage or
concurrency to worry
about.
68© 2017 Snowflake Computing Inc. All Rights Reserved.
• Server-less billing model: utilization-
based billing
• No warehouse to manage for load
• Per core, per second granularity
• Charged as Snowflake credits
• New line item on Snowflake bill
• Components of the Snowpipe
charges:
• Loading work – idle times are not charged
• Time spent in metadata management for file
loading
6
Pricing
69© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved.
Snowflake in Action Today
70© 2017 Snowflake Computing Inc. All Rights Reserved.
What customers are doing with Snowflake
DATA  MARTS  
&  EXTRACTS
Market research
company consolidated
data marts to reduce
costs and data silos
Gaming company
replaced Hadoop +
SQL database with
Snowflake
STAGING
DATA  LAKE
DATA  
WAREHOUSE
Consumer retailer
modernizing DW by
replacing legacy
appliance with
Snowflake
Mobile analytics
company shares live
data with clients
REPORTING,  ANALYTICS  
&  APPLICATIONS
DATA  SOURCES
71© 2017 Snowflake Computing Inc. All Rights Reserved.
Delivering compelling results
Simpler data pipeline
Replace noSQL database with Snowflake
for storing & transforming JSON event
data
Faster analytics
Replace on-premises data warehouse with
Snowflake for analytics workload
Significantly lower cost
Improved performance while adding new
workloads--at a fraction of the cost
noSQL data base:
8 hours to prepare data
Snowflake:
1.5 minutes
Data warehouse appliance:
20+ hours
Snowflake:
45 minutes
Data warehouse appliance:
$5M + to expand
Snowflake:
added 2 new workloads for $50K
72© 2017 Snowflake Computing Inc. All Rights Reserved.
Over 1000 customers demonstrating what’s possible
Up to 200x faster reports that enable analysts to make
decisions in minutes rather than days
Ability to load and update data in near real time by replacing
legacy data warehouse + Hadoop cluster
New applications that provide secure access to analytics to
11,000+ pharmacies
73© 2017 Snowflake Computing Inc. All Rights Reserved.
Chosen by leading enterprises
Built advertising analytics
platform on Snowflake
Moving enterprise reporting to
Snowflake
Deploying BI solution stack on
Snowflake
Built new embedded analytics
application on Snowflake
Moved audience reporting
infrastructure to Snowflake
Replaced Hadoop + legacy data
warehouse with Snowflake
74© 2017 Snowflake Computing Inc. All Rights Reserved.
Ranked #1 Cloud Data Warehouse!
“Snowflake Hits All the Marks” – Gigaom
4,85
4,50
4,45
3,75
3,75
3,35
3,20
3,15
2,60
Cloud
Analytics
Database
Distruption Vectors
AWS Redshift
Oracle Database Exdata Cloud Service
SAP HANA Cloud Platform
Azure Data Warehouse
Vertica
DashDB (IBM)
Teradata
Google Big Query
Snowflake
RobustnessofSQL15%
Built-inOptimization15%
On-the-flyElasticity25%
DynamicEnvironment
Adaption20%
SeparationofCompute
fromstorage15%
SupportforDiversedata
10%
Score
“You can tell the data warehouse
pedigree from the development…
With superior performance and
the most hands-off model of
ownership, Snowflake is the
epitome of data warehouse as a
service. The model, cost, features
and scalability have already
caused some to postpone
Hadoop adoption.”
William McKnight
Gigaom
Disruption
Vectors
Gigaom Analyst Report: Sector Roadmap: Cloud Analytic
Databases 2017
Read the full report on snowflake.net
75© 2017 Snowflake Computing Inc. All Rights Reserved.
What does a Cloud-native
DWaaS Provide?
Cost effective storage and analysis of
GBs, TBs, or even PB’s
Lightning fast query performance
Continuous data loading without
impacting query performance
Unlimited user concurrency
Full SQL relational support of both
structured and semi-structured data
Support for the tools and languages you
already useODBC WEB UIJDBCJava
>_
Scripting
76© 2017 Snowflake Computing Inc. All Rights Reserved.
Big Data does not have to
equal Big Effort
Web ERP3rd party
apps
Enterprise
apps
IoTMobile
77© 2017 Snowflake Computing Inc. All Rights Reserved.
Discover the performance, concurrency, and simplicity of
Snowflake
As easy as 1-2-3!
01 Visit Snowflake.net
02 Click “Try for Free”
03 Sign up & register
Snowflake is the only data warehouse built for the
cloud. You can automatically scale compute up,
out, or down—independent of storage. Plus, you
have the power of a complete SQL database, with
zero management, that can grow with you to
support all of your data and all of your users. With
Snowflake On Demand™, pay only for what you
use.
Sign up and receive
$400 worth of free
usage for 30 days!
Kent Graziano
Snowflake Computing
Kent.graziano@snowflake.net
On Twitter @KentGraziano
More info at
http://snowflake.net
Visit my blog at
http://kentgraziano.com
Contact Information
79© 2017 Snowflake Computing Inc. All Rights Reserved.
YOUR DATA, NO LIMITS
© 2017 Snowflake Computing Inc. All Rights Reserved.
Thank You!
Ad

More Related Content

What's hot (20)

Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
Ishan Bhawantha Hewanayake
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Khalid Salama
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
Kujambu Murugesan
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
Brett VanderPlaats
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 
Data Sharing with Snowflake
Data Sharing with SnowflakeData Sharing with Snowflake
Data Sharing with Snowflake
Snowflake Computing
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
James Serra
 
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Cathrine Wilhelmsen
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
Adam Doyle
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
qureshihamid
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Visual_BI
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
Databricks
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Khalid Salama
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
Kujambu Murugesan
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
Brett VanderPlaats
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
James Serra
 
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Cathrine Wilhelmsen
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
Adam Doyle
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
qureshihamid
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Visual_BI
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
Databricks
 

Similar to Demystifying Data Warehousing as a Service - DFW (20)

Elastic Data Warehousing
Elastic Data WarehousingElastic Data Warehousing
Elastic Data Warehousing
Snowflake Computing
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
Eric Kavanagh
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)
Kent Graziano
 
Simplifying Your Journey to the Cloud: The Benefits of a Cloud-Based Data War...
Simplifying Your Journey to the Cloud: The Benefits of a Cloud-Based Data War...Simplifying Your Journey to the Cloud: The Benefits of a Cloud-Based Data War...
Simplifying Your Journey to the Cloud: The Benefits of a Cloud-Based Data War...
Matillion
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Denodo
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
DataStax Academy
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
DataWorks Summit
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Matt Stubbs
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Matt Stubbs
 
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
Denodo
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
Senturus
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
SnapLogic
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
DataWorks Summit
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Cloudera, Inc.
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
StampedeCon
 
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB
 
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data FederationNRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
Eric Kavanagh
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)
Kent Graziano
 
Simplifying Your Journey to the Cloud: The Benefits of a Cloud-Based Data War...
Simplifying Your Journey to the Cloud: The Benefits of a Cloud-Based Data War...Simplifying Your Journey to the Cloud: The Benefits of a Cloud-Based Data War...
Simplifying Your Journey to the Cloud: The Benefits of a Cloud-Based Data War...
Matillion
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Denodo
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
Cassandra Day SV 2014: Apache Cassandra at Equinix for High Performance, Scal...
DataStax Academy
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
DataWorks Summit
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Matt Stubbs
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Matt Stubbs
 
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
Denodo
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
Senturus
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
SnapLogic
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
DataWorks Summit
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Cloudera, Inc.
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
StampedeCon
 
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB
 
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data FederationNRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB
 
Ad

More from Kent Graziano (20)

Balance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudBalance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data Cloud
Kent Graziano
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
Kent Graziano
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
Kent Graziano
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
Kent Graziano
 
Rise of the Data Cloud
Rise of the Data CloudRise of the Data Cloud
Rise of the Data Cloud
Kent Graziano
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
Kent Graziano
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Kent Graziano
 
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsExtreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Kent Graziano
 
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSAgile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Kent Graziano
 
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Kent Graziano
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)
Kent Graziano
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016
Kent Graziano
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse Design
Kent Graziano
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Kent Graziano
 
Agile Methods and Data Warehousing
Agile Methods and Data WarehousingAgile Methods and Data Warehousing
Agile Methods and Data Warehousing
Kent Graziano
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Kent Graziano
 
Top Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data ModelerTop Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data Modeler
Kent Graziano
 
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
Kent Graziano
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Kent Graziano
 
Why Data Vault?
Why Data Vault? Why Data Vault?
Why Data Vault?
Kent Graziano
 
Balance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudBalance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data Cloud
Kent Graziano
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
Kent Graziano
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
Kent Graziano
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
Kent Graziano
 
Rise of the Data Cloud
Rise of the Data CloudRise of the Data Cloud
Rise of the Data Cloud
Kent Graziano
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
Kent Graziano
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Kent Graziano
 
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsExtreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Kent Graziano
 
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSAgile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Kent Graziano
 
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Kent Graziano
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)
Kent Graziano
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016
Kent Graziano
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse Design
Kent Graziano
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Kent Graziano
 
Agile Methods and Data Warehousing
Agile Methods and Data WarehousingAgile Methods and Data Warehousing
Agile Methods and Data Warehousing
Kent Graziano
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Kent Graziano
 
Top Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data ModelerTop Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data Modeler
Kent Graziano
 
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
Kent Graziano
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Kent Graziano
 
Ad

Recently uploaded (20)

EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 

Demystifying Data Warehousing as a Service - DFW

  • 1. 1© 2017 Snowflake Computing Inc. All Rights Reserved. Y O U R D A T A , N O L I M I T S © 2017 Snowflake Computing Inc. All Rights Reserved. @KentGraziano KENT GRAZIANO Chief Technical Evangelist Snowflake Computing Demystifying Data Warehousing as a Service (DWaaS)
  • 2. 2© 2017 Snowflake Computing Inc. All Rights Reserved. • Chief Technical Evangelist, Snowflake Computing • Oracle ACE Director (DW/BI) • OakTable Network • Blogger – The Data Warrior • Certified Data Vault Master and DV 2.0 Practitioner • Former Member: Boulder BI Brain Trust (#BBBT) • Member: DAMA Houston & DAMA International • Data Architecture and Data Warehouse Specialist • 30+ years in IT • 25+ years of Oracle-related work • 20+ years of data warehousing experience • Author & Co-Author of a bunch of books (Amazon) • Past-President of ODTUG and Rocky Mountain Oracle User Group My Bio
  • 3. 3© 2017 Snowflake Computing Inc. All Rights Reserved. About Snowflake Experienced, accomplished leadership team 2012 Founded by industry veterans with over 120 database patents Vision: A world with no limits on data First data warehouse built for the cloud Over 1000 customers since GA
  • 4. 4© 2017 Snowflake Computing Inc. All Rights Reserved. • Data Challenges • What is a Data Warehouse as a Service? • What can a Cloud DWaaS do for me? • Features of a Cloud DWaaS • Top 10 (or so) Cool Features of Snowflake • Becoming more Agile with DWaaS • Where Snowflake fits • Spark Integration • Continuous Loading • Conclusion Agenda
  • 5. 5© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved. Data challenges today
  • 6. 6© 2017 Snowflake Computing Inc. All Rights Reserved. Scenarios  with  affinity  for  cloud Gartner 2016 Predictions: By 2018, six billion connected things will be requesting support. Connecting applications, devices, and “things” Reaching employees, business partners, and consumers Anytime, anywhere mobility On demand, unlimited scale Understanding behavior; generating, retaining, and analyzing data
  • 7. 7© 2017 Snowflake Computing Inc. All Rights Reserved. 40 Zettabytes by 2020 Web ERP3rd party apps Enterprise apps IoTMobile
  • 8. 8© 2017 Snowflake Computing Inc. All Rights Reserved. It’s not the data itself it’s how you take full advantage of the insight it provides Web ERP3rd party apps Enterprise apps IoTMobile
  • 9. 9© 2017 Snowflake Computing Inc. All Rights Reserved. Most firms don’t consistently turn data into action Source: Forrester All Possible data All Possible Action of firms aspire to be data-driven. 73% of firms are good at turning data into action. 29%
  • 10. 10© 2017 Snowflake Computing Inc. All Rights Reserved. The data struggle Data is stale Hard to incorporate new data sources Too much time waiting for data Too much time spent on manual administration Reports are too slow whenever the system is busy Hard to experiment with data
  • 11. 11© 2017 Snowflake Computing Inc. All Rights Reserved. Symptoms of fundamental challenges Data silos Data locked into separate databases, big data systems, and applications Inflexibility Slow, cumbersome scaling and limited support for diverse data Complexity Multiple systems to integrate and manage requiring specialized skills and tools Performance Contention for limited resources resulting in latency and delays Cost Painful upfront costs and overprovisioned capacity
  • 12. 12© 2017 Snowflake Computing Inc. All Rights Reserved. The evolution of data platforms Data warehouse & platform software Vertica, Greenplum, Paraccel, Hadoop Data warehouse appliance Teradata 1990s 2000s 2010s Cloud DWaaS Snowflake 1980s Relational database Oracle, DB2, SQL Server
  • 13. 13© 2017 Snowflake Computing Inc. All Rights Reserved. What is a Cloud DWaaS? DW- Data Warehouse • Relational database • Uses standard SQL • Optimized for fast loads and analytic queries aaS – As a Service • Like SaaS (e.g. SalesForce.com) • No infrastructure set up • Minimal to no administration • Managed for you by the vendor • Pay as you go, for what you use
  • 14. 14© 2017 Snowflake Computing Inc. All Rights Reserved. Goals of a Cloud DWaaS Make your life easier • So you can load and use your data faster Support business • Make data accessible to more people • Reduce time to insights Handle big data too! • Schema-less ingestion
  • 15. 15© 2017 Snowflake Computing Inc. All Rights Reserved. What to Expect from a Cloud DWaaS It should support standard SQL (natively) • It should support standard ETL, BI & data science tools • ODBC or JDBC connectivity It should be infinitly scalable (cloud) • Handle huge amounts of data • Handle large number of concurrent queries without performance degradation It should handle flexible schema data types • No sharding or ETL required
  • 16. 16© 2017 Snowflake Computing Inc. All Rights Reserved. What to Expect from a Cloud DWaaS It should be secure • Built in encryption? It shoud be stable • Resiliancy and availability should be easy to configure and manage It should be easy to configure and manage It should provide a lower TCO • Cloud scale pricing
  • 17. 17© 2017 Snowflake Computing Inc. All Rights Reserved. Common customer scenarios Data  warehouse   for  SaaS  offerings Use  Cloud  DW as  back-­ end  data  warehouse   supporting  data-­driven   SaaS  products noSQL  replacement Replace  use  of  noSQL   system  (e.g.  Hadoop)  for   transformation  and  SQL   analytics  of  multi-­ structured  data   Data  warehouse   modernization Consolidate  legacy   datamarts  and  support   new  projects
  • 18. 18© 2017 Snowflake Computing Inc. All Rights Reserved. Enabling key use cases Datamart & data silo consolidation Consolidate legacy datamarts to eliminate silos and support new projects Integrated data analytics Directly load structured + semi-structured data into Snowflake for reporting & analytics Data Science, Exploratory & ad hoc analytics Direct access to data for SQL analysts and data scientists to explore data, identify correlations, build & test models
  • 19. 19© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved. Introducing Snowflake
  • 20. 20© 2017 Snowflake Computing Inc. All Rights Reserved. Snowflake: 1st Data Warehouse Built for the Cloud SQL relational database Optimized storage & processing Standard connectivity – BI, ETL, … Data Warehousing… Existing SQL skills and tools “Load and go” ease of use Cloud-based elasticity to fit any scale Data scientists SQL users & tools …for Everyone
  • 21. 21© 2017 Snowflake Computing Inc. All Rights Reserved. What is Snowflake? Built for the cloud SQL Data Warehouse Our Vision Provide all users anytime, anywhere insights so they can make actionable decisions based on data Our Solution Next-generation data warehouse built from the ground up for the cloud and for today’s data and analytics Delivered as a service
  • 22. 22© 2017 Snowflake Computing Inc. All Rights Reserved. Snowflake’s differentiating technology Unique architecture: Multi-cluster, shared data Single place for data z Instant, unlimited scalability Zero management Instant, live data sharing
  • 23. 23© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved. The Data Warrior’s Top 10+ Cool Things About Snowflake (A Data Geeks Guide to Cloud-native DW)
  • 24. 24© 2017 Snowflake Computing Inc. All Rights Reserved. #10 – Persistent Result Sets • No setup • In Query History • By Query ID • 24 Hours • No re-execution • No Cost for Compute
  • 25. 25© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved. #9 -Connect w/JDBC & ODBC to the cloud Data Sources Custom & Packaged Applications & Data Science ODBC WEB UIJDBC Interfaces Java >_ Scripting Reporting & Analytics Data Modeling, Management & Transformation SDDM
  • 26. 26© 2017 Snowflake Computing Inc. All Rights Reserved. #8 - UNDROP UNDROP TABLE <table name> UNDROP SCHEMA <schema name> UNDROP DATABASE <db name> Part of Time Travel feature: AWESOME!
  • 27. 27© 2017 Snowflake Computing Inc. All Rights Reserved. #7 Fast Clone (Zero-Copy) Instant copy of table, schema, or database: CREATE OR REPLACE TABLE MyTable_V2 CLONE MyTable; With Time Travel: CREATE SCHEMA mytestschema_clone_restore CLONE testschema BEFORE (TIMESTAMP => TO_TIMESTAMP(40*365*86400)); PROD PUBLIC Table A Table B Table C DEV PUBLIC Table A Table B Table C PUBLIC Table A Table B Table C INT
  • 28. 28© 2017 Snowflake Computing Inc. All Rights Reserved. #6 – JSON Support with SQL Apple 101.12 250 FIH-2316 Pear 56.22 202 IHO-6912 Orange 98.21 600 WHQ-6090 Structured data (e.g. CSV) Semi-structured data (e.g. JSON, Avro, XML) { "firstName": "John", "lastName": "Smith", "height_cm": 167.64, "address": { "streetAddress": "212nd Street", "city": "New York", "state": "NY", "postalCode": "10021-3100" }, "phoneNumbers": [ { "type": "home", "number":"212555-1234"}, { "type": "office", "number": "646 555-4567"} ] } Optimized storage Flexible schema - Native Relational processing select v:lastName::string as last_name from json_demo; All Your Data!
  • 29. 29© 2017 Snowflake Computing Inc. All Rights Reserved. #5 – Standard SQL w/Analytic Functions select Nation, Customer, Total from (select n.n_name Nation, c.c_name Customer, sum(o.o_totalprice) Total, rank() over (partition by n.n_name order by sum(o.o_totalprice) desc) customer_rank from orders o, customer c, nation n where o.o_custkey = c.c_custkey and c.c_nationkey = n.n_nationkey group by 1, 2) where customer_rank <= 3 order by 1, customer_rank SQL Complete SQL database • Data definition language (DDLs) • Query (SELECT) • Updates, inserts and deletes (DML) • Role based security • Multi-statement transactions New partner post: https://sonra.io/2018/02/04/create-custom-aggregate-udaf-window-functions-snowflake/
  • 30. 30© 2017 Snowflake Computing Inc. All Rights Reserved. #4 – Separation of Storage & Compute Snowflake’s multi-cluster, shared data architecture Centralized storage Instant, automatic scalability & elasticity Service Compute Storage
  • 31. 31© 2017 Snowflake Computing Inc. All Rights Reserved. #3 – Support Multiple Workloads Accelerate the data pipeline Run loading & analytics at any time, concurrently, to get data to users faster Scale compute to support any workload Scale processing horsepower up and down on-the-fly, with zero downtime or disruption Scale concurrency without performance impact Multi-cluster “virtual warehouse” architecture scales concurrent users & workloads without contention Deliver faster analytics at any scale Loading Marketing Finance
  • 32. 32© 2017 Snowflake Computing Inc. All Rights Reserved. • Elastic scaling for storage Low-cost cloud storage, fully replicated and resilient • Elastic scaling for compute Virtual warehouses scale up & down on the fly to support workload needs • Elastic scaling for concurrency Automatically scale concurrency using multi- cluster virtual warehouses Instant, unlimited scalability
  • 33. 33© 2017 Snowflake Computing Inc. All Rights Reserved. #2  – Secure  by  Design  with  Automatic  Encryption  of  Data! Embedded multi-factor authentication Federated authentication available Certified against enterprise- class requirements HIPPA Certified! PCI Certified! All data encrypted, always, end-to-end Encryption keys managed automatically NEW: Tri-secret security Role-based access control model Granular privileges on all objects & actions Authentication Access control Data encryption External validation
  • 34. 34© 2017 Snowflake Computing Inc. All Rights Reserved. #1 - Automatic Query Optimization Zero Management Fully managed with no knobs or tuning required No indexes, distribution keys, partitioning, vacuuming,… Zero infrastructure costs Zero admin costs
  • 35. 35© 2017 Snowflake Computing Inc. All Rights Reserved. New #1 Fav – Data Sharing (The Data “Sharehouse”) Data Consumers Data Providers No data movement Share with unlimited number of consumers Live access Data consumers immediately see all updates Ready to use Consumers can immediately start querying
  • 36. 36© 2017 Snowflake Computing Inc. All Rights Reserved. Data Sharing - Configuring access create share s1; --empty share grant usage on database sales to share s1; -- add database grant usage on schema sales.east to share s1; -- add schema grant usage on view sales.east.accts to share s1; -- add view alter share s1 add accounts=a1, a2, a3; -- add accounts s1 Provider Database Data Share Object Schema List View List Account List a3 a1 a2 Data provider Data Consumers create database sales from share p1.s1;
  • 37. © 2017 Snowflake Computing Inc. All Rights Reserved. Enabling Agile DW in the Cloud Examples
  • 38. 38© 2017 Snowflake Computing Inc. All Rights Reserved. •Agile Warehouse Scaling • Separation of Workloads • Virtual WH Scaling Techniques •Agile Data Lifecycle • Cloning Enabling the Agile Data Warehouse
  • 39. © 2017 Snowflake Computing Inc. All Rights Reserved. Agile  Warehouse  Scaling Separation  of  Workloads
  • 40. 40© 2017 Snowflake Computing Inc. All Rights Reserved. • DWaaS - Snowflake solution • Assign each workload its own Virtual WH • At a minimum, two WH: ETL and Business • ETL can run continuously if it makes sense for the business • ETL / Business workload contention is eliminated • Further subdivide business workloads into own clusters as needed • Eliminate contention between Sales, Marketing, Finance, Data Science workloads • International groups can operate clusters on their own local time schedules • Easy to add new workloads on demand! Separation of Workloads Virtual Warehouse Databases Virtual Warehouse ETL & Data Loading Business Workloads Finance Virtual Warehouse Test/Dev Virtual Warehouse S Marketing Virtual Warehouse Sales Virtual Warehouse S Research
  • 41. 41© 2017 Snowflake Computing Inc. All Rights Reserved. • Increase cluster size – on the fly! • More data being analyzed • More complex queries • Get some concurrency boost • No need to start, stop, or reboot! • Automatic scale out! • Multi-cluster warehouse concurrency • Workload queries have usual weight • But more of them, e.g. 20 dashboard users rather than the usual 5 • Automatically scales back after the load drops Other Agile Warehouse Scaling Techniques
  • 42. 42© 2017 Snowflake Computing Inc. All Rights Reserved. • Anticipated surges • Explicitly increase WH nodes (T-shirt size) when expecting more data • Explicitly increase MCWH minimum clusters when expecting more queries • Can do both at once with ALTER WAREHOUSE • Use cron or other scheduling/orchestration tool • Unanticipated surges • Rely on MCWH maximum clusters for some extra headroom • Maximize Business Agility! • Responsiveness for users • Throughput and value extracted from variable compute power • Minimize • Cost and administrative overhead Agile Warehouse Scaling – Best Practices
  • 43. © 2017 Snowflake Computing Inc. All Rights Reserved. Agile  Data  Lifecycle
  • 44. 44© 2017 Snowflake Computing Inc. All Rights Reserved. • Separation of Workloads • Individual virtual warehouse for each dev/test/prod functional area • CLONE for dev/test or Data Science Sandbox – on demand! • Full logical copy of the data, but uses no extra storage • Test/dev operations against clone have no effect on original data • Security • RBAC limits dev/test access to clone and not production data • Secure Views permit role- or user-based obfuscation / masking / projection • Business Impact – better quality code • Dev and test teams are working on data at scale, see true app performance • Full range of values means fewer surprises when app encounters live data Agile Data Lifecycle
  • 45. 45© 2017 Snowflake Computing Inc. All Rights Reserved. • Create development (DEV) and integration (INT) databases from production (PROD) Scenario 1 PROD PUBLIC Table A Table B INT PUBLIC Table A Table B DEV PUBLIC Table A Table B CLONE CLONE CREATE OR REPLACE DATABASE DEV CLONE PROD CREATE OR REPLACE DATABASE INT CLONE PROD
  • 46. 46© 2017 Snowflake Computing Inc. All Rights Reserved. • Create two new tables, C and D, in the development (DEV) database Scenario 2: new development PROD PUBLIC Table A Table B INT PUBLIC Table A Table B DEV PUBLIC Table A Table B Table C Table D
  • 47. 47© 2017 Snowflake Computing Inc. All Rights Reserved. • Mini-release: promote table C for integration testing Scenario 2: new development PROD PUBLIC Table A Table B INT PUBLIC Table A Table B DEV PUBLIC Table A Table B Table C Table D Table C CREATE TABLE LIKE
  • 48. 48© 2017 Snowflake Computing Inc. All Rights Reserved. • Deploy to production: promote table C to PROD database Scenario 2: new development PROD PUBLIC Table A Table B INT PUBLIC Table A Table B DEV PUBLIC Table A Table B Table C Table D Table C CREATE TABLE LIKE Table C
  • 49. 49© 2017 Snowflake Computing Inc. All Rights Reserved. • Refresh dev: get latest PROD data into DEV and INT Scenario 2: new development PROD PUBLIC Table A Table B INT PUBLIC Table A Table B DEV PUBLIC Table A Table B Table C Table D Table C Table C CLONE DEV PUBLIC Table A Table B Table C Table D DEV2 CLONE CREATE OR REPLACE TABLE Dev2.TableD CLONE Dev.TableD CREATE OR REPLACE DATABASE DEV2 CLONE PROD
  • 50. 50© 2017 Snowflake Computing Inc. All Rights Reserved. • CLONE for Data Scientists • Quick and Safe sandbox for discovery and testing • Experiment and explore, and write back! • Test hypothisis and keep iterating – drop and reclone as needed • No impact to production data • Combine with own virtual warehouse for complete isolation • Business Impact – better data science • More fine-grained data over longer time intervals • Deeper insights, better forecasting, more monetizable results • CLONE for Compliance • Monthly, quarterly, annual clones – financial reporting, auditing requirements • Business Impact – simpler compliance • Your "backups" are live and immediately available Agile Data Lifecycle
  • 51. © 2017 Snowflake Computing Inc. All Rights Reserved. Snowflake & Spark
  • 52. 52© 2017 Snowflake Computing Inc. All Rights Reserved. Where Snowflake fits with complementary solutions Vast majority, >> 80% or more, is semi-/structured >> 20% or less is unstructured Data Sources for Analytics (Audio, video, text, etc.) (Tables, CSV, JSON, Avro, Parquet, XML, flat files, etc.) Complementary Special data processing (Spark ML, Hadoop lakes, NoSQL front-ends, In-memory DBs, non-relational exploring, et al.) Built for the cloud, relational data warehouse/DB & analytics delivered as a service 52 Use Cases (Tables, metadata) • Live analytics • Operational DW • Real-time dashboards • Continuous data loading • Change data capture • Traditional DW/BI • Consolidation of data silos • Relational data exploration • Interactive queries • Data science/statistics (Python, R) • And more • Spark machine learning • Unstructureddata mining • Streaming analytics • Non-relational data exploration • Non-relational data science • And more
  • 53. 53© 2017 Snowflake Computing Inc. All Rights Reserved. Modern  Data  Architectures Data Production Data Lake Data Consumption Web Servers Data Bases App Metrics Snowflake Business Intelligence Data Science: Exploration Data Science: Machine Learning Model Training Data Augmentation & Loading Data Augmentation & Loading Model-driven Apps Online Recommendations & Behavioral Targeting on Web Pages Offline Churn Prediction Other Machine Learning Apps
  • 54. 54© 2017 Snowflake Computing Inc. All Rights Reserved. Modern  Data  Architectures:   Data  Loading  &  Augmentation Web Servers App Metrics Data Bases Data ProductionCustomers & End Users External Data Augmentation Spark Cluster Kafka SnowSQL COPY Kinesis Transformations and Augmentations in e.g. Python, Scala, SparkSQL, Spark Streaming etc. Snowflake Warehouse Snowflake Spark Connector Full/Pruned Data Lake with Augmentation Snowflake Spark Connector Source/Raw Data Augmented Data Source/Raw Data Augmented Data Transformations and Augmentations using Snowflake SQL Augmentation Augmentation CDC - FiveTran
  • 55. 55© 2017 Snowflake Computing Inc. All Rights Reserved. Modern  Data  Architectures:  Data  Science Jupyter Notebook using e.g. ML Data Exploration Model Design Spark app using Spark ML Model (Re-)Training Operationalized Model (Code + Params) External Data Augmentation Spark Cluster Snowflake Warehouse Snowflake Spark Connector Data Lake with Augmentation Snowflake Spark Connector Source/Raw Data Augmented Data Source/Raw Data Augmented Data Augmentation Augmentation Data Exploration using Snowflake SQL
  • 56. 56© 2017 Snowflake Computing Inc. All Rights Reserved. Snowflake WarehouseSpark Cluster (e.g. in AWS EMR) AWS S3 3 2 Data Flow: Data Frames Data Flow: Snowflake Tables Master Node Slave Nodes Query (SQL) Cloud Services Virtual Warehouse 1 Architecture:  Data  Interchange  (Read)
  • 57. 57© 2017 Snowflake Computing Inc. All Rights Reserved. Snowflake WarehouseSpark Cluster (e.g. in AWS EMR) AWS S3 1 3 Data Flow: Data Frames Data Flow: Snowflake Tables Master Node Slave Nodes ControlFlow (SQL) Cloud Services Virtual Warehouse 2 Architecture:  Data  Interchange  (Write)
  • 58. 58© 2017 Snowflake Computing Inc. All Rights Reserved. Spark Cluster Snowflake Virtual Warehouse Super-Charge Spark Processing with Snowflake • Spark optimizer extension automatically identifies Spark operations with corresponding Snowflake implementations • Spark connector pushes these operations into Snowflake SQL • Pushed operations include: project, filter, join, aggregation, limit • Optimized Snowflake SQL applied to all Spark data frames backed by Snowflake tables • Transparent to Spark language choice: No need to rewrite Spark (Python, Scala, Java, …) or SparkSQL code df1 = spark.load…(‘T1’) df2 = spark.load…(‘T2’) df1 .join(df2, ‘id’) .groupBy(‘region’) .count() .collect() SELECT region, count(*) FROM T1 JOIN T2 ON T1.id = T2.id GROUPBY region Snowflake SQL AWS S3 Results
  • 59. 59© 2017 Snowflake Computing Inc. All Rights Reserved. • Reduced complexity, better manageability • Same data store for curated and raw data • Same data store for data science and BI data sets • Fully transactional data store even for raw data when needed • Performance benefits • Leveraging structure of the data even in semi-structured data formats such as JSON as compared to flat file storage • Metadata for pruning based on queries • Easy to integrate with Spark as compute and application platform • Overall compute capacity and cost savings due to performance benefits with Snowflake • Blog posts: • https://www.snowflake.net/snowflake-and-spark-part-1-why-spark/ • https://www.snowflake.net/snowflake-spark-part-2-pushing-query-processing/ Snowflake + Spark Benefits
  • 60. © 2017 Snowflake Computing Inc. All Rights Reserved. NEW Continous Data Loading with Snowpipe
  • 61. 61© 2017 Snowflake Computing Inc. All Rights Reserved. Data is created constantly.Data accumulated over time. Data is Being Generated Faster than Ever Before Infrequent Now Constant Web data Corp data App data There’s an enormous opportunity to use continuously generated data in analysis.
  • 62. 62© 2017 Snowflake Computing Inc. All Rights Reserved. Snowpipe is an automated service that asynchronously listens for new data as it arrives in Amazon Web Services’ S3 cloud storage service and continuously loads that data into Snowflake. © 2017 Snowflake Computing Inc. All Rights Reserved. Snowflake Confidential – under NDA only.
  • 63. 63© 2017 Snowflake Computing Inc. All Rights Reserved. 6 Data Ingress Approaches & Snowpipe Support Approach Approach Definition Snowpipe Batch Data accumulates over time (hours, days) and is then loaded periodically Option 1: Point at an S3 bucket and a destination table in your warehouse at which point new data is automatically uploaded. Option 2: Technical resources can interface directly with the programmatic REST API along with Java and Python SDKs to enable highly customized loading use cases Micro-batch Data accumulates over small time windows (minutes) and is then loaded Continuous Every data item is loaded individually as it arrives
  • 64. 64© 2017 Snowflake Computing Inc. All Rights Reserved. Snowpipe Scenario: Automatic Loading from S3 Snowflake Database External S3 SnowPipe Service Server-less Loader S3 notification File data
  • 65. 65© 2017 Snowflake Computing Inc. All Rights Reserved. Snowpipe – Current Preview Snowflake Database REST Call {file names} S3 Application SnowPipe Service REST Endpoint Server-less Loader
  • 66. 66© 2017 Snowflake Computing Inc. All Rights Reserved. Snowpipe Auto-Ingest: Fully Automatic Loading from S3 Snowflake Database External S3 SnowPipe Service Server-less Loader S3 notification File data Use simple Snowflake DDL statements to configure bucket for automatic load with Snowpipe See the video here: https://www.youtube.com/embed/rwZWmcy0aBU?autoplay=1&rel=0 Blog posts: https://www.snowflake.net/your-first-steps-with-snowpipe/ https://www.snowflake.net/snowpipe-serverless-loading-for-streaming-data-2/
  • 67. 67© 2017 Snowflake Computing Inc. All Rights Reserved. Key Snowpipe Benefits You only pay for the compute time you use to load data Avoid repeated manual COPY commands Continuously generated data is available for analysis in seconds 0 management. No indexing, tuning, partitioning or vacuuming on load Full support for semi-structured data on load Server-less: No servers to manage or concurrency to worry about.
  • 68. 68© 2017 Snowflake Computing Inc. All Rights Reserved. • Server-less billing model: utilization- based billing • No warehouse to manage for load • Per core, per second granularity • Charged as Snowflake credits • New line item on Snowflake bill • Components of the Snowpipe charges: • Loading work – idle times are not charged • Time spent in metadata management for file loading 6 Pricing
  • 69. 69© 2017 Snowflake Computing Inc. All Rights Reserved.© 2017 Snowflake Computing Inc. All Rights Reserved. Snowflake in Action Today
  • 70. 70© 2017 Snowflake Computing Inc. All Rights Reserved. What customers are doing with Snowflake DATA  MARTS   &  EXTRACTS Market research company consolidated data marts to reduce costs and data silos Gaming company replaced Hadoop + SQL database with Snowflake STAGING DATA  LAKE DATA   WAREHOUSE Consumer retailer modernizing DW by replacing legacy appliance with Snowflake Mobile analytics company shares live data with clients REPORTING,  ANALYTICS   &  APPLICATIONS DATA  SOURCES
  • 71. 71© 2017 Snowflake Computing Inc. All Rights Reserved. Delivering compelling results Simpler data pipeline Replace noSQL database with Snowflake for storing & transforming JSON event data Faster analytics Replace on-premises data warehouse with Snowflake for analytics workload Significantly lower cost Improved performance while adding new workloads--at a fraction of the cost noSQL data base: 8 hours to prepare data Snowflake: 1.5 minutes Data warehouse appliance: 20+ hours Snowflake: 45 minutes Data warehouse appliance: $5M + to expand Snowflake: added 2 new workloads for $50K
  • 72. 72© 2017 Snowflake Computing Inc. All Rights Reserved. Over 1000 customers demonstrating what’s possible Up to 200x faster reports that enable analysts to make decisions in minutes rather than days Ability to load and update data in near real time by replacing legacy data warehouse + Hadoop cluster New applications that provide secure access to analytics to 11,000+ pharmacies
  • 73. 73© 2017 Snowflake Computing Inc. All Rights Reserved. Chosen by leading enterprises Built advertising analytics platform on Snowflake Moving enterprise reporting to Snowflake Deploying BI solution stack on Snowflake Built new embedded analytics application on Snowflake Moved audience reporting infrastructure to Snowflake Replaced Hadoop + legacy data warehouse with Snowflake
  • 74. 74© 2017 Snowflake Computing Inc. All Rights Reserved. Ranked #1 Cloud Data Warehouse! “Snowflake Hits All the Marks” – Gigaom 4,85 4,50 4,45 3,75 3,75 3,35 3,20 3,15 2,60 Cloud Analytics Database Distruption Vectors AWS Redshift Oracle Database Exdata Cloud Service SAP HANA Cloud Platform Azure Data Warehouse Vertica DashDB (IBM) Teradata Google Big Query Snowflake RobustnessofSQL15% Built-inOptimization15% On-the-flyElasticity25% DynamicEnvironment Adaption20% SeparationofCompute fromstorage15% SupportforDiversedata 10% Score “You can tell the data warehouse pedigree from the development… With superior performance and the most hands-off model of ownership, Snowflake is the epitome of data warehouse as a service. The model, cost, features and scalability have already caused some to postpone Hadoop adoption.” William McKnight Gigaom Disruption Vectors Gigaom Analyst Report: Sector Roadmap: Cloud Analytic Databases 2017 Read the full report on snowflake.net
  • 75. 75© 2017 Snowflake Computing Inc. All Rights Reserved. What does a Cloud-native DWaaS Provide? Cost effective storage and analysis of GBs, TBs, or even PB’s Lightning fast query performance Continuous data loading without impacting query performance Unlimited user concurrency Full SQL relational support of both structured and semi-structured data Support for the tools and languages you already useODBC WEB UIJDBCJava >_ Scripting
  • 76. 76© 2017 Snowflake Computing Inc. All Rights Reserved. Big Data does not have to equal Big Effort Web ERP3rd party apps Enterprise apps IoTMobile
  • 77. 77© 2017 Snowflake Computing Inc. All Rights Reserved. Discover the performance, concurrency, and simplicity of Snowflake As easy as 1-2-3! 01 Visit Snowflake.net 02 Click “Try for Free” 03 Sign up & register Snowflake is the only data warehouse built for the cloud. You can automatically scale compute up, out, or down—independent of storage. Plus, you have the power of a complete SQL database, with zero management, that can grow with you to support all of your data and all of your users. With Snowflake On Demand™, pay only for what you use. Sign up and receive $400 worth of free usage for 30 days!
  • 78. Kent Graziano Snowflake Computing [email protected] On Twitter @KentGraziano More info at http://snowflake.net Visit my blog at http://kentgraziano.com Contact Information
  • 79. 79© 2017 Snowflake Computing Inc. All Rights Reserved. YOUR DATA, NO LIMITS © 2017 Snowflake Computing Inc. All Rights Reserved. Thank You!