SlideShare a Scribd company logo
An overview of snowflake
 Snowflake is a Cloud Data warehouse provided as SaaS with full support of ANSI
SQL, also includes both Structural and Semi-Structure data.
 Enables Users to CreateTables, Start Querying data with less administration.
 Offers bothTraditional Share disk and Shared Nothing architecture to offer the
best of both.
Shared Nothing Architecture Shared disk Architecture
 Snowflake facilitates Unlimited Storage Scalabilty without refactoring and Multiple
Clusters can read or write share data, ResizeClusters Instantly – no downtime is
involved.
 FullTransaction consistencyACID Across entire System.
 Centrally Manage Logical assets such as Servers, buckets etc.
Snowflake Support by 3 different layers
 Storage layer, Compute and Cloud Service
 Snowflake process queries using MPP Concept such as each node has parts of the
data stored locally while using a central data repository to store the data that is
accessible by all compute nodes.
 Snowflake Architecture consist of 3 layers
1. Data Storage
2. Query Processing
3. Cloud Services.
Database Storage Layer:
 Snowflake Organize data into multiple micro partition that are internally
optimized and compressed. It uses columnar format to store. Data stored in
Cloud Storage and works as Shared disk model which provides a simplicity in
data management.
 Compute Node connect with Storage layer to fetch data for querying as the
Storage layer is independent.This allows as Snowflake is provisioned on Cloud,
there by Storage is elastic resulting the user only pay perTB every month.
Query Layer
 Snowflake usesVirtualWarehouse for running the query.The Phenomenal of
snowflake that separate the query processing layer from disk storage.
Cloud Service Layer:
AllActivities Such as Authentication, Security, Meta Management of loaded data and
Query optimizer that coordinates across this layer.
Benefits:
Cloud Services :
 Multi-tenant, transactional and Secure
 Runs in AWS Cloud
 Million of Queries per day over
petabytes of data.
 Replicated for Availability and Scalability
 Focus on easy of use and service
experience
 Collection of services such as Access-
Control,QueryOptimizer and
transactional Manager
1. Extract data from oracle to CSV using SQL Plus
2. Data type conversation and other transformations.
3. Staging files to S3
4. Finally Copy Staged files to Snowflake tables.
Step 1: Code: --Turn on the spool
spool spool file.txt
select * from dba_table;
spool off
Note : Spool file will not be available until it is turned off.
#!/usr/bin/bash
FILE="students.csv"
sqlplus -s user_name/password@oracle_db <<EOF
SET PAGESIZE 35000
SET COLSEP "|"
SET LINESIZE 230
SET FEEDBACKOFF
SPOOL $FILE
SELECT * FROM EMP;
SPOOLOFF
EXIT
EOF#!/usr/bin/bash
FILE="emp.csv"
sqlplus -s scott/tiger@XE <<EOF
SET PAGESIZE 50000
SET COLSEP ","
SET LINESIZE 200
SET FEEDBACKOFF
SPOOL $FILE
SELECT * FROM STUDENTS;
SPOOLOFF
EXIT
EOF
Step 1.2:
 For incremental load,we need
to generate sql with proper
condition to select only the
records which we are modified
after the last data pull.
Query : select * from students
where last_modified_time >
last_pull_time and
last_modified_time <=
sys_time.
Step 2 : Below are the recommendation for
transferring data type conversation from oracle
to snowflake.
Step 3:
 To load data to Snowflake, the
data needs to be upload to s3
loaction (step 2 explains about
extract of oracle to flat files)
 We need a Snowflake instance
which runs on AWS.This
instance needs to have the
ability to access the S3 files in
AWS.
 This access can be either
internal or external and this
process is called Staging
Create Internal Staging :
create or replace stage my_oracle_stage
copy_options= (on_error='skip_file')
file_format= (type = 'CSV' field_delimiter = ','
skip_header = 1);
Use below PUT command to stage files to
internal Snowflake stage
PUT file://path_to_your_file/your_filename
internal_stage_name
Upload a file items_data.csv in the
/tmp/oracle_data/data/ directory to an
internal stage named oracle_stage
put
ile:////tmp/oracle_data/data/items_data.cs
v @oracle_stage;
Ref :
https://docs.snowflake.net/manuals/sql-reference/sql/put.html
Step3: (External Staging options)
 Snowflake supports any
accessibleAmazon S3 or
MicrosoftAzure as an external
staging location.You can create
a stage to pointing to the
location data can be loaded
directly to the Snowflake table
through that stage. No need to
move the data to an internal
stage
 create an external stage
pointing to an S3 location, IAM
credentials with proper access
permissions are required
If data needs to be decrypted before loading
to Snowflake, proper keys are to be
provided.
create or replace stage oracle_ext_stage
url='s3://snowflake_oracle/data/load/files/'
credentials=(aws_key_id='1d318jnsonmb5#d
gd4rrb3c'
aws_secret_key='aii998nnrcd4kx5y6z');
encryption=(master_key =
'eSxX0jzskjl22bNaaaDuOaO8=');
Once data is extracted from Oracle it can be
uploaded to S3 using the direct upload
option or usingAWS SDK in your favourite
programming language. Python’s boto3 is
a popular one used under such
circumstances. Once data is in S3, an external
stage can be created to point that location
Step 4: Copy staged files to
Snowflake table
 Extracted data from Oracle,
uploaded it to an S3 location
and created an external
Snowflake stage pointing to
that location.The next step is
to copy data to the table.The
command used to do this
is COPY INTO. Note:To execute
the COPY INTO command,
compute resources in
Snowflake virtual warehouses
are required and your
Snowflake credits will be
utilized.
• To load from a named internal
copy into oracle_table
from @oracle_stage;
• Loading from the external stage. Only one file is
specified.
copy into my_ext_stage_table from
@oracle_ext_stage/tutorials/dataloading/items_ext
.csv;
• A copy directly from an external location without
creating a stage
copy into oracle_table from
s3://mybucket/oracle_snow/data/files
credentials=(aws_key_id='$AWS_ACCESS_KEY_
ID'
aws_secret_key='$AWS_SECRET_ACCESS_KE
Y') encryption=(master_key =
'eSxX009jhh76jkIuLPH5r4BD09wOaO8=')
file_format = (format_name = csv_format);
Files can be specified using patterns
copy into oracle_pattern_table
from @oracle_stage file_format =
(type = 'TSV')
pattern='.*/.*/.*[.]csv[.]gz';
Step 4: Update SnowflakeTable
The basic idea is to load incrementally
extracted data into an intermediate
or temporary table and modify
records in the final table with data in
the intermediate table.The three
methods mentioned below are
generally used for this.
1. Update the rows in the target table
with new data (with same keys).
Then insert new rows from the intermediate or
landing table which are not in the final table
UPDATE oracle_target_table t SET t.value =
s.value FROM landing_delta_table in
WHERE t.id = in.id;
INSERT INTO oracle_target_table (id, value)
SELECT id, value
FROM landing_delta_table WHERE NOT
id IN (SELECT id FROM
oracle_target_table);
2. Delete rows from the target table which are
also in the landing table. Then insert all
rows from the landing table to the final
table. Now, the final table will have the
latest data without duplicates
DELETE .oracle_target_table f WHERE f.id IN
(SELECT id from landing_table); INSERT
oracle_target_table (id, value) SELECT id,
value FROM landing_table;
Files can be specified using patterns
copy into oracle_pattern_table
from @oracle_stage file_format =
(type = 'TSV')
pattern='.*/.*/.*[.]csv[.]gz';
Step 4: Update SnowflakeTable
The basic idea is to load incrementally
extracted data into an intermediate
or temporary table and modify
records in the final table with data in
the intermediate table.The three
methods mentioned below are
generally used for this.
1. Update the rows in the target table
with new data (with same keys).
Then insert new rows from the intermediate or
landing table which are not in the final table
UPDATE oracle_target_table t SET t.value =
s.value FROM landing_delta_table in
WHERE t.id = in.id;
INSERT INTO oracle_target_table (id, value)
SELECT id, value
FROM landing_delta_table WHERE NOT
id IN (SELECT id FROM
oracle_target_table);
2. Delete rows from the target table which are
also in the landing table. Then insert all
rows from the landing table to the final
table. Now, the final table will have the
latest data without duplicates
DELETE .oracle_target_table f WHERE f.id IN
(SELECT id from landing_table); INSERT
oracle_target_table (id, value) SELECT id,
value FROM landing_table;
3. MERGE Statement – Standard SQL
merge statement which combines
Inserts and updates. It is used to
apply changes in the landing table
to the target table with one SQL
statement
MERGE into oracle_target_table
t1 using landing_delta_table t2 on
t1.id = t2.idWHEN matched then
update set value = t2.value WHEN
not matched then INSERT (id, value)
values (t2.id, t2.value);
This method works when you have a
comfortable project timeline and a
pool of experienced engineering
resources that can build and
maintain the pipeline. However, the
method mentioned above comes
with a lot of coding and
maintenance overhead
Ref :
https://hevodata.com/blog/oracle-to-snowflake-etl/
An overview of snowflake
An overview of snowflake
An overview of snowflake
An overview of snowflake
Q&A
https://www.analytics.today/blog/top-10-reasons-snowflake-rocks
https://www.g2.com/reports/grid-report-for-data-warehouse-fall-
2019?featured=snowflake&secure%5Bgated_consumer%5D=0043e810-90c1-4257-a24a-
f7a3b7e6b1c3&secure%5Btoken%5D=04647245837d1e63f5d46e942153e0beed97b18b25f466
db19d0c54901467747&utm_campaign=gate-768549
An overview of snowflake
Q&A
Ad

More Related Content

What's hot (20)

Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data Engineering
Harald Erb
 
Snowflake Data Loading.pptx
Snowflake Data Loading.pptxSnowflake Data Loading.pptx
Snowflake Data Loading.pptx
Parag860410
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
Brett VanderPlaats
 
Snowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat SheetSnowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat Sheet
Jeno Yamma
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
Databricks
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
Ishan Bhawantha Hewanayake
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
James Serra
 
KSnow: Getting started with Snowflake
KSnow: Getting started with SnowflakeKSnow: Getting started with Snowflake
KSnow: Getting started with Snowflake
Knoldus Inc.
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
Databricks
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
AndrewJiang18
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
Kent Graziano
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
Databricks
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
Matillion
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
James Serra
 
adb.pdf
adb.pdfadb.pdf
adb.pdf
AdityaMehta724216
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
Adam Doyle
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
James Serra
 
Hive partitioning best practices
Hive partitioning  best practicesHive partitioning  best practices
Hive partitioning best practices
Nabeel Moidu
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data Engineering
Harald Erb
 
Snowflake Data Loading.pptx
Snowflake Data Loading.pptxSnowflake Data Loading.pptx
Snowflake Data Loading.pptx
Parag860410
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
Brett VanderPlaats
 
Snowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat SheetSnowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat Sheet
Jeno Yamma
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
Databricks
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
James Serra
 
KSnow: Getting started with Snowflake
KSnow: Getting started with SnowflakeKSnow: Getting started with Snowflake
KSnow: Getting started with Snowflake
Knoldus Inc.
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
Databricks
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
AndrewJiang18
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
Kent Graziano
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
Databricks
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
Matillion
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
James Serra
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
Adam Doyle
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
James Serra
 
Hive partitioning best practices
Hive partitioning  best practicesHive partitioning  best practices
Hive partitioning best practices
Nabeel Moidu
 

Similar to An overview of snowflake (20)

ASP.Net Presentation Part2
ASP.Net Presentation Part2ASP.Net Presentation Part2
ASP.Net Presentation Part2
Neeraj Mathur
 
Sqllite
SqlliteSqllite
Sqllite
Senthil Kumar
 
Day 1 - Technical Bootcamp azure synapse analytics
Day 1 - Technical Bootcamp azure synapse analyticsDay 1 - Technical Bootcamp azure synapse analytics
Day 1 - Technical Bootcamp azure synapse analytics
Armand272
 
oracle dba
oracle dbaoracle dba
oracle dba
uday jampani
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
Ajeet Singh
 
Big datademo
Big datademoBig datademo
Big datademo
Ravi Patel
 
Whitepaper To Study Filestream Option In Sql Server
Whitepaper To Study Filestream Option In Sql ServerWhitepaper To Study Filestream Option In Sql Server
Whitepaper To Study Filestream Option In Sql Server
Shahzad
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
metsarin
 
ME_Snowflake_Introduction_for new students.pptx
ME_Snowflake_Introduction_for new students.pptxME_Snowflake_Introduction_for new students.pptx
ME_Snowflake_Introduction_for new students.pptx
Samuel168738
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
Alex Zaballa
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
Alex Zaballa
 
Migration from 8.1 to 11.3
Migration from 8.1 to 11.3Migration from 8.1 to 11.3
Migration from 8.1 to 11.3
Suryakant Bharati
 
Oracle Database Backup
Oracle Database BackupOracle Database Backup
Oracle Database Backup
Handy_Backup
 
CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...
CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...
CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...
smn-automate
 
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Faysal Shaarani (MBA)
 
Top 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseTop 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous Database
Sandesh Rao
 
Sql lite android
Sql lite androidSql lite android
Sql lite android
Dushyant Nasit
 
Introduction to SQLite in Adobe AIR
Introduction to SQLite in Adobe AIRIntroduction to SQLite in Adobe AIR
Introduction to SQLite in Adobe AIR
Peter Elst
 
Exam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and AdministrationExam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and Administration
KylieJonathan
 
ora_sothea
ora_sotheaora_sothea
ora_sothea
thysothea
 
ASP.Net Presentation Part2
ASP.Net Presentation Part2ASP.Net Presentation Part2
ASP.Net Presentation Part2
Neeraj Mathur
 
Day 1 - Technical Bootcamp azure synapse analytics
Day 1 - Technical Bootcamp azure synapse analyticsDay 1 - Technical Bootcamp azure synapse analytics
Day 1 - Technical Bootcamp azure synapse analytics
Armand272
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
Ajeet Singh
 
Big datademo
Big datademoBig datademo
Big datademo
Ravi Patel
 
Whitepaper To Study Filestream Option In Sql Server
Whitepaper To Study Filestream Option In Sql ServerWhitepaper To Study Filestream Option In Sql Server
Whitepaper To Study Filestream Option In Sql Server
Shahzad
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
metsarin
 
ME_Snowflake_Introduction_for new students.pptx
ME_Snowflake_Introduction_for new students.pptxME_Snowflake_Introduction_for new students.pptx
ME_Snowflake_Introduction_for new students.pptx
Samuel168738
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
Alex Zaballa
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
Alex Zaballa
 
Migration from 8.1 to 11.3
Migration from 8.1 to 11.3Migration from 8.1 to 11.3
Migration from 8.1 to 11.3
Suryakant Bharati
 
Oracle Database Backup
Oracle Database BackupOracle Database Backup
Oracle Database Backup
Handy_Backup
 
CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...
CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...
CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...
smn-automate
 
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Faysal Shaarani (MBA)
 
Top 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseTop 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous Database
Sandesh Rao
 
Sql lite android
Sql lite androidSql lite android
Sql lite android
Dushyant Nasit
 
Introduction to SQLite in Adobe AIR
Introduction to SQLite in Adobe AIRIntroduction to SQLite in Adobe AIR
Introduction to SQLite in Adobe AIR
Peter Elst
 
Exam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and AdministrationExam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and Administration
KylieJonathan
 
ora_sothea
ora_sotheaora_sothea
ora_sothea
thysothea
 
Ad

More from Sivakumar Ramar (8)

Nps speedo meter gauge chart in tabelau
Nps speedo meter   gauge chart in tabelauNps speedo meter   gauge chart in tabelau
Nps speedo meter gauge chart in tabelau
Sivakumar Ramar
 
01 BlockChain
01 BlockChain01 BlockChain
01 BlockChain
Sivakumar Ramar
 
AWS Services - Part 1
AWS Services - Part 1AWS Services - Part 1
AWS Services - Part 1
Sivakumar Ramar
 
Amazon quicksight
Amazon quicksightAmazon quicksight
Amazon quicksight
Sivakumar Ramar
 
Monitor tableau server for reference
Monitor tableau server for referenceMonitor tableau server for reference
Monitor tableau server for reference
Sivakumar Ramar
 
Today's Synopsis about the 24 Databases
Today's Synopsis about the 24 DatabasesToday's Synopsis about the 24 Databases
Today's Synopsis about the 24 Databases
Sivakumar Ramar
 
AWS Devops
AWS DevopsAWS Devops
AWS Devops
Sivakumar Ramar
 
TABLEAU for Beginners
TABLEAU for BeginnersTABLEAU for Beginners
TABLEAU for Beginners
Sivakumar Ramar
 
Nps speedo meter gauge chart in tabelau
Nps speedo meter   gauge chart in tabelauNps speedo meter   gauge chart in tabelau
Nps speedo meter gauge chart in tabelau
Sivakumar Ramar
 
AWS Services - Part 1
AWS Services - Part 1AWS Services - Part 1
AWS Services - Part 1
Sivakumar Ramar
 
Monitor tableau server for reference
Monitor tableau server for referenceMonitor tableau server for reference
Monitor tableau server for reference
Sivakumar Ramar
 
Today's Synopsis about the 24 Databases
Today's Synopsis about the 24 DatabasesToday's Synopsis about the 24 Databases
Today's Synopsis about the 24 Databases
Sivakumar Ramar
 
TABLEAU for Beginners
TABLEAU for BeginnersTABLEAU for Beginners
TABLEAU for Beginners
Sivakumar Ramar
 
Ad

Recently uploaded (20)

Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 

An overview of snowflake

  • 2.  Snowflake is a Cloud Data warehouse provided as SaaS with full support of ANSI SQL, also includes both Structural and Semi-Structure data.  Enables Users to CreateTables, Start Querying data with less administration.  Offers bothTraditional Share disk and Shared Nothing architecture to offer the best of both. Shared Nothing Architecture Shared disk Architecture
  • 3.  Snowflake facilitates Unlimited Storage Scalabilty without refactoring and Multiple Clusters can read or write share data, ResizeClusters Instantly – no downtime is involved.  FullTransaction consistencyACID Across entire System.  Centrally Manage Logical assets such as Servers, buckets etc.
  • 4. Snowflake Support by 3 different layers  Storage layer, Compute and Cloud Service
  • 5.  Snowflake process queries using MPP Concept such as each node has parts of the data stored locally while using a central data repository to store the data that is accessible by all compute nodes.  Snowflake Architecture consist of 3 layers 1. Data Storage 2. Query Processing 3. Cloud Services. Database Storage Layer:  Snowflake Organize data into multiple micro partition that are internally optimized and compressed. It uses columnar format to store. Data stored in Cloud Storage and works as Shared disk model which provides a simplicity in data management.  Compute Node connect with Storage layer to fetch data for querying as the Storage layer is independent.This allows as Snowflake is provisioned on Cloud, there by Storage is elastic resulting the user only pay perTB every month.
  • 6. Query Layer  Snowflake usesVirtualWarehouse for running the query.The Phenomenal of snowflake that separate the query processing layer from disk storage. Cloud Service Layer: AllActivities Such as Authentication, Security, Meta Management of loaded data and Query optimizer that coordinates across this layer. Benefits:
  • 7. Cloud Services :  Multi-tenant, transactional and Secure  Runs in AWS Cloud  Million of Queries per day over petabytes of data.  Replicated for Availability and Scalability  Focus on easy of use and service experience  Collection of services such as Access- Control,QueryOptimizer and transactional Manager
  • 8. 1. Extract data from oracle to CSV using SQL Plus 2. Data type conversation and other transformations. 3. Staging files to S3 4. Finally Copy Staged files to Snowflake tables. Step 1: Code: --Turn on the spool spool spool file.txt select * from dba_table; spool off Note : Spool file will not be available until it is turned off. #!/usr/bin/bash FILE="students.csv" sqlplus -s user_name/password@oracle_db <<EOF SET PAGESIZE 35000 SET COLSEP "|" SET LINESIZE 230
  • 9. SET FEEDBACKOFF SPOOL $FILE SELECT * FROM EMP; SPOOLOFF EXIT EOF#!/usr/bin/bash FILE="emp.csv" sqlplus -s scott/tiger@XE <<EOF SET PAGESIZE 50000 SET COLSEP "," SET LINESIZE 200 SET FEEDBACKOFF SPOOL $FILE SELECT * FROM STUDENTS; SPOOLOFF EXIT EOF
  • 10. Step 1.2:  For incremental load,we need to generate sql with proper condition to select only the records which we are modified after the last data pull. Query : select * from students where last_modified_time > last_pull_time and last_modified_time <= sys_time. Step 2 : Below are the recommendation for transferring data type conversation from oracle to snowflake.
  • 11. Step 3:  To load data to Snowflake, the data needs to be upload to s3 loaction (step 2 explains about extract of oracle to flat files)  We need a Snowflake instance which runs on AWS.This instance needs to have the ability to access the S3 files in AWS.  This access can be either internal or external and this process is called Staging Create Internal Staging : create or replace stage my_oracle_stage copy_options= (on_error='skip_file') file_format= (type = 'CSV' field_delimiter = ',' skip_header = 1); Use below PUT command to stage files to internal Snowflake stage PUT file://path_to_your_file/your_filename internal_stage_name Upload a file items_data.csv in the /tmp/oracle_data/data/ directory to an internal stage named oracle_stage put ile:////tmp/oracle_data/data/items_data.cs v @oracle_stage; Ref : https://docs.snowflake.net/manuals/sql-reference/sql/put.html
  • 12. Step3: (External Staging options)  Snowflake supports any accessibleAmazon S3 or MicrosoftAzure as an external staging location.You can create a stage to pointing to the location data can be loaded directly to the Snowflake table through that stage. No need to move the data to an internal stage  create an external stage pointing to an S3 location, IAM credentials with proper access permissions are required If data needs to be decrypted before loading to Snowflake, proper keys are to be provided. create or replace stage oracle_ext_stage url='s3://snowflake_oracle/data/load/files/' credentials=(aws_key_id='1d318jnsonmb5#d gd4rrb3c' aws_secret_key='aii998nnrcd4kx5y6z'); encryption=(master_key = 'eSxX0jzskjl22bNaaaDuOaO8='); Once data is extracted from Oracle it can be uploaded to S3 using the direct upload option or usingAWS SDK in your favourite programming language. Python’s boto3 is a popular one used under such circumstances. Once data is in S3, an external stage can be created to point that location
  • 13. Step 4: Copy staged files to Snowflake table  Extracted data from Oracle, uploaded it to an S3 location and created an external Snowflake stage pointing to that location.The next step is to copy data to the table.The command used to do this is COPY INTO. Note:To execute the COPY INTO command, compute resources in Snowflake virtual warehouses are required and your Snowflake credits will be utilized. • To load from a named internal copy into oracle_table from @oracle_stage; • Loading from the external stage. Only one file is specified. copy into my_ext_stage_table from @oracle_ext_stage/tutorials/dataloading/items_ext .csv; • A copy directly from an external location without creating a stage copy into oracle_table from s3://mybucket/oracle_snow/data/files credentials=(aws_key_id='$AWS_ACCESS_KEY_ ID' aws_secret_key='$AWS_SECRET_ACCESS_KE Y') encryption=(master_key = 'eSxX009jhh76jkIuLPH5r4BD09wOaO8=') file_format = (format_name = csv_format);
  • 14. Files can be specified using patterns copy into oracle_pattern_table from @oracle_stage file_format = (type = 'TSV') pattern='.*/.*/.*[.]csv[.]gz'; Step 4: Update SnowflakeTable The basic idea is to load incrementally extracted data into an intermediate or temporary table and modify records in the final table with data in the intermediate table.The three methods mentioned below are generally used for this. 1. Update the rows in the target table with new data (with same keys). Then insert new rows from the intermediate or landing table which are not in the final table UPDATE oracle_target_table t SET t.value = s.value FROM landing_delta_table in WHERE t.id = in.id; INSERT INTO oracle_target_table (id, value) SELECT id, value FROM landing_delta_table WHERE NOT id IN (SELECT id FROM oracle_target_table); 2. Delete rows from the target table which are also in the landing table. Then insert all rows from the landing table to the final table. Now, the final table will have the latest data without duplicates DELETE .oracle_target_table f WHERE f.id IN (SELECT id from landing_table); INSERT oracle_target_table (id, value) SELECT id, value FROM landing_table;
  • 15. Files can be specified using patterns copy into oracle_pattern_table from @oracle_stage file_format = (type = 'TSV') pattern='.*/.*/.*[.]csv[.]gz'; Step 4: Update SnowflakeTable The basic idea is to load incrementally extracted data into an intermediate or temporary table and modify records in the final table with data in the intermediate table.The three methods mentioned below are generally used for this. 1. Update the rows in the target table with new data (with same keys). Then insert new rows from the intermediate or landing table which are not in the final table UPDATE oracle_target_table t SET t.value = s.value FROM landing_delta_table in WHERE t.id = in.id; INSERT INTO oracle_target_table (id, value) SELECT id, value FROM landing_delta_table WHERE NOT id IN (SELECT id FROM oracle_target_table); 2. Delete rows from the target table which are also in the landing table. Then insert all rows from the landing table to the final table. Now, the final table will have the latest data without duplicates DELETE .oracle_target_table f WHERE f.id IN (SELECT id from landing_table); INSERT oracle_target_table (id, value) SELECT id, value FROM landing_table;
  • 16. 3. MERGE Statement – Standard SQL merge statement which combines Inserts and updates. It is used to apply changes in the landing table to the target table with one SQL statement MERGE into oracle_target_table t1 using landing_delta_table t2 on t1.id = t2.idWHEN matched then update set value = t2.value WHEN not matched then INSERT (id, value) values (t2.id, t2.value); This method works when you have a comfortable project timeline and a pool of experienced engineering resources that can build and maintain the pipeline. However, the method mentioned above comes with a lot of coding and maintenance overhead Ref : https://hevodata.com/blog/oracle-to-snowflake-etl/
  • 23. Q&A