0% found this document useful (0 votes)
297 views

Snowflake Architecture

The document discusses Snowflake's multi cluster shared data architecture. It has three main layers - the cloud services layer, the virtual warehouse layer, and the data storage layer. The architecture addresses issues with other approaches like shared disk and shared nothing architectures, and allows for heterogeneous workloads and hardware with membership changes and software upgrades. It utilizes caching to improve performance.

Uploaded by

vrjs27 v
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
297 views

Snowflake Architecture

The document discusses Snowflake's multi cluster shared data architecture. It has three main layers - the cloud services layer, the virtual warehouse layer, and the data storage layer. The architecture addresses issues with other approaches like shared disk and shared nothing architectures, and allows for heterogeneous workloads and hardware with membership changes and software upgrades. It utilizes caching to improve performance.

Uploaded by

vrjs27 v
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Snowflake fundamentals

© 2018 Slidefabric.com All rights reserved.


Cloud services layer
Snowflake
Architecture
Virtual warehouse layer
Multi cluster shared data architecture

Data storage layer CLICK HERE

© 2018 Slidefabric.com All rights reserved. 2


Snowflake
Architecture
Multi cluster shared data architecture

CLICK HERE

© 2018 Slidefabric.com All rights reserved. 3


Shared disk
Architecture
• Scalability is limited.
• Hard to maintain data consistency across the cluster.
• Bottle neck of communication with shared disk.

© 2018 Slidefabric.com All rights reserved. 4


Shared
nothing
Architecture

© 2018 Slidefabric.com All rights reserved. 5


Shared
nothing
Architecture
It scales processing and compute together.

It moves data storage close to compute.

© 2018 Slidefabric.com All rights reserved. 65


Shared
nothing
Architecture

© 2018 Slidefabric.com All rights reserved. 75


Shared
nothing
Architecture
Data distributed across the cluster requires shuffling between
nodes.

Performance is heavily dependent on how data is distributed


across the nodes in the system.

Compute can’t be sized independently of storage.

© 2018 Slidefabric.com All rights reserved. 85


Heterogeneous
Bulk loading
workload
Low Compute High I/O

Requires higher I/O bandwidth and light compute.

© 2018 Slidefabric.com All rights reserved. 9


Heterogeneous
Data
processing
workload
Heavy Compute Low I/O

Requires lower I/O bandwidth and heavy compute

© 2018 Slidefabric.com All rights reserved. 10


20 GB RAM 40 GB RAM

Membership
changes
2 TB 5 TB

© 2018 Slidefabric.com All rights reserved. 11


Upgrades

© 2018 Slidefabric.com All rights reserved. 12


Shared
Heterogenous Workload and homogenous hardware.

Membership changes.

Problem with software upgrades. nothing


Architecture

© 2018 Slidefabric.com All rights reserved. 13


Multi cluster shared data architecture

Cloud services layer

Virtual warehouse layer

Data storage layer CLICK HERE

© 2018 Slidefabric.com All rights reserved. 14


Snowflake
COST architecture

Impact

© 2018 Slidefabric.com All rights reserved. 15


ARCHITECTURE DEMO CACHING
§ By the end of this section you will understand how data processing
happens under the hood.
§ You will understand how snowflake architecture layers will interact
with each other.
§ You will understand how caching works in snowflake.

© 2018 Slidefabric.com All rights reserved. 16


Architecture

© 2018 Slidefabric.com All rights reserved. 17


Caching in
snowflake.

© 2018 Slidefabric.com All rights reserved. 18

You might also like