5.micro Partitions+and+Clustering
5.micro Partitions+and+Clustering
AWS - S3
Azure - Azure Blob Storage
GCP - Google Cloud Storage
Data Partitioning
Partitioning is the process of breaking down a table into smaller, more
manageable parts based on a common Creteria,
for example, a date,
a geographic region,
or a product category.
Each partition is treated as a separate table and can be queried
independently, allowing faster and more efficient data retrieval. Also, keep in
mind that partitioning can help lower storage costs by putting data that is used
less often in cheaper storage space.
Ex:
Let's consider an example to illustrate the benefits of data partitioning.
Let's say, for example, that we have a sales database containing millions of records organized
into year and month partitions so that data from specific months or years can be promptly
accessed.
------------------
Therefore, by partitioning the data like this, requests are more efficiently processed and more
accurate answers can be obtained.
The only micro-partitions that match this criterion are micro-partitions 3 and 4. The query pruning has
reduced our total dataset to just these two partitions
And only the [type] and [country] fields are required in the query output, any part of micro-partitions that do
not contain data for these columns would also be pruned.
i.e. When the micro-partitions themselves are queried, only the required columns are queried
Benefits of Micro partitioning
• In contrast to traditional static partitioning, Snowflake micro-partitions are derived
automatically; they don’t need to be explicitly defined up-front or maintained by
users.
• Micro-partitions are small in size (50 to 500 MB), which enables extremely
efficient DML and fine-grained pruning for faster queries.
• Columns are stored independently within micro-partitions, often referred to as
columnar storage.
• This enables efficient scanning of individual columns; only the columns
referenced by a query are scanned.
• Columns are also compressed individually within micro-partitions, this optimizes
the storage cost.
Benefits of Snowflake Micro-Partitions
The benefits of Snowflake's approach to partitioning table data include:
Snowflake recommends
• Define cluster keys on large tables and don’t on small tables.
• Don’t define cluster keys on more than 4 columns.
Thank You