technical, snowflake,

Snowpro Core preparation part 2 - storage and protection

Snowpro Core preparation part 2 -  storage and protection

I am currently preparing for the SnowPro Core certification which is the standard technical Snowflake certifications. More specialized certifications can be gained after this one is achieved. More details can be found on the Snowflake website.

Other articles in this series: Part 1 - Warehouses Part 2 - Storage Part 3 - Account and security Part 4 - Data movement Part 5 - Overview and architecture Part 6 - Performance tuning Part 7 - Semi-structured data

So, let’s jump right into the second topic, which is storage and protection:

Storage

  • Automatic micro-partitioning of table data​
  • Partitioned based on ingestion order, can be changed by defining a clustering key​
  • Each micro partition is optimally compressed based on the data type​
  • Per micro partition: max 16 MB compressed, 50-500 MB uncompressed​
  • Immutable –> update writes new version; services layer knows which version of a tables consists of which versions of the partitions –> time travel!​
  • Partition metadata is used for optimizing queries by not scanning certain partitions if there are filtering predicates

Data Protection:

  • All communication and all data at rest encrypted end-to-end​
  • Each micro partition has its own encryption key​
  • All data replicated across availability zones

Clustering:

Benefits:​

  • Improve scan efficiency in queries by skipping data that does not match the filter predicates (just certain micro partitions are read).​
  • Better compression per column​
  • After a key is defined the only modification to administer is drop or modify the key​

Considerations:​

  • Use on tables with a large number of micro partitions (Typically multiple terabytes).​
  • Query advantages (one or both):​
    • The queries are selective (only read the necessary micro partitions).​
    • The queries sort the data (ORDER BY clause)​
  • High number of queries can benefit if they use the same few columns (or at least the key columns)

Table and view types are well described in these images from Snowflake:

Snowflake table types
The different types of tables in Snowflake
Snowflake view types
The different types of views in Snowflake