data-platform-on-aws-and-snowflake-ra
data-platform-on-aws-and-snowflake-ra
Modern Data Platform using AWS and Snowflake service (SaaS) applications, edge devices,
logs, streaming data, and social media
networks.
This architecture enables customers to build end-to-end modern data analytics platforms using AWS and Snowflake.
Based on the type of data source, AWS
2 Database Migration Service, AWS
1 AWS Cloud 2 Governance and lineage 6 7 8 DataSync, Amazon Kinesis, Amazon
Managed Streaming for Apache Kafka,
AWS IoT Core, AWS Glue and Amazon
AppFlow are used to ingest the data into
Relational/Operational
AWS Security Token the data lake in AWS.
data
Service (AWS STS) Snowflake
AWS Identity and AWS Lake Amazon S3 is used for fully managed,
Access Management Formation 3 highly available and scalable data lake
SQL/NoSQL DBs AWS Data Migration (IAM) storage.
Service Amazon QuickSight
Data Lake 3 4 AWS Glue is used to extract, transform
and ingest data across multiple data
Events/Streaming data
Raw data stores. Amazon EMR provides the cloud
Query data lake big data platform for processing vast
without loading amounts of data using open source
Amazon Kinesis
data using external analytics framework. AWS Lambda and
Conformed data Amazon Simple
tables
Amazon EC2 provide compute capability
Devices Media
Storage Service Applications and for data enrichment needs.
(Amazon S3) services
Amazon Managed Streaming Amazon Managed Workflows for Apache
for Apache Kafka
Modelled data 5 Airflow (MWAA ) or AWS Step Functions
Social Logs is used for orchestrating end-to-end data
Automated data
ingestion with
pipelines.
ETL 4 SnowPipe AWS Lake Formation makes it easy to
Batch data
AWS IoT Core 6 build, secure and manage your data lake,
providing single place to enforce data
classification and manage fine-grained
access. AWS IAM and AWS STS provides
AWS Glue Amazon EMR AWS Lambda
File shares ability to manage access permissions and
AWS DataSync AWS Glue
Amazon SageMaker temporary credentials.
SaaS Applications
Orchestration
5 Snowflake is used as virtual data
7 warehouse with ability to query Amazon
SaaS Apps Data store S3 using external tables, and automated
Amazon AppFlow Amazon Managed Workflows AWS Step Functions and continuous data ingestion using
for Apache Airflow (MWAA) SnowPipe.
Amazon SageMaker can be used to build,
Data sources Data ingestion Data storage, transform and govern Data serving Data consumption 8 train, and deploy machine learning (ML)
models, and add intelligence to your
Reviewed for technical
Reviewed technical accuracy
accuracyMarch
Month11, 2022
Day, 2022
© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Reference Architecture Applications. Amazon QuickSight provides
ML-powered business intelligence (BI).