0% found this document useful (0 votes)
19 views

Architecture For Data Ingestion Clean Processing and Visulizationyounesse

The document discusses the ingestion, processing, analysis, and visualization of IoT and other data sources using various AWS services. IoT sensor data will be ingested using Kinesis Data Streams, historical database records will be migrated to RDS using DMS with CDC, and third-party data will be fetched and processed using AWS Glue. Glue will also be used for data cleansing and structuring. EMR will perform complex analysis and QuickSight will enable interactive dashboards. CloudWatch will monitor the system and Lambda will automate processes.

Uploaded by

Yøű Ñęş
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Architecture For Data Ingestion Clean Processing and Visulizationyounesse

The document discusses the ingestion, processing, analysis, and visualization of IoT and other data sources using various AWS services. IoT sensor data will be ingested using Kinesis Data Streams, historical database records will be migrated to RDS using DMS with CDC, and third-party data will be fetched and processed using AWS Glue. Glue will also be used for data cleansing and structuring. EMR will perform complex analysis and QuickSight will enable interactive dashboards. CloudWatch will monitor the system and Lambda will automate processes.

Uploaded by

Yøű Ñęş
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Data Ingestion:

IoT Sensors Data:


I'll begin by capturing real-time data from our IoT sensors. To achieve this, I'm utilizing Amazon Kinesis
Data Stream. It effortlessly handles streaming data, allowing us to ingest data as it's generated.

Historical Database Records:


Moving on to historical data, I've chosen to employ the AWS Database Migration Service (DMS). It helps
replicate data from our existing database to an Amazon RDS instance. With Change Data Capture (CDC)
enabled, we can stay updated with ongoing changes.

Third-Party Data:
When it comes to supplementing our internally generated data, I'll be utilizing AWS Glue, which operates
much like our familiar Hadoop-based tools. Glue fetches data from various sources, performs
transformations, and stores the processed data in our S3 storage.

Data Processing and Transformation:


Data cleanliness and structure are paramount. For this, I'll continue to use AWS Glue. It not only
transforms and cleanses data but also ensures it's stored in a logical manner in S3.
Data Analysis and Visualization:

Data Analysis:
To perform complex analysis, I've opted for Amazon EMR (Elastic MapReduce). This aligns with my current
approach by providing a managed Hadoop framework. I can run Apache Spark and Hive jobs here, just as
I do with our existing setup.

Dashboards and Visualization:


For the exciting part – visualization – I've integrated Amazon QuickSight. This cloud-native business
intelligence tool directly connects to our S3 data and EMR for analysis. It enables me to craft interactive
dashboards showcasing our insights.

Automation and Monitoring:

To ensure smooth operations and management:


 I've employed AWS CloudWatch to keep tabs on the health and performance of our system.
 AWS Lambda functions will trigger specific actions when events occur, streamlining processes.

You might also like