SlideShare a Scribd company logo
Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023]
© All rights reserved 2022 2
Basel AWS User Group
re:Invent re:Cap
Data & Analytics
© All rights reserved 2022
Cloudreach Service Lines
Cloudreach
Services
Advisory
Software Platform
Data & Analytics
Platform Development
Application Modernisation
Security Transformation
Migration
Management & Resale
© All rights reserved 2022
Our Cloud Partnerships
UK&I Consulting Partner
of the Year 2019
EMEA GSI Partner of the Year
2022
3x Golden Jackets!
900+ Certifications
15 Competency
Specializations
Data Analytics - Services Specialisation
GCP Security Partner
of the Year 2021
100+ Certifications
12 Competency
Specializations
Gold Data Analytics Competency
Gold Data Platform Competency
Microsoft UK Partner
of the Year 2020
500+ Certifications
12 Gold Certifications
15 Competency
Specializations
© All rights reserved 2022
Marketing
Achieved!
© All rights reserved 2022
The Big Data & Analytics
Announcements
© All rights reserved 2022
Zero-ETL Integration - Aurora with RedShift
Eliminate the need to build and maintain complex data pipelines to perform
extract, transform, and load (ETL) operations
Auto-Copy from S3
Load files from S3 as it arrives
Dynamic data masking
Protect sensitive information stored in your data warehouse and ensure that
only the relevant data is accessible by users based on their roles
Centralised access control with AWS Lake Formation
Simplify governance of data shared from Amazon Redshift and centrally manage
granular access across all data-sharing consumers
Integration with Apache Spark
Build and run Spark applications on Amazon Redshift and Redshift Serverless,
opening up the data warehouse for a broader set of AWS analytics and machine
learning solutions
Streaming Ingestion
Natively ingest hundreds of megabytes of data per second from Amazon Kinesis
Data Streams and Amazon MSK
Redshift
Glue 4.0
New and Updated Engines, More Data Formats, and More
This version of Glue includes Python 3.10 and Apache Spark 3.3.0, plus native
support for the Cloud Shuffle Service Plugin for Spark. It also includes Pandas
support and more.
Glue Data Quality
Analyse your tables and recommend a set of rules automatically based on what it
finds
Glue for Ray
Process large datasets with Python and popular Python libraries, serverless
option for data integration with Ray, a popular new open-source compute
framework that helps you scale Python workloads
Glue
© All rights reserved 2022
Built-in Data Preparation, Real-Time Collaboration, and
Notebook Automation
Improve data quality in minutes with the built-in data preparation capability, edit
the same notebooks with your teams in real time, and automatically convert
notebook code to production-ready jobs.
ML Governance Tools for Amazon SageMaker
Simplify Access Control and Enhance Transparency Over Your ML Projects
Build, Train, and Deploy ML Models Using Geospatial Data
This collection of features offers pre-trained deep neural network (DNN) models
and geospatial operators that make it easy to access and prepare large
geospatial datasets.
Support for Shadow Testing
Compare Inference Performance Between ML Model Variants
Deploying a model in shadow mode lets you conduct a more holistic test by
routing a copy of the live inference requests for a production model to the new
(shadow) model.
Real-Time and Batch Inference support in Data Wrangler
Reuse the data transformation flow which you created in SageMaker Data
Wrangler as a step in Amazon SageMaker inference pipelines.
SageMaker
Automated Data Preparation
Automated data preparation utilises machine learning to infer semantic
information about data and adds it to datasets as metadata about the columns
(fields), making it faster for you to prepare data in order to support natural
language questions.
Paginated Reports
Create and Share Operational Reports at Scale.
This feature allows customers to create and share highly formatted,
personalized reports containing business-critical data to hundreds of thousands
of end-users — without any infrastructure setup or maintenance, up-front
licensing, or long-term commitments.
New API Capabilities to Accelerate Your BI Transformation
New QuickSight API capabilities allow programmatic creation and management
of dashboards, analysis, and templates.
QuickSight
© All rights reserved 2022
Amazon Security Lake
A Purpose-Built Customer-Owned Data Lake Service. This new service
automatically centralises your organisation’s security data from cloud and
on-premises sources into a purpose-built data lake stored in your account.
Step Functions Distributed Map
A Serverless Solution for Large-Scale Parallel Data Processing. The new
distributed map state can launch up to ten thousand parallel workflows to
process data.
Amazon Athena for Apache Spark
Using Spark without having to manually provision the infrastructure. Run
Apache Spark workloads, use Jupyter Notebook as the interface to perform data
processing on Athena, and programmatically interact with Spark applications
using Athena APIs.
AWS Database Migration Service: Fully Managed Schema
Conversion
Streamlines database migrations by making schema assessment and conversion
available inside AWS DMS. You can now plan, assess, convert and migrate under
one central DMS service.
Other New Stuff
Amazon DataZone
Unlock data across organisational boundaries with built-in governance. Discover
and share data at scale across organisational boundaries with governance and
access controls.
Amazon Omics
Store, query, and analyze genomic, transcriptomic, and other omics data and
then generate insights from that data to improve health and advance scientific
discoveries.
AWS Clean Rooms
Securely match, analyse, and collaborate on combined datasets across
organisations without revealing the underlying data.
© All rights reserved 2022
So What is DataZone?
© All rights reserved 2022
Provides the technology infrastructure for pursuing
a data mesh architecture within an organisation
Key components:
Business Data Catalog
Make data assets discoverable and searchable with business
metadata, manageable by Data Stewards via metadata forms
Data Projects
Groups users and data assets within a business context where they
can collaborate and manage their output
Governance & Access Control
Includes self-service data access requests, data producer review
processes, and access control to manage where data assets are
used
Data Portal
A dedicated web portal (outside of the AWS console) for working
with DataZones
Domains
Groupings of the above which enable DataZone to map it’s
governance and access control structure to your organisational
structure
Basically; Data Mesh Tech Layer-as-a-Service Coming Soon
© All rights reserved 2022
So What is Omics?
© All rights reserved 2022
Basically; Genomic Research Infra-as-a-Service
Provides specialised storage and processing workflows for
omics data
Built for Petabyte scale storage and processing
Designed to be used in GDPR and HIPAA regulated
environments
Automatically manages underlying infrastructure for
analytics workflows on omics data
Enables utilisation of omics data via Athena, EMR,
QuickSight, and more
GA
© All rights reserved 2022
So What is Clean Rooms?
© All rights reserved 2022
Provides the plumbing for multiple
organisations to collaborate on their data,
without actually copying/transferring it
Can share any data in your Glue Data
Catalog / S3 Data Lake
Key terms:
Analysis Rules
Basically; Zero Transfer X-Org Analytics Preview
Define what types of SQL queries other collaborators can use with a data asset you associate with a clean room
E.g. limiting which SQL aggregations can be used
Output Restrictions
Define limits on the results other collaborators can receive from a data asset you associate with a clean room
E.g. setting a minimum result value for a COUNT DISTINCT query
Can restrict collaborators access to only rows that match an inner join on their own associated data assets
Can run queries on data whilst it remains encrypted - i.e. encryption in-use!
Optionally logs all queries performed against your data to CloudWatch Logs in your AWS account
Important T&C contractual limit:
You may not use AWS Clean Rooms or any information obtained from your use of AWS Clean Rooms to identify a person or associate such information with an identifiable
person, unless otherwise permitted by the applicable third party contributor of the data.
© All rights reserved 2022
My Favourite Session:
AWS IoT ExpressLink Workshop
© All rights reserved 2022
What is IoT ExpressLink?
Physical hardware for integration into IoT devices, built by AWS partners
Provides AWS connectivity without the need to write your own connectivity
code
Implements AWS security best practices for IoT connectivity
Effectively:
● Offload the undifferentiated heavy lifting of connectivity and
cryptography in IoT devices
● Simplify and accelerate AWS-integrated IoT device and service
development
● Rapid mass onboarding of IoT devices, e.g. via bluetooth
Configured for WiFi or 5G connections using serial AT commands e.g. (in
Python):
send_command(uart, f"AT+CONF SSID={wifi_ssid}")
send_command(uart, f"AT+CONF Passphrase={wifi_key}")
© All rights reserved 2022
We’ll use a freebie from the re:Invent
workshop to look at:
● IoT ExpressLink in use
● AWS IoT Core device shadows
● Wrapping device shadows in a
serverless API with AWS Lambda
● Building an AWS Amplify app on top
of that API
The demo badge contains:
● IoT ExpressLink chip (red circle)
● Raspberry Pi Pico-esque board
● A bunch of widgets!
Hardware details and image source here
Demo Time!
© All rights reserved 2022
IoT Core Device Shadow
© All rights reserved 2022
Basic Serverless Device Shadow API
© All rights reserved 2022
Device Shadow Web App
© All rights reserved 2022 22
Questions?
Perhaps Answers
© All rights reserved 2022 23
Thank you
Cloudreach Tech Blog: https://www.cloudreach.com/en/technical-blog
These Slides: https://bit.ly/3QFl7Xr
Demo Repos: https://bit.ly/3GLt3BH
https://bit.ly/3IRfXWg
Slides

More Related Content

What's hot (20)

PPTX
Monitoring & Observability
Lumban Sopian
 
PDF
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
PPTX
Master the Multi-Clustered Data Warehouse - Snowflake
Matillion
 
PDF
Cloud Migration
Susanne Tedrick
 
PPTX
Introduction to AWS Lake Formation.pptx
SwathiPonugumati
 
PPTX
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Floyd DCosta
 
PPTX
Azure Application Modernization
Karina Matos
 
PDF
Business Intelligence & Data Analytics– An Architected Approach
DATAVERSITY
 
PDF
AWS Summit Seoul 2023 | 천만 사용자 서비스를 위한 Amazon SageMaker 활용 방법 진화하기
Amazon Web Services Korea
 
PDF
Databricks Delta Lake and Its Benefits
Databricks
 
PPTX
Introduction To AWS & AWS Lambda
An Nguyen
 
PPTX
Microsoft Cloud Adoption Framework for Azure: Governance Conversation
Nicholas Vossburg
 
PDF
Auto scaling using Amazon Web Services ( AWS )
Harish Ganesan
 
PDF
Salesforce overview
Ratchata Ardchawuthikulawong
 
PDF
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon Web Services Korea
 
PDF
Cloud Architecture
Arief Gunawan
 
PPTX
Pythonsevilla2019 - Introduction to MLFlow
Fernando Ortega Gallego
 
PDF
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
PDF
Microsoft Azure Overview
David J Rosenthal
 
PDF
Designing An Enterprise Data Fabric
Alan McSweeney
 
Monitoring & Observability
Lumban Sopian
 
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
Master the Multi-Clustered Data Warehouse - Snowflake
Matillion
 
Cloud Migration
Susanne Tedrick
 
Introduction to AWS Lake Formation.pptx
SwathiPonugumati
 
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Floyd DCosta
 
Azure Application Modernization
Karina Matos
 
Business Intelligence & Data Analytics– An Architected Approach
DATAVERSITY
 
AWS Summit Seoul 2023 | 천만 사용자 서비스를 위한 Amazon SageMaker 활용 방법 진화하기
Amazon Web Services Korea
 
Databricks Delta Lake and Its Benefits
Databricks
 
Introduction To AWS & AWS Lambda
An Nguyen
 
Microsoft Cloud Adoption Framework for Azure: Governance Conversation
Nicholas Vossburg
 
Auto scaling using Amazon Web Services ( AWS )
Harish Ganesan
 
Salesforce overview
Ratchata Ardchawuthikulawong
 
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon Web Services Korea
 
Cloud Architecture
Arief Gunawan
 
Pythonsevilla2019 - Introduction to MLFlow
Fernando Ortega Gallego
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Microsoft Azure Overview
David J Rosenthal
 
Designing An Enterprise Data Fabric
Alan McSweeney
 

Similar to Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023] (20)

PDF
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
Informatica
 
PDF
Confluent_AWS_ImmersionDay_Q42023.pdf
Ahmed791434
 
PDF
Build real-time streaming data pipelines to AWS with Confluent
confluent
 
PDF
Sederhanakan_integrasi_data_anda_dengan_AWS_Glue_handout.pdf
Jazzy44
 
PDF
Best Practices for Cloud Migrations with Zero Disruption with AWS Marketplace
Denodo
 
PDF
Benefits of the Azure Cloud
Caserta
 
PDF
Module 3 - QuickSight Overview
Lam Le
 
PPTX
101_Customer_Move and Modernize Siebel_07012021.pptx
BhagavathyPadmanabha1
 
PPTX
Integration of Things (Sam Vanhoutte @Iglooconf 2017)
Codit
 
PPTX
BigQuery - Snowflake compete deck _ Sales _ Y21.pptx
Erkan Çiftçi
 
PDF
Customer Migration to Azure SQL Database_2024.pdf
George Walters
 
PPTX
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
PPTX
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Cloudera, Inc.
 
PPTX
Azure intelligent edge solutions overview
Cenk Ersoy
 
PDF
Azure Data Engineer Course | Azure Data Engineer Trainin
Accentfuture
 
PPTX
Cloud Computing – Opportunities, Definitions, Options, and Risks (Part-1)
Manoj Kumar
 
PPTX
Cloud Computing Pascal Walschots
PWalschots
 
PPTX
Azure Overview Csco
rajramab
 
PPTX
Benefits of the Azure cloud
James Serra
 
PPTX
Azure Community Tour 2019 - AZUGDK
Peter Selch Dahl
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
Informatica
 
Confluent_AWS_ImmersionDay_Q42023.pdf
Ahmed791434
 
Build real-time streaming data pipelines to AWS with Confluent
confluent
 
Sederhanakan_integrasi_data_anda_dengan_AWS_Glue_handout.pdf
Jazzy44
 
Best Practices for Cloud Migrations with Zero Disruption with AWS Marketplace
Denodo
 
Benefits of the Azure Cloud
Caserta
 
Module 3 - QuickSight Overview
Lam Le
 
101_Customer_Move and Modernize Siebel_07012021.pptx
BhagavathyPadmanabha1
 
Integration of Things (Sam Vanhoutte @Iglooconf 2017)
Codit
 
BigQuery - Snowflake compete deck _ Sales _ Y21.pptx
Erkan Çiftçi
 
Customer Migration to Azure SQL Database_2024.pdf
George Walters
 
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Cloudera, Inc.
 
Azure intelligent edge solutions overview
Cenk Ersoy
 
Azure Data Engineer Course | Azure Data Engineer Trainin
Accentfuture
 
Cloud Computing – Opportunities, Definitions, Options, and Risks (Part-1)
Manoj Kumar
 
Cloud Computing Pascal Walschots
PWalschots
 
Azure Overview Csco
rajramab
 
Benefits of the Azure cloud
James Serra
 
Azure Community Tour 2019 - AZUGDK
Peter Selch Dahl
 
Ad

More from Chris Bingham (7)

PDF
Managing Geospatial Open Data Serverlessly [AWS Community Day CH 2025]
Chris Bingham
 
PDF
Managing Geospatial Open Data Serverlessly [Cloud Native Bern Meetup | May 2025]
Chris Bingham
 
PDF
How AWS Encryption Key Options Impact Your Security and Compliance
Chris Bingham
 
PDF
3 Steps for Securing Your AWS Organisation.pdf
Chris Bingham
 
PDF
Managing Geospatial Open Data Serverlessly: paddelbuch.ch [Cloud Native Compu...
Chris Bingham
 
PDF
Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...
Chris Bingham
 
PDF
SucculentPi [AWS Basel Meetup - Oct 2022]
Chris Bingham
 
Managing Geospatial Open Data Serverlessly [AWS Community Day CH 2025]
Chris Bingham
 
Managing Geospatial Open Data Serverlessly [Cloud Native Bern Meetup | May 2025]
Chris Bingham
 
How AWS Encryption Key Options Impact Your Security and Compliance
Chris Bingham
 
3 Steps for Securing Your AWS Organisation.pdf
Chris Bingham
 
Managing Geospatial Open Data Serverlessly: paddelbuch.ch [Cloud Native Compu...
Chris Bingham
 
Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...
Chris Bingham
 
SucculentPi [AWS Basel Meetup - Oct 2022]
Chris Bingham
 
Ad

Recently uploaded (20)

PDF
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
PPTX
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
PPTX
apidays Helsinki & North 2025 - APIs at Scale: Designing for Alignment, Trust...
apidays
 
PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PPTX
Dr djdjjdsjsjsjsjsjsjjsjdjdjdjdjjd1.pptx
Nandy31
 
PPTX
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PDF
Merits and Demerits of DBMS over File System & 3-Tier Architecture in DBMS
MD RIZWAN MOLLA
 
OPPOTUS - Malaysias on Malaysia 1Q2025.pdf
Oppotus
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
apidays Helsinki & North 2025 - APIs at Scale: Designing for Alignment, Trust...
apidays
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
Dr djdjjdsjsjsjsjsjsjjsjdjdjdjdjjd1.pptx
Nandy31
 
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Merits and Demerits of DBMS over File System & 3-Tier Architecture in DBMS
MD RIZWAN MOLLA
 

Data & Analytics ReInvent Recap [AWS Basel Meetup - Jan 2023]

  • 2. © All rights reserved 2022 2 Basel AWS User Group re:Invent re:Cap Data & Analytics
  • 3. © All rights reserved 2022 Cloudreach Service Lines Cloudreach Services Advisory Software Platform Data & Analytics Platform Development Application Modernisation Security Transformation Migration Management & Resale
  • 4. © All rights reserved 2022 Our Cloud Partnerships UK&I Consulting Partner of the Year 2019 EMEA GSI Partner of the Year 2022 3x Golden Jackets! 900+ Certifications 15 Competency Specializations Data Analytics - Services Specialisation GCP Security Partner of the Year 2021 100+ Certifications 12 Competency Specializations Gold Data Analytics Competency Gold Data Platform Competency Microsoft UK Partner of the Year 2020 500+ Certifications 12 Gold Certifications 15 Competency Specializations
  • 5. © All rights reserved 2022 Marketing Achieved!
  • 6. © All rights reserved 2022 The Big Data & Analytics Announcements
  • 7. © All rights reserved 2022 Zero-ETL Integration - Aurora with RedShift Eliminate the need to build and maintain complex data pipelines to perform extract, transform, and load (ETL) operations Auto-Copy from S3 Load files from S3 as it arrives Dynamic data masking Protect sensitive information stored in your data warehouse and ensure that only the relevant data is accessible by users based on their roles Centralised access control with AWS Lake Formation Simplify governance of data shared from Amazon Redshift and centrally manage granular access across all data-sharing consumers Integration with Apache Spark Build and run Spark applications on Amazon Redshift and Redshift Serverless, opening up the data warehouse for a broader set of AWS analytics and machine learning solutions Streaming Ingestion Natively ingest hundreds of megabytes of data per second from Amazon Kinesis Data Streams and Amazon MSK Redshift Glue 4.0 New and Updated Engines, More Data Formats, and More This version of Glue includes Python 3.10 and Apache Spark 3.3.0, plus native support for the Cloud Shuffle Service Plugin for Spark. It also includes Pandas support and more. Glue Data Quality Analyse your tables and recommend a set of rules automatically based on what it finds Glue for Ray Process large datasets with Python and popular Python libraries, serverless option for data integration with Ray, a popular new open-source compute framework that helps you scale Python workloads Glue
  • 8. © All rights reserved 2022 Built-in Data Preparation, Real-Time Collaboration, and Notebook Automation Improve data quality in minutes with the built-in data preparation capability, edit the same notebooks with your teams in real time, and automatically convert notebook code to production-ready jobs. ML Governance Tools for Amazon SageMaker Simplify Access Control and Enhance Transparency Over Your ML Projects Build, Train, and Deploy ML Models Using Geospatial Data This collection of features offers pre-trained deep neural network (DNN) models and geospatial operators that make it easy to access and prepare large geospatial datasets. Support for Shadow Testing Compare Inference Performance Between ML Model Variants Deploying a model in shadow mode lets you conduct a more holistic test by routing a copy of the live inference requests for a production model to the new (shadow) model. Real-Time and Batch Inference support in Data Wrangler Reuse the data transformation flow which you created in SageMaker Data Wrangler as a step in Amazon SageMaker inference pipelines. SageMaker Automated Data Preparation Automated data preparation utilises machine learning to infer semantic information about data and adds it to datasets as metadata about the columns (fields), making it faster for you to prepare data in order to support natural language questions. Paginated Reports Create and Share Operational Reports at Scale. This feature allows customers to create and share highly formatted, personalized reports containing business-critical data to hundreds of thousands of end-users — without any infrastructure setup or maintenance, up-front licensing, or long-term commitments. New API Capabilities to Accelerate Your BI Transformation New QuickSight API capabilities allow programmatic creation and management of dashboards, analysis, and templates. QuickSight
  • 9. © All rights reserved 2022 Amazon Security Lake A Purpose-Built Customer-Owned Data Lake Service. This new service automatically centralises your organisation’s security data from cloud and on-premises sources into a purpose-built data lake stored in your account. Step Functions Distributed Map A Serverless Solution for Large-Scale Parallel Data Processing. The new distributed map state can launch up to ten thousand parallel workflows to process data. Amazon Athena for Apache Spark Using Spark without having to manually provision the infrastructure. Run Apache Spark workloads, use Jupyter Notebook as the interface to perform data processing on Athena, and programmatically interact with Spark applications using Athena APIs. AWS Database Migration Service: Fully Managed Schema Conversion Streamlines database migrations by making schema assessment and conversion available inside AWS DMS. You can now plan, assess, convert and migrate under one central DMS service. Other New Stuff Amazon DataZone Unlock data across organisational boundaries with built-in governance. Discover and share data at scale across organisational boundaries with governance and access controls. Amazon Omics Store, query, and analyze genomic, transcriptomic, and other omics data and then generate insights from that data to improve health and advance scientific discoveries. AWS Clean Rooms Securely match, analyse, and collaborate on combined datasets across organisations without revealing the underlying data.
  • 10. © All rights reserved 2022 So What is DataZone?
  • 11. © All rights reserved 2022 Provides the technology infrastructure for pursuing a data mesh architecture within an organisation Key components: Business Data Catalog Make data assets discoverable and searchable with business metadata, manageable by Data Stewards via metadata forms Data Projects Groups users and data assets within a business context where they can collaborate and manage their output Governance & Access Control Includes self-service data access requests, data producer review processes, and access control to manage where data assets are used Data Portal A dedicated web portal (outside of the AWS console) for working with DataZones Domains Groupings of the above which enable DataZone to map it’s governance and access control structure to your organisational structure Basically; Data Mesh Tech Layer-as-a-Service Coming Soon
  • 12. © All rights reserved 2022 So What is Omics?
  • 13. © All rights reserved 2022 Basically; Genomic Research Infra-as-a-Service Provides specialised storage and processing workflows for omics data Built for Petabyte scale storage and processing Designed to be used in GDPR and HIPAA regulated environments Automatically manages underlying infrastructure for analytics workflows on omics data Enables utilisation of omics data via Athena, EMR, QuickSight, and more GA
  • 14. © All rights reserved 2022 So What is Clean Rooms?
  • 15. © All rights reserved 2022 Provides the plumbing for multiple organisations to collaborate on their data, without actually copying/transferring it Can share any data in your Glue Data Catalog / S3 Data Lake Key terms: Analysis Rules Basically; Zero Transfer X-Org Analytics Preview Define what types of SQL queries other collaborators can use with a data asset you associate with a clean room E.g. limiting which SQL aggregations can be used Output Restrictions Define limits on the results other collaborators can receive from a data asset you associate with a clean room E.g. setting a minimum result value for a COUNT DISTINCT query Can restrict collaborators access to only rows that match an inner join on their own associated data assets Can run queries on data whilst it remains encrypted - i.e. encryption in-use! Optionally logs all queries performed against your data to CloudWatch Logs in your AWS account Important T&C contractual limit: You may not use AWS Clean Rooms or any information obtained from your use of AWS Clean Rooms to identify a person or associate such information with an identifiable person, unless otherwise permitted by the applicable third party contributor of the data.
  • 16. © All rights reserved 2022 My Favourite Session: AWS IoT ExpressLink Workshop
  • 17. © All rights reserved 2022 What is IoT ExpressLink? Physical hardware for integration into IoT devices, built by AWS partners Provides AWS connectivity without the need to write your own connectivity code Implements AWS security best practices for IoT connectivity Effectively: ● Offload the undifferentiated heavy lifting of connectivity and cryptography in IoT devices ● Simplify and accelerate AWS-integrated IoT device and service development ● Rapid mass onboarding of IoT devices, e.g. via bluetooth Configured for WiFi or 5G connections using serial AT commands e.g. (in Python): send_command(uart, f"AT+CONF SSID={wifi_ssid}") send_command(uart, f"AT+CONF Passphrase={wifi_key}")
  • 18. © All rights reserved 2022 We’ll use a freebie from the re:Invent workshop to look at: ● IoT ExpressLink in use ● AWS IoT Core device shadows ● Wrapping device shadows in a serverless API with AWS Lambda ● Building an AWS Amplify app on top of that API The demo badge contains: ● IoT ExpressLink chip (red circle) ● Raspberry Pi Pico-esque board ● A bunch of widgets! Hardware details and image source here Demo Time!
  • 19. © All rights reserved 2022 IoT Core Device Shadow
  • 20. © All rights reserved 2022 Basic Serverless Device Shadow API
  • 21. © All rights reserved 2022 Device Shadow Web App
  • 22. © All rights reserved 2022 22 Questions? Perhaps Answers
  • 23. © All rights reserved 2022 23 Thank you Cloudreach Tech Blog: https://www.cloudreach.com/en/technical-blog These Slides: https://bit.ly/3QFl7Xr Demo Repos: https://bit.ly/3GLt3BH https://bit.ly/3IRfXWg Slides