SlideShare a Scribd company logo
Data Mesh in Practice
Max Schultze - max.schultze@zalando.de
Arif Wider - awider@thoughtworks.com
17-11-2020
How Europe’s Leading
Online Platform for Fashion
Goes Beyond the Data Lake
@mcs1408 @arifwider
2
Max Schultze
● Lead Data Engineer
● MSc in Computer Science
● Took part in early
development of Apache Flink
● Retired semi-professional
Magic: the Gathering player
Who are we?
Arif Wider
● Software engineering professor (full)
at HTW Berlin, Germany
● Fellow technology consultant with
ThoughtWorks Germany (part-time)
● Former Head of AI at ThoughtWorks
● Coffee geek
3
TABLE OF
CONTENTS
Zalando’s Data Platform
What’s this Data Mesh?
Data Mesh in Practice
4
Zalando’s Data Platform
5
Zalando’s Data Platform
Ingestion
Storage
Serving
6
Web
Tracking
Event Bus
DWH
Ingestion
Storage
Serving
Zalando’s Data Platform
7
Web
Tracking
Event Bus
DWH
Ingestion
Storage
Serving
Metastore
Zalando’s Data Platform
8
Web
Tracking
Event Bus
Ingestion
Storage
Serving
Metastore
Processing Platform
Fast Query Layer
DWH
Data
Catalog
Zalando’s Data Platform
9
Centralization Challenges
Datasets provided by central data infrastructure team
● Lack of ownership
?
10
Field_A Field_B
Record_1
Record_2
Record_3
Datasets provided by central data infrastructure team
● Lack of ownership
Data pipelines operated by central data infrastructure team
● Lack of quality
Centralization Challenges
11
Centralization Challenges
Datasets provided by central data infrastructure team
● Lack of ownership
Data pipelines operated by central data infrastructure team
● Lack of quality
Organizational scaling
● Central team becomes the bottleneck
12
A Recurring Pattern
13
A Recurring Pattern
14
A Recurring Pattern
15
A Recurring Pattern
16
Why is that?
central
data platform
17
Why is that?
checkout
service
checkout
events
18
What is Data Mesh?
Old wine applied to new bottles…
→ Product Thinking
→ Domain-Driven Distributed Architecture
→ Infrastructure as a Platform
… creates value from Data
https://martinfowler.com/articles/data-monolith-to-mesh.html by Zhamak Dehghani
19
Data as a Product
Data
Product
What is my market?
What are the desires of
my customers?
What “price” is justified?
How to do marketing?
What’s the USP?
Are my customers happy?
20
Domain-Driven Distributed Architecture… applied to Data
Domain
21
Domain-Driven Distributed Architecture… applied to Data
Domain
→
Aggregated
Domain
22
Domain-Driven Distributed Architecture… applied to Data
Discoverable
Addressable
Self-describing
Trustworthy
Interoperable
Secure
Domain
→
Aggregated
Domain
23
...backed by domain-agnostic self-service data infrastructure
Data Infra as a Platform
Discoverable
Addressable
Self-describing
Trustworthy
Interoperable
Secure
Domain
→
Aggregated
Domain
24
It’s a mindset shift
FROM TO
Centralized ownership Decentralized ownership
Pipelines as first class concern Domain Data as first class concern
Data as a by-product Data as a Product
Siloed Data Engineering Team Cross-functional Domain-Data Teams
Centralized Data Lake / Warehouse Ecosystem of Data Products
25
Data Mesh in Practice
26
Recap:
● From Bottleneck to Infra Platform
Data Mesh in Practice
Data Infra as a Platform
27
Recap:
● From Bottleneck to Infra Platform
● From Data Monolith to Interoperable Services
Data Mesh in Practice
Data Infra as a Platform
central
data
platform
28
Data Lake Storage
Governance Layer
Central Services with Global Interoperability
29
Data Lake Storage
Bring Your Own Bucket (BYOB)
Governance Layer
30
Processing Platform
Simplify Data Processing
Data Lake Storage
Governance Layer
31
Processing Platform
Simplify Data Sharing
Data Lake Storage
Governance Layer
32
Central Services with Global Interoperability
Decentralized ownership does not imply decentralized infrastructure!
Interoperability is created through convenient solutions of a self service platform.
Decentral Storage Central Infrastructure
Decentral Ownership Central Governance
33
Recap:
● Datasets provided through pipelines of central data infrastructure teams
Data Mesh in Practice
?
34
How to Ensure Data Quality?
Make conscious decisions
● Opt-in instead of default storage
35
How to Ensure Data Quality?
Make conscious decisions
● Opt-in instead of default storage
● Behavioral changes - data is a product
36
Care About Your User!
● Classification of Usage
37
Care About Your User!
● Classification of Usage
● Dedicate resources to
○ Understand usage
○ Ensure quality
38
Some Numbers
39
Some Numbers
● 40 teams using BYOB
40
Some Numbers
● 40 teams using BYOB
● 100 teams using the processing platform
Processing Platform
41
Some Numbers
● 40 teams using BYOB
● 100 teams using the processing platform
● First curated data teams
Data Products
On Data Products
On Data Products
Processing Platform
42
Some Numbers
● 40 teams using BYOB
● 100 teams using the processing platform
● First curated data teams
● 0 operational effort for the central team
Data Products
On Data Products
On Data Products
Processing Platform
43
Some Numbers
● 40 teams using BYOB
● 100 teams using the processing platform
● First curated data teams
● 0 operational effort for the central team
Data Products
On Data Products
On Data Products
Processing Platform
It’s a journey ;)
44
It’s a Journey
45
“Off the shelf” data tooling
46
“Off the shelf” data tooling
De-centralized archiving
47
“Off the shelf” data tooling
De-centralized archiving
De-centralized GDPR deletion tooling
48
“Off the shelf” data tooling
Template driven data preparation
De-centralized archiving
De-centralized GDPR deletion tooling
49
Data Mesh in Practice
How Europe’s Leading
Online Platform for Fashion
Goes Beyond the Data Lake
Max Schultze
max.schultze@zalando.de
@mcs1408
Arif Wider
awider@thoughtworks.com
@arifwider

More Related Content

PPTX
Data Center Infrastructure Management(DCIM)
PDF
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
PDF
Data Mesh Part 4 Monolith to Mesh
PDF
Five Things to Consider About Data Mesh and Data Governance
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r1)
PDF
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
PDF
PPTX
Introduction to Data Engineering
Data Center Infrastructure Management(DCIM)
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Data Mesh Part 4 Monolith to Mesh
Five Things to Consider About Data Mesh and Data Governance
Data Lakehouse, Data Mesh, and Data Fabric (r1)
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
Introduction to Data Engineering

What's hot (20)

PDF
Enabling a Data Mesh Architecture with Data Virtualization
PDF
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
PDF
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
PPTX
Building a modern data warehouse
PPTX
Introducing the Snowflake Computing Cloud Data Warehouse
PDF
Introduction to Knowledge Graphs
PPTX
Snowflake: The Good, the Bad, and the Ugly
PPT
Data warehouse
PDF
Lakehouse in Azure
PDF
Data Catalog for Better Data Discovery and Governance
PDF
Data Architecture Strategies: Data Architecture for Digital Transformation
PDF
Data Vault Introduction
PPTX
Data warehousing Demo PPTS | Over View | Introduction
PDF
Technical Deck Delta Live Tables.pdf
PPT
Introduction to Data Warehouse
PPTX
Data mesh
PPTX
Snowflake Architecture.pptx
PPTX
Data Architecture Brief Overview
PPTX
Your Roadmap for An Enterprise Graph Strategy
PPTX
Design cube in Apache Kylin
Enabling a Data Mesh Architecture with Data Virtualization
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Building a modern data warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
Introduction to Knowledge Graphs
Snowflake: The Good, the Bad, and the Ugly
Data warehouse
Lakehouse in Azure
Data Catalog for Better Data Discovery and Governance
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Vault Introduction
Data warehousing Demo PPTS | Over View | Introduction
Technical Deck Delta Live Tables.pdf
Introduction to Data Warehouse
Data mesh
Snowflake Architecture.pptx
Data Architecture Brief Overview
Your Roadmap for An Enterprise Graph Strategy
Design cube in Apache Kylin
Ad

Similar to Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data Lake (20)

PDF
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
PDF
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
PDF
Cloudian 451-hortonworks - webinar
PDF
Virtualisation de données : Enjeux, Usages & Bénéfices
PDF
Building the Artificially Intelligent Enterprise
PDF
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
PPTX
GraphTalks Hamburg - Einführung in Graphdatenbanken
PPTX
Neo4j GraphTalks - Einführung in Graphdatenbanken
PDF
When and How Data Lakes Fit into a Modern Data Architecture
PDF
Transforming Business in a Digital Era with Big Data and Microsoft
PDF
Renewing the BI infrastructure at Hellorider - Big Data Expo 2019
PPTX
Neo4j GraphDay Tel Aviv - Graphs in Action
PPTX
Experfy Online Course - Gain Competitive Advantage Using Microsoft Azure Data...
PPTX
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
PDF
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
PDF
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
PDF
Big Data in Action – Real-World Solution Showcase
PDF
Marvin Platform – Potencializando equipes de Machine Learning
PDF
What is the future of data strategy?
PPTX
Digital Business Transformation for Energy & Utility company
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Cloudian 451-hortonworks - webinar
Virtualisation de données : Enjeux, Usages & Bénéfices
Building the Artificially Intelligent Enterprise
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
GraphTalks Hamburg - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in Graphdatenbanken
When and How Data Lakes Fit into a Modern Data Architecture
Transforming Business in a Digital Era with Big Data and Microsoft
Renewing the BI infrastructure at Hellorider - Big Data Expo 2019
Neo4j GraphDay Tel Aviv - Graphs in Action
Experfy Online Course - Gain Competitive Advantage Using Microsoft Azure Data...
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
Enterprise Data Marketplace: A Centralized Portal for All Your Data Assets
Big Data in Action – Real-World Solution Showcase
Marvin Platform – Potencializando equipes de Machine Learning
What is the future of data strategy?
Digital Business Transformation for Energy & Utility company
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
PPT
Data Lakehouse Symposium | Day 1 | Part 2
PPTX
Data Lakehouse Symposium | Day 2
PPTX
Data Lakehouse Symposium | Day 4
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
PDF
Democratizing Data Quality Through a Centralized Platform
PDF
Learn to Use Databricks for Data Science
PDF
Why APM Is Not the Same As ML Monitoring
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
PDF
Stage Level Scheduling Improving Big Data and AI Integration
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
PDF
Sawtooth Windows for Feature Aggregations
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
PDF
Re-imagine Data Monitoring with whylogs and Spark
PDF
Raven: End-to-end Optimization of ML Prediction Queries
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
PDF
Massive Data Processing in Adobe Using Delta Lake
DW Migration Webinar-March 2022.pptx
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 4
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Democratizing Data Quality Through a Centralized Platform
Learn to Use Databricks for Data Science
Why APM Is Not the Same As ML Monitoring
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Stage Level Scheduling Improving Big Data and AI Integration
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Sawtooth Windows for Feature Aggregations
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Re-imagine Data Monitoring with whylogs and Spark
Raven: End-to-end Optimization of ML Prediction Queries
Processing Large Datasets for ADAS Applications using Apache Spark
Massive Data Processing in Adobe Using Delta Lake

Recently uploaded (20)

PPTX
1intro to AI.pptx AI components & composition
PPTX
Computer network topology notes for revision
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Understanding Prototyping in Design and Development
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Oracle OFSAA_ The Complete Guide to Transforming Financial Risk Management an...
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Data Analyst Certificate Programs for Beginners | IABAC
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Logistic Regression ml machine learning.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
1intro to AI.pptx AI components & composition
Computer network topology notes for revision
IB Computer Science - Internal Assessment.pptx
Understanding Prototyping in Design and Development
Business Ppt On Nestle.pptx huunnnhhgfvu
Introduction-to-Cloud-ComputingFinal.pptx
Oracle OFSAA_ The Complete Guide to Transforming Financial Risk Management an...
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Data Analyst Certificate Programs for Beginners | IABAC
STUDY DESIGN details- Lt Col Maksud (21).pptx
Quality review (1)_presentation of this 21
Logistic Regression ml machine learning.pptx
Reliability_Chapter_ presentation 1221.5784
Miokarditis (Inflamasi pada Otot Jantung)
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
Launch Your Data Science Career in Kochi – 2025
Business Acumen Training GuidePresentation.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx

Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data Lake