20220802-EB-Practical Data Mesh
20220802-EB-Practical Data Mesh
Adam Bellemare
Foreword by Ben Stopford
Practical Data Mesh
Building Decentralized Data Architectures
with Event Streams
Adam Bellemare
Foreword by Ben Stopford
Practical Data Mesh
Building Decentralized Data Architectures
with Event Streams
by Adam Bellemare
Foreword by Ben Stopford
©️ 2022 Confluent, Inc.
iii
Foreword
The software industry has always been susceptible to trends. Some stick,
some pass, but all vie for the limelight in one way or another. Agile, SOA,
cloud, data lakes, microservices, DevOps, and event streaming have all
had a fundamental effect on the software we build today.
Data mesh may well be the next innovation we can add to this list. I see
it as a kind of microservices for data architecture. Part technology and
part practice, this socio-technical theory aims to let large, interconnected
organizations avoid putting all their data into one single place: a pattern
that can lead to paralysis. In a data mesh, different applications, pipelines,
databases, storage layers, etc., are instead connected through self-service
data products, creating a network, or “mesh,” of data that has no central
point where teams are forced to wait in line or step on each other’s toes.
Such problems plague the monolithic application and the monolithic data
warehouse alike.
Building systems that solve socio-technical problems like these has always
been a bit of a dark art. I got this sense early in my career, working on data
infrastructure in large enterprise companies. Investment banks, in particular,
have complex data models with thousands of individual applications
reusing the same common datasets: trades, customers, books, risk metrics,
etc. Their architectures inevitably become unruly, and the natural instinct is
to put everything in one place and install tight governance. While this can
work, it stifles progress.
5
After moving to Silicon Valley, I was surprised to see younger companies
facing the same problems, although being younger the effects were
less pronounced. Two things stuck with me through these experiences.
First, the approaches that worked best had some elements of being
decentralized. Second, they emphasized getting the people part right, not
just the technology.
This book also does a great job of making data mesh more tangible. Adam
uses the data mesh principles as a set of logical guardrails that help readers
understand the trade-offs they need to consider. He then dives deep into
practical and opinionated implementation. This adds meat onto the bones
of what is otherwise an abstract theoretical construct. I think this down to
earth take will be valuable to architects and engineers as they map their
own problem domain to the data mesh principles.
Of course, whether data mesh proves to be the next big trend that rocks the
software industry remains up for debate, at least at the time of writing, but
Adam’s book takes you further than anyone has along this particular journey.
6 | Foreword
Introduction
While the data mesh concept is relatively new, the problems it proposes
to solve are not. This book covers the historical issues of data access,
including why these issues remain relevant to this day. It examines how
data mesh architecture can solve these historical problems and how
event streams play into this modern data stack. In addition, it explores
your options for building and designing data products served by event
streams, and the necessary decisions you’ll need to make for building
your self-service tooling.
The principles underpinning the data mesh, while detailed and prescriptive,
leave a lot of leeway for taking different approaches to the underlying
implementation. For this reason, we have included a review of Confluent’s
opinionated data mesh prototype, explaining the design decisions and
implementation choices we made in its creation. We also cover a data mesh
implementation created by Saxo Bank, showcasing the design choices and
architectural trade-offs that they decided upon.
Before we get into those, let’s start with why data mesh is relevant. What
are the main problems that we’re still trying to solve today?
7
A Brief History of Data Problems
Data, as a discipline, is often treated as a separate domain from engineering.
A data team composed of data engineers, data scientists, and data analysts
extracts data from engineering systems and does “something useful”
with it for the business. “Something useful” typically includes answering
analytical questions, building reports, and structuring data from disparate
systems into queryable form. An example of this might include correlating
sales with various patterns of user behavior observed on a website. Making
real-time product suggestions based on pages a user recently browsed
would be a more contemporary example.
The data team usually functions in a highly centralized capacity. While the
data engineers remain responsible for obtaining data from other systems,
they do not own those systems or the data inside. Yet they must find the
means to couple on the systems, extract the data, transform it, and load it
(ETL) into a location for the data scientists to remodel. This data is typically
stored in a large cloud storage bucket.
8 | Introduction
Figure 1-1. A typical data lake architecture, including the areas of responsibility
for the various roles
Introduction | 9
model changes, data schema changes, configuration changes, and credential
changes are only a few common failure modes. Any breakage in an ETL
job ripples downstream to the data scientists and analysts trying to use the
data, resulting in dashboard outages, delayed reports, incorrect insights, and
stoppages to machine-learning model training and deployments. Scaling
becomes even more difficult as the number of datasets grows, because each
new dataset increases the fragility and complexity of the dependency graph.
Third, the traditional data team forms a bottleneck. They are responsible
for extracting, standardizing, and cleaning data, deriving and clarifying
relationships between datasets from multiple domains, and formulating
and serving data models to a wide array of end customers. They are also
responsible for building, maintaining, and fixing ETL jobs, a highly reactive
process when combined with their lack of ownership over the source
domain models. The data team often spends significant time tracking
down failures and remediating incorrectly computed results. The lack of
official dataset ownership makes it difficult to pinpoint who is responsible
for making data available to other teams. One quantification from the IDC
suggests that data analysts waste about 50% of their time simply trying to
identify or build reliable sources of data.
Fourth, when using traditional data, cost is often also an issue. While
historically you may only have had a few data sources, data-centric
companies typically have hundreds of data sources, each with its own
dedicated, yet fragile, ETL job. The competitive needs for data have only
increased, with daily jobs replaced by hourly jobs, and hourly jobs replaced
by event stream processing.
This leads to the fifth issue with the traditional approach: timeliness. Real-
time data is better than batch. Data is increasingly being generated in real
time by mobile phones, servers, sensors, and internet-of-things devices.
Our customers expect us to react to this data quickly and efficiently. To do
so, we must shed the unresponsive batch processing in favor of real-time,
stream-processing solutions.
10 | Introduction
Data warehouses exemplify a very similar approach to data lakes: Extract
data, transform it, and load it into storage for analytical querying. But
unlike data lakes, data warehouses require a well-defined data model and
schema at write time, which is usually defined and fulfilled by the data
team. While this strategy may resolve many data quality issues, the onus of
maintaining a full, canonical data model, and its accompanying ETL jobs,
remains with the data team, and not with those who own the production
of the upstream data. Scaling, bottlenecks, cost, and timeliness remain
an issue, along with maintaining a data model stitched together from
independently evolving, disparate datasets from across the organization.
Introduction | 11
The Principles of Data Mesh
Data mesh moves the responsibility of providing reliable and useful access
to data back to the data’s owner from the centralized data team. Data is no
longer treated as an application’s byproduct, but instead is promoted as a
first-class citizen on par with other products created and used within an
organization. This requires a shift in responsibilities with respect to how
data is created, modeled, and made available across an organization.
13
Figure 2-1. A data mesh connected across multiple business domains
Each domain can choose which data to expose to the others through
well-structured and formally supported data products. By coupling data
product ownership to its source domain, we can apply the same rigor
to data products that we do to other business products. Consumers can
simplify their dependencies and eliminate coupling on internal source
models, instead relying on data products as trustworthy mechanisms for
accessing important business data.
Let’s take a closer look at the four main principles of data mesh.
Data as a Product
Each team is responsible for publishing their data as a fully supported
product. That team must engage in product thinking about the data they’re
serving: They’re wholly responsible for its contents, modeling, cohesiveness,
quality, and service-level agreements. The resultant data products form the
fundamental building blocks for communicating important business data
across the company.
Figure 2-2. A data product contains data, code and infrastructure components
Be careful about selecting the domain data to include in the data product
and do not share any internal private state. Determining what data ends
up in a data product requires consulting both known and prospective
consumers. It is also common to create multiple data products within a
single domain, each serving the unique needs of its consumers.
For example, if you’re producing a sales forecast for Japan, you could find
all of the data you need to drive that report—ideally in a few minutes.
You’d be able to quickly get all of the data you need, from all of the places
it lives, into a database or reporting system that you control.
Federated Governance
Governance and control of your data remain a concern when it’s made
widely available across your organization. It’s a collaborative affair,
with individuals from across the organization joining together to create
appropriate standards and policies. Shared responsibilities include outlining
info security policies, data handling requirements, and right to be forgotten
legislation. Another part of it is balancing a data product owner’s right to
autonomy versus the ease with which consumers can access and use the
data.
In the next section, we’ll look at creating a data product from an existing
domain, illustrating the interrelationship of the data mesh principles. We’ll
cover domain boundaries, modes of data product access, and governance,
and we’ll also discuss the role of event streams in data mesh.
A data product is composed within its domain and output across its
boundary for other teams to consume and use. Figure 3-1 shows the
components for building a data product.
From left to right, data is extracted from one or more operational systems,
then transformed and remodeled according to the data product’s
requirements. Finally, it is made available in a format suitable for consumer
use cases. The extraction and transformation steps form an anti-corruption
layer that isolates the internal data model from the data published to
the external world. Consumers coupled with the data product API need
not concern themselves with internal model changes. This significantly
contrasts the frequent breaking changes inherent in the responsibility
model of a centralized data team.
19
Data products are first and foremost meant to serve analytical use cases.
This means selecting the necessary operational data and remodeling it into
facts. These facts comprise a record of precisely what happened within that
domain’s operations, without directly exposing the internal data model of
the domain.
• Facts are read-only: Only the data product owner can add new
facts to the data product.
• Facts are immutable: Data product owners can append new facts
as an addendum to previous facts, but cannot modify, overwrite,
or delete them.
• Facts are timestamped: Each fact contains a timestamp representing
when it occurred, such that time-based ordering is made possible.
The three key characteristics of facts—that they are read-only, are immutable,
and are timestamped—provide the essential property of reproducibility:
A single consumer can read and reprocess the same data for a given time
range as many times as it chooses, and will always yield the same result.
Similarly, two independent consumers can access the same data range
at different points in time and obtain consistent results. Consumers can
leverage these properties to produce projections or materialized views of
the data to serve their own specific business needs.
Source Aligned
Source-aligned data products reflect business facts generated by an
operational system. For example, a storefront domain may emit click facts
containing the full set of details about each product that a user clicked on.
Aggregate Aligned
An aggregate-aligned data product consists of data from one or more
upstream data products. An example would be a daily aggregate of clicks
per e-commerce item enriched with each item’s category classification.
Consumer Aligned
A consumer-centric data product is built to serve a highly specialized use
case for a given consumer.
The three alignments make different trade-offs regarding the effort involved
in creating them, their scope of application, and the effort involved by the
consumer in applying them. At one end of the spectrum are source-aligned
data products, which are general purpose and typically require only simple
business logic to construct. Consumers can take these source-aligned facts
and apply their own business logic to remodel and transform the data to
suit their needs.
In the middle of the data product spectrum are data products that are built
around an aggregation for its consumers, and may or may not include
enrichment and denormalization of data. Figure 3-3 shows the aggregation
of item clicks into a daily count, joined on item entity data for ease of use
by downstream consumers.
While consumer needs should drive the domain alignment of data products,
data product owners must balance needs against available resources. While
a simple consumer-aligned data product may be reasonable to implement,
another may be too complex for the domain owner to support. In this case,
it will be up to the consumer to build and support a consumer-aligned data
product to fulfill its needs.
A big data compatible mode is the most common way to access a data
product. An example would be a set of Parquet files written to S3 for
periodic access via an analytics job. Another would be a set of domain
events published to an event stream. Other options could include accessing
the data product through SQL queries, REST APIs, or gRPC.
A single data product can be served through many different modes. Figure
3-5 shows a single data product served via three separate mechanisms:
Using event streams as the primary mode of data product access unlocks
several powerful options, which we’ll look at in detail in the next section.
Figure 3-7. A Kafka topic with two partitions, showcasing the append-only update
of Key 100 from “Apple” to “Red Apple”
If you would like to learn more about the fundamentals of Kafka, including
partitions, topics, production, and consumption, please check out
Confluent’s Kafka 101 course.
There are several major factors that make event streams optimal for serving
data products:
• They get data where it needs to be, in real time: Event streams are
a scalable, reliable, and durable way of storing and communicating
important business data. Streams are updated as new data becomes
available, propagating the latest data to all consumers. Real-time
data enables both streaming and batch use cases, including those
that span data centers, clouds, or geographies.
Figure 3-8. Sourcing customer and order data through event streams
Event streams provide data product consumers with the ability to use the
latest events, or to rewind to a specific point in time and replay events.
New consumers can begin from the start of the event stream and build up
historical state, while existing consumers can replay events to account for
bug fixes and new functionality. Each consumer autonomously controls its
stream consumption progress.
Consumers can debug and inspect event stream contents using precisely
the same consumer interface. For example, a producer of an event stream
may want to validate the contents of what they have written. In contrast,
consumers may wish to obtain actual sample data to validate their
expectations. Inspecting the contents of an event stream can be done using
a stand-alone application like kcat or by leveraging built-in SaaS UI, such
as Confluent’s message viewer.
Event streams play a unique role in data mesh because they provide an
opportunity for the consumer to have access to a stream of business facts.
Source-aligned and aggregate-aligned event streams offer the opportunity
for real-time reaction, and provide enterprising consumers with an easy to
use, general-purpose data product API they can use to build their own use
case specific models featuring personal business logic.
Regardless of how the event stream data product is aligned, the need to
have a well-defined schema to enforce the data format remains, along with
metadata to describe it.
Apache Avro and Google’s Protobuf are the leading options for event stream
schemas. The value of an event must have explicitly defined schemas to
simplify discovery, validation, documentation, and code generation. An
event’s key may also have a schema, or it may use a primitive type if
the key is a simple, unique identifier. Schemas are stored in the Schema
Registry and comprise part of the metadata required for discovering the
contents of the data product.
But data mesh is multimodal, and other interfaces may better serve some
data products. Further, you can serve the same data product through both
an event stream and another interface, since using one does not preclude
you from the other.
Your data product APIs will depend heavily on your organization’s existing
tech stack, the resources you have dedicated towards data mesh, and the
needs of your data users. Your organization may be at a stage where it only
needs a daily aggregate of user analytics data, met with a nightly export to
cloud storage for batch computation. With no real-time use cases, you may
find it unnecessary to invest in event-driven data products until such use
cases arise.
Using Kafka topics as the basis for your data products relies heavily
on stream processing and stream connectors. While there are multiple
technologies you can choose to support building data products, here are
the two that we have found to be essential for creating and consuming
event stream data products:
35
• Kafka Connect, as a means of integrating non-streaming systems
with event streams.
• ksqlDB for consuming, transforming, joining, and using event
streams to drive business logic and create new data products.
Figure 4-1. Kafka Connect and ksqlDB make it easy to use event streaming
data products
While you can download Kafka Connect and run your own cluster, Confluent
makes self-service a reality by providing it as a service. Data product owners
specify the data sets they want to extract in the connector configuration,
including targeted table and column names, SQL queries, or other database-
specific query options. The connector is automatically scaled up and down
depending on resource needs. All you need to do is choose from the large
selection of managed connectors, provide your configurations, and start it up.
Figure 4-3. Extracting events from the transaction log into Kafka
To use it, you create a dedicated new table (the outbox) to store events
temporarily. Next, you modify your code by wrapping selected database
modifying commands within a transaction. Thus, a corresponding record
is also published to the outbox whenever you insert, update, or delete data.
The following pseudocode shows a user insertion wrapped in a transaction,
such that the outbox is also populated:
The transaction ensures that any update made to the internal domain model
is reflected in the outbox table on a successful commit, and correctly rolled
back in the case of a failure. You may also have noticed that the outbox does
not write the user.age or user.pin. This is a deliberate choice: the outbox
table acts as an anti-corruption layer, isolating the internal domain model
from the data product model.
Once data is in the outbox, a process must write it into the corresponding
Kafka topic, and self-service Kafka Connect is the easiest way to do this.
You can set up a CDC connector to tail the database log or a periodic query
connector to query the table for the latest results. After emitting the captured
data to the event stream, be sure that you delete it from the outbox table to
keep the disk usage low and to avoid republishing old data.
The premise is that the event stream does not contain the entire history of
data but only a tiny recent subset. A new consumer who wants the whole
data history would simply start up their application, then push a button in
a UI to have the source domain republish the entire dataset.
There are several reasons that we recommend using infinite event stream
retention instead:
Instead, it is best to retain all data product facts in the event stream as long
as they remain relevant to the business. Consumers simply reset their offset
to the beginning of the event stream to access the history.
In this example, the consumer has fully materialized the event stream into
its domain boundaries. In materializing (key: 100, value: “Red Apple”,
offset: 2), the consumer service simply overwrote the existing entry, evicting
(key: 100, value: “Apple”, offset: 0).
The sink connector converts the event stream into the format required by
the existing systems. This native ability to power both real-time streaming
jobs and offline batch jobs is one of the most significant advantages of
publishing data products to event streams.
Second, we can use these materialized views to join, aggregate, and enrich
other events, composing complex business logic with streaming primitives.
Updates are made in real time as the ksqlDB application receives new
events.
The key takeaway is that the overhead associated with accessing, consuming,
and storing data derived from data products needs to be managed and
simplified for all consumers. They should be able to spend their efforts
focusing on using data products instead of struggling with access and
management overhead. These self-service tools enable a clean separation
of responsibilities for the data product creator and the consumers accessing
the event stream output port.
The Confluent data mesh is built on the concept that data products are
best served as event streams. In this section, we’re going to examine the
Confluent data mesh prototype to get a better idea of what an opinionated
implementation actually looks like. We’ll investigate how the principles of
the data mesh map to the prototype, exploring alternative options and their
trade-offs. Finally, we’ll cap it off with a case study from one of Confluent’s
customers, Saxo Bank.
The Confluent data mesh prototype is built on top of the fully managed,
cloud-native, data streaming platform Confluent Cloud. As the event
stream is the primary data product mode in the reference architecture, we
rely heavily on several of Confluent’s commonly used services:
• Apache Kafka acts as the central broker for hosting the event stream
topics, including handling all producer writes and consumer reads.
• The Confluent Schema Registry contains the schemas for each
event stream and manages the schema evolution rules for handling
changes over time.
• The Confluent Stream Catalog stores the metadata related to data
products for easy discoverability.
45
Figure 5-1. Confluent’s data mesh prototype
The prototype is written in Spring Boot Java with an Elm frontend. You
can run and fork your own copy of the prototype by following the steps
outlined in the GitHub repo. The source code is freely available for you to
use however you like.
While this book will cover some of the features of the data mesh
prototype, it will not be an exhaustive tour of all components. If you are
interested in a detailed video tour, please see the Data Mesh 101 course.
• Tab 1 is for Data Product Consumers, where they can explore the
available data products, including the metadata that describes
them.
• Tab 2 is for Application Developers, who can use the available data
products to build new applications.
• Tab 3 is for Data Product Owners, who can manage published data
products and their advertised metadata.
Our prototype data mesh only allows the registration of Kafka topics that
already have an associated schema. You cannot register internal topics, nor
topics that are outside of your domain. Figure 5-3 shows the full selection
of data necessary to register a new data product.
Figure 5-3. The required metadata gatekeeps the publishing of a data product
Data Quality and Service Level Agreement (SLA) are the two other
required metadata fields. The first helps you identify the data product
quality, differentiating canonical data products (Authoritative) from
those with lower (Curated), or no (Raw) standards. SLAs provided by the
domain owner indicate the service level a consumer can expect when using
the data product: Tier 1 indicates that someone will get out of bed to fix
the data product in the middle of the night, whereas Tier 3 indicates you’ll
have to wait until the next business day.
As a publisher of data products, you are responsible for ensuring that all
required guarantees can be met. Figure 5-4 shows a completed metadata
record, including explicit acknowledgment of responsibility by the data
product owner.
Figure 5-4. A completed metadata record for an event stream ready for publishing
Figure 5-6. Stream Lineage showcasing the lineage of the stocktrades topic,
including ksqlDB queries and connectors
By ensuring data quality via schema support, metadata via Stream Catalog,
and lineage options via Stream Lineage, we’re able to provide users with
reliable self-service tooling to discover, understand, and trust the data they
need.
At a minimum, a consumer needs to know the Kafka topic name, the hosting
event broker URI, the Schema Registry URI, and the subject name (to obtain
the schema). The consumer must also be able to register themselves as a
consumer of the data product, using Kafka ACLs or Confluent’s RBAC.
The barrier to accessing data products through Kafka Connect is very low
and abstracts the complexity away from the consumer. However, as we saw
earlier with the Kappa Architecture, there are other ways to consume and
use event streams, and your needs will vary with your organization and
tooling. As an implementer of the data mesh, your focus needs to remain on
simplifying the barriers to entry and integrating with your organization’s
workflows and tooling.
To make this a reality, the data mesh federated governance group, together
with your microservices standards group, must focus on streamlining
microservice integration with the data mesh. One form of this is in reducing
the overhead on common tasks, as part of facilitating self-service. This
could include:
First, the data product may need to be refactored, possibly due to a shift
in domain responsibilities, a renegotiating of the public domain boundary
with consumers, or perhaps due to an oversight in the original schema.
This is referred to as a breaking change, and depending on the magnitude
of the change, it could mean as little as minor accommodations by existing
consumers, or something as large as a full-out migration to a new data
product. Stream Lineage, in conjunction with strict read-only access controls,
helps you identify the affected services and plan migrations accordingly.
data product owner needs to be able to communicate with all of its registered
A
consumers to notify them about breaking changes and the compensating
actions that will be necessary. This is a communications-intensive process
and is best resolved by getting representatives from all participating teams
to come up with a joint migration plan. This can include creating new data
products, populating them with new schemas, reprocessing old data into
the new format, and swapping existing consumers over to the new data
products. Consumers may also need to reset and rebuild their own internal
state, depending on the severity of the change.
57
Event streams provide a significant replication advantage over other
data product APIs thanks to their incremental, eventually consistent, and
append-only nature. As an event is published to the data product’s host
Kafka cluster, the replication tooling can immediately route a copy of the
new event to the desired destination cluster.
While the concept of replicating an event stream is simple, there are still
a host of related concerns to attend to. We still need to address syncing of
topic definitions, configurations, consumer offsets, access control lists, and
the actual data itself. The reality is that accurately synchronizing a data
product in real-time and across multiple environments and cloud providers
is difficult to do well. It is further complicated by intermittent errors,
network connectivity problems, privacy rules, and the idiosyncrasies of
cloud providers.
https://cnfl.io/practical-data-mesh-video
61
Figure 7-1. Saxo Bank’s starting architecture, bridging analytics and operations
through batch ETL
Thus, Saxo Bank began their journey to a data mesh architecture. Instead
of narrating through their entire journey, which Paul Makkar covers in the
video, we will focus on a few of their design decisions.
Saxo Bank is a .NET shop, and this heavily influenced their design
decisions. For one, Apache Avro isn’t supported as well in .NET as it is
in the Java world, so Saxo made the decision to go with Protobuf for their
event schemas. This decision was made by their federated governing body,
who fenced out a poly-schema approach to ensure simplicity of tooling for
both producer and consumer alike.
Aside from the problems of data quality and ownership, Saxo Bank’s
migration to data mesh was also driven by their need to provide well-
formulated operational data to multiple consumer systems. Kafka shines in
asynchronous event-driven architectures, and as it was coupled with strong
domain ownership and delineation of data ownership responsibilities, it
led to a common data plane for all teams to access.
While the Confluent data mesh prototype uses Confluent’s Stream Catalog
to store the metadata and provide the backing for its data discovery UI,
Saxo Bank uses Acryl Data. The principles of publishing, managing, and
discovering data products remain identical: Metadata is collected as part of
the publishing process, with prospective users able to browse and discover
the data they need.
One decision that Paul highlights in the Saxo Bank migration is the
intersection of information security with the standards established by
federated governance, as mandated to all data product owners. All
personally identifiable information (PII) is encrypted end-to-end and
integrated with the schemas, using Protobuf tags. They use format-
preserving encryption to ensure that while data remains encrypted, the
format of the schemas remains unchanged. This preserves functionality for
clients that do not have encryption access, eliminating the need to maintain
multiple topics of similar-yet-encrypted data.
Saxo Bank shifts the data modeling and data quality responsibilities to
the left, upstream to the teams that own the data. Data product owners
assume the responsibilities of creating consumer-friendly data models and
event schemas, working with current and prospective consumers to ensure
their needs are met. Data products are populated to the operational plane,
backed by Apache Kafka, where both operational domain and analytics
domain teams and services can discover, subscribe to, and consume them
as they need.
Saxo Bank’s data mesh architecture relies upon both the operational and
analytical planes to source important business data products through
Apache Kafka. Consumers rely on asynchronously updating data products
to power their use cases, using stream processors, Kafka Connect, or basic
consumer clients to consume and remodel data for usage in their own
domains. New data products can be emitted back into Apache Kafka,
registered for usage and consumption by other teams.
https://cnfl.io/practical-data-mesh-blog
https://cnfl.io/practical-data-mesh-podcast
67
Finally, collaborative federated governance provides guidance and
standards for everyone working in the data mesh. Self-service platform
engineers obtain clarity on their user requirements, data product owners
obtain clarity on their roles and responsibilities, and would-be consumers
obtain clarity on identifying products, using self-service tooling, and
change request processes.
68 | Conclusion
About the Author
Adam Bellemare is a Staff Technologist at
Confluent, and author of O’Reilly’s Building
Event-Driven Microservices.