0% found this document useful (0 votes)
25 views

Notification System Architecture With AWS _ By Joud W. Awad _ Medium

The document outlines the design and implementation of a serverless notification system using AWS services, focusing on scalability and support for various notification types such as push notifications, SMS, and email. It details functional and non-functional requirements, including user preferences, high availability, and cost optimization, while also discussing database options like Amazon DynamoDB and RDS Aurora for managing user data. The architecture aims to efficiently handle high volumes of notifications, ensuring low latency and reliable delivery during peak loads.

Uploaded by

chaitanya.kp15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Notification System Architecture With AWS _ By Joud W. Awad _ Medium

The document outlines the design and implementation of a serverless notification system using AWS services, focusing on scalability and support for various notification types such as push notifications, SMS, and email. It details functional and non-functional requirements, including user preferences, high availability, and cost optimization, while also discussing database options like Amazon DynamoDB and RDS Aurora for managing user data. The architecture aims to efficiently handle high volumes of notifications, ensuring low latency and reliable delivery during peak loads.

Uploaded by

chaitanya.kp15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W.

Awad | Medium

Open in app

26
Search

Get unlimited access to the best of Medium for less than $1/week. Become a member

Notification System Architecture With AWS

Notification System Architecture With AWS


Joud W. Awad · Following
30 min read · Dec 26, 2024

Listen Share More

In this blog post, we will explore how to design and implement a fully production-
ready serverless notification system using AWS services. This architecture will
enable various types of notifications to be sent to customers in response to specific
events within a system.

Designed for scalability, the system will support large-scale applications, handling
hundreds or even thousands of notifications per second. We will leverage AWS
services to build this solution and explore different design patterns, evaluating them
based on cost, performance, and other key considerations.

Understanding the Functional Requirements

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 1/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Our goal is to design a system capable of handling a high volume of notifications per
second while ensuring extensibility to accommodate new notification types as the
system evolves. The key functional requirements for this system include:

1. Support for Multiple Notification Types


The system must support various notification types, including push
notifications, SMS messages, and emails.

2. Customer Opt-in/Opt-out Preferences


Users should be able to manage their notification preferences, allowing them to
select which types of notifications they wish to receive.

3. Notification Provider Extensibility


To future-proof the system, it should be designed to easily integrate new
notification types as the business requirements grow.

4. Analytics for Stakeholders


Stakeholders should have access to analytics capabilities within the notification
system. This would enable insights into metrics such as the most commonly sent
notifications over specific timeframes (e.g., past hours, days, etc.).

Non-functional Requirements
The business objectives guide us in identifying several critical non-functional
requirements for the notification system:

1. Scalability
The system must efficiently scale to handle an increasing volume of
notifications, both incoming and outgoing.

2. High Availability
To ensure uninterrupted service, the system must maintain high availability at
all times.

3. Reliability
The system should guarantee the delivery of all notifications without any loss.

4. Extensibility
The architecture should be flexible, requiring minimal changes to support new
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 2/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

notification types or providers in the future.

5. Cost Optimization
Given the heavy usage of the system, cost efficiency is crucial. The design
should focus on leveraging the most cost-effective AWS services and features to
minimize operational expenses.

Back-of-the-Envelope Estimation
The notification system we are designing operates as a single service within a
microservices architecture. In our ecosystem, there are approximately 150 services
running concurrently, each generating events. However, not all events are relevant
to notifications. After analyzing the services, we identified that around 20 services
are responsible for generating notification-related events.

These 20 services collectively produce approximately 2,000 events per second under
normal conditions. However, during peak usage scenarios — such as promotional
campaigns, flash sales, or system-wide updates — event generation can surge
significantly. Based on historical data and projections, we estimate peak loads could
reach up to 5,000 events per second. The system must be designed to handle these
peaks without degradation in performance.

To meet business needs, the system should aim for low latency in processing and
delivering notifications. Notifications should ideally be delivered to end customers
within 1 second of the event being generated. In extreme peak scenarios, a slight
increase in latency — up to 5 seconds — may be acceptable, but only as a fallback
during sustained surges.

High-level design
This section outlines the high-level design of a notification system that supports
various notification types, including iOS push notifications, Android push
notifications, SMS messages, and email. The design is structured into the following
components:

1. High-Level System Design


An overview of the system architecture, detailing how different components
interact to process and deliver notifications efficiently.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 3/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

2. Support for Multiple Notification Types


A breakdown of how the system handles various notification channels, such as
iOS and Android push notifications, SMS messages, and email, ensuring
flexibility and scalability.

3. Contact Information Gathering Flow & Data Modeling


An explanation of how user contact information is gathered, stored, and
managed, including a discussion on data modeling to support user preferences
and notification routing.

4. Service Invocation Types


Details on the service invocation patterns used, such as synchronous,
asynchronous, or event-driven mechanisms, to optimize notification processing
and delivery.

5. Notification Service Responsiblities


Notification service is the core of the notification system, in this section we will
discuss what are the responsibilities of the Notification service.

High-Level System Design


To understand the high-level design of our notification system, we need to clarify its
core responsibilities:

1. Event Invocation
The notification service must be triggered by the relevant events generated
within the system.

2. Event Type Processing and User Preferences


For each event, the service should determine the notification type and fetch the
user’s preferences for that type of notification. This ensures that notifications
align with the user’s opt-in/opt-out choices.

3. Fan-out to Notification Providers


The service must handle the distribution (fan-out) of notifications to the
appropriate notification providers, such as push notifications, SMS gateways, or
email services.

4. Durable Storage for Notifications


Notifications should be preserved in durable storage until they are successfully

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 4/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

processed. If a notification fails to be processed, it must remain in storage and


be retried until successfully delivered.

With these requirements in mind, the architecture could be visualized as follows:

Different types of notifications


We start by looking at each notification type at a high level.

iOS push notification

We primarily need three components to send an iOS push notification:

Provider: A provider builds and sends notification requests to Apple Push


Notification Service (APNS). To construct a push notification, the provider
provides the following data:

Device token: This is a unique identifier used for sending push notifications.

Payload: This is a JSON dictionary that contains a notification’s payload.


https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 5/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

{
"aps":{
"alert":{
"title":"Order Submitted",
"body":"You have a new order submitted, check the order detials for mo
"action-loc-key":"VIEW"
},
"badge":5
}
}

APNS: This is a remote service provided by Apple to propagate push


notifications to iOS devices.

iOS Device: It is the end client, which receives push notifications.

Android push notification


Android adopts a similar notification flow. Instead of using APNs, Firebase Cloud
Messaging (FCM) is commonly used to send push notifications to Android devices.

SMS message
For SMS messages, third-party SMS services like Twilio, Nexmo, and many others
are commonly used. Most of them are commercial services.

Email
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 6/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Although companies can set up their own email servers, many of them opt for
commercial email services. Sendgrid and Mailchimp are among the most popular
email services, which offer a better delivery rate and data analytics.

So to recap we can summarize the list of all providers in the following figure

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 7/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 8/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Contact info gathering flow


To design an effective notification system using AWS services, we must choose the
right data storage solution for our application’s needs. Considering the system will
process approximately 2,000 events per second under normal conditions and up to
5,000 events per second during peak times, the database must handle a high read
workload efficiently. Each event requires reading the user’s notification
preferences, making the flow read-heavy.

For this scenario, we need a database that is:

Easy to set up and maintain.

Capable of scaling seamlessly to handle increasing read demands.

Optimized for performance and cost.

To meet these requirements, we consider two AWS database options:

1. Amazon DynamoDB

A NoSQL, scalable, and highly available database built to manage large volumes
of reads and writes.

Offers robust performance but requires thoughtful schema design to optimize


queries.

2. Amazon RDS Aurora

A SQL-based, on-demand, autoscaling database for Amazon Aurora.

Automatically adjusts capacity based on application needs, simplifying database


management.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 9/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

When choosing between these database types, we must evaluate the following
factors:

Data Model Compatibility

Scalability and Performance

Ecosystem Compatibility

Cost

Let us now compare these databases based on these key criteria.

1. Data Model Compatibility


In our case, the data modeling requirements are relatively simple. The user
preferences can either be stored as a single row or multiple rows per user. Let’s
explore both data modeling options:

1. Single Row Per User Preference


In this model, each row would represent a distinct user preference. This
approach allows for easy addition or removal of preferences. We can store each
preference with a user identifier and filter preferences based on the “Enabled”
field when querying. The benefits of this model include:

Easier updates and flexibility in adding new preferences.

Efficient querying based on the user ID.

Single Row Per User Preference Schema

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 10/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

2. One Row for All User Preferences


Here, all the user preferences are stored in a single row. This means we can query
the entire set of preferences with a single read operation. However, we need to
ensure that concurrent writes to the same row do not overwrite preferences. The
key benefits are:

Fewer rows in the database, simplifying the overall structure.

A single query can return all user preferences at once, which may improve
performance.

One Row for All User Preferences Schema

Both schema designs work for DynamoDB and Aurora (PostgreSQL), but there are
important considerations:

Schema Flexibility: As the system evolves, adding new fields or preferences will
be necessary. In the “Single Row Per User Preference” design, we can simply add
new rows without worrying about schema migrations. In contrast, the “One Row
for All User Preferences” design would require a schema migration if new fields
are added.

Migration Complexity: For the “One Row for All User Preferences” approach,
schema migrations may be needed when adding new preferences, although this
is unlikely to occur frequently. On the other hand, with DynamoDB, no
migration is needed since it’s a NoSQL database that can easily adapt to changes
in structure.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 11/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

From my perspective, there is no clear winner in terms of schema management


because both options are straightforward given the simplicity of the schema.
However, DynamoDB offers the advantage of not requiring schema migrations,
which can simplify operations. As someone who prefers not to manage schema
migrations in a simple schema design, especially when there are no complex
relationships, I would lean towards DynamoDB for this particular use case.

2. Scalability and Performance


To evaluate the scalability and performance of our system, we need to determine
whether it will be read-heavy or write-heavy.
Write Operations
In our system, write operations are relatively light. We only encounter two types of
writes:

1. Account Creation: When a user creates an account, their preferences are stored.

2. Preference Updates: When a user updates their notification preferences, the


changes are recorded.

However, account creation is not a heavy write operation. Considering the typical
growth of popular applications, which might acquire 5,000 to 10,000 new users per
day, this volume is not substantial in terms of writing records. Similarly, updates to
user preferences are infrequent. Most users update their preferences only
occasionally — think of the last time you updated your preferences on Medium.com!
given the large number of users, the frequency of updates is still relatively low, so
even with a large user base, these writes can be easily managed by any database.

Read Operations
Our system, however, is read-heavy. With approximately 5,000 events generated per
second, each event requires reading the user preferences to determine the
appropriate notification. This results in around 5,000 reads per second for peak
loads and 2,000 reads for normal load.

Both DynamoDB and RDS Aurora can handle this volume of reads, but each has its
unique approach:

DynamoDB:
DynamoDB distributes read requests across multiple partitions based on the
partition key (user_id). Each partition can handle up to 3,000 RCUs (Read

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 12/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Capacity Units). To calculate the required RCUs for 2,000 reads per second:
- Assuming each item size is 4 KB or less, we calculate normal load:
2,000 reads ÷ 2 reads per RCU = 1,000 RCUs.
- Assuming each item size is 4 KB or less, we calculate peak load:
5,000 reads ÷ 2 reads per RCU = 2,500 RCUs.

In a production environment, reads will likely be distributed across many


partitions, so DynamoDB can easily scale to meet the read demand. As read traffic
increases, DynamoDB will automatically create new partitions to distribute the load.

DynamoDB Read Distrubution With Partitions

RDS Aurora:
RDS Aurora handles scaling through CPU, Memory, and networking resources.
Scaling in RDS Aurora typically involves provisioning instance sizes manually or
using Aurora Auto Scaling for read replicas to handle fluctuating read
workloads. RDS Aurora is optimized to manage high read volumes with minimal
latency, providing consistent performance for demanding applications.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 13/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

RDS Aurora Reader Instances

Performance and Scalability Comparison


Both databases are highly scalable and can perform well under the expected read
load:

DynamoDB guarantees single-digit millisecond response times for GetItem


requests.

RDS Aurora also provides low-latency performance, similar to DynamoDB,


especially with its ability to scale using reader instances.

While both databases can handle the required read operations, DynamoDB shines
in automatic partitioning and horizontal scaling. On the other hand, RDS Aurora
offers excellent scaling and automatic adjustment of resources based on demand.

There’s no clear winner when it comes to scalability, as both options can handle the
load. However, for performance, we will later discuss how adding a caching layer on
top of the database implementations can further optimize the system.

3. EcoSystem compatibility
The compatibility of our database with other services in our architecture is a crucial
factor, particularly for features like Change Data Capture (CDC), which can play a
role in implementing a caching layer for user preferences.

DynamoDB Ecosystem Compatibility

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 14/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

AWS DynamoDB offers a highly flexible and integrated CDC solution through
DynamoDB Streams. This service captures every modification to items in the table
(e.g., inserts, updates, and deletes) in near real-time, making it straightforward to
integrate with other AWS services such as AWS Lambda or Amazon Kinesis for
downstream processing.

Aurora Ecosystem Compatibility

In contrast, implementing CDC with Aurora is more complex. Aurora does not
natively support CDC in the same way DynamoDB does. To achieve CDC with
Aurora, you would need to use a combination of:

1. AWS Database Migration Service (DMS): To capture and stream data changes.

2. Amazon Kinesis: To process and forward those changes to other services.

This additional setup introduces complexity compared to DynamoDB’s built-in CDC


capabilities, increasing both development effort and operational overhead.

4. Cost
When evaluating the cost of DynamoDB and RDS Aurora, we need to consider
various factors, including CDC (Change Data Capture) costs, read and write
operations, and overall database pricing. Here’s a breakdown:

DynamoDB ~ Approximately 350$/month

Aurora Serverless ~ Approximately 380$/month

While the base costs are similar, DynamoDB’s pricing for additional features such as
CDC (via DynamoDB Streams) is more cost-efficient compared to Aurora. In our
case, where CDC plays a significant role, DynamoDB provides a more economical
solution.

Contact Information Maintenance Flow


For maintaining and updating contact information, the following approach is
recommended:

1. User Sign-Up or Settings Update:


When a user signs up or updates their preferences, an HTTP request is sent to
the user preferences service hosted on the EKS cluster. This service updates the
database.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 15/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

2. Database Update:
Once the database is updated, there are two options for managing subsequent
data access:

CDC-Driven Cache (Recommended): Use CDC to populate a secondary database


or cache layer. This approach enhances read performance for the notification
service.

Direct Database Access: Alternatively, the notification service can directly


access the primary database for user preferences, though this may not be as
efficient.

Contact Information Maintenance Flow Architecture

Service Invocation Types


In our high-level design (refer to the figure attached)

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 16/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

High-Level Design

we identified the “Invoke” step that triggers the notification service. This invocation
step is critical, as it determines how events flow into the system. There are two
primary types of invocation methods to consider: Synchronous (Sync) and
Asynchronous (Async). Let’s explore the pros and cons of each approach.

Sync Invocation:
In synchronous communication, the service triggering the notification waits for the
notification service to process the request and return a response before proceeding.

Sync invocation Flow

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 17/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Pros:

Immediate Feedback: The calling service receives an immediate


acknowledgment, ensuring that the notification was processed successfully.

Simplicity: Implementing synchronous calls, often via HTTP or gRPC, is


straightforward and aligns with traditional request-response paradigms.

Cons:

Tight Coupling: All services that need to send notifications, should be aware of
the notification service, if something changes in how the notification service
handles requests, we need to re-visit all of the services that call notification
service.

Fault Tolerance Issues: Network issues or downtime in the notification service


can cause cascading failures, affecting the overall system reliability.

Async Invocation:
Asynchronous communication involves the calling service sending a notification
request to a message broker or queue, allowing the notification service to process it
independently without requiring the caller to wait for a response.

Async invocation Flow

Pros:
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 18/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Loose Coupling: Services operate independently, enhancing system resilience


and allowing each to scale according to its own demands.

Improved Scalability: The system can handle a higher volume of notifications,


as requests are queued and processed as resources permit, preventing
bottlenecks.

Enhanced Fault Tolerance: Message brokers can store messages until the
notification service is ready, ensuring that notifications are not lost even if the
service experiences downtime.

Cons:

Eventual Consistency: There may be a delay between triggering the notification


and its actual processing, which could be problematic for time-sensitive
operations.

Increased Complexity: Implementing asynchronous communication requires


managing message brokers, handling message delivery guarantees, and
ensuring idempotency, adding to the system’s complexity.

In a microservices-based architecture, asynchronous processing aligns perfectly


with the principles of loose coupling and scalability. By choosing async processing,
we gain stronger guarantees that our events will be reliably processed, even in the
face of high loads or intermittent failures.

With asynchronous processing, the notification service can subscribe to different


streams or topics that are responsible for emitting notification-related events. This
approach decouples the notification service from the event producers, allowing
teams to work independently.

Additionally, this architecture enables the notification service team to focus on:

Integrating new notification types.

Handling changes in the system.

Managing retries, error handling, and scaling independently of other services.

Async processing ensures a robust, flexible system that can scale and adapt as new
requirements emerge, making it the optimal choice for the architecture we’ve

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 19/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

designed.

Finally, the notification service can be developed in one of two options in this case:

Lambda workers

Custom Microservices

Notification Service Responsibilities


https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 20/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

The notification service plays a pivotal role in a microservices architecture. It is


responsible for:

Reading events from multiple sources.

Validating these events.

Templating and enriching them.

Publishing the processed events to the Fanout service.

To design a scalable and maintainable notification service, it is advisable to follow


principles like Domain-Driven Design (DDD) and Hexagonal Architecture. These
approaches allow the service to scale seamlessly by adding new inbound or
outbound adapters.

Hexagonal Architecture For Notification Service

Notification Service Logic

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 21/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

The notification service may be invoked by various streams or even an EventBridge


event. To ensure its logic remains consistent regardless of the event source, a
unified approach is necessary. Below are the logical steps involved in the service’s
operation:

Notification Service Logic

Event Validation: The first step is validating the structure of incoming events.
Unrecognized or malformed events are placed in a Dead Letter Queue (DLQ).
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 22/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

This allows the implementation of a re-drive policy to handle dropped events,


which could result from schema changes in upstream services.
here is an example of an event generated by an up-stream service

{
"eventType": "orderCreated",
"eventId": "12345-abcde-67890",
"timestamp": "2024-12-25T10:00:00Z",
"data": {
"orderId": "98765",
"userId": "54321",
"orderTotal": 150.75,
"currency": "USD",
"orderItems": [
{
"itemId": "001",
"productName": "Wireless Mouse",
"quantity": 1,
"price": 25.99
},
],
"shippingAddress": {
"name": "John Doe",
"street": "123 Elm Street",
"city": "Springfield",
"state": "IL",
"postalCode": "62704",
"country": "USA"
},
"paymentStatus": "Pending",
"orderStatus": "Created"
},
"source": "order-service",
"version": "1.0"
}

Aggregate Events: Events are aggregated based on user identifiers. Aggregation


minimizes database queries by grouping events related to the same user,
improving overall performance. For instance:

[
{
"user_id": "USER_ID_#001",
},
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 23/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

{
"user_id": "USER_ID_#002",
},
...
]

Fetch User Preferences: Using DynamoDB’s BatchGetItem , user preferences are


fetched in a single call. This reduces latency and improves throughput.

Events Enrichment: Enrichment involves adding supplementary data to the


event. This could include user preferences or information fetched from external
APIs. Enriched events contain all necessary fields for downstream consumers,
such as routing details (e.g., “Email”, “SMS”).
Example enriched event:

{
"supportedChannels": ["EMAIL", "SMS"],
"eventType": "orderCreated",
"eventId": "12345-abcde-67890",
"timestamp": "2024-12-25T10:00:00Z",
"data": {
"orderId": "98765",
"userId": "54321",
"orderTotal": 150.75,
"currency": "USD",
"orderItems": [
{
"itemId": "001",
"productName": "Wireless Mouse",
"quantity": 1,
"price": 25.99
}
],
"shippingAddress": {
"name": "John Doe",
"street": "123 Elm Street",
"city": "Springfield",
"state": "IL",
"postalCode": "62704",
"country": "USA"
},
"paymentStatus": "Pending",
"orderStatus": "Created"
},
"source": "order-service",

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 24/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

"version": "1.0"
}

Re-structure Event: To optimize payload size and network costs, events are
restructured. Fields irrelevant to downstream consumers are removed,
producing a streamlined event payload.
Restructured event example:

{
"supportedChannels": ["EMAIL", "SMS"],
"eventName": "orderCreated",
"eventGroup": "order",
"timestamp": "2024-12-25T10:00:00Z",
"data": {
"orderId": "98765",
"userId": "54321",
"orderTotal": 150.75,
"currency": "USD",
"orderItems": [
{
"itemId": "001",
"productName": "Wireless Mouse",
"quantity": 1,
"price": 25.99
}
]
},
"source": "order-service",
"version": "1.0"
}

Publish Events: Finally, events are published to the Fanout component, such as
AWS SNS. It is essential to use batch processing for publishing to reduce network
costs and improve throughput. For example, AWS SNS requires events to be sent
as stringified JSON.

With these steps, the notification service efficiently processes and routes events,
ensuring seamless integration into the broader microservices ecosystem.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 25/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Design deep dive


In the high-level design, we outlined the foundational aspects of the notification
system, including the types of notifications, contact information gathering, data
modeling, and overall architecture. Now, it’s time to delve deeper into specific
components of the system to refine and optimize its design.

Improved Design for the Read Side

Building the Fanout Pattern in AWS

Workers Handling Notifications

Building an Analytics Workload for Notifications

Reliability

Improved Design for the Read Side


To enhance the read efficiency of our notification system, we need to integrate a
caching layer. This will not only reduce the read load on DynamoDB but also
significantly improve response times for fetching user preferences. At the same
time, we must ensure that our cache remains consistent with updates to the
underlying data.

let us start by revisting our previous design

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 26/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

User Preferences Services Update Reflection To Notification Service

In our current design, when a user’s preferences are updated, a Change Data
Capture (CDC) mechanism triggers a write operation to DynamoDB, where the

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 27/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

notification service queries preferences. Now, to add a caching layer, we place it


between the Notification Service and DynamoDB, ensuring that most read
operations can be served from the cache.

Read Side with Cache Layer On Top

Given that we are using AWS and DynamoDB, two primary caching solutions come
to mind:

1. Amazon ElastiCache

A flexible caching service that supports engines like Memcached and Redis.

Provides a wide range of applications beyond just DynamoDB.

Requires custom implementation for cache invalidation and synchronization.


https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 28/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

2. Amazon DynamoDB Accelerator (DAX)

A fully managed, in-memory caching service specifically designed to optimize


DynamoDB reads.

Seamlessly integrates with DynamoDB, automatically handling caching, reads,


and writes.

Significantly reduces development overhead for caching logic.

Following is a table that contains a full comparison between the two solutions

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 29/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

full comparison Between ElastiCache and DynamoDB DAX

For our use case, DynamoDB DAX is the optimal choice. It offloads the complexity of
implementing cache consistency, managing invalidation, and synchronizing data.
With DAX, the notification service can transparently query the cache without
additional logic, while DynamoDB remains the source of truth.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 30/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

With the addition of DAX, our updated system design would look like this:

Updated Design With A Caching Layer For Reader Side

Building the Fanout pattern in AWS


When our Notification Service processes a batch of events, it needs to aggregate
these events and fetch the corresponding user preferences. Once preferences are
retrieved, the service must publish the notifications to a durable storage solution,
where they are held until workers process them. This process of distributing data
from a single source to multiple destinations is known as the Fanout Pattern.

In our architecture, notifications are stored in AWS SQS (Simple Queue Service).
SQS offers several advantages:

Scalability: Handles high-throughput messaging with virtually unlimited


capacity.

Flexibility: Supports both ordered (FIFO) and unordered (Standard) message


processing.

Reliability: Ensures message durability until explicitly processed or deleted.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 31/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Since notifications do not require strict ordering, we can use the Standard Queue to
focus on performance and scalability.

To implement the Fanout Pattern in a serverless architecture, we use AWS SNS


(Simple Notification Service) as the intermediary between the Notification Service
and multiple SQS queues. Here’s how it works:

Fanout Pattern In Our Architecture

1. Publish Messages to SNS:


The Notification Service reads batches of events, fetches user preferences, and
publishes these messages to an SNS Topic.

2. Distribute Messages to SQS:


The SNS Topic fans out these messages to multiple SQS queues, with each queue
representing a specific type of notification (e.g., SMS, Email, Push
Notifications).

3. Filter Policies for Fine-Grained Control:


To control which messages each queue receives, we use SNS Filter Policies.
These policies evaluate message attributes and only forward messages that meet
specific criteria to the corresponding queue.

Workers Handling Notifications


https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 32/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

In our serverless architecture, there are multiple AWS services available to process
queue data, such as Lambda, ECS, and EC2. However, we aim to maintain a fully
serverless design with minimal complexity. For this reason, AWS Lambda is the
preferred choice for handling notifications because:

Seamless Integration: Lambda has built-in integration with SQS, requiring


minimal configuration.

Scalability: It scales automatically with demand, processing messages as they


arrive in the queue.

Cost Efficiency: Pay only for the compute time used, without the overhead of
provisioning and managing servers.

Fanout Architecture With Workers

let us zoom in on the “Workers” architecture to understand how they work, the
following diagram shows how a single worker flow acts:

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 33/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Worker Architecture Flow

Here is a step-by-step breakdown of the logic implemented by each Lambda worker:

1. Item Retrieval from the Queue

Each Lambda function fetches messages from the SQS queue dedicated to its
notification type (e.g., SMS, Email, Push Notifications).

Messages are processed in batches or individually, depending on the queue’s


configuration.

2. Template Loading

Notification templates are stored within the Lambda environment’s file system
or loaded from a centralized configuration service like AWS S3.

Templates are customized dynamically based on user preferences or message


attributes.

3. Writing Notifications to DynamoDB

Processed notifications are written to a DynamoDB table.

This table serves as a source for analytical workloads, enabling the generation of
insights like delivery success rates or user engagement.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 34/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

4. Sending Notifications

Notifications are dispatched to users via 3rd-party APIs or SDKs, such as Twilio
for SMS, SES for emails, or Firebase for push notifications.

Lambda ensures secure communication with these services using IAM roles or
environment variables for API keys.

5. Removing Processed Records

Once notifications are successfully sent, they are removed from the queue using
the SQS SDK.

This step prevents reprocessing of already-handled messages.

While this design represents the happy path, failures can occur at various stages:

1. Item Retrieval Failure: Handled by SQS retry policies. Messages remain in the
queue until processed successfully or moved to the Dead Letter Queue (DLQ).

2. Template Loading Failure: Mitigated by fallback mechanisms, such as default


templates stored locally or in S3.

3. Writing to DynamoDB Failure: DynamoDB errors (e.g., throttling) can cause


failures. Implementing exponential backoff retries with the AWS SDK can
resolve transient issues.

4. Notification Delivery Failure: Failures during delivery to 3rd-party services (e.g.,


network issues) result in retries by SQS.

Multiple deliveries are possible if failures occur at this stage, which may result in
redundant notifications, we can still implement idempotency in our workers if and
only if the 3rd party service offers such a feature.

Building Analytic Workload for Notifications


To enable robust analytical capabilities for our notification system, we leverage the
“DynamoDB Data Export to Amazon S3” feature, which is a fully managed solution
designed to efficiently export data from DynamoDB tables to Amazon S3 at scale.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 35/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

This approach allows us to transform operational data into actionable insights,


ensuring our system remains both efficient and data-driven.

System Architecture With Analytic Workload

The export functionality supports two modes: full export and incremental export.
Full export enables exporting a complete snapshot of the DynamoDB table from any
point within the defined Point-In-Time Recovery (PITR) window to an S3 bucket.
This is particularly useful for establishing comprehensive datasets or creating
periodic backups. Incremental export, on the other hand, focuses on changes,
allowing us to export only the data that has been updated, deleted, or added within a
specified time range. This method is optimal for continuous updates to analytical
datasets without redundant data duplication.

To use this feature, PITR must be enabled on the DynamoDB table. Once configured,
exports can be initiated through the AWS Management Console, AWS CLI, or
DynamoDB API, providing flexibility and integration into existing workflows.

By exporting notification data to Amazon S3, we can unlock the ability to perform
advanced analytical queries. For example, we can track the volume of notifications
sent within a specific time frame, identify the most frequently used notification
types, or generate detailed business intelligence dashboards for stakeholders. This
architecture seamlessly integrates operational and analytical workflows, ensuring
that data-driven decision-making becomes a core component of our notification
system.

Reliability
Ensuring the reliability of our notification system is a critical aspect of the overall
architecture. The system must guarantee that no data is lost while maintaining

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 36/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

flexibility to handle delays or reordering of notifications. To achieve this, we


implement a multi-tiered approach leveraging durable storage, retry mechanisms,
and Dead Letter Queues (DLQs).

1. Event Listeners: The notification service listens for incoming events from topics
or streams, and processes them. It also publishes the processed data to an SNS
topic. During this stage, errors can occur if:

The SNS service is unavailable.

A network issue occurred when reading from DynamoDB or publishing to SNS

Event Listeners

To mitigate these risks, the event source (stream or topic) retains unprocessed
records until an acknowledgment is received, ensuring no data is lost. Additionally,
a DLQ is configured at the stream/topic level to capture records that exceed the
maximum retry attempts, allowing for later inspection and reprocessing.

2. Fanout Component: The SNS topic acts as the fanout mechanism, delivering
notifications to multiple SQS queues. Failures here include:
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 37/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Message delivery retries exceeding the configured threshold.

In such cases, SNS provides built-in retry mechanisms. For undeliverable messages,
DLQs are configured for each subscriber. These DLQs capture messages that cannot
be successfully delivered after all retries, preserving them for further analysis.

Fanout Component

3. Worker Stage: Lambda workers fetch records from SQS queues, process them,
and perform tasks such as sending notifications via third-party APIs or writing
results to the database. Failures may arise from:

Transient network issues.

Persistent errors cause a record to fail multiple processing attempts.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 38/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

To address these, SQS automatically retries failed messages. If a record cannot be


successfully processed after a configurable number of attempts, it is moved to a
DLQ attached to the queue. This ensures that problematic records are retained for
review by expert teams while maintaining system flow.

Worker Stage

Incorporating these reliability mechanisms, the architecture now ensures that:

No data is lost, with unprocessed records captured at each critical failure point.

Failed messages are transparently retried within the system, reducing manual
intervention.

DLQs provide a safety net for records that exceed retry thresholds, enabling
post-mortem analysis and troubleshooting.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 39/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Updated Architecture With More Reliability

Scalability Deep Dive


In our final design, scalability plays a central role in ensuring the system can handle
varying workloads without bottlenecks. Let’s explore the scalability of each
architectural component and address its potential limitations.
Event Receiver Flow

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 40/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Event Receiver Flow

The event receiver is critical for ingesting data from multiple streams or topics. Our
architecture leverages Kubernetes for deploying the notification service, providing
the ability to scale horizontally based on demand. Here’s how scalability is achieved:

Topic/Stream Scaling: Each topic or stream is divided into partitions (or shards
in some stream implementations), which can be increased dynamically as traffic
grows. This ensures that the event ingestion system can handle surges in event
throughput.

Notification Service Scaling: The notification service is designed to scale


horizontally. For every partition or shard in a stream, a new instance of the
notification service can be added to handle incoming events. This dynamic
scaling ensures that the service can keep up with the event rate.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 41/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Partition/Shards on Event Stream Service

Data Layer: DynamoDB and DAX


The DynamoDB database, coupled with DAX (DynamoDB Accelerator), ensures low-
latency reads and sufficient write throughput for the notification system.

DynamoDB Scalability: DynamoDB automatically scales its read and write


capacity to match workload demands. Its seamless scalability makes it an
excellent choice for handling unpredictable traffic spikes.

DAX Accelerator: DAX provides an in-memory cache to reduce DynamoDB


query load, improving performance for frequently accessed data. AWS
recommends using at least three nodes in a DAX cluster for production to
ensure fault tolerance and scalability.
Fanout Workflow

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 42/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Fanout Workflow

When the notification service processes events from a topic or stream, it enriches
them by fetching necessary details like user preferences or other relevant
information from the database. After gathering and preparing a batch of records,
the service sends these messages to Amazon SNS (Simple Notification Service),
which acts as the intermediary for distributing notifications.
Approaches for Sending Records to SNS
1. Send Each Record Separately in Parallel: Using the Publish API command, the
service can send individual messages to the SNS topic. This approach offers
simplicity and ensures each message is processed independently, but it may lead
to higher costs due to multiple API calls.

2. Send a Batch of Records (Preferred for Improved Throughput): By leveraging


the PublishBatch API command, the service can publish up to 10 messages in a
single request. Batch publishing optimizes throughput, reduces latency, and
minimizes costs, making it an ideal approach for high-traffic systems.

SNS Throughput Considerations

Standard Topics: Support up to 30,000 messages per second per topic.

FIFO Topics: Support up to 3,000 messages per second or 20 MB per second,


whichever limit is reached first.

Given these capabilities, Standard topics can hold the expected workload of the
fanout workflow, ensuring reliable and efficient message distribution.

If we want to use FIFO topics, we can add a distributed layer before the topic that is
responsible for routing an event to multiple FIFO topics which in there turn can
route the events to the queues.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 43/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Distrubution Layer For SNS Routing For FIFO Topics To handle incremental Load

Implementing a Consistant Hashing Layer can add a lot of complicity to your design.
Worker and Analytic Flow

Worker and Analytic Flow

1. SQS Queue Types and Lambda Scaling:

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 44/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

FIFO Queues: Lambda provisions one instance per MessageGroupId , ensuring


ordered processing within each group. SQS FIFO dynamically scales the number
of MessageGroupIds based on workload requirements.

Standard Queues: Lambda scales to the configured concurrency limit for the
queue. For example, if you set a limit of 100 concurrent Lambdas for an “Email”
SQS queue, Lambda scales accordingly as messages arrive.

2. Concurrency Configuration and Third-Party Quotas:

To align with the capacity of your third-party notification provider, configure


Lambda concurrency based on their limits.

For example:
- If the provider allows sending 1,000 messages per second.
- If Lambda processes 10 messages in a single batch.
- The maximum concurrency setting for the queue should be 100 to avoid
exceeding provider quotas and risking throttling.

3. Batch Processing Optimization:

Batch processing reduces the number of API calls to SQS and improves cost
efficiency. This approach is particularly beneficial for high-throughput systems.
Analytic Flow with DynamoDB and S3
1. Data Writing: Lambda workers persist notification data in DynamoDB for both
operational needs and analytical workloads. DynamoDB’s scalability ensures it
can handle high write volumes without bottlenecks.

2. Exporting to S3: DynamoDB’s “Data Export to Amazon S3” feature enables


seamless transfer of table data to S3.
This export operates independently of your table’s read capacity, ensuring it
does not impact real-time application performance.

3. Analytical Capabilities: Once data is exported to S3, it can be processed using


services like AWS Athena, Amazon Redshift, or other analytics tools.
This enables querying data trends, generating usage metrics, and creating
robust business intelligence dashboards.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 45/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Possible System Improvement and Extandability


In this section, we want to highlight some expandability features that can be
implemented in our architecture we cover the following topics:

Scheduled System For Sending Notifications

Priority Queues For Notifications

Scheduled System For Sending Notifications


To understand what this architecture will offer to us, we want to take an example, let
us assume we are running this notification service on an e-commerce site, and we
want to schedule that for a “New Year” or for “Black Friday” we want to send an
EMAIL notification to all users that have their email notification enabled a sale or a
promotion email, for that we want to design a scheduled system where we can select
a specific notification type (EMAIL, SMS, etc..) and select a specific schedule that
will run and notify all users that have opt-in for this specific notification type.

For this we want to use a service called “EventBridge Scheduler” which is a


serverless scheduler that allows you to create, run, and manage tasks from one
central, managed service. With EventBridge Scheduler, you can create schedules
using cron and rate expressions for recurring patterns, or configure one-time
invocations. You can set up flexible time windows for delivery, define retry limits,
and set the maximum retention time for failed API invocations.

EventBridge Scheduler

Let us understand how the scheduler flow will look like:

1. An admin uses a UI-based tool to specify options, such as:

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 46/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

2. Once the admin submits their request, an EventBridge Scheduler is created with
the specified configurations using AWS SDK. A custom payload is included,
containing the admin’s configurations:
{
“notification_target”: “EMAIL”,
“notification_template”: “new_year_sales”
}

3. EventBridge Scheduler invokes a Lambda function with the custom payload.

4. The Lambda function reads the payload and queries DynamoDB indexes to fetch
users with the corresponding preferences.

5. The Lambda function either invokes an SNS Topic or communicates directly


with an SQS queue.

6. Once the data is inserted into the queue, workers can pick it up and
process/deliver the notifications.

The architecture would look something similar to the following diagram:

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 47/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Scheduled System For Sending Notifications Architecture

To enable efficient querying, we use DynamoDB’s Global Secondary Indexes (GSI).


Assuming a “Single Row Per User Preference” schema design, we configure:

Partition Key (PK): contact_type (e.g., EMAIL, SMS).

Sort Key (SK): user_id .

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 48/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Single Row Per User Preference Schema

The Lambda function reads the notification_target field from the custom payload,
queries the DynamoDB table using the contact_type as the PK, and aggregates
groups of reads. It then publishes the results to SNS or SQS.

This architecture simplifies scheduling notifications and ensures scalability for


high-volume systems.

Priority Queues For Notifications


To enhance how notifications are handled, we can set up a priority queue system.
For instance, a priority queue can be used for email notifications, allowing us to
assign a score to each user and determine the importance of their notifications.
Based on this score, notifications can be placed into specific priority queues.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 49/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 50/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Priority Queue Pattern

For each priority queue, we configure a specific Lambda function with concurrent
execution limits. We use SNS filter policies and add a priority field to the event
payload passed to SNS. This ensures that events are routed to the appropriate queue
based on their priority.

The final architecture with a priority queue implementation looks something like
this:

updated architecture with priority queue

For more details on implementing priority queues, please refer to my other blog
post.

APPENDIX A (Additional Desing With EventBridge Pipes)


In this appendix, I want to share another approach for building the notification
system using a recently announced feature in AWS EventBridge: EventBridge Pipes.

EventBridge Pipes are designed for point-to-point integrations. Each pipe processes
events from a single source and delivers them to a single target. Additionally, Pipes
support advanced transformations and event enrichment before delivery.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 51/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

EventBridge Pips Architecture

EventBridge Pipes can integrate with different targets or destinations, enabling the
following architectural designs:

1. Stream/Topic → EventBridge Pipe → SNS Topic

2. Stream/Topic → EventBridge Pipe → SQS

By using EventBridge Pipes, we eliminate the need to provision and manage code
for a separate “Notification Service.” Since EventBridge is fully serverless, its
features allow us to achieve our notification requirements seamlessly.

Stream/Topic → EventBridge Pipe → SNS Topic Architecture

One potential architecture utilizes an SNS Topic as the target of the Pipe. This design
enables reliable message distribution to multiple subscribers.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 52/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Alternatively, we can skip the SNS Topic and directly invoke an SQS queue. However,
this approach has a limitation: for each queue associated with a specific stream, you
must create a separate pipe. If there are only 2–3 queues, this isn’t a significant
issue. However, as the number of queues grows, challenges at the stream level could
be introduced.

The architecture in this case would look like this:

Stream/Topic → EventBridge Pipe → SQS Architecture

Handling User Preferences with Enrichment


You may wonder how to fetch and apply user preferences before invoking the SQS
queue. This is addressed during the enrichment stage. EventBridge Pipes allow
batch reading, enabling operations on each record. Filtering can also be applied
during this stage to ensure only relevant events proceed.
Production Considerations
I have not yet used this architecture in a production environment, so I cannot
definitively comment on its flexibility, cost efficiency, or performance. However, it
is worth keeping in mind for future use cases, especially when exploring options for
simplifying your notification system design.

Conclusion
In this comprehensive blog, we’ve delved into various aspects of system
architecture, focusing on data modeling, caching, and handling high volumes of

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 53/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

reads. We explored a full architecture that can be used to build your next production
read notification system.

Moreover, we concluded the blog with a thorough examination of scalability


strategies, emphasizing the ability to scale each component independently. By
leveraging managed services and implementing best practices, our systems are
engineered to be highly scalable and available, capable of accommodating changing
demands and ensuring seamless performance.

I encourage you to take your time to review the blog thoroughly. If you have any
questions or suggestions for further improvement, please feel free to share your
thoughts in the comments section. Your feedback is invaluable in enhancing the
content and providing greater value to our readers.

Follow Me For More Content


If you made it this far and you want to receive more content similar to this make
sure to follow me on Medium and on Linkedin

AWS Serverless System Design Interview Scalability Software Development

Following

Written by Joud W. Awad


1.5K Followers · 40 Following

Experienced Software Engineer and Solutions Architect with 10+ years in backend/frontend development,
mobile apps, and DevOps.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 54/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Responses (2)

What are your thoughts?

Respond

Varun Rao
3 days ago

This gives the complete system design picture end to end. Well documented and explained!

5 1 reply Reply

Sayed Ramadan
4 days ago

This is a great article. Very well thought and the information flow is very logical. Thank you.

5 1 reply Reply

More from Joud W. Awad

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 55/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Joud W. Awad

AWS Cognito Deep Dive


Master Amazon Cognito: Explore User pools, Identity pools, and best practices to secure and
scale your applications seamlessly.

3d ago 125 4

Joud W. Awad

AWS Lambda Architecture Deep Dive


Master AWS Lambda: Deep Dive into Architecture, Invocations & Execution Phases for
Optimized Serverless Performance

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 56/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Nov 17, 2024 319 1

Joud W. Awad

AWS Kinesis Data Streams Deep Dive


In-depth guide to AWS Kinesis Data Streams: learn architecture, consumer models, and
optimize data streaming for scalable applications.

Nov 5, 2024 280 3

Joud W. Awad

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 57/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Master AWS Lambda Versions & Aliases: Build Production-Ready


Deployment Pipelines
Learn how to manage AWS Lambda Versions & Aliases and create production-ready
deployment pipelines using AWS CodePipeline & CodeDeploy.

Nov 22, 2024 12 1

See all from Joud W. Awad

Recommended from Medium

In Level Up Coding by Matt Bentley

My Top 3 Tips for Being a Great Software Architect


My top tips for being a great Software Architect and making the best decisions for your teams
and projects.

2d ago 819 17

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 58/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Vu Trinh

Netflix’s Trillions Scale Real-time Data Infrastructure


4 phases, each phase’s lessons learned and strategies

5d ago 155 3

Lists

General Coding Knowledge


20 stories · 1847 saves

Stories to Help You Grow as a Software Developer


19 stories · 1540 saves

Coding & Development


11 stories · 959 saves

Leadership
62 stories · 507 saves

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 59/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

In Coding Beauty by Tari Ibaba

Google really destroyed OpenAI and Sora without even trying


Just when Sam Altman thought they were far ahead of the competition with Sora…

Dec 22, 2024 1.96K 51

Joud W. Awad

AWS Cognito Deep Dive


Master Amazon Cognito: Explore User pools, Identity pools, and best practices to secure and
scale your applications seamlessly.

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 60/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

3d ago 125 4

Codingwinner

From $0 to a $32.5M Exit: How I Bootstrapped and Sold My B2B SaaS


Company
My article is open to everyone; non-member readers can click this link to read the full text

6d ago 292 7

In Stackademic by Crafting-Code

15 Linux Command Line Hacks Every Programmer Must Know


https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 61/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium

Code Faster, Command Smarter

Dec 27, 2024 472 10

See more recommendations

https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 62/62

You might also like