Notification System Architecture With AWS _ By Joud W. Awad _ Medium
Notification System Architecture With AWS _ By Joud W. Awad _ Medium
Awad | Medium
Open in app
26
Search
Get unlimited access to the best of Medium for less than $1/week. Become a member
In this blog post, we will explore how to design and implement a fully production-
ready serverless notification system using AWS services. This architecture will
enable various types of notifications to be sent to customers in response to specific
events within a system.
Designed for scalability, the system will support large-scale applications, handling
hundreds or even thousands of notifications per second. We will leverage AWS
services to build this solution and explore different design patterns, evaluating them
based on cost, performance, and other key considerations.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 1/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Our goal is to design a system capable of handling a high volume of notifications per
second while ensuring extensibility to accommodate new notification types as the
system evolves. The key functional requirements for this system include:
Non-functional Requirements
The business objectives guide us in identifying several critical non-functional
requirements for the notification system:
1. Scalability
The system must efficiently scale to handle an increasing volume of
notifications, both incoming and outgoing.
2. High Availability
To ensure uninterrupted service, the system must maintain high availability at
all times.
3. Reliability
The system should guarantee the delivery of all notifications without any loss.
4. Extensibility
The architecture should be flexible, requiring minimal changes to support new
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 2/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
5. Cost Optimization
Given the heavy usage of the system, cost efficiency is crucial. The design
should focus on leveraging the most cost-effective AWS services and features to
minimize operational expenses.
Back-of-the-Envelope Estimation
The notification system we are designing operates as a single service within a
microservices architecture. In our ecosystem, there are approximately 150 services
running concurrently, each generating events. However, not all events are relevant
to notifications. After analyzing the services, we identified that around 20 services
are responsible for generating notification-related events.
These 20 services collectively produce approximately 2,000 events per second under
normal conditions. However, during peak usage scenarios — such as promotional
campaigns, flash sales, or system-wide updates — event generation can surge
significantly. Based on historical data and projections, we estimate peak loads could
reach up to 5,000 events per second. The system must be designed to handle these
peaks without degradation in performance.
To meet business needs, the system should aim for low latency in processing and
delivering notifications. Notifications should ideally be delivered to end customers
within 1 second of the event being generated. In extreme peak scenarios, a slight
increase in latency — up to 5 seconds — may be acceptable, but only as a fallback
during sustained surges.
High-level design
This section outlines the high-level design of a notification system that supports
various notification types, including iOS push notifications, Android push
notifications, SMS messages, and email. The design is structured into the following
components:
1. Event Invocation
The notification service must be triggered by the relevant events generated
within the system.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 4/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Device token: This is a unique identifier used for sending push notifications.
{
"aps":{
"alert":{
"title":"Order Submitted",
"body":"You have a new order submitted, check the order detials for mo
"action-loc-key":"VIEW"
},
"badge":5
}
}
SMS message
For SMS messages, third-party SMS services like Twilio, Nexmo, and many others
are commonly used. Most of them are commercial services.
Email
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 6/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Although companies can set up their own email servers, many of them opt for
commercial email services. Sendgrid and Mailchimp are among the most popular
email services, which offer a better delivery rate and data analytics.
So to recap we can summarize the list of all providers in the following figure
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 7/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 8/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
1. Amazon DynamoDB
A NoSQL, scalable, and highly available database built to manage large volumes
of reads and writes.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 9/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
When choosing between these database types, we must evaluate the following
factors:
Ecosystem Compatibility
Cost
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 10/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
A single query can return all user preferences at once, which may improve
performance.
Both schema designs work for DynamoDB and Aurora (PostgreSQL), but there are
important considerations:
Schema Flexibility: As the system evolves, adding new fields or preferences will
be necessary. In the “Single Row Per User Preference” design, we can simply add
new rows without worrying about schema migrations. In contrast, the “One Row
for All User Preferences” design would require a schema migration if new fields
are added.
Migration Complexity: For the “One Row for All User Preferences” approach,
schema migrations may be needed when adding new preferences, although this
is unlikely to occur frequently. On the other hand, with DynamoDB, no
migration is needed since it’s a NoSQL database that can easily adapt to changes
in structure.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 11/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
1. Account Creation: When a user creates an account, their preferences are stored.
However, account creation is not a heavy write operation. Considering the typical
growth of popular applications, which might acquire 5,000 to 10,000 new users per
day, this volume is not substantial in terms of writing records. Similarly, updates to
user preferences are infrequent. Most users update their preferences only
occasionally — think of the last time you updated your preferences on Medium.com!
given the large number of users, the frequency of updates is still relatively low, so
even with a large user base, these writes can be easily managed by any database.
Read Operations
Our system, however, is read-heavy. With approximately 5,000 events generated per
second, each event requires reading the user preferences to determine the
appropriate notification. This results in around 5,000 reads per second for peak
loads and 2,000 reads for normal load.
Both DynamoDB and RDS Aurora can handle this volume of reads, but each has its
unique approach:
DynamoDB:
DynamoDB distributes read requests across multiple partitions based on the
partition key (user_id). Each partition can handle up to 3,000 RCUs (Read
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 12/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Capacity Units). To calculate the required RCUs for 2,000 reads per second:
- Assuming each item size is 4 KB or less, we calculate normal load:
2,000 reads ÷ 2 reads per RCU = 1,000 RCUs.
- Assuming each item size is 4 KB or less, we calculate peak load:
5,000 reads ÷ 2 reads per RCU = 2,500 RCUs.
RDS Aurora:
RDS Aurora handles scaling through CPU, Memory, and networking resources.
Scaling in RDS Aurora typically involves provisioning instance sizes manually or
using Aurora Auto Scaling for read replicas to handle fluctuating read
workloads. RDS Aurora is optimized to manage high read volumes with minimal
latency, providing consistent performance for demanding applications.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 13/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
While both databases can handle the required read operations, DynamoDB shines
in automatic partitioning and horizontal scaling. On the other hand, RDS Aurora
offers excellent scaling and automatic adjustment of resources based on demand.
There’s no clear winner when it comes to scalability, as both options can handle the
load. However, for performance, we will later discuss how adding a caching layer on
top of the database implementations can further optimize the system.
3. EcoSystem compatibility
The compatibility of our database with other services in our architecture is a crucial
factor, particularly for features like Change Data Capture (CDC), which can play a
role in implementing a caching layer for user preferences.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 14/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
AWS DynamoDB offers a highly flexible and integrated CDC solution through
DynamoDB Streams. This service captures every modification to items in the table
(e.g., inserts, updates, and deletes) in near real-time, making it straightforward to
integrate with other AWS services such as AWS Lambda or Amazon Kinesis for
downstream processing.
In contrast, implementing CDC with Aurora is more complex. Aurora does not
natively support CDC in the same way DynamoDB does. To achieve CDC with
Aurora, you would need to use a combination of:
1. AWS Database Migration Service (DMS): To capture and stream data changes.
4. Cost
When evaluating the cost of DynamoDB and RDS Aurora, we need to consider
various factors, including CDC (Change Data Capture) costs, read and write
operations, and overall database pricing. Here’s a breakdown:
While the base costs are similar, DynamoDB’s pricing for additional features such as
CDC (via DynamoDB Streams) is more cost-efficient compared to Aurora. In our
case, where CDC plays a significant role, DynamoDB provides a more economical
solution.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 15/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
2. Database Update:
Once the database is updated, there are two options for managing subsequent
data access:
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 16/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
High-Level Design
we identified the “Invoke” step that triggers the notification service. This invocation
step is critical, as it determines how events flow into the system. There are two
primary types of invocation methods to consider: Synchronous (Sync) and
Asynchronous (Async). Let’s explore the pros and cons of each approach.
Sync Invocation:
In synchronous communication, the service triggering the notification waits for the
notification service to process the request and return a response before proceeding.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 17/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Pros:
Cons:
Tight Coupling: All services that need to send notifications, should be aware of
the notification service, if something changes in how the notification service
handles requests, we need to re-visit all of the services that call notification
service.
Async Invocation:
Asynchronous communication involves the calling service sending a notification
request to a message broker or queue, allowing the notification service to process it
independently without requiring the caller to wait for a response.
Pros:
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 18/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Enhanced Fault Tolerance: Message brokers can store messages until the
notification service is ready, ensuring that notifications are not lost even if the
service experiences downtime.
Cons:
Additionally, this architecture enables the notification service team to focus on:
Async processing ensures a robust, flexible system that can scale and adapt as new
requirements emerge, making it the optimal choice for the architecture we’ve
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 19/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
designed.
Finally, the notification service can be developed in one of two options in this case:
Lambda workers
Custom Microservices
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 21/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Event Validation: The first step is validating the structure of incoming events.
Unrecognized or malformed events are placed in a Dead Letter Queue (DLQ).
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 22/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
{
"eventType": "orderCreated",
"eventId": "12345-abcde-67890",
"timestamp": "2024-12-25T10:00:00Z",
"data": {
"orderId": "98765",
"userId": "54321",
"orderTotal": 150.75,
"currency": "USD",
"orderItems": [
{
"itemId": "001",
"productName": "Wireless Mouse",
"quantity": 1,
"price": 25.99
},
],
"shippingAddress": {
"name": "John Doe",
"street": "123 Elm Street",
"city": "Springfield",
"state": "IL",
"postalCode": "62704",
"country": "USA"
},
"paymentStatus": "Pending",
"orderStatus": "Created"
},
"source": "order-service",
"version": "1.0"
}
[
{
"user_id": "USER_ID_#001",
},
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 23/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
{
"user_id": "USER_ID_#002",
},
...
]
{
"supportedChannels": ["EMAIL", "SMS"],
"eventType": "orderCreated",
"eventId": "12345-abcde-67890",
"timestamp": "2024-12-25T10:00:00Z",
"data": {
"orderId": "98765",
"userId": "54321",
"orderTotal": 150.75,
"currency": "USD",
"orderItems": [
{
"itemId": "001",
"productName": "Wireless Mouse",
"quantity": 1,
"price": 25.99
}
],
"shippingAddress": {
"name": "John Doe",
"street": "123 Elm Street",
"city": "Springfield",
"state": "IL",
"postalCode": "62704",
"country": "USA"
},
"paymentStatus": "Pending",
"orderStatus": "Created"
},
"source": "order-service",
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 24/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
"version": "1.0"
}
Re-structure Event: To optimize payload size and network costs, events are
restructured. Fields irrelevant to downstream consumers are removed,
producing a streamlined event payload.
Restructured event example:
{
"supportedChannels": ["EMAIL", "SMS"],
"eventName": "orderCreated",
"eventGroup": "order",
"timestamp": "2024-12-25T10:00:00Z",
"data": {
"orderId": "98765",
"userId": "54321",
"orderTotal": 150.75,
"currency": "USD",
"orderItems": [
{
"itemId": "001",
"productName": "Wireless Mouse",
"quantity": 1,
"price": 25.99
}
]
},
"source": "order-service",
"version": "1.0"
}
Publish Events: Finally, events are published to the Fanout component, such as
AWS SNS. It is essential to use batch processing for publishing to reduce network
costs and improve throughput. For example, AWS SNS requires events to be sent
as stringified JSON.
With these steps, the notification service efficiently processes and routes events,
ensuring seamless integration into the broader microservices ecosystem.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 25/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Reliability
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 26/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
In our current design, when a user’s preferences are updated, a Change Data
Capture (CDC) mechanism triggers a write operation to DynamoDB, where the
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 27/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Given that we are using AWS and DynamoDB, two primary caching solutions come
to mind:
1. Amazon ElastiCache
A flexible caching service that supports engines like Memcached and Redis.
Following is a table that contains a full comparison between the two solutions
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 29/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
For our use case, DynamoDB DAX is the optimal choice. It offloads the complexity of
implementing cache consistency, managing invalidation, and synchronizing data.
With DAX, the notification service can transparently query the cache without
additional logic, while DynamoDB remains the source of truth.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 30/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
With the addition of DAX, our updated system design would look like this:
In our architecture, notifications are stored in AWS SQS (Simple Queue Service).
SQS offers several advantages:
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 31/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Since notifications do not require strict ordering, we can use the Standard Queue to
focus on performance and scalability.
In our serverless architecture, there are multiple AWS services available to process
queue data, such as Lambda, ECS, and EC2. However, we aim to maintain a fully
serverless design with minimal complexity. For this reason, AWS Lambda is the
preferred choice for handling notifications because:
Cost Efficiency: Pay only for the compute time used, without the overhead of
provisioning and managing servers.
let us zoom in on the “Workers” architecture to understand how they work, the
following diagram shows how a single worker flow acts:
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 33/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Each Lambda function fetches messages from the SQS queue dedicated to its
notification type (e.g., SMS, Email, Push Notifications).
2. Template Loading
Notification templates are stored within the Lambda environment’s file system
or loaded from a centralized configuration service like AWS S3.
This table serves as a source for analytical workloads, enabling the generation of
insights like delivery success rates or user engagement.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 34/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
4. Sending Notifications
Notifications are dispatched to users via 3rd-party APIs or SDKs, such as Twilio
for SMS, SES for emails, or Firebase for push notifications.
Lambda ensures secure communication with these services using IAM roles or
environment variables for API keys.
Once notifications are successfully sent, they are removed from the queue using
the SQS SDK.
While this design represents the happy path, failures can occur at various stages:
1. Item Retrieval Failure: Handled by SQS retry policies. Messages remain in the
queue until processed successfully or moved to the Dead Letter Queue (DLQ).
Multiple deliveries are possible if failures occur at this stage, which may result in
redundant notifications, we can still implement idempotency in our workers if and
only if the 3rd party service offers such a feature.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 35/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
The export functionality supports two modes: full export and incremental export.
Full export enables exporting a complete snapshot of the DynamoDB table from any
point within the defined Point-In-Time Recovery (PITR) window to an S3 bucket.
This is particularly useful for establishing comprehensive datasets or creating
periodic backups. Incremental export, on the other hand, focuses on changes,
allowing us to export only the data that has been updated, deleted, or added within a
specified time range. This method is optimal for continuous updates to analytical
datasets without redundant data duplication.
To use this feature, PITR must be enabled on the DynamoDB table. Once configured,
exports can be initiated through the AWS Management Console, AWS CLI, or
DynamoDB API, providing flexibility and integration into existing workflows.
By exporting notification data to Amazon S3, we can unlock the ability to perform
advanced analytical queries. For example, we can track the volume of notifications
sent within a specific time frame, identify the most frequently used notification
types, or generate detailed business intelligence dashboards for stakeholders. This
architecture seamlessly integrates operational and analytical workflows, ensuring
that data-driven decision-making becomes a core component of our notification
system.
Reliability
Ensuring the reliability of our notification system is a critical aspect of the overall
architecture. The system must guarantee that no data is lost while maintaining
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 36/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
1. Event Listeners: The notification service listens for incoming events from topics
or streams, and processes them. It also publishes the processed data to an SNS
topic. During this stage, errors can occur if:
Event Listeners
To mitigate these risks, the event source (stream or topic) retains unprocessed
records until an acknowledgment is received, ensuring no data is lost. Additionally,
a DLQ is configured at the stream/topic level to capture records that exceed the
maximum retry attempts, allowing for later inspection and reprocessing.
2. Fanout Component: The SNS topic acts as the fanout mechanism, delivering
notifications to multiple SQS queues. Failures here include:
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 37/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
In such cases, SNS provides built-in retry mechanisms. For undeliverable messages,
DLQs are configured for each subscriber. These DLQs capture messages that cannot
be successfully delivered after all retries, preserving them for further analysis.
Fanout Component
3. Worker Stage: Lambda workers fetch records from SQS queues, process them,
and perform tasks such as sending notifications via third-party APIs or writing
results to the database. Failures may arise from:
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 38/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Worker Stage
No data is lost, with unprocessed records captured at each critical failure point.
Failed messages are transparently retried within the system, reducing manual
intervention.
DLQs provide a safety net for records that exceed retry thresholds, enabling
post-mortem analysis and troubleshooting.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 39/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 40/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
The event receiver is critical for ingesting data from multiple streams or topics. Our
architecture leverages Kubernetes for deploying the notification service, providing
the ability to scale horizontally based on demand. Here’s how scalability is achieved:
Topic/Stream Scaling: Each topic or stream is divided into partitions (or shards
in some stream implementations), which can be increased dynamically as traffic
grows. This ensures that the event ingestion system can handle surges in event
throughput.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 41/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 42/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Fanout Workflow
When the notification service processes events from a topic or stream, it enriches
them by fetching necessary details like user preferences or other relevant
information from the database. After gathering and preparing a batch of records,
the service sends these messages to Amazon SNS (Simple Notification Service),
which acts as the intermediary for distributing notifications.
Approaches for Sending Records to SNS
1. Send Each Record Separately in Parallel: Using the Publish API command, the
service can send individual messages to the SNS topic. This approach offers
simplicity and ensures each message is processed independently, but it may lead
to higher costs due to multiple API calls.
Given these capabilities, Standard topics can hold the expected workload of the
fanout workflow, ensuring reliable and efficient message distribution.
If we want to use FIFO topics, we can add a distributed layer before the topic that is
responsible for routing an event to multiple FIFO topics which in there turn can
route the events to the queues.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 43/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Distrubution Layer For SNS Routing For FIFO Topics To handle incremental Load
Implementing a Consistant Hashing Layer can add a lot of complicity to your design.
Worker and Analytic Flow
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 44/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Standard Queues: Lambda scales to the configured concurrency limit for the
queue. For example, if you set a limit of 100 concurrent Lambdas for an “Email”
SQS queue, Lambda scales accordingly as messages arrive.
For example:
- If the provider allows sending 1,000 messages per second.
- If Lambda processes 10 messages in a single batch.
- The maximum concurrency setting for the queue should be 100 to avoid
exceeding provider quotas and risking throttling.
Batch processing reduces the number of API calls to SQS and improves cost
efficiency. This approach is particularly beneficial for high-throughput systems.
Analytic Flow with DynamoDB and S3
1. Data Writing: Lambda workers persist notification data in DynamoDB for both
operational needs and analytical workloads. DynamoDB’s scalability ensures it
can handle high write volumes without bottlenecks.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 45/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
EventBridge Scheduler
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 46/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
2. Once the admin submits their request, an EventBridge Scheduler is created with
the specified configurations using AWS SDK. A custom payload is included,
containing the admin’s configurations:
{
“notification_target”: “EMAIL”,
“notification_template”: “new_year_sales”
}
4. The Lambda function reads the payload and queries DynamoDB indexes to fetch
users with the corresponding preferences.
6. Once the data is inserted into the queue, workers can pick it up and
process/deliver the notifications.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 47/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 48/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
The Lambda function reads the notification_target field from the custom payload,
queries the DynamoDB table using the contact_type as the PK, and aggregates
groups of reads. It then publishes the results to SNS or SQS.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 49/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 50/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
For each priority queue, we configure a specific Lambda function with concurrent
execution limits. We use SNS filter policies and add a priority field to the event
payload passed to SNS. This ensures that events are routed to the appropriate queue
based on their priority.
The final architecture with a priority queue implementation looks something like
this:
For more details on implementing priority queues, please refer to my other blog
post.
EventBridge Pipes are designed for point-to-point integrations. Each pipe processes
events from a single source and delivers them to a single target. Additionally, Pipes
support advanced transformations and event enrichment before delivery.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 51/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
EventBridge Pipes can integrate with different targets or destinations, enabling the
following architectural designs:
By using EventBridge Pipes, we eliminate the need to provision and manage code
for a separate “Notification Service.” Since EventBridge is fully serverless, its
features allow us to achieve our notification requirements seamlessly.
One potential architecture utilizes an SNS Topic as the target of the Pipe. This design
enables reliable message distribution to multiple subscribers.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 52/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Alternatively, we can skip the SNS Topic and directly invoke an SQS queue. However,
this approach has a limitation: for each queue associated with a specific stream, you
must create a separate pipe. If there are only 2–3 queues, this isn’t a significant
issue. However, as the number of queues grows, challenges at the stream level could
be introduced.
Conclusion
In this comprehensive blog, we’ve delved into various aspects of system
architecture, focusing on data modeling, caching, and handling high volumes of
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 53/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
reads. We explored a full architecture that can be used to build your next production
read notification system.
I encourage you to take your time to review the blog thoroughly. If you have any
questions or suggestions for further improvement, please feel free to share your
thoughts in the comments section. Your feedback is invaluable in enhancing the
content and providing greater value to our readers.
Following
Experienced Software Engineer and Solutions Architect with 10+ years in backend/frontend development,
mobile apps, and DevOps.
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 54/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Responses (2)
Respond
Varun Rao
3 days ago
This gives the complete system design picture end to end. Well documented and explained!
5 1 reply Reply
Sayed Ramadan
4 days ago
This is a great article. Very well thought and the information flow is very logical. Thank you.
5 1 reply Reply
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 55/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Joud W. Awad
3d ago 125 4
Joud W. Awad
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 56/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Joud W. Awad
Joud W. Awad
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 57/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
2d ago 819 17
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 58/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Vu Trinh
5d ago 155 3
Lists
Leadership
62 stories · 507 saves
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 59/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
Joud W. Awad
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 60/62
1/5/25, 6:16 PM Notification System Architecture With AWS | By Joud W. Awad | Medium
3d ago 125 4
Codingwinner
6d ago 292 7
In Stackademic by Crafting-Code
https://medium.com/@joudwawad/notification-system-architecture-with-aws-968103c2c730 62/62