100% found this document useful (1 vote)
79 views72 pages

Replication in MongoDB

Replication in MongoDB involves maintaining multiple synchronized copies of data across servers. The primary server records all write operations and shares the changes with secondary servers via an oplog. This ensures high availability and data redundancy even if the primary server fails. When the primary fails, an election occurs to select a new primary from the secondary servers based on data freshness, priority, and uptime to keep the replica set functioning.

Uploaded by

dobomighie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
79 views72 pages

Replication in MongoDB

Replication in MongoDB involves maintaining multiple synchronized copies of data across servers. The primary server records all write operations and shares the changes with secondary servers via an oplog. This ensures high availability and data redundancy even if the primary server fails. When the primary fails, an election occurs to select a new primary from the secondary servers based on data freshness, priority, and uptime to keep the replica set functioning.

Uploaded by

dobomighie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

MongoDB Replication

…to duplicate, copy, mirror, reproduce, and emulate


Replication is a process of creating and
maintaining multiple synchronised copies of
data across multiple database servers. It
ensures high availability, fault tolerance,
and data redundancy.
To avoid losing inventory data in a small
bookstore, the owner makes copies of the
record book and distributes them to employees,
ensuring that they have an up-to-date copy of
the inventory. This is similar to MongoDB
Replication, where a primary server records
changes to the database, and secondary servers
maintain copies of the primary server's data to
keep them in sync.
Purpose: The main purpose of replication is
to maintain data redundancy and improve
data availability, ensuring continuous
service during system failures, hardware
issues, or network outages.
Advantages:
a. High availability: Ensures uninterrupted
service even if one or more servers fail.

b. Data redundancy: Multiple copies of data


protect against data loss and provide
backup options.
c. Disaster recovery: Facilitates recovery
from catastrophic events, such as data
center outages or natural disasters.

d. Load balancing: Distributes read


operations across secondary servers to
improve performance.
e. Geographic distribution: Allows
spreading data across various regions to
improve latency and comply with data
locality regulations.
How Replication Works
1. Establish multiple servers: Set up
multiple database servers to host the
same dataset. One server is designated as
the primary server, responsible for
handling write operations, while the other
servers serve as secondary servers,
maintaining copies of the primary server's
data.
2. Write operations: The primary server
accepts all write operations, such as
inserting, updating, or deleting data.
3. Record changes: When data is modi ed
on the primary server, the changes are
recorded in a log that can be shared with
the secondary servers. The log is called
oplog (operation log)

fi
4. Transmit changes: The log containing the
changes is transmitted to the secondary
servers, ensuring they receive the latest
updates.
5. Apply changes to secondary servers:
The secondary servers apply the changes
received from the log to their copies of
the data, keeping their data synchronized
with the primary server.
6. Monitor primary server: Secondary
servers continuously monitor the primary
server's availability. If the primary server
is detected as unavailable, a failover
process begins.
7. Failover process: During failover, an
election takes place among the secondary
servers to choose a new primary server.
The elected server takes over the primary
server's responsibilities, including handling
write operations.
8. Update con guration: The system
con guration is updated to re ect the
change in the primary server, and the new
primary server begins accepting write
operations.
fi
fi
fl
9. Resume normal operations: The
secondary servers continue to replicate
data from the new primary server,
ensuring the system remains synchronized.
replica set
In a MongoDB, replica set is a group of
servers (MongoDB Instances) working
together to maintain the same dataset. A
replica set contains several data bearing
nodes and optional node - arbiter node. Of
the data bearing nodes, one and only one
member is deemed the primary node, while
the other nodes are deemed secondary nodes
The designated primary server is
responsible for accepting all write
operations (adding, updating, or deleting
data), while the secondary servers
maintain copies of the primary server's
data.
Heartbeat
Heartbeat: This is a periodic signal
exchanged between the members of the
replica set to monitor their availability and
health.
1. Check the availability of other
members: Heartbeat messages help the
replica set members determine if other
members are online and operational. If a
member stops receiving heartbeat
messages from another member within a
speci ed time period, it considers that
member as unavailable.
fi
2. Monitor primary server: The secondary
servers continuously exchange heartbeat
messages with the primary server to
monitor its availability. If the primary
server becomes unavailable, the replica
set initiates a failover process to elect a
new primary server.
3. Share replica set con guration version:
Heartbeat messages also contain
information about the replica set
con guration version. If there's a
mismatch in the con guration version
between members, they can request an
updated con guration from other members.
fi
fi
fi
fi
4. Support the election process: During
the failover process, heartbeat messages
are used to exchange votes among the
replica set members. Members vote for a
candidate to become the new primary
server based on factors such as data
freshness, priority settings, and server
uptime.
MongoDB sends heartbeat messages every
2 seconds, and the default timeout for
considering a member as unavailable is 10
seconds. However, these values can be
con gured according to speci c
requirements.
fi
fi
Asynchronous Replication
In a MongoDB replica set, the primary
server logs data changes in the oplog.
Secondary servers replicate this oplog and
apply changes to their data sets
asynchronously, meaning they process
updates at their own pace rather than in
real-time or immediately after primary
server changes.
Failover
&
The Process of Failover
Failover in MongoDB Replica Set is the
process of automatically selecting a new
primary server when the current primary
server becomes unavailable ensuring the
MongoDB Replica Set remains highly
available and fault-tolerant.
1. Detect primary server failure: The
secondary servers in the replica set
continuously monitor the primary server's
availability by exchanging heartbeat
messages. If a secondary server does not
receive a heartbeat from the primary
server within a speci ed time period, it
considers the primary server as unavailable.
fi
2. Initiate election: Once the primary
server is detected as unavailable, an
election process begins among the
remaining secondary servers to choose a
new primary server.
3. Nominate candidates: Eligible secondary
servers that are up-to-date with the
primary server's data and have a majority
of votes from other replica set members
can become candidates for the new
primary server.
4. Vote: Each member of the replica set
casts a vote for the candidate they
consider best-suited to become the new
primary server. This decision is based on
factors such as data freshness, priority
settings, and server uptime.
Data freshness: Data freshness refers to how up-to-
date a secondary server's data is compared to the
data on the primary server. Secondary servers with
the most recent data (i.e., those that have replicated
the latest changes from the primary server) are
considered better candidates for the new primary
server. This is because promoting a secondary server
with the latest data minimizes the risk of data loss
and ensures that the replica set remains consistent.
Priority settings: In a MongoDB replica set, you can assign a priority value
to each member,
• which is an integer between 0 and 1000.
• A higher priority value indicates that a member is more preferred
for becoming the primary server during the election process.
• By default, all members have a priority of 1.
• You can set the priority of a member to 0 to make it ineligible for
becoming the primary server, effectively turning it into a read-only
secondary server.
• Priority settings allow you to in uence the election process and
ensure that the most suitable servers are promoted to primary
servers based on your speci c requirements.
fi
fl
Server uptime: Server uptime is the amount of time
a server has been continuously online and
operational. During the election process, a secondary
server with longer uptime is considered more stable
and reliable, making it a better candidate for the
new primary server. A longer uptime may indicate
that the server has better hardware, a more stable
network connection, or fewer operational issues,
which are all factors that contribute to a reliable
and highly available replica set.
The combination of these factors—data
freshness, priority settings, and server uptime—
helps the members of a MongoDB replica set to
make an informed decision when selecting a
new primary server during the failover process.
This ensures that the replica set remains highly
available, fault-tolerant, and consistent, even in
the event of primary server failure.
5. Elect new primary: The candidate that
receives a majority of votes from the
replica set members is elected as the new
primary server. In case of a tie, a new
round of voting will take place until a
majority is reached.
6. Update con guration: Once the new
primary server is elected, the replica set
con guration is updated to re ect the
change, and the new primary server starts
accepting write operations.
fi
fi
fl
7. Resume normal operations: The
secondary servers continue to replicate
data from the new primary server,
ensuring that the replica set remains in
sync.
8. Recovery of the failed primary server:
When the failed primary server becomes
available again, it will rejoin the replica
set as a secondary server. It will
synchronize its data with the current
primary server's data using the oplog to
catch up on any changes that occurred
during its downtime.
Creating MongoDB Replica
1. Create data directories for each replica set member:
mkdir -p ~/mongo-replica-set/{rs1, rs2, rs3}/data
• "mkdir" - to create directories.
• "-p" - is an option to create parent directories if
they don't already exist.
• "~/" - shorthand notation for the current user's
home directory and serves as the parent directory
for the entire path.

• "mongo-replica-set" is a subdirectory under the


parent directory "~/".
• "{rs1,rs2,rs3}" is a brace expansion to create three
subdirectories named "rs1", "rs2", and "rs3"
under the "mongo-replica-set" directory.
• "/data" is a subdirectory created under each of
the "rs1", "rs2", and "rs3" directories. You can
check if the directories where created.

ls ~/mongo-replica-set/rs1/
ls ~/mongo-replica-set/rs2/
ls ~/mongo-replica-set/rs3/
const con g = {
_id: "myReplicaSet",
members: [
{ _id: 0, host: "localhost:27017" },
{ _id: 1, host: "localhost:27018" },
{ _id: 2, host: "localhost:27019" }
]
};
fi
• The code block de nes a con guration object called con g for a
MongoDB replica set named "myReplicaSet".
• The _id eld is the unique identi er for the replica set.
• The members array contains the con guration for each replica set
member.
• Each member is identi ed by its _id eld and speci es the host on
which the instance is running.
• There are three members speci ed in the con g object, each with
a different _id and host values.
• The purpose of having multiple members is to ensure high
availability and redundancy of the data.
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
fi
rs.initiate(con g);

• rs.initiate(con g) is a command used to initialize a new replica set.


• When you execute this, MongoDB creates a new replica set based on the
con guration provided in the con g object.
• The replica set will have one primary node and two secondary nodes
according to the con g.
• Connects to the MongoDB instance running on the current host and port.
• Initializes the replica set with the con guration provided in the con g object.
• Elects the rst node in the con guration as the primary node.
• Begins replicating data from the primary node to the secondary nodes in the
replica set.
fi
fi
fi
fi

fi
fi
fi
fi
fi
rs.status();
• majorityVoteCount speci es the minimum number of voting
members needed to reach a majority consensus for certain
operations, such as electing a new primary node or committing
a write operation.
• writeMajorityCount speci es the minimum number of voting
members needed to reach a majority consensus for write
operations.
• votingMembersCount speci es the total number of voting
members in the replica set.
• writableVotingMembersCount speci es the number of voting
members that are currently available for write operations.
fi
fi
fi
fi
Replication in MongoDB refers to the
process of synchronising data across
multiple servers, creating multiple copies
of the data to ensure high availability and
fault tolerance.

You might also like