Module 4 Nosql
Module 4 Nosql
MODULE 4
Example Setup: In a 3-node replica-set cluster, new slave nodes can be added to
handle increased read traffic. Each new node, like mongo D, is added to the replica set
using rs.add("mongod:27017"). This node will sync with the existing nodes, joining
as a secondary node to serve read requests.
Horizontal Scaling for Reads: This technique, known as read scaling, allows each
additional node to increase the read capacity of the system.
Sharding: This involves splitting data based on a specific key (e.g., firstname) and
distributing it across multiple nodes, or "shards." Each shard can also be configured as
a replica set to improve read performance within that shard.
Placement Strategy: Sharding can also be based on user location, which places data
closer to the users for faster access, e.g., data for East Coast users in East Coast
servers, and West Coast data on the West Coast.
Each shard in a sharded cluster can be set up as a replica set, combining the benefits
of sharding with replication (as seen in Figure 9.3). This setup enables improved read
and write performance, as each shard can serve both as an independent replica set and
a distributed part of the overall dataset.
1
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
Document databases, like CouchDB and MongoDB, offer flexible query options that make
complex data retrieval easier compared to traditional key-value stores.
1. Views in CouchDB:
2
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
Unlike key-value stores, document databases allow querying based on fields within
documents without needing to retrieve the entire document by its key. This capability
brings them closer to RDBMS-style querying.
JSON-based Syntax: MongoDB uses a JSON-like syntax for queries, with operators
such as $query for filtering (WHERE clause), $orderby for sorting, and $explain to
display the execution plan.
Example Queries:
MongoDB: db.order.find()
o Filter by customerId:
MongoDB: db.order.find({"customerId":"883c2c5b4e5b"})
MongoDB:
db.order.find({customerId:"883c2c5b4e5b"},{orderId:1,orderDate:1})
3
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
MongoDB:
db.orders.find({"items.product.name": /Refactoring/})
This embedded structure in MongoDB enables simpler and more efficient querying for
related data within a single document.
3 What is a Document database? Explain with an example. Explain its features briefly.
A document database is a type of NoSQL database designed to store, retrieve, and manage
document-oriented information, which is typically represented as JSON-like documents.
Unlike traditional relational databases (RDBMS), document databases allow each
"document" (similar to a row in an RDBMS) to have a unique structure. This flexibility
makes document databases well-suited for applications requiring a schema that can adapt
over time.
1. First Document:
"firstname": "Martin",
"lastcity": "Boston"
2. Second Document:
"firstname": "Pramod",
"addresses": [
],
"lastcity": "Chicago"
4
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
This document shares some attributes, such as firstname and lastcity, but also contains
citiesvisited and addresses—additional fields that the first document does not have.
1. Consistency
2. Transactions
3. Availability
Following the CAP theorem, MongoDB opts for availability and partition tolerance over
consistency in distributed setups. Availability is enhanced through replica sets, which
maintain data across multiple nodes. In case the primary node fails, the replica set elects a
new primary, ensuring data remains accessible. This automated failover and data redundancy
enhance MongoDB’s resilience.
4. Query Features
MongoDB offers a flexible JSON-based query language with operators like $query (for
filtering), $orderby (for sorting), and $explain (to view query plans). MongoDB's structure
allows querying embedded documents directly, making it straightforward to filter on nested
fields. For example, fetching documents with a certain nested field match can be done with
db.orders.find({"items.product.name": /Refactoring/}), enabling simpler queries on
aggregated data structures compared to traditional SQL joins.
5. Scaling
Read Scaling: Adding more secondary nodes in a replica set allows load distribution
across these nodes, especially for read-heavy applications.
5
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
Write Scaling: MongoDB uses sharding, a form of data partitioning that distributes
data across multiple nodes based on a shard key. This enables applications to handle
higher write loads by balancing data among shards and dynamically redistributing
data as nodes are added.
Explanation: Document databases allow each document to have its own structure,
enabling flexibility and adaptability. Unlike relational databases that require a strict
schema, document databases let you store varied data within a collection without a
predefined schema. This makes them ideal for applications where data requirements
change frequently.
Benefit: This flexibility reduces the need for costly schema migrations and supports
rapid development, especially for projects with evolving data requirements.
2. Ease of Scalability
Explanation: Document databases are designed for easy horizontal scaling (adding
more servers). Scaling is often done through sharding, where data is distributed across
multiple nodes based on a shard key.
Benefit: This structure minimizes database operations needed to retrieve related data,
improving query performance and reducing latency. It’s especially useful for
applications requiring fast, complex queries, like content management systems or
personalized recommendation engines.
Explanation: Document databases can store a wide range of data types, including
arrays, nested documents, and various data structures within a single document.
6
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
JSON-like formats make them easy to read and use with many programming
languages.
Benefit: Support for rich data types allows for a more intuitive and efficient way to
represent complex data models, making document databases ideal for handling
unstructured or semi-structured data, such as user profiles, catalogs, and real-time
analytics.
These benefits make document databases particularly advantageous for applications that
demand flexibility, scalability, and performance with complex, evolving data requirements.
Elaborate the suitable use cases of document databases. When document databases
5
are not suitable? Explain
1. Event Logging
o Explanation: Document databases are ideal for event logging since they can
store diverse types of events without requiring a rigid schema. This flexibility
is valuable for enterprise applications where logging requirements may vary
across different departments or applications.
o Example: In a CMS, you might store web pages, user-generated content, and
metadata as documents, enabling rapid adaptation to new content types or
requirements.
7
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745
o Example: For a web analytics platform, documents might store data on visitor
interactions, with fields for page views, session length, and engagement
metrics that can be expanded as analytics needs grow.
4. E-Commerce Applications
8
Koustav Biswas, Dept. Of CSE, DSATM