0% found this document useful (0 votes)
155 views

Module 4 Nosql

Uploaded by

Raghu Nayak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
155 views

Module 4 Nosql

Uploaded by

Raghu Nayak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

NOSQL Database 21CS745

MODULE 4

Question Bank with Answers

1 Briefly explain scaling features in document databases, with neat diagram

1. Scaling for Heavy-Read Loads:

 Approach: Adding read slaves to a replica set.

 Example Setup: In a 3-node replica-set cluster, new slave nodes can be added to
handle increased read traffic. Each new node, like mongo D, is added to the replica set
using rs.add("mongod:27017"). This node will sync with the existing nodes, joining
as a secondary node to serve read requests.

 Advantage: No downtime is required when adding new nodes. Reads can be


distributed across multiple nodes, thus balancing the load.

 Horizontal Scaling for Reads: This technique, known as read scaling, allows each
additional node to increase the read capacity of the system.

2. Scaling for Write Loads via Sharding:

 Sharding: This involves splitting data based on a specific key (e.g., firstname) and
distributing it across multiple nodes, or "shards." Each shard can also be configured as
a replica set to improve read performance within that shard.

 Command Example: db.runCommand({ shardcollection: "ecommerce.customer",


key: {firstname: 1} }) distributes data across shards based on the firstname field.

 Data Distribution and Load Balancing: Shards are automatically balanced by


MongoDB to ensure an even distribution of data. New shards can be added without
application downtime, although performance may be temporarily affected while
rebalancing occurs.

 Placement Strategy: Sharding can also be based on user location, which places data
closer to the users for faster access, e.g., data for East Coast users in East Coast
servers, and West Coast data on the West Coast.

Replica Sets in Sharded Clusters:

 Each shard in a sharded cluster can be set up as a replica set, combining the benefits
of sharding with replication (as seen in Figure 9.3). This setup enables improved read
and write performance, as each shard can serve both as an independent replica set and
a distributed part of the overall dataset.

1
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745

2 Describe some example queries to use with document databases


Query Features in Document Databases

Document databases, like CouchDB and MongoDB, offer flexible query options that make
complex data retrieval easier compared to traditional key-value stores.

1. Views in CouchDB:

 Materialized and Dynamic Views: CouchDB supports querying through views,


similar to RDBMS views, which can be materialized or dynamic. Materialized views
store precomputed query results, so when there's a high volume of requests (e.g.,
counting reviews or averaging ratings), the data does not have to be recalculated on
each request. The view updates automatically when data changes, improving
performance for frequent, complex queries.

 Map-Reduce for Aggregation: CouchDB allows implementing views using map-


reduce. For example, you can create a view to count the number of reviews and
calculate the average rating of a product.

2
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745

2. Querying Document Content:

 Unlike key-value stores, document databases allow querying based on fields within
documents without needing to retrieve the entire document by its key. This capability
brings them closer to RDBMS-style querying.

3. MongoDB Query Language:

 JSON-based Syntax: MongoDB uses a JSON-like syntax for queries, with operators
such as $query for filtering (WHERE clause), $orderby for sorting, and $explain to
display the execution plan.

 Example Queries:

o Retrieve All Documents:

 SQL: SELECT * FROM order

 MongoDB: db.order.find()

o Filter by customerId:

 SQL: SELECT * FROM order WHERE customerId = "883c2c5b4e5b"

 MongoDB: db.order.find({"customerId":"883c2c5b4e5b"})

o Select Specific Fields:

 SQL: SELECT orderId, orderDate FROM order WHERE customerId =


"883c2c5b4e5b"

 MongoDB:
db.order.find({customerId:"883c2c5b4e5b"},{orderId:1,orderDate:1})

4. Aggregated and Embedded Data Querying:

 MongoDB’s structure allows querying embedded documents directly, simplifying


multi-table join operations found in SQL databases. For instance, to find orders where
a product with the name "Refactoring" is ordered, MongoDB allows querying child
objects within documents:

SQL (using joins):

SELECT * FROM customerOrder, orderItem, product

WHERE customerOrder.orderId = orderItem.customerOrderId

AND orderItem.productId = product.productId

AND product.name LIKE '%Refactoring%'

3
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745

MongoDB:

db.orders.find({"items.product.name": /Refactoring/})

This embedded structure in MongoDB enables simpler and more efficient querying for
related data within a single document.

3 What is a Document database? Explain with an example. Explain its features briefly.
A document database is a type of NoSQL database designed to store, retrieve, and manage
document-oriented information, which is typically represented as JSON-like documents.
Unlike traditional relational databases (RDBMS), document databases allow each
"document" (similar to a row in an RDBMS) to have a unique structure. This flexibility
makes document databases well-suited for applications requiring a schema that can adapt
over time.

Example Documents in a Document Database

Consider the two sample documents:

1. First Document:

"firstname": "Martin",

"likes": ["Biking", "Photography"],

"lastcity": "Boston"

This document includes a firstname, a list of likes, and a lastcity attribute.

2. Second Document:

"firstname": "Pramod",

"citiesvisited": ["Chicago", "London", "Pune", "Bangalore"],

"addresses": [

{ "state": "AK", "city": "DILLINGHAM", "type": "R" },

{ "state": "MH", "city": "PUNE", "type": "R" }

],

"lastcity": "Chicago"

4
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745

This document shares some attributes, such as firstname and lastcity, but also contains
citiesvisited and addresses—additional fields that the first document does not have.

1. Consistency

MongoDB ensures consistency primarily through replica sets. By configuring write


operations with the desired WriteConcern level, you can control how many nodes a write
operation needs to propagate to before it’s considered successful. This is adjusted using
commands like db.runCommand({ getlasterror: 1, w: "majority" }), where the w parameter
specifies the number of nodes required to acknowledge a write. The trade-off here involves
balancing write consistency against performance.

2. Transactions

MongoDB supports single-document atomic transactions rather than multi-document


transactions typically found in RDBMS. You can control the acknowledgment level of a
write operation with WriteConcern, such as WriteConcern.REPLICAS_SAFE, which ensures
that writes reach multiple nodes before being considered successful. Although MongoDB
doesn’t offer full RDBMS-style transactions, it allows for strong write guarantees for
applications that need reliable multi-node writes.

3. Availability

Following the CAP theorem, MongoDB opts for availability and partition tolerance over
consistency in distributed setups. Availability is enhanced through replica sets, which
maintain data across multiple nodes. In case the primary node fails, the replica set elects a
new primary, ensuring data remains accessible. This automated failover and data redundancy
enhance MongoDB’s resilience.

4. Query Features

MongoDB offers a flexible JSON-based query language with operators like $query (for
filtering), $orderby (for sorting), and $explain (to view query plans). MongoDB's structure
allows querying embedded documents directly, making it straightforward to filter on nested
fields. For example, fetching documents with a certain nested field match can be done with
db.orders.find({"items.product.name": /Refactoring/}), enabling simpler queries on
aggregated data structures compared to traditional SQL joins.

5. Scaling

MongoDB supports horizontal scaling for both reads and writes:

 Read Scaling: Adding more secondary nodes in a replica set allows load distribution
across these nodes, especially for read-heavy applications.

5
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745

 Write Scaling: MongoDB uses sharding, a form of data partitioning that distributes
data across multiple nodes based on a shard key. This enables applications to handle
higher write loads by balancing data among shards and dynamically redistributing
data as nodes are added.

4 List out and Explain benefits of Documents database.


Four key benefits of document databases:

1. Flexible Schema Design

 Explanation: Document databases allow each document to have its own structure,
enabling flexibility and adaptability. Unlike relational databases that require a strict
schema, document databases let you store varied data within a collection without a
predefined schema. This makes them ideal for applications where data requirements
change frequently.

 Benefit: This flexibility reduces the need for costly schema migrations and supports
rapid development, especially for projects with evolving data requirements.

2. Ease of Scalability

 Explanation: Document databases are designed for easy horizontal scaling (adding
more servers). Scaling is often done through sharding, where data is distributed across
multiple nodes based on a shard key.

 Benefit: Scalability in document databases enables them to handle large volumes of


data and high traffic, making them well-suited for applications needing high
availability and scalability, such as e-commerce or social media platforms.

3. Efficient Data Storage and Retrieval

 Explanation: Documents in a document database store related information together,


allowing for faster access. Data is often stored in a hierarchical structure with nested
sub-documents, reducing the need for joins.

 Benefit: This structure minimizes database operations needed to retrieve related data,
improving query performance and reducing latency. It’s especially useful for
applications requiring fast, complex queries, like content management systems or
personalized recommendation engines.

4. Support for Rich Data Types

 Explanation: Document databases can store a wide range of data types, including
arrays, nested documents, and various data structures within a single document.

6
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745

JSON-like formats make them easy to read and use with many programming
languages.

 Benefit: Support for rich data types allows for a more intuitive and efficient way to
represent complex data models, making document databases ideal for handling
unstructured or semi-structured data, such as user profiles, catalogs, and real-time
analytics.

These benefits make document databases particularly advantageous for applications that
demand flexibility, scalability, and performance with complex, evolving data requirements.

Elaborate the suitable use cases of document databases. When document databases
5
are not suitable? Explain

Suitable Use Cases for Document Databases

1. Event Logging

o Explanation: Document databases are ideal for event logging since they can
store diverse types of events without requiring a rigid schema. This flexibility
is valuable for enterprise applications where logging requirements may vary
across different departments or applications.

o Example: Events could be logged by application name or event type (e.g.,


order_processed, customer_logged) to make it easy to organize and retrieve
specific events.

2. Content Management Systems (CMS) and Blogging Platforms

o Explanation: Document databases support JSON-like documents, making


them a good choice for CMSs and blogging platforms. They allow for easy
storage of user profiles, posts, comments, and other web-facing content,
without the need for predefined schemas.

o Example: In a CMS, you might store web pages, user-generated content, and
metadata as documents, enabling rapid adaptation to new content types or
requirements.

3. Web Analytics or Real-Time Analytics

o Explanation: Document databases facilitate real-time analytics by allowing


document updates for metrics like page views and unique visitors. The ability
to add new fields without schema changes is valuable for applications that
need to track evolving metrics.

7
Koustav Biswas, Dept. Of CSE, DSATM
NOSQL Database 21CS745

o Example: For a web analytics platform, documents might store data on visitor
interactions, with fields for page views, session length, and engagement
metrics that can be expanded as analytics needs grow.

4. E-Commerce Applications

o Explanation: E-commerce applications often require flexible schemas for


product catalogs and orders. Document databases enable these applications to
evolve their data models with minimal effort, accommodating new product
attributes or order details as business requirements change.

o Example: An e-commerce store can easily store products with varied


specifications (e.g., clothing with size and color options, electronics with
model-specific features) in a document database, allowing for a dynamic
product catalog.

When Not to Use Document Databases

1. Complex Transactions Spanning Different Operations

o Explanation: Document databases are typically not ideal for applications


requiring atomic, multi-document transactions. While some document
databases (like RavenDB) support this, relational databases are often better
suited for applications with high transactional integrity requirements.

2. Queries Against Varying Aggregate Structures

o Explanation: Document databases do not enforce a strict schema, which can


complicate ad hoc queries when data structures frequently change. If the data
model is dynamic and requires frequent changes in structure, this could lead to
inconsistencies and make querying difficult.

o Example: If the design constantly changes and querying depends on


normalized data, a relational database with defined schemas may be more
suitable for managing such evolving, structured data.

----------------------------------------END OF MODULE 4----------------------------------------------

8
Koustav Biswas, Dept. Of CSE, DSATM

You might also like