0% found this document useful (0 votes)

14 views

MODULE 3 (3)

Notes of engineering subject ,

Uploaded by

Bhagya K S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

MODULE 3 (3)

Notes of engineering subject ,

Uploaded by

Bhagya K S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Q1 Properties of NoSQL Databases

1. Schema Flexibility:
○ NoSQL databases often have a schema-less or flexible schema model. This
means they can store unstructured or semi-structured data without needing to
define the structure in advance.
○ This allows data to evolve over time, making it ideal for use cases where data
requirements change frequently.
2. Horizontal Scalability:
○ NoSQL databases are designed for horizontal scaling, meaning you can add
more servers or nodes to handle increased load, rather than upgrading a
single server (vertical scaling).
○ This is crucial for handling large-scale data and provides cost-effective scaling
solutions for growing data needs.
3. Replication:
○ NoSQL systems often support data replication across multiple nodes to
ensure high availability and fault tolerance.
○ This means that even if some nodes fail, the data remains accessible from
other replicated nodes.
4. Sharding (Partitioning):
○ Sharding involves breaking down a large dataset into smaller, more
manageable pieces, called shards, which are stored across multiple servers.
○ Each shard contains a subset of the total data and operates independently,
helping distribute the data and load across the system.
5. BASE Properties:
○ NoSQL databases follow BASE (Basically Available, Soft state, Eventual
consistency) rather than ACID (Atomicity, Consistency, Isolation, Durability)
properties used in relational databases.
■ Basically Available: The system guarantees availability, but not
always consistency.
■ Soft state: The state of the system may change over time, even
without input, due to eventual consistency.
■ Eventual consistency: The system will eventually reach a consistent
state, but not immediately after a transaction.
6. CAP Theorem:
○ The CAP theorem states that a distributed database can only guarantee two
out of the following three properties at any given time:
■ Consistency: Every read receives the most recent write or an error.
■ Availability: Every request receives a response, without guarantee
that it contains the most recent write.
■ Partition Tolerance: The system continues to operate despite
network partition failures.
○ NoSQL databases typically prioritize availability and partition tolerance over
strict consistency.
7. Integrated Caching:
○ Many NoSQL databases come with built-in caching capabilities to improve
performance by storing frequently accessed data in memory.
8. Support for Semi-Structured Data:
○ NoSQL databases can handle semi-structured data formats, such as JSON or
XML, allowing for flexible data models that adapt to varying data types and
structures.

Q2 Architecture of NoSQL Databases

NoSQL databases are designed with an architecture that supports scalability, flexibility, and
high availability. Their architecture is tailored to handle large-scale data operations and is
optimized for distributed computing. Below are the key aspects of NoSQL architecture:

1. Distributed System Architecture:

● Cluster-Based Systems: NoSQL databases operate on a cluster of machines or

nodes. This distributed setup allows data to be stored and processed across multiple
nodes, enabling scalability and fault tolerance.
● Shared-Nothing Architecture: Each node operates independently, with no shared
memory or storage. This approach ensures there is no single point of failure,
providing high availability and reliability.

2. Data Distribution:

● Sharding: NoSQL databases use sharding to partition data across multiple nodes.
Each shard contains a subset of the data and operates as an independent database.
Sharding helps distribute the load and manage large datasets efficiently.
● Replication: Data replication ensures that copies of the data exist on multiple nodes.
This setup provides fault tolerance and improves data availability, as data can still be
accessed even if a node fails.

3. Consistency Models:

● BASE Properties: NoSQL systems often follow BASE (Basically Available, Soft
state, Eventual consistency), which means the system guarantees availability and will
eventually become consistent.
● CAP Theorem: NoSQL databases balance between Consistency, Availability, and
Partition Tolerance. According to the CAP theorem, a distributed database can only
guarantee two out of the three properties simultaneously.

4. Scalability:

● Horizontal Scaling: NoSQL architectures support horizontal scaling, where more

nodes can be added to the system to handle increasing data loads, rather than
upgrading the capacity of a single node (vertical scaling).
● Elasticity: Many NoSQL systems can scale in and out seamlessly as data and
processing demands change.
5. Replication and Fault Tolerance:

● Master-Slave Architecture: Some NoSQL databases use a master-slave model

where the master node handles write operations and the slave nodes replicate the
master’s data for read operations. This can improve read performance and data
redundancy but may have limitations on write scalability.
● Peer-to-Peer Architecture: Databases like Cassandra use a peer-to-peer
architecture where each node can accept read and write requests. This approach
ensures that there is no single point of failure and allows for better load distribution.

6. Storage Models:

● Key-Value Store: Data is stored as key-value pairs, providing simple, fast access
(e.g., Redis, DynamoDB).
● Document Store: Data is stored as documents (e.g., MongoDB, CouchDB) that can
have complex nested structures, typically in JSON or BSON formats.
● Column-Family Store: Data is stored in columns rather than rows (e.g., Cassandra,
HBase), which is suitable for queries that access a subset of columns.
● Graph Database: Data is stored in nodes and edges to represent relationships (e.g.,
Neo4j), focusing on the connections between data points.

7. Integrated Caching:

● Many NoSQL databases come with built-in caching mechanisms to store frequently
accessed data in memory. This enhances the performance of read operations by
reducing the need to access disk storage.

Q3 MongoDB: Sharding and Replication

MongoDB is a NoSQL database that supports horizontal scaling and high availability
through its sharding and replication mechanisms. These features are crucial for managing
large data volumes and ensuring that data is available even in the event of node failures.

1. Sharding in MongoDB:

● Definition: Sharding is a method used to distribute large datasets across multiple

servers or nodes. It helps balance the load and allows MongoDB to scale
horizontally.
● How It Works:
○ Shard Key: MongoDB uses a shard key to determine how data is partitioned
across different shards. The choice of a shard key is critical because it
impacts the distribution and performance of the database.
○ Data Distribution: Each shard stores a subset of the overall dataset. When a
query is made, MongoDB routes the query to the relevant shard(s) using the
shard key.
○ Sharded Cluster Components:
■ Config Servers: Store metadata and the configuration details of the
cluster.
■ Mongos: Acts as a query router, directing operations to the
appropriate shards.
■ Shards: The data-bearing nodes in the cluster that store a portion of
the data.
● Advantages:
○ Horizontal Scalability: Allows for the addition of more shards to handle
larger datasets or increased traffic.
○ Load Balancing: Distributes data and read/write operations evenly across
shards.
○ Increased Write Capacity: By partitioning data, write operations can be
distributed to different shards, reducing the load on individual servers.

2. Replication in MongoDB:

● Definition: Replication is the process of synchronizing data across multiple servers.

In MongoDB, replication is implemented using a replica set.
● Replica Set Components:
○ Primary Node: Handles all write operations and replicates data to secondary
nodes.
○ Secondary Nodes: Receive and replicate data from the primary node. They
can also handle read operations if enabled for read preference.
○ Arbiter: A lightweight node that does not hold data but helps in electing a
new primary during failover.
● How It Works:
○ Data Replication: The primary node receives all write operations and records
them in the oplog (operation log). The secondary nodes continuously
replicate this log and apply the operations to maintain synchronization.
○ Automatic Failover: If the primary node fails, an election takes place among
the secondary nodes, and one of them is promoted to the new primary.
● Advantages:
○ High Availability: Ensures data is available even if a node fails, improving
the system's resilience.
○ Data Redundancy: Multiple copies of data are maintained across different
nodes, enhancing fault tolerance.
○ Read Scalability: Allows reads to be distributed across secondary nodes,
reducing the load on the primary node.

Combining Sharding and Replication:

● Sharded Replicated Clusters: MongoDB can combine both sharding and replication
by having each shard in a sharded cluster configured as a replica set. This
configuration provides the benefits of horizontal scaling (via sharding) and high
availability (via replication).
● Operational Benefits:
○ Scalable and Fault-Tolerant: The combination allows MongoDB to scale
data across multiple nodes while ensuring that data remains available during
node failures.
○ Improved Performance: Write operations can be distributed across shards,
and read operations can be scaled out to secondary nodes.

In summary, sharding and replication in MongoDB work together to provide a scalable,

high-performance, and fault-tolerant database system suitable for handling large-scale data
and high-availability requirements.

Q4 Master-Slave Architecture

Master-Slave is a replication model used in distributed databases to ensure data

redundancy, high availability, and load balancing. In this architecture, one node acts as the
master (or primary), while the other nodes act as slaves (or replicas).

1. Overview of Master-Slave Architecture:

● Master Node:
○ Handles all the write operations in the system.
○ Updates made on the master node are propagated to the slave nodes.
○ Maintains the main copy of the data and ensures data consistency for writes.
● Slave Nodes:
○ Replicate the data from the master node and can handle read operations.
○ Provide additional copies of the data to distribute the load and improve read
performance.
○ Typically configured to be read-only to avoid write conflicts.

2. How It Works:

● Write Operations: All write operations are directed to the master node. Once the
master processes the write, it propagates the change to the slave nodes.
● Read Operations: Read operations can be performed on the master or distributed
across the slave nodes to balance the load and enhance read performance.
● Replication Process:
○ The master node logs all changes in a replication log or oplog.
○ The slave nodes pull updates from this log to maintain synchronization with
the master.

3. Advantages:

● High Availability: The system can still function if a slave node fails, as the master
continues to operate and can direct traffic to the remaining slaves.
● Improved Read Performance: Distributing read operations to slave nodes reduces
the load on the master, enabling better performance for high read-intensive
applications.
● Data Redundancy: Ensures data backup and safety by maintaining multiple copies
of the data across slave nodes.

4. Disadvantages:

● Single Point of Failure: The master node is a single point of failure in this model. If it
goes down, write operations halt until a new master is elected or manually
configured.
● Consistency Lag: Slave nodes may experience a lag in data synchronization,
leading to temporary inconsistencies between the master and slaves.
● Limited Scalability for Writes: Since all write operations are handled by the master
node, the architecture can become a bottleneck if the write load increases
significantly.

5. Use Cases:

● Read-Heavy Applications: Ideal for scenarios where the application has more read
operations than write operations, such as data analytics or reporting.
● Backup and Failover: Can be used to create backup copies of the database, and in
the event of a failure, one of the slaves can be promoted to act as the new master.

Implementation in MongoDB:

● MongoDB supports master-slave replication, although it is more common to use

replica sets for better fault tolerance and automatic failover.
● In a basic MongoDB master-slave setup:
○ The master (primary) node handles all write operations.
○ The slave (secondary) nodes replicate data from the master and handle
read operations as needed.

Master-slave replication is foundational for providing data reliability and read scalability, but it
has limitations that modern replication models, such as peer-to-peer or multi-master
architectures, aim to address.
Q5 CRUD Operations in Cassandra

Cassandra is a distributed NoSQL database known for handling large volumes of data
across multiple nodes with high availability and fault tolerance. It uses the Cassandra Query
Language (CQL) to perform CRUD (Create, Read, Update, Delete) operations. Below is an
explanation of how these operations work in Cassandra:

1. Create Operation:

● Purpose: To insert data into a table.

● Command: The INSERT statement is used to add data to a table.

Example:
sql
Copy code
INSERT INTO keyspace_name.table_name (column1, column2, column3)
VALUES ('value1', 'value2', 'value3');

●
● Explanation: The INSERT command specifies the keyspace, table, and column
values to be added. If the primary key already exists, the INSERT statement acts as
an UPDATE.

2. Read Operation:

● Purpose: To retrieve data from a table.

● Command: The SELECT statement is used to query data.

Example:
sql
Copy code
SELECT column1, column2 FROM keyspace_name.table_name
WHERE primary_key_column = 'value';

●
● Explanation: The SELECT statement can retrieve specific columns or all columns
using SELECT *. Conditions in the WHERE clause help filter data based on criteria.

3. Update Operation:

● Purpose: To modify existing data in a table.

● Command: The UPDATE statement is used for changing data.

Example:
sql
Copy code
UPDATE keyspace_name.table_name
SET column1 = 'new_value'
WHERE primary_key_column = 'value';

●
● Explanation: The UPDATE command specifies the table, column(s) to be updated,
and the condition to identify the rows. The WHERE clause must include the primary
key to locate the specific row(s) for updating.

4. Delete Operation:

● Purpose: To remove data from a table.

● Command: The DELETE statement is used for data deletion.

Example:
sql
Copy code
DELETE column1 FROM keyspace_name.table_name
WHERE primary_key_column = 'value';

●
● Explanation: The DELETE command can remove specific columns or entire rows.
The WHERE clause is required to identify which row(s) should be deleted.

Important Notes on CRUD Operations in Cassandra:

● Primary Key Requirement: Most CRUD operations require the WHERE clause to
specify the primary key, ensuring efficient data access.
● Denormalization: Cassandra encourages denormalized data models to optimize for
read-heavy operations. Data is often duplicated across tables to allow for faster
reads.
● Tunable Consistency: Cassandra allows specifying consistency levels for read and
write operations, such as ONE, QUORUM, or ALL, depending on the desired balance
between availability and consistency.
● Batch Operations: Cassandra supports batch operations for executing multiple
INSERT, UPDATE, or DELETE statements as a single atomic operation.
Q6 Data Storage in MongoDB

MongoDB is a NoSQL database that stores data in a flexible, document-oriented format.

Unlike traditional relational databases that use tables and rows, MongoDB stores data in
JSON-like documents, making it suitable for applications that require scalable and flexible
data structures.

1. Document Model:

● Documents: MongoDB stores data in documents, which are represented as BSON

(Binary JSON). BSON extends the JSON format by supporting more data types, such
as dates and binary data.
● Fields: Each document contains fields and values, similar to key-value pairs, and
can have nested structures. This allows for more complex and hierarchical data
models compared to traditional relational databases.

Example Document:
json
Copy code
{
"_id": ObjectId("507f1f77bcf86cd799439011"),
"name": "John Doe",
"email": "[email protected]",
"address": {
"street": "123 Main St",
"city": "Anytown",
"zipcode": "12345"
},
"phone_numbers": ["123-456-7890", "987-654-3210"]
}

2. Collections:

● Definition: Documents in MongoDB are stored in collections, which are analogous

to tables in relational databases.
● Dynamic Schema: Collections do not enforce a fixed schema, allowing documents
in the same collection to have different structures. This flexibility is beneficial for
applications where data models evolve frequently.

3. BSON Format:

● Storage Efficiency: BSON is used as the storage format for documents. It is a

binary representation of JSON-like documents and provides support for additional
data types such as Date, Binary, and Decimal128, making it more efficient for
database storage.
● Indexing: MongoDB can create indexes on any field in a document to improve query
performance.

4. Storage Engine:

● WiredTiger: The default storage engine in MongoDB is WiredTiger, which supports

features like compression and concurrency control to enhance performance and
reduce storage space.
● MMAPv1: An older storage engine that has been deprecated but allowed
memory-mapped file storage.
● Storage Structure:
○ Collections and Indexes are stored in data files.
○ The data files are stored in a directory on disk, typically in the /data/db
directory by default.

5. Data Partitioning and Sharding:

● Horizontal Scaling: MongoDB supports sharding, which partitions data across

multiple nodes in a cluster. Each shard stores a subset of the collection’s data,
enabling horizontal scaling to distribute the load.
● Shard Key: A shard key is chosen to determine how documents are distributed
among shards. It is important to select a shard key that ensures even data
distribution for optimal performance.

6. Replication:

● Replica Sets: MongoDB supports data replication through replica sets, where data
is replicated across multiple nodes for high availability and redundancy. One node
acts as the primary (handling writes), while other nodes act as secondaries
(replicating data from the primary).
● Automatic Failover: If the primary node fails, a secondary node is elected as the
new primary, ensuring continuous data availability.

7. Storage Mechanics:

● Journaling: MongoDB uses a journaling mechanism to provide durability. Changes

are written to an on-disk journal before being applied to the database, ensuring data
consistency in the event of a crash.
● Write Concern: MongoDB provides options to specify the level of acknowledgment
needed from replica sets for write operations, ensuring data durability and
consistency as required by the application.

Key Benefits of MongoDB’s Data Storage:

● Flexible Schema: Allows storing various types of data structures without enforcing a
rigid schema.
● Scalability: Supports sharding for horizontal scaling to handle large datasets
efficiently.
● High Availability: Ensures data availability through replica sets and automatic
failover.
● Rich Data Types: BSON format supports a wide range of data types, making
MongoDB versatile for different data storage needs.

CQL (Cassandra Query Language) Commands Overview

Cassandra Query Language (CQL) is used to interact with Cassandra databases. It provides
SQL-like syntax for defining, querying, and manipulating data. Below is a breakdown of
commonly used CQL commands:

1. Data Definition Language (DDL) Commands:

● CREATE: Used to create keyspaces, tables, and indexes.

Example: Creating a keyspace.

sql
Copy code
CREATE KEYSPACE my_keyspace
WITH replication = {'class': 'SimpleStrategy', 'replication_factor':
3};

Example: Creating a table.

sql
Copy code
CREATE TABLE my_keyspace.users (
user_id UUID PRIMARY KEY,
name TEXT,
email TEXT,
age INT
);

○
● ALTER: Modifies an existing keyspace or table structure.

Example: Altering a table to add a new column.

sql
Copy code
ALTER TABLE my_keyspace.users ADD phone_number TEXT;

○
● DROP: Deletes a keyspace, table, or index.
Example: Dropping a table.
sql
Copy code
DROP TABLE my_keyspace.users;

2. Data Manipulation Language (DML) Commands:

● INSERT: Adds new data to a table.

Example: Inserting data into a table.

sql
Copy code
INSERT INTO my_keyspace.users (user_id, name, email, age)
VALUES (uuid(), 'John Doe', '[email protected]', 30);

○
● SELECT: Retrieves data from a table.

Example: Selecting specific columns.

sql
Copy code
SELECT name, email FROM my_keyspace.users WHERE user_id =
'some-uuid';

Example: Selecting all columns.

sql
Copy code
SELECT * FROM my_keyspace.users;

○
● UPDATE: Modifies existing data in a table.

Example: Updating a record.

sql
Copy code
UPDATE my_keyspace.users
SET email = '[email protected]', age = 31
WHERE user_id = 'some-uuid';

○
● DELETE: Removes data from a table.

Example: Deleting specific columns.

sql
Copy code
DELETE email FROM my_keyspace.users WHERE user_id = 'some-uuid';

○
○ Example: Deleting a full row.
sql
Copy code
DELETE FROM my_keyspace.users WHERE user_id =
'some-uuid';

Q7 Key-Value Store Overview

● Definition: A simple, schema-less data store where each key maps directly to a
value. The value can be any type of data, such as text, images, or multimedia files,
stored as a BLOB (Binary Large Object).
● Characteristics:
○ High performance for quick data retrieval.
○ Excellent scalability for managing very large datasets.
○ High flexibility in storing different types of data.

Key Operations in Key-Value Stores

● Get(key): Retrieves the value associated with the given key.

● Put(key, value): Stores or updates the value with the given key.
● Multi-get(key1, key2, ...): Retrieves values for a list of keys.
● Delete(key): Removes the key and its associated value from the data store.

Advantages of Key-Value Stores

1. Versatile Data Storage: Can store any type of data in the value field (text, hypertext,
images, video, audio). The system retrieves and returns the data in its entirety when
requested.
2. Fast Querying: Simple queries that request values using keys result in high-speed
data retrieval.
3. Schema Flexibility: Unlike traditional relational databases, key-value stores do not
enforce a rigid schema, allowing for dynamic and varied data structures.
4. Eventual Consistency: Ensures data availability across distributed systems, with
consistency eventually achieved across replicas.
5. Hierarchical or Ordered Storage: Some implementations allow for hierarchical
storage (nested structures) or ordered key-value pairs.
6. Flexible Key Representation:
○ Keys can be generated in different formats (e.g., hash values, logical paths,
REST endpoints).
○ Auto-generated or synthetic keys help in unique identification.
7. Scalability and Reliability: Designed to handle horizontal scaling efficiently, which
supports adding more nodes to manage increased data loads. Provides portability
and incurs low operational costs.
Limitations of Key-Value Stores

1. Lack of Indexed Search: Values are not indexed, making it impossible to search for
subsets or filter based on value conditions.
2. Basic Database Capabilities: Does not natively support advanced database
features like:
○ Atomic Transactions: No built-in support for complex multi-operation
transactions.
○ Consistency Guarantees: Ensuring strong consistency across transactions
can be difficult without additional implementation.
3. Key Uniqueness Management: Maintaining unique and identifiable keys can
become challenging as the volume of data grows.
4. Limited Querying: Cannot query specific fields within the value. Unlike SQL-based
databases, key-value stores do not support complex queries (e.g., WHERE clauses or
value-based filtering).

Real-Time Data Processing & Analytics - Distributed Computing & Event Processing Using Spark, Flink, Storm, Kafka
100% (3)
Real-Time Data Processing & Analytics - Distributed Computing & Event Processing Using Spark, Flink, Storm, Kafka
422 pages
Multilevel Marketing Bi Solution Case Study
No ratings yet
Multilevel Marketing Bi Solution Case Study
27 pages
FortiSIEM 5.1 Study Guide-Online
No ratings yet
FortiSIEM 5.1 Study Guide-Online
461 pages
Cloud Security Introduction +1100 Questions-Answers-Explanations
100% (1)
Cloud Security Introduction +1100 Questions-Answers-Explanations
262 pages
Obiee Admin Interview Questions
100% (3)
Obiee Admin Interview Questions
13 pages
BDCN Unit 2 Activity 1
No ratings yet
BDCN Unit 2 Activity 1
11 pages
Module 7 - NoSQL
No ratings yet
Module 7 - NoSQL
34 pages
CS3492-DBMS unit-5
No ratings yet
CS3492-DBMS unit-5
9 pages
Dod Unit2
No ratings yet
Dod Unit2
22 pages
BDA Module-3
No ratings yet
BDA Module-3
7 pages
Mongo Nosql
No ratings yet
Mongo Nosql
12 pages
Introduction to NoSQL
No ratings yet
Introduction to NoSQL
13 pages
NOSQL Interview Q&A
No ratings yet
NOSQL Interview Q&A
25 pages
BIG - DATA - Unit 4
No ratings yet
BIG - DATA - Unit 4
99 pages
No SQL
No ratings yet
No SQL
17 pages
Nosql Notes
No ratings yet
Nosql Notes
110 pages
No SQL
No ratings yet
No SQL
11 pages
Swdnd501 Note
No ratings yet
Swdnd501 Note
7 pages
41 NoSQL Introduction.pptx
No ratings yet
41 NoSQL Introduction.pptx
18 pages
Unit No 1
No ratings yet
Unit No 1
34 pages
Module 4 Nosql
No ratings yet
Module 4 Nosql
8 pages
unit 4 BDA
No ratings yet
unit 4 BDA
22 pages
BDT Assignment
No ratings yet
BDT Assignment
4 pages
Mongo DB
No ratings yet
Mongo DB
33 pages
Chapter 5-NoSQL PDF
No ratings yet
Chapter 5-NoSQL PDF
47 pages
2383_1019_DOC_NoSQL Databases
No ratings yet
2383_1019_DOC_NoSQL Databases
6 pages
NoSQL Databases
No ratings yet
NoSQL Databases
8 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
NoSQL Big Data Management
No ratings yet
NoSQL Big Data Management
36 pages
MOD5_CH2
No ratings yet
MOD5_CH2
36 pages
BDA MODULE 3
No ratings yet
BDA MODULE 3
20 pages
NoSQL Unit 1 & 2 QnA
No ratings yet
NoSQL Unit 1 & 2 QnA
18 pages
UNIT II First Half Notes
No ratings yet
UNIT II First Half Notes
21 pages
No SQLMongo DB
No ratings yet
No SQLMongo DB
47 pages
Unit-I Remaining HM
No ratings yet
Unit-I Remaining HM
32 pages
Big Data Bhag 4 Changes
No ratings yet
Big Data Bhag 4 Changes
26 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
12 pages
Unit 3
No ratings yet
Unit 3
28 pages
Module-2
No ratings yet
Module-2
100 pages
BDA CW Chapter 3
No ratings yet
BDA CW Chapter 3
9 pages
Mongo DB
No ratings yet
Mongo DB
227 pages
bda-ia2-bda
No ratings yet
bda-ia2-bda
7 pages
Nosql Database
No ratings yet
Nosql Database
19 pages
Full Stack UNIT 3
No ratings yet
Full Stack UNIT 3
36 pages
PPT 2.1.2
No ratings yet
PPT 2.1.2
31 pages
BDA-1-
No ratings yet
BDA-1-
23 pages
UNIT-III
No ratings yet
UNIT-III
22 pages
3.1 Introduction to NoSQL
No ratings yet
3.1 Introduction to NoSQL
10 pages
No SQL - Types, CAP Theorem(4)
No ratings yet
No SQL - Types, CAP Theorem(4)
12 pages
CH 2 BDA
No ratings yet
CH 2 BDA
3 pages
Unit 4
No ratings yet
Unit 4
36 pages
Big Dataahh Is The Future
No ratings yet
Big Dataahh Is The Future
10 pages
NO-SQL
No ratings yet
NO-SQL
32 pages
Unit-V DBMS
No ratings yet
Unit-V DBMS
19 pages
NONSQL-DATABASE_NOTE
No ratings yet
NONSQL-DATABASE_NOTE
24 pages
BDT UNIT-II
No ratings yet
BDT UNIT-II
13 pages
Chapter24 Nosql Dbs
No ratings yet
Chapter24 Nosql Dbs
35 pages
NoSQL, Cloud Computing, and IOT
No ratings yet
NoSQL, Cloud Computing, and IOT
3 pages
Unit-V SQL
No ratings yet
Unit-V SQL
18 pages
Notes 20240601105414
No ratings yet
Notes 20240601105414
2 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
18 pages
BIG DATA UNIT-II NOTES
No ratings yet
BIG DATA UNIT-II NOTES
7 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
38 pages
Adbms Mini Sem 5-1
No ratings yet
Adbms Mini Sem 5-1
10 pages
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Iot Unit 3 Notes
No ratings yet
Iot Unit 3 Notes
43 pages
No SQL Technical Com
No ratings yet
No SQL Technical Com
61 pages
K21 Top 10 AWS Interview Question
No ratings yet
K21 Top 10 AWS Interview Question
17 pages
Cloud Computing
No ratings yet
Cloud Computing
15 pages
SAP HANA Workload on Azure
No ratings yet
SAP HANA Workload on Azure
1,330 pages
01 cmsc416 Intro
No ratings yet
01 cmsc416 Intro
51 pages
Leveraging Database Technologies For Efficient and Effective Data Analytics
No ratings yet
Leveraging Database Technologies For Efficient and Effective Data Analytics
8 pages
The Design and Implementation of Erachnid: An Extensible, Scalable Web Crawler in Erlang
No ratings yet
The Design and Implementation of Erachnid: An Extensible, Scalable Web Crawler in Erlang
10 pages
Impact of Resource Management and Scalability On Performance of Cloud Applications - A Survey
No ratings yet
Impact of Resource Management and Scalability On Performance of Cloud Applications - A Survey
9 pages
h16763 Vxrail Spec Sheet
No ratings yet
h16763 Vxrail Spec Sheet
9 pages
Guide to Data Warehousing in the Lakehouse 1731468863
No ratings yet
Guide to Data Warehousing in the Lakehouse 1731468863
55 pages
Introduction To Big Data Analytics
100% (4)
Introduction To Big Data Analytics
112 pages
Whitepaper PDF
No ratings yet
Whitepaper PDF
21 pages
CHAPTER1 Update
No ratings yet
CHAPTER1 Update
20 pages
The Vital Role of Databases in Modern Applications
No ratings yet
The Vital Role of Databases in Modern Applications
8 pages
The RDDL Network - Vision For A Physical Trust Layer - v1.3
No ratings yet
The RDDL Network - Vision For A Physical Trust Layer - v1.3
27 pages
Automation and AI Implementation for Mystery Shopping Operations
No ratings yet
Automation and AI Implementation for Mystery Shopping Operations
3 pages
LS1.1 - V2 Scaling With Traditional Databases
No ratings yet
LS1.1 - V2 Scaling With Traditional Databases
7 pages
Virtualizing Business Critical Applications
No ratings yet
Virtualizing Business Critical Applications
17 pages
Azure
No ratings yet
Azure
24 pages
Parallel and Disputing Computing
No ratings yet
Parallel and Disputing Computing
3 pages
Furniture Renting System
No ratings yet
Furniture Renting System
18 pages
Monoxide Scale Out Blockchains With Asynchronous Consensuz Zones
No ratings yet
Monoxide Scale Out Blockchains With Asynchronous Consensuz Zones
19 pages
Netflix
No ratings yet
Netflix
10 pages
Oracle BI Publisher
No ratings yet
Oracle BI Publisher
43 pages

MODULE 3 (3)

Uploaded by

MODULE 3 (3)

Uploaded by

Q1 Properties of NoSQL Databases

Q2 Architecture of NoSQL Databases

1. Distributed System Architecture:

● Cluster-Based Systems: NoSQL databases operate on a cluster of machines or

● Horizontal Scaling: NoSQL architectures support horizontal scaling, where more

● Master-Slave Architecture: Some NoSQL databases use a master-slave model

Q3 MongoDB: Sharding and Replication

● Definition: Sharding is a method used to distribute large datasets across multiple

● Definition: Replication is the process of synchronizing data across multiple servers.

Combining Sharding and Replication:

In summary, sharding and replication in MongoDB work together to provide a scalable,

Master-Slave is a replication model used in distributed databases to ensure data

1. Overview of Master-Slave Architecture:

● MongoDB supports master-slave replication, although it is more common to use

● Purpose: To insert data into a table.

● Purpose: To retrieve data from a table.

● Purpose: To modify existing data in a table.

● Purpose: To remove data from a table.

Important Notes on CRUD Operations in Cassandra:

MongoDB is a NoSQL database that stores data in a flexible, document-oriented format.

● Documents: MongoDB stores data in documents, which are represented as BSON

● Definition: Documents in MongoDB are stored in collections, which are analogous

● Storage Efficiency: BSON is used as the storage format for documents. It is a

● WiredTiger: The default storage engine in MongoDB is WiredTiger, which supports

5. Data Partitioning and Sharding:

● Horizontal Scaling: MongoDB supports sharding, which partitions data across

● Journaling: MongoDB uses a journaling mechanism to provide durability. Changes

Key Benefits of MongoDB’s Data Storage:

CQL (Cassandra Query Language) Commands Overview

1. Data Definition Language (DDL) Commands:

● CREATE: Used to create keyspaces, tables, and indexes.

Example: Creating a keyspace.

Example: Creating a table.

Example: Altering a table to add a new column.

2. Data Manipulation Language (DML) Commands:

● INSERT: Adds new data to a table.

Example: Inserting data into a table.

Example: Selecting specific columns.

Example: Selecting all columns.

Example: Updating a record.

Example: Deleting specific columns.

Q7 Key-Value Store Overview

Key Operations in Key-Value Stores

● Get(key): Retrieves the value associated with the given key.

Advantages of Key-Value Stores

You might also like