0% found this document useful (0 votes)

4 views

bda-ia2-bda

The document discusses the business drivers for NoSQL databases, including volume, velocity, variability, and agility, highlighting their advantages over traditional RDBMS. It explains the CAP theorem, the base properties of NoSQL, and different types of NoSQL databases, such as key-value, document, column, and graph databases. Additionally, it covers concepts like shared nothing architecture, sharding, consistent hashing, Bloom filters, and applications of data visualization.

Uploaded by

mervismascarenhas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

bda-ia2-bda

Uploaded by

mervismascarenhas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

lOMoARcPSD|23515852

BDA IA2 - bda

B teach (Pillai College of Engineering)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Mervis Mascarenhas ([email protected])
lOMoARcPSD|23515852

1) Business drivers of nosql

Volume:
The need to scale out (also known as horizontal scaling), rather than scale up (faster
processors), moved organizations from serial to parallel processing where data
problems are split into separate paths and sent to separate processors to divide and
conquer the work.

Velocity:It refers to how quickly data is generated and how quickly that data moves.
the ability of a single processor system to rapidly read and write data.When single
processors RDBMSs are used as a back end to a web storefront, the random bursts in
web traffic slow down response for everyone and tuning these systems can be costly
when both high read and write throughput is desired.

Variability:
The number of inconsistencies in the data. Capturing and reporting on exception data
struggle when attempting to use rigid database schema structures imposed by RDBMS
systems. For example, if a business unit wants to capture a few custom fields for a
particular customer, all customer rows within the database need to store this information
even though it doesn't apply. Adding new columns to an RDBMS requires the system to
be shut down and ALTER TABLE commands to be run. When a large database is large,
this process can impact system availability, losing time and money.

Agility: putting data into and getting data out of the database. If your data has nested
and repeated subgroups of data structures you need to include an object-relational
mapping layer. The responsibility of this layer is to generate the correct combination of
INSERT, UPDATE, DELETE and SELECT SQL statements to move object data to and
from the RDBMS persistence layer. This process is not simple and is associated with the
largest barrier to rapid change when developing new or modifying existing applications.

2) CAP theorem for Nosql

CAP stands for Consistency, Availability, and Partitioning.

Consistency: Refers, that all nodes in the network see the same data at the same
time. A transaction cannot be executed partially. It will always be ‘All or none. If
something goes wrong in between the execution of a transaction, the whole transaction
needs to be rolled back.

Availability: Availability is a guarantee that every request receives a response about

whether it was successful or failed. However, it does not guarantee that a read request
returns the most recent write. The more number of users a system can cater to better is
the availability.

Downloaded by Mervis Mascarenhas ([email protected])

lOMoARcPSD|23515852

Partition Tolerance is a guarantee that the system continues to operate despite arbitrary
message loss or failure of part of the system.

- RDBMS can provide only consistency but not partition tolerance. While HBASE
and Redis can provide Consistency and Partition tolerance. And MongoDB,
CouchDB, Cassandra, and Dynamo guarantee only availability but no
consistency. Such databases generally settle down for eventual consistency
meaning that after a while the system is going to be ok.

3) Base properties of NoSQL

Basic Availability: NoSQL databases spread data across many storage systems with a
high degree of replication. In the unlikely event that a failure disrupts access to a
segment of data, this does not necessarily result in a complete database outage.

Soft state indicates that the state of the system may change over time, even without
input. This is because of the eventual consistency model.

Eventual consistency indicates that the system will become consistent over time.

4) NoSQL databases (aka "not only SQL") are non-tabular databases and store data
differently than relational tables. NoSQL databases come in a variety of types based on
their data model. The main types are document, key-value, wide-column, and graph.
They provide flexible schemas and scale easily with large amounts of data and high user
loads.
1. Key-Value Store Database(key and value are stored)
2. Column Store Database(each individual column may contain multiple other columns
like traditional databases. )
3. Document Database( key-value pairs but here, the values are called as
Documents.can be a form of text, arrays, strings, JSON, XML or any such format)
4. Graph Database
(
Clearly, this architecture pattern deals with the storage and management of data in
graphs. Graphs are basically structures that depict connections between two or more
objects in some data. The objects or entities are called as nodes and are joined together
by relationships called Edges. Each edge has a unique identifier. Each node serves as a
point of contact for the graph. This pattern is very commonly used in social networks
where there are a large number of entities and each entity has one or many
characteristics which are connected by edges. The relational database pattern has
tables that are loosely connected, whereas graphs are often very strong and rigid in
nature.
)

Downloaded by Mervis Mascarenhas ([email protected])

lOMoARcPSD|23515852

6) Shared nothing
Shared Nothing Architecture (SNA) is a distributed computing architecture that
consists of multiple separated nodes that don’t share resources. The nodes are
independent and self-sufficient as they have their own disk space and memory.

In such a system, the data set/workload is split into smaller sets (nodes) distributed into
different parts of the system. Each node has its own memory, storage, and independent
input/output interfaces. It communicates and synchronizes with other nodes through a
high-speed interconnect network. Such a connection ensures low latency, high
bandwidth, as well as high availability (with a backup interconnect available in case the
primary fails).

scale the distributed system horizontally and increase the transmission capacity.

SNA has no shared resources. The only thing connecting the nodes is the network layer,
which manages the system and communication among nodes.
Advantages:

Downloaded by Mervis Mascarenhas ([email protected])

lOMoARcPSD|23515852

Easier to Scale

Eliminates Single Points of Failure

Simplifies Upgrades and Prevents Downtime

Disadvantages
Cost

A node consists of its individual processor, memory, and disk. Having dedicated
resources essentially means higher costs when it comes to setting up the system.
Additionally, transmitting data that requires software interaction is more expensive
compared to architectures with shared disk space and/or memory.

Decreased Performance

Scaling up your system can eventually affect the overall performance if the cross-
communication layer isn’t set up correctly.

7) Sharding is a method for distributing a single dataset across multiple databases, which
can then be stored on multiple machines. This allows for larger datasets to be split into
smaller chunks and stored in multiple data nodes, increasing the total storage capacity
of the system. See more on the basics of sharding here.

Similarly, by distributing the data across multiple machines, a sharded database can
handle more requests than a single machine can.

Sharding is a form of scaling known as horizontal scaling or scale-out, as additional

nodes are brought on to share the load. Horizontal scaling allows for near-limitless
scalability to handle big data and intense workloads.
There is overhead and complexity in setting up shards, maintaining the data on each
shard, and properly routing requests across those shards.

WHY to use sharding

Sharding makes the Database smaller
Sharding makes the Database faster
Sharding makes the Database much more easily manageable
Sharding can be a complex operation sometimes
Sharding reduces the transaction cost of the Database

8)
Data stream management system diya hai in tech- knowledge! Follow that & Stream
queries 2 he explain kr na hai
1) Standing queries

Downloaded by Mervis Mascarenhas ([email protected])

lOMoARcPSD|23515852

2) Ad-hoc queries

9) Consistent hashing - Rehashing is a problem, where if a server crashes, the server

location for all keys will changed, for them too who are not crashed (this will increase
load on origin as there will cache miss of keys and rehashing will be done for all the
keys)
Consistent Hashing is a distributed hashing scheme that operates independently of the
number of servers or objects in a distributed hash table by assigning them a position on
an abstract circle, or hash ring. This allows servers and objects to scale without affecting
the overall system.

10) Blooms filter

A Bloom filter is a data structure designed to tell you, rapidly and memory-efficiently,
whether an element is present in a set.

The price paid for this efficiency is that a Bloom filter is a probabilistic data structure: it
tells us that the element either definitely is not in the set or may be in the set.
The base data structure of a Bloom filter is a Bit Vector

● Unlike a standard hash table, a Bloom filter of a fixed size can represent a set
with an arbitrarily large number of elements.
● Adding an element never fails. However, the false positive rate increases steadily
as elements are added until all bits in the filter are set to 1, at which point all
queries yield a positive result.
● Bloom filters never generate false negative result, i.e., telling you that a
username doesn’t exist when it actually exists.
● Deleting elements from filter is not possible because, if we delete a single
element by clearing bits at indices generated by k hash functions, it might cause
deletion of few other elements.

Downloaded by Mervis Mascarenhas ([email protected])

lOMoARcPSD|23515852

The applications of Bloom Filter are:

● Weak password detection

● Internet Cache Protocol
● Safe browsing in Google Chrome
● Wallet synchronization in Bitcoin
● Hash-based IP Traceback
11) DGIM
a) https://medium.com/fnplus/dgim-algorithm-169af6bb3b0c

12) Data visualization in R https://www.google.com/amp/s/www.geeksforgeeks.org/data-

visualization-in-r/amp/

Different applications of data visualization

● Healthcare Industries
● Business intelligence
● Military
● Data Science
● Finance industries
● Real estate business
● Food delivery apps
● Marketing

Downloaded by Mervis Mascarenhas ([email protected])

Introduction to NoSQL
No ratings yet
Introduction to NoSQL
13 pages
CS3492-DBMS unit-5
No ratings yet
CS3492-DBMS unit-5
9 pages
Nosql Tricks
No ratings yet
Nosql Tricks
34 pages
Intro to NoSQL DBs
No ratings yet
Intro to NoSQL DBs
44 pages
BDA Module-3
No ratings yet
BDA Module-3
7 pages
unit 4 BDA
No ratings yet
unit 4 BDA
22 pages
BDA CW Chapter 3
No ratings yet
BDA CW Chapter 3
9 pages
2- NoSQL
No ratings yet
2- NoSQL
32 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
29 pages
NO SQL IA-01_MICRO
No ratings yet
NO SQL IA-01_MICRO
6 pages
Bda - 4 Unit
No ratings yet
Bda - 4 Unit
10 pages
Big Data Analysis
No ratings yet
Big Data Analysis
9 pages
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
No ratings yet
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
42 pages
4.NoSQL 1
No ratings yet
4.NoSQL 1
69 pages
Chapter 4 - Distributed Database System
No ratings yet
Chapter 4 - Distributed Database System
52 pages
4unit NoSQL
No ratings yet
4unit NoSQL
27 pages
777 1651399819 BD Module 5
No ratings yet
777 1651399819 BD Module 5
75 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
43 pages
No SQL
No ratings yet
No SQL
109 pages
UDBMS NOTES
No ratings yet
UDBMS NOTES
18 pages
NoSQL Databases
No ratings yet
NoSQL Databases
20 pages
On Introdution To NoSQL
No ratings yet
On Introdution To NoSQL
56 pages
Chapter_4 - NoSQL_1676181987
No ratings yet
Chapter_4 - NoSQL_1676181987
85 pages
Module-2
No ratings yet
Module-2
100 pages
MODULE 3 (3)
No ratings yet
MODULE 3 (3)
14 pages
Module-2
No ratings yet
Module-2
104 pages
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
No ratings yet
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
30 pages
41 NoSQL Introduction.pptx
No ratings yet
41 NoSQL Introduction.pptx
18 pages
Bda Module 3
No ratings yet
Bda Module 3
24 pages
NoSql 2024 Assign2
No ratings yet
NoSql 2024 Assign2
189 pages
Lecture 6 - NoSQL
No ratings yet
Lecture 6 - NoSQL
28 pages
Module 2 Notes
No ratings yet
Module 2 Notes
19 pages
Chapter24 Nosql Dbs
No ratings yet
Chapter24 Nosql Dbs
35 pages
BIG - DATA - Unit 4
No ratings yet
BIG - DATA - Unit 4
99 pages
NoSQL D
No ratings yet
NoSQL D
26 pages
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
No ratings yet
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
102 pages
NoSQL (1)
No ratings yet
NoSQL (1)
12 pages
Nosql What Does It Mean
No ratings yet
Nosql What Does It Mean
15 pages
Bda Chapter 3 This Is The Notes of Bda
No ratings yet
Bda Chapter 3 This Is The Notes of Bda
14 pages
CH 2 BDA
No ratings yet
CH 2 BDA
3 pages
Second Unit ADBMS
No ratings yet
Second Unit ADBMS
53 pages
nosql-kk
No ratings yet
nosql-kk
23 pages
DBMS Unit 5 Macro
No ratings yet
DBMS Unit 5 Macro
3 pages
Introduction To Nosql: Gabriele Pozzani
No ratings yet
Introduction To Nosql: Gabriele Pozzani
49 pages
Nosql
No ratings yet
Nosql
20 pages
Chapter_4_NOSQL_250525_070847
No ratings yet
Chapter_4_NOSQL_250525_070847
28 pages
2 Big Data Analytics-Hadoop R21 A7902 ABP
No ratings yet
2 Big Data Analytics-Hadoop R21 A7902 ABP
16 pages
Big Data
No ratings yet
Big Data
53 pages
BigData_NoSQL
No ratings yet
BigData_NoSQL
30 pages
NOSQL , MONGODB
No ratings yet
NOSQL , MONGODB
18 pages
nosql-databases
No ratings yet
nosql-databases
379 pages
Nosql Final
No ratings yet
Nosql Final
50 pages
Hbase Hive Pig
No ratings yet
Hbase Hive Pig
144 pages
R23-IDS-Unit3-PPT
No ratings yet
R23-IDS-Unit3-PPT
36 pages
NoSQL Database
No ratings yet
NoSQL Database
64 pages
BDA MODULE 3
No ratings yet
BDA MODULE 3
20 pages
NoSQL Databases
No ratings yet
NoSQL Databases
52 pages
MDS 271 2448001
No ratings yet
MDS 271 2448001
9 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
DBMS MASTER: Become Pro in Database Management System
From Everand
DBMS MASTER: Become Pro in Database Management System
Ummed Singh
No ratings yet
final_year
No ratings yet
final_year
26 pages
MIS in Education
No ratings yet
MIS in Education
5 pages
EM QB Sol
No ratings yet
EM QB Sol
13 pages
MIS QB Sol
No ratings yet
MIS QB Sol
8 pages
IAT2 QB Sol
No ratings yet
IAT2 QB Sol
77 pages
ML Iat2-1
No ratings yet
ML Iat2-1
23 pages
NLP Iat-2 QB Solutions
No ratings yet
NLP Iat-2 QB Solutions
13 pages
Big Data Analysis IAT-1
No ratings yet
Big Data Analysis IAT-1
43 pages
Core Subject MCQ
No ratings yet
Core Subject MCQ
3 pages
Why Isn T Urban Development Sustainable An Institutional Approach To The Case of Athens Greece 1st Edition Msc. Antonios Tsiligiannis Download PDF
100% (10)
Why Isn T Urban Development Sustainable An Institutional Approach To The Case of Athens Greece 1st Edition Msc. Antonios Tsiligiannis Download PDF
33 pages
Butler Snow Show Cause Order
No ratings yet
Butler Snow Show Cause Order
3 pages
Plant and Animal Cells - Worksheet
100% (1)
Plant and Animal Cells - Worksheet
4 pages
Planned Track Closures
No ratings yet
Planned Track Closures
33 pages
Reading Practice - Answer
No ratings yet
Reading Practice - Answer
1 page
P18
No ratings yet
P18
12 pages
USAID_FY_2025_Pinas_Country_Roadmap_en_US
No ratings yet
USAID_FY_2025_Pinas_Country_Roadmap_en_US
3 pages
Zhejiang Telecom
No ratings yet
Zhejiang Telecom
4 pages
Societies of Wolves and Free ranging Dogs - 1st Edition full download
100% (11)
Societies of Wolves and Free ranging Dogs - 1st Edition full download
15 pages
Morphine Chemistry
No ratings yet
Morphine Chemistry
19 pages
Aristotle's Metaphysics
No ratings yet
Aristotle's Metaphysics
18 pages
T N 7821 Roll Up Roll Up More and Less at The Circus Powerpoint Ver 2
No ratings yet
T N 7821 Roll Up Roll Up More and Less at The Circus Powerpoint Ver 2
9 pages
Unit 5 Global Env Issues Policies
No ratings yet
Unit 5 Global Env Issues Policies
83 pages
IDDSI Recipe Booklet 1
No ratings yet
IDDSI Recipe Booklet 1
19 pages
Effectiveness of promotional mix elements in yottolaba pvt.ltd . project report
No ratings yet
Effectiveness of promotional mix elements in yottolaba pvt.ltd . project report
47 pages
CADENCE_Practical
No ratings yet
CADENCE_Practical
25 pages
Construction I Buildings
No ratings yet
Construction I Buildings
16 pages
Jonathan Royle - Ultimate Mentalism Routine PDF
100% (1)
Jonathan Royle - Ultimate Mentalism Routine PDF
37 pages
Page 12-16 Talking About A Painting Art Vocabulary
No ratings yet
Page 12-16 Talking About A Painting Art Vocabulary
5 pages
Telehandler 2
No ratings yet
Telehandler 2
50 pages
Lesson 1
No ratings yet
Lesson 1
3 pages
Geometry in Primary School
No ratings yet
Geometry in Primary School
6 pages
NCP Ventura
No ratings yet
NCP Ventura
3 pages
Natario Warp Drive
No ratings yet
Natario Warp Drive
9 pages
CABANA MENU 2023.min
No ratings yet
CABANA MENU 2023.min
7 pages
An Experiment on the Word—Reading Alma 32
No ratings yet
An Experiment on the Word—Reading Alma 32
113 pages
Smith PsychologyDayDreams 1904
No ratings yet
Smith PsychologyDayDreams 1904
25 pages
1183 1799 1 SM
No ratings yet
1183 1799 1 SM
18 pages
Thesis Statement For Liver Cancer
No ratings yet
Thesis Statement For Liver Cancer
4 pages
Bibliographical Bulletin For Byzantine Philosophy 2010 2012
No ratings yet
Bibliographical Bulletin For Byzantine Philosophy 2010 2012
37 pages

bda-ia2-bda

Uploaded by

bda-ia2-bda

Uploaded by

lOMoARcPSD|23515852

BDA IA2 - bda

B teach (Pillai College of Engineering)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

1) Business drivers of nosql

2) CAP theorem for Nosql

Availability: Availability is a guarantee that every request receives a response about

Downloaded by Mervis Mascarenhas ([email protected])

3) Base properties of NoSQL

Downloaded by Mervis Mascarenhas ([email protected])

Downloaded by Mervis Mascarenhas ([email protected])

Eliminates Single Points of Failure

Simplifies Upgrades and Prevents Downtime

Sharding is a form of scaling known as horizontal scaling or scale-out, as additional

WHY to use sharding

Downloaded by Mervis Mascarenhas ([email protected])

9) Consistent hashing - Rehashing is a problem, where if a server crashes, the server

10) Blooms filter

Downloaded by Mervis Mascarenhas ([email protected])

The applications of Bloom Filter are:

● Weak password detection

12) Data visualization in R https://www.google.com/amp/s/www.geeksforgeeks.org/data-

Different applications of data visualization

Downloaded by Mervis Mascarenhas ([email protected])

You might also like