This document provides an overview of NoSQL databases. It discusses that NoSQL databases are non-relational and do not follow the RDBMS principles. It describes some of the main types of NoSQL databases including document stores, key-value stores, column-oriented stores, and graph databases. It also discusses how NoSQL databases are designed for massive scalability and do not guarantee ACID properties, instead following a BASE model ofBasically Available, Soft state, and Eventually Consistent.
The document provides an introduction to NoSQL databases, including key definitions and characteristics. It discusses that NoSQL databases are non-relational and do not follow RDBMS principles. It also summarizes different types of NoSQL databases like document stores, key-value stores, and column-oriented stores. Examples of popular databases for each type are also provided.
An overview of various database technologies and their underlying mechanisms over time.
Presentation delivered at Alliander internally to inspire the use of and forster the interest in new (NOSQL) technologies. 18 September 2012
This document discusses relational and non-relational databases. It begins by introducing NoSQL databases and some of their key characteristics like not requiring a fixed schema and avoiding joins. It then discusses why NoSQL databases became popular for companies dealing with huge data volumes due to limitations of scaling relational databases. The document covers different types of NoSQL databases like key-value, column-oriented, graph and document-oriented databases. It also discusses concepts like eventual consistency, ACID properties, and the CAP theorem in relation to NoSQL databases.
NoSQL databases provide an alternative to traditional relational databases that is well-suited for large datasets, high scalability needs, and flexible, changing schemas. NoSQL databases sacrifice strict consistency for greater scalability and availability. The document model is well-suited for semi-structured data and allows for embedding related data within documents. Key-value stores provide simple lookup of data by key but do not support complex queries. Graph databases effectively represent network-like connections between data elements.
NoSQL databases were developed to address the limitations of relational databases in handling massive, unstructured datasets. NoSQL databases sacrifice ACID properties like consistency in favor of scalability and availability. The CAP theorem states that only two of consistency, availability, and partition tolerance can be achieved at once. Common NoSQL database types include document stores, key-value stores, column-oriented stores, and graph databases. NoSQL is best suited for large datasets that don't require strict consistency or relational structures.
This document provides an overview of NoSQL databases, including:
- Key-value stores store data as maps or hashmaps and are efficient for data access but limited in query capabilities.
- Column-oriented stores group attributes into column families and store data efficiently but are operationally challenging.
- Document databases store loosely structured data like JSON and allow retrieving documents by keys or contents.
- Graph databases are suited for interaction networks and path finding but are less suited for tabular data.
Module 2.2 Introduction to NoSQL Databases.pptxNiramayKolalle
This presentation explores NoSQL databases, a modern alternative to traditional relational database management systems (RDBMS). NoSQL databases are designed to handle large-scale data storage and high-speed processing with a focus on flexibility, scalability, and performance. Unlike SQL databases, NoSQL solutions do not rely on structured tables, schemas, or joins, making them ideal for handling Big Data applications and distributed systems.
Introduction to NoSQL Databases:
NoSQL databases are built on the following core principles:
Schema-Free Structure: No predefined table structures, allowing dynamic data storage.
Horizontal Scalability: Unlike SQL databases that scale vertically (by increasing hardware power), NoSQL databases support horizontal scaling, distributing data across multiple servers.
Distributed Computing: Data is stored across multiple nodes, preventing single points of failure and ensuring high availability.
Simple APIs: NoSQL databases often use simpler query mechanisms instead of complex SQL queries.
Optimized for Performance: NoSQL databases eliminate joins and support faster read/write operations.
Key Theoretical Concepts:
CAP Theorem (Brewer’s Theorem)
The CAP theorem states that a distributed system can provide only two out of three guarantees:
Consistency (C) – Ensures that all database nodes show the same data at any given time.
Availability (A) – Guarantees that every request receives a response.
Partition Tolerance (P) – The system continues to operate even if network failures occur.
Most NoSQL databases prioritize Availability and Partition Tolerance (AP) while relaxing strict consistency constraints, unlike SQL databases that focus on Consistency and Availability (CA).
BASE vs. ACID Model
SQL databases follow the ACID (Atomicity, Consistency, Isolation, Durability) model, ensuring strict transactional integrity. NoSQL databases use the BASE model (Basically Available, Soft-state, Eventually consistent), allowing flexibility in distributed environments where eventual consistency is preferred over immediate consistency.
Types of NoSQL Databases:
Key-Value Stores – Store data as simple key-value pairs, making them highly efficient for caching, session management, and real-time analytics.
Examples: Amazon DynamoDB, Redis, Riak
Column-Family Stores – Store data in columns rather than rows, optimizing analytical queries and batch processing workloads.
Examples: Apache Cassandra, HBase, Google Bigtable
Document Stores – Use JSON, BSON, or XML documents to represent data, making them ideal for content management systems, catalogs, and flexible data models.
Examples: MongoDB, CouchDB, ArangoDB
Graph Databases – Focus on relationships between data, allowing high-performance queries for connected data such as social networks, fraud detection, and recommendation engines.
Examples: Neo4j, Oracle NoSQL Graph, Amazon Neptune
Business Drivers for NoSQL Adoption:
Volume: The ability to process large datasets effic
Sql vs NO-SQL database differences explainedSatya Pal
This document compares SQL and NoSQL databases. It outlines key differences between the two types of databases such as their data structures (tables vs documents/key-value pairs), schemas (strict vs dynamic), scalability (vertical vs horizontal), and query languages (SQL vs unstructured). Examples of popular SQL databases discussed are MySQL, MS-SQL Server, and Oracle. Examples of NoSQL databases discussed are MongoDB, CouchDB, and Redis. The document provides an overview of each example database's features and benefits.
This document provides an introduction to NoSQL databases, including the motivation behind them, where they fit, types of NoSQL databases like key-value, document, columnar, and graph databases, and an example using MongoDB. NoSQL databases are a new way of thinking about data that is non-relational, schema-less, and can be distributed and fault tolerant. They are motivated by the need to scale out applications and handle big data with flexible and modern data models.
This document provides an overview of different types of databases and how to choose the right one. It discusses relational databases like MySQL and SQL Server which use tables and relationships. NoSQL databases like Cassandra and MongoDB are discussed as being schema-less and storing data together. Evented databases like Kafka are covered as focusing on transient data through events and producers/consumers. Key factors in choosing a database are also outlined like scalability, querying capabilities, use cases and storage structure.
NoSQL is a non-relational database designed for large-scale data storage needs. It has several key features: it is non-relational, schema-free, uses simple APIs, and is distributed. The four main types of NoSQL databases are key-value, column-oriented, document-oriented, and graph-based. Key advantages of NoSQL include scalability, flexibility in data structures, and ease of development. However, NoSQL sacrifices some consistency and lacks standardization compared to SQL databases.
The document discusses the history and concepts of NoSQL databases. It notes that traditional single-processor relational database management systems (RDBMS) struggled to handle the increasing volume, velocity, variability, and agility of data due to various limitations. This led engineers to explore scaled-out solutions using multiple processors and NoSQL databases, which embrace concepts like horizontal scaling, schema flexibility, and high performance on commodity hardware. Popular NoSQL database models include key-value stores, column-oriented databases, document stores, and graph databases.
This document provides an introduction to NoSQL and MongoDB. It outlines that NoSQL databases are used to manage unstructured data and overcome limitations of relational databases. MongoDB is introduced as a popular document-oriented NoSQL database that stores data as JSON-like documents. Key features of MongoDB include high performance, scalability, rich query language, and automatic replication for high availability.
Evolution of the DBA to Data Platform Administrator/SpecialistTony Rogerson
DBA's used to be Relational Database centric for instance managing Microsoft SQL Server or Oracle, in this changing world of polyglot database environments their role has expanded not just into new platforms other than SQL but also new legal governance, modelling techniques, architecture etc. They need to have a base knowledge of Kimball, Inmon, Data Vault, what CAP theorem is, LAMBDA, Big Data, Data Science etc.
The document summarizes a meetup about NoSQL databases hosted by AWS in Sydney in 2012. It includes an agenda with presentations on Introduction to NoSQL and using EMR and DynamoDB. NoSQL is introduced as a class of databases that don't use SQL as the primary query language and are focused on scalability, availability and handling large volumes of data in real-time. Common NoSQL databases mentioned include DynamoDB, BigTable and document databases.
NoSQL databases should not be chosen just because a system is slow or to replace RDBMS. The appropriate choice depends on factors like the nature of the data, how the data scales, and whether ACID properties are needed. NoSQL databases are categorized by data model (document, column family, graph, key-value store) which affects querying. Other considerations include scalability based on the CAP theorem and operational factors like the distribution model and whether there is a single point of failure. The best choice depends on the specific requirements and risks losing data if chosen incorrectly.
The document provides an agenda for a two-day training on NoSQL and MongoDB. Day 1 covers an introduction to NoSQL concepts like distributed and decentralized databases, CAP theorem, and different types of NoSQL databases including key-value, column-oriented, and document-oriented databases. It also covers functions and indexing in MongoDB. Day 2 focuses on specific MongoDB topics like aggregation framework, sharding, queries, schema-less design, and indexing.
NoSQL, as many of you may already know, is basically a database used to manage huge sets of unstructured data, where in the data is not stored in tabular relations like relational databases. Most of the currently existing Relational Databases have failed in solving some of the complex modern problems like:
• Continuously changing nature of data - structured, semi-structured, unstructured and polymorphic data.
• Applications now serve millions of users in different geo-locations, in different timezones and have to be up and running all the time, with data integrity maintained
• Applications are becoming more distributed with many moving towards cloud computing.
NoSQL plays a vital role in an enterprise application which needs to access and analyze a massive set of data that is being made available on multiple virtual servers (remote based) in the cloud infrastructure and mainly when the data set is not structured. Hence, the NoSQL database is designed to overcome the Performance, Scalability, Data Modelling and Distribution limitations that are seen in the Relational Databases.
The document compares SQL and NoSQL databases. SQL databases follow ACID properties and are good for applications requiring consistency, but do not scale well. NoSQL databases sacrifice consistency for scalability and availability. Data is modeled flexibly in NoSQL as documents, collections, or key-value pairs without predefined schemas. Examples include embedding or linking in MongoDB and using tables and items in DynamoDB. The CAP theorem explains the tradeoff between consistency, availability, and partition tolerance that NoSQL databases face. The conclusion is that SQL works for consistency while NoSQL scales, and a hybrid approach can be optimized.
The rising interest in NoSQL technology over the last few years resulted in an increasing number of evaluations and comparisons among competing NoSQL technologies From survey we create a concise and up-to-date comparison of NoSQL engines, identifying their most beneficial use from the software engineer point of view.
This document provides an overview of NoSQL databases, including:
- Key-value stores store data as maps or hashmaps and are efficient for data access but limited in query capabilities.
- Column-oriented stores group attributes into column families and store data efficiently but are operationally challenging.
- Document databases store loosely structured data like JSON and allow retrieving documents by keys or contents.
- Graph databases are suited for interaction networks and path finding but are less suited for tabular data.
Module 2.2 Introduction to NoSQL Databases.pptxNiramayKolalle
This presentation explores NoSQL databases, a modern alternative to traditional relational database management systems (RDBMS). NoSQL databases are designed to handle large-scale data storage and high-speed processing with a focus on flexibility, scalability, and performance. Unlike SQL databases, NoSQL solutions do not rely on structured tables, schemas, or joins, making them ideal for handling Big Data applications and distributed systems.
Introduction to NoSQL Databases:
NoSQL databases are built on the following core principles:
Schema-Free Structure: No predefined table structures, allowing dynamic data storage.
Horizontal Scalability: Unlike SQL databases that scale vertically (by increasing hardware power), NoSQL databases support horizontal scaling, distributing data across multiple servers.
Distributed Computing: Data is stored across multiple nodes, preventing single points of failure and ensuring high availability.
Simple APIs: NoSQL databases often use simpler query mechanisms instead of complex SQL queries.
Optimized for Performance: NoSQL databases eliminate joins and support faster read/write operations.
Key Theoretical Concepts:
CAP Theorem (Brewer’s Theorem)
The CAP theorem states that a distributed system can provide only two out of three guarantees:
Consistency (C) – Ensures that all database nodes show the same data at any given time.
Availability (A) – Guarantees that every request receives a response.
Partition Tolerance (P) – The system continues to operate even if network failures occur.
Most NoSQL databases prioritize Availability and Partition Tolerance (AP) while relaxing strict consistency constraints, unlike SQL databases that focus on Consistency and Availability (CA).
BASE vs. ACID Model
SQL databases follow the ACID (Atomicity, Consistency, Isolation, Durability) model, ensuring strict transactional integrity. NoSQL databases use the BASE model (Basically Available, Soft-state, Eventually consistent), allowing flexibility in distributed environments where eventual consistency is preferred over immediate consistency.
Types of NoSQL Databases:
Key-Value Stores – Store data as simple key-value pairs, making them highly efficient for caching, session management, and real-time analytics.
Examples: Amazon DynamoDB, Redis, Riak
Column-Family Stores – Store data in columns rather than rows, optimizing analytical queries and batch processing workloads.
Examples: Apache Cassandra, HBase, Google Bigtable
Document Stores – Use JSON, BSON, or XML documents to represent data, making them ideal for content management systems, catalogs, and flexible data models.
Examples: MongoDB, CouchDB, ArangoDB
Graph Databases – Focus on relationships between data, allowing high-performance queries for connected data such as social networks, fraud detection, and recommendation engines.
Examples: Neo4j, Oracle NoSQL Graph, Amazon Neptune
Business Drivers for NoSQL Adoption:
Volume: The ability to process large datasets effic
Sql vs NO-SQL database differences explainedSatya Pal
This document compares SQL and NoSQL databases. It outlines key differences between the two types of databases such as their data structures (tables vs documents/key-value pairs), schemas (strict vs dynamic), scalability (vertical vs horizontal), and query languages (SQL vs unstructured). Examples of popular SQL databases discussed are MySQL, MS-SQL Server, and Oracle. Examples of NoSQL databases discussed are MongoDB, CouchDB, and Redis. The document provides an overview of each example database's features and benefits.
This document provides an introduction to NoSQL databases, including the motivation behind them, where they fit, types of NoSQL databases like key-value, document, columnar, and graph databases, and an example using MongoDB. NoSQL databases are a new way of thinking about data that is non-relational, schema-less, and can be distributed and fault tolerant. They are motivated by the need to scale out applications and handle big data with flexible and modern data models.
This document provides an overview of different types of databases and how to choose the right one. It discusses relational databases like MySQL and SQL Server which use tables and relationships. NoSQL databases like Cassandra and MongoDB are discussed as being schema-less and storing data together. Evented databases like Kafka are covered as focusing on transient data through events and producers/consumers. Key factors in choosing a database are also outlined like scalability, querying capabilities, use cases and storage structure.
NoSQL is a non-relational database designed for large-scale data storage needs. It has several key features: it is non-relational, schema-free, uses simple APIs, and is distributed. The four main types of NoSQL databases are key-value, column-oriented, document-oriented, and graph-based. Key advantages of NoSQL include scalability, flexibility in data structures, and ease of development. However, NoSQL sacrifices some consistency and lacks standardization compared to SQL databases.
The document discusses the history and concepts of NoSQL databases. It notes that traditional single-processor relational database management systems (RDBMS) struggled to handle the increasing volume, velocity, variability, and agility of data due to various limitations. This led engineers to explore scaled-out solutions using multiple processors and NoSQL databases, which embrace concepts like horizontal scaling, schema flexibility, and high performance on commodity hardware. Popular NoSQL database models include key-value stores, column-oriented databases, document stores, and graph databases.
This document provides an introduction to NoSQL and MongoDB. It outlines that NoSQL databases are used to manage unstructured data and overcome limitations of relational databases. MongoDB is introduced as a popular document-oriented NoSQL database that stores data as JSON-like documents. Key features of MongoDB include high performance, scalability, rich query language, and automatic replication for high availability.
Evolution of the DBA to Data Platform Administrator/SpecialistTony Rogerson
DBA's used to be Relational Database centric for instance managing Microsoft SQL Server or Oracle, in this changing world of polyglot database environments their role has expanded not just into new platforms other than SQL but also new legal governance, modelling techniques, architecture etc. They need to have a base knowledge of Kimball, Inmon, Data Vault, what CAP theorem is, LAMBDA, Big Data, Data Science etc.
The document summarizes a meetup about NoSQL databases hosted by AWS in Sydney in 2012. It includes an agenda with presentations on Introduction to NoSQL and using EMR and DynamoDB. NoSQL is introduced as a class of databases that don't use SQL as the primary query language and are focused on scalability, availability and handling large volumes of data in real-time. Common NoSQL databases mentioned include DynamoDB, BigTable and document databases.
NoSQL databases should not be chosen just because a system is slow or to replace RDBMS. The appropriate choice depends on factors like the nature of the data, how the data scales, and whether ACID properties are needed. NoSQL databases are categorized by data model (document, column family, graph, key-value store) which affects querying. Other considerations include scalability based on the CAP theorem and operational factors like the distribution model and whether there is a single point of failure. The best choice depends on the specific requirements and risks losing data if chosen incorrectly.
The document provides an agenda for a two-day training on NoSQL and MongoDB. Day 1 covers an introduction to NoSQL concepts like distributed and decentralized databases, CAP theorem, and different types of NoSQL databases including key-value, column-oriented, and document-oriented databases. It also covers functions and indexing in MongoDB. Day 2 focuses on specific MongoDB topics like aggregation framework, sharding, queries, schema-less design, and indexing.
NoSQL, as many of you may already know, is basically a database used to manage huge sets of unstructured data, where in the data is not stored in tabular relations like relational databases. Most of the currently existing Relational Databases have failed in solving some of the complex modern problems like:
• Continuously changing nature of data - structured, semi-structured, unstructured and polymorphic data.
• Applications now serve millions of users in different geo-locations, in different timezones and have to be up and running all the time, with data integrity maintained
• Applications are becoming more distributed with many moving towards cloud computing.
NoSQL plays a vital role in an enterprise application which needs to access and analyze a massive set of data that is being made available on multiple virtual servers (remote based) in the cloud infrastructure and mainly when the data set is not structured. Hence, the NoSQL database is designed to overcome the Performance, Scalability, Data Modelling and Distribution limitations that are seen in the Relational Databases.
The document compares SQL and NoSQL databases. SQL databases follow ACID properties and are good for applications requiring consistency, but do not scale well. NoSQL databases sacrifice consistency for scalability and availability. Data is modeled flexibly in NoSQL as documents, collections, or key-value pairs without predefined schemas. Examples include embedding or linking in MongoDB and using tables and items in DynamoDB. The CAP theorem explains the tradeoff between consistency, availability, and partition tolerance that NoSQL databases face. The conclusion is that SQL works for consistency while NoSQL scales, and a hybrid approach can be optimized.
The rising interest in NoSQL technology over the last few years resulted in an increasing number of evaluations and comparisons among competing NoSQL technologies From survey we create a concise and up-to-date comparison of NoSQL engines, identifying their most beneficial use from the software engineer point of view.
We introduce the Gaussian process (GP) modeling module developed within the UQLab software framework. The novel design of the GP-module aims at providing seamless integration of GP modeling into any uncertainty quantification workflow, as well as a standalone surrogate modeling tool. We first briefly present the key mathematical tools on the basis of GP modeling (a.k.a. Kriging), as well as the associated theoretical and computational framework. We then provide an extensive overview of the available features of the software and demonstrate its flexibility and user-friendliness. Finally, we showcase the usage and the performance of the software on several applications borrowed from different fields of engineering. These include a basic surrogate of a well-known analytical benchmark function; a hierarchical Kriging example applied to wind turbine aero-servo-elastic simulations and a more complex geotechnical example that requires a non-stationary, user-defined correlation function. The GP-module, like the rest of the scientific code that is shipped with UQLab, is open source (BSD license).
The role of the lexical analyzer
Specification of tokens
Finite state machines
From a regular expressions to an NFA
Convert NFA to DFA
Transforming grammars and regular expressions
Transforming automata to grammars
Language for specifying lexical analyzers
International Journal of Distributed and Parallel systems (IJDPS)samueljackson3773
The growth of Internet and other web technologies requires the development of new
algorithms and architectures for parallel and distributed computing. International journal of
Distributed and parallel systems is a bimonthly open access peer-reviewed journal aims to
publish high quality scientific papers arising from original research and development from
the international community in the areas of parallel and distributed systems. IJDPS serves
as a platform for engineers and researchers to present new ideas and system technology,
with an interactive and friendly, but strongly professional atmosphere.
Analysis of reinforced concrete deep beam is based on simplified approximate method due to the complexity of the exact analysis. The complexity is due to a number of parameters affecting its response. To evaluate some of this parameters, finite element study of the structural behavior of the reinforced self-compacting concrete deep beam was carried out using Abaqus finite element modeling tool. The model was validated against experimental data from the literature. The parametric effects of varied concrete compressive strength, vertical web reinforcement ratio and horizontal web reinforcement ratio on the beam were tested on eight (8) different specimens under four points loads. The results of the validation work showed good agreement with the experimental studies. The parametric study revealed that the concrete compressive strength most significantly influenced the specimens’ response with the average of 41.1% and 49 % increment in the diagonal cracking and ultimate load respectively due to doubling of concrete compressive strength. Although the increase in horizontal web reinforcement ratio from 0.31 % to 0.63 % lead to average of 6.24 % increment on the diagonal cracking load, it does not influence the ultimate strength and the load-deflection response of the beams. Similar variation in vertical web reinforcement ratio leads to an average of 2.4 % and 15 % increment in cracking and ultimate load respectively with no appreciable effect on the load-deflection response.
Value Stream Mapping Worskshops for Intelligent Continuous SecurityMarc Hornbeek
This presentation provides detailed guidance and tools for conducting Current State and Future State Value Stream Mapping workshops for Intelligent Continuous Security.
Sorting Order and Stability in Sorting.
Concept of Internal and External Sorting.
Bubble Sort,
Insertion Sort,
Selection Sort,
Quick Sort and
Merge Sort,
Radix Sort, and
Shell Sort,
External Sorting, Time complexity analysis of Sorting Algorithms.
This paper proposes a shoulder inverse kinematics (IK) technique. Shoulder complex is comprised of the sternum, clavicle, ribs, scapula, humerus, and four joints.
The Fluke 925 is a vane anemometer, a handheld device designed to measure wind speed, air flow (volume), and temperature. It features a separate sensor and display unit, allowing greater flexibility and ease of use in tight or hard-to-reach spaces. The Fluke 925 is particularly suitable for HVAC (heating, ventilation, and air conditioning) maintenance in both residential and commercial buildings, offering a durable and cost-effective solution for routine airflow diagnostics.
In tube drawing process, a tube is pulled out through a die and a plug to reduce its diameter and thickness as per the requirement. Dimensional accuracy of cold drawn tubes plays a vital role in the further quality of end products and controlling rejection in manufacturing processes of these end products. Springback phenomenon is the elastic strain recovery after removal of forming loads, causes geometrical inaccuracies in drawn tubes. Further, this leads to difficulty in achieving close dimensional tolerances. In the present work springback of EN 8 D tube material is studied for various cold drawing parameters. The process parameters in this work include die semi-angle, land width and drawing speed. The experimentation is done using Taguchi’s L36 orthogonal array, and then optimization is done in data analysis software Minitab 17. The results of ANOVA shows that 15 degrees die semi-angle,5 mm land width and 6 m/min drawing speed yields least springback. Furthermore, optimization algorithms named Particle Swarm Optimization (PSO), Simulated Annealing (SA) and Genetic Algorithm (GA) are applied which shows that 15 degrees die semi-angle, 10 mm land width and 8 m/min drawing speed results in minimal springback with almost 10.5 % improvement. Finally, the results of experimentation are validated with Finite Element Analysis technique using ANSYS.
2. ▪ Introduction to NoSQL
▪ Different NoSQL Products
▪ Exploring MongoDB Java/Ruby/Python statements
▪ Interfacing and Interacting with NoSQL
2
3. NoSQL databases are currently a hot topic in some parts of computing, with over
a hundred different NoSQL databases.
4. ▪ Data stored in columns and tables
▪ Relationships represented by data
▪ Data Manipulation Language
▪ Data Definition Language
▪ Transactions
▪ Abstraction from physical layer
▪ Applications specify what, not how
▪ Physical layer can change without modifying applications
▪ Create indexes to support queries
▪ In Memory databases
5. ▪ Atomic – All of the work in a transaction completes (commit) or none of it
completes
▪ Consistent – A transaction transforms the database from one consistent state
to another consistent state. Consistency is defined in terms of constraints.
▪ Isolated – The results of any changes made during a transaction are not
visible until the transaction has committed.
▪ Durable – The results of a committed transaction survive failures
6. ▪ NoSQL stands for:
▪ No Relational
▪ No RDBMS
▪ Not Only SQL
▪ NoSQL is an umbrella term for all databases and data stores that don’t follow the
RDBMS principles
▪ A class of products
▪ A collection of several (related) concepts about data storage and manipulation
▪ Often related to large data sets
7. ▪ Non-relational DBMSs are not new
▪ But NoSQL represents a new incarnation
▪ Due to massively scalable Internet applications
▪ Based on distributed and parallel computing
▪ Development
▪ Starts with Google
▪ First research paper published in 2003
▪ Continues also thanks to Lucene's developers/Apache (Hadoop) and Amazon (Dynamo)
▪ Then a lot of products and interests came from Facebook, Netfix,Yahoo, eBay, Hulu, IBM,
and many more
8. ▪ Three major papers were the seeds of the NoSQL movement
▪ BigTable (Google)
▪ Dynamo (Amazon)
▪ Distributed key-value data store
▪ Eventual consistency
▪ CAP Theorem
9. ▪ NoSQL comes from Internet, thus it is often related to the “big data” concept
▪ How much big are “big data”?
▪ Over few terabytes Enough to start spanning multiple storage units
▪ Challenges
▪ Efficiently storing and accessing large amounts of data is difficult, even more considering
fault tolerance and backups
▪ Manipulating large data sets involves running immensely parallel processes
▪ Managing continuously evolving schema and metadata for semi-structured and un-
structured data is difficult
10. ▪ Explosion of social media sites (Facebook,Twitter) with large
data needs
▪ Rise of cloud-based solutions such as Amazon S3 (simple storage
solution)
▪ Just as moving to dynamically-typed languages (Python, Ruby,
Groovy), a shift to dynamically-typed data with frequent schema
changes
▪ Open-source community
11. ▪ The context is Internet
▪ RDBMSs assume that data are
▪ Dense
▪ Largely uniform (structured data)
▪ Data coming from Internet are
▪ Massive and sparse
▪ Semi-structured or unstructured
▪ With massive sparse data sets, the typical storage mechanisms and access methods
get stretched
12. ▪ Large data volumes
▪ Google’s “big data”
▪ Scalable replication and
distribution
▪ Potentially thousands of
machines
▪ Potentially distributed
around the world
▪ Queries need to return
answers quickly
▪ Mostly query, few
updates
▪ Asynchronous Inserts &
Updates
▪ Schema-less
▪ ACID transaction properties
are not needed – BASE
▪ CAP Theorem
▪ Open source development
14. Discussing NoSQL databases is complicated because there are a variety of types:
▪Sorted ordered Column Store
▪Optimized for queries over large datasets, and store
columns of data together, instead of rows
▪Document databases:
▪pair each key with a complex data structure known as a document.
▪Key-Value Store :
▪are the simplest NoSQL databases. Every single item in the database is stored
as an attribute name (or 'key'), together with its value.
▪Graph Databases :
▪are used to store information about networks of data, such as social connections.
15. ▪ Documents
▪ Loosely structured sets of key/value pairs in documents, e.g., XML, JSON
▪ Encapsulate and encode data in some standard formats or encodings
▪ Are addressed in the database via a unique key
▪ Documents are treated as a whole, avoiding splitting a document into its constituent
name/value pairs
▪ Allow documents retrieving by keys or contents
▪ Notable for:
▪ MongoDB (used in FourSquare, Github, and more)
▪ CouchDB (used in Apple, BBC, Canonical, Cern, and more)
16. ▪ The central concept is the notion of a "document“ which corresponds to a
row in RDBMS.
▪ A document comes in some standard formats like JSON (BSON).
▪ Documents are addressed in the database via a unique key that represents
that document.
▪ The database offers an API or query language that retrieves documents
based on their contents.
▪ Documents are schema free, i.e., different documents can have structures
and schema that differ from one another. (An RDBMS requires that each row
contain the same columns.)
16
17. {
_id: ObjectId("51156a1e056d6f966f268f81"),
type: "Article",
author: "Derick Rethans",
title: "Introduction to Document Databases with MongoDB",
date: ISODate("2013-04-24T16:26:31.911Z"),
body: "This arti…"
},
{
_id: ObjectId("51156a1e056d6f966f268f82"),
type: "Book",
author: "Derick Rethans",
title: "php|architect's Guide to Date and Time Programming with PHP",
isbn: "978-0-9738621-5-7"
}
18. ▪ Store data in a schema-less way
▪ Store data as maps
▪ HashMaps or associative arrays
▪ Provide a very efficient average running
time algorithm for accessing data
▪ Notable for:
▪ Couchbase (Zynga,Vimeo, NAVTEQ, ...)
▪ Redis (Craiglist,Instagram, StackOverfow,
flickr, ...)
▪ Amazon Dynamo (Amazon, Elsevier,
IMDb, ...)
▪ Apache Cassandra (Facebook, Digg,
Reddit,Twitter,...)
▪ Voldemort (LinkedIn, eBay, …)
▪ Riak (Github, Comcast, Mochi, ...)
19. ▪ Data are stored in a column-oriented way
▪ Data efficiently stored
▪ Avoids consuming space for storing nulls
▪ Columns are grouped in column-families
▪ Data isn’t stored as a single table but is stored by column families
▪ Unit of data is a set of key/value pairs
▪ Identified by “row-key”
▪ Ordered and sorted based on row-key
▪ Notable for:
▪ Google's Bigtable (used in all
Google's services)
▪ HBase (Facebook, StumbleUpon,
Hulu,Yahoo!, ...)
20. ▪ Graph-oriented
▪ Everything is stored as an edge, a node or an attribute.
▪ Each node and edge can have any number of attributes.
▪ Both the nodes and edges can be labelled.
▪ Labels can be used to narrow searches.
20
21. ▪ Issues with scaling up when the dataset is just too big
▪ RDBMS were not designed to be distributed
▪ Traditional DBMSs are best designed to run well on a “single” machine
▪ Larger volumes of data/operations requires to upgrade the server with faster
CPUs or more memory known as ‘scaling up’ or ‘Vertical scaling’
▪ NoSQL solutions are designed to run on clusters or multi-node database
solutions
▪ Larger volumes of data/operations requires to add more machines to the
cluster, Known as ‘scaling out’ or ‘horizontal scaling’
▪ Different approaches include:
▪ Master-slave
▪ Sharding (partitioning)
22. ▪ RDBMSs are based on ACID (Atomicity, Consistency, Isolation, and Durability)
properties
▪ NoSQL
▪ Does not give importance to ACID properties
▪ In some cases completely ignores them
▪ In distributed parallel systems it is difficult/impossible to ensure ACID properties
▪ Long-running transactions don't work because keeping resources blocked for a
long time is not practical
23. ▪Acronym contrived to be the opposite of ACID
▪ Basically Available,
▪ Soft state,
▪ Eventually Consistent
▪Characteristics
▪ Weak consistency – stale data OK
▪ Availability first
▪ Best effort
▪ Approximate answers OK
▪ Aggressive (optimistic)
▪ Simpler and faster
24. A congruent and logical way for assessing the problems involved in
assuring ACID-like guarantees in distributed systems is provided by the
CAP theorem
At most two of the following three can be maximized at one time
▪ Consistency
▪ Each client has the same view of the
data
▪ Availability
▪ Each client can always read and write
▪ Partition tolerance
▪ System works well across distributed
physical networks
25. ▪ CAP theorem – At most two properties on three can be
addressed
▪ The choices could be as follows:
1. Availability is compromised but consistency and partition
tolerance are preferred over it
2. The system has little or no partition tolerance. Consistency
and availability are preferred
3. Consistency is compromised but systems are always available
and can work when parts of it are partitioned
26. C A
P
• Consistency and Availability is not
“binary” decision
• AP systems relax consistency in
favor of availability – but are not
inconsistent
• CP systems sacrifice availability for
consistency- but are not unavailable
• This suggests both AP and CP
systems can offer a degree of
consistency, and availability, as
well as partition tolerance
27. ▪ There is no perfect NoSQL database
▪ Every database has its advantages and disadvantages
▪ Depending on the type of tasks (and preferences) to accomplish
▪ NoSQL is a set of concepts, ideas, technologies, and software
dealing with
▪ Big data
▪ Sparse un/semi-structured data
▪ High horizontal scalability
▪ Massive parallel processing
▪ Different applications, goals, targets, approaches need different
NoSQL solutions
28. ▪ Where would I use a NoSQL database?
▪ Do you have somewhere a large set of uncontrolled,
unstructured, data that you are trying to fit into a RDBMS?
▪ Log Analysis
▪ Social Networking Feeds (many firms hooked in through
Facebook or Twitter)
▪ External feeds from partners
▪ Data that is not easily analyzed in a RDBMS such as time-
based data
▪ Large data feeds that need to be massaged before entry into
an RDBMS
34. • CREATE KEYSPACE − Creates a KeySpace in Cassandra.
• USE − Connects to a created KeySpace.
• ALTER KEYSPACE − Changes the properties of a KeySpace.
• DROP KEYSPACE − Removes a KeySpace
• CREATE TABLE − Creates a table in a KeySpace.
• ALTER TABLE − Modifies the column properties of a table.
• DROP TABLE − Removes a table.
• TRUNCATE − Removes all the data from a table.
• CREATE INDEX − Defines a new index on a single column of a table.
• DROP INDEX − Deletes a named index.
34
35. • INSERT − Adds columns for a row in a table.
• UPDATE − Updates a column of a row.
• DELETE − Deletes data from a table.
• BATCH − Executes multiple DML statements at once.
• CQL Clauses
• SELECT -− This clause reads data from a table
35
36. ▪ describe keyspaces;
▪ or desc keyspaces;
▪ use demo;
▪ describe tables;
▪ describe table emp;
▪ describe type emp;
36
37. ▪ create table student ( name text, usn text, mob int, id int, primary key(id));
▪ desc student;
▪ select * from student;
▪ insert into student(id, usn,name,mob) values (1, '1bm20cs001','abc',2334233423);
▪ update student set name='xyz' where id=1;
▪ update student set name='xyz' where id=2; // no error instead create new row
37
38. ▪ CREATE TABLE emp( emp_id int PRIMARY KEY, emp_name text, emp_city text,
emp_sal varint, emp_phone varint );
▪ Select * from emp;
▪ INSERT INTO emp (emp_id, emp_name, emp_city,
emp_phone, emp_sal) VALUES(1,'ram', 'Hyderabad', 9848022338, 50000);
▪ UPDATE emp SET emp_city='Delhi',emp_sal=50000 WHERE emp_id=2;
▪ DELETE emp_sal FROM emp WHERE emp_id=3;
38
39. ▪ uuid() function which is very important to insert value and to uniquely generates
“guaranteed unique” UID value Universally.
▪ Create table function4(Id uuid primary key, name text);
▪ Insert into function4 (Id, name) values (1,‘Ashish’); // fails
▪ Insert into function4(Id, name) values (now(),‘Ashish’); //correct
39
41. ▪ In programming language to connect application with database there is a
programming Pattern.
▪ Three Easy steps are following :
▪ Create a connection (which is called a Session)
▪ Use the session to execute the query.
▪ Close the connection/session.
41
61. ▪ db.emp.aggregate([{“$match”:{“section”:”A”}}])
▪ db.emp.aggregate([{“$match” :{ $and:[{“section”:”A”},{Marks: {“$gt”:70}}}])
▪ db.emp.aggregate([{“$project”:{name:1 ,section:1}}]) shows id , name and
section
▪ db.emp.aggregate([{“$project”:{_id:0, name:1 ,section:1}}]) shows only name
and section id is not visible
▪ db.emp.aggregate([{“$match”:{“section”:”A”}}, {“$project”:{_id:0, name:1
,section:1}}])
▪ db.emp.aggregate([{“$group”:{“_id”: {“section”:“$section”},
“Totalmarks”:{“$sum”:”$Marks”}}}])
61
64. ▪ Redis stands for REmote DIctionary Server.
▪ Redis is a No SQL database which works on the concept of key-value
pair.
▪ Redis is a flexible, open-source (BSD licensed), in-memory data
structure store, used as database, cache, and message broker.
▪ Redis supports various types of data structures like strings, hashes,
lists, sets, sorted sets and bitmaps.
▪ Redis is an advanced key-value store to improve performance when
serving data that is stored in system memory.
64
65. ▪ Very flexible
▪ No schemas and column names
▪ Very fast : Can perform around 110,000 SETs per second and about 81,000 GETs
per second.
▪ Rich Datatype support
▪ Caching and Disk persistence
65
85. ▪ <?php
▪ //Connecting to Redis server on localhost
▪ $redis = new Redis();
▪ $redis->connect('127.0.0.1',6379);
▪ echo "Connection to server sucessfully";
▪ //check whether server is running or not
▪ echo "Server is running: ".$redis->ping();
▪ ?>
85
86. ▪ <?php
▪ //Connecting to Redis server on localhost
▪ $redis = new Redis();
▪ $redis->connect('127.0.0.1', 6379);
▪ echo "Connection to server sucessfully";
▪ //set the data in redis string
▪ $redis->set("tutorial-name", "Redis tutorial");
▪ // Get the stored data and print it
▪ echo "Stored string in redis:: " .$redis→get("tutorial-name");
▪ ?>
86
87. ▪ <?php
▪ $redis = new Redis();
▪ $redis->connect('127.0.0.1', 6379); //Connecting to Redis server on localhost
▪ echo "Connection to server sucessfully";
▪ //store data in redis list
▪ $redis->lpush("tutorial-list", "Redis");
▪ $redis->lpush("tutorial-list", "Mongodb");
▪ $redis->lpush("tutorial-list", "Mysql");
▪ // Get the stored data and print it
▪ $arList = $redis->lrange("tutorial-list", 0 ,5);
▪ echo "Stored string in redis:: ";
▪ print_r($arList);
▪ ?>
87
89. ▪ Neo4j is one of the popular Graph Databases and Cypher Query Language (CQL).
▪ Graph database is a database used to model the data in the form of graph.
▪ Other Graph Databases are Oracle NoSQL Database, OrientDB, HypherGraphDB,
GraphBase, InfiniteGraph, and AllegroGraph.
▪ Graph databases store relationships and connections as first-class entities.
89
91. ▪ Flexible data model
▪ Real-time insights
▪ High availability
▪ Connected and semi structured data
▪ Easy retrieval
▪ Cypher Query Language
▪ No Joins
91
104. ▪ CREATE (n)
▪ MATCH (n) RETURN n
▪ CREATE (n),(m)
▪ MATCH(n) RETURN n limit 2
▪ MATCH (n) WHERE id(n)=1 RETURN n
▪ MATCH (n) WHERE id(n)<=6 RETURN n
▪ MATCH (n) WHERE id(n) IN [1,2,6] RETURN n
104
105. ▪ MATCH (n) WHERE id(n)=1 DELETE n
▪ MATCH (n) RETURN n
▪ MATCH (n) WHERE id(n) IN[2,3] DELETE n
▪ MATCH (n) DELETE n
105
106. ▪ //WITH LABLE
▪ CREATE (n:Person)
▪ MATCH (n) WHERE n:Person RETURN n
▪ CREATE (n:Person:Indian) // 2 Label
▪ MATCH (n) WHERE n:Person:Indian RETURN n
▪ MATCH (n) WHERE n:Person OR n:Indian RETURN n
▪ MATCH (n) REMOVE n:Person RETURN n
▪ MATCH (n) WHERE ID(n) IN[2,3] REMOVE n:Employee RETURN n
106
107. ▪ //Update
▪ MATCH (n) WHERE ID(n)=0 REMOVE n:manager SET n:Director RETURN n
▪ //Create Node with property
▪ CREATE (x:Book{title:NoSQL}) RETURN x;
▪ CREATE (x:Book{title:"NoSQL",author:"abc",publisher:"wrox"}) RETURN x;
▪ MATCH (n:Book{author:"abc"}) RETURN n;
▪ MATCH (n:Book) WHERE n.price <1000 AND (n.author:"abc" OR n.author:"xyz")
RETURN n;
107