Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives

No sql

Uploaded by

sunnyrx100virat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives

No sql

Uploaded by

sunnyrx100virat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Course Code: CSE3009 Course Title: No SQL Data Bases TPC 3 2 4

Version No. 1.0

Course Pre-requisites/ CSE2007
Co-requisites

Anti-requisites (if any). None

Objectives: 1. This course will explore the origins of NoSQL databases and
the characteristics that distinguish them from traditional
relational database management systems.
2. This covers the architectures and common features of the main
types of NoSQL databases (key-value stores, document
databases, column-family stores, graph databases)
3. Finally, discuss the criteria that decision makers should
consider when choosing between relational and non-relational
databases and techniques for selecting the NoSQL database
that best addresses specific use cases.
Expected Outcome: On completion of the course, students will have the ability to
1. Explain the detailed architecture, define objects, load data,
query data and performance tune NoSQL databases
2. Define NoSQL, its characteristics, history and primary benefits
using NoSQL Databases.
3. Define the major types of NoSQL databases including a
primary use case and advantages/disadvantages of each type.
4. Analyze semi-structured data and choose an appropriate
storage structure
Module No. 1 Introduction To NoSQL Concepts 6 Hours
Data base revolutions: First generation, second generation, third generation, Managing
Transactions and Data Integrity, ACID and BASE for reliable database transactions, Speeding
performance by strategic use of RAM, SSD, and disk, Achieving horizontal scalability with
database sharding, Brewer’s CAP theorem.
Module No. 2 NoSQL Data Architecture Patterns 8 Hours
NoSQL Data model: Aggregate Models- Document Data Model- Key-Value Data Model-
Columnar Data Model, Graph Based Data Model – Graph Data Model, NoSQL system ways to
handle big data problems, Moving Queries to data, not data to the query, hash rings to distribute
the data on clusters, replication to scale reads, Database distributed queries to data nodes.
Module No. 3 Key –Value Data Stores 8 Hours
From array to key –value databases, Essential features of key – value Databases, Properties of
keys, Characteristics of Values, Key-Value Database Data Modeling Terms, Key-Value
Architecture and implementation Terms, Designing Structured Values, Limitations of Key- Value
Databases, Design Patterns for Key-Value Databases, Case Study: Key-Value Databases for
Mobile Application Configuration
Module No. 4 Document Oriented Database 7 Hours
Document, Collection, Naming, CRUD operation, querying, indexing, Replication, Sharding,
Consistency Implementation: Distributed consistency, Eventual Consistency, Capped Collection,
Case studies: document oriented database: MongoDB and/or Cassandra
Module No. 5 Columnar Data Model 8 Hours
Data warehousing schemas: Comparison of columnar and row-oriented storage, Column-store
Architectures: C-Store and Vector-Wise, Column-store internals and, Inserts/updates/deletes,
Indexing, Adaptive Indexing and Database Cracking. Advanced techniques: Vectorized
Processing, Compression, Write penalty, Joins, Group-by, Aggregation and Arithmetic Operations,
Case Studies
Module No. 6 Data Modeling With Graph 8 Hours
Comparison of Relational and Graph Modeling, Property Graph Model Graph Analytics: Link
analysis algorithm- Web as a graph, PageRank- Markov chain, page rank computation, Topic
specific page rank (Page Ranking Computation techniques: iterative processing, Random walk
distribution Querying Graphs: Introduction to Cypher, case study: Building a Graph Database
Application- community detection
Text Books
1. Guy Harrison, “Next Generation database: NoSQL and Big data”, A Press ,2015
2. Ted Hills , “NoSQL and SQL Data Modeling: Bringing Together Data, Semantics, and
Software”, Technics Publications,2016
References
1. Daniel Abadi, Peter Boncz, Stavros Harizopoulos, Stratos Idreaos, Samuel Madden, “The
Design and Implementation of Modern Column-Oriented Database Systems”, Now
Publishers,2013
Lab Exercises
1. Import the Hubway data into Neo4j and configure Neo4j. Then, answer the following
questions using the Cypher Query Language:
a) List top 10 stations with most outbound trips (Show station name and number of trips)
b) List top 10 stations with most inbound trips (Show station name and number of trips)
c) List top 5 routes with most trips (Show starting station name, ending station name and
number of trips) (4) List the hour number (for example 13 means 1pm -2pm) and number
of trips which start from the station "B.U. Central"
d) List the hour number (for example 13 means 1pm -2pm) and number of trips which end
at the station "B.U. Central"
2. The flight data can be found at http://stat-computing.org/dataexpo/2009/thedata.html
You need to download just one year and from there you can sample a subset of at least
10000 records. You can use the data from a full year if you want but we recommend using
a smaller dataset for simplicity. Hint: If you need to unzip the data file, you can use the
command: bzip2 –d datafile from a terminal. For example, for the 2008, you download the
file and unzip it using: bzip2 -d 1987.csv.bz2. The airport data can be found at
http://stat-computing.org/dataexpo/2009/supplemental-data.html
1) Download the flight dataset and airport dataset.
(2) Clean the dataset (for example: remove columns you do not need, remove records with
missing information, remove duplicate records and so on).
(3) Give the header to csv files
(4) Import the data into Neo4j.
(5) Write the queries to answer following questions:
(5.1) List top 10 airports with most outbound flights.
(5.2) List top 10 airports with most inbound flights.
(5.3) List top 5 routes with most flights in weekdays.
(5.4) List top 5 routes with most flights in weekends.
(5.5) List the hour number (for example 13 means 1pm -2pm) and number of flights, which
depart from a specific airport in your data (e.g., Boston Logan Airport).
(5.6) List the hour number (for example 13 means 1pm -2pm) and number of flights, which
arrive at specific airport in your data (e.g., Boston Logan Airport).
In your report, you should answer the following questions:
(a) List the year of the flights that you downloaded and prepared for this assignment. You
can get a sample set from one-year data. However, the number of flights cannot be smaller
than 10k.
(b) Describe how you clean the data (Which columns you remove and why? Which rows
you remove and why?). Hint: You can clean your data by writing a small program in Java,
Python, C, Matlab or any kind of programming language.
(c) Describe the header you give to the csv files.
(d) Write down the command for importing data.
(e) Write and execute the queries from step (5) above.
3. Download a zip code dataset at
http://media.mongodb.org/zips.json
Use mongoimport to import the zip code dataset into MongoDB.
After importing the data, answer the following questions by using aggregation pipelines:
(1) Find all the states that have a city called "BOSTON".
(2) Find all the states and cities whose names include the string "BOST".
(3) Each city has several zip codes. Find the city in each state with the most number of zip
codes and rank those cities along with the states using the city populations.
(4) MongoDB can query on spatial information.
Assume we have a spatial position as [-72, 42], and in the range of 2 (it can be [-71.5, 41.5]
or [-72.5, 42.5] or somewhere else), there may exist a number of zip codes . Try to find the
states in that range. You should return the total populations and the number of cities of
each state in that range. Rank the states based on the number of cities.
(5) Consider a certain rectangular area, in which the vertices are [ -80 , 30 ] , [ -90 ,30 ] , [ -
90 , 40 ] and [ -80 , 40 ]. Find and report the top 10 largest cities (by population) in this
area.
4. Create a database that stores road cars. Cars have a manufacturer, a type. Each car has a
maximum performance and a maximum torque value. Do the following:
Test Cassandra’s replication schema and consistency models.
5. Network Partition without Replication
6. Network Partition with Replication and Weak Consistency
7. Network Partition with Replication and Quorum Consistency
8. Cars have different powertrains. Each type can be described with different parameters:
Internal combustion engine: fuel type, displacement, maximum torque, maximum
power.Electric motor: maximum torque, maximum power Both: all of the above and the
combined maximum torque and power values Construct the class hierarchy for different
powertrain types Extend the cars column family to store the powertrain of each car.
Write a query that collects the cars with an internal combustion engine or an electric motor.
9. Master Data Mangement using Neo4j
Manage your master data more effectively
The world of master data is changing. Data architects and application developers are
swapping their relational databases with graph databases to store their master data. This
switch enables them to use a data store optimized to discover new insights in existing data,
provide a 360-degree view of master data and answer questions about data relationships in
real time.
10. Optimization of Customer Experience with Real-time Recommendations using Neo4j
11. The operational intelligence case studies describe applications that collect machine
generated data from logging systems, application output, and other systems using
mongoDB.
12. The product data management case studies address aspects of applications required for
building product catalogs, and managing inventory in e-commerce systems (use
MongoDB)
13. the content management case studies introduce basic patterns and techniques for building
content management systems using MongoDB.
14. ShoppingMall case study using cassendra, where we have many customers ordering items
from the mall and we have suppliers who deliver them their ordered items.
15. Key-Value Databases for Mobile Application Configuration

Projects
Projects may be given as group projects
The following list are the sample projects that can be given to students to be implemented:
1. Analyzing and Visualizing social networks like Facebook and twitter using NoSQL
Databases.
2. Using Sample datasets from http://www.rdatamining.com/resources/data,UCLA
Repository, kaggle dataset etc., and analyzing those using NoSQL databases.
3. Twitter provides a fire hose of data. Automatically filtering, aggregating, analyzing such
data can allow a way to harness the full value of the data, extracting valuable information.
The idea of this project is investigating stream processing technology to operate on social
streams.
4. Project on Combining Database management and Cloud storage system.
5. CarTel. In the CarTel project, we are building a system for collecting and managing data
from automobiles. There are several possible CarTel related projects:
a) One of the features of CarTel is a GUI for browsing geo‐spatial data collected from cars.
Primitive interface for retrieving parts of the data that are of interest, but developing a
more sophisticated interface or query language for browsing and exploring this data would
make a great project. It collects relatively sensitive personal information about users
location and
driving habits. Protecting this information from casual browsers, insurance companies, or
other undesired users is important. However, it is also important to be able to combine
different users data together to do things like intelligent route planning or vehicle anomaly
detection. The goal of this project would be to find a way to securely perform certain types
of aggregate queries over CarTel data without exposing personally identifiable
information.

Mode of Evaluation Practice Tests-20%, Continuous Assessment Tests-60%, Practical

Assesment-20%

Practice Tests - Cumulative for 16 Weeks 20%

Continuous Assessment Test-1 20%
Continuous Assessment Test-2 20%
Continuous Assessment Test-3 20%
Practical Assessment (Mini Project) 20%

Recommended by the 06.07.2018

Board of Studies on
Date of Approval by 2nd Academic Council 21.07.2018
the Academic Council

Final Report - Baseline Assessment For The Usaid Expanding Water and Sanitation Project - 06-07-2022
No ratings yet
Final Report - Baseline Assessment For The Usaid Expanding Water and Sanitation Project - 06-07-2022
97 pages
CSE6006 NoSQL-Databases ETH 1 AC41
No ratings yet
CSE6006 NoSQL-Databases ETH 1 AC41
10 pages
NoSQL Database For Software
No ratings yet
NoSQL Database For Software
49 pages
Modeling of Big Data Processing
No ratings yet
Modeling of Big Data Processing
15 pages
Ds Notes
No ratings yet
Ds Notes
88 pages
Introduction To Big Data and NoSQL
No ratings yet
Introduction To Big Data and NoSQL
52 pages
IE494_Big_Data_Processing_Course_File_Autumn24_PMJ - PM Jat
No ratings yet
IE494_Big_Data_Processing_Course_File_Autumn24_PMJ - PM Jat
5 pages
Chapter-14
No ratings yet
Chapter-14
35 pages
BD UNIT 1,2
No ratings yet
BD UNIT 1,2
12 pages
Modern Database Design outline ver 1.1 1
No ratings yet
Modern Database Design outline ver 1.1 1
3 pages
Seminar Nosql
No ratings yet
Seminar Nosql
59 pages
Bda Sem 7 Book
No ratings yet
Bda Sem 7 Book
188 pages
WK 3
No ratings yet
WK 3
29 pages
BDA_(2)_merged[1]
No ratings yet
BDA_(2)_merged[1]
29 pages
CS8091 Big Data Analytics Unit5
No ratings yet
CS8091 Big Data Analytics Unit5
71 pages
Manual Mango
No ratings yet
Manual Mango
17 pages
BIG DATA UNIT-II NOTES
No ratings yet
BIG DATA UNIT-II NOTES
7 pages
nosql
No ratings yet
nosql
64 pages
RDBMS: Atomic Consistent Isolated Durable
No ratings yet
RDBMS: Atomic Consistent Isolated Durable
9 pages
BDA CW Chapter 3
No ratings yet
BDA CW Chapter 3
9 pages
2 Big Data Analytics-Hadoop R21 A7902 ABP
No ratings yet
2 Big Data Analytics-Hadoop R21 A7902 ABP
16 pages
03 Unit Bda Hadoop,Map Reduce
No ratings yet
03 Unit Bda Hadoop,Map Reduce
80 pages
NOSQL, Graph Databases & Cypher
No ratings yet
NOSQL, Graph Databases & Cypher
78 pages
BDA-UNIT-1
No ratings yet
BDA-UNIT-1
32 pages
Mongo DB
No ratings yet
Mongo DB
24 pages
2 emerging
No ratings yet
2 emerging
10 pages
CSE5003 - DAT ABA Se Syste MS: DES IGN A ND I M PLE Ment Atio N L, T, P, J, C 2,0,2,4,4
No ratings yet
CSE5003 - DAT ABA Se Syste MS: DES IGN A ND I M PLE Ment Atio N L, T, P, J, C 2,0,2,4,4
9 pages
BDA Assignment1 BE6 20
No ratings yet
BDA Assignment1 BE6 20
10 pages
Nosqldbs
No ratings yet
Nosqldbs
149 pages
EUC1502 Module5 Big-Data
No ratings yet
EUC1502 Module5 Big-Data
46 pages
BDS Session 1
100% (1)
BDS Session 1
70 pages
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
ADWT_BOOK
No ratings yet
ADWT_BOOK
67 pages
2 BDA A6515 Hadoop
No ratings yet
2 BDA A6515 Hadoop
55 pages
DSA Practical Workbook - LAb Manuals 18cs
No ratings yet
DSA Practical Workbook - LAb Manuals 18cs
141 pages
Updated Mongodb Lab Manual IV sem
No ratings yet
Updated Mongodb Lab Manual IV sem
48 pages
1 Bda A6515 Intro Bda
No ratings yet
1 Bda A6515 Intro Bda
48 pages
Lecture8
No ratings yet
Lecture8
34 pages
No SQL
No ratings yet
No SQL
38 pages
1,2,3 units
No ratings yet
1,2,3 units
37 pages
Syllabus
No ratings yet
Syllabus
2 pages
NOSQL Concept 2
No ratings yet
NOSQL Concept 2
4 pages
ADBMS original-output
No ratings yet
ADBMS original-output
28 pages
Important Da
No ratings yet
Important Da
9 pages
NoSQL Essentials: Navigating the World of Non-Relational Databases
From Everand
NoSQL Essentials: Navigating the World of Non-Relational Databases
Kameron Hussain
No ratings yet
DBMS 11
No ratings yet
DBMS 11
13 pages
PPT 2.2.1
No ratings yet
PPT 2.2.1
26 pages
SEM VII BDA Syllabus Theory
No ratings yet
SEM VII BDA Syllabus Theory
4 pages
BDA Techmax (Searchable)
No ratings yet
BDA Techmax (Searchable)
150 pages
Nosql Datawarehouse
No ratings yet
Nosql Datawarehouse
11 pages
nosql-technology (1)
No ratings yet
nosql-technology (1)
8 pages
Question Bank - Big Data Analytics - Final1
100% (1)
Question Bank - Big Data Analytics - Final1
6 pages
Big Daa R18 Manual
No ratings yet
Big Daa R18 Manual
84 pages
Advanced Data Base Mangement System
No ratings yet
Advanced Data Base Mangement System
182 pages
BDA Syllabus - Sem VII - Mumbai University
No ratings yet
BDA Syllabus - Sem VII - Mumbai University
3 pages
1. Introduction of Subject
No ratings yet
1. Introduction of Subject
28 pages
Mastering ScyllaDB: High-Performance NoSQL with C++
From Everand
Mastering ScyllaDB: High-Performance NoSQL with C++
Robert Johnson
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
SQL and NoSQL Full Mastery: A Comprehensive Guide to Modern Data Management
From Everand
SQL and NoSQL Full Mastery: A Comprehensive Guide to Modern Data Management
Kameron Hussain
No ratings yet
SQL and NoSQL: Building Hybrid Data Solutions for Modern Applications
From Everand
SQL and NoSQL: Building Hybrid Data Solutions for Modern Applications
Robert Johnson
No ratings yet
The DynamoDB Handbook: Practical Solutions for Modern NoSQL Database Management
From Everand
The DynamoDB Handbook: Practical Solutions for Modern NoSQL Database Management
Robert Johnson
No ratings yet
Federal Democratic Republic of Nepal
No ratings yet
Federal Democratic Republic of Nepal
26 pages
Unit V
No ratings yet
Unit V
89 pages
Tread Patterns PDF
No ratings yet
Tread Patterns PDF
2 pages
Flexible Modules Catalogue - India
No ratings yet
Flexible Modules Catalogue - India
2 pages
PRODUKKOMERSIAL
No ratings yet
PRODUKKOMERSIAL
3 pages
Aoxiang Golf Cart Brochure
No ratings yet
Aoxiang Golf Cart Brochure
28 pages
KL650THX 635042118250460000
No ratings yet
KL650THX 635042118250460000
1 page
Hofstede's Dimensions3b
No ratings yet
Hofstede's Dimensions3b
16 pages
EE6504-Electrical Machines - II-1330526698-Em II Unit 5
No ratings yet
EE6504-Electrical Machines - II-1330526698-Em II Unit 5
30 pages
Crisis Comm Final Project
No ratings yet
Crisis Comm Final Project
25 pages
Form 1 - Chapter 3
100% (2)
Form 1 - Chapter 3
8 pages
Tax SOLVING
No ratings yet
Tax SOLVING
3 pages
Removal of Teeth Under Local Anaesthetic
No ratings yet
Removal of Teeth Under Local Anaesthetic
4 pages
Reasearch Paper '22 (Group 1 - Brotherhood)
No ratings yet
Reasearch Paper '22 (Group 1 - Brotherhood)
27 pages
Data Sheet: TDA3618JR
No ratings yet
Data Sheet: TDA3618JR
24 pages
Name: Joel Tabor: Individual Development Plan
100% (4)
Name: Joel Tabor: Individual Development Plan
8 pages
KRA TENDER DOCUMENT FOR HOUSEKEEEPING ITEMS Reflector
No ratings yet
KRA TENDER DOCUMENT FOR HOUSEKEEEPING ITEMS Reflector
21 pages
CRM
100% (1)
CRM
11 pages
WT0712
No ratings yet
WT0712
35 pages
Data Pengajuan Sepatu Jan' 2020
No ratings yet
Data Pengajuan Sepatu Jan' 2020
27 pages
Wedding Guidelines For University Circle United Methodist Church
No ratings yet
Wedding Guidelines For University Circle United Methodist Church
4 pages
Nfpa Codes
100% (1)
Nfpa Codes
22 pages
BCA-234 Data Structures Laboratory
No ratings yet
BCA-234 Data Structures Laboratory
29 pages
Clariant SDS SYNERGEN 9903 Argentina English
No ratings yet
Clariant SDS SYNERGEN 9903 Argentina English
14 pages
Indian Constitution Most Important Question and Answer
No ratings yet
Indian Constitution Most Important Question and Answer
18 pages
IBM Retail - Boost Integration, ROI & Innovation With SOA
No ratings yet
IBM Retail - Boost Integration, ROI & Innovation With SOA
24 pages
Resume Dev
No ratings yet
Resume Dev
1 page
Study of Mutual Fund As An
No ratings yet
Study of Mutual Fund As An
78 pages
Practice Set
No ratings yet
Practice Set
46 pages

Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives

Uploaded by

Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives

Uploaded by

Course Code: CSE3009 Course Title: No SQL Data Bases TPC 3 2 4

Version No. 1.0

Anti-requisites (if any). None

Mode of Evaluation Practice Tests-20%, Continuous Assessment Tests-60%, Practical

Practice Tests - Cumulative for 16 Weeks 20%

Recommended by the 06.07.2018

You might also like