NoSql-Unit-4
NoSql-Unit-4
SELVA KUMAR S
B.M.S COLLEGE OF ENGINEERING
▪ Working with NOSQL:
▪ Surveying Database Internals
2
▪ MongoDB
▪ Membase
▪ Hypertable
▪ Apache Cassandra
▪ Berkley DB
3
▪ Clients speak to a MongoDB server using a simple TCP/IP-based socket
connection.
▪ The wire protocol used for the communication is a simple request-response-
based socket protocol.
▪ The wire protocol headers and payload are BSON encoded.
4
▪ OP_INSERT (code: 2002) — Insert a document.
5
▪ Every request and response message has a header. A standard message header has the following
properties:
▪ messageLength — The length of the message in bytes.
▪ responseTo — In the case of OP_QUERY and OP_GET_MORE the response from the database
6
▪ When creating and inserting a new document, a client sends an OP_INSERT operation via a request that
includes:
▪ A message header — A standard message header structure that includes messageLength, requestID,
responseTo, and opCode.
▪ An int32 value — Zero (which is simply reserved for future use).
▪ An array — This array contains one or more documents that need to be inserted into a collection.
7
▪ When querying for documents in a collection, a client sends an OP_QUERY operation via a request.
▪ A message header — A standard header with messageLength, requestID, responseTo, and opCode
elements in it.
▪ An int32 value — Contains flags that represent query options.
8
▪ An OP_REPLY message from the server includes:
▪ A message header — The message header in a client request and a server response is quite similar.
▪ An int32 value — Contains response flags that typically denote an error or exception situation.
▪ An int64 value — Contains the cursor id that allows a client to fetch more documents.
9
▪ MongoDB stores database and collection data in fi les that reside at a path specified by the –dbpath
option to the mongod server program.
▪ The default value for dbpath is /data/db.
▪ db.movies.dataSize();
▪ db.movies.storageSize();
▪ db.movies.totalSize();
▪ db.movies.totalIndexSize();
10
11
12
▪ Parsing SQL statements
▪ Opening tables
▪ Locking tables
▪ Unlocking tables
▪ Closing tables
13
14
▪ Data is in-memory in two places: the storage engine buffer and Memcached.
▪ Replication of data between the storage engine and Memcached can have
inconsistent states of data.
▪ The data is fetched into Memcached via the SQL layer and so the SQL
overhead is still present, even if it’s minimized.
▪ Memcached performance is superior only until all relevant data fits in
memory. Disk I/O overheads can be high and can make the system slow.
15
16
▪ Using Rails with NoSQL
▪ Using Django with NoSQL
▪ Using Spring Data
17
▪ gem install rails
▪ require ‘rubygems’
▪ require ‘mongo’
▪ source ‘http://gemcutter.org’
▪ gem “mongo_mapper”
18
19
20
▪ get ‘my_action/create_user’
21
▪ Django is to the Python community what Rails is to Ruby developers.
▪ Django is a lightweight web framework that allows for rapid prototyping and fast development.
22
▪ Application.properties
▪ spring.data.mongodb.host=localhost
spring.data.mongodb.port=27017
spring.data.mongodb.database=springBootMongoDB
▪ Creating an entity class
▪ public class Book {
@Id // making this variable as ID, will be auto-generated by MongoDB
private String id;
@NonNull
private Integer bookId;
@NonNull
private String bookName;
@NonNull
private String bookAuthor;
@NonNull
private Double bookCost;
}
23
▪ @Override
public void run(String... args) throws Exception {
// Removing old data if exists
bookRepo.deleteAll();
// Saving 4 books into DB
bookRepo.saveAll(Arrays.asList(
new Book(501, "Core Java", "Kathy Sierra", 1065.5),
new Book(502, "Spring in Action", "Craig Walls", 940.75),
new Book(503, "Hibernate in Action", "Gavin King", 889.25),
new Book(504, "Practical MongoDB", "Shakuntala Gupta", 785.0)
));
System.out.println("All Data saved into MongoDB");
// Updating ID(PK) manually (allowed) : It will create one new record
bookRepo.save(new Book("ISBN10:1484240251", 504,"Practical MongoDB", "Navin Sabharwal", 785.0));
// insert
// Printing all books
List<Book> bookList = bookRepo.findAll();
bookList.forEach(System.out::println);
}
24
▪ Migrating from a structured schema to a schema-less form is not very hard.
▪ In many cases you could simply export the data from RDBMS tables and move them into NoSQL
collections.
▪ However, things get complicated when the NoSQL database is a column-family, sorted ordered, or a
key/value store.
▪ The greater impedance mismatch is around ad-hoc querying and secondary indexes, which are often
difficult to support in a NoSQL environment.
▪ NoSQL looks at the data store from a query perspective and not from a generic storage viewpoint.
25
▪ To facilitate data importation from RDBMS to Hadoop for
NoSQL-style manipulations.
▪ Sqoop is a command-line tool with the following capabilities:
▪ Imports individual RDBMS tables or entire databases to fi les in HDFS
▪ Generates Java classes to allow you to interact with your imported
data
▪ Provides the ability to import from SQL databases straight into your
Hive data warehouse
26
27
28