The document discusses the significance of NoSQL databases in managing Big Data, highlighting their advantages over traditional SQL databases. It explains the various types of NoSQL databases, such as key-value stores, document stores, wide column stores, and graph databases, along with their core properties and examples. The paper concludes that NoSQL databases offer greater scalability, flexibility, and performance for handling large and diverse datasets in real-time applications.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
8 views3 pages
IJTRD8422
The document discusses the significance of NoSQL databases in managing Big Data, highlighting their advantages over traditional SQL databases. It explains the various types of NoSQL databases, such as key-value stores, document stores, wide column stores, and graph databases, along with their core properties and examples. The paper concludes that NoSQL databases offer greater scalability, flexibility, and performance for handling large and diverse datasets in real-time applications.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3
International Journal of Trend in Research and Development, Volume 4(3), ISSN: 2394-9333
www.ijtrd.com
Study of NoSQL Data Types with Big Data
Sarika Rameshwar Rathi, Computer Engineering Department, MGM‟s Polytechnic, Aurangabad, Maharastra, India Abstract: The term „BigData‟ explains superior techniques and In such case, new database NoSql plays very significant role in technologies to capture, store, distribute, manage and analyze big data Analytics. This paper‟s first part explains the concept of petabyte - or terabytes data with 3 Vs. Data is produced from BigData, second part explains requirement of Nosql concept and numerous dissimilar resources and can enter in the system at Nosql database last part explains major differences between SQL various rates. Big data is practical for data sets where their and NoSql. volume or style is away from the potential of traditional relational A. Tasks for Traditional RDBMS databases for capturing, organizing and handing out the data with low-latency. Relational databases were not intended to handle 1. Relational Database Management System uses Schema, in with the scale and quickness challenges that face contemporary schema structure of data must be predefined which is applications, nor were they built to take benefit of the commodity probable to make changes update later, but large changes storage and handing out power accessible today. NOSQL may be complicated. Now a day‟s those data generated it provides a mechanism for storage space and recovery of data does not have predefined format, so every time not probable which is mock-up in means other than the tabular relations used to deal with such complications. in relational databases. These databases are gradually more used 2. For SQL it is not possible to process random and in big data and real-time web applications. unstructured data, now a day‟s those data is generated which is highly flexible ie changing every time. Keywords: BigData, SQL,Nosql, RDBMS, MongoDB, CouchDB, 3. With Relational Database Management systems, built in Redis, Apache HBase , Google’s BigTable, Apache Cassandra. clustering is difficult due to ACID properties. I. INTRODUCTION 4. With continuously increasing size of database and users SQL can‟t easily manage those things. BigData is a gathering of very large amount of datasets. It 5. With development of IT world , new changes are appeared consists of structured data as relational data, semi structured data in network database application, such as data model, as XML data & unstructured data as Word, PDF, Text, Media distributed architecture and data storage. Also data logs. Combination of all these contains a huge amount of extensions, reading and writing speed which is not possible information. Big data is collection of complex and large data sets, in SQL. which include information, may be produced by multiple services such as Black Box Data, Social media, Stock exchange, Search II. CONCEPT OF NOSQL engine, sensors used for climate information, digital pictures, NoSQL Not Only SQL is stands for non relational storage traffic, software logs etc. Big data at start characterized as system. Nosql provides a method for storage and retrieval of data combination of 3V‟s:1st V indicates gigantic Volume of data , 2nd which is modeled in means other than the tabular relations used in V represents Velocity of processing huge data and 3 rd V denotes relational databases. Such databases gradually more used in big broad variety of different types of data. data and real-time web applications. Data is produced from numerous dissimilar resources and can In Nosql database, data can be added anywhere, at any time. enter in the system at various rates. The major mission here is to There‟s no need to specify a document design or schema. Main merge numerous data from various systems. Here requirement of features of Nosql databases are scalability, concurrency control, databases to be skilled to store and process big data efficiently, consistency, availability, durability etc. demand for high performance when reading and writing, so the traditional relational databases is facing many new challenges. A. Architecture of NoSql Particularly in huge level and high concurrency applications via the relational database to store and query dynamic user data have come into view to be inadequate.
Figure 2: Architecture of NoSQL Data
Figure 1: Enlighten concept of 3V‟s
IJTRD | May-Jun 2017
Available [email protected] 137 International Journal of Trend in Research and Development, Volume 4(3), ISSN: 2394-9333 www.ijtrd.com These data stores have been intended to offer higher scalability from one model to another using this database.Such information and availability than conventional relational databases while can be used on social relations or geographic data. Allegro supporting a simplified transaction and consistency model. Some Graph, Arango DB, Infinite Graph is an example of Graph Based examples of NoSQL data stores are: MongoDB , Apache HBase , Store Google‟s BigTable , and Apache Cassandra. III. COMPARISIONS OF DIFFERENT NOSQL TYPES WITH ITS PERFORMANCE Core properties of NoSQL are : (1) they generally use a distributed and fault-tolerant architecture where additional Table 1: Performance evulation of NoSql Datatypes computers can be easily added, (2) data is partitioned among different computers, (3) data is replicated to provide fault Type of Performa Scalabi Flexibil Comple Function tolerance, and (4) they often support a simplified NoSql nce lity ity xity ality transaction/consistency model Key High High High Nil Variable Value (nil) B. Nosql database types Store Four different Core NoSql Systems are as : Key value store, Column High High Modera Low Minimu Document basedstore, Wide Column basedstore, Graph BasedSt te m Basedstore. ore Docume High Variabl High Low Variable Key value store:-In key value store the key may be synthetic or n e (low) auto-generated while the value can be String. It fundamentally BasedSt uses a hash table in which there exist a unique key to provide non ore ambiguous value and a pointer to a particular item of data. A Graph Variable Variabl High High Graph bucket is a logical group of keys – but they don‟t physically BasedSt e Theory group the data. There can be identical keys in different buckets. ore To read a value we require both the key and the bucket because the real key is a hash (Bucket+ Key). IV. SOME NOSQL DATABASE EXAMPLES Some of operations performed in this are Insert, Lookup and A. MongoDB Delete. The simple model provided by key value allows you to work very fast and efficient represented in Amazon‟s Dynamo. MongoDB is a free and open-source database. It is non-relational database with dynamic schema, written in C++.Mongodb is one Document base store: - The data which is collection of key value of the most popular document based NoSQL database as it stores pair is compressed as document store quite similar to key value data in JSON like documents. It is currently being used by some store, but difference is that values stored referred as document. big companies like The New York Times, Craigslist, MTV So here data is stored in documents. Arbitrary data is in Networks. structured data format. Generally data format is fixed and structure is flexible.One key difference between key value and As in SQL, RDBMS consist of table,in NoSql ,MongoDB consist document base is that document base provides attributes metadata of Collection which is a group of MongoDB documents. A associated with stored content. Apache CouchDB is an example collection exists within a single database. Collections do not of document store. It uses JSON to store data. enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection are of Wide Column Based store:- are also identified as Extensible similar or related purpose. Record Store. It can be seen as key value store with two dimensional key i.e column key and row key with an ability to B. Some of MongoDB Strength hold very large numbers of dynamic columns. Since the column Speed: it gives good performance, as all the related data are in names as well as the record keys are not fixed, and since a record single document which eliminates the join operations. can have billions of columns, wide column stores can be seen as two-dimensional key-value stores.It can be further extended by Scalability: It is horizontally scalable i.e. you can reduce the timestamp as in Google‟sBig Table or keyspaces or domains. So workload by increasing the number of servers in your resource we can say that wide column stores can have many dimensions pool instead of relying on a stand alone resource. resulting in a structure of multi dimensions. Manageable: It is easy to use for both developers and Accumulo, Cassandra, Druid, HBase is an example of Wide administrators. This also gives the ability to shard Column Basedstore. database Dynamic Schema: Its gives you the flexibility to evolve Graph Based Stores:- In this type data is represented by graph. your data schema without modifying the existingdata These are usually used for representing data for high information about flexible number of interconnections. Graph structures are C. CouchDB used with edges, nodes and properties which provides index free CouchDB is an open source database developed by Apache adjancy. This kind of database is designed for data whose software foundation, is a multi master application. CouchDB is a relations are well represented as a graph consisting of elements document based NoSQL database. The focus is on the ease of interconnected with a finite number of relations between them. use, embracing the web.It uses JSON, to store data (documents), The type of data could be social relations, public transport links, java script as its query language to transform the documents, http road maps or network topologies.Data can be easily transferred
IJTRD | May-Jun 2017
Available [email protected] 138 International Journal of Trend in Research and Development, Volume 4(3), ISSN: 2394-9333 www.ijtrd.com protocol for API to access the documents. CouchDB became an consists of n number of rows of data whereas NoSQL Apache project in 2008. Some of CouchDB strengths: databases are the collection of key-value pair, documents, graph databases or wide-column stores Schema Less:- it have a dynamic schema which make it which do not have standard schema definitions which it more flexible, having a form of JSON documents. needs to adhered to. HTTP Query:- Easy access databse documents using web browser. Conflict Resolution:- Automatic conflict detection which is useful in a distributed database. Easy replication:- implementing replication is practically straight forward. D. Redis Redis is an open source, advanced key-value store. It is often referred to as a data structure server, since the keys can contain strings, hashes, lists, sets and sorted sets. Redis is written in C. It is generally used for building high performance, scalable web applications. It is practically used because of its lightening speed. It can replicate data to any number of slaves. Some of Redis strength: Exceptionally fast − Redis is very fast and can perform about 110000 SETs per second, about 81000 GETs per SQL databases are vertically scalable whereas the second. NoSQL databases are horizontally scalable. Supports rich data types − Redis natively supports most SQL database examples: MySql, Oracle, Sqlite, Postgres of the datatypes which already known.This makes it and MS-SQL. NoSQL database examples: MongoDB, easy to solve a variety of problems. BigTable, Redis, RavenDb, Cassandra, Hbase and Operations are atomic − All Redis operations are CouchDb atomic, which ensures that if two clients concurrently access, Redis server will receive the updated value. CONCLUSION Multi-utility tool − Redis is a multi-utility tool and can NoSql database are attractive main element of database scenery be used in a number of use cases such as caching, today, and with their number of advantages, they can be practical messaging-queues. in real world applications. Less cost, easier scalability and open source features construct easier alternative for lots of companies D. HBase come across to integrate in BigData. Now a day‟s major business HBase is a data model that is similar to Google‟s big table data depends on coputers, so IT persons for particular business designed to provide quick random access to huge amounts of decide which features are most important for their database and structured data. It is a distributed column-oriented database built based on this select one of the datavase. The choice between on top of the Hadoop file system. It is an open-source project and NoSql and Sql depends on the complex business needs of an is horizontally scalable. It is a part of the Hadoop ecosystem that organization with volume, velocity and variety of data what it provides random real-time read/write access to data in the posses. Hadoop File System. Some of HBase strength: References Easy Operations:- Easy to perform basicoperations on HBase using java. [1] Sarika Rathi “A brief Study of Big Data Analytics using Fault tolerance:- It leverages the fault tolerance Apache Pig and Hadoop Distributed File Sysetm” provided by the Hadoop File System (HDFS). [2] NoSQL Databases by Johannes Zollmann Easy Storage:-One can store the data in HDFS either [3] BigData NoSql White paper directly or through HBase. [4] Exploring the Different Types of NoSQL Databases Part ii Random Access:-Data consumer reads/accesses the data [5] NoSQL : A New Horizon in Big Data Amarbir Singh in HDFS randomly using HBase. Published in IJSRSET [6] SQL: From Traditional Databases to Big Data V. MAJOR DIFFERENCE BETWEEN SQL AND NOSQL [7] An introduction to Hbase SQL databases are primarily known as Relational https://www.tutorialspoint.com/hbase.html Databases , whereas NoSQL database are primarily [8] An introduction to Redis known as non-relational or distributed database. https://www.tutorialspoint.com/Redis.html SQL databases have predefined schema whereas NoSQL [9] An introduction to MongoDB databases have dynamic schema for unstructured data. https://www.tutorialspoint.com/MongoDB.html SQL databases are table based databases whereas Sarika R. Rathi, Computer Department , MGM‟s Polytechnic NoSQL databases are document based, key-value pairs, College, Aurangabad,India,8087624333/7798615500. graph databases or wide-column stores. This means that e-mail:- [email protected] SQL databases represent data in form of tables which