Bigdata Ass2
Bigdata Ass2
where it is used :-
WhY it is used :-
Scalability
Fault Tolerance
Simplicity
Flexibility
Flexibility
-----------------------------------------------------------------------------------
-----------------------------------------------------------
Que 2
S.No. Pig Hive
1. Pig operates on the client side of a cluster. Hive operates on the
server side of a cluster.
2. Pig uses pig-latin language. Hive uses HiveQL language.
3. Pig is a Procedural Data Flow Language. Hive is a Declarative
SQLish Language.
4. It was developed by Yahoo. It was developed by Facebook.
5. It is used by Researchers and Programmers. It is mainly used by
Data Analysts.
6. It is used to handle structured and
semi-structured data. It is mainly used to handle
structured data.
7. It is used for programming. It is used for creating
reports.
8. Pig scripts end with .pig extension. In HIve, all extensions
are supported.
9. It does not support partitioning. It supports partitioning.
10. It loads data quickly. It loads data slowly.
-----------------------------------------------------------------------------------
--------------------------------------------------------
Que 3
NoSQL, or "Not Only SQL," is a type of database management system designed for
handling large volumes of unstructured or semi-structured data. It provides
flexible data models, horizontal scalability, and high performance, making it well-
suited for modern, data-intensive applications. NoSQL databases come in various
forms, including document, key-value, column-family, and graph databases, each
tailored to specific use cases.
Variation of nosql
Column-Family Stores: Data is organized into column families rather than tables
Graph Databases: These are optimized for storing and querying graph-like data
structures.
Wide-Column Stores: Designed for large-scale data with high write throughput
Object Databases: These store data in the form of objects, allowing for complex
data structures.
-----------------------------------------------------------------------------------
----------------------------
Que 4
The Hadoop ecosystem is a collection of open-source software tools and frameworks
for distributed storage and processing of big data. Its key components include:
-----------------------------------------------------------------------------------
---------------
Que 5
Social networking mining refers to the process of extracting insights and patterns
from social media data. It involves collecting and analyzing user-generated content
from platforms like Facebook, Twitter, and Instagram.
Applications: