Big_Data_and_Hadoop_Notes
Big_Data_and_Hadoop_Notes
2. History of Hadoop:
3. Hadoop Ecosystem:
- Tools like HDFS, MapReduce, Pig, Hive, HBase, Sqoop, Flume, and Oozie.
- Integrates Hadoop with IBM Infosphere BigInsights for enterprise data management.
1. HDFS Concepts:
3. Hadoop I/O:
- Compression: Reduces data size.
- Splits data into tasks, processes them in parallel, and combines results.
3. Job Scheduling:
- Ensures efficient task execution using schedulers like FIFO, Fair Scheduler.
1. Pig:
2. Hive:
3. HBase:
1. Supervised Learning:
- Uses labeled data to train models.
2. Unsupervised Learning:
3. Collaborative Filtering: