Bda Da1
Bda Da1
Hadoop
Hadoop is a robust framework designed to tackle the challenges of big data
processing. It excels at storing and managing massive datasets across clusters of
computers.
At its core lies HDFS (Hadoop Distributed File System), which distributes data across
multiple nodes for efficient access and fault tolerance.
MapReduce, another key component, breaks down complex tasks into smaller,
parallel processes, enabling rapid data analysis.
Hadoop's ability to handle vast amounts of data has made it indispensable in various
industries. From analyzing customer purchasing patterns in retail to detecting
financial fraud and processing genomic data in healthcare, Hadoop empowers
organizations to extract valuable insights that drive informed decision-making. Its
scalability, fault tolerance, and cost-effectiveness have solidified its position as a
cornerstone in the big data ecosystem.
Procedure
After configuring the Hadoop files, start all Apache Hadoop daemons using the command:
1. ssh:
ssh localhost
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
hadoop-3.2.3/bin/hdfs namenode -format
2. Format the file system :
6. Create a file into the directory using the command: touch xyz.txt
7. Upload a file into the directory using the command: Hadoop fs -put xyz.txt /user
8. Download a file from the Hdfs directory to the localfilesystem using the command:
hadoop fs -get /user/xyz.txt
9. Display the content of a file using the command: hadoop fs -cat /user/xyz.txt.
Files of all the 6 group members:
Hbase
2. Opening the Hbase Shell to create tables using command: hbase shell
3. Creating table of all members using command: create ‘tablename’, ‘name’, ‘age’
4. Putting details in every table using command: put ‘tablename’, ‘row1’ , ’name:XYZ’ ,
‘age:21’
5. Display details of every table using command: scan ‘tablename’