Unit 3 tt1
Unit 3 tt1
v. Alter the table to add a new column “CGPA” with float datatype:
ALTER TABLE student ADD COLUMN cgpa FLOAT;
These commands will create a table in Hive with transactional properties and partitions, insert values
into the table, display the count of students with respect to branch, update the name where ID = 103,
and alter the table to add a new column "CGPA" with float datatype
Table B: Salary
+----+-------+
| Id | Salary|
+----+-------+
| 1 | 50000 |
| 2 | 60000 |
| 4 | 70000 |
These are the HBase commands for creating a table, adding column families and columns, displaying
table contents, and checking the table's status. Adjust the row key and column values as needed for your
specific use case.
(iii) Inserting value in the column c1 of cf1 as v1, column c2 of cf1 as v2:
put 't1', 'row_key', 'cf1:c1', 'v1'
put 't1', 'row_key', 'cf1:c2', 'v2'
These are the HBase commands for checking the status of the server, creating a table with column
families, inserting values into the table, displaying available filters, and deleting a table from the HBase
server. Adjust the table name, column families, row key, and column values as needed for your specific
use case.
Explain the working of Zookeeper. Also state the benefits of Zookeeper. (10 Marks)
• Apache ZooKeeper is a software project of Apache Software Foundation.
• It is an open-source technology that maintains configuration information and provides
synchronized as well as group services which are deployed on Hadoop cluster to administer
the infrastructure.
• The ZooKeeper framework was originally built at Yahoo! for easier accessing of applications
but, later on, ZooKeeper was used for organizing services used by distributed frameworks like
Hadoop, HBase, etc., and Apache ZooKeeper became a standard.
• It was designed to be a vigorous service that enabled application developers to focus mainly
on their application logic rather than coordination.
• ZooKeeper is a distributed coordination service that also helps to manage a large set of hosts.
• Managing and coordinating a service especially in a distributed environment is a complicated
process, so ZooKeeper solves this problem due to its simple architecture as well as API, that
allows developers to implement common coordination tasks like electing a master server,
managing group membership, and managing metadata.
• Apache ZooKeeper is used for maintaining centralized configuration information, naming,
providing distributed synchronization, and providing group services in a simple interface so
that we don’t have to write it from scratch.
• Apache Kafka also uses ZooKeeper to manage configuration.
• ZooKeeper allows developers to focus on the core application logic, and it implements various
protocols on the cluster so that the applications need not implement them on their own.Features
of Apache ZooKeeper
Apache ZooKeeper provides a wide range of good features to the user such as:
• Updating the Node’s Status: Apache ZooKeeper is capable of updating every node that
allows it to store updated information about each node across the cluster.
• Managing the Cluster: This technology can manage the cluster in such a way that the status
of each node is maintained in real time, leaving lesser chances for errors and ambiguity.
• Naming Service: ZooKeeper attaches a unique identification to every node which is quite
similar to the DNA that helps identify it.
• Automatic Failure Recovery: Apache ZooKeeper locks the data while modifying which
helps the cluster recover it automatically if a failure occurs in the database.Zookeeper
Working of Apache Zookeeper
• The first thing that happens as soon as the ensemble (a group of ZooKeeper servers) starts is,
it waits for the clients to connect to the servers.
• After that, the clients in the ZooKeeper ensemble will connect to one of the nodes. That node
can be any of a leader node or a follower node.
• Once the client is connected to a particular node, the node assigns a session ID to the client
and sends an acknowledgement to that particular client.
• If the client does not get any acknowledgement from the node, then it resends the message to
another node in the ZooKeeper ensemble and tries to connect with it.
• On receiving the acknowledgement, the client makes sure that the connection is not lost by
sending the heartbeats to the node at regular intervals.
• Finally, the client can perform functions like read, write, or store the data as per the need.
Benefits of Apache ZooKeeper
• Simplicity: Coordination is done with the help of a shared hierarchical
namespace.
• Reliability: The system keeps performing even if more than one node
fails.
• Order: It keeps track by stamping each update with a number
denoting its order.
• Speed: It runs with a ratio of 10:1 in the cases where ‘reads’ are more
common.
• Scalability: The performance can be enhanced by deploying more
machines.