Hive-Part-2
Hive-Part-2
PART-2
HIVE
• Hive is a data warehouse system - used to analyse
structured data.
• Built on the top of Hadoop.
• Developed by Facebook.
• Functionality of reading, writing, and managing
large datasets residing in distributed storage.
• Runs SQL like queries called HQL (Hive query
language) which gets internally converted to
MapReduce jobs.
• Using Hive, - skip writing complex MapReduce
programs.
• Hive supports Data Definition Language (DDL),
HIVE Architecture
Hive - MetaStore
5
Apache Hive Installation
• Java Installation - $ java -version
• Hadoop Installation - $hadoop version
• Download the Apache Hive tar file.
• http://mirrors.estointernet.in/apache/hive/hive-1.2.2/
• Unzip the downloaded tar file.
• tar -xvf apache-hive-1.2.2-bin.tar.gz
• Open the bashrc file. $ sudo nano ~/.bashrc
• Provide the following HIVE_HOME path.
• export HIVE_HOME=/home/user/local/apache-hive-1.2.2-
bin
• export PATH=$PATH:/home/user/local/apache-hive-1.2.2-
bin/bin
• Update the environment variable. $ source ~/.bashrc
• Let's start the hive $ hive
Hive
• Data Types
• DDL Commands
• DML Operations
• Data Retrieval Queries
Hive Data Types
• Basic datatypes
• Numbers
• Date / Time
• Strings
• Complex datatypes
HIVE DATA TYPES
Integer Types
Type Size Range
Decimal
BIGINT Types8-byte signed -9,223,372,036,854,775,808 to
integer 9,223,372,036,854,775,807
Type Size Range
13
Hive DDL
14
Hive - Create Database
hive
> show databases;
hive
> show databases;
Hive - Alter Database
• add database properties or modify the
properties
ALTER Database Command 1
Syntax:
DATABASE or SCHEMA is the same thing we can use any name.
SCHEMA in ALTER is added in hive 0.14.0 and later.
hive> ALTER DATABASE student SET DBPROPERTIES ( 'owner' = ‘IIITK-Batch2020' , ' Date'
= '2023-09-27');
Step 4: Let’s change the existing property to see the effect. In our
example, we are changing the owner from ‘IIITK-Batch2020’ to ‘IIITK-
Batch2020-Set1’
Step 1: Change the user name associated with the student database.
hive> ALTER DATABASE student SET OWNER USER Ram; # with this we have
changed the db owner from dikshant to Ram
Hive - Alter Database
ALTER Database Command 3
Step 1: Change the user name associated with the student database.
hive> DESCRIBE DATABASE EXTENDED student; # we have used it to see the current
user info
hive> ALTER DATABASE student SET OWNER USER Ram; # with this we have changed
the db owner from dikshant to Ram
Hive - Alter Database
ALTER Database Command 3
Syntax:
Step 1: Create a database first so that we can create tables inside it.
hive> CREATE DATABASE database_name;
hive> SHOW DATABASES;
• The internal tables are not flexible enough to share with other
tools like Pig.
• Two keywords
external keyword - used to specify the external table
location keyword - used to determine the location of loaded data
hive> create external table emplist (Id int, Name string , Salary float)
row format delimited
fields terminated by ','
location '/HiveDirectory';
Hive - Load data into
Table
34
Hive - Load data into
Table
35
Hive - Load data into
Table
• If we want to add more data into the current database,
execute the same query again by just updating the
new file name.
36
Hive – Load data into
Table
Load unmatched data
• One or more column data doesn't match the data type of
specified table columns), it will not throw any exception.
• However, it stores the Null value at the position of
unmatched tuple.
• add one more file to the current table. This file contains the
unmatched data.
• Third column contains the data of string type, and the table allows the float type data. So,
this condition arises in an unmatched data situation.
37
Hive – Load data into
Table
Load unmatched data
• Third column contains the data of string type, and the table allows the float type data. So
this condition arises in an unmatched data situation.
38
Hive – Load data into
Table
Load unmatched data
39
Hive - Alter Table
• In Hive, we can perform modifications
in the existing table like changing the
table name, column name, comments,
and table properties.
• It provides SQL like commands to alter
the table.a Table
Rename
Adding column
Change Column
Delete or Replace Column
40
Hive - Alter Table
Rename a Table
change the name of an existing table
41
Hive - Alter Table
Rename a Table
existing tables present in the current database
42
Hive - Alter Table
Adding column
add one or more columns in an existing
table
Syntax: ALTER TABLE table_name ADD COLUMNS(column_name
datatype);
Schema of the table data of columns exists in the
table
43
Hive - Alter Table
Adding column
Schema of the table data of columns exists in the
table
44
Hive - Alter Table
Adding column
Schema of the table Data of columns exists in the
table
45
Hive - Alter Table
Adding column
hive> ALTER TABLE employee_data ADD COLUMNS
(age int);
Updated schema of the table
46
References
• https://www.geeksforgeeks.org/hive-alter-
database/?
ref=ml_lbp - Alter Database
• https
://www.geeksforgeeks.org/how-to-create-t
able-in-hive/?
ref=ml_lbp – Table
• https://
www.javatpoint.com/hive-create-table
47