SlideShare a Scribd company logo
HIVE COMMANDS IN HADOOP
What is HIVE in hadoop ?
Apache HIVE is data warehouse software used to
querying and managing large data set on distributed
cluster. Introduced by Facebook(2007).
Hive is a data warehousing framework built on top of
Hadoop.
Hive is designed to enable,
Easy data summarization
Ad-hoc querying
Analysis of large volumes of data
HADOOP + SQL = HQL
Very useful for anyone who knows SQL; BI Developers
feel right at home with Hive!
Access to files on various data stores such as HDFS and
Hbase
HIVE ARCHICTURE
Hive Metastore:
•To support features like schema(s)
and data partitioning
• Hive keeps its metadata in a
Relational Database
• Packaged with Derby, a lightweight
embedded SQL DB
Hive Clients:
•Thrift Application
•JDBC Application
•ODBC Application
Let’s start with HIVE…type HIVE in to terminal
CREATE TABLE :
>CREATE [EXTERNAL] TABLE Student (Id INT, Name STRING, Addr STRING)
ROW FORMAT DELIMITED FIELDS TERMINITED BY ‘,’
LINES TERMINITED BY ‘/n’
LOCATION ‘/hive/data/college’ ;
Create statement use to create a table called Student with three columns ,
the first being an integer and another two are string type.
If we use EXTERNAL table, when we drop the student table our data file is
not delete only schema will delete. if we not use external table our original file is
move into that table when we load this file and we use drop command for delete
this table our file is also delete.
And remaining part is how our file is formatted, like fields are separated
by comma and lines are terminated by ‘/n’ (next line). and the location of the table
where it can be store.
LOAD DATA INTO TABLE :
>LOAD DATA [LOACL] INPATH ‘path/to_the/input/file.txt’ OVERWRITE INTO
TABLE Student ;
Data can be load from local file system or it can be load from HDFS.
'LOCAL' signifies that the input file is on the local file system. If 'LOCAL' is omitted
then it looks for the file in HDFS.
It can be move our file.txt into table Student. The keyword 'OVERWRITE'
signifies that existing data in the table is deleted. If the 'OVERWRITE' keyword is
omitted, data files are appended to existing data sets.
LIST OF TABLES IN HIVE:
>SHOW TABLES ;
SHOW TABLES command gives us list of all the tables that are present in HIVE
database.
For example ,
Student
Employee
department etc.
DESCRIBE :
>DESCRIBE Student ;
Describe command shows schema of the table,
i.e. Student{Id INT , Name STRING , Addr STRING}
SELECT:
>SELECT * FORM Student ;
It gives us everything present in the table, like a RDBMS SQL query.
The * means all the data that are present in the table.
WHERE:
>SELECT * FORM Student WHERE Addr = “PUNE” ;
where is used to apply the condition when querying the data .
In above query it select all the data where address like PUNE .
ALTER :
>ALTER TABLE Student RENAME TO employee ;
Hive allows you to change the definition for columns, add new columns,
or even replace all existing columns in a table with a new set.
Now we use ALTER to rename the student table to employee.
>ALTER TABLE Student ADD COLUMNS ( phone INT) ;
We can add the extra columns in the table using ALTER.
>ALTER TABLE Student REPLACE COLUMNS (Id INT , First_Name STRING , Addr
STRING COMMENT ‘First_Name replaces the Name column’);
>ALTER TABLE Student REPLACE COLUMNS (Id INT COMMENT 'only keep the first
column');
REPLACE COLUMNS replaces all existing columns and only changes the
table's schema, not the data. REPLACE COLUMNS can also be used to drop columns
from the table's schema.
We can also us comments to the statements in HIVE.
PARTITION:
>LOAD DATA INPATH ‘/data/files/sample1.txt‘ OVERWRITE INTO TABLE Employee
PARTITION (department=‘HR');
>LOAD DATA INPATH ‘/data/files/sample2.txt‘ OVERWRITE INTO TABLE Employee
PARTITION (department=‘Testing');
To increase performance Hive has the capability to partition data. The values of
partitioned column divide a table into segments.
The two LOAD statements above load data into two different partitions of
the table Employee. Table Employee must be created as partitioned by the key
departments for this to succeed.
Example query:
>SELECT a.Id FROM Employee a WHERE a.department=‘HR';
selects column ‘Id' from all rows of partition department=HR of
the Employee table. The results are not stored anywhere, but are displayed on the
console.
JOINS :
•Joins in Hive are trivial
•Supports outer joins
•Can join multiple tables
– Rows are joined where the keys match
– Rows that do not have matches are not included in the result
Inner Join
Let’s say we have 2 tables: posts and likes
SELECT * FROM posts LIMIT 10;
User1 Funny Story 1343182026191
User2 Cool Deal 1343182133839
User4 Interesting Post 343182154633
User5 Yet Another Blog 1343183939434
SELECT * FROM likes LIMIT 10;
User1 12 1343182026191
User2 7 1343182139394
User3 0 1343182154633
User4 50 1343182147364
JOINS …..
>CREATE TABLE posts_likes (user STRING, post STRING, likes_count INT);
We want to join these 2 data-sets and produce a single table that
contains user, post and count of likes.
>INSERT OVERWRITE TABLE posts_likes
SELECT p.user, p.post, l.count
FROM posts p JOIN likes l ON (p.user = l.user);
SELECT * FROM posts_likes LIMIT 10;
User1 Funny Story 12
user2 Cool Deal 7
user4 Interesting Post 50
Two tables are joined based on user column; 3 columns are selected
and stored in posts_likes table.
Outer JOIN Examples
SELECT p.*, l.* FROM posts p LEFT OUTER JOIN likes l ON (p.user = l.user) limit 10;
SELECT p.*, l.* FROM posts p RIGHT OUTER JOIN likes l ON (p.user = l.user) limit 10;
SELECT p.*, l.* FROM posts p FULL OUTER JOIN likes l ON (p.user = l.user) limit 10;
SUBQUERY:
> SELECT COL FROM (SELECT col1+col2 AS col FROM table1) table2;
HIVE supports sub queries only in the FROM clause.
The columns in the sub query select list are available in the outer query just like
column of a table.
VIEW:
> CREATE VIEW Student_V AS SELECT * FROM Student WHERE Name IS
NOT NULL AND Addr =‘PUNE’;
A VIEW is a sort of “virtual table” that is defined by a SELECT statement.
Views can be used present data to users in a different way to the way it is actually
stored on disk.
IMPORTING DATA-INSERT:
So far we used LOAD DATA operation to import data into a Hive table (or
partition) by copying or moving files to the table’s directory. You can also populate
a table with data from another Hive table using an INSERT statement
Example of an INSERT statement:
> INSERT OVERWRITE TABLE target SELECT col1, col2
FROM source;
For partitioned tables, you can specify the partition to insert into by supplying a
PARTITION clause:
>INSERT OVERWRITE TABLE target PARTITION (dept=‘HR‘) SELECT col1,
col2 FROM source;
IMPORTING DATA-MULTITABLE INSERT:
In HiveQL, you can turn the INSERT statement around and start with the
FROM clause, for the same effect:
Example:
FROM records2
INSERT OVERWRITE TABLE stations_by_year SELECT year, COUNT(DISTINCT
station) GROUP BY year
INSERT OVERWRITE TABLE records_by_year SELECT year, COUNT(1)
GROUP BY year
INSERT OVERWRITE TABLE good_records_by_year SELECT year, COUNT(1)
WHERE temperature != 9999
AND (quality = 0 OR quality = 1 OR quality = 4 OR quality = 5 OR quality = 9)
GROUP BY year;
There is a single source table (records2), but three tables to hold the results from
three different queries over the source.
BUCKETING:
Mechanism to query and examine random samples of data.
Break data into a set of buckets based on a hash function of a "bucket column“.
Capability to execute queries on a sub-set of random data.
Doesn’t automatically enforce bucketing.
>CREATE TABLE post_count (user STRING, count INT)
CLUSTERED BY (user) INTO 5 BUCKETS;
>SET hive.enforce.bucketing = true; //set bucketing to hive
>INSERT OVERWRITE TABLE post_count
SELECT user, COUNT(post) FROM posts GROUP BY user;
INDEXES:
Indexes use to speed up the performance of our Hive Query.
CREATE INDEX date_index ON TABLE weather(date) AS ‘COMPACT’ WITH
DEFERED REBULD ;
Ganesh L. Sanap
connectoganesh@gmail.com
INDEXES……
We create date index on date column. REBULD partition by partition for save our
time of rebuild.
>ALTER INDEX date_index ON weather PARTITION (month=‘01’) REBULD ;
>SHOW INDEX ON weather;
>DROP INDEX date_index ON weather;
NOTE:- IN HIVE we use ORDER BY for one reducer and SORT BY for multiple
reducers.
Ad

More Related Content

What's hot (18)

Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamHive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
Zheng Shao
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start Tutorial
Carl Steinbach
 
Hadoop on osx
Hadoop on osxHadoop on osx
Hadoop on osx
Devopam Mittra
 
Apache Hive
Apache HiveApache Hive
Apache Hive
Ajit Koti
 
Hive Anatomy
Hive AnatomyHive Anatomy
Hive Anatomy
nzhang
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
Abhinav Tyagi
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
Rupak Roy
 
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Perl Programming - 04 Programming Database
Perl Programming - 04 Programming DatabasePerl Programming - 04 Programming Database
Perl Programming - 04 Programming Database
Danairat Thanabodithammachari
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
Rupak Roy
 
HiveServer2
HiveServer2HiveServer2
HiveServer2
Schubert Zhang
 
Unit 5-lecture-3
Unit 5-lecture-3Unit 5-lecture-3
Unit 5-lecture-3
vishal choudhary
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON
Padma shree. T
 
SQOOP PPT
SQOOP PPTSQOOP PPT
SQOOP PPT
Dushhyant Kumar
 
Hive
HiveHive
Hive
Vetri V
 
Ten tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache HiveTen tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache Hive
Will Du
 
From oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsFrom oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other tools
Guy Harrison
 
Hbase
HbaseHbase
Hbase
Vetri V
 
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamHive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
Zheng Shao
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start Tutorial
Carl Steinbach
 
Hive Anatomy
Hive AnatomyHive Anatomy
Hive Anatomy
nzhang
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
Rupak Roy
 
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
Rupak Roy
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON
Padma shree. T
 
Ten tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache HiveTen tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache Hive
Will Du
 
From oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsFrom oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other tools
Guy Harrison
 

Viewers also liked (14)

Hadoop basic commands
Hadoop basic commandsHadoop basic commands
Hadoop basic commands
bispsolutions
 
Integration of Hive and HBase
Integration of Hive and HBaseIntegration of Hive and HBase
Integration of Hive and HBase
Hortonworks
 
Pig programming is more fun: New features in Pig
Pig programming is more fun: New features in PigPig programming is more fun: New features in Pig
Pig programming is more fun: New features in Pig
daijy
 
What's new in Apache Hive
What's new in Apache HiveWhat's new in Apache Hive
What's new in Apache Hive
DataWorks Summit
 
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labsApache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Viswanath Gangavaram
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading
Dataiku
 
Hive Evolution: ApacheCon NA 2010
Hive Evolution:  ApacheCon NA 2010Hive Evolution:  ApacheCon NA 2010
Hive Evolution: ApacheCon NA 2010
John Sichi
 
Hadoop File System Shell Commands,
Hadoop File System Shell Commands,Hadoop File System Shell Commands,
Hadoop File System Shell Commands,
Hadoop online training
 
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | EdurekaPig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Edureka!
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
Lynn Langit
 
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
Zheng Shao
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big Data
DATAVERSITY
 
Hadoop basic commands
Hadoop basic commandsHadoop basic commands
Hadoop basic commands
bispsolutions
 
Integration of Hive and HBase
Integration of Hive and HBaseIntegration of Hive and HBase
Integration of Hive and HBase
Hortonworks
 
Pig programming is more fun: New features in Pig
Pig programming is more fun: New features in PigPig programming is more fun: New features in Pig
Pig programming is more fun: New features in Pig
daijy
 
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labsApache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Viswanath Gangavaram
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading
Dataiku
 
Hive Evolution: ApacheCon NA 2010
Hive Evolution:  ApacheCon NA 2010Hive Evolution:  ApacheCon NA 2010
Hive Evolution: ApacheCon NA 2010
John Sichi
 
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | EdurekaPig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Edureka!
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
Lynn Langit
 
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
Zheng Shao
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big Data
DATAVERSITY
 
Ad

Similar to Hive commands (20)

Mysql cheatsheet
Mysql cheatsheetMysql cheatsheet
Mysql cheatsheet
Adolfo Nasol
 
MySQL Essential Training
MySQL Essential TrainingMySQL Essential Training
MySQL Essential Training
HudaRaghibKadhim
 
Module 3
Module 3Module 3
Module 3
cs19club
 
PPT of Common Table Expression (CTE), Window Functions, JOINS, SubQuery
PPT  of Common Table Expression (CTE), Window Functions, JOINS, SubQueryPPT  of Common Table Expression (CTE), Window Functions, JOINS, SubQuery
PPT of Common Table Expression (CTE), Window Functions, JOINS, SubQuery
Abhishek590097
 
Unit-1 SQL fundamentals.docx SQL commands used to create table, insert values...
Unit-1 SQL fundamentals.docx SQL commands used to create table, insert values...Unit-1 SQL fundamentals.docx SQL commands used to create table, insert values...
Unit-1 SQL fundamentals.docx SQL commands used to create table, insert values...
SakkaravarthiS1
 
Sql tutorial
Sql tutorialSql tutorial
Sql tutorial
amitabros
 
Creating database using sql commands
Creating database using sql commandsCreating database using sql commands
Creating database using sql commands
Belle Wx
 
SQL
SQLSQL
SQL
Shyam Khant
 
ADVANCE ITT BY PRASAD
ADVANCE ITT BY PRASADADVANCE ITT BY PRASAD
ADVANCE ITT BY PRASAD
PADYALAMAITHILINATHA
 
Its about a sql topic for basic structured query language
Its about a sql topic for basic structured query languageIts about a sql topic for basic structured query language
Its about a sql topic for basic structured query language
IMsKanchanaI
 
SQL. It education ppt for reference sql process coding
SQL. It education ppt for reference  sql process codingSQL. It education ppt for reference  sql process coding
SQL. It education ppt for reference sql process coding
aditipandey498628
 
Les10 Creating And Managing Tables
Les10 Creating And Managing TablesLes10 Creating And Managing Tables
Les10 Creating And Managing Tables
NETsolutions Asia: NSA – Thailand, Sripatum University: SPU
 
DOODB_LAB.pptx
DOODB_LAB.pptxDOODB_LAB.pptx
DOODB_LAB.pptx
FilestreamFilestream
 
DBMS and SQL(structured query language) .pptx
DBMS and SQL(structured query language) .pptxDBMS and SQL(structured query language) .pptx
DBMS and SQL(structured query language) .pptx
jainendraKUMAR55
 
Introduction to database and sql fir beginers
Introduction to database and sql fir beginersIntroduction to database and sql fir beginers
Introduction to database and sql fir beginers
reshmi30
 
DBMS.pdf
DBMS.pdfDBMS.pdf
DBMS.pdf
Rishab Saini
 
SQl data base management and design
SQl     data base management  and designSQl     data base management  and design
SQl data base management and design
franckelsania20
 
Hive
HiveHive
Hive
GowriLatha1
 
Chapter 4 Structured Query Language
Chapter 4 Structured Query LanguageChapter 4 Structured Query Language
Chapter 4 Structured Query Language
Eddyzulham Mahluzydde
 
SQL.pptx for the begineers and good know
SQL.pptx for the begineers and good knowSQL.pptx for the begineers and good know
SQL.pptx for the begineers and good know
PavithSingh
 
PPT of Common Table Expression (CTE), Window Functions, JOINS, SubQuery
PPT  of Common Table Expression (CTE), Window Functions, JOINS, SubQueryPPT  of Common Table Expression (CTE), Window Functions, JOINS, SubQuery
PPT of Common Table Expression (CTE), Window Functions, JOINS, SubQuery
Abhishek590097
 
Unit-1 SQL fundamentals.docx SQL commands used to create table, insert values...
Unit-1 SQL fundamentals.docx SQL commands used to create table, insert values...Unit-1 SQL fundamentals.docx SQL commands used to create table, insert values...
Unit-1 SQL fundamentals.docx SQL commands used to create table, insert values...
SakkaravarthiS1
 
Sql tutorial
Sql tutorialSql tutorial
Sql tutorial
amitabros
 
Creating database using sql commands
Creating database using sql commandsCreating database using sql commands
Creating database using sql commands
Belle Wx
 
Its about a sql topic for basic structured query language
Its about a sql topic for basic structured query languageIts about a sql topic for basic structured query language
Its about a sql topic for basic structured query language
IMsKanchanaI
 
SQL. It education ppt for reference sql process coding
SQL. It education ppt for reference  sql process codingSQL. It education ppt for reference  sql process coding
SQL. It education ppt for reference sql process coding
aditipandey498628
 
DBMS and SQL(structured query language) .pptx
DBMS and SQL(structured query language) .pptxDBMS and SQL(structured query language) .pptx
DBMS and SQL(structured query language) .pptx
jainendraKUMAR55
 
Introduction to database and sql fir beginers
Introduction to database and sql fir beginersIntroduction to database and sql fir beginers
Introduction to database and sql fir beginers
reshmi30
 
SQl data base management and design
SQl     data base management  and designSQl     data base management  and design
SQl data base management and design
franckelsania20
 
SQL.pptx for the begineers and good know
SQL.pptx for the begineers and good knowSQL.pptx for the begineers and good know
SQL.pptx for the begineers and good know
PavithSingh
 
Ad

Recently uploaded (20)

LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025
kashifyounis067
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025
kashifyounis067
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 

Hive commands

  • 2. What is HIVE in hadoop ? Apache HIVE is data warehouse software used to querying and managing large data set on distributed cluster. Introduced by Facebook(2007). Hive is a data warehousing framework built on top of Hadoop. Hive is designed to enable, Easy data summarization Ad-hoc querying Analysis of large volumes of data HADOOP + SQL = HQL Very useful for anyone who knows SQL; BI Developers feel right at home with Hive! Access to files on various data stores such as HDFS and Hbase
  • 3. HIVE ARCHICTURE Hive Metastore: •To support features like schema(s) and data partitioning • Hive keeps its metadata in a Relational Database • Packaged with Derby, a lightweight embedded SQL DB Hive Clients: •Thrift Application •JDBC Application •ODBC Application
  • 4. Let’s start with HIVE…type HIVE in to terminal CREATE TABLE : >CREATE [EXTERNAL] TABLE Student (Id INT, Name STRING, Addr STRING) ROW FORMAT DELIMITED FIELDS TERMINITED BY ‘,’ LINES TERMINITED BY ‘/n’ LOCATION ‘/hive/data/college’ ; Create statement use to create a table called Student with three columns , the first being an integer and another two are string type. If we use EXTERNAL table, when we drop the student table our data file is not delete only schema will delete. if we not use external table our original file is move into that table when we load this file and we use drop command for delete this table our file is also delete. And remaining part is how our file is formatted, like fields are separated by comma and lines are terminated by ‘/n’ (next line). and the location of the table where it can be store.
  • 5. LOAD DATA INTO TABLE : >LOAD DATA [LOACL] INPATH ‘path/to_the/input/file.txt’ OVERWRITE INTO TABLE Student ; Data can be load from local file system or it can be load from HDFS. 'LOCAL' signifies that the input file is on the local file system. If 'LOCAL' is omitted then it looks for the file in HDFS. It can be move our file.txt into table Student. The keyword 'OVERWRITE' signifies that existing data in the table is deleted. If the 'OVERWRITE' keyword is omitted, data files are appended to existing data sets. LIST OF TABLES IN HIVE: >SHOW TABLES ; SHOW TABLES command gives us list of all the tables that are present in HIVE database. For example , Student Employee department etc.
  • 6. DESCRIBE : >DESCRIBE Student ; Describe command shows schema of the table, i.e. Student{Id INT , Name STRING , Addr STRING} SELECT: >SELECT * FORM Student ; It gives us everything present in the table, like a RDBMS SQL query. The * means all the data that are present in the table. WHERE: >SELECT * FORM Student WHERE Addr = “PUNE” ; where is used to apply the condition when querying the data . In above query it select all the data where address like PUNE .
  • 7. ALTER : >ALTER TABLE Student RENAME TO employee ; Hive allows you to change the definition for columns, add new columns, or even replace all existing columns in a table with a new set. Now we use ALTER to rename the student table to employee. >ALTER TABLE Student ADD COLUMNS ( phone INT) ; We can add the extra columns in the table using ALTER. >ALTER TABLE Student REPLACE COLUMNS (Id INT , First_Name STRING , Addr STRING COMMENT ‘First_Name replaces the Name column’); >ALTER TABLE Student REPLACE COLUMNS (Id INT COMMENT 'only keep the first column'); REPLACE COLUMNS replaces all existing columns and only changes the table's schema, not the data. REPLACE COLUMNS can also be used to drop columns from the table's schema. We can also us comments to the statements in HIVE.
  • 8. PARTITION: >LOAD DATA INPATH ‘/data/files/sample1.txt‘ OVERWRITE INTO TABLE Employee PARTITION (department=‘HR'); >LOAD DATA INPATH ‘/data/files/sample2.txt‘ OVERWRITE INTO TABLE Employee PARTITION (department=‘Testing'); To increase performance Hive has the capability to partition data. The values of partitioned column divide a table into segments. The two LOAD statements above load data into two different partitions of the table Employee. Table Employee must be created as partitioned by the key departments for this to succeed. Example query: >SELECT a.Id FROM Employee a WHERE a.department=‘HR'; selects column ‘Id' from all rows of partition department=HR of the Employee table. The results are not stored anywhere, but are displayed on the console.
  • 9. JOINS : •Joins in Hive are trivial •Supports outer joins •Can join multiple tables – Rows are joined where the keys match – Rows that do not have matches are not included in the result Inner Join Let’s say we have 2 tables: posts and likes SELECT * FROM posts LIMIT 10; User1 Funny Story 1343182026191 User2 Cool Deal 1343182133839 User4 Interesting Post 343182154633 User5 Yet Another Blog 1343183939434 SELECT * FROM likes LIMIT 10; User1 12 1343182026191 User2 7 1343182139394 User3 0 1343182154633 User4 50 1343182147364
  • 10. JOINS ….. >CREATE TABLE posts_likes (user STRING, post STRING, likes_count INT); We want to join these 2 data-sets and produce a single table that contains user, post and count of likes. >INSERT OVERWRITE TABLE posts_likes SELECT p.user, p.post, l.count FROM posts p JOIN likes l ON (p.user = l.user); SELECT * FROM posts_likes LIMIT 10; User1 Funny Story 12 user2 Cool Deal 7 user4 Interesting Post 50 Two tables are joined based on user column; 3 columns are selected and stored in posts_likes table. Outer JOIN Examples SELECT p.*, l.* FROM posts p LEFT OUTER JOIN likes l ON (p.user = l.user) limit 10; SELECT p.*, l.* FROM posts p RIGHT OUTER JOIN likes l ON (p.user = l.user) limit 10; SELECT p.*, l.* FROM posts p FULL OUTER JOIN likes l ON (p.user = l.user) limit 10;
  • 11. SUBQUERY: > SELECT COL FROM (SELECT col1+col2 AS col FROM table1) table2; HIVE supports sub queries only in the FROM clause. The columns in the sub query select list are available in the outer query just like column of a table. VIEW: > CREATE VIEW Student_V AS SELECT * FROM Student WHERE Name IS NOT NULL AND Addr =‘PUNE’; A VIEW is a sort of “virtual table” that is defined by a SELECT statement. Views can be used present data to users in a different way to the way it is actually stored on disk.
  • 12. IMPORTING DATA-INSERT: So far we used LOAD DATA operation to import data into a Hive table (or partition) by copying or moving files to the table’s directory. You can also populate a table with data from another Hive table using an INSERT statement Example of an INSERT statement: > INSERT OVERWRITE TABLE target SELECT col1, col2 FROM source; For partitioned tables, you can specify the partition to insert into by supplying a PARTITION clause: >INSERT OVERWRITE TABLE target PARTITION (dept=‘HR‘) SELECT col1, col2 FROM source;
  • 13. IMPORTING DATA-MULTITABLE INSERT: In HiveQL, you can turn the INSERT statement around and start with the FROM clause, for the same effect: Example: FROM records2 INSERT OVERWRITE TABLE stations_by_year SELECT year, COUNT(DISTINCT station) GROUP BY year INSERT OVERWRITE TABLE records_by_year SELECT year, COUNT(1) GROUP BY year INSERT OVERWRITE TABLE good_records_by_year SELECT year, COUNT(1) WHERE temperature != 9999 AND (quality = 0 OR quality = 1 OR quality = 4 OR quality = 5 OR quality = 9) GROUP BY year; There is a single source table (records2), but three tables to hold the results from three different queries over the source.
  • 14. BUCKETING: Mechanism to query and examine random samples of data. Break data into a set of buckets based on a hash function of a "bucket column“. Capability to execute queries on a sub-set of random data. Doesn’t automatically enforce bucketing. >CREATE TABLE post_count (user STRING, count INT) CLUSTERED BY (user) INTO 5 BUCKETS; >SET hive.enforce.bucketing = true; //set bucketing to hive >INSERT OVERWRITE TABLE post_count SELECT user, COUNT(post) FROM posts GROUP BY user; INDEXES: Indexes use to speed up the performance of our Hive Query. CREATE INDEX date_index ON TABLE weather(date) AS ‘COMPACT’ WITH DEFERED REBULD ;
  • 15. Ganesh L. Sanap [email protected] INDEXES…… We create date index on date column. REBULD partition by partition for save our time of rebuild. >ALTER INDEX date_index ON weather PARTITION (month=‘01’) REBULD ; >SHOW INDEX ON weather; >DROP INDEX date_index ON weather; NOTE:- IN HIVE we use ORDER BY for one reducer and SORT BY for multiple reducers.