0% found this document useful (0 votes)

13 views

ABP W9-W10 Big Data Analytics Lab-PIG

The document outlines the implementation of Pig Latin commands for big data analytics using Cloudera, focusing on relational and diagnostic operations on student and employee datasets. It details steps for loading, storing, and manipulating data in the Hadoop Pig framework, including operations like filtering, grouping, joining, and splitting. Additionally, it provides specific commands and examples for executing these operations in the Grunt shell.

Uploaded by

srikeshshekapuram0711

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

ABP W9-W10 Big Data Analytics Lab-PIG

Uploaded by

srikeshshekapuram0711

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

BIG DATA ANALYTICS LAB

(A7902) (VCE-R21)

Week-9 Pig Latin commands

a) Implement Relational operators –Loading and Storing, and

Diagnostic operators -Dump, Describe, Illustrate & Explain
on the given database in Hadoop Pig framework using
Cloudera.
b) Develop a Pig Latin program to implement Filtering, Sorting
operations on the given database.

For the given Student dataset and Employee dataset, perform Relational operations like
Loading, Storing, Diagnostic Operations (Dump, Describe, Illustrate & Explain) in Hadoop
Pig framework using Cloudera
Student ID First Name Age City CGPA
001 Jagruthi 21 Hyderabad 9.1
002 Praneeth 22 Chennai 8.6
003 Sujith 22 Mumbai 7.8
004 Sreeja 21 Bengaluru 9.2
005 Mahesh 24 Hyderabad 8.8
006 Rohit 22 Chennai 7.8
007 Sindhu 23 Mumbai 8.3

Employee ID Name Age City

001 Angelina 22 LosAngeles
002 Jackie 23 Beijing
003 Deepika 22 Mumbai
004 Pawan 24 Hyderabad
005 Rajani 21 Chennai
006 Amitabh 22 Mumbai

Step-1: Create a Directory in HDFS with the name pigdir in the required path using mkdir:
$ hdfs dfs -mkdir /bdalab/pigdir
Step-2: The input file of Pig contains each tuple/record in individual lines with the entities
separated by a delimiter ( “,”).

In the local file system, create an input In the local file system, create an input
file student_data.txt containing data as file employee_data.txt containing data
shown below. as shown below.

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

001,Jagruthi,21,Hyderabad,9.1 001,Angelina,22,LosAngeles

002,Praneeth,22,Chennai,8.6 002,Jackie,23,Beijing

003,Sujith,22,Mumbai,7.8 003,Deepika,22,Mumbai

004,Sreeja,21,Bengaluru,9.2 004,Pawan,24,Hyderabad

005,Mahesh,24,Hyderabad,8.8 005,Rajani,21,Chennai

006,Rohit,22,Chennai,7.8 006,Amitabh,22,Mumbai

007,Sindhu,23,Mumbai,8.3
Step-3: Move the file from the local file system to HDFS using put (Or) copyFromLocal
command and verify using -cat command
$ hdfs dfs -put /home/cloudera/pigdir/student_data /bdalab/pigdir/
$ hdfs dfs -cat /bdalab/pigdir/student_data
$ hdfs dfs -put /home/cloudera/pigdir/employee_data /bdalab/pigdir/
$ hdfs dfs -cat /bdalab/pigdir/employee_data
Step-4: Apply Relational Operator – LOAD to load the data from the file
student_data.txt into Pig by executing the following Pig Latin statement in the
Grunt shell. Relational Operators are NOT case sensitive.
$ pig => will direct to grunt> shell
grunt> student = LOAD '/bdalab/pigdir/student_data.txt' USING PigStorage(',')
as ( id:int, name:chararray, age:int, city:chararray, cgpa:double );
grunt> employee = LOAD '/bdalab/pigdir/employee_data.txt’ USING
PigStorage(',') as ( id:int, name:chararray, age:int, city:chararray);
Step-5: Apply Relational Operator – STORE to Store the relation in the HDFS directory
“/pig_output/” as shown below.
grunt> STORE student INTO '/bdalab/pigdir/pig_output/ ' USING PigStorage (',');
grunt> STORE employee INTO ' /bdalab/pigdir/pig_output/ ' USING PigStorage
(',');

Step-6: Verify the stored data as shown below

$ hdfs dfs -ls /bdalab/pigdir/pig_output/
$ hdfs dfs -cat /bdalab/pigdir/pig_output/part-m-00000

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

Step-7: Apply Relational Operator – Diagnostic Operator – DUMP to Print the

contents of the relation.
grunt> Dump student
grunt> Dump employee
Step-8: Apply Relational Operator – Diagnostic Operator – DESCRIBE to View the
schema of a relation.
grunt> Describe student
grunt> Describe employee
Step-9: Apply Relational Operator – Diagnostic Operator – ILLUSTRATE to give the
step-by-step execution of a sequence of statements
grunt> Illustrate student
grunt> Illustrate employee
Step-10: Apply Relational Operator – Diagnostic Operator – EXPLAIN to Display the
logical, physical, and MapReduce execution plans of a relation using
Explain operator
grunt> Explain student
grunt> Explain employee

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

Week-10 Pig Latin commands

a) Implement Grouping, Joining, Combining and Splitting
operations on the given database using Pig Latin statements.
b) Perform Eval Functions on the given dataset.
c) Develop a WordCount program using Pig Latin statements.
10.A) Implement Grouping, Joining, Combining and Splitting operations
on the given database using Pig Latin statements
The GROUP operator is used to group the data in one or more relations. It collects the data
having the same key.
grunt> Group_data = GROUP Relation_name BY Key;
Step-1: Group the records/tuples in the relation by age using GROUP command and
verify.
grunt> group_std = GROUP student BY age;
grunt> Dump group_std;
grunt> group_emp = GROUP employee BY city;
grunt> Dump group_emp;
Step-2: View Schema of the table after grouping the data using the describe command
as shown below.
grunt> Describe group_std;
group_std: {group: int,student: {(id:int, name:chararray, age:int, city:chararray,
cgpa:float)}}
grunt> Describe group_emp;
group_emp: {group: int,employee: {(id: int,name: chararray,age:int,city:
chararray)}}
Step-3: Group by multiple columns of the relation by age and city and verify the
content.
grunt> groupmultiple_std = GROUP student BY (age, city);
grunt> Dump groupmultiple_std
grunt> groupmultiple_emp = GROUP employee BY (age, city);
grunt> Dump groupmultiple_emp
Step-4: Group by All columns of the relation and verify the content.
grunt> groupall_std = GROUP student All;
grunt> Dump groupall_std

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

grunt> groupall_emp = GROUP employee All;

grunt> Dump groupall_emp
Step-5: Combinedly Group the records/tuples of the relations student_data and
employee_data with the key age and then verify the result.
grunt> cogroup_stdemp = COGROUP student_data by age, employee_data by age;
grunt> Dump cogroup_stdemp
The JOIN operator is used to combine records from two or more relations. While
performing a join operation, we declare one (or a group of) tuple(s) from each relation, as
keys. When these keys match, the two tuples are matched, else the records are dropped.
Joins can be of the following types −

• SELF-Join

• INNER-Join

• OUTER-Join − LEFT Join, RIGHT Join, and FULL Join

Step-6: SELF-JOIN, we will load the same data multiple times, under different aliases
(names). grunt> std1 = LOAD ' /bdalab/pigdir/student_data ' USING
PigStorage(',') as (id:int, name:chararray, age:int, city:chararray, cgpa:float);
grunt> std2 = LOAD ' /bdalab/pigdir/student_data ' USING PigStorage(',') as
(id:int, name:chararray, age:int, city:chararray, cgpa:float );
grunt> selfjoin_std_data = JOIN students1 BY id, students2 BY id;
grunt> dump selfjoin_std_data;
(1,Jagruthi,21,Hyderabad,9.1,1,Jagruthi,21,Hyderabad,9.1)
(2,Praneeth,22,Chennai,8.6,2,Praneeth,22,Chennai,8.6)
(3,Sujith,22,Mumbai,7.8,3,Sujith,22,Mumbai,7.8)
(4,Sreeja,21,Bengaluru,9.2,4,Sreeja,21,Bengaluru,9.2)
(5,Mahesh,24,Hyderabad,8.8,5,Mahesh,24,Hyderabad,8.8)
(6,Rohit,22,Chennai,7.8,6,Rohit,22,Chennai,7.8)
(7,Sindhu,23,Mumbai,8.3,7,Sindhu,23,Mumbai,8.3)
Step-7: INNER JOIN - EQUI JOIN creates a new relation by combining column values
of two relations based upon the join-predicate. It returns rows when there is a
match in both tables.
grunt> innerjoin_data_att = JOIN std_data BY id, std_att BY id;
grunt> dump innerjoin_data_att;
(1,Jagruthi,21,Hyderabad,9.1,1,Jagruthi,joined,9:10:10)
(4,Sreeja,21,Bengaluru,9.2,4,Sreeja,joined,9:10:24)
(6,Rohit,22,Chennai,7.8,6,Rohit,joined,9:11:15)
(7,Sindhu,23,Mumbai,8.3,7,Sindhu,joined,9:12:25)

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

OUTER JOIN returns all the rows from at least one of the relations. An outer join operation
is carried out in three ways −
• LEFT OUTER JOIN
• RIGHT OUTER JOIN
• FULL OUTER JOIN
Step-8: LEFT OUTER JOIN operation returns all rows from the left table, even if there are
no matches in the right relation.
Note: Student_data is LEFT
grunt> outerleft_data_att = JOIN std_data BY id LEFT, std_att BY id;
grunt> DUMP outerleft_data_att
(1,Jagruthi,21,Hyderabad,9.1,1,Jagruthi,joined,9:10:10)
(2,Praneeth,22,Chennai,8.6,,,,)
(3,Sujith,22,Mumbai,7.8,,,,)
(4,Sreeja,21,Bengaluru,9.2,4,Sreeja,joined,9:10:24)
(5,Mahesh,24,Hyderabad,8.8,,,,)
(6,Rohit,22,Chennai,7.8,6,Rohit,joined,9:11:15)
(7,Sindhu,23,Mumbai,8.3,7,Sindhu,joined,9:12:25)
Note: Student_att is LEFT
grunt> outerleft_att_data = JOIN std_att BY id LEFT, std_data BY id;
grunt> DUMP outerleft_att_data;
(1,Jagruthi,joined,9:10:10,1,Jagruthi,21,Hyderabad,9.1)
(4,Sreeja,joined,9:10:24,4,Sreeja,21,Bengaluru,9.2)
(6,Rohit,joined,9:11:15,6,Rohit,22,Chennai,7.8)
(7,Sindhu,joined,9:12:25,7,Sindhu,23,Mumbai,8.3)
(8,Sai,joined,9.14:18,,,,,)
(9,Meghana,joined,9.15:25,,,,,)

Step-9: RIGHT OUTER JOIN operation returns all rows from the right table, even if there
are no matches in the left table.
grunt> outerright_data_att = JOIN std_data BY id RIGHT, std_att BY id;
grunt> DUMP outerright_data_att;
(1,Jagruthi,21,Hyderabad,9.1,1,Jagruthi,joined,9:10:10)
(4,Sreeja,21,Bengaluru,9.2,4,Sreeja,joined,9:10:24)
(6,Rohit,22,Chennai,7.8,6,Rohit,joined,9:11:15)
(7,Sindhu,23,Mumbai,8.3,7,Sindhu,joined,9:12:25)
(,,,,,8,Sai,joined,9.14:18)
(,,,,,9,Meghana,joined,9.15:25)

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

Step-10: FULL OUTER JOIN operation returns rows when there is a match in one of the
relations.
grunt> outerfull_data_att = JOIN std_data BY id FULL, std_att BY id;
grunt> DUMP outerfull_data_att;
(1,Jagruthi,21,Hyderabad,9.1,1,Jagruthi,joined,9:10:10)
(2,Praneeth,22,Chennai,8.6,,,,)
(3,Sujith,22,Mumbai,7.8,,,,)
(4,Sreeja,21,Bengaluru,9.2,4,Sreeja,joined,9:10:24)
(5,Mahesh,24,Hyderabad,8.8,,,,)
(6,Rohit,22,Chennai,7.8,6,Rohit,joined,9:11:15)
(7,Sindhu,23,Mumbai,8.3,7,Sindhu,joined,9:12:25)
(,,,,,8,Sai,joined,9.14:18)
(,,,,,9,Meghana,joined,9.15:25)
Step-11: FILTER operator is used to select the required tuples from a relation based
on a condition
grunt> filter_std = FILTER std_data BY city == 'Hyderabad';
grunt> DUMP filter_std;
(1,Jagruthi,21,Hyderabad,9.1)
(5,Mahesh,24,Hyderabad,8.8)

Step-12: SPLIT operator is used to split a relation into two or more relations
grunt> SPLIT std_data INTO split_std1 IF age<23, split_std2 IF (age>22 AND
age<25);
grunt> DUMP split_std1;
(1,Jagruthi,21,Hyderabad,9.1)
(2,Praneeth,22,Chennai,8.6)
(3,Sujith,22,Mumbai,7.8)
(4,Sreeja,21,Bengaluru,9.2)
(6,Rohit,22,Chennai,7.8)
grunt> DUMP split_std2;
(5,Mahesh,24,Hyderabad,8.8)
(7,Sindhu,23,Mumbai,8.3)

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

5.B) Perform Eval Functions on the given dataset.

 There is a huge set of Apache Pig Built in Functions available. Such as the eval, load/store,
math, string, date and time, bag and tuple functions.
 Eval Functions is the first types of Pig Built in Functions. Here are the Pig Eval functions,
offered by Apache Pig.

1) AVG(expression):
To compute the average of the numerical values within a bag. It requires a preceding GROUP ALL
statement for global averages and a GROUP BY statement for group averages. However, it ignores
the NULL values.
Ex: Average GPA for each Employee is computed
grunt> A = LOAD ‘Employee.txt’ AS (name:chararray, term:chararray, gpa:float);
grunt> B = GROUP A BY name;
grunt> C = FOREACH B GENERATE A.name, AVG(A.gpa);
grunt> DUMP C;

2) CONCAT (expression, expression):

To concatenate two or more expressions. The generated result of expression must have identical
types. However, if any sub-expression is null, the generated expression is also null.
Ex: fields f1, an underscore string literal, f2 and f3 are concatenated.
grunt> X = LOAD ‘data’ as (f1:chararray, f2:chararray, f3:chararray);
grunt> DUMP X;
(apache,open,source)
(hadoop,map,reduce)
(pig,pig,latin)
grunt> Y = FOREACH X GENERATE CONCAT(f1, ‘_’, f2,f3);
grunt> DUMP Y;
(apache_opensource)
(hadoop_mapreduce)
(pig_piglatin)

3) COUNT(expression):
To count the number of elements in a bag. It requires a preceding GROUP ALL statement for global
counts and a GROUP BY statement for group counts. It ignores the null values.
Ex: grunt> X = LOAD ‘data’ AS (f1:int,f2:int,f3:int);
grunt> DUMP X;

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

(1,2,3)
(4,2,1)
(8,3,4)
(4,3,3)
(7,2,5)
(8,4,3)
grunt> Y = GROUP X BY f1;
grunt> DUMP B;
(1,{(1,2,3)})
(4,{(4,2,1),(4,3,3)})
(7,{(7,2,5)})
(8,{(8,3,4),(8,4,3)})
grunt> A = FOREACH Y GENERATE COUNT(X);
grunt> DUMP A;
(1L)
(2L)
(1L)
(2L)

4) IsEmpty(expression):
To check if a bag or map is empty.
Ex: grunt> Y = filter X by IsEmpty(SSN_NAME);

5) MAX(expression):
To find out the maximum of the numeric values or chararrays in a single-column bag. It requires a
preceding GROUP ALL statement for global maximums and a GROUP BY statement for group
maximums. However, it ignores the NULL values.
Ex: grunt> X = FOREACH B GENERATE group, MAX(A.gpa);

6) MIN(expression):
To get the minimum (lowest) value (numeric or chararray) for a certain column in a single-column
bag.
Ex: grunt> X = FOREACH B GENERATE group, MIN(A.gpa);

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

10.C) Develop a WordCount program using Pig Latin statements.

The input file of Pig contains each tuple/record in individual lines with the entities
separated by a delimiter ( “,”).
Step-1: Create a Directory in HDFS with the name pigdir in the required path using
mkdir:
$ hdfs dfs -mkdir /bdalab/pigdir
Step-2: In the local file system, create an input file wordcount containing data as shown
below.
Deer,Bear,River
Car,Car,River
River,Car,River
Deer,River,Bear
Step-3: Move the file from the local file system to HDFS using put (Or) copyFromLocal
command and verify using -cat command
$ hdfs dfs -put /home/cloudera/pigdir/wordcount_data /bdalab/pigdir/
$ hdfs dfs -cat / bdalab/pigdir/ wordcount_data
Step-4: Open Pig in Grunt shell and execute the following Pig Latin statement.
$ pig => will direct to grunt>
Convert Each line to each tuple.
Apply Relational Operator – LOAD to load the data into Relation lines from the file
wordcount_data.
grunt> lines = LOAD '/bdalab/pigdir/wordcount_data' AS (line:chararray);
grunt> DUMP lines;
(Deer,Bear,River)
(Car,Car,River)
(River,Car,River)
(Deer,River,Bear)
Step-5: Convert Each line tuple to each word tuple
TOKENIZE splits the line into a field for each word.
FLATTEN will take the collection of records returned by TOKENIZE and produce
a separate record for each one, calling the single field in the record word.
FOREACH operator is used to generate specified data transformations based on
the column data.
grunt> words = FOREACH lines GENERATE FLATTEN(TOKENIZE(line,’%’)) as
word;
grunt> dump words;
(Deer)

A. Bhanu Prasad, Associate Professor of CSE, VCE

BIG DATA ANALYTICS LAB
(A7902) (VCE-R21)

(Bear)
(River)
(Car)
(Car)
(River)
(River)
(Car)
(River)
(Deer)
(River)
(Bear)
Step-6: Group all similar words into each tuple
grunt> groupword = GROUP words by word;
grunt> dump groupword;
(Car,{(Car),(Car),(Car)})
(Bear,{(Bear),(Bear)})
(Deer,{(Deer),(Deer)})
(River,{(River),(River),(River),(River),(River)})

Step-6: Count each grouped word and display

grunt> wordcount = FOREACH groupword GENERATE group, COUNT(words);
grunt> dump wordcount;
(Car,3)
(Bear,2)
(Deer,2)
(River,5)

A. Bhanu Prasad, Associate Professor of CSE, VCE

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Valentina. User Manual. Preview Version
100% (1)
Valentina. User Manual. Preview Version
66 pages
Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction To Pig
67% (3)
Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction To Pig
34 pages
TC Automation Interface
No ratings yet
TC Automation Interface
475 pages
Oam Interview Questions
No ratings yet
Oam Interview Questions
10 pages
Pig
No ratings yet
Pig
55 pages
UNIT V
No ratings yet
UNIT V
30 pages
Apache Pig
No ratings yet
Apache Pig
61 pages
Unit IV - Pig PDF
No ratings yet
Unit IV - Pig PDF
79 pages
Lecture 18
No ratings yet
Lecture 18
20 pages
Pig Expt 5
No ratings yet
Pig Expt 5
4 pages
Unit IV EBDP 22
No ratings yet
Unit IV EBDP 22
97 pages
Apache Pig
100% (2)
Apache Pig
80 pages
Acet
No ratings yet
Acet
8 pages
exp10
No ratings yet
exp10
10 pages
Module 4 - Pig
No ratings yet
Module 4 - Pig
65 pages
9_Pig Latin (1)
No ratings yet
9_Pig Latin (1)
42 pages
Apache Pig
No ratings yet
Apache Pig
28 pages
Hadoop Week 5
No ratings yet
Hadoop Week 5
78 pages
Pig Hive
No ratings yet
Pig Hive
59 pages
Chapter 5 - Introducing Pig Pig Architecture
No ratings yet
Chapter 5 - Introducing Pig Pig Architecture
81 pages
BDA Unit - IV
No ratings yet
BDA Unit - IV
81 pages
Pig Hive
No ratings yet
Pig Hive
58 pages
Lab 7
No ratings yet
Lab 7
2 pages
Chapter 10
No ratings yet
Chapter 10
50 pages
Experiment-7 BDA
No ratings yet
Experiment-7 BDA
4 pages
Program No 13
No ratings yet
Program No 13
3 pages
7 Ibiz Pig Workouts
No ratings yet
7 Ibiz Pig Workouts
7 pages
Experiment-7 Pig-Script
No ratings yet
Experiment-7 Pig-Script
4 pages
Introduction To Pig: SESSION 2016-2017
No ratings yet
Introduction To Pig: SESSION 2016-2017
44 pages
Apache PIG.pptx
No ratings yet
Apache PIG.pptx
41 pages
Sai PIG Practicals PDF
No ratings yet
Sai PIG Practicals PDF
6 pages
Session 3.3
No ratings yet
Session 3.3
30 pages
Demonstration: Understanding Pig: HDP Developer: Apache Pig and Hive
No ratings yet
Demonstration: Understanding Pig: HDP Developer: Apache Pig and Hive
26 pages
Unit 4 Pig and Hive
No ratings yet
Unit 4 Pig and Hive
86 pages
bda-unit-4-060115-big-data-analytics-unit-4
No ratings yet
bda-unit-4-060115-big-data-analytics-unit-4
19 pages
Bda Unit 4 060115 Big Data Analytics Unit 4
No ratings yet
Bda Unit 4 060115 Big Data Analytics Unit 4
19 pages
4 1-Pig
No ratings yet
4 1-Pig
46 pages
Exp 9 and 10
No ratings yet
Exp 9 and 10
7 pages
BDA-V
No ratings yet
BDA-V
10 pages
Pig Practical: Mcjjcbek/View?Usp Sharing
No ratings yet
Pig Practical: Mcjjcbek/View?Usp Sharing
10 pages
Pig Hive
No ratings yet
Pig Hive
72 pages
BDA Module 4 - Part 1 (Pig) 2023
No ratings yet
BDA Module 4 - Part 1 (Pig) 2023
34 pages
BDC Output 7
No ratings yet
BDC Output 7
9 pages
Pig
No ratings yet
Pig
12 pages
exP 5,6
No ratings yet
exP 5,6
12 pages
Pig_2
No ratings yet
Pig_2
63 pages
EMP1.txt (Id:int, Name:chararray, Dept:chararray, Salary:int)
No ratings yet
EMP1.txt (Id:int, Name:chararray, Dept:chararray, Salary:int)
2 pages
Exercise 7,8,9 Basic Commands
No ratings yet
Exercise 7,8,9 Basic Commands
7 pages
Thejas Nair Pig Team at Yahoo! Apache Pig PMC Member
No ratings yet
Thejas Nair Pig Team at Yahoo! Apache Pig PMC Member
22 pages
Pig - Lab Demonstrations Explore!: Woha! Pig Is Supercool!
No ratings yet
Pig - Lab Demonstrations Explore!: Woha! Pig Is Supercool!
4 pages
Tutorialspoint HBase Pig
No ratings yet
Tutorialspoint HBase Pig
23 pages
BDA Unit-4-PPT
No ratings yet
BDA Unit-4-PPT
98 pages
Lab 5
No ratings yet
Lab 5
9 pages
Bigdata: What Is Pig?
No ratings yet
Bigdata: What Is Pig?
16 pages
Notes Unit 5 Bigdata
No ratings yet
Notes Unit 5 Bigdata
19 pages
Notes
No ratings yet
Notes
19 pages
Unit-4_PIG_
No ratings yet
Unit-4_PIG_
9 pages
Lecture38 PDF
No ratings yet
Lecture38 PDF
23 pages
BigData2
No ratings yet
BigData2
3 pages
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
Job Ready Go
From Everand
Job Ready Go
Haythem Balti
No ratings yet
Kubectl Commands Cheat Sheet
No ratings yet
Kubectl Commands Cheat Sheet
1 page
Jms Tutorial - Java Message Service Tutorial: Howtodoinjava
No ratings yet
Jms Tutorial - Java Message Service Tutorial: Howtodoinjava
15 pages
Filter 1
No ratings yet
Filter 1
430 pages
Oracle Ebs Portion
No ratings yet
Oracle Ebs Portion
7 pages
QA Interview Questions
100% (1)
QA Interview Questions
28 pages
403 Process To Enable Termination Workflow
No ratings yet
403 Process To Enable Termination Workflow
3 pages
VBCS Assignment For Freshers
No ratings yet
VBCS Assignment For Freshers
7 pages
Changelog
No ratings yet
Changelog
6 pages
02 - Practice Questions of Selection
No ratings yet
02 - Practice Questions of Selection
2 pages
FinalExamCLC 2021 Ans
No ratings yet
FinalExamCLC 2021 Ans
3 pages
ch4
No ratings yet
ch4
50 pages
Clean Architecture: A Craftsman's Guide To Software Structure and Design
No ratings yet
Clean Architecture: A Craftsman's Guide To Software Structure and Design
13 pages
Powershell Tips Tricks
No ratings yet
Powershell Tips Tricks
19 pages
Literature Survey On Android 3
No ratings yet
Literature Survey On Android 3
5 pages
1 Java Introduction
No ratings yet
1 Java Introduction
27 pages
Sap 2
No ratings yet
Sap 2
20 pages
Go Programming Language Tutorial (Part 8)
No ratings yet
Go Programming Language Tutorial (Part 8)
7 pages
Assignment Problems For Os
50% (2)
Assignment Problems For Os
4 pages
Internship Report: "Web Development Using PHP and HTML"
No ratings yet
Internship Report: "Web Development Using PHP and HTML"
38 pages
Java Interview Questions
No ratings yet
Java Interview Questions
8 pages
VB Script Objects
No ratings yet
VB Script Objects
4 pages
EE 337: Interfacing To LCD Display Lab 4: 1 Homework
No ratings yet
EE 337: Interfacing To LCD Display Lab 4: 1 Homework
3 pages
csc309 2
No ratings yet
csc309 2
70 pages
14-Software Estimating Technology
No ratings yet
14-Software Estimating Technology
11 pages
Assignment #3 CSCI 201 Fall 2023 6.0% of Course Grade
No ratings yet
Assignment #3 CSCI 201 Fall 2023 6.0% of Course Grade
7 pages
Python Programming - Important - Previously Asked Questions - Python Programming - Study Glancep
No ratings yet
Python Programming - Important - Previously Asked Questions - Python Programming - Study Glancep
4 pages
Ch06 Multiple Forms, Adds Obj
No ratings yet
Ch06 Multiple Forms, Adds Obj
30 pages

ABP W9-W10 Big Data Analytics Lab-PIG

Uploaded by

ABP W9-W10 Big Data Analytics Lab-PIG

Uploaded by

BIG DATA ANALYTICS LAB

Week-9 Pig Latin commands

a) Implement Relational operators –Loading and Storing, and

Employee ID Name Age City

A. Bhanu Prasad, Associate Professor of CSE, VCE

Step-6: Verify the stored data as shown below

A. Bhanu Prasad, Associate Professor of CSE, VCE

Step-7: Apply Relational Operator – Diagnostic Operator – DUMP to Print the

A. Bhanu Prasad, Associate Professor of CSE, VCE

Week-10 Pig Latin commands

A. Bhanu Prasad, Associate Professor of CSE, VCE

grunt> groupall_emp = GROUP employee All;

• OUTER-Join − LEFT Join, RIGHT Join, and FULL Join

A. Bhanu Prasad, Associate Professor of CSE, VCE

A. Bhanu Prasad, Associate Professor of CSE, VCE

A. Bhanu Prasad, Associate Professor of CSE, VCE

5.B) Perform Eval Functions on the given dataset.

2) CONCAT (expression, expression):

A. Bhanu Prasad, Associate Professor of CSE, VCE

A. Bhanu Prasad, Associate Professor of CSE, VCE

10.C) Develop a WordCount program using Pig Latin statements.

A. Bhanu Prasad, Associate Professor of CSE, VCE

Step-6: Count each grouped word and display

A. Bhanu Prasad, Associate Professor of CSE, VCE

You might also like