Pig

The document provides instructions on installing and running Pig on Hadoop to perform data analysis tasks like sorting, grouping, joining, projecting and filtering data. It includes commands to download and extract Pig, configure environment variables, load and analyze sample data files to demonstrate Pig Latin scripts for each task. The key tasks demonstrated are loading and transforming data with Pig, running Pig in local and MapReduce modes, and using diagnostic tools to understand data flow and transformations.

Uploaded by

Stunt Stunt

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

Pig

Uploaded by

Stunt Stunt

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

EXPERIMENT – 6

6)Install and Run Pig then write Pig Latin scripts to sort, group, join,
project, and filter your data.
PROCEDURE:
 Download and extract pig-0.13.0.
Command: wget https://archive.apache.org/dist/pig/pig-0.13.0/pig-
0.13.0.tar.gz
Command: tar xvf pig-0.13.0.tar.gz
Command: sudo mv pig-0.13.0 /usr/lib/pig
 Set Path for pig
Command: sudo gedit
$HOME/.bashrc
export
PIG_HOME=/usr
/lib/pig
export PATH=$PATH:$PIG_HOME/bin
export PIG_CLASSPATH=$HADOOP_COMMON_HOME/conf
 pig.properties file
In the conf folder of Pig, we have a file named pig.properties. In the
pig.properties file, you can set various parameters as given below.
pig -h properties
 Verifying the
Installation
Verify the installation of Apache Pig by typing the version command. If the
installation is successful, you will get the version of Apache Pig as shown
below.
Command: pig -version

Local mode MapReduce mode

Command: Command:
$ pig -x local $ pig -x mapreduce
15/09/28 10:13:03 INFO pig.Main: 15/09/28 10:28:46 INFO pig.Main:
Logging error messages to: Logging error messages to:
/home/Hadoop/ /home/Hadoop/
pig_1443415383991.l og pig_1443416326123.l og
2015-09-28 10:13:04,838 2015-09-28 10:28:46,427
[main] INFO [main] INFO
org.apache.pig.backend.hadoop.ex org.apache.pig.backend.hadoop.
ecution engine.HExecutionEngine executi on
- Connecting to hadoop file system engine.HExecutionEngine -
at: file:/// Connecting to hadoop file
grunt> system at: file:///

grunt>

Grouping Of Data:
 put dataset into hadoop
Command: hadoop fs -put pig/input/data.txt pig_data/

 Run pig script program of GROUP on hadoop mapreduce

grunt>
student_details = LOAD
'hdfs://localhost:8020/user/pcetcse/pig_data/student_details.txt'
USING PigStorage(',') as (id:int, firstname:chararray,
lastname:chararray, age:int, phone:chararray, city:chararray);
group_data = GROUP student_details by age;
Dump group_data;
Output:

Joining Of Data:
 Run pig script program of JOIN on hadoop mapreduce
grunt>
customers = LOAD
'hdfs://localhost:8020/user/pcetcse/pig_data/customers.txt'
USING PigStorage(',')as (id:int, name:chararray, age:int,
address:chararray, salary:int);
orders = LOAD
'hdfs://localhost:8020/user/pcetcse/pig_data/orders.txt'
USING PigStorage(',')as (oid:int, date:chararray,
customer_id:int, amount:int);

grunt> coustomer_orders = JOIN customers BY id, orders BY

customer_id;
 Verification
Verify the relation coustomer_orders using the DUMP operator as
shown below.
grunt> Dump coustomer_orders;
 Output
You will get the following output that wills the contents of the relation
named
coustomer_orders.
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)
(3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500)
(3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000)
(4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060)
Sorting of Data:
 Run pig script program of SORT on hadoop mapreduce
Assume that we have a file named student_details.txt in the HDFS
directory /pig_data/
as shown below.
student_details.txt
001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai
And we have loaded this file into Pig with the schema name
student_details as shown below.
grunt>
student_details = LOAD
„hdfs://localhost:8020/user/pcetcse/pig_data/
student_details.txt' USING PigStorage(',')as (id:int,
firstname:chararray, lastname:chararray, age:int,
phone:chararray, city:chararray);
Let us now sort the relation in a descending order based on the
age of the student and store it into another relation named data
using the ORDER BY operator as shown below.
grunt> order_by_data = ORDER student_details BY age DESC;
 Verification
Verify the relation order_by_data using the DUMP operator as shown
below.
grunt> Dump order_by_data;
 Output
It will produce the following output, displaying the contents of the
relation order_by_data as follows.
(8,Bharathi,Nambiayar,24,9848022333,Chennai)
(7,Komal,Nayak,24,9848022334,trivendram)
(6,Archana,Mishra,23,9848022335,Chennai)
(5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar)
(3,Rajesh,Khanna,22,9848022339,Delhi)
(2,siddarth,Battacharya,22,9848022338,Kolkata)
(4,Preethi,Agarwal,21,9848022330,Pune)
(1,Rajiv,Reddy,21,9848022337,Hyderabad)
Filtering of data:
 Run pig script program of FILTER on hadoop mapreduce
Assume that we have a file named student_details.txt in the HDFS
directory /pig_data/
as shown below.
student_details.txt 001,Rajiv,Reddy,21,9848022337,Hyderabad
002,siddarth,Battacharya,22,9848022338,Kolkata
003,Rajesh,Khanna,22,9848022339,Delhi
004,Preethi,Agarwal,21,9848022330,Pune
005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
006,Archana,Mishra,23,9848022335,Chennai
007,Komal,Nayak,24,9848022334,trivendram
008,Bharathi,Nambiayar,24,9848022333,Chennai
And we have loaded this file into Pig with the schema name
student_details as shown below.
grunt>
student_details = LOAD
„hdfs://localhost:8020/user/pcetcse/pig_data/
student_details.txt' USING PigStorage(',')as (id:int,
firstname:chararray, lastname:chararray, age:int,
phone:chararray, city:chararray);
Let us now use the Filter operator to get the details of the students
who belong to the city Chennai.
grunt> filter_data = FILTER student_details BY city == 'Chennai';
 Verification
Verify the relation filter_data using the DUMP operator as shown
below.
grunt> Dump filter_data;
 Output
It will produce the following output, displaying the contents of the
relation filter_data as follows.
(6,Archana,Mishra,23,9848022335,Chennai)
(8,Bharathi,Nambiayar,24,9848022333,Chennai)

PIG Running Modes

We can manually override the default mode is –x or –exectype options
$ pig –x local
$pig –x mapreduce
Bulding blocks
- Field – piece of data Ex: “abc”
- Tuple – ordered set of fields represented with “(“ and “)”
Ex : (10.3, abc, 5)
- Bag – collection of tuples representd with “{“ and “}”
Ex: { (10.3, abc, 5), (def, 12,13.5) }

Grouping data
grunt> group1 = group data by age;
grunt> describe group1;
group1: {group: int,data: {(age: int)}}
grunt> dump group1;
(12,{(12)})
(19,{(19)})
(24,{(24),(24)})
(25,{(25)})
(27,{(27)})
(35,{(35),(35)})
(45,{(45)})
(55,{(55)})
(65,{(65)})
The data bag is grouped by ‘age’ therefore Group element contain unique
values
To see how pig transforms data
grunt > ILLUSTRAGE group1;
Load Command

LOAD 'data' [USING function] [AS schema];

• data – name of the directory or file – Must be in single quotes
• USING – specifies the load function to use
– By default uses PigStorage which parses each line into fields
using a delimiter.
- Default delimiter is tab (“\t‟)
• AS – assign a schema to incoming data
– Assigns names to fields
– Declares types to fields
LOADING DATA:
• Create file in local file system
[cloudera@localhost ~]$ cat > a.txt
25
35
45
55
65
24
12
19
27
35
24
• Copy file from local file system to hdfs
[cloudera@localhost ~]$ hadoop fs -put a.txt
Pig Latin – Diagnostic Tools
• Display the structure of the Bag
grunt> DESCRIBE <bag_name>;
ex: DESCRIBE data;
• Display Execution Plan
– Produces Various reports
• Logical Plan
• MapReduce Plan
grunt> EXPLAIN <bag_name>;
ex: EXPLAIN data;
• Illustrate how Pig engine transforms the data
grunt> ILLUSTRATE <bag_name>;
ex: ILLUSTRATE data;
Filter data
grunt> grunt> filter1 = filter data by age > 30;
grunt> dump filter1;
(35)
(45)
(55)
(65)
(35)
grunt> filter2 = filter data by age < 20;
grunt> dump filter2;
(12)
(19)
Sort data
Sort by Ascending order
grunt> sort1 = order data by age ASC;
grunt> dump sort1;
(12)
(19)
(24)
(24)
(25)
(27)
(35)
(35)
(45)
(55)
(65)
Sort by Descending order
grunt> sort2 = order data by age DESC;
grunt> dump sort2;
(65)
(55)
(45)
(35)
(35)
(27)
(25)
(24)
(24)
(19)
(12)
Grouping data
grunt> group1 = group data by age;
grunt> describe group1;
group1: {group: int,data: {(age: int)}}
grunt> dump group1;
(12,{(12)})
(19,{(19)})
(24,{(24),(24)})
(25,{(25)})
(27,{(27)})
(35,{(35),(35)})
(45,{(45)})
(55,{(55)})
(65,{(65)})
The data bag is grouped by ‘age’ therefore Group element contain unique
values
To see how pig transforms data
grunt > ILLUSTRAGE group1;
FOREACH
FOREACH<bag> GENERATE <data>
Iterates over each element in the bag and produce a result.
grunt> records = LOAD ‘std.txt’ USING PigStorage(‘ , ’) AS (roll:int,
name:chararray);
grunt> dump records;
(501,aaa)
(502,hhh)
(507,yyy)
(204,rrr)
(510,bbb)
grunt> stdname = foreach records generate name;
grunt> dump stdname;
(aaa)
(hhh)
(yyy)
(rrr)
(bbb)
grunt> stdroll = foreach records generate roll;
grunt> dump stdroll;
(501)
(502)
(507)
(204)
(510)

JOIN
The JOIN operator is used to combine records from two or more relations.
While performing a join operation, we declare one (or a group of) tuple(s) from
each relation, as keys. When these keys match, the two particular tuples are
matched, else the records are dropped. Joins can be of the following types −
Self-join
Inner-join
Outer-join − left join, right join, and full join
Self-join
Self-join is used to join a table with itself.
Inner Join
Default Join is Inner Join – Rows are joined where the keys match – Rows
that do not have matches are not included in the result
Records which will not join with the ‘other’ record-set are still included in the
result
Left Outer – Records from the first data-set are included whether they
have a match or not. Fields from the unmatched (second) bag are set to null.
Right Outer – The opposite of Left Outer Join: Records from the
second data-set are included no matter what. Fields from the
unmatched (first) bag are set to null.
Full Outer – Records from both sides are included. For
unmatched records the fields from the ‘other’ bag are set to null.
cloudera@localhost ~]$ cat>a.txt
1,2,3
4,2,1
8,3,4
4,3,3
7,2,5
8,4,3
[cloudera@localhost ~]$ cat>b.txt
2,4
8,9
1,3
2,7
2,9
4,6
4,9
[cloudera@localhost ~]$ hadoop fs -put a.txt
[cloudera@localhost ~]$ hadoop fs -put b.txt
Self join
Self-join is used to join a table with itself as if the table were two relations,
temporarily renaming at least one relation.
i.e we join one table to itself rather than joining two tables.
grunt> ONE= load 'a.txt' using PigStorage(',') as (a1:int,a2:int,a3:int);
grunt> TWO = load 'a.txt' using PigStorage(',') as (a1:int,a2:int,a3:int);
SELFJ = JOIN ONE by a1 , TWO BY a1;
grunt> describe SELFJ;
SELFJ: {ONE::a1: int,ONE::a2: int,ONE::a3: int,TWO::a1: int,TWO::a2:
int,TWO::a3: int}
Equi-join
inner Join is used quite frequently; it is also referred to as equijoin.
An inner join returns rows when there is a match in both tables.
grunt> A = load 'a.txt' using PigStorage(',') as (a1:int,a2:int,a3:int);
grunt> B = load 'b.txt' using PigStorage(',') as (b1:int,b2:int,b3:int);
grunt> X = Join A by a1, B by b1;
grunt> Dump X;
(1,2,3,1,3,)
(4,2,1,4,6,)
(4,2,1,4,9,)
(4,3,3,4,6,)
(4,3,3,4,9,)
(8,3,4,8,9,)
(8,4,3,8,9,)
Left outer join
A = LOAD ‘A.txt' using PigStorage(',') AS (a1:int,a2:int,a3:int);
B = LOAD, ‘B.txt' using PigStorage(',') AS (b1:int,b2:int);
LEFTJ = JOIN A by a1 LEFT OUTER, B BY b1;
DUMP LEFTJ;
(1,2,3,1,3)
(4,3,3,4,9)
(4,3,3,4,6)
(4,2,1,4,9)
(4,2,1,4,6)
(7,2,5,,)
(8,4,3,8,9)
(8,3,4,8,9)
Right outer join
A = LOAD ‘A.txt' using PigStorage(',') AS (a1:int,a2:int,a3:int);
B = LOAD, ‘B.txt' using PigStorage(',') AS (b1:int,b2:int);
RIGHTJ = JOIN A by a1 RIGHT OUTER, B BY b1;
DUMP RIGHTJ;
(1,2,3,1,3)
(,,,2,4)
(,,,2,7)
(,,,2,9)
(4,2,1,4,6)
(4,2,1,4,9)
(4,3,3,4,6)
(4,3,3,4,9)
(8,3,4,8,9)
(8,4,3,8,9)
Full join
A = LOAD ‘A.txt' using PigStorage(',') AS (a1:int,a2:int,a3:int);
B = LOAD, ‘B.txt' using PigStorage(',') AS (b1:int,b2:int);
FULLJ = JOIN A by a1 FULL, B BY b1;
DUMP FULLJ;
(1,2,3,1,3)
(,,,2,4)
(,,,2,7)
(,,,2,9)
(4,2,1,4,6)
(4,2,1,4,9)
(4,3,3,4,6)
(4,3,3,4,9)
(7,2,5,,)
(8,3,4,8,9)
(8,4,3,8,9)
UNION & SPLIT
UNION combines multiple relations together whereas SPLIT partitions a
relation in to multiple ones.
grunt> cat a.txt
1,2,3
4,2,1
8,3,4
grunt> cat b.txt
4,3,3
7,2,5
8,4,3
grunt> a = load 'a.txt' using PigStorage(',') as (a1:int, a2:int, a3:int);
grunt> b = load 'b.txt' using PigStorage(',') as (b1:int, b2:int, b3:int);
grunt> dump a;
(1,2,3)
(4,2,1)
(8,3,4)
grunt> dump b;
(4,3,3)
(7,2,5)
(8,4,3)
grunt> c = UNION a, b;
(1,2,3)
(4,2,1)
(8,3,4)
(4,3,3)
(7,2,5)
(8,4,3)
grunt> SPLIT c into sp1 if $0 == 4, sp2 if $0 == 8;
Split operation on ‘c’ sends a tuple to sp1 if its first field ($0) is 0 , and to sp2 if
it’s 1
grunt> dump sp1;
(4,3,3)
(4,2,1)
grunt > dump sp2;
(8,4,3)
(8,3,4)
grunt> chars = LOAD 'char.txt' AS (c:chararray);
grunt> chargrp = GROUP chars by c;
grunt> dump chargrp;
(a,{(a),(a),(a)})
(c,{(c),(c)})
(i,{(i),(i),(i)})
(k,{(k),(k),(k),(k)})
(l,{(l),(l)})
grunt> describe chargrp;
chargrp: {group: chararray,chars: {(c: chararray)}}
FOREACH with Functions
– Pig comes with many functions including COUNT, FLATTEN,
CONCAT, etc...
– Can implement a custom function
COUNT:

grunt> counts = FOREACH chargrp GENERATE group, COUNT(chars);

(a,3)
(c,2)
(i,3)
(k,4)
(l,2)

035 Assignment PDF
No ratings yet
035 Assignment PDF
14 pages
BC-10 Catalog
100% (1)
BC-10 Catalog
2 pages
Rollback Guideline EMUI 10.0 Magic UI 3.0 Rollback to EMUI 9.X Magic UI 2.1 Operation Instruction-2019.9.5 - 终稿
No ratings yet
Rollback Guideline EMUI 10.0 Magic UI 3.0 Rollback to EMUI 9.X Magic UI 2.1 Operation Instruction-2019.9.5 - 终稿
11 pages
VFP Cross Tab Query Vs
No ratings yet
VFP Cross Tab Query Vs
2 pages
Maplin Touchscreen Weather Station Software Manual
No ratings yet
Maplin Touchscreen Weather Station Software Manual
13 pages
ABP W9-W10 Big Data Analytics Lab-PIG
No ratings yet
ABP W9-W10 Big Data Analytics Lab-PIG
11 pages
Data Science Projects
No ratings yet
Data Science Projects
74 pages
BDA - Week04 - 10
No ratings yet
BDA - Week04 - 10
41 pages
### Monitoring How To Build An Application Monitoring System With FastAPI and RabbitMQ - Python - by Carlos Armando Marcano Vargas - Medium
No ratings yet
### Monitoring How To Build An Application Monitoring System With FastAPI and RabbitMQ - Python - by Carlos Armando Marcano Vargas - Medium
25 pages
Data Minig and Techniquezz
No ratings yet
Data Minig and Techniquezz
48 pages
ant-seedlings-classification-viu-Grupo01
No ratings yet
ant-seedlings-classification-viu-Grupo01
118 pages
Final Coding
No ratings yet
Final Coding
6 pages
Jashan ML
No ratings yet
Jashan ML
20 pages
Cloudera Training
No ratings yet
Cloudera Training
10 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
Pig - Lab Demonstrations Explore!: Woha! Pig Is Supercool!
No ratings yet
Pig - Lab Demonstrations Explore!: Woha! Pig Is Supercool!
4 pages
Machine Learning Laboratory
No ratings yet
Machine Learning Laboratory
23 pages
Loadeer Lab
No ratings yet
Loadeer Lab
3 pages
Data Reduction Using Pythonh
No ratings yet
Data Reduction Using Pythonh
5 pages
BDC Final Record
No ratings yet
BDC Final Record
36 pages
Normal Abnormal Ear - Ipynb - Colab
No ratings yet
Normal Abnormal Ear - Ipynb - Colab
10 pages
COACH RECORD STORING SYSTEM - Project
No ratings yet
COACH RECORD STORING SYSTEM - Project
23 pages
05 Functions
No ratings yet
05 Functions
6 pages
Word Count
No ratings yet
Word Count
3 pages
Salazar Francisco C3 - W1 - Lab - 3 - Sarcasm
No ratings yet
Salazar Francisco C3 - W1 - Lab - 3 - Sarcasm
11 pages
Data Science and Predictive Analytics Bi
No ratings yet
Data Science and Predictive Analytics Bi
44 pages
2주차
No ratings yet
2주차
5 pages
PySpark Questions
No ratings yet
PySpark Questions
5 pages
Why We Cannot Put Transaction Log File in A FileGroup
No ratings yet
Why We Cannot Put Transaction Log File in A FileGroup
14 pages
IOE421 (Programming Assignment-3)
No ratings yet
IOE421 (Programming Assignment-3)
55 pages
Coach Record Storing System
No ratings yet
Coach Record Storing System
23 pages
SESION 10 (Pandas 2)
No ratings yet
SESION 10 (Pandas 2)
120 pages
Python Database Programming Study Material PDF
100% (1)
Python Database Programming Study Material PDF
17 pages
27 Python Database Programming Study Material PDF
100% (1)
27 Python Database Programming Study Material PDF
17 pages
Bda Record
No ratings yet
Bda Record
46 pages
Getwd
No ratings yet
Getwd
24 pages
Practice 6 Monitoring Data Guard Configuration
No ratings yet
Practice 6 Monitoring Data Guard Configuration
11 pages
Week 1 To Week 9
No ratings yet
Week 1 To Week 9
30 pages
DV 9
No ratings yet
DV 9
11 pages
jdbc program .dot
No ratings yet
jdbc program .dot
13 pages
Big Data Analytics IT
No ratings yet
Big Data Analytics IT
55 pages
Rice - Ipynb - Colab
No ratings yet
Rice - Ipynb - Colab
11 pages
Deploy A Streamlit Web App With Azure App Service - by Richard P - Towards Data Science
No ratings yet
Deploy A Streamlit Web App With Azure App Service - by Richard P - Towards Data Science
16 pages
WT Practical File
No ratings yet
WT Practical File
17 pages
P1 - Pengenalan R Untuk Data Spasial (RA) PDF
No ratings yet
P1 - Pengenalan R Untuk Data Spasial (RA) PDF
39 pages
Practicaal Session Lecture3-Set Up For R Programming Language For Data Analytics
No ratings yet
Practicaal Session Lecture3-Set Up For R Programming Language For Data Analytics
11 pages
Indian Cricket Analysis
No ratings yet
Indian Cricket Analysis
37 pages
Ml Lab Manual Completed
No ratings yet
Ml Lab Manual Completed
56 pages
ESTIVEN - HURTADO.SANTOS - Analytics, De, Data, No, Estructurada - Machine, Learning - ESTIVEN - HURTADO.SANTOS - Ipynb - Colaboratory
No ratings yet
ESTIVEN - HURTADO.SANTOS - Analytics, De, Data, No, Estructurada - Machine, Learning - ESTIVEN - HURTADO.SANTOS - Ipynb - Colaboratory
5 pages
DEV RECORD AIDS
No ratings yet
DEV RECORD AIDS
24 pages
Unstructured Dataload Into Hive Database Through PySpark
No ratings yet
Unstructured Dataload Into Hive Database Through PySpark
9 pages
Index
No ratings yet
Index
11 pages
ML Lab Manual (Upto Cie-1)
No ratings yet
ML Lab Manual (Upto Cie-1)
33 pages
Databricks Pyspark 1712042928
100% (1)
Databricks Pyspark 1712042928
21 pages
017) Pandas - Batch 2 - Day 017
No ratings yet
017) Pandas - Batch 2 - Day 017
47 pages
Sqoop Lab
No ratings yet
Sqoop Lab
5 pages
LSTM Stock Prediction
100% (1)
LSTM Stock Prediction
38 pages
ChatDBClean - Colab
No ratings yet
ChatDBClean - Colab
3 pages
BDA-Lab Record
No ratings yet
BDA-Lab Record
43 pages
Python Database Programming
No ratings yet
Python Database Programming
11 pages
Python Database Programming: Storage Areas
No ratings yet
Python Database Programming: Storage Areas
11 pages
VPS Server Setup
From Everand
VPS Server Setup
L Mohan Arun
5/5 (1)
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Citrix XenDesktop® Cookbook - Third Edition - Sample Chapter
No ratings yet
Citrix XenDesktop® Cookbook - Third Edition - Sample Chapter
58 pages
Flowcode Manual PDF
No ratings yet
Flowcode Manual PDF
6 pages
Midterm Exam: ENGR 391 Numerical Methods
No ratings yet
Midterm Exam: ENGR 391 Numerical Methods
7 pages
Wein Bridge Oscillators Presentation
No ratings yet
Wein Bridge Oscillators Presentation
15 pages
Django for Professionals Production websites with Python Django 4 0 William S. Vincent pdf download
No ratings yet
Django for Professionals Production websites with Python Django 4 0 William S. Vincent pdf download
55 pages
A Practical Implementation Guide To Predictive Data Analytics Using Python
No ratings yet
A Practical Implementation Guide To Predictive Data Analytics Using Python
1 page
Implementation of Critical Path Method and Project Evaluation and Review Technique
No ratings yet
Implementation of Critical Path Method and Project Evaluation and Review Technique
9 pages
Ericsson BSC Commands
No ratings yet
Ericsson BSC Commands
9 pages
PowerPoint 2013 - Applying Transitions
No ratings yet
PowerPoint 2013 - Applying Transitions
6 pages
Demo Board API Manual v2.3
No ratings yet
Demo Board API Manual v2.3
11 pages
Navigat Mk2 Manual
100% (2)
Navigat Mk2 Manual
0 pages
vehicular_2020_1_30_30016
No ratings yet
vehicular_2020_1_30_30016
5 pages
780-7013 Rev. B Hm2b Operator's Manual Text
No ratings yet
780-7013 Rev. B Hm2b Operator's Manual Text
170 pages
Computer Science Practical File Term-2
No ratings yet
Computer Science Practical File Term-2
12 pages
2024-03-08 grandMA3 User Manual v2-0
No ratings yet
2024-03-08 grandMA3 User Manual v2-0
1,869 pages
SSC CGL Computer Knowledge Mock-1
No ratings yet
SSC CGL Computer Knowledge Mock-1
6 pages
4G LTE Cat1 GPS Tracker-M276 USER MANUAL V2.0-20220304
No ratings yet
4G LTE Cat1 GPS Tracker-M276 USER MANUAL V2.0-20220304
18 pages
Functions in PLSQL
No ratings yet
Functions in PLSQL
10 pages
Data Sheet WRV210
No ratings yet
Data Sheet WRV210
4 pages
Hotel Management 3.0
No ratings yet
Hotel Management 3.0
2 pages
Akhil Resume
No ratings yet
Akhil Resume
2 pages
Multi-Camera Module User's Manual
No ratings yet
Multi-Camera Module User's Manual
8 pages
R22 ----RE
No ratings yet
R22 ----RE
2 pages
BA356PEN 71043311 Deltapilot S FMB 70 Profibus OM
No ratings yet
BA356PEN 71043311 Deltapilot S FMB 70 Profibus OM
92 pages
The Many Creative Uses of Social Media in Learning
No ratings yet
The Many Creative Uses of Social Media in Learning
5 pages
Manual de Estudiante Mitel 3300
No ratings yet
Manual de Estudiante Mitel 3300
466 pages

Pig

Uploaded by

Pig

Uploaded by

EXPERIMENT – 6

Local mode MapReduce mode

 Run pig script program of GROUP on hadoop mapreduce

grunt> coustomer_orders = JOIN customers BY id, orders BY

PIG Running Modes

LOAD 'data' [USING function] [AS schema];

grunt> counts = FOREACH chargrp GENERATE group, COUNT(chars);

You might also like