SlideShare a Scribd company logo
from pyspark import SparkContext,
SparkConf
conf = SparkConf().setAppName("hdfs_test")
sc = SparkContext(conf=conf)
text_file = sc.textFile("hdfs://hanameservice/testzep/prueba2.txt")
counts = text_file.flatMap(lambda line: line.split(" ")).map(lambda word: (word,
1)).reduceByKey(lambda a, b: a + b)
counts.saveAsTextFile("hdfs://hanameservice/testzep/count.txt")
sc.stop()
Ad

Recommended

CCResourceAsyncLoader
CCResourceAsyncLoader
Keisuke Hata
 
Lettuce example using scenarios outline
Lettuce example using scenarios outline
Karen Wiznia
 
Hight Work
Hight Work
Nutron
 
Program pengurutan data
Program pengurutan data
linda_rosalina
 
Decentralizing CI/CD Pipelines (In Go)
Decentralizing CI/CD Pipelines (In Go)
Neil Primmer
 
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Yuki Tanabe
 
Ns2 ns3 training in mohali
Ns2 ns3 training in mohali
Arwinder paul singh
 
Flux and InfluxDB 2.0
Flux and InfluxDB 2.0
InfluxData
 
Live in shell
Live in shell
Tiến Nguyễn
 
201801 CSE240 Lecture 15
201801 CSE240 Lecture 15
Javier Gonzalez-Sanchez
 
Academy PRO: Elasticsearch Misc
Academy PRO: Elasticsearch Misc
Binary Studio
 
Counters for real-time statistics
Counters for real-time statistics
Edward Capriolo
 
Gregorian calendar class
Gregorian calendar class
Nontawat Wongnuk
 
Fast and cost effective geospatial analysis pipeline with AWS lambda
Fast and cost effective geospatial analysis pipeline with AWS lambda
Mila Frerichs
 
Illustrator_Sample
Illustrator_Sample
Saeid Saadatmand
 
Performance testing of microservices in Action
Performance testing of microservices in Action
Alexander Kachur
 
Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010
Skills Matter
 
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Skills Matter
 
Monitoring Cloud Foundry: Learning about the Firehose
Monitoring Cloud Foundry: Learning about the Firehose
Dustin Ruehle
 
Use of django at jolt online v3
Use of django at jolt online v3
Jaime Buelta
 
Distributing C# Applications with Apache Spark (TechEd 2017, Prague)
Distributing C# Applications with Apache Spark (TechEd 2017, Prague)
Attila Szucs
 
Ragel talk
Ragel talk
elliando dias
 
Optimizing the Grafana Platform for Flux
Optimizing the Grafana Platform for Flux
InfluxData
 
Ass1
Ass1
rajuajarekar
 
multi-line record grep
multi-line record grep
Ryoichi KATO
 
Endofday: A Container Workflow Engine for Scalable, Reproducible Computation
Endofday: A Container Workflow Engine for Scalable, Reproducible Computation
Enis Afgan
 
Docker tips & tricks
Docker tips & tricks
Dharmit Shah
 
Sol5
Sol5
University Of Lahore
 
ppt somu_Jarvis_AI_Assistant_presen.pptx
ppt somu_Jarvis_AI_Assistant_presen.pptx
MohammedumarFarhan
 
PPT2 W1L2.pptx.........................................
PPT2 W1L2.pptx.........................................
palicteronalyn26
 

More Related Content

What's hot (20)

Live in shell
Live in shell
Tiến Nguyễn
 
201801 CSE240 Lecture 15
201801 CSE240 Lecture 15
Javier Gonzalez-Sanchez
 
Academy PRO: Elasticsearch Misc
Academy PRO: Elasticsearch Misc
Binary Studio
 
Counters for real-time statistics
Counters for real-time statistics
Edward Capriolo
 
Gregorian calendar class
Gregorian calendar class
Nontawat Wongnuk
 
Fast and cost effective geospatial analysis pipeline with AWS lambda
Fast and cost effective geospatial analysis pipeline with AWS lambda
Mila Frerichs
 
Illustrator_Sample
Illustrator_Sample
Saeid Saadatmand
 
Performance testing of microservices in Action
Performance testing of microservices in Action
Alexander Kachur
 
Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010
Skills Matter
 
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Skills Matter
 
Monitoring Cloud Foundry: Learning about the Firehose
Monitoring Cloud Foundry: Learning about the Firehose
Dustin Ruehle
 
Use of django at jolt online v3
Use of django at jolt online v3
Jaime Buelta
 
Distributing C# Applications with Apache Spark (TechEd 2017, Prague)
Distributing C# Applications with Apache Spark (TechEd 2017, Prague)
Attila Szucs
 
Ragel talk
Ragel talk
elliando dias
 
Optimizing the Grafana Platform for Flux
Optimizing the Grafana Platform for Flux
InfluxData
 
Ass1
Ass1
rajuajarekar
 
multi-line record grep
multi-line record grep
Ryoichi KATO
 
Endofday: A Container Workflow Engine for Scalable, Reproducible Computation
Endofday: A Container Workflow Engine for Scalable, Reproducible Computation
Enis Afgan
 
Docker tips & tricks
Docker tips & tricks
Dharmit Shah
 
Sol5
Sol5
University Of Lahore
 
Academy PRO: Elasticsearch Misc
Academy PRO: Elasticsearch Misc
Binary Studio
 
Counters for real-time statistics
Counters for real-time statistics
Edward Capriolo
 
Fast and cost effective geospatial analysis pipeline with AWS lambda
Fast and cost effective geospatial analysis pipeline with AWS lambda
Mila Frerichs
 
Performance testing of microservices in Action
Performance testing of microservices in Action
Alexander Kachur
 
Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010
Skills Matter
 
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Skills Matter
 
Monitoring Cloud Foundry: Learning about the Firehose
Monitoring Cloud Foundry: Learning about the Firehose
Dustin Ruehle
 
Use of django at jolt online v3
Use of django at jolt online v3
Jaime Buelta
 
Distributing C# Applications with Apache Spark (TechEd 2017, Prague)
Distributing C# Applications with Apache Spark (TechEd 2017, Prague)
Attila Szucs
 
Optimizing the Grafana Platform for Flux
Optimizing the Grafana Platform for Flux
InfluxData
 
multi-line record grep
multi-line record grep
Ryoichi KATO
 
Endofday: A Container Workflow Engine for Scalable, Reproducible Computation
Endofday: A Container Workflow Engine for Scalable, Reproducible Computation
Enis Afgan
 
Docker tips & tricks
Docker tips & tricks
Dharmit Shah
 

Recently uploaded (20)

ppt somu_Jarvis_AI_Assistant_presen.pptx
ppt somu_Jarvis_AI_Assistant_presen.pptx
MohammedumarFarhan
 
PPT2 W1L2.pptx.........................................
PPT2 W1L2.pptx.........................................
palicteronalyn26
 
PPT1_CB_VII_CS_Ch3_FunctionsandChartsinCalc.ppsx
PPT1_CB_VII_CS_Ch3_FunctionsandChartsinCalc.ppsx
animaroy81
 
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Mahmoud Shoush
 
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
 
美国毕业证范本中华盛顿大学学位证书CWU学生卡购买
美国毕业证范本中华盛顿大学学位证书CWU学生卡购买
Taqyea
 
Measurecamp Copenhagen - Consent Context
Measurecamp Copenhagen - Consent Context
Human37
 
Indigo_Airlines_Strategy_Presentation.pptx
Indigo_Airlines_Strategy_Presentation.pptx
mukeshpurohit991
 
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
Taqyea
 
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
 
Data Visualisation in data science for students
Data Visualisation in data science for students
confidenceascend
 
UPS and Big Data intro to Business Analytics.pptx
UPS and Big Data intro to Business Analytics.pptx
sanjum5582
 
最新版美国威斯康星大学河城分校毕业证(UWRF毕业证书)原版定制
最新版美国威斯康星大学河城分校毕业证(UWRF毕业证书)原版定制
taqyea
 
The Influence off Flexible Work Policies
The Influence off Flexible Work Policies
sales480687
 
Indigo dyeing Presentation (2).pptx as dye
Indigo dyeing Presentation (2).pptx as dye
shreeroop1335
 
YEAP !NOT WHAT YOU THINK aakshdjdncnkenfj
YEAP !NOT WHAT YOU THINK aakshdjdncnkenfj
payalmistryb
 
NVIDIA Triton Inference Server, a game-changing platform for deploying AI mod...
NVIDIA Triton Inference Server, a game-changing platform for deploying AI mod...
Tamanna36
 
@Reset-Password.pptx presentakh;kenvtion
@Reset-Password.pptx presentakh;kenvtion
MarkLariosa1
 
Residential Zone 4 for industrial village
Residential Zone 4 for industrial village
MdYasinArafat13
 
Boost Business Efficiency with Professional Data Entry Services
Boost Business Efficiency with Professional Data Entry Services
eloiacs eloiacs
 
ppt somu_Jarvis_AI_Assistant_presen.pptx
ppt somu_Jarvis_AI_Assistant_presen.pptx
MohammedumarFarhan
 
PPT2 W1L2.pptx.........................................
PPT2 W1L2.pptx.........................................
palicteronalyn26
 
PPT1_CB_VII_CS_Ch3_FunctionsandChartsinCalc.ppsx
PPT1_CB_VII_CS_Ch3_FunctionsandChartsinCalc.ppsx
animaroy81
 
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Mahmoud Shoush
 
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
 
美国毕业证范本中华盛顿大学学位证书CWU学生卡购买
美国毕业证范本中华盛顿大学学位证书CWU学生卡购买
Taqyea
 
Measurecamp Copenhagen - Consent Context
Measurecamp Copenhagen - Consent Context
Human37
 
Indigo_Airlines_Strategy_Presentation.pptx
Indigo_Airlines_Strategy_Presentation.pptx
mukeshpurohit991
 
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
Taqyea
 
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
 
Data Visualisation in data science for students
Data Visualisation in data science for students
confidenceascend
 
UPS and Big Data intro to Business Analytics.pptx
UPS and Big Data intro to Business Analytics.pptx
sanjum5582
 
最新版美国威斯康星大学河城分校毕业证(UWRF毕业证书)原版定制
最新版美国威斯康星大学河城分校毕业证(UWRF毕业证书)原版定制
taqyea
 
The Influence off Flexible Work Policies
The Influence off Flexible Work Policies
sales480687
 
Indigo dyeing Presentation (2).pptx as dye
Indigo dyeing Presentation (2).pptx as dye
shreeroop1335
 
YEAP !NOT WHAT YOU THINK aakshdjdncnkenfj
YEAP !NOT WHAT YOU THINK aakshdjdncnkenfj
payalmistryb
 
NVIDIA Triton Inference Server, a game-changing platform for deploying AI mod...
NVIDIA Triton Inference Server, a game-changing platform for deploying AI mod...
Tamanna36
 
@Reset-Password.pptx presentakh;kenvtion
@Reset-Password.pptx presentakh;kenvtion
MarkLariosa1
 
Residential Zone 4 for industrial village
Residential Zone 4 for industrial village
MdYasinArafat13
 
Boost Business Efficiency with Professional Data Entry Services
Boost Business Efficiency with Professional Data Entry Services
eloiacs eloiacs
 
Ad

Script PyThon

  • 1. from pyspark import SparkContext, SparkConf conf = SparkConf().setAppName("hdfs_test") sc = SparkContext(conf=conf) text_file = sc.textFile("hdfs://hanameservice/testzep/prueba2.txt") counts = text_file.flatMap(lambda line: line.split(" ")).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b) counts.saveAsTextFile("hdfs://hanameservice/testzep/count.txt") sc.stop()