SlideShare a Scribd company logo
Big Data Processing With 
Scala and Spark 
Slide 1 www.edureka.Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions co/apache-spark-scala-training
Objectives of this Session 
What is Big Data? 
What is Spark? 
Why Spark? 
Spark Ecosystem 
A note about Scala 
Why Scala? 
Hello Spark! 
For Queries during the session and class recording: 
Post on Twitter @edurekaIN: #askEdureka 
Post on Facebook /edurekaIN 
Slide 2 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Big Data 
 Lots of Data (Terabytes or Petabytes) 
 Big data is the term for a collection of data sets 
so large and complex that it becomes difficult to 
process using on-hand database management 
tools or traditional data processing applications 
 The challenges include capture, curation, 
storage, search, sharing, transfer, analysis, and 
visualization 
cloud 
tools 
statistics 
No SQL 
Big Data 
compression 
support 
database 
storage 
analyze 
information 
mobile 
processing 
terabytes 
Slide 3 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
What is Spark? 
 Apache Spark is a general-purpose cluster in-memory computing system 
 Provides high-level APIs in Java, Scala and Python, and an optimized engine that supports general execution graphs 
 Provides various high level tools like Spark SQL for structured data processing, Mlib for Machine Learning and more.. 
High Level 
APIs 
High Level 
Tools 
More… 
Slide 4 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Why Spark? 
via YARN 
Cluster Manager 
 The Spark framework can be deployed through 
Apache Mesos, Apache Hadoop via Yarn, or 
Spark’s own cluster manager. 
Deployment 
Slide 5 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Why Spark? 
 Spark framework is polyglot – Can be programmed 
in several programming languages (Currently 
Scala, Java and Python supported). 
Polyglot Scala 
Slide 6 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Why Spark? 
A fully Apache Hive compatible data 
warehousing system that can run 100x 
faster than Hive. 
100x faster than for certain applications. 
Slide 7 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Why Spark? 
 Provides powerful caching and disk persistence capabilities 
 Interactive Data Analysis 
 Faster Batch 
 Iterative Algorithms 
 Real-Time Stream Processing 
 Faster Decision-Making 
Slide 8 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Spark Community is Super Active! 
Slide 9 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Spark Ecosystem 
MLLib 
(Machine 
learning) 
Spark Core Engine 
Aplha/Pre-alpha 
BlindDB 
(Approximate 
SQL) 
Shark 
(SQL) 
Spark 
Streaming 
(Streaming) 
GraphX 
(Graph 
Computation) 
SparkR 
(R on Spark) 
Slide 10 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Spark Ecosystem (Contd.) 
An approximate 
query engine. To 
run over Core 
Spark Engine. 
Used for structured 
data. Can run 
unmodified hive 
queries on existing 
Hadoop 
deployment. 
MLLib 
(Machine 
learning) 
Enables analytical 
and interactive 
apps for live 
streaming data. 
Spark Core Engine 
Aplha/Pre-alpha 
BlindDB 
(Approximate 
SQL) 
Shark 
(SQL) 
Spark 
Streaming 
(Streaming) 
Graph Computation 
engine. 
(Similar to Giraph) 
GraphX 
(Graph 
Computation) 
Package for R language 
to enable R-users to 
leverage Spark power 
from R shell. 
SparkR 
(R on Spark) 
Machine learning library being built on top of Spark. Provision for support to many 
machine learning algorithms with speeds upto 100 times faster than Map-Reduce. 
Slide 11 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
A Note on Scala 
 Scala is a general-purpose programming language designed 
to express common programming patterns in a concise, 
elegant, and type-safe way 
 Scala supports both Object Oriented Programming and 
Functional Programming 
 Scala is very much in fabric of present and Future Big Data 
frameworks like Scalding, Spark, Akka 
» All examples of Spark in class will be 
covered in Scala 
» Scala would be covered before Spark 
coverage as part of course! 
Slide 12 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Why Scala? 
 Scala is a pure object-oriented language. Conceptually, every value is an object and every operation is a 
method-call. The language supports advanced component architectures through classes and traits 
 Scala is also a functional language. Supports functions, immutable data structures and preference for 
immutability over mutation 
 Seamlessly integrated with Java 
 Being used heavily for future Big data and we developments frameworks like Spark, Akka, Scalding, Play etc 
Slide 13 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
Hello Spark! 
Hello Spark! 
Slide 14 www.Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions edureka.co/apache-spark-scala-training
Questions? 
Buy Spark Course at : www.edureka.co 
Slide 15 www.edureka.Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions co/apache-spark-scala-training
Big data Processing with Apache Spark & Scala
Ad

More Related Content

What's hot (20)

Apache Spark beyond Hadoop MapReduce
Apache Spark beyond Hadoop MapReduceApache Spark beyond Hadoop MapReduce
Apache Spark beyond Hadoop MapReduce
Edureka!
 
Apache spark
Apache sparkApache spark
Apache spark
Dona Mary Philip
 
Introduction to Apache Spark and MLlib
Introduction to Apache Spark and MLlibIntroduction to Apache Spark and MLlib
Introduction to Apache Spark and MLlib
pumaranikar
 
Apache spark
Apache sparkApache spark
Apache spark
TEJPAL GAUTAM
 
Intro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of TwingoIntro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of Twingo
MapR Technologies
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
BTI360
 
5 reasons why spark is in demand!
5 reasons why spark is in demand!5 reasons why spark is in demand!
5 reasons why spark is in demand!
Edureka!
 
Apache spark
Apache spark Apache spark
Apache spark
Edureka!
 
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Edureka!
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
Spark Summit
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache Spark
Dona Mary Philip
 
Apache spark linkedin
Apache spark linkedinApache spark linkedin
Apache spark linkedin
Yukti Kaura
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
Apache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & librariesApache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & libraries
Walaa Hamdy Assy
 
Apache Spark Notes
Apache Spark NotesApache Spark Notes
Apache Spark Notes
Venkateswaran Kandasamy
 
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Simplilearn
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!
Edureka!
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf
 
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Edureka!
 
Spark
SparkSpark
Spark
Intellipaat
 
Apache Spark beyond Hadoop MapReduce
Apache Spark beyond Hadoop MapReduceApache Spark beyond Hadoop MapReduce
Apache Spark beyond Hadoop MapReduce
Edureka!
 
Introduction to Apache Spark and MLlib
Introduction to Apache Spark and MLlibIntroduction to Apache Spark and MLlib
Introduction to Apache Spark and MLlib
pumaranikar
 
Intro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of TwingoIntro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of Twingo
MapR Technologies
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
BTI360
 
5 reasons why spark is in demand!
5 reasons why spark is in demand!5 reasons why spark is in demand!
5 reasons why spark is in demand!
Edureka!
 
Apache spark
Apache spark Apache spark
Apache spark
Edureka!
 
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Edureka!
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
Spark Summit
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache Spark
Dona Mary Philip
 
Apache spark linkedin
Apache spark linkedinApache spark linkedin
Apache spark linkedin
Yukti Kaura
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
Apache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & librariesApache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & libraries
Walaa Hamdy Assy
 
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...
Simplilearn
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!
Edureka!
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf
 
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Edureka!
 

Viewers also liked (20)

Scala and spark
Scala and sparkScala and spark
Scala and spark
Fabio Fumarola
 
Introduction to Big Data processing (FGRE2016)
Introduction to Big Data processing (FGRE2016)Introduction to Big Data processing (FGRE2016)
Introduction to Big Data processing (FGRE2016)
Thomas Vanhove
 
Five Ways To Do Data Analytics "The Wrong Way"
Five Ways To Do Data Analytics "The Wrong Way"Five Ways To Do Data Analytics "The Wrong Way"
Five Ways To Do Data Analytics "The Wrong Way"
Discover Pinterest
 
Introduction to Spark
Introduction to SparkIntroduction to Spark
Introduction to Spark
Li Ming Tsai
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!
Edureka!
 
Spark Will Replace Hadoop ! Know Why
Spark Will Replace Hadoop ! Know Why Spark Will Replace Hadoop ! Know Why
Spark Will Replace Hadoop ! Know Why
Edureka!
 
Scaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosScaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and Mesos
Discover Pinterest
 
Hadoop and Spark
Hadoop and SparkHadoop and Spark
Hadoop and Spark
Shravan (Sean) Pabba
 
"Spark Summit 2016: Trends & Insights" -- Zurich Spark Meetup, July 2016
"Spark Summit 2016: Trends & Insights" -- Zurich Spark Meetup, July 2016"Spark Summit 2016: Trends & Insights" -- Zurich Spark Meetup, July 2016
"Spark Summit 2016: Trends & Insights" -- Zurich Spark Meetup, July 2016
René Pfitzner
 
Apache spark sneha challa- google pittsburgh-aug 25th
Apache spark  sneha challa- google pittsburgh-aug 25thApache spark  sneha challa- google pittsburgh-aug 25th
Apache spark sneha challa- google pittsburgh-aug 25th
Sneha Challa
 
Spark is going to replace Apache Hadoop! Know Why?
Spark is going to replace Apache Hadoop! Know Why?Spark is going to replace Apache Hadoop! Know Why?
Spark is going to replace Apache Hadoop! Know Why?
Edureka!
 
Understanding Big Data And Hadoop
Understanding Big Data And HadoopUnderstanding Big Data And Hadoop
Understanding Big Data And Hadoop
Edureka!
 
Apache Spark Usage in the Open Source Ecosystem
Apache Spark Usage in the Open Source EcosystemApache Spark Usage in the Open Source Ecosystem
Apache Spark Usage in the Open Source Ecosystem
Databricks
 
Fault Tolerance with Kafka
Fault Tolerance with KafkaFault Tolerance with Kafka
Fault Tolerance with Kafka
Edureka!
 
Taboola's experience with Apache Spark (presentation @ Reversim 2014)
Taboola's experience with Apache Spark (presentation @ Reversim 2014)Taboola's experience with Apache Spark (presentation @ Reversim 2014)
Taboola's experience with Apache Spark (presentation @ Reversim 2014)
tsliwowicz
 
Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
Abdullah Çetin ÇAVDAR
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
Edureka!
 
2016 spark survey
2016 spark survey2016 spark survey
2016 spark survey
Abhishek Choudhary
 
Big Data Trend with Open Platform
Big Data Trend with Open PlatformBig Data Trend with Open Platform
Big Data Trend with Open Platform
Jongwook Woo
 
Spark For Faster Batch Processing
Spark For Faster Batch ProcessingSpark For Faster Batch Processing
Spark For Faster Batch Processing
Edureka!
 
Introduction to Big Data processing (FGRE2016)
Introduction to Big Data processing (FGRE2016)Introduction to Big Data processing (FGRE2016)
Introduction to Big Data processing (FGRE2016)
Thomas Vanhove
 
Five Ways To Do Data Analytics "The Wrong Way"
Five Ways To Do Data Analytics "The Wrong Way"Five Ways To Do Data Analytics "The Wrong Way"
Five Ways To Do Data Analytics "The Wrong Way"
Discover Pinterest
 
Introduction to Spark
Introduction to SparkIntroduction to Spark
Introduction to Spark
Li Ming Tsai
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!
Edureka!
 
Spark Will Replace Hadoop ! Know Why
Spark Will Replace Hadoop ! Know Why Spark Will Replace Hadoop ! Know Why
Spark Will Replace Hadoop ! Know Why
Edureka!
 
Scaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosScaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and Mesos
Discover Pinterest
 
"Spark Summit 2016: Trends & Insights" -- Zurich Spark Meetup, July 2016
"Spark Summit 2016: Trends & Insights" -- Zurich Spark Meetup, July 2016"Spark Summit 2016: Trends & Insights" -- Zurich Spark Meetup, July 2016
"Spark Summit 2016: Trends & Insights" -- Zurich Spark Meetup, July 2016
René Pfitzner
 
Apache spark sneha challa- google pittsburgh-aug 25th
Apache spark  sneha challa- google pittsburgh-aug 25thApache spark  sneha challa- google pittsburgh-aug 25th
Apache spark sneha challa- google pittsburgh-aug 25th
Sneha Challa
 
Spark is going to replace Apache Hadoop! Know Why?
Spark is going to replace Apache Hadoop! Know Why?Spark is going to replace Apache Hadoop! Know Why?
Spark is going to replace Apache Hadoop! Know Why?
Edureka!
 
Understanding Big Data And Hadoop
Understanding Big Data And HadoopUnderstanding Big Data And Hadoop
Understanding Big Data And Hadoop
Edureka!
 
Apache Spark Usage in the Open Source Ecosystem
Apache Spark Usage in the Open Source EcosystemApache Spark Usage in the Open Source Ecosystem
Apache Spark Usage in the Open Source Ecosystem
Databricks
 
Fault Tolerance with Kafka
Fault Tolerance with KafkaFault Tolerance with Kafka
Fault Tolerance with Kafka
Edureka!
 
Taboola's experience with Apache Spark (presentation @ Reversim 2014)
Taboola's experience with Apache Spark (presentation @ Reversim 2014)Taboola's experience with Apache Spark (presentation @ Reversim 2014)
Taboola's experience with Apache Spark (presentation @ Reversim 2014)
tsliwowicz
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
Edureka!
 
Big Data Trend with Open Platform
Big Data Trend with Open PlatformBig Data Trend with Open Platform
Big Data Trend with Open Platform
Jongwook Woo
 
Spark For Faster Batch Processing
Spark For Faster Batch ProcessingSpark For Faster Batch Processing
Spark For Faster Batch Processing
Edureka!
 
Ad

Similar to Big data Processing with Apache Spark & Scala (20)

Scalable Machine Learning with PySpark
Scalable Machine Learning with PySparkScalable Machine Learning with PySpark
Scalable Machine Learning with PySpark
Ladle Patel
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Edureka!
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
Edureka!
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
20160512 apache-spark-for-everyone
20160512 apache-spark-for-everyone20160512 apache-spark-for-everyone
20160512 apache-spark-for-everyone
Amanda Casari
 
A Master Guide To Apache Spark Application And Versatile Uses.pdf
A Master Guide To Apache Spark Application And Versatile Uses.pdfA Master Guide To Apache Spark Application And Versatile Uses.pdf
A Master Guide To Apache Spark Application And Versatile Uses.pdf
DataSpace Academy
 
Pyspark vs Spark Let's Unravel the Bond!
Pyspark vs Spark Let's Unravel the Bond!Pyspark vs Spark Let's Unravel the Bond!
Pyspark vs Spark Let's Unravel the Bond!
ankitbhandari32
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
Happiest Minds Technologies
 
Introduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingIntroduction to Apache Spark Developer Training
Introduction to Apache Spark Developer Training
Cloudera, Inc.
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learning
Patrick Nicolas
 
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Edureka!
 
Scala and spark
Scala and sparkScala and spark
Scala and spark
Janu Jahnavi
 
SparkFramework
SparkFrameworkSparkFramework
SparkFramework
Sergio Viademonte.
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Dharmjit Singh
 
Pyspark presentationfsfsfjspfsjfsfsfjsfpsfsf
Pyspark presentationfsfsfjspfsjfsfsfjsfpsfsfPyspark presentationfsfsfjspfsjfsfsfjsfpsfsf
Pyspark presentationfsfsfjspfsjfsfsfjsfpsfsf
sasuke20y4sh
 
39.-Introduction-to-Sparkspark and all-1.pdf
39.-Introduction-to-Sparkspark and all-1.pdf39.-Introduction-to-Sparkspark and all-1.pdf
39.-Introduction-to-Sparkspark and all-1.pdf
ajajkhan16
 
Scala: the unpredicted lingua franca for data science
Scala: the unpredicted lingua franca  for data scienceScala: the unpredicted lingua franca  for data science
Scala: the unpredicted lingua franca for data science
Andy Petrella
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
Home
 
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Codemotion
 
Module01
 Module01 Module01
Module01
NPN Training
 
Scalable Machine Learning with PySpark
Scalable Machine Learning with PySparkScalable Machine Learning with PySpark
Scalable Machine Learning with PySpark
Ladle Patel
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Edureka!
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
Edureka!
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
20160512 apache-spark-for-everyone
20160512 apache-spark-for-everyone20160512 apache-spark-for-everyone
20160512 apache-spark-for-everyone
Amanda Casari
 
A Master Guide To Apache Spark Application And Versatile Uses.pdf
A Master Guide To Apache Spark Application And Versatile Uses.pdfA Master Guide To Apache Spark Application And Versatile Uses.pdf
A Master Guide To Apache Spark Application And Versatile Uses.pdf
DataSpace Academy
 
Pyspark vs Spark Let's Unravel the Bond!
Pyspark vs Spark Let's Unravel the Bond!Pyspark vs Spark Let's Unravel the Bond!
Pyspark vs Spark Let's Unravel the Bond!
ankitbhandari32
 
Introduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingIntroduction to Apache Spark Developer Training
Introduction to Apache Spark Developer Training
Cloudera, Inc.
 
Open Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learningOpen Source Lambda Architecture for deep learning
Open Source Lambda Architecture for deep learning
Patrick Nicolas
 
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Edureka!
 
Pyspark presentationfsfsfjspfsjfsfsfjsfpsfsf
Pyspark presentationfsfsfjspfsjfsfsfjsfpsfsfPyspark presentationfsfsfjspfsjfsfsfjsfpsfsf
Pyspark presentationfsfsfjspfsjfsfsfjsfpsfsf
sasuke20y4sh
 
39.-Introduction-to-Sparkspark and all-1.pdf
39.-Introduction-to-Sparkspark and all-1.pdf39.-Introduction-to-Sparkspark and all-1.pdf
39.-Introduction-to-Sparkspark and all-1.pdf
ajajkhan16
 
Scala: the unpredicted lingua franca for data science
Scala: the unpredicted lingua franca  for data scienceScala: the unpredicted lingua franca  for data science
Scala: the unpredicted lingua franca for data science
Andy Petrella
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
Home
 
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Codemotion
 
Ad

More from Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
Edureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
Edureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
Edureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
Edureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
Edureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
Edureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
Edureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
Edureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
Edureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
Edureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
Edureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
Edureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
Edureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
Edureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
Edureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
Edureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Edureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
Edureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
Edureka!
 
What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
Edureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
Edureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
Edureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
Edureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
Edureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
Edureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
Edureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
Edureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
Edureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
Edureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
Edureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
Edureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
Edureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
Edureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
Edureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
Edureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Edureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
Edureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
Edureka!
 

Recently uploaded (20)

BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
Nguyen Thanh Tu Collection
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
THE STG QUIZ GROUP D.pptx quiz by Ridip Hazarika
THE STG QUIZ GROUP D.pptx   quiz by Ridip HazarikaTHE STG QUIZ GROUP D.pptx   quiz by Ridip Hazarika
THE STG QUIZ GROUP D.pptx quiz by Ridip Hazarika
Ridip Hazarika
 
Understanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s GuideUnderstanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s Guide
GS Virdi
 
Political History of Pala dynasty Pala Rulers NEP.pptx
Political History of Pala dynasty Pala Rulers NEP.pptxPolitical History of Pala dynasty Pala Rulers NEP.pptx
Political History of Pala dynasty Pala Rulers NEP.pptx
Arya Mahila P. G. College, Banaras Hindu University, Varanasi, India.
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 AccountingHow to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
Celine George
 
Introduction-to-Communication-and-Media-Studies-1736283331.pdf
Introduction-to-Communication-and-Media-Studies-1736283331.pdfIntroduction-to-Communication-and-Media-Studies-1736283331.pdf
Introduction-to-Communication-and-Media-Studies-1736283331.pdf
james5028
 
Geography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjectsGeography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjects
ProfDrShaikhImran
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
Grade 3 - English - Printable Worksheet (PDF Format)
Grade 3 - English - Printable Worksheet  (PDF Format)Grade 3 - English - Printable Worksheet  (PDF Format)
Grade 3 - English - Printable Worksheet (PDF Format)
Sritoma Majumder
 
03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.
MCH
 
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetCBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
Sritoma Majumder
 
How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18
Celine George
 
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulsepulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
sushreesangita003
 
Engage Donors Through Powerful Storytelling.pdf
Engage Donors Through Powerful Storytelling.pdfEngage Donors Through Powerful Storytelling.pdf
Engage Donors Through Powerful Storytelling.pdf
TechSoup
 
How to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POSHow to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POS
Celine George
 
"Basics of Heterocyclic Compounds and Their Naming Rules"
"Basics of Heterocyclic Compounds and Their Naming Rules""Basics of Heterocyclic Compounds and Their Naming Rules"
"Basics of Heterocyclic Compounds and Their Naming Rules"
rupalinirmalbpharm
 
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFAExercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 
Operations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdfOperations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdf
Arab Academy for Science, Technology and Maritime Transport
 
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
BỘ ĐỀ TUYỂN SINH VÀO LỚP 10 TIẾNG ANH - 25 ĐỀ THI BÁM SÁT CẤU TRÚC MỚI NHẤT, ...
Nguyen Thanh Tu Collection
 
THE STG QUIZ GROUP D.pptx quiz by Ridip Hazarika
THE STG QUIZ GROUP D.pptx   quiz by Ridip HazarikaTHE STG QUIZ GROUP D.pptx   quiz by Ridip Hazarika
THE STG QUIZ GROUP D.pptx quiz by Ridip Hazarika
Ridip Hazarika
 
Understanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s GuideUnderstanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s Guide
GS Virdi
 
Kenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 CohortKenan Fellows Participants, Projects 2025-26 Cohort
Kenan Fellows Participants, Projects 2025-26 Cohort
EducationNC
 
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 AccountingHow to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
How to Customize Your Financial Reports & Tax Reports With Odoo 17 Accounting
Celine George
 
Introduction-to-Communication-and-Media-Studies-1736283331.pdf
Introduction-to-Communication-and-Media-Studies-1736283331.pdfIntroduction-to-Communication-and-Media-Studies-1736283331.pdf
Introduction-to-Communication-and-Media-Studies-1736283331.pdf
james5028
 
Geography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjectsGeography Sem II Unit 1C Correlation of Geography with other school subjects
Geography Sem II Unit 1C Correlation of Geography with other school subjects
ProfDrShaikhImran
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
Grade 3 - English - Printable Worksheet (PDF Format)
Grade 3 - English - Printable Worksheet  (PDF Format)Grade 3 - English - Printable Worksheet  (PDF Format)
Grade 3 - English - Printable Worksheet (PDF Format)
Sritoma Majumder
 
03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.03#UNTAGGED. Generosity in architecture.
03#UNTAGGED. Generosity in architecture.
MCH
 
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetCBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
Sritoma Majumder
 
How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18How to Manage Purchase Alternatives in Odoo 18
How to Manage Purchase Alternatives in Odoo 18
Celine George
 
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulsepulse  ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
pulse ppt.pptx Types of pulse , characteristics of pulse , Alteration of pulse
sushreesangita003
 
Engage Donors Through Powerful Storytelling.pdf
Engage Donors Through Powerful Storytelling.pdfEngage Donors Through Powerful Storytelling.pdf
Engage Donors Through Powerful Storytelling.pdf
TechSoup
 
How to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POSHow to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POS
Celine George
 
"Basics of Heterocyclic Compounds and Their Naming Rules"
"Basics of Heterocyclic Compounds and Their Naming Rules""Basics of Heterocyclic Compounds and Their Naming Rules"
"Basics of Heterocyclic Compounds and Their Naming Rules"
rupalinirmalbpharm
 
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFAExercise Physiology MCQS By DR. NASIR MUSTAFA
Exercise Physiology MCQS By DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 

Big data Processing with Apache Spark & Scala

  • 1. Big Data Processing With Scala and Spark Slide 1 www.edureka.Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions co/apache-spark-scala-training
  • 2. Objectives of this Session What is Big Data? What is Spark? Why Spark? Spark Ecosystem A note about Scala Why Scala? Hello Spark! For Queries during the session and class recording: Post on Twitter @edurekaIN: #askEdureka Post on Facebook /edurekaIN Slide 2 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 3. Big Data  Lots of Data (Terabytes or Petabytes)  Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications  The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization cloud tools statistics No SQL Big Data compression support database storage analyze information mobile processing terabytes Slide 3 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 4. What is Spark?  Apache Spark is a general-purpose cluster in-memory computing system  Provides high-level APIs in Java, Scala and Python, and an optimized engine that supports general execution graphs  Provides various high level tools like Spark SQL for structured data processing, Mlib for Machine Learning and more.. High Level APIs High Level Tools More… Slide 4 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 5. Why Spark? via YARN Cluster Manager  The Spark framework can be deployed through Apache Mesos, Apache Hadoop via Yarn, or Spark’s own cluster manager. Deployment Slide 5 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 6. Why Spark?  Spark framework is polyglot – Can be programmed in several programming languages (Currently Scala, Java and Python supported). Polyglot Scala Slide 6 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 7. Why Spark? A fully Apache Hive compatible data warehousing system that can run 100x faster than Hive. 100x faster than for certain applications. Slide 7 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 8. Why Spark?  Provides powerful caching and disk persistence capabilities  Interactive Data Analysis  Faster Batch  Iterative Algorithms  Real-Time Stream Processing  Faster Decision-Making Slide 8 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 9. Spark Community is Super Active! Slide 9 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 10. Spark Ecosystem MLLib (Machine learning) Spark Core Engine Aplha/Pre-alpha BlindDB (Approximate SQL) Shark (SQL) Spark Streaming (Streaming) GraphX (Graph Computation) SparkR (R on Spark) Slide 10 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 11. Spark Ecosystem (Contd.) An approximate query engine. To run over Core Spark Engine. Used for structured data. Can run unmodified hive queries on existing Hadoop deployment. MLLib (Machine learning) Enables analytical and interactive apps for live streaming data. Spark Core Engine Aplha/Pre-alpha BlindDB (Approximate SQL) Shark (SQL) Spark Streaming (Streaming) Graph Computation engine. (Similar to Giraph) GraphX (Graph Computation) Package for R language to enable R-users to leverage Spark power from R shell. SparkR (R on Spark) Machine learning library being built on top of Spark. Provision for support to many machine learning algorithms with speeds upto 100 times faster than Map-Reduce. Slide 11 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 12. A Note on Scala  Scala is a general-purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way  Scala supports both Object Oriented Programming and Functional Programming  Scala is very much in fabric of present and Future Big Data frameworks like Scalding, Spark, Akka » All examples of Spark in class will be covered in Scala » Scala would be covered before Spark coverage as part of course! Slide 12 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 13. Why Scala?  Scala is a pure object-oriented language. Conceptually, every value is an object and every operation is a method-call. The language supports advanced component architectures through classes and traits  Scala is also a functional language. Supports functions, immutable data structures and preference for immutability over mutation  Seamlessly integrated with Java  Being used heavily for future Big data and we developments frameworks like Spark, Akka, Scalding, Play etc Slide 13 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/apache-spark-scala-training
  • 14. Hello Spark! Hello Spark! Slide 14 www.Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions edureka.co/apache-spark-scala-training
  • 15. Questions? Buy Spark Course at : www.edureka.co Slide 15 www.edureka.Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions co/apache-spark-scala-training