SlideShare a Scribd company logo
Apache Spark
Agenda
Hadoop vs Spark: Big ‘Big Data’ question
Spark Ecosystem
What is RDD
Operations on RDD: Actions vs
Transformations
Running in cluster
Task schedulers
Spark Streaming
Dataframes API
Let’s remember: MapReduce
Apache Hadoop MapReduce
Hadoop VS/AND Spark
Hadoop: DFS
Spark: Speed (RAM)
Spark ecosystem
Glossary
Job
RDD
Stages
Tasks
DAG
Executor
Driver
Simple Example
RDD: Resilient Distributed Dataset
Represents an immutable, partitioned collection of elements that can be
operated in parallel with failure recovery possibilities.
Example
Hadoop RDD
getPartitions = HDFS blocks
getDependencies = None
compute = load block in memory
getPrefferedLocations = HDFS block locations
partitioner = None
MapPartitions RDD
getPartitions = same as parent
getDependencies = parent RDD
compute = compute parent and apply map()
getPrefferedLocations = same as parent
partitioner = None
RDD: Resilient Distributed Dataset
RDD Example
RDD Example
RDD Operations
● Transformations
○ Apply user function to every element in a partition
○ Apply aggregation function to a whole dataset
(groupBy, sortBy)
○ Provide functionality for repartitioning (repartition,
partitionBy)
● Actions
○ Materialize computation results (collect, count,
take)
○ Store RDDs in memory or on disk (cache, persist)
RDD Dependencies
DAG: Directed Acyclic Graph
All the operators in a job
are used to construct a
DAG (Directed Acyclic
Graph). The DAG is
optimized by rearranging
and combining operators
where possible.
DAG Example
DAG Scheduler
The DAG scheduler divides
operators into stages of
tasks. A stage is comprised
of tasks based on partitions
of the input data. Pipelines
operators together.
DAG Scheduler example
RDD Persistence: persist() & cache()
When you persist an RDD, each node stores any partitions of it that it computes in memory
and reuses them in other actions on that dataset (or datasets derived from it).
Storage levels: MEMORY_ONLY (default), MEMORY_AND_DISK,
MEMORY_ONLY_SER, MEMORY_AND_DISK_SER, DISK_ONLY,
MEMORY_ONLY_2, MEMORY_AND_DISK_2, etc.
Removing data: least-recently-used (LRU) fashion or RDD.unpersist() method.
Job execution
Task Schedulers
Standalone
Default
FIFO strategy
Controls number of CPU
cores and executor
memory
YARN
Hadoop oriented
Takes all available
resources
Was designed for
stateless batch jobs
that can be restarted
easily if they fail.
Mesos
Resource oriented
Dynamic sharing or CPU
cores
Less predictive latency
Spark Driver (application)
Running in cluster
Memory usage
• Execution memory
• Storage for data needed during tasks execution
• Shuffle-related data
• Storage memory
• Cached RDDs
• Possible to borrow from execution memory
• User memory
• User data structures and internal metadata
• Safeguarding against OOM
• Reserved memory
• Memory needed for running executor itself
Spark Streaming
Spark Streaming: Basic Concept
Spark Streaming: Architecture
Spark Streaming receives live input data streams and divides the data into
batches, which are then processed by the Spark engine to generate the final
stream of results in batches.
Discretized Streams (DStreams)
Windowed computations
Spark Streaming checkpoints
• Create heavy objects in foreachRDD
• Default persistence level of DStreams keeps the data serialized in memory.
• Checkpointing (metadata and received data)
• Automatic restart (task manager)
• Max receiving rate
• Level of Parallelism
• Kryo serialization
Spark Streaming Example
Spark Dataframes
(SQL)
Apache Hive
• Hadoop product
• Stores metadata in the relational database, but data only in HDFS
• Is not suited for real time data processing
• Best used for batch jobs over large datasets of immutable data (web logs)
Is a good choice if you:
• Want to query the data
• When you’re familiar with SQL
About Spark SQL
Part of Spark core since April 2014
Works with structured data
Mixes SQL queries with Spark programs
Connect to any datasource (files, Hive
tables, external databases, RDDs)
Spark Dataframes
Spark Dataframes
Spark SQL
Spark SQL with schema
Dataframes benchmark
Q&A
Ad

More Related Content

What's hot (20)

Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
Aakashdata
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
Databricks
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Anastasios Skarlatidis
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
Spark
SparkSpark
Spark
Heena Madan
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
Robert Sanders
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
Pietro Michiardi
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Simplilearn
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
Sohil Jain
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
Girish Khanzode
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
Russell Jurney
 
PySpark dataframe
PySpark dataframePySpark dataframe
PySpark dataframe
Jaemun Jung
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
GauravBiswas9
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
colorant
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
Mostafa
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slides
Dat Tran
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
Aakashdata
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
Databricks
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
Pietro Michiardi
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Simplilearn
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
Sohil Jain
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
Russell Jurney
 
PySpark dataframe
PySpark dataframePySpark dataframe
PySpark dataframe
Jaemun Jung
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
colorant
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
Mostafa
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slides
Dat Tran
 

Viewers also liked (20)

Qa talk-test manager-Oksana Kharchuk
Qa talk-test manager-Oksana KharchukQa talk-test manager-Oksana Kharchuk
Qa talk-test manager-Oksana Kharchuk
DataArt
 
Propiedad intelectual del soft ware
Propiedad intelectual del soft warePropiedad intelectual del soft ware
Propiedad intelectual del soft ware
Joel Quintana
 
The Rental Policies You Need to Know About
The Rental Policies You Need to Know AboutThe Rental Policies You Need to Know About
The Rental Policies You Need to Know About
UrbanBound
 
IR
IRIR
IR
MAK
 
Роман Еникеев - PHP или откуда взялся слон
Роман Еникеев - PHP или откуда взялся слонРоман Еникеев - PHP или откуда взялся слон
Роман Еникеев - PHP или откуда взялся слон
DataArt
 
Андрей Беляев "Мыслить как заказчик"
Андрей Беляев "Мыслить как заказчик"Андрей Беляев "Мыслить как заказчик"
Андрей Беляев "Мыслить как заказчик"
DataArt
 
photos
photosphotos
photos
diakxr
 
Visiting unpleasent places
Visiting unpleasent placesVisiting unpleasent places
Visiting unpleasent places
Arpanasa
 
Mapas etiquetas
Mapas etiquetasMapas etiquetas
Mapas etiquetas
Diego Rojas
 
Estrategika nuevos productos proteccion
Estrategika nuevos productos proteccionEstrategika nuevos productos proteccion
Estrategika nuevos productos proteccion
JUAN CARLOS CALDERON
 
Bit trade labs sovereign identity fintech summit 2016
Bit trade labs sovereign identity   fintech summit 2016Bit trade labs sovereign identity   fintech summit 2016
Bit trade labs sovereign identity fintech summit 2016
Glen Frost
 
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
DataArt
 
นิทาน
นิทานนิทาน
นิทาน
ExitOfLove
 
Reader’s theater (1)
Reader’s theater (1)Reader’s theater (1)
Reader’s theater (1)
IIPCONX
 
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
DataArt
 
Uses and gratification theory
Uses and gratification theoryUses and gratification theory
Uses and gratification theory
Abbey Cotterill
 
Joint venture
Joint ventureJoint venture
Joint venture
Shlagha Nayyar
 
Bio pharma vessels & tanks
Bio pharma vessels & tanksBio pharma vessels & tanks
Bio pharma vessels & tanks
Akshar Engineering Works
 
Android wear, Alexey Rybakov DataArt Kharkov
Android wear, Alexey Rybakov DataArt KharkovAndroid wear, Alexey Rybakov DataArt Kharkov
Android wear, Alexey Rybakov DataArt Kharkov
DataArt
 
Qa talk-test manager-Oksana Kharchuk
Qa talk-test manager-Oksana KharchukQa talk-test manager-Oksana Kharchuk
Qa talk-test manager-Oksana Kharchuk
DataArt
 
Propiedad intelectual del soft ware
Propiedad intelectual del soft warePropiedad intelectual del soft ware
Propiedad intelectual del soft ware
Joel Quintana
 
The Rental Policies You Need to Know About
The Rental Policies You Need to Know AboutThe Rental Policies You Need to Know About
The Rental Policies You Need to Know About
UrbanBound
 
IR
IRIR
IR
MAK
 
Роман Еникеев - PHP или откуда взялся слон
Роман Еникеев - PHP или откуда взялся слонРоман Еникеев - PHP или откуда взялся слон
Роман Еникеев - PHP или откуда взялся слон
DataArt
 
Андрей Беляев "Мыслить как заказчик"
Андрей Беляев "Мыслить как заказчик"Андрей Беляев "Мыслить как заказчик"
Андрей Беляев "Мыслить как заказчик"
DataArt
 
photos
photosphotos
photos
diakxr
 
Visiting unpleasent places
Visiting unpleasent placesVisiting unpleasent places
Visiting unpleasent places
Arpanasa
 
Estrategika nuevos productos proteccion
Estrategika nuevos productos proteccionEstrategika nuevos productos proteccion
Estrategika nuevos productos proteccion
JUAN CARLOS CALDERON
 
Bit trade labs sovereign identity fintech summit 2016
Bit trade labs sovereign identity   fintech summit 2016Bit trade labs sovereign identity   fintech summit 2016
Bit trade labs sovereign identity fintech summit 2016
Glen Frost
 
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
DataArt
 
นิทาน
นิทานนิทาน
นิทาน
ExitOfLove
 
Reader’s theater (1)
Reader’s theater (1)Reader’s theater (1)
Reader’s theater (1)
IIPCONX
 
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
DataArt
 
Uses and gratification theory
Uses and gratification theoryUses and gratification theory
Uses and gratification theory
Abbey Cotterill
 
Android wear, Alexey Rybakov DataArt Kharkov
Android wear, Alexey Rybakov DataArt KharkovAndroid wear, Alexey Rybakov DataArt Kharkov
Android wear, Alexey Rybakov DataArt Kharkov
DataArt
 
Ad

Similar to Apache Spark overview (20)

Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Djamel Zouaoui
 
Apache Spark on HDinsight Training
Apache Spark on HDinsight TrainingApache Spark on HDinsight Training
Apache Spark on HDinsight Training
Synergetics Learning and Cloud Consulting
 
Spark architechure.pptx
Spark architechure.pptxSpark architechure.pptx
Spark architechure.pptx
SaiSriMadhuriYatam
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
clairvoyantllc
 
Geek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaGeek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and Scala
Atif Akhtar
 
Big data overview
Big data overviewBig data overview
Big data overview
beCloudReady
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Rahul Borate
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
cdmaxime
 
Apache Spark™ is a multi-language engine for executing data-S5.ppt
Apache Spark™ is a multi-language engine for executing data-S5.pptApache Spark™ is a multi-language engine for executing data-S5.ppt
Apache Spark™ is a multi-language engine for executing data-S5.ppt
bhargavi804095
 
Spark from the Surface
Spark from the SurfaceSpark from the Surface
Spark from the Surface
Josi Aranda
 
Introduction to Apache Spark :: Lagos Scala Meetup session 2
Introduction to Apache Spark :: Lagos Scala Meetup session 2 Introduction to Apache Spark :: Lagos Scala Meetup session 2
Introduction to Apache Spark :: Lagos Scala Meetup session 2
Olalekan Fuad Elesin
 
Apache Spark Introduction.pdf
Apache Spark Introduction.pdfApache Spark Introduction.pdf
Apache Spark Introduction.pdf
MaheshPandit16
 
Study Notes: Apache Spark
Study Notes: Apache SparkStudy Notes: Apache Spark
Study Notes: Apache Spark
Gao Yunzhong
 
TriHUG talk on Spark and Shark
TriHUG talk on Spark and SharkTriHUG talk on Spark and Shark
TriHUG talk on Spark and Shark
trihug
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.ir
datastack
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
Apache Spark Fundamentals Meetup Talk
Apache Spark Fundamentals Meetup TalkApache Spark Fundamentals Meetup Talk
Apache Spark Fundamentals Meetup Talk
Eren Avşaroğulları
 
Apache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource ManagerApache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource Manager
haridasnss
 
Bigdata processing with Spark - part II
Bigdata processing with Spark - part IIBigdata processing with Spark - part II
Bigdata processing with Spark - part II
Arjen de Vries
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
Zahra Eskandari
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Djamel Zouaoui
 
Geek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaGeek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and Scala
Atif Akhtar
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Rahul Borate
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
cdmaxime
 
Apache Spark™ is a multi-language engine for executing data-S5.ppt
Apache Spark™ is a multi-language engine for executing data-S5.pptApache Spark™ is a multi-language engine for executing data-S5.ppt
Apache Spark™ is a multi-language engine for executing data-S5.ppt
bhargavi804095
 
Spark from the Surface
Spark from the SurfaceSpark from the Surface
Spark from the Surface
Josi Aranda
 
Introduction to Apache Spark :: Lagos Scala Meetup session 2
Introduction to Apache Spark :: Lagos Scala Meetup session 2 Introduction to Apache Spark :: Lagos Scala Meetup session 2
Introduction to Apache Spark :: Lagos Scala Meetup session 2
Olalekan Fuad Elesin
 
Apache Spark Introduction.pdf
Apache Spark Introduction.pdfApache Spark Introduction.pdf
Apache Spark Introduction.pdf
MaheshPandit16
 
Study Notes: Apache Spark
Study Notes: Apache SparkStudy Notes: Apache Spark
Study Notes: Apache Spark
Gao Yunzhong
 
TriHUG talk on Spark and Shark
TriHUG talk on Spark and SharkTriHUG talk on Spark and Shark
TriHUG talk on Spark and Shark
trihug
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.ir
datastack
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
Apache Spark Fundamentals Meetup Talk
Apache Spark Fundamentals Meetup TalkApache Spark Fundamentals Meetup Talk
Apache Spark Fundamentals Meetup Talk
Eren Avşaroğulları
 
Apache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource ManagerApache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource Manager
haridasnss
 
Bigdata processing with Spark - part II
Bigdata processing with Spark - part IIBigdata processing with Spark - part II
Bigdata processing with Spark - part II
Arjen de Vries
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
Zahra Eskandari
 
Ad

More from DataArt (20)

DataArt Custom Software Engineering with a Human Approach
DataArt Custom Software Engineering with a Human ApproachDataArt Custom Software Engineering with a Human Approach
DataArt Custom Software Engineering with a Human Approach
DataArt
 
DataArt Healthcare & Life Sciences
DataArt Healthcare & Life SciencesDataArt Healthcare & Life Sciences
DataArt Healthcare & Life Sciences
DataArt
 
DataArt Financial Services and Capital Markets
DataArt Financial Services and Capital MarketsDataArt Financial Services and Capital Markets
DataArt Financial Services and Capital Markets
DataArt
 
About DataArt HR Partners
About DataArt HR PartnersAbout DataArt HR Partners
About DataArt HR Partners
DataArt
 
Event management в IT
Event management в ITEvent management в IT
Event management в IT
DataArt
 
Digital Marketing from inside
Digital Marketing from insideDigital Marketing from inside
Digital Marketing from inside
DataArt
 
What's new in Android, Igor Malytsky ( Google Post I|O Tour)
What's new in Android, Igor Malytsky ( Google Post I|O Tour)What's new in Android, Igor Malytsky ( Google Post I|O Tour)
What's new in Android, Igor Malytsky ( Google Post I|O Tour)
DataArt
 
DevOps Workshop:Что бывает, когда DevOps приходит на проект
DevOps Workshop:Что бывает, когда DevOps приходит на проектDevOps Workshop:Что бывает, когда DevOps приходит на проект
DevOps Workshop:Что бывает, когда DevOps приходит на проект
DataArt
 
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArtIT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
DataArt
 
«Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
 «Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han... «Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
«Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
DataArt
 
Communication in QA's life
Communication in QA's lifeCommunication in QA's life
Communication in QA's life
DataArt
 
Нельзя просто так взять и договориться, или как мы работали со сложными людьми
Нельзя просто так взять и договориться, или как мы работали со сложными людьмиНельзя просто так взять и договориться, или как мы работали со сложными людьми
Нельзя просто так взять и договориться, или как мы работали со сложными людьми
DataArt
 
Знакомьтесь, DevOps
Знакомьтесь, DevOpsЗнакомьтесь, DevOps
Знакомьтесь, DevOps
DataArt
 
DevOps in real life
DevOps in real lifeDevOps in real life
DevOps in real life
DataArt
 
Codeless: автоматизация тестирования
Codeless: автоматизация тестированияCodeless: автоматизация тестирования
Codeless: автоматизация тестирования
DataArt
 
Selenoid
SelenoidSelenoid
Selenoid
DataArt
 
Selenide
SelenideSelenide
Selenide
DataArt
 
A. Sirota "Building an Automation Solution based on Appium"
A. Sirota "Building an Automation Solution based on Appium"A. Sirota "Building an Automation Solution based on Appium"
A. Sirota "Building an Automation Solution based on Appium"
DataArt
 
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
DataArt
 
IT talk: Как я перестал бояться и полюбил TestNG
IT talk: Как я перестал бояться и полюбил TestNGIT talk: Как я перестал бояться и полюбил TestNG
IT talk: Как я перестал бояться и полюбил TestNG
DataArt
 
DataArt Custom Software Engineering with a Human Approach
DataArt Custom Software Engineering with a Human ApproachDataArt Custom Software Engineering with a Human Approach
DataArt Custom Software Engineering with a Human Approach
DataArt
 
DataArt Healthcare & Life Sciences
DataArt Healthcare & Life SciencesDataArt Healthcare & Life Sciences
DataArt Healthcare & Life Sciences
DataArt
 
DataArt Financial Services and Capital Markets
DataArt Financial Services and Capital MarketsDataArt Financial Services and Capital Markets
DataArt Financial Services and Capital Markets
DataArt
 
About DataArt HR Partners
About DataArt HR PartnersAbout DataArt HR Partners
About DataArt HR Partners
DataArt
 
Event management в IT
Event management в ITEvent management в IT
Event management в IT
DataArt
 
Digital Marketing from inside
Digital Marketing from insideDigital Marketing from inside
Digital Marketing from inside
DataArt
 
What's new in Android, Igor Malytsky ( Google Post I|O Tour)
What's new in Android, Igor Malytsky ( Google Post I|O Tour)What's new in Android, Igor Malytsky ( Google Post I|O Tour)
What's new in Android, Igor Malytsky ( Google Post I|O Tour)
DataArt
 
DevOps Workshop:Что бывает, когда DevOps приходит на проект
DevOps Workshop:Что бывает, когда DevOps приходит на проектDevOps Workshop:Что бывает, когда DevOps приходит на проект
DevOps Workshop:Что бывает, когда DevOps приходит на проект
DataArt
 
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArtIT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
DataArt
 
«Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
 «Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han... «Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
«Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
DataArt
 
Communication in QA's life
Communication in QA's lifeCommunication in QA's life
Communication in QA's life
DataArt
 
Нельзя просто так взять и договориться, или как мы работали со сложными людьми
Нельзя просто так взять и договориться, или как мы работали со сложными людьмиНельзя просто так взять и договориться, или как мы работали со сложными людьми
Нельзя просто так взять и договориться, или как мы работали со сложными людьми
DataArt
 
Знакомьтесь, DevOps
Знакомьтесь, DevOpsЗнакомьтесь, DevOps
Знакомьтесь, DevOps
DataArt
 
DevOps in real life
DevOps in real lifeDevOps in real life
DevOps in real life
DataArt
 
Codeless: автоматизация тестирования
Codeless: автоматизация тестированияCodeless: автоматизация тестирования
Codeless: автоматизация тестирования
DataArt
 
Selenoid
SelenoidSelenoid
Selenoid
DataArt
 
Selenide
SelenideSelenide
Selenide
DataArt
 
A. Sirota "Building an Automation Solution based on Appium"
A. Sirota "Building an Automation Solution based on Appium"A. Sirota "Building an Automation Solution based on Appium"
A. Sirota "Building an Automation Solution based on Appium"
DataArt
 
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
DataArt
 
IT talk: Как я перестал бояться и полюбил TestNG
IT talk: Как я перестал бояться и полюбил TestNGIT talk: Как я перестал бояться и полюбил TestNG
IT talk: Как я перестал бояться и полюбил TestNG
DataArt
 

Recently uploaded (20)

How to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POSHow to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POS
Celine George
 
SPRING FESTIVITIES - UK AND USA -
SPRING FESTIVITIES - UK AND USA            -SPRING FESTIVITIES - UK AND USA            -
SPRING FESTIVITIES - UK AND USA -
Colégio Santa Teresinha
 
Operations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdfOperations Management (Dr. Abdulfatah Salem).pdf
Operations Management (Dr. Abdulfatah Salem).pdf
Arab Academy for Science, Technology and Maritime Transport
 
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Library Association of Ireland
 
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Library Association of Ireland
 
Sinhala_Male_Names.pdf Sinhala_Male_Name
Sinhala_Male_Names.pdf Sinhala_Male_NameSinhala_Male_Names.pdf Sinhala_Male_Name
Sinhala_Male_Names.pdf Sinhala_Male_Name
keshanf79
 
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar RabbiPresentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Md Shaifullar Rabbi
 
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetCBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
Sritoma Majumder
 
To study Digestive system of insect.pptx
To study Digestive system of insect.pptxTo study Digestive system of insect.pptx
To study Digestive system of insect.pptx
Arshad Shaikh
 
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Library Association of Ireland
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 
apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
Metamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative JourneyMetamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative Journey
Arshad Shaikh
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 
How to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POSHow to Manage Opening & Closing Controls in Odoo 17 POS
How to Manage Opening & Closing Controls in Odoo 17 POS
Celine George
 
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Marie Boran Special Collections Librarian Hardiman Library, University of Gal...
Library Association of Ireland
 
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Library Association of Ireland
 
Sinhala_Male_Names.pdf Sinhala_Male_Name
Sinhala_Male_Names.pdf Sinhala_Male_NameSinhala_Male_Names.pdf Sinhala_Male_Name
Sinhala_Male_Names.pdf Sinhala_Male_Name
keshanf79
 
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar RabbiPresentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Md Shaifullar Rabbi
 
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetCBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
Sritoma Majumder
 
To study Digestive system of insect.pptx
To study Digestive system of insect.pptxTo study Digestive system of insect.pptx
To study Digestive system of insect.pptx
Arshad Shaikh
 
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Michelle Rumley & Mairéad Mooney, Boole Library, University College Cork. Tra...
Library Association of Ireland
 
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public SchoolsK12 Tableau Tuesday  - Algebra Equity and Access in Atlanta Public Schools
K12 Tableau Tuesday - Algebra Equity and Access in Atlanta Public Schools
dogden2
 
apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
To study the nervous system of insect.pptx
To study the nervous system of insect.pptxTo study the nervous system of insect.pptx
To study the nervous system of insect.pptx
Arshad Shaikh
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
Metamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative JourneyMetamorphosis: Life's Transformative Journey
Metamorphosis: Life's Transformative Journey
Arshad Shaikh
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
How to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 WebsiteHow to Subscribe Newsletter From Odoo 18 Website
How to Subscribe Newsletter From Odoo 18 Website
Celine George
 

Apache Spark overview

Editor's Notes

  • #4: Исполнение в кластере Параллельность Отказоустойчивость Скорость Различные форматы данных Мониторинг и распределение ресурсов
  • #21: Spark’s cache is fault-tolerant – if any partition of an RDD is lost, it will automatically be recomputed using the transformations that originally created it. Пример: вычитываем данные из файла, берем имя работника, его должность, зарплату и возраст. Фильтруем по нужным должностям. Потом хоть аггрегировать: среднюю зп по должности и по возрасту.
  • #31: window length - The duration of the window (3 in the figure). sliding interval - The interval at which the window operation is performed (2 in the figure).
  • #32: Пример с созданием коннекшена к базе