0% found this document useful (0 votes)

18 views

B07_Apache spark in big data analytics tools.Apache spark is very usefull tool in big data analytics

Bigdata analytics tools

Uploaded by

mudduanjali02

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

B07_Apache spark in big data analytics tools.Apache spark is very usefull tool in big data analytics

Bigdata analytics tools

Uploaded by

mudduanjali02

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 10

║JAI SRI GURUDEV║

Sri AdichunchanagiriShikshana Trust (R)

SJB INSTITUTE OF TECHNOLOGY

BGS Health & Education City, Kengeri, Bangalore – 60.

DEPARTMENT OF ELECTRONICS & COMMUNICATION

ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

ASSIGNMENT – 02
APACHE SPARK
Presented by

Presented By: Under the guidance of:

Anjali N [1JB22CS400] Mrs. Vijayalakshmi B

Deepika C[1JB22CS403]
Deeya Darshini SD [1JB22CS404]
INTRODUCTION

Apache Spark is an open-source, distributed data processing

framework designed for big data analytics and machine learning.
It provides a fast, general-purpose engine for large-scale data
processing with capabilities for batch, real-time streaming,
machine learning, and graph processing.
KEY FEATURES

 Speed

 Ease of Use
 Fault Tolerance

 Scalability

 Rich Ecosystem
ARCHITECTURE

Apache Spark uses a master-slave architecture

consisting of the following components:
 Driver Program → Cluster Manager → Executors (Worker
Nodes)
• Driver Program: Defines tasks and sends them to executors.
• Cluster Manager: Allocates resources for tasks.
• Executors: Perform parallel computations on the worker nodes.
QUERIES IN APACHE SPARK
Using Spark SQL - Spark SQL allows you to run SQL-like queries on structured

data. You can load data into a temporary table or view and execute SQL queries.

Example
from pyspark.sql import SparkSession

# Initialize Spark Session

spark = SparkSession.builder.appName("SparkSQLExample").getOrCreate()

# Load data into a DataFrame

df = spark.read.csv("data.csv", header=True, inferSchema=True)
# Create a temporary view
df.createOrReplaceTempView("data_table")

# Run an SQL query

result = spark.sql("SELECT name, age FROM data_table WHERE age >
30")

# Show the results

result.show()
Use Cases

 Real-Time Data Processing

 Machine Learning

 Graph Analytics

 Data Science and Exploratory Data Analysis (EDA)

 Real-Time Recommendations

 Media and Entertainment

Conclusion

Apache Spark is a powerful and versatile big data processing

framework that has revolutionized the way data is processed and
analyzed. Its key features, such as in-memory computing,
scalability, fault tolerance, and real-time data processing, make it an
indispensable tool in the world of big data.

Linking To Pages On The Web: Exercise 6-1
No ratings yet
Linking To Pages On The Web: Exercise 6-1
1 page
Network Security LAB Manual For Diploma
50% (2)
Network Security LAB Manual For Diploma
31 pages
Thesis Apache Spark
100% (2)
Thesis Apache Spark
4 pages
1.1.4 and 1.1.5
No ratings yet
1.1.4 and 1.1.5
38 pages
A Study of Big Data Analytics Using Apache Spark With Python and Scala
No ratings yet
A Study of Big Data Analytics Using Apache Spark With Python and Scala
8 pages
Unit-4 - Apache Spark
No ratings yet
Unit-4 - Apache Spark
24 pages
Unit IV spark
No ratings yet
Unit IV spark
23 pages
Spark SQL PPT 3.2.3 and 3.2.4
No ratings yet
Spark SQL PPT 3.2.3 and 3.2.4
17 pages
Spark Seminar Report
100% (1)
Spark Seminar Report
30 pages
Apache Spark Engine
100% (1)
Apache Spark Engine
82 pages
Unleashing The Power of Apache Spark - A Comprehensive Guide To Data Processing at Scale
No ratings yet
Unleashing The Power of Apache Spark - A Comprehensive Guide To Data Processing at Scale
2 pages
Spark SQL Tutorial PDF
100% (1)
Spark SQL Tutorial PDF
35 pages
Spark SQL Tutorial
0% (1)
Spark SQL Tutorial
7 pages
Abhishek BDA File
No ratings yet
Abhishek BDA File
23 pages
bda u3 p1 (intro to spark)
No ratings yet
bda u3 p1 (intro to spark)
66 pages
4. Introduction-to-Apache-Spark
No ratings yet
4. Introduction-to-Apache-Spark
22 pages
Pyspark Interview Code
100% (3)
Pyspark Interview Code
197 pages
Apache Spark Tutorial
100% (4)
Apache Spark Tutorial
36 pages
CC_ppt
No ratings yet
CC_ppt
12 pages
Unit-5 Spark SQL and Spark Streaming
No ratings yet
Unit-5 Spark SQL and Spark Streaming
24 pages
Spark SQL
No ratings yet
Spark SQL
12 pages
Spark & SparkMLLib
No ratings yet
Spark & SparkMLLib
6 pages
Spark: Prepared by Dulari Bhatt
No ratings yet
Spark: Prepared by Dulari Bhatt
19 pages
Int 421
No ratings yet
Int 421
2 pages
Spark Interview Questions
No ratings yet
Spark Interview Questions
19 pages
Key Features: General-Purpose Fast Cluster Computing Platform
No ratings yet
Key Features: General-Purpose Fast Cluster Computing Platform
16 pages
Dbms Final Report Nithin and Ramesh
No ratings yet
Dbms Final Report Nithin and Ramesh
40 pages
Skyess Spark Syllabus
No ratings yet
Skyess Spark Syllabus
12 pages
Big Data Management Syllabus
100% (1)
Big Data Management Syllabus
5 pages
Skill DEVElopment
No ratings yet
Skill DEVElopment
30 pages
06-Apache Spark
No ratings yet
06-Apache Spark
75 pages
Apache Spark: Dhineshkumar S K
No ratings yet
Apache Spark: Dhineshkumar S K
31 pages
Course Pack BDA
No ratings yet
Course Pack BDA
6 pages
7 Steps For A Developer To Learn Apache Spark
No ratings yet
7 Steps For A Developer To Learn Apache Spark
30 pages
Apache_Spark_Lecture_Notes
No ratings yet
Apache_Spark_Lecture_Notes
4 pages
1-Cover Page
No ratings yet
1-Cover Page
1 page
20J41A0514-Big Data Spark
No ratings yet
20J41A0514-Big Data Spark
12 pages
Learning Spark Preview Ed
No ratings yet
Learning Spark Preview Ed
18 pages
Unit 4
No ratings yet
Unit 4
60 pages
BDA U4 copy
No ratings yet
BDA U4 copy
49 pages
Unit 5
100% (1)
Unit 5
109 pages
Presentation On Apache Spark
No ratings yet
Presentation On Apache Spark
7 pages
1_PDFsam_apache_spark_tutorial
No ratings yet
1_PDFsam_apache_spark_tutorial
7 pages
Lecture 3 PPT 22
No ratings yet
Lecture 3 PPT 22
25 pages
Spark SQL Meetup - 4-8-2012
No ratings yet
Spark SQL Meetup - 4-8-2012
27 pages
DBMS 4TH Sem Course File
100% (1)
DBMS 4TH Sem Course File
15 pages
High Performance Computing Using Apache Spark
No ratings yet
High Performance Computing Using Apache Spark
10 pages
Databricks On AWS 01 Getting Started Apache Spark Slides
100% (1)
Databricks On AWS 01 Getting Started Apache Spark Slides
29 pages
DE Bootcamp _ Week 3 Day 2
No ratings yet
DE Bootcamp _ Week 3 Day 2
4 pages
Sparks QL Sig Mod 2015
No ratings yet
Sparks QL Sig Mod 2015
12 pages
Spark SQL - Relational Data Processing in Spark
No ratings yet
Spark SQL - Relational Data Processing in Spark
12 pages
CMP514 Advance Java R
No ratings yet
CMP514 Advance Java R
174 pages
Report SQL PDF
No ratings yet
Report SQL PDF
21 pages
Big data Handling Techniques
No ratings yet
Big data Handling Techniques
21 pages
09 Programming Hadoop - Spark, R and Pig
No ratings yet
09 Programming Hadoop - Spark, R and Pig
80 pages
Bda Unit 6
No ratings yet
Bda Unit 6
14 pages
Spark
No ratings yet
Spark
4 pages
Big Data Spark Cs606pc Syllabus
No ratings yet
Big Data Spark Cs606pc Syllabus
4 pages
PPT 2.1.1.
No ratings yet
PPT 2.1.1.
24 pages
DATAQUEST - April, 2025: DataQuest monthly
From Everand
DATAQUEST - April, 2025: DataQuest monthly
Cyber Media (India) Ltd.
No ratings yet
The Story of Lanka Electricity Company
From Everand
The Story of Lanka Electricity Company
Asian Development Bank
No ratings yet
tower of honoi
No ratings yet
tower of honoi
30 pages
Question Bank
No ratings yet
Question Bank
1 page
web programing report
No ratings yet
web programing report
27 pages
types of cc
No ratings yet
types of cc
20 pages
IOT Module-1
No ratings yet
IOT Module-1
28 pages
mod5ppt
No ratings yet
mod5ppt
85 pages
Apache-Spark-A-Comprehensive-Guide
No ratings yet
Apache-Spark-A-Comprehensive-Guide
9 pages
micro hyper v Arch
No ratings yet
micro hyper v Arch
11 pages
blockchain-basedvotingsystem-230915123715-6daaf5cd
No ratings yet
blockchain-basedvotingsystem-230915123715-6daaf5cd
11 pages
Functional Dependencies and Normalization For Relational Databases
No ratings yet
Functional Dependencies and Normalization For Relational Databases
41 pages
Assignment2 4
No ratings yet
Assignment2 4
1 page
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-6
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-6
4 pages
Phase-1 Online Streaming Application
No ratings yet
Phase-1 Online Streaming Application
4 pages
Second Order Cybernetics
No ratings yet
Second Order Cybernetics
23 pages
Resume-Kapil Varma
No ratings yet
Resume-Kapil Varma
1 page
Wa0001.
No ratings yet
Wa0001.
6 pages
Jasim Ansari Resume
No ratings yet
Jasim Ansari Resume
1 page
SECUDE Secure Login 5.1 Installation Administration and Usage Manual en r17
No ratings yet
SECUDE Secure Login 5.1 Installation Administration and Usage Manual en r17
275 pages
JNTUH M.Tech Software Engineering Syllabus
No ratings yet
JNTUH M.Tech Software Engineering Syllabus
14 pages
Kripesh_Final file_ppt
No ratings yet
Kripesh_Final file_ppt
16 pages
Basics of Software
No ratings yet
Basics of Software
7 pages
Final Project Presentation
No ratings yet
Final Project Presentation
22 pages
Veeam Backup 11 0 User Guide Hyperv
No ratings yet
Veeam Backup 11 0 User Guide Hyperv
1,547 pages
Google Advance Browsing
No ratings yet
Google Advance Browsing
25 pages
Jurnal e Commerce
No ratings yet
Jurnal e Commerce
23 pages
Data Security Posture Management in Aws With Zscaler
No ratings yet
Data Security Posture Management in Aws With Zscaler
6 pages
IBM Cloud Pak For Integration - EN
No ratings yet
IBM Cloud Pak For Integration - EN
5 pages
2018 02 14 Lua
No ratings yet
2018 02 14 Lua
33 pages
Unit V Exception - Handling
No ratings yet
Unit V Exception - Handling
26 pages
Computers Questions 19 21
No ratings yet
Computers Questions 19 21
3 pages
Simplified Barcode Based Point of Sales and Inventory
No ratings yet
Simplified Barcode Based Point of Sales and Inventory
5 pages
TRBOnet PLUS Price Book - USD
100% (1)
TRBOnet PLUS Price Book - USD
9 pages
Eset Esa Product Manual Enu
No ratings yet
Eset Esa Product Manual Enu
114 pages
Assignment - 1 HCI: Task - 1
No ratings yet
Assignment - 1 HCI: Task - 1
3 pages
CS File CSV
No ratings yet
CS File CSV
22 pages
Albaroodi, Hala - Critical Review of Openstack Security Issues and Weaknesses
No ratings yet
Albaroodi, Hala - Critical Review of Openstack Security Issues and Weaknesses
11 pages
ParallelJobs PDF
No ratings yet
ParallelJobs PDF
84 pages

B07_Apache spark in big data analytics tools.Apache spark is very usefull tool in big data analytics

Uploaded by

B07_Apache spark in big data analytics tools.Apache spark is very usefull tool in big data analytics

Uploaded by

║JAI SRI GURUDEV║

Sri AdichunchanagiriShikshana Trust (R)

SJB INSTITUTE OF TECHNOLOGY

DEPARTMENT OF ELECTRONICS & COMMUNICATION

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Presented By: Under the guidance of:

Anjali N [1JB22CS400] Mrs. Vijayalakshmi B

Apache Spark is an open-source, distributed data processing

Apache Spark uses a master-slave architecture

# Initialize Spark Session

# Load data into a DataFrame

# Run an SQL query

# Show the results

 Real-Time Data Processing

 Data Science and Exploratory Data Analysis (EDA)

 Media and Entertainment

Apache Spark is a powerful and versatile big data processing

You might also like