SlideShare a Scribd company logo
Data Mining With Big
Data
Guide: Prof. Prashant G. Ahire
Presented by :
Miss.Rupa Solapure
Roll no. 259
Agenda
Problem Definition
Objectives
Literature Survey
Architecture/Big Data mining algorithm
Existing System/Mathematical model
Advantages
Disadvantages/Limitations
Characteristics of Big Data
Big Data and it’s challenges
Big Data mining Tools
Applications of Big Data
References
Problem Definition:
Big Data consists of huge modules, difficult, growing data sets with
numerous and , independent sources. With the fast development of
networking, storage of data, and the data gathering capacity, Big Data are
now quickly increasing in all science and engineering domains, as well as
animal, genetic and biomedical sciences. This paper elaborates a HACE
theorem that states the characteristics of the Big Data revolution, and
proposes a Big Data processing model from the data mining view.
Objective:
This requires carefully designed algorithms to analyze model correlations
between distributed sites, and fuse decisions from multiple sources to gain a best
model out of the Big Data. Developing a safe and sound information sharing
protocol is a major challenge.
To support Big Data mining, high-performance computing platforms are
required, which impose systematic designs to unleash the full power of the Big
Data. Big data as an emerging trend and the need for Big data mining is rising in
all science and engineering domains.
Literature Survey
Title/Year Keywords Concept/Abstract Author
“Data Mining With Big
Data,Jan 2014”
Big Data,data
Mining,Heterogeneity,Au
tonomous
sources,Complex,and
Evolving associations.
This paper presents a HACE
theorem that characterizes the
features of Big Data
revolutions,processing model
from data mining.
Xindong Wu, Fellow,
IEEE, Xingquan Zhu,
Senior Member, IEEE,
Gong-Qing Wu, and Wei
Ding
“The Survey of Data
Mining Applications
And Feature
Scope,,June 2012”
Data mining task, Data
mining life cycle ,
Visualization of the data
mining model , Data
mining Methods,s
Data mining applications.
This paper imparts more
number of applications of the
data mining and also o focuses
scope of the data mining which
will helpful in the further
research.
Neelamadhab Padhy1,
Dr. Pragnyaban Mishra 2,
and Rasmita Panigrahi3
“Review on Data
Mining with Big
Data..Dec 2014”
Big Data, data mining,
heterogeneity,
autonomous sources,
complex and evolving
associations.
This data-driven model involves
demand-driven aggregation of
information sources, mining and
analysis, security and privacy
considerations.
Savita Suryavanshi, Prof.
Bharati Kale.
“SURVEY ON BIG
DATA MINING
PLATFORMS,
ALGORITHMS AND
CHALLENGES.sep201
4”
big data, big data mining
platforms, big data
mining algorithms, big
data mining challenges,
data mining.
This paper gives A review on
various big data mining
platforms, algorithms and
challenges is also discussed in
this paper.
SHERIN A1, Dr S UMA2,
SARANYA K3, SARANYA
VANI M4.
Architecture:
Fig.: Big data Memory evolution
Data Mining Algorithm
 Decision tree induction classification algorithms
 Evolutionary based classification algorithms
 Partitioning based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Hierarchical based clustering algorithms
 Model based clustering algorithms
Existing System:
The rise of Big Data applications where data collection has grown tremendous
doubly and is beyond the ability of commonly used software tools to capture,
manage, and process within a “tolerable elapsed time.”
The most fundamental challenge for Big Data applications is to explore the large
volumes of data and extract useful information or knowledge for future actions.
In many situations, the knowledge extraction process has to be very efficient and
close to real time because storing all observed data is nearly infeasible.
The unprecedented data volumes require an effective data analysis and prediction
platform to achieve fast response and real-time classification for such Big Data.
In model level it will produce local pattern. This pattern will be produced after
mined local data.
By sharing these local patterns with other local sites, we can produce a single
global pattern.
At the knowledge level, model correlation analysis investigates the relevance
between models generated from various data sources to determine how related
the data sources are correlated to each other, and how to form accurate decisions
based on models built from autonomous sources
Continue…
Big Data
Big Data is a comprehensive term for any collection of data sets so large and multifarious
that it becomes difficult to process them using conventional data processing applications.
There are two types of Big Data: structured and unstructured.
Structured data
Structured data are numbers and words that can be easily categorized and analyzed.
These data are generated by things like network sensors embedded in electronic
devices, smart phones, and global positioning system (GPS) devices. Structured data
also include things like sales figures, account balances, and transaction data.
Unstructured data
Unstructured data include more multifarious information, such as customer reviews
from feasible websites, photos and other multimedia, and comments on social
networking sites. These data can not be separated into categorized or analyzed
numerically.
Big Data Characteristic(HACE Theorem)
Figure . The blind men and the enormous elephant: the restricted view
of each blind man leads to a biased conclusion.
HACE theorem suggests that the key characteristics of the
Big Data are:
A. Huge with various and miscellaneous data sources
B. Autonomous Sources with circulated & disperse Control
C. Complex and Evolving associations
Applications of Data Mining
Marketing
 Analysis of consumer behaviour
 Advertising campaigns
 Targeted mailings
 Segmentation of customers, stores, or products
Finance
 Creditworthiness of clients
 Performance analysis of finance investments
 Fraud detection
Manufacturing
 Optimization of resources
 Optimization of manufacturing processes
 Product design based on customer requirements
Health Care
 Discovering patterns in X-ray images
 Analyzing side effects of drugs
 Effectiveness of treatments
Big Data Mining Algorithm
Big data applications have so many sources to gather information.
 If we want to mine data, we need to gather all distributed data to the
centralized site.But it is prohibited because of high data transmission cost
and privacy concerns.
Most of the mining levels order to achieve the pattern of correlations, or
patterns can be discovered from combined variety of sources.
The global data mining is done through two steps process.
 Model level
Knowledge level.
Each and every local sites use local data to calculate the data statistics
and it share this information in order to achieve global data distribution in
their data level.
Data Mining Challenges With Big Data
Fig. a conceptual view of the Big Data processing framework
DISADVANTAGES OF EXISTING
SYSTEM
To explore Big Data, we have analysed several challenges at the
data, model, and system levels.
The challenges at Tier I focus on data accessing and arithmetic
computing procedures. Because Big Data are often stored at
different locations and data volumes may continuously grow, an
effective computing platform will have to take distributed large-
scale data storage into consideration for computing.
PROPOSED SYSTEM
We propose a HACE theorem to model Big Data characteristics. The
characteristics of HACH make it an extreme challenge for
discovering useful knowledge from the Big Data.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
ADVANTAGES OF PROPOSED SYSTEM
Provide most relevant and most accurate social sensing feedback to
better understand our society at real time.
Characteristics of Big Data
Fig. Five Vs of BIG DATA
Volume- The quantity of data
Variety - categorizing the data
Velocity- speed of generation of data or the speed
of processing the data
Variability- Inconsistency
Complexity- Managing the data
Continue…
BIG Data Mining Tools
Hadoop
Apache S4
Strom
Apache Mahout
MOA
Fig.: Big Data processing
Conclusion:
Because of Increase in the amount of data in the field of genomics,
meteorology, biology, environmental research, it becomes difficult to handle
the data, to find Associations, patterns and to analyze the large data sets.
As an organization collects more data at this scale, formalizing the process of
big data analysis will become paramount.The paper describes methods for
different algorithms used to handle such large data sets. And it gives an
overview of architecture and algorithms used in large data sets.
References
 McKinsy Global Institute, Big Data: The next frontier for
innovation, competition and productivity- May 2011
Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013,
Data Mining with Big Data
 Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis,
Algorithms for mining the evolution of conserved relational states in
dynamic network
 IEEE, Data Mining with Big Data, January 2014
 Oracle, June 2013,Unstructured Data Management with Oracle
Database 12c
Data minig with Big data analysis
Ad

More Related Content

What's hot (20)

Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
MutiaSari53
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
Sandip Tipayle Patil
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
Slideshare
 
Introduction To Predictive Modelling
Introduction To Predictive ModellingIntroduction To Predictive Modelling
Introduction To Predictive Modelling
Spotle.ai
 
Big Data
Big DataBig Data
Big Data
Seminar Links
 
Data mining
Data miningData mining
Data mining
Kinza Razzaq
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Umair Shafique
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
MadhuriNigam1
 
Exploratory data analysis data visualization
Exploratory data analysis data visualizationExploratory data analysis data visualization
Exploratory data analysis data visualization
Dr. Hamdan Al-Sabri
 
Data Mining
Data MiningData Mining
Data Mining
ksanthosh
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
Valerii Klymchuk
 
CS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_ArchitectureCS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_Architecture
Palani Kumar
 
Big data
Big dataBig data
Big data
Pooja Shah
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
Basma Gamal
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
Eyad Manna
 
Mining Association Rules in Large Database
Mining Association Rules in Large DatabaseMining Association Rules in Large Database
Mining Association Rules in Large Database
Er. Nawaraj Bhandari
 
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
ssuser23e4f31
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
Prithwis Mukerjee
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
Mithilesh Trivedi
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
Navjot Kaur
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
MutiaSari53
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
Sandip Tipayle Patil
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
Slideshare
 
Introduction To Predictive Modelling
Introduction To Predictive ModellingIntroduction To Predictive Modelling
Introduction To Predictive Modelling
Spotle.ai
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Umair Shafique
 
Exploratory data analysis data visualization
Exploratory data analysis data visualizationExploratory data analysis data visualization
Exploratory data analysis data visualization
Dr. Hamdan Al-Sabri
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
Valerii Klymchuk
 
CS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_ArchitectureCS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_Architecture
Palani Kumar
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
Basma Gamal
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
Eyad Manna
 
Mining Association Rules in Large Database
Mining Association Rules in Large DatabaseMining Association Rules in Large Database
Mining Association Rules in Large Database
Er. Nawaraj Bhandari
 
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
Data Analytics Life Cycle [EMC² - Data Science and Big data analytics]
ssuser23e4f31
 

Viewers also liked (20)

Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
University of Hertfordshire
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
Sandip Tipayle Patil
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
kk1718
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
Nasrin Hussain
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
Bernard Marr
 
Big Data
Big DataBig Data
Big Data
NGDATA
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 
Data mining
Data miningData mining
Data mining
imran khan
 
Introduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data AnalyticsIntroduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data Analytics
Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University
 
What is big data?
What is big data?What is big data?
What is big data?
David Wellman
 
Big data ppt
Big data pptBig data ppt
Big data ppt
Thirunavukkarasu Ps
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
Bernard Marr
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Big Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our OrganizationsBig Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our Organizations
Agile Technologies
 
Data is Currency
Data is CurrencyData is Currency
Data is Currency
iMedia Connection
 
How Great Companies Think Differently
How Great Companies Think DifferentlyHow Great Companies Think Differently
How Great Companies Think Differently
Dia Lao
 
Frank henry digital rural futures conf june 2013 v3
Frank henry digital rural futures conf june  2013  v3Frank henry digital rural futures conf june  2013  v3
Frank henry digital rural futures conf june 2013 v3
Frank Henry
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
Mithlesh Sadh
 
2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech
Manju Nath
 
2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles
Manju Nath
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
kk1718
 
Big Data
Big DataBig Data
Big Data
NGDATA
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
Bernard Marr
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Big Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our OrganizationsBig Data & The Role Analytics Can Play In Our Organizations
Big Data & The Role Analytics Can Play In Our Organizations
Agile Technologies
 
How Great Companies Think Differently
How Great Companies Think DifferentlyHow Great Companies Think Differently
How Great Companies Think Differently
Dia Lao
 
Frank henry digital rural futures conf june 2013 v3
Frank henry digital rural futures conf june  2013  v3Frank henry digital rural futures conf june  2013  v3
Frank henry digital rural futures conf june 2013 v3
Frank Henry
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
Mithlesh Sadh
 
2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech2016 and 2017 Data Mining Projects @ TMKS Infotech
2016 and 2017 Data Mining Projects @ TMKS Infotech
Manju Nath
 
2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles2016 and 2017 IEEE Titles
2016 and 2017 IEEE Titles
Manju Nath
 
Ad

Similar to Data minig with Big data analysis (20)

Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
Polash Halder
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
csandit
 
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MININGISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
cscpconf
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the table
Paradigm4
 
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
IJET - International Journal of Engineering and Techniques
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
Dr. Radhey Shyam
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
Dr. Radhey Shyam
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
Dr. Radhey Shyam
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
ijsrd.com
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
IRJET Journal
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICS
ijistjournal
 
GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378
Parag Kapile
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
PothyeswariPothyes
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
IRJET Journal
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
Sitamarhi Institute of Technology
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)
NikitaRajbhoj
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and Issues
Karan Deep Singh
 
BigData
BigDataBigData
BigData
Viveka Sharma
 
Big Data Intoduction & Hadoop ArchitectureModule1.pdf
Big Data Intoduction & Hadoop ArchitectureModule1.pdfBig Data Intoduction & Hadoop ArchitectureModule1.pdf
Big Data Intoduction & Hadoop ArchitectureModule1.pdf
SharmilaChidaravalli
 
A Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven SocietyA Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven Society
AnthonyOtuonye
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
Polash Halder
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
csandit
 
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MININGISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING
cscpconf
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the table
Paradigm4
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
Dr. Radhey Shyam
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
Dr. Radhey Shyam
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
Dr. Radhey Shyam
 
Real World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining ToolsReal World Application of Big Data In Data Mining Tools
Real World Application of Big Data In Data Mining Tools
ijsrd.com
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
IRJET Journal
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICS
ijistjournal
 
GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378GROUP PROJECT REPORT_FY6055_FX7378
GROUP PROJECT REPORT_FY6055_FX7378
Parag Kapile
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
IRJET Journal
 
Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)Nikita rajbhoj(a 50)
Nikita rajbhoj(a 50)
NikitaRajbhoj
 
Big Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and IssuesBig Data Mining - Classification, Techniques and Issues
Big Data Mining - Classification, Techniques and Issues
Karan Deep Singh
 
Big Data Intoduction & Hadoop ArchitectureModule1.pdf
Big Data Intoduction & Hadoop ArchitectureModule1.pdfBig Data Intoduction & Hadoop ArchitectureModule1.pdf
Big Data Intoduction & Hadoop ArchitectureModule1.pdf
SharmilaChidaravalli
 
A Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven SocietyA Novel Framework for Big Data Processing in a Data-driven Society
A Novel Framework for Big Data Processing in a Data-driven Society
AnthonyOtuonye
 
Ad

Recently uploaded (20)

QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
Raish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdfRaish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdf
RaishKhanji
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Journal of Soft Computing in Civil Engineering
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 
Introduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptxIntroduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptx
AS1920
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Journal of Soft Computing in Civil Engineering
 
The Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLabThe Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLab
Journal of Soft Computing in Civil Engineering
 
Compiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptxCompiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptx
RushaliDeshmukh2
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
Mathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdfMathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdf
TalhaShahid49
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
Oil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdfOil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdf
M7md3li2
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
Raish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdfRaish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdf
RaishKhanji
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 
Introduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptxIntroduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptx
AS1920
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
Compiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptxCompiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptx
RushaliDeshmukh2
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
Mathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdfMathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdf
TalhaShahid49
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
Oil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdfOil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdf
M7md3li2
 

Data minig with Big data analysis

  • 1. Data Mining With Big Data Guide: Prof. Prashant G. Ahire Presented by : Miss.Rupa Solapure Roll no. 259
  • 2. Agenda Problem Definition Objectives Literature Survey Architecture/Big Data mining algorithm Existing System/Mathematical model Advantages Disadvantages/Limitations Characteristics of Big Data Big Data and it’s challenges Big Data mining Tools Applications of Big Data References
  • 3. Problem Definition: Big Data consists of huge modules, difficult, growing data sets with numerous and , independent sources. With the fast development of networking, storage of data, and the data gathering capacity, Big Data are now quickly increasing in all science and engineering domains, as well as animal, genetic and biomedical sciences. This paper elaborates a HACE theorem that states the characteristics of the Big Data revolution, and proposes a Big Data processing model from the data mining view.
  • 4. Objective: This requires carefully designed algorithms to analyze model correlations between distributed sites, and fuse decisions from multiple sources to gain a best model out of the Big Data. Developing a safe and sound information sharing protocol is a major challenge. To support Big Data mining, high-performance computing platforms are required, which impose systematic designs to unleash the full power of the Big Data. Big data as an emerging trend and the need for Big data mining is rising in all science and engineering domains.
  • 5. Literature Survey Title/Year Keywords Concept/Abstract Author “Data Mining With Big Data,Jan 2014” Big Data,data Mining,Heterogeneity,Au tonomous sources,Complex,and Evolving associations. This paper presents a HACE theorem that characterizes the features of Big Data revolutions,processing model from data mining. Xindong Wu, Fellow, IEEE, Xingquan Zhu, Senior Member, IEEE, Gong-Qing Wu, and Wei Ding “The Survey of Data Mining Applications And Feature Scope,,June 2012” Data mining task, Data mining life cycle , Visualization of the data mining model , Data mining Methods,s Data mining applications. This paper imparts more number of applications of the data mining and also o focuses scope of the data mining which will helpful in the further research. Neelamadhab Padhy1, Dr. Pragnyaban Mishra 2, and Rasmita Panigrahi3 “Review on Data Mining with Big Data..Dec 2014” Big Data, data mining, heterogeneity, autonomous sources, complex and evolving associations. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, security and privacy considerations. Savita Suryavanshi, Prof. Bharati Kale. “SURVEY ON BIG DATA MINING PLATFORMS, ALGORITHMS AND CHALLENGES.sep201 4” big data, big data mining platforms, big data mining algorithms, big data mining challenges, data mining. This paper gives A review on various big data mining platforms, algorithms and challenges is also discussed in this paper. SHERIN A1, Dr S UMA2, SARANYA K3, SARANYA VANI M4.
  • 6. Architecture: Fig.: Big data Memory evolution
  • 7. Data Mining Algorithm  Decision tree induction classification algorithms  Evolutionary based classification algorithms  Partitioning based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Hierarchical based clustering algorithms  Model based clustering algorithms
  • 8. Existing System: The rise of Big Data applications where data collection has grown tremendous doubly and is beyond the ability of commonly used software tools to capture, manage, and process within a “tolerable elapsed time.” The most fundamental challenge for Big Data applications is to explore the large volumes of data and extract useful information or knowledge for future actions. In many situations, the knowledge extraction process has to be very efficient and close to real time because storing all observed data is nearly infeasible. The unprecedented data volumes require an effective data analysis and prediction platform to achieve fast response and real-time classification for such Big Data.
  • 9. In model level it will produce local pattern. This pattern will be produced after mined local data. By sharing these local patterns with other local sites, we can produce a single global pattern. At the knowledge level, model correlation analysis investigates the relevance between models generated from various data sources to determine how related the data sources are correlated to each other, and how to form accurate decisions based on models built from autonomous sources Continue…
  • 10. Big Data Big Data is a comprehensive term for any collection of data sets so large and multifarious that it becomes difficult to process them using conventional data processing applications. There are two types of Big Data: structured and unstructured. Structured data Structured data are numbers and words that can be easily categorized and analyzed. These data are generated by things like network sensors embedded in electronic devices, smart phones, and global positioning system (GPS) devices. Structured data also include things like sales figures, account balances, and transaction data. Unstructured data Unstructured data include more multifarious information, such as customer reviews from feasible websites, photos and other multimedia, and comments on social networking sites. These data can not be separated into categorized or analyzed numerically.
  • 11. Big Data Characteristic(HACE Theorem) Figure . The blind men and the enormous elephant: the restricted view of each blind man leads to a biased conclusion.
  • 12. HACE theorem suggests that the key characteristics of the Big Data are: A. Huge with various and miscellaneous data sources B. Autonomous Sources with circulated & disperse Control C. Complex and Evolving associations
  • 13. Applications of Data Mining Marketing  Analysis of consumer behaviour  Advertising campaigns  Targeted mailings  Segmentation of customers, stores, or products Finance  Creditworthiness of clients  Performance analysis of finance investments  Fraud detection Manufacturing  Optimization of resources  Optimization of manufacturing processes  Product design based on customer requirements Health Care  Discovering patterns in X-ray images  Analyzing side effects of drugs  Effectiveness of treatments
  • 14. Big Data Mining Algorithm Big data applications have so many sources to gather information.  If we want to mine data, we need to gather all distributed data to the centralized site.But it is prohibited because of high data transmission cost and privacy concerns. Most of the mining levels order to achieve the pattern of correlations, or patterns can be discovered from combined variety of sources. The global data mining is done through two steps process.  Model level Knowledge level. Each and every local sites use local data to calculate the data statistics and it share this information in order to achieve global data distribution in their data level.
  • 15. Data Mining Challenges With Big Data Fig. a conceptual view of the Big Data processing framework
  • 16. DISADVANTAGES OF EXISTING SYSTEM To explore Big Data, we have analysed several challenges at the data, model, and system levels. The challenges at Tier I focus on data accessing and arithmetic computing procedures. Because Big Data are often stored at different locations and data volumes may continuously grow, an effective computing platform will have to take distributed large- scale data storage into consideration for computing.
  • 17. PROPOSED SYSTEM We propose a HACE theorem to model Big Data characteristics. The characteristics of HACH make it an extreme challenge for discovering useful knowledge from the Big Data.
  • 18. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 19. ADVANTAGES OF PROPOSED SYSTEM Provide most relevant and most accurate social sensing feedback to better understand our society at real time.
  • 20. Characteristics of Big Data Fig. Five Vs of BIG DATA
  • 21. Volume- The quantity of data Variety - categorizing the data Velocity- speed of generation of data or the speed of processing the data Variability- Inconsistency Complexity- Managing the data Continue…
  • 22. BIG Data Mining Tools Hadoop Apache S4 Strom Apache Mahout MOA
  • 23. Fig.: Big Data processing
  • 24. Conclusion: Because of Increase in the amount of data in the field of genomics, meteorology, biology, environmental research, it becomes difficult to handle the data, to find Associations, patterns and to analyze the large data sets. As an organization collects more data at this scale, formalizing the process of big data analysis will become paramount.The paper describes methods for different algorithms used to handle such large data sets. And it gives an overview of architecture and algorithms used in large data sets.
  • 25. References  McKinsy Global Institute, Big Data: The next frontier for innovation, competition and productivity- May 2011 Xindong Wu, Xinguan Zhu, Gong-Qing Wu, Wei Ding, 2013, Data Mining with Big Data  Ahmed and Karypis 2012, Rezwan Ahmed, George Karpis, Algorithms for mining the evolution of conserved relational states in dynamic network  IEEE, Data Mining with Big Data, January 2014  Oracle, June 2013,Unstructured Data Management with Oracle Database 12c