SlideShare a Scribd company logo
MACHINE LEARNING ON BIG
DATA: OPPORTUNITIES AND
CHALLENGES - FUTURE RESEARCH
DIRECTION FOR PHD SCHOLARS
An Academic presentation by
Dr. Nancy Agnes, Head, Technical Operations, Phdassistance
Group www.phdassistance.com
Email: info@phdassistance.com
In-brief
Introduction
Machine learning
Big data
Data preprocessing opportunities and challenges
Evaluation opportunities and challenges
Future research
Conclusion
Outline
TODAY'SDISCUSSION
Machine Learning (ML) is rapidly used in a variety of applications. It has risen to
prominence in recent years, owing in part to the emergence of big data. When it comes
to big data, ML algorithms have never been more promising. Big data allows machine
learning algorithms to discover finer-grained patterns and make more timely and
precise predictions than ever before; however, it also poses significant challenges to
machine learning, such as model scalability and distributed computing.
In-Brief
In various fields as computer vision, speech recognition,
natural language comprehension, neuroscience, fitness,
and the Internet of Things, ML techniques have had
enormous societal impacts.
The emergence of the era of big data has stirred up interest
in Machine Learning Big Data has never promised or
questioned machine learning algorithms to gain new
insights into a variety of business applications and human
behaviours.
Contd...
INTRODUCTION
On the one hand, big data provides ML algorithms with unparalleled amounts of data
from which to derive underlying patterns and create predictive models; on the other
hand, conventional ML algorithms face crucial challenges such as scalability in order
to fully unlock the value of big data.
With the ever-expanding world of big data, ML must develop and grow in order to turn
big data into actionable intelligence.
Contd...
ML aims to answer the question of how to build a computer system that improves itself
over time.
The problem of learning from experience with respect to certain tasks and performance
metrics is referred to as an ML problem.
Users may use ML techniques to deduce underlying structure and make predictions from
large datasets.
Contd...
ML thrives on strong computational environments, efficient learning techniques
(algorithms), and rich and/or large data.
As a result, ML has a lot of potential and is an essential part of big data analytics
Fig. 1. A Framework of machine learning on big data
(MLBid)
Data pre-processing, learning, and assessment are
common stages of Machine Learning.
Data pre-processing aids in the transformation of raw
data into the "right form" for further learning steps.
Via data cleaning, extraction, transformation, and fusion,
the pre-processing phase transforms such data into a
form that can be used as inputs to learning.
Contd...
MACHINE
LEARNING
Using the pre-processed input data, the learning step selects learning algorithms and
tunes model parameters to produce desired outputs.
Data pre-processing can be done with some learning methods, especially
representational learning.
After that, the trained models are evaluated to see how well they do.
The essence of learning input, the goal of learning activities, and the timing of data
availability are all characteristics of machine learning.
Contd...
ML can be divided into three major categories based on the quality of the input available
to a learning system: supervised learning, unsupervised learning, and reinforcement
learning.
ML can be divided into two types: representational learning and task learning,
depending on whether the learning goal is to learn particular tasks using input features
or to learn the features themselves.
Each Machine Learning Algorithm can be classified in a variety of ways.
Fig. 2. A multi-dimensional taxonomy of machine
learning
Volume, velocity, variety, veracity, and value are the five
dimensions of big data.
Starting from the bottom, we organised the five dimensions into
a stack of high, data, and value layers.
The data layer is integral to big data, and the meaning factor
characterises the influence of big data real-world applications.
Contd...
BIGDATA
The lower layer is more reliant on technical advancements, while the higher layer is
more focused on applications that leverage big data's strategic strength.
Established machine learning paradigms and algorithms must be modified to
understand the potential of big data analytics and to process big data efficiently.
We recognise key opportunities and challenges in this section.
We go through them individually for each of the three phases of machine learning:
preprocessing, learning, and assessment.
Contd...
Fig. 3. Big data
stack
Data replication or inconsistency can have a
significant impact on machine learning.
Traditional methods such as pairwise similarity
comparison are no longer feasible for big data,
despite a variety of techniques for detecting
duplicates produced in the last 20 years.
Contd...
When two or more data samples represent the
same object, duplication occurs.
DATA REDUNDANCY
DATAPREPROCESSING
OPPORTUNITIES AND
CHALLENGES
Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance
Furthermore, the conventional presumption that duplicated pairs are rarer than
non-duplicated pairs is no longer true.
Dynamic Time Warping can be much faster than current Euclidean distance
algorithms in this regard
DATA HETEROGENEITY
Big data promises to include multi-view data from a variety of repositories, in a
variety of formats, and from a variety of population samples, and thus is highly
heterogeneous.
Contd...
The value of these multi-view heterogeneous data. As a result, combining all of
the characteristics and treating them equally relevant is unlikely to result in
optimal learning outcomes.
Big data offers the possibility of simultaneously learning from different views and
then assembling multiple findings by learning the relevance of feature views to
the task.
The approach is supposed to be resistant to data outliers and to be able to solve
optimization and convergence problems.
Contd...
DATA DISCRETIZATION
However, most current discretization
dealing with large amounts of data.
methods would be ineffective when
Traditional discretization approaches have been parallelized in big data platforms
to solve big data problems, with a distributed variant of the entropy minimization
discretizer based on the Minimum Description Length Principle improving both
efficiency and accuracy.
Contd...
DATA LABELLING
Active learning can be used as an optimization technique for marking activities
in crowd-sourced databases, reducing the number of questions posed to the
crowd and enabling crowd-sourced applications to scale.
Designing active Learning Algorithms for a crowd-sourced dataset, on the other
hand, presents a number of practical challenges, including generality, scalability,
and usability.
Another problem is that such a dataset cannot cover all user-specific contexts,
resulting in output that is often inferior to user-centric training.
Contd...
IMBALANCED DATA
Traditional stratified random sampling approaches have tackled the problem of
unbalanced data.
However, if iterations of sub-sample generation and error metrics measurement are
needed, the process can take a long time.
Furthermore, conventional sampling methods are unable to support data sampling
over a user-specified subset of data that includes value-based sampling efficiently.
Parallel data sampling is needed by big data.
Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars  - Phdassistance
This paper provides a summary of the benefits and
drawbacks of machine learning on big data.
Big data poses new possibilities for inspiring revolutionary
and novel ML technologies to solve many associated
technological problems and generate real-world impacts,
while also posing multiple challenges for conventional ML in
terms of scalability, adaptability, and usability.
Contd...
FUTURE
RESEARCH
These opportunities and challenges can be used to evaluate current research in
this field.
According to the components of the MLBiD system, we also highlight some open
Research issues in ML on big data, as shown in Table.
In conclusion, machine learning is needed to address the
challenges faced by big data and to discover hidden patterns,
information, and insights from big data in order to transform its
potential into real value for business decision-making and
scientific exploration.
The combination of machine learning and big data points to a
bright future in a modern frontier.
CONCLUSION
Contact Us
UNITED KINGDOM
+44-1143520021
INDIA
+91-4448137070
EMAIL
info@phdassistance.com

More Related Content

PDF
An Architecture for Simplified and Automated Machine Learning
PPTX
Regression with Microsoft Azure & Ms Excel
PPTX
Eckovation Machine Learning
PDF
Data Science for Business Managers - The bare minimum a manager should know
PDF
IRJET - An Overview of Machine Learning Algorithms for Data Science
PDF
Datactif Suite of Big Data Analytics
PDF
Datactif Suite of Big Data Analytics
An Architecture for Simplified and Automated Machine Learning
Regression with Microsoft Azure & Ms Excel
Eckovation Machine Learning
Data Science for Business Managers - The bare minimum a manager should know
IRJET - An Overview of Machine Learning Algorithms for Data Science
Datactif Suite of Big Data Analytics
Datactif Suite of Big Data Analytics

What's hot (20)

PDF
Cognitive automation
PDF
Prediction of Default Customer in Banking Sector using Artificial Neural Network
PDF
Internship project report,Predictive Modelling
PDF
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
PPTX
Machine learning ppt
PPTX
Machine learning
PPTX
Data analytics
PDF
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
PDF
V2 i9 ijertv2is90699-1
PDF
A tutorial on secure outsourcing of large scalecomputation for big data
PPT
Machine Learning
DOCX
Machine learning (domingo's paper)
PPTX
Machine Learning Using Python
PDF
GTU GeekDay Data Science and Applications
PPTX
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
PDF
Distributed Digital Artifacts on the Semantic Web
PDF
Ml introduction
PDF
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
PDF
Data science lecture4_doaa_mohey
PPTX
Industrial training ppt
Cognitive automation
Prediction of Default Customer in Banking Sector using Artificial Neural Network
Internship project report,Predictive Modelling
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
Machine learning ppt
Machine learning
Data analytics
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
V2 i9 ijertv2is90699-1
A tutorial on secure outsourcing of large scalecomputation for big data
Machine Learning
Machine learning (domingo's paper)
Machine Learning Using Python
GTU GeekDay Data Science and Applications
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
Distributed Digital Artifacts on the Semantic Web
Ml introduction
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
Data science lecture4_doaa_mohey
Industrial training ppt
Ad

Similar to Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars - Phdassistance (20)

PPT
Machine learning with Big Data power point presentation
PDF
Technovision
PDF
Similar Data Points Identification with LLM: A Human-in-the-Loop Strategy Usi...
PDF
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
PDF
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
PDF
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
PDF
How Does Data Create Economic Value_ Foundations For Valuation Models.pdf
PDF
DSSG Speaker Series: Paco Nathan
PDF
Deep learning applications and challenges in big data analytics
PDF
Top 10 Trends to Watch for In Data Science.pdf
PDF
Revolutionizing Big Data with AI-Driven Hybrid Soft Computing Techniques
PDF
Revolutionizing Big Data with AI-Driven Hybrid Soft Computing Techniques
PDF
التنقيب في البيانات - Data Mining
PDF
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
PDF
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
PDF
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
PDF
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
PDF
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
PDF
IRJET - Employee Performance Prediction System using Data Mining
PDF
Essential+Data+Science+Notes+-+A+Concise+PDF+Guide.pdf
Machine learning with Big Data power point presentation
Technovision
Similar Data Points Identification with LLM: A Human-in-the-Loop Strategy Usi...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
How Does Data Create Economic Value_ Foundations For Valuation Models.pdf
DSSG Speaker Series: Paco Nathan
Deep learning applications and challenges in big data analytics
Top 10 Trends to Watch for In Data Science.pdf
Revolutionizing Big Data with AI-Driven Hybrid Soft Computing Techniques
Revolutionizing Big Data with AI-Driven Hybrid Soft Computing Techniques
التنقيب في البيانات - Data Mining
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
IRJET - Employee Performance Prediction System using Data Mining
Essential+Data+Science+Notes+-+A+Concise+PDF+Guide.pdf
Ad

More from PhD Assistance (20)

PDF
The relationship between clinical and biochemical findings with diabetic keto...
PPTX
Referencing an Article - Its styles and type.pptx
PDF
Referencing an Article - Its styles and type.pdf
PPTX
ROLE OF COMMUNITY TO BOOST MENTAL HEALTH .pptx
PDF
Current and future developments in cultural psychology of inequality in PhD r...
PDF
Quantum Machine Learning is all you Need – PhD Assistance.pdf
PPTX
Nutritional Interventional trials in muscle and cachexia PhD research directi...
PDF
Nutritional Interventional trials in muscle and cachexia PhD research directi...
PDF
7 Major Types of Cyber Security Threats.pdf
PDF
Machine Learning Algorithm for Business Strategy.pdf
PPTX
Key Factors Influencing Customer Purchasing Behavior.pptx
PDF
Key Factors Influencing Customer Purchasing Behavior.pdf
PPTX
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pptx
PDF
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pdf
PPTX
Immigrant’s Potentials to Emerge as Entrepreneurs.pptx
PDF
Immigrant’s Potentials to Emerge as Entrepreneurs - PhD Assistance.pdf
PPTX
An overview of cyber security data science from a perspective of machine lear...
PDF
An overview of cyber security data science from a perspective of machine lear...
PDF
Selecting a Research Topic - Framework for Doctoral Students.pdf
PDF
Identifying and Formulating the Research Problem in Food and Nutrition Study ...
The relationship between clinical and biochemical findings with diabetic keto...
Referencing an Article - Its styles and type.pptx
Referencing an Article - Its styles and type.pdf
ROLE OF COMMUNITY TO BOOST MENTAL HEALTH .pptx
Current and future developments in cultural psychology of inequality in PhD r...
Quantum Machine Learning is all you Need – PhD Assistance.pdf
Nutritional Interventional trials in muscle and cachexia PhD research directi...
Nutritional Interventional trials in muscle and cachexia PhD research directi...
7 Major Types of Cyber Security Threats.pdf
Machine Learning Algorithm for Business Strategy.pdf
Key Factors Influencing Customer Purchasing Behavior.pptx
Key Factors Influencing Customer Purchasing Behavior.pdf
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pptx
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pdf
Immigrant’s Potentials to Emerge as Entrepreneurs.pptx
Immigrant’s Potentials to Emerge as Entrepreneurs - PhD Assistance.pdf
An overview of cyber security data science from a perspective of machine lear...
An overview of cyber security data science from a perspective of machine lear...
Selecting a Research Topic - Framework for Doctoral Students.pdf
Identifying and Formulating the Research Problem in Food and Nutrition Study ...

Recently uploaded (20)

PDF
Pre independence Education in Inndia.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Institutional Correction lecture only . . .
PDF
01-Introduction-to-Information-Management.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Pre independence Education in Inndia.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Final Presentation General Medicine 03-08-2024.pptx
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
102 student loan defaulters named and shamed – Is someone you know on the list?
Renaissance Architecture: A Journey from Faith to Humanism
Institutional Correction lecture only . . .
01-Introduction-to-Information-Management.pdf
VCE English Exam - Section C Student Revision Booklet
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Cell Structure & Organelles in detailed.
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...

Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars - Phdassistance

  • 1. MACHINE LEARNING ON BIG DATA: OPPORTUNITIES AND CHALLENGES - FUTURE RESEARCH DIRECTION FOR PHD SCHOLARS An Academic presentation by Dr. Nancy Agnes, Head, Technical Operations, Phdassistance Group www.phdassistance.com Email: [email protected]
  • 2. In-brief Introduction Machine learning Big data Data preprocessing opportunities and challenges Evaluation opportunities and challenges Future research Conclusion Outline TODAY'SDISCUSSION
  • 3. Machine Learning (ML) is rapidly used in a variety of applications. It has risen to prominence in recent years, owing in part to the emergence of big data. When it comes to big data, ML algorithms have never been more promising. Big data allows machine learning algorithms to discover finer-grained patterns and make more timely and precise predictions than ever before; however, it also poses significant challenges to machine learning, such as model scalability and distributed computing. In-Brief
  • 4. In various fields as computer vision, speech recognition, natural language comprehension, neuroscience, fitness, and the Internet of Things, ML techniques have had enormous societal impacts. The emergence of the era of big data has stirred up interest in Machine Learning Big Data has never promised or questioned machine learning algorithms to gain new insights into a variety of business applications and human behaviours. Contd... INTRODUCTION
  • 5. On the one hand, big data provides ML algorithms with unparalleled amounts of data from which to derive underlying patterns and create predictive models; on the other hand, conventional ML algorithms face crucial challenges such as scalability in order to fully unlock the value of big data. With the ever-expanding world of big data, ML must develop and grow in order to turn big data into actionable intelligence. Contd...
  • 6. ML aims to answer the question of how to build a computer system that improves itself over time. The problem of learning from experience with respect to certain tasks and performance metrics is referred to as an ML problem. Users may use ML techniques to deduce underlying structure and make predictions from large datasets. Contd...
  • 7. ML thrives on strong computational environments, efficient learning techniques (algorithms), and rich and/or large data. As a result, ML has a lot of potential and is an essential part of big data analytics
  • 8. Fig. 1. A Framework of machine learning on big data (MLBid)
  • 9. Data pre-processing, learning, and assessment are common stages of Machine Learning. Data pre-processing aids in the transformation of raw data into the "right form" for further learning steps. Via data cleaning, extraction, transformation, and fusion, the pre-processing phase transforms such data into a form that can be used as inputs to learning. Contd... MACHINE LEARNING
  • 10. Using the pre-processed input data, the learning step selects learning algorithms and tunes model parameters to produce desired outputs. Data pre-processing can be done with some learning methods, especially representational learning. After that, the trained models are evaluated to see how well they do. The essence of learning input, the goal of learning activities, and the timing of data availability are all characteristics of machine learning. Contd...
  • 11. ML can be divided into three major categories based on the quality of the input available to a learning system: supervised learning, unsupervised learning, and reinforcement learning. ML can be divided into two types: representational learning and task learning, depending on whether the learning goal is to learn particular tasks using input features or to learn the features themselves. Each Machine Learning Algorithm can be classified in a variety of ways.
  • 12. Fig. 2. A multi-dimensional taxonomy of machine learning
  • 13. Volume, velocity, variety, veracity, and value are the five dimensions of big data. Starting from the bottom, we organised the five dimensions into a stack of high, data, and value layers. The data layer is integral to big data, and the meaning factor characterises the influence of big data real-world applications. Contd... BIGDATA
  • 14. The lower layer is more reliant on technical advancements, while the higher layer is more focused on applications that leverage big data's strategic strength. Established machine learning paradigms and algorithms must be modified to understand the potential of big data analytics and to process big data efficiently. We recognise key opportunities and challenges in this section. We go through them individually for each of the three phases of machine learning: preprocessing, learning, and assessment. Contd...
  • 15. Fig. 3. Big data stack
  • 16. Data replication or inconsistency can have a significant impact on machine learning. Traditional methods such as pairwise similarity comparison are no longer feasible for big data, despite a variety of techniques for detecting duplicates produced in the last 20 years. Contd... When two or more data samples represent the same object, duplication occurs. DATA REDUNDANCY DATAPREPROCESSING OPPORTUNITIES AND CHALLENGES
  • 18. Furthermore, the conventional presumption that duplicated pairs are rarer than non-duplicated pairs is no longer true. Dynamic Time Warping can be much faster than current Euclidean distance algorithms in this regard DATA HETEROGENEITY Big data promises to include multi-view data from a variety of repositories, in a variety of formats, and from a variety of population samples, and thus is highly heterogeneous. Contd...
  • 19. The value of these multi-view heterogeneous data. As a result, combining all of the characteristics and treating them equally relevant is unlikely to result in optimal learning outcomes. Big data offers the possibility of simultaneously learning from different views and then assembling multiple findings by learning the relevance of feature views to the task. The approach is supposed to be resistant to data outliers and to be able to solve optimization and convergence problems. Contd...
  • 20. DATA DISCRETIZATION However, most current discretization dealing with large amounts of data. methods would be ineffective when Traditional discretization approaches have been parallelized in big data platforms to solve big data problems, with a distributed variant of the entropy minimization discretizer based on the Minimum Description Length Principle improving both efficiency and accuracy. Contd...
  • 21. DATA LABELLING Active learning can be used as an optimization technique for marking activities in crowd-sourced databases, reducing the number of questions posed to the crowd and enabling crowd-sourced applications to scale. Designing active Learning Algorithms for a crowd-sourced dataset, on the other hand, presents a number of practical challenges, including generality, scalability, and usability. Another problem is that such a dataset cannot cover all user-specific contexts, resulting in output that is often inferior to user-centric training. Contd...
  • 22. IMBALANCED DATA Traditional stratified random sampling approaches have tackled the problem of unbalanced data. However, if iterations of sub-sample generation and error metrics measurement are needed, the process can take a long time. Furthermore, conventional sampling methods are unable to support data sampling over a user-specified subset of data that includes value-based sampling efficiently. Parallel data sampling is needed by big data.
  • 24. This paper provides a summary of the benefits and drawbacks of machine learning on big data. Big data poses new possibilities for inspiring revolutionary and novel ML technologies to solve many associated technological problems and generate real-world impacts, while also posing multiple challenges for conventional ML in terms of scalability, adaptability, and usability. Contd... FUTURE RESEARCH
  • 25. These opportunities and challenges can be used to evaluate current research in this field. According to the components of the MLBiD system, we also highlight some open Research issues in ML on big data, as shown in Table.
  • 26. In conclusion, machine learning is needed to address the challenges faced by big data and to discover hidden patterns, information, and insights from big data in order to transform its potential into real value for business decision-making and scientific exploration. The combination of machine learning and big data points to a bright future in a modern frontier. CONCLUSION