0% found this document useful (0 votes)
40 views

Abdul Azam - Final Research Report

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Abdul Azam - Final Research Report

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

1

Final Project

Abdul Azam Mohammed

Big Data Analytics

Instructor: Jimmie Flores

Campbellsville University
2

Chapter 1

Introduction

Creating trustworthy frameworks to handle massive data processing: Design frameworks to

treat the quantities, varieties, and speeds of big data to guarantee the peaceful data collection,

storage, processing, and analysis. creation of methods that would enable organizations to do

assessment of real time continuous information hence, help them to make correct decisions in

favor of most recent information. The data-driven insights can be used to the right decisions

when information is used objectively in the business areas. Eventually, the net advantage will

be some much better outlook by the client, cost reductions and efficiency. Identifying

unfamiliar relationships and patterns in the data collection process might be a source of

innovation which in turn can enhance productivity, new products, services and advanced

business strategies. Acquire a competitive edge as companies like this would be always ahead

of their competitors through studying data that will take them to the point of knowing their

customers, markets and operations much better (Song & Hua, 2021).

To deal with the shortcomings of conventional methodologies that hamper the efficient

exploitation of data, major challenges for the researchers from the fields of data mining and

big data analytics are the development of new approaches and innovative tools. It is crucial to

develop algorithms that quickly dig out knowledge from large datasets that hold a wide

variety of information in order to spot hidden correlations and at a large scale (Song & Hua,

2021).

Problem statement

During the process of scaling and increasing the number of processes as well as the amount

of processed data, the companies may suffer from some difficulties related to data volume

and processing requirements. The analysis might face delays as well as performance
3

bottlenecks existed from these. Data is always happening and can be constantly update real

time or very nearly real time analysis is required to make sure that the insights coming in are

up to date before they are outdated (Song & Hua, 2021).

Research Questions

1. What would be the appropriate technology updates like machine learning, artificial

intelligence, and cloud computing to widen data mining and big data analytics?

2. How are big data analytics and data mining are changing and developing in the

aspects of fresh trends and opportunities?

Relevance and significance

Data is the main tool that a business company relies on to anticipate the client’s behavior,

make their processes to be more efficient, and personalize marketing campaigns. Data

analysis is the tool that is used in the healthcare industry for the green light of a new drug,

tailored therapy, and fast disease detection. Data analytics is vital for financial institutions in

finance with analyzing risks, fraud as well as providing personalized financial solutions.

Application of big data in analytics makes it possible for researchers to examine complicated

data thereby subsequent to this research, comes scientific progress and novel discoveries.

Procedures are being changed across different domains due to data science and big data

analytics. The possibilities of growth, innovation, and education in general grow significantly

due to this development. A solid data treatment may be provided by big data analytics and

data mining that make the data consumption optimal (Geetha et al., 2021).

Barriers and issues

We need algorithms that work at scale and create data collection/processing infrastructure, in

order to get deep insights in time-pressure conditions. Playing the role of high-stakes

prospects, performance and scalability are very important. Although tradition data analysis
4

tools can handle the complexity of big data, it's extensive volume could lead to bottlenecks

and processing delays, the reason being that they are not built to handle such volume. They

enable marketing automation for instance, tagging a visitor with their sentiments on certain

product, or grouping them into certain sphere and using targeted ads. Additionally, the ethical

issues associated with big data and data mining are equally important as they play a critical

role in detecting cyber threats and eliminating them in through the use real-time analytical

(Geetha et al., 2021).

Chapter 2- Literature Review

The scope to find the best cutting-edge tools and technologies to develop a very successful

data mining and big data analytics project has made it possible. This implies that application

of library and frameworks for data analysis, machine learning and handling of enormous data

can be effectively done with, for example, R, Python and Apache Spark which are some good

open-source platforms. Marketable software packages as SAS, IBM SPSS, and Microsoft

Azure bundle a set of tools meant for data mining, predictive analytics, and business

intelligence (Geetha et al., 2021).

Ethical and legal considerations become especially important for confines of big data

analytics and data mining techniques. Data privacy laws, have got a set of regulations

including, to be followed in a strict manner. These laws are, for example, the Robert S.

Bechtel Privacy Act (RGB Act) in the USA and the General Data Protection Act (GDPR) in

Europe. What and how the data collection, use and distribute are now within the scope of

these guidelines. ACM's Code of Ethics and Professional Conduct is one of the ethical

standards addressing the idea of transparency, accountability, and fairness in algorithms and

reporting analysis processes (Song & Hua, 2021).

The rapidly growing application of big data analytics and data mining technologies ensures

that outstanding volumes of data are examined, which has an extremely influential effect on
5

business and society. These are the technologies that a huge number of industries now,

particularly banking, healthcare, retail, transportations and entertainment, make use of to gain

important insights from a lot of data and make decisions based on such information. Having

the capacity to easily interpret data and make data-based choices will give businesses a key

competitive advantage they need to work on enhancing innovative solutions, optimizing

operations, and improving consumer experience (Mosa et al., 2020).

In large number of big data analytics operations and data mining process, data security and

privacy issues are the most important things to be examined. Through increasing the number

of cyberattacks in data breach incidents as well as privacy infringements it is clear how

important it is to put a stop to unauthorized individuals from getting to access, disclose, and

misuse of sensitive data. Rigorous security standards should be associated to the systems of

data mining and analysis so as to protect the data in terms of confidentiality and privacy in

said settings. Reducing risks in data collection arise from sophisticated protocols such as

identification of data anonymity, permission controls, and encryption (Abdelhafez &

Elmannai, 2022).

Due to the fact that new technology emerges soon and changes so fast in this domain, the data

mining and big data analytics professionals must cultivate new competencies and improve the

already possessed Data Scientists and analysts will be at an advantage if they continually

update their knowledge about the current techniques, tools, and trends in order to be ahead of

the competition and do their duty well. Going to industry conferences and seminars with the

advantage of continuing education courses and certifications may create platform for the

professionals to learn and become more versatile in data science and analytics (Bhatia &

Jason, 2023).
6

Chapter 3- Research Method

Employing predictive analytics, we arrive at some highly accurate predictions of future

events or trends on the stock market with the help of the historical data. It involves data

mining for patterns and correlations, machine learning platforms to predict the future while

statistical modeling is employed to model the data to find trends and patterns. Predictive

analytics, the identification of trends and patterns to predict future outcomes, is employed by

many industries, which makes it work in different areas, including sales forecasting, risk

management, and predictive maintenance, etc. Its purpose is to decrease the risks or capitalize

on potential situations (Mosa et al., 2020).

For instance in predictive analytics research is carried out with the help of methods such as

regression analysis and classification. Such techniques are mainly used in order to assess

either continuous or categorical variables that lead to the prediction. Categories algorithms

may be implemented to organize the data into predetermined classifications or categories.

Support vector machines, logistic regression, and decision tree are just is examples of the

classification techniques. Rather than other forms of statistical analytics, the purpose of

regression analysis is to pinpoint a continuous target variable on an output basis with one or

more input elements. The industrial sectors using these technologies are customers

segmentation, detection of fraud, and demand forecasting (Mosa et al., 2020).

The technique of data mining which makes use of the research methodology called cluster

analysis is a common one. It is the process, which is aimed at grouping the analogous and

similar items or the data points into clusters on the basis of shared features or factors present.

This type of unsupervised learning is intended for discovering hidden relationships or

associations between data points without specified clusters or structures so the system

becomes able to derive unseen patterns and structures. Clustering is a broad, usually comes in

two forms, i.e. revealing meaningful data and guiding decisions derived from data. Market
7

segmentation, image segmentation and anomaly detection are some of the ways of

application (Mosa et al., 2020).

Chapter 4- Findings

Data mining methods include a whole range of algorithms and techniques such as

visualization, clustering, regression, and association rules that are used to discover new trends

and links from the mass of data. These techniques that go under this heading take place in

forms of regression analysis, classification, clustering, correlation rule mining, and deviation

detection. The option of strategy application also depends on the features of source data and

that is what needs to be considered to achieve specific objectives of the study done. For

instance, classifiers algorithms can be employed to create predictions or output, labels,

whereas the clustering algorithms can be used to put data that have the same component

together into a cluster (Abdelhafez & Elmannai, 2022).

While big data analytics and data mining bring undoubted benefits for businesses, there also a

number of significant problems which are linked with them. These barriers encompass,

however not exclusively, the data accuracy related problems, the privacy issue, the

compliance to regulation rules, the scalability, as well as the staff shortage. For organization

to overcome these challenges, it’s of utmost importance that comprehensive data governance

policies, investments in data quality management, and regulation of data protection are all

implemented. Also, education and training initiatives must be put in place to develop

appropriate talent (Abdelhafez & Elmannai, 2022).

Chapter 5- Conclusion

The prognosis for the patients, the cut of expenditure, and the field of medical research are

the applications that are quite promising for data mining and big data analytics of the

healthcare industry. By using genetic data (this includes medical imaging data and electronic
8

health records), medical practitioners will stand an opportunity to create an insight of

potential trends, patterns, and correlative common things that can aid in making crucial

decisions relative to diagnosis, treatment, and disease management. What is more, monitoring

the health of patients and identifying any health issues at their earliest stage has become

possible due to the combination of data from wearable devices, sensors, and IoT technologies

(Abdelhafez & Elmannai, 2022).

The fact that they face the welfare issues that society is coping with now, data mining and

analysis are able to craft sustainable development chances as well as have a positive social

effect. Data research may prompt suggestions that will be hard-evidence basis for the

production of policies and operations in so many fields like poverty battle, climate change

and environmental degradation protest, and social justice pursuit. Social problems may be

jointly tackled and data will make governance more resilient and sustainable and create

equity in our world for everyone if corporations, governments and non-profits use it for the

betterment (Abdelhafez & Elmannai, 2022).


9

Reference

Abdelhafez, H. A., & Elmannai, H. (2022). Developing and Comparing Data Mining

Algorithms That Work Best for Predicting Student Performance. International Journal

of Information and Communication Technology Education (IJICTE), 18(1), 1–14.

https://doi.org/10.4018/IJICTE.293235

Bhatia, S., & Jason, L. A. (2023). Using Data Mining and Time Series to Investigate ME and

CFS Naming Preferences. Journal of Disability Policy Studies.

https://doi.org/10.1177/10442073231154027

Geetha, D., Kavitha, V., Manikandan, G., & Karunkuzhali, D. (2021). Enhancement and

Development of Next Generation Data Mining Photolithographic Mechanism. Journal

of Physics: Conference Series, 1964(4). https://doi.org/10.1088/1742-

6596/1964/4/042092

Mosa, M., Agami, N., Elkhayat, G., & Kholief, M. (2020). A Literature Review of Data

Mining Techniques for Enhancing Digital Customer Engagement. International

Journal of Enterprise Information Systems (IJEIS), 16(4), 80–100.

https://doi.org/10.4018/IJEIS.2020100105

Song, Y., & Hua, X. (2021). Implementation of Data Mining Technology in Bonded

Warehouse Inbound and Outbound Goods Trade. Journal of Organizational and End

User Computing (JOEUC), 34(3), 1–18. https://doi.org/10.4018/JOEUC.291511

You might also like