Data-Centric Artificial Intelligence For Multidisciplinary Applications 1st Edition by Parikshit N. Mahalle
Data-Centric Artificial Intelligence For Multidisciplinary Applications 1st Edition by Parikshit N. Mahalle
Edited by
Parikshit N. Mahalle, Namrata N. Wasatkar, and
Gitanjali R. Shinde
Designed cover image: ShutterStock
© 2024 selection and editorial matter, Parikshit N. Mahalle, Namrata N. Wasatkar and Gitanjali R.
Shinde; individual chapters, the contributors
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of
their use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged, please write and
let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known
or hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.
com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA
01923, 978‑750‑8400. For works that are not available on CCC please contact mpkbookspermissions@
tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
DOI: 10.1201/9781003461500
Typeset in Times
by codeMantra
Contents
Editors..................................................................................................................... viii
List of Contributors.....................................................................................................x
Section I R
ecent Developments in
Data‑Centric AI
Chapter 1 Advancements in Data‑Centric AI Foundations, Ethics,
and Emerging Technology....................................................................3
Sujal Dilip Patil, Rupali Atul Mahajan, and Nitin Sakhare
Section II D
ata‑Centric AI in
Healthcare and Agriculture
v
vi Contents
Chapter 8 Medical Image Analysis and Classification for Varicose Veins....... 114
Jyoti Yogesh Deshmukh, Vijay U. Rathod, Yogesh Kisan Mali,
and Rachna Sable
Section III B
uilding AI with Quality Data
for Multidisciplinary Domains
Index....................................................................................................................... 293
Editors
Dr. Parikshit N. Mahalle i s a senior member of the IEEE and
is Professor and Head of Department of Artificial Intelligence
and Data Science at Vishwakarma Institute of Information
Technology, Pune, India. He completed his Ph.D. from Aalborg
University, Denmark and continued as Post Doc Researcher at
CMI, Copenhagen, Denmark. He has 23+ years of teaching
and research experience. He is a member of the Board of
Studies in Computer Engineering, Ex‑Chairman Information
Technology, SPPU and various Universities and autonomous
colleges across India. He has 9 patents, 200+ research publications (Google Scholar
citations‑2250 plus, H index‑22 and Scopus Citations are 1190 plus with H index ‑16),
and authored/edited 42+ books with Springer, CRC Press, Cambridge University
Press, etc. He is editor‑in‑chief for IGI Global – International Journal of Rough Sets
and Data Analysis, Associate Editor for IGI Global – International Journal of
Synthetic Emotions, Inter‑science International Journal of Grid and Utility
Computing, member of Editorial Review Board for IGI Global – International
Journal of Ambient Computing and Intelligence. His research interests are machine
learning, data science, algorithms, internet of things, identity management and secu‑
rity. He is a recognized Ph.D. guide of SSPU, Pune, guiding seven Ph.D. students in
the area of IoT and machine learning. Recently, five students have successfully
defended their Ph.D. He is also the recipient of the “Best Faculty Award” by Sinhgad
Institutes and Cognizant Technologies Solutions. He has delivered 200 plus lectures
at national and international levels. He is also the recipient of the best faculty award
by Cognizant Technology Solutions.
viii
Editors ix
citations‑700 plus, H index‑11). She is author of 10+ books with publishers Springer
and CRC Press Taylor & Francis Group, and she is also editor of several books. Her
book Data Analytics for Pandemics A COVID 19 Case Study was awarded outstand‑
ing Book of the year 2020.
x
Contributors xi
Sarthak Turki
Vishwakarma Institute of Information
Technology
Pune, India
Section I
Recent Developments
in Data‑Centric AI
1 Advancements
in Data‑Centric AI
Foundations, Ethics, and
Emerging Technology
Sujal Dilip Patil, Rupali Atul Mahajan,
and Nitin Sakhare
1.1 INTRODUCTION
Data‑centric artificial intelligence (AI) denotes an approach within AI and
machine learning (ML) that places significant emphasis on the pivotal role of
meticulously curated, high‑quality data in the development and implementation of
AI models and systems [1]. Under this paradigm, data assumes the bedrock upon
which AI algorithms are constructed and honed, and its effective handling, pre‑
processing, and analysis stand as pivotal factors for achieving precise and depend‑
able AI outcomes [2]. The essence of data‑centric AI springs from the recognition
that the performance of AI models is intricately linked to the calibre and quantity
of data employed for training, validation, and testing. This methodology under‑
scores the understanding that even the most advanced AI algorithms might grap‑
ple to yield meaningful outcomes if the input data is incomplete, biased, laden
with noise, or inadequately structured [3]. The overall chapter consists of key
aspects of data‑centric AI described in Section 1.1.1, Applications of Data‑Centric
AI, Various techniques – Machine Learning, Deep Learning used in AI, etc. are
explained in Section 1.2. The various technologies that are part of data‑centric
AI are mentioned in Sections 1.3, 1.4, 1.5, and 1.6. The ethical implications of
AI technologies are described in Section 1.7. This chapter also covers various AI
Governance and Regulation Strategies for Responsible AI Implementation and
Oversight.
Key aspects of Data‑Centric AI include:
DOI: 10.1201/9781003461500-2 3
4 Data-Centric Artificial Intelligence for Multidisciplinary Applications
1.2.1 Healthcare
Data‑centric AI improves the accuracy of medical image analysis, aiding in d iagnosing
diseases like cancer, detecting anomalies, and predicting patient outcomes [10].
AI‑driven analysis of molecular data accelerates drug discovery by predicting drug
interactions, identifying potential drug candidates, and optimizing drug designs [11].
By analysing patient data, including genetic, clinical, and lifestyle factors, AI helps
tailor treatment plans and predicts patient responses to therapies [12].
1.2.2 Finance
Data‑centric AI enhances credit risk assessment, fraud detection, and anti‑money
laundering efforts by analysing transactional and behavioural data for patterns. AI
models analyse market data to make high‑frequency trading decisions, optimizing
trading strategies, and portfolio management. AI‑driven analytics of customer data
improve marketing strategies, customer segmentation, and churn prediction [13].
1.2.4 Manufacturing
Data‑centric AI monitors and analyses sensor data from manufacturing processes to
identify defects, reduce waste, and ensure product quality. AI models analyse equip‑
ment sensor data to predict maintenance needs, minimizing downtime and opti‑
mizing maintenance schedules. AI‑driven analysis of supply chain data improves
inventory management, demand forecasting, and logistics planning.
1.2.6 Transportation
Data‑centric AI powers self‑driving cars by analysing sensor data to make real‑time
driving decisions and ensure passenger safety [15]. AI‑driven analysis of traffic data
improves traffic flow, reduces congestion, and enhances urban mobility.
Advancements in Data-Centric AI 7
1.2.7 Agriculture
Data‑centric AI analyses data from sensors, drones, and satellites to optimize crop
management, irrigation, and fertilization for increased yield and resource efficiency.
AI models analyse images of plants to detect diseases early, enabling targeted inter‑
ventions and reducing crop loss.
1.3.1 Supervised Learning
Supervised learning stands as a type of ML wherein the algorithm gains insights
from labelled training data. In this context, the dataset is composed of pairs denoting
inputs and their associated outputs. The algorithm’s objective revolves around
mastering the mapping between inputs and the anticipated outputs. The ultimate
aim is for the algorithm to extrapolate from the training dataset and generate precise
predictions or classifications for fresh, previously unseen data.
In supervised learning, the algorithm is furnished with datasets compris‑
ing labelled pairs of inputs and their corresponding outputs. The primary goal
of supervised learning is for the algorithm to discern and internalize a map‑
ping function. This function should enable the algorithm to make precise pre‑
dictions or classifications for new, previously unseen instances. Supervised
learning encompasses diverse tasks, with classification and regression standing
out. Classification entails assigning labels to inputs, while regression involves
predicting continuous values. A range of algorithms are utilized in supervised
learning, including decision trees, support vector machines, neural networks, and
linear regression. These algorithms are tailored to address various types of data
and tasks.
8 Data-Centric Artificial Intelligence for Multidisciplinary Applications
1.3.2 Unsupervised Learning
Unsupervised learning entails training an ML model using unlabelled data, where
explicit output labels are absent [16]. The algorithm’s objective is to uncover pat‑
terns, structures, or relationships inherent within the data. This is frequently accom‑
plished by clustering similar instances together or reducing the data’s dimensionality.
Unlabelled data is provided, and the algorithm seeks to identify underlying patterns
or groupings. The goal is to explore the inherent structure of the data, uncover hidden
patterns, or reduce its complexity. Unsupervised learning encompasses tasks such
as clustering, where similar data points are grouped together, and dimensionality
reduction, which involves condensing the number of features while preserving vital
information [17]. A variety of algorithms are employed in unsupervised learning,
including k‑means clustering, hierarchical clustering, principal component analysis,
and t‑distributed Stochastic Neighbor Embedding. These algorithms cater to differ‑
ent aspects of data exploration and pattern recognition.
1.3.3 Reinforcement Learning
Reinforcement learning (RL) stands as an ML category in which an agent learns to
formulate decisions through interactions with an environment. This agent garners
feedback in the shape of rewards or penalties contingent on its actions, striving to
acquire a policy that maximizes the cumulative reward across a span of time. RL
frequently finds application in tasks involving sequential decision‑making.
• Training Data: The agent learns through trial and error by interacting with
an environment.
• Objective: The agent aims to learn a policy that maximizes the expected
cumulative reward over a sequence of actions.
• Examples: Game playing, robotic control, and autonomous driving are
typical applications of RL.
• Algorithms: Q‑learning, Deep Q‑Networks, and policy gradient methods
are common RL algorithms.
These three fundamental learning algorithms constitute the bedrock of ML, serving as
pivotal components for constructing a diverse array of AI applications. Depending on
the nature of the challenge and the accessibility of labelled data, each learning algo‑
rithm category possesses its own merits and constraints. The selection of an algorithm
hinges upon the particular task at hand and the distinct attributes of the data involved.
Deep learning and neural networks have propelled advancements in diverse domains,
such as computer vision, NLP, and RL. Effective deep learning model development,
enabling precise predictions and discovery of intricate data patterns, demands pru‑
dent architecture design and rigorous training [22].
features, and fine‑tuning can be performed on higher layers to adapt the model to the
specific dataset.
1.5.1 Applications of CNNs
CNNs have demonstrated remarkable performance in various image analysis tasks:
1.6.1 Types of RNNs
1. Simple RNN: The basic RNN architecture processes sequences step by
step, but it suffers from the vanishing gradient problem, making it difficult
to capture long‑term dependencies.
2. Long Short‑Term Memory (LSTM): LSTM networks address the vanish‑
ing gradient problem by incorporating specialized memory cells that can
store and access information over long sequences.
12 Data-Centric Artificial Intelligence for Multidisciplinary Applications
3. Gated Recurrent Unit (GRU): Similar to LSTM, GRU also mitigates the
vanishing gradient problem using gating mechanisms. It has a simplified
structure with fewer parameters.
1.6.2 Applications of RNNs
RNNs are widely used in various applications involving sequential data:
While RNNs are powerful for handling sequential data, they also have limitations,
such as difficulty in capturing very long‑range dependencies. More advanced archi‑
tectures like Transformers have emerged to address some of these limitations and
have gained popularity in various applications as well [26].
The above‑discussed AI technologies have some ethical and legal implica‑
tions in modern day‑to‑day life decision‑making processes. These implications are
discussed below.
systems make errors or harmful decisions can be challenging. There may be questions
about whether the responsibility lies with the developers, users, or the technology
itself [29]. Also, clear guidelines for accountability and liability are essential, espe‑
cially in safety‑critical applications. AI systems become more autonomous, ques‑
tions arise about who has control over their decisions and actions. Ensuring human
oversight, intervention mechanisms, and fail‑safes are crucial to prevent unintended
consequences. AI systems can be vulnerable to adversarial attacks, where malicious
actors manipulate inputs to deceive the system. Robustness testing and security
measures are necessary to safeguard AI systems from such attacks. Complex AI
systems can exhibit behaviour that was not explicitly programmed, leading to unex‑
pected outcomes. Comprehensive testing and validation procedures are essential to
identify and mitigate unintended consequences. AI technologies have the potential
to reshape society in profound ways, impacting economies, job markets, and social
norms. Long‑term ethical considerations and societal implications should guide the
development and deployment of AI technologies [30]. Ensuring that AI technologies
benefit all of humanity, regardless of geographic location or socioeconomic status,
is a significant ethical concern. Efforts should be made to bridge the digital divide
and prevent exacerbating existing inequalities. Addressing these ethical implications
requires collaboration between policymakers, researchers, industry stakeholders,
and society as a whole. Ethical frameworks, regulations, and guidelines are being
developed to ensure that AI technologies are developed and deployed in ways that
prioritize human well‑being, fairness, and accountability.
• Prudent Data Collection: Gather only the essential data required for the
intended purpose of the AI system. Restricting data acquisition mitigates
the potential for privacy breaches and unauthorized utilization [31].
• Knowledgeable Consent: Prior to collecting and utilizing personal data,
secure informed and explicit consent from individuals. Individuals should
have a comprehensive understanding of data usage and the opportunity to
opt out [32].
• Anonymization and De‑identification: Modify or eliminate personally
identifiable information from data to avert direct individual identification.
Notably, it’s vital to acknowledge that anonymization might not always
guarantee privacy, considering potential re‑identification attacks.
• Robust Data Security: Enforce stringent security measures to shield amassed
data from unauthorized access, breaches, and cyber threats. Encryption,
secure storage, and access controls assume paramount importance.
14 Data-Centric Artificial Intelligence for Multidisciplinary Applications
and cleaning are essential steps in data‑centric AI [34]. Advanced algorithms are
used to clean and transform data, ensuring its quality and consistency. Data‑centric
AI in finance heavily relies on ML and AI algorithms to analyse the data, discover
patterns, and make predictions. Algorithms such as regression, classification, clus‑
tering, and deep learning models are commonly used. Financial institutions utilize
data‑centric AI to assess and manage risks associated with investments, loans, and
other financial products. By analysing historical data and market trends, AI mod‑
els can identify potential risks and provide insights to improve risk management
strategies. Data‑centric AI enables financial institutions to offer personalized ser‑
vices and recommendations to their customers. By analysing customer behaviour
and preferences, AI models can suggest tailored financial products and services that
meet individual needs. AI‑powered fraud detection systems are increasingly used
in finance to identify suspicious activities and transactions [35]. These systems can
detect anomalies in real‑time and prevent fraudulent activities, protecting both cus‑
tomers and financial institutions. Data‑centric AI plays a crucial role in ensuring
regulatory compliance for financial institutions. AI models can analyse vast amounts
of data to identify any non‑compliance issues and assist in meeting reporting require‑
ments. Data‑centric AI can analyse market trends, news, and other economic factors
to inform trading decisions. This approach is commonly used in algorithmic trading
and quantitative finance [36].
There may have been significant developments in AI. Key emerging technologies and
trends in AI up to the point are discussed in Section 1.12.
Both Edge AI and IoT applications are evolving rapidly, and their combined poten‑
tial has significant implications for various industries, ranging from consumer elec‑
tronics to healthcare, manufacturing, transportation, and beyond. The integration of
AI capabilities at the edge with IoT devices is expected to drive further innovation
Advancements in Data-Centric AI 19
and enhance the efficiency, intelligence, and capabilities of connected devices and
systems.
Quantum computing has the potential to revolutionize AI by significantly speed‑
ing up complex computations, enabling researchers to tackle more intricate AI
problems.
REFERENCES
1. Abdi, H., and Williams, L. J. Principal component analysis. Wiley Interdisciplinary
Reviews: Computational Statistics 2, 4 (2010), 433–459.
2. Ahsan, M. M., Mahmud, M. P., Saha, P. K., Gupta, K. D., and Siddiqe, Z. Effect of data
scaling methods on machine learning algorithms and model performance. Technologies
9, 3 (2021), 52
3. Agarwal, A., Dahleh, M., & Sarkar, T. A marketplace for data: An algorithmic solution.
In Proceedings of the 2019 ACM Conference on Economics and Computation (2019)
(pp. 701–726).
4. Ali, P. J. M., Faraj, R. H., Koya, E., Ali, P. J. M., and Faraj, R. H. Data normalization and
standardization: a technical report. Machine Learning Technical Reports 1, 1 (2014), 1–6.
5. Armbrust, M., Ghodsi, A., Xin, R., & Zaharia, M. (2021, January). Lakehouse: a
new generation of open platforms that unify data warehousing and advanced analyt‑
ics. In Proceedings of CIDR (Vol. 8).6. Arocena, P. C., Glavic, B., Mecca, G., Miller,
R. J., Papotti, P., and Santoro, D. Benchmarking data curation systems. IEEE Data
Engineering Bulletin 39, 2 (2016), 47–62.
7. Aroyo, L., Lease, M., Paritosh, P., & Schaekermann, M. (2022). Data excellence for AI:
why should you care?. Interactions, 29, 2, 66–69.
8. Azhagusundari, B., Thanamani, A. S. et al. Feature selection based on information
gain. International Journal of Innovative Technology and Exploring Engineering
(IJITEE) 2, 2 (2013), 18–21.
9. Azizzadenesheli, K., Liu, A., Yang, F., & Anandkumar, A. (2019). Regularized learning
for domain adaptation under label shifts. arXiv preprint arXiv:1903.09734.10. Anand,
D., and Kumar, A. IoT‑based automated healthcare system. In: 5th International
Conference on Computing Methodologies and Communication (ICCMC), Erode,
India (2021).
24 Data-Centric Artificial Intelligence for Multidisciplinary Applications
11. Pao, L. Y., and Frei, C. W. A comparison of parallel and sequential implementations of
a multisensor multitarget tracking algorithm. In: Proceedings of the American Control
Conference, vol. 3, Seattle, WA, June (1995), pp. 1683–1687.
12. D’Mello, S., and Graesser, A. (2012). Dynamics of affective states during complex
learning. Learning and Instruction 22(2), 145–157
13. Baik, C., Jagadish, H. V., & Li, Y. (2019, April). Bridging the semantic gap with SQL
query logs in natural language interfaces to databases. In 2019 IEEE 35th International
Conference on Data Engineering (ICDE) (pp. 374–385). IEEE.
14. Barandas, M., Folgado, D., Fernandes, L., Santos, S., Abreu, M., Bota, P., Liu, H.,
Schultz, T. and Gamboa, H., 2020. TSFEL: Time series feature extraction library.
SoftwareX, 11, p.100456.
15. Basu, A., & Blanning, R. W. (1995, January). Discovering implicit integrity constraints
in rule bases using metagraphs. In Proceedings of the Twenty‑Eighth Annual Hawaii
International Conference on System Sciences (Vol. 3, pp. 321–329). IEEE.
16. Batini, C., Cappiello, C., Francalanci, C., and Maurino, A. Methodologies for data
quality assessment and improvement. ACM Computing Surveys (CSUR) 41, 3 (2009),
1–52.
17. Baylor, D., Breck, E., Cheng, H. T., Fiedel, N., Foo, C. Y., Haque, Z., ... & Zinkevich, M.
(2017, August). Tfx: A tensorflow‑based production‑scale machine learning platform.
In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (pp. 1387–1395)
18. Bertini, E., & Lalanne, D. (2009, June). Surveying the complementary role of auto‑
matic data analysis and visualization in knowledge discovery. In: Proceedings of the
ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating
Automated Analysis with Interactive Exploration (pp. 12–20).
19. Yin, L., Lin, S., Sun, Z., Wang, S., Li, R., & He, Y. (2024). PriMonitor: An adaptive
tuning privacy-preserving approach for multimodal emotion detection. World Wide
Web 27(2), 1–28.
20. D’Mello, S., Kory, J., and Dieterle, E. Affective computing: Challenges, techniques,
and evaluation. In: S. D’Mello, R. A. Calvo, and J. Gratch (Eds.), Handbook of Affective
Computing, Oxford University Press (2015), pp. 401–414.
21. Li, Y., Chen, Z., Zha, D., Zhou, K., Jin, H., Chen, H., and Hu, X. Automated anomaly
detection via curiosity‑guided search and self‑imitation learning. IEEE Transactions on
Neural Networks and Learning Systems 33, 6 (2021), 2365–2377.
22. Li, Y., Chen, Z., Zha, D., Zhou, K., Jin, H., Chen, H., & Hu, X. (2021, April). Autood:
Neural architecture search for outlier detection. In: 2021 IEEE 37th International
Conference on Data Engineering (ICDE) (pp. 2117–2122). IEEE.
23. Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., and Neubig, G. Pre‑train, prompt, and
predict: a systematic survey of prompting methods in natural language processing.
ACM Computing Surveys 55, 9 (2023), 1–35.
24. Li, Y., Zha, D., Venugopal, P., Zou, N., and Hu, X. PyODDS: An end‑to‑end outlier
detection system with automated machine learning. In: Companion Proceedings of the
Web Conference 2020 (2020).
25. Liu, Z., Chen, S., Zhou, K., Zha, D., Huang, X., and Hu, X. RSC: Accelerating graph
neural networks training via randomized sparse computations. In: Proceedings of the
40th International Conference on Machine Learning, PMLR 202:21951‑21968, 2023.
26. Liu, Z., Wei, P., Jiang, J., Cao, W., Bian, J., and Chang, Y. MESA: boost ensemble imbal‑
anced learning with meta‑sampler. In: Advances in Neural Information Processing
Systems 33 (NeurIPS 2020) (2020).
27. Lucic, Ana, Harrie Oosterhuis, Hinda Haned and M. de Rijke. FOCUS: Flexible
Optimizable Counterfactual Explanations for Tree Ensembles. AAAI Conference on
Artificial Intelligence (2019).
Advancements in Data-Centric AI 25
28. Luo, Y., Qin, X., Tang, N., and Li, G. DeepEye: Towards automatic data visualiza‑
tion. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), IEEE
(2018), pp. 101–112.
29. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. Towards deep learning
models resistant to adversarial attacks. In: 6th International Conference on Learning
Representations, ICLR 2018 ‑ Conference Track Proceedings. (2018)8.
30. Marcus, Ryan, Andreas Kipf, Alexander van Renen, Mihail Stoian, Sanchit Misra,
Alfons Kemper, Thomas Neumann and Tim Kraska. Benchmarking learned indexes.
In: Proceedings of the VLDB Endowment 14 (2020): 1–13.
31. M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, and F. Hutter. Ecient
and robust automated machine learning. Advances in neural information processing
systems, 28, 2015.
32. Moosavi‑Dezfooli, S., Fawzi, A., & Frossard, P. DeepFool: A Simple and Accurate
Method to Fool Deep Neural Networks. 2016 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), 2574–2582. (2016).
33. Mazumder, M., Banbury, C.R., Yao, X., Karlavs, B., Rojas, W.G., Diamos, S., Diamos, G.F.,
He, L., Kiela, D., Jurado, D., Kanter, D., Mosquera, R., Ciro, J., Aroyo, L., Acun, B., Eyuboglu,
S., Ghorbani, A., Goodman, E.D., Kane, T., Kirkpatrick, C.R., Kuo, T., Mueller, J.W., Thrush,
T., Vanschoren, J., Warren, M.J., Williams, A., Yeung, S., Ardalani, N., Paritosh, P.K., Zhang,
C., Zou, J.Y., Wu, C., Coleman, C., Ng, A.Y., Mattson, P., & Reddi, V.J. DataPerf: Benchmarks
for Data‑Centric AI Development. ArXiv, abs/2207.10062. (2022).
34. Meduri, V., Popa, L., Sen, P., & Sarwat, M. (2020). A Comprehensive Benchmark
Framework for Active Learning Methods in Entity Matching. Proceedings of the 2020
ACM SIGMOD International Conference on Management of Data. (2020).
35. Mehrabi, N., Morstatter, F., Saxena, N.A., Lerman, K., & Galstyan, A.G. A Survey
on Bias and Fairness in Machine Learning. ACM Computing Surveys (CSUR), 54,
pp. 1– 35. (2019)
36. Meng, C., Trinh, L., Xu, N., Enouen, J., & Liu, Y. Interpretability and fairness evalua‑
tion of deep learning models on MIMIC‑IV dataset. Scientific Reports, 12. (2022).
37. Milutinovic, M., Schoenfeld, B., Martinez‑Garcia, D., Ray, S., Shah, S., and Yan, D. On
evaluation of autoML systems In: Computer Science (2020).
38. Mintz, M.D., Bills, S., Snow, R., & Jurafsky, D. Distant supervision for relation extrac‑
tion without labeled data. Annual Meeting of the Association for Computational
Linguistics. (2009).
39. Miotto, R., Wang, F., Wang, S., Jiang, X., and Dudley, J. T. Deep learning for healthcare:
review, opportunities and challenges. Briefings in Bioinformatics 19, 6 (2018).
40. Miranda, L. J. Towards data‑centric machine learning: a short review. Communications
of the ACM, 66, 8, pp. 84–92 10.1145/3571724 (2023)
41. Mirdita, M., Von Den Driesch, L., Galiez, C., Martin, M. J., Söding, J., and Steinegger,
M. Uniclust databases of clustered and deeply annotated protein sequences and align‑
ments. Nucleic Acids Research 45, D1 (2017), D170–D176.
42. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D. &
Riedmiller, M. Playing atari with deep reinforcement learning. arXiv preprint
arXiv:1312.5602. (2013).
43. Nanni, L., Paci, M., Brahnam, S., and Lumini, A. Comparison of different image data
augmentation approaches. Journal of Imaging 7, 12 (2021).
44. Nargesian, F., Zhu, E., Pu, K. Q., and Miller, R. J. Table union search on open
data. In: Proceedings of the VLDB Endowment 11, 7 813–825 (2018). https://doi.
org/10.14778/3192965.3192973
45. Kim, J. S., Jin, H. et al., Location‑based social network data generation based on pat‑
terns of life. In: 2020 21st IEEE International Conference on Mobile Data Management
(MDM), IEEE (2020), pp. 158–167.
26 Data-Centric Artificial Intelligence for Multidisciplinary Applications
46. Kumar, A., Dabas, V., and Hooda, P., 2020. Collaboration Big Data. Internet of data: a
SWOT analysis. International Journal of Information Technology 12 (2020) 1159–1169.
47. Van Aken, D., Pavlo, A., Gordon, G. J., and Zhang, B. Automatic database management
system tuning through large‑scale machine learning. In SIGMOD ‘17: Proceedings
of the 2017 ACM International Conference on Management of Data May (2017)
1009–1024 https://doi.org/10.1145/3035918.3064029 (2017).
48. Venkatasubramanian, S., and Alfano, M., 2020. The philosophical basis of algorithmic
recourse. In Proceedings of the 2020 conference on fairness, accountability, and trans‑
parency, pp. 284–293.
49. Zha, D., Lai, K.‑H., Tan, Q., Ding, S., Zou, N., and Hu, X. B. Towards automated
imbalanced learning with deep hierarchical reinforcement learning. In Proceedings of
the 31st ACM International Conference on Information & Knowledge Management.
2476–2485. (2022).
50. Pedrozo, W. G., Nievola, J. C., and Ribeiro, D. C. An adaptive approach for index tuning
with learning classifier systems on hybrid storage environments, In: Hybrid Artificial
Intelligent Systems: 13th International Conference, HAIS 2018, Proceedings, vol. 13,
Oviedo, Spain, June 20–22, 2018, Springer (2018), pp. 716–729.
51. Bodenheimer, T., Sinsky, C., and Froman, R. Improving Primary Care Access and
Continuity: A Framework and Quality Improvement Toolkit for Achieving the Institute
for Healthcare Improvement’s Triple Aim. Agency for Healthcare Research and Quality
(US) (2018).
52. Woolf, B. P. Building Intelligent Interactive Tutors: Student‑Centered Strategies for
Revolutionizing E‑Learning. Morgan Kaufmann (2010).
53. Lee, Y., Im, D., and Shim, J., 2019. Data labeling research for deep learning based fire
detection system. In: 2019 International Conference on Systems of Collaboration Big
Data. Internet of Things and Security (SysCoBIoTS), IEEE (2021), pp. 1–4.
2 Emerging Development
and Challenges in
Data‑Centric AI
Chaitali Shewale
2.1 INTRODUCTION
Data are crucial for training models, assessing performance, and generating predic‑
tions in artificial intelligence (AI). The concept that the quality and quantity of data
have a direct impact on the efficacy of AI models is at the heart of data‑centric AI
(DCAI). This chapter aims to shed light on how the AI community is developing to
adapt to the changing landscape of data by examining the new trends and difficulties
in DCAI. In this fast‑growing world, AI has rapidly transformed much of our human
work into a machine world with one click. The main core of this transformation lies
in the paradigm shift toward DCAI, where the huge and unprecedented growth along
with the availability of data has become a challenge and also an advantage for us.
With this huge amount of data, there is a need for a mechanism to separate and dis‑
tribute data to make it used more efficiently and in a much better way. The concept of
DCAI will have a direct effect on data arrangement and data distribution [1]. We will
delve into the techniques and strategies employed in data preprocessing, augmenta‑
tion, and feature engineering that contribute to the enhancement of data quality and
subsequent efficacy of models.
Applications for DCAI are prevalent across a range of industries, highlighting
the critical role that data plays in forming and improving AI models. The use of
large medical imaging data by AI models for the accurate detection of diseases
like cancer and the analysis of electronic health records to identify patterns and
provide individualized therapies are two notable examples. By anticipating main‑
tenance requirements and minimizing downtime, predictive maintenance, based
on sensor data analysis, optimizes operations. Image processing and data analysis
help quality control find defects in manufactured goods and guarantee excellent
quality. AI models analyze transaction history to find anomalous tendencies sug‑
gestive of probable fraud in financial transactions. Energy consumption optimiza‑
tion increases the effectiveness of energy distribution and lowers costs, whereas
traffic management enhances the efficiency of transportation networks and traffic
flow through data analysis. Through the use of multilingual text data, AI models are
trained to improve the accuracy of language translation systems. Sentiment analy‑
sis uses a massive amount of text data from reviews, surveys, and social media to
DOI: 10.1201/9781003461500-3 27
28 Data-Centric Artificial Intelligence for Multidisciplinary Applications
evaluate public opinion on a variety of topics. Analyzing historical soil, crop, and
meteorological data helps predict crop yields, increasing farm production. Real‑time
decision‑making and obstacle identification are made possible for autonomous cars
through image and sensor processing. By making recommendations for goods or
services based on user preferences, recommender systems increase consumer happi‑
ness and revenue. In order to keep the right amount of inventory on hand, inventory
optimization analyzes sales data and demand forecasts. In finance, risk assessment is
evaluating and predicting investment risks by examining market and economic data.
These fields have undergone a revolution, thanks to DCAI, which emphasizes how
crucial it is to have high‑quality, diversified data for AI applications. These illustra‑
tions highlight how DCAI applications use a wealth of data to build models, forecast
the future, and streamline processes in a variety of industries, ultimately enhancing
productivity, accuracy, and judgment.
1. AI algorithms that comprehend data and enhance models with that knowl‑
edge. This is seen in curriculum learning, where ML models are initially
trained on “easy data”.
2. Data‑modifying AI algorithms that enhance AI models. This is exemplified
by confident learning, in which ML models are trained on a filtered dataset
with incorrectly identified data eliminated [4].
Both of the aforementioned examples use algorithms to analyze the outputs of trained
ML models to automatically estimate which data is simple or incorrectly categorized.
The foundation of DCAI is the collection of diverse, high‑quality data [5]. The
popularity of methods like active learning and federated learning has made it
Emerging Development and Challenges in Data‑Centric AI 29
possible to collect data from many sources effectively [6]. Additionally, data labeling
and annotation are essential to supervised learning. The process of annotating data
is being revolutionized by improvements in methodologies like weak supervision and
semi‑supervised learning, which make the procedure more effective and scalable.
Cleaning, standardization, and feature scaling are examples of preprocessing tech‑
niques used to get data ready for AI models. By broadening the dataset, data augmen‑
tation methods including rotation, translation, and generative approaches improve
model performance. In order for the AI model to extract reliable and generalized
patterns from the data, preprocessing and augmentation are necessary. The privacy
of user data is a major concern in DCAI. Data privacy must be protected throughout
the AI lifecycle due to rising data breaches and privacy laws. Emerging technolo‑
gies like differential privacy, federated learning and homomorphic encryption enable
AI progress while protecting individual privacy. As a result of its use in modify‑
ing AI models for certain applications, domain‑specific data is becoming more and
more important. A potent method to use domain‑specific data is transfer learning,
a technique where models learned on one task are adapted for another related one.
Pre‑trained models can be fine‑tuned to drastically cut down on training time and
data requirements, opening up AI to more specialized sectors. Biases found in train‑
ing data can make AI models continue to be unfair and discriminatory. A persistent
problem in DCAI is addressing biases in the data and assuring fairness. To reduce
biases and encourage justice in AI applications, strategies like adversarial debiasing
and fairness restrictions are being investigated. Despite improvements, there are still
problems in the field of DCAI. Managing biases and ensuring privacy, along with the
rising demand for large‑scale, high‑quality data, provide significant hurdles. To meet
the changing demands of DCAI, future initiatives include establishing standardized
tools for data management, producing synthetic data for improved generalization,
and encouraging interdisciplinary collaboration.
DCAI is the discipline of systematically engineering data used to build an AI sys‑
tem [7]. Many researchers have defined DCAI in many different ways, some of them
as “the discipline of systematically engineering the data used to build an AI system”.
As shown in Figure 2.1, the prime approach in DCAI is to take data to center stage or
move the data to a centralized unit to make data accessible in the entire AI develop‑
ment lifecycle. Here, we can access data taken at all the stages of the AI lifecycle.
DCAI is based on the objective of providing the best dataset to feed a given ML
model, while model‑centric AI is based on the goal of producing the best model for
a given dataset [8]. One should do both in order to implement the best‑supervised
learning systems in practice. A DCAI includes the following activities:
1. Investigate the data, correct any major problems, and then change it so that
it is ML‑appropriate.
2. Utilizing the correctly prepared dataset, train a foundational ML model.
3. Make use of this model to help you enhance the dataset using the methods
covered in this class.
4. To obtain the optimal model, experiment with various modeling strategies
on the improved dataset
2.5 MODEL‑CENTRIC AI
In AI, a “model” is something that is referred to as a mathematical or computational
representation that understands behavior within data by capturing various patterns
and relationships. Models are the core of many AI techniques such as ML and deep
learning [12].
In model‑centric AI, the main or brute force approach is developing experimental
research to improve the ML model performance. This involves selecting the best
model architecture that will be favorable to that dataset and training process from a
wide range of possibilities as follows:
TABLE 2.1
Comparison between Model‑Centric and Data‑Centric AI
Category Model‑Centric AI Data‑Centric AI
System Progressive upgrade of algorithm and Enhancing the data quality
Development code having constant volume and consistently while keeping the
Process fixed type of data model hyperparameters constant
Performance Excels primarily while dealing with Shows optimal performance while
substantial or extensive datasets working with a modest or compact
dataset
Robustness Susceptible to adversarial samples Higher adversarial robustness
Applicability In model‑centric AI, selecting the right Data‑centric AI often involves using
hyperparameters is critical for a variety of ML algorithms and
achieving optimal model performance models to extract insights from data
Generalization Balanced complexity of the model is Generalization starts with the
essential overlay complex model may collection of drivers and
fit the data perfectly but fails the representative data from real world
generalization to new data
34 Data-Centric Artificial Intelligence for Multidisciplinary Applications
1. Focus:
In DCAI, the primary emphasis is on keeping the model architecture
constant while striving to enhance performance by improving the quality
and quantity of the data used for training.
On the other hand, model‑centric revolves around refining the model’s
design and structure to achieve better performance, with the dataset being
relatively fixed.
2. Data Work and Domain Knowledge:
DCAI demands a deep understanding of both the specific model archi‑
tecture being employed and the domain of the problem at hand. This
includes a comprehensive grasp of the underlying data. The process involves
domain‑specific data manipulation and analysis. Moreover, the develop‑
ment of techniques and tools that partially automate tasks contributes to the
creation of effective AI systems.
Model‑centric AI may not require as detailed a domain understanding,
as the primary efforts are directed toward optimizing the model’s structure.
3. Data Quality Understanding:
Modifying the foundation data raises the performance enhancements in
DCAI. Consequently, shifts in metrics used to measure the effectiveness of
ML models also reflect the impact of data adjustments. This offers a fresh
perspective on gauging data quality, approximated by changes in ML metrics.
Emerging Development and Challenges in Data‑Centric AI 35
• Data Corruption
• Data Imbalance
• Continuous data collection and updating
• Data Fusion and Integration
• Data privacy and security
• Data bias and fairness
2.7.2 Data Corruption
As we know if our data is not correct or is corrupted, then it will produce the wrong
output and lead disform of our system. So, data correctness is the most important
part, and to correct this, data on a huge scale is a big challenge because the sources
36 Data-Centric Artificial Intelligence for Multidisciplinary Applications
from which we are getting the data are not reliable, so need a strong method to make
it more successful work. Data corruption can occur at various stages of the AI life‑
cycle and can have several.
Data is a fundamental component of AI systems, and its quality directly impacts
the performance and reliability of these models. Data corruption can occur at various
stages of the AI lifecycle and can stem from several causes.
2.7.3 Data Imbalance
In real‑world scenarios, data may be unevenly distributed across classes or catego‑
ries. It is a situation where the distribution of classes or categories in a dataset is sig‑
nificantly skewed, with some classes having much fewer instances than others. This
can lead to models that are biased toward the majority class and perform poorly on
minority classes. This imbalance can pose challenges when training and evaluating
ML models, as the models may become biased toward the majority class and perform
poorly on minority classes.
• Sensor Data Fusion: Here, we have combined data from multiple sensors
to improve the accuracy and reliability of measurements. This is some‑
times often used in fields like robotics, autonomous vehicles, and military
applications.
• Feature‑Level Fusion: Here, we combine different features or attributes of
data from various sources. For example, combining different text data with
image data in natural language processing and computer vision applications.
• Decision‑Level Fusion– Combining the decisions or outputs of multiple
algorithms or models to make a final decision. This is common in ensemble
learning where multiple models are combined to improve.
• Key Principles
• Consent: Individuals should have informed consent for the collection
and use of their data.
• Data Minimization: Collect data which need and avoid additional
access to data used on the current user login.
• Transparency: One of the main and important policies that should be
specified to the user is transparency; it should be all clear to the user and
nothing should be hidden.
• Fairness: Fairness in ML refers to the goal of ensuring that models and algo‑
rithms treat all individuals and do not discriminate against any particular
demographic or social group. Achieving fairness is essential to prevent the
reinforcement of existing biases and to promote ethical and equitable outcomes.
• Individual Fairness: This principle aims to treat similar individuals
similarly. In other words, if any two things and these two things have
similarities in them, then the model preparation should be similar.
• Group fairness: It focuses on ensuring that predictions are fair at a
group level, particularly concerning sensitive attributes like race, gen‑
der, or ethnicity.
38 Data-Centric Artificial Intelligence for Multidisciplinary Applications
This involves avoiding disparate impact, where certain groups are disproportionately
affected by a model’s predictions.
Data bias and fairness are ongoing concerns in the field of ML, and addressing
these issues is crucial for building trustworthy and ethical AI systems that benefit
society as a whole.
Design scalable and enhanced data architecture which is the need for your AI. Use
proper database systems and data storage solutions to manage and fetch data easily
and efficiently. We can create data pipelines that self‑operate the process of clean‑
ing, transforming, and integrating data [5], which check data accuracy before setting
data into AI. We can implement a self‑operating data validation process that identifies
the inconsistencies in the incoming data. Create data recovery systems like backups
and versioning to restore clean data if it gets corrupted. Apply techniques like the
Synthetic Minority Oversampling Technique (SMOTE) to balance class distribution in
training data. Real‑time data ingestion pipelines collect real‑time data and process it,
which ensures that AI has updated data. If we can Store the data in the blockchain and
provide access to data can resolve the problem of security, as we have seen that data is
getting imbalance, we will divide the data in form of small, distributed, and structured
chunks so that the data is now in structured and distributed format; this chunks will be
the single block of this blockchain and will be accessed by the authenticated user only.
2.8 CONCLUSION
This chapter describes that model‑centric AI and DCAI have highlighted important
factors that have a big impact on the efficiency and dependability of AI models. When
developing and implementing AI systems, the problems outlined in the data‑centric
approach must be taken into account. The distinctions between model‑centric AI
and DCAI were emphasized, highlighting the significance of creating a scalable
and improved data architecture for successful AI implementation. Important tactics
include using appropriate database systems, applying data validation procedures, and
using methods like SMOTE for data balance. In addition, utilizing blockchain for
distributed, secure data storage is a viable answer to the problems with data security
and imbalance. To design AI systems that are dependable, precise, and unbiased and
eventually contribute to a more robust and fairer AI‑driven future, it is crucial to
acknowledge and effectively address these problems in DCAI.
REFERENCES
1. Zha, Daochen, Bhat, Zaid Pervaiz, Lai, Kwei‑Herng, Hu, Xia et al. (2023). Data‑centric
AI: Perspectives and challenges. doi:10.48550/arXiv.2301.04819.
2. Mingyang, Wan, Zha, Daochen, Liu, Ninghao, and Zou, Na. (2023). In‑processing
modeling techniques for machine learning fairness: A survey. ACM Transactions on
Knowledge Discovery from Data 17, 3, 1–27
3. Press, G. (2021). Andrew Ng launches a campaign for data‑centric AI. Forbes.
4. Northcutt, C., Jiang, L., and Chuang, I. L. (2021) Confident learning: Estimating uncer‑
tainty in dataset labels. Journal of Artificial Intelligence Research, 70, 1373–1411
Emerging Development and Challenges in Data‑Centric AI 39
5. Patel, Hima, Guttula, Shanmukha, Gupta, Nitin, Hans, Sandeep, Mittal, Ruhi, and
Lokesh, N. (2023). A data centric AI framework for automating exploratory data analysis
and data quality tasks. Journal of Data and Information Quality. doi:10.1145/3603709.
6. Zhang, Huaizheng, Huang, Yizheng, and Li, Yuanming. (2023). DataCI: A platform for
data‑centric AI on streaming data.
7. Verdecchia, Roberto, Cruz, Luís, Sallou, June, Lin, Michelle, Wickenden, James,
and Hotellier, Estelle. (2022). Data‑centric green AI: An exploratory empirical study.
doi:10.1109/ICT4S55073.2022.00015.
8. Khang P.H, Alex, Gujrati, Rashmi, Rani, Sita, Uygun, Hayri, and Gupta, Dr‑Shashi.
(2023). Designing workforce management systems for industry 4.0: Data‑centric and
AI‑enabled approaches. doi:10.1201/9781003357070.
9. Chiang, T. (2023). ChatGPT is a blurry JPEG of the Web, February, 2023, https://www.
newyorker.com
10. Lee, Youngjune, Kwon, Oh Joon, Lee, Haeju, Kim, Joonyoung, Lee, Kangwook and
Kim, Kee‑Eung. (2021). Augment, valuate: A data enhancement pipeline for data‑Cen‑
tric AI. In: NeurIPS Data‑Centric AI Workshop. arXiv preprint arXiv:2112.03837.
11. Redman, T. (2016). Bad data costs the U.S. $3 trillion per year. Harvard Business
Review, 22, 11–18.
12. Jarrahi, Mohammad, Memariani, Ali, and Guha, Shion. (2022). The principles of
data‑centric AI (DCAI). doi:10.48550/arXiv.2211.14611.
3 Unleashing the Power
of Industry 4.0
A Harmonious Blend
of Data‑Centric and
Model‑Centric AI
Manivannan Karunakaran, Batri Krishnan,
D. Shanthi, J. Benadict Raja,
and B. Sakthi Karthi Durai
3.1 INTRODUCTION
Industry 4.0, often referred to as the fourth industrial revolution, has ignited a trans‑
formative wave of automation, leading to the emergence of novel industrial appli‑
cations that have the potential to reshape how humans interact with machines [1].
In response to this revolution, businesses are progressively moving away from con‑
ventional automation approaches, which heavily depend on the expertise of human
developers and the use of application programming interfaces for service platforms
[2]. The fourth industrial revolution, also known as Industry 4.0, has brought about
a transformation in engineering systems, seamlessly integrating sensing capabilities,
computational power, control, and networking into cyber‑physical objects. These
objects are interconnected through the Internet of Things (IoTs) [3].
Industry 4.0 is distinguished by its reliance on data, which plays a prominent role
across various technologies [4]. Cloud computing facilitates the effective storage,
analysis, and processing of vast amounts of data in the cloud [5]. Edge computing, on
the other hand, minimizes latency in real‑time production operations by efficiently
analyzing data near the sensors [6]. Digital twins play a crucial role in simulating var‑
ious systems’ processes in virtual environments, utilizing data from IoT sensors and
interconnected objects [7]. Artificial intelligence (AI) and machine learning (ML)
are crucial components in handling the vast data volumes of Industry 4.0 [8]. AI
systems, mainly algorithms (code), learn prototypical features from extensive data to
solve problems across different formats like text, audio, image, and video. ML, a sub‑
set of AI, enables AI systems to detect imperceptible patterns using general‑purpose
procedures, solving problems without explicit programming [9].
Data preparation involves human experts labeling and curating data for con‑
text and interpretation, known as “data annotation.” In the context of AI and ML,
40 DOI: 10.1201/9781003461500-4
Unleashing the Power of Industry 4.0 41
3.1.1 Contribution
This chapter significantly extends prior research [19]. We employ a comparative
analysis methodology to contrast data‑centric AI and model‑centric AI (Table 3.1).
The analysis draws from Andrew Ng’s live stream presentation on 24 March 2021 [20].
This chapter also considers the growing number of researchers who support the tran‑
sition from model‑centric AI to data‑centric AI. The main thesis of this chapter is that
while model‑centric AI may have its limitations in terms of performance, embracing a
collaborative approach that combines both data‑centric and model‑centric methodolo‑
gies would lead to more substantial advancements in current AI technology compared
to focusing solely on improving datasets, despite the critical importance of dataset
enhancement. The key contributions of this chapter can be summarized as follows:
In Section 3.2, we present a succinct review and comprehensive discussion of the
deep learning (DL) technique and its pivotal role in propelling current AI advance‑
ments. Sections 3.4 and 3.5 establish crucial connections between current AI, cyber
security, and natural language inference (NLI). We emphasize the drawbacks of
model‑centric AI, particularly regarding its algorithmic stability and robustness.
We specifically point out examples like adversarial samples and hypothesis‑only
biases to illustrate the difficulties faced in this approach. In Section 3.6, we delve
into the motivation for adopting the data‑centric AI approach, particularly empha‑
sizing the significant impact of the IoTs’ continuous expansion, supported by the
latest relevant data. Finally, in Section 3.7, we reconcile the data‑centric AI perspec‑
tive with model‑centric AI, presenting additional arguments in favor of the “both/
and” approach over the less optimal “either/or” stance. By exploring the benefits of
combining both methodologies, we shed light on the synergistic potential of this col‑
laborative approach.
and the logistic function f(z) = 1/(1 + exp(−z)) [3]. These activation functions intro‑
duce nonlinearity to the neural network, enabling it to model complex relationships
and learn from data more effectively.
In contrast to traditional ANNs, DL networks acquire data representations at vari‑
ous levels of abstraction. The achievement of modeling complex relationships in neu‑
ral networks is made possible by employing a significantly higher number of layers.
These layers consist of simple, nonlinear modules known as neurons, which play a
pivotal role in transforming the internal representation of specific input data aspects
from one layer to a higher‑level internal representation. As data passes through mul‑
tiple layers, the neural network learns to abstract and represent increasingly com‑
plex features, allowing it to capture intricate patterns and relationships in the data.
This hierarchical approach to representation learning enables deep neural networks
to tackle challenging tasks and achieve remarkable performance in various appli‑
cations. The backpropagation algorithm fine‑tunes the neural network’s weights by
working back down through the layers, adjusting each weight proportionally based
on its contribution to the overall error. DL networks are characterized by their sig‑
nificant depth, typically ranging from 5 to 20 layers, which is what earned them the
name “deep” networks. However, in modern commercial applications, neural net‑
work models frequently employ over 100 layers, emphasizing their ability to scale
too much larger depths. This substantial depth enables DL models to learn hierarchi‑
cal representations and abstract complex features from raw input data, allowing them
to handle intricate tasks effectively. In contrast to classical ML techniques, which
rely on human experts to carefully engineer features and extract relevant aspects of
input data, DL models possess the ability to autonomously learn these representa‑
tions directly from the data. This process, known as feature learning or represen‑
tation learning, allows DL models to operate without explicit human intervention
in the feature extraction step. This characteristic is particularly advantageous as it
reduces the burden of manual feature engineering and enables DL models to adapt
to a wide range of tasks and datasets, making them highly flexible and powerful
tools in various applications. DL algorithms implicitly learn features from data using
general‑purpose procedures. The exceptional capability of DL has drawn compari‑
sons to the problem‑solving approach of the human brain. Additionally, a significant
finding in the field is that ANNs, with a non‑polynomial activation function, pos‑
sess the ability to approximate any continuous function with arbitrary accuracy. This
finding, known as the universal approximation theorem, mathematically establishes
the equivalence of ANNs to universal computers. In essence, this theorem confirms
that DL models can represent and approximate a wide range of complex functions,
making them highly versatile and powerful tools for solving diverse problems. This
unique combination of human brain‑like problem‑solving and mathematical prowess
has contributed to the widespread success and adoption of DL in numerous domains
and applications.
The evolution of DL into its current state and the fulfillment of many early aspi‑
rations of AI researchers can be attributed to several significant factors. A critical
aspect is DL’s ability to excel when presented with abundant and diverse data from
various sources in the Global Data sphere. This extensive data sphere encompasses
textual, visual, and acoustic data and comprises a wide range of sources, including
Unleashing the Power of Industry 4.0 45
FIGURE 3.3 Projected annual data creation, consumption, and storage according to IDC.
in datasets with biases. While the model performs well on such datasets, it lacks
generalization to new data without similar biases, demonstrating its low generaliza‑
tion capacity.
Prior to the widespread recognition of “data‑centric AI,” AI researchers and ML
practitioners devoted significant efforts to curating datasets for training ML models.
While some data instances were discovered to be invalid due to mislabeling, ambi‑
guity, or irrelevance, their influence on model performance was often deemed insig‑
nificant. However, businesses and industries have now come to realize the crucial
significance of prioritizing high‑quality datasets throughout the entire development
process of AI systems.
The foremost challenge lies in the scarcity of sufficiently large and diverse datasets.
Unlike internet companies, manufacturing industries often have limited data, with
training datasets containing only thousands of relevant data points. Consequently,
ML models built on such limited data struggle to perform effectively when com‑
pared to models trained on massive datasets. In industries like manufacturing, where
a variety of products are produced, a one‑size‑fits‑all AI system for fault detection
may not be sufficient. Each product demands its own uniquely trained ML system to
ensure effective performance and accurate fault detection.
on goal‑relevant aspects while filtering out distracting and noisy information. The
success of AI models heavily relies on the quality of the data used for their train‑
ing. In the data‑centric AI paradigm, the focus shifts from merely fine‑tuning the
model’s architecture and hyperparameters to prioritizing the acquisition of sufficient
and representative data inputs. This chapter aims to explore the significance of ade‑
quate and representative data inputs for data‑centric AI and how they contribute to
achieving peak performance and robustness in AI systems. Data is the lifeblood of
AI systems, serving as the foundation upon which models learn to make decisions
and predictions. In data‑centric AI, the emphasis is on obtaining datasets that not
only contain a large volume of data but also encompass a diverse range of examples
that reflect real‑world scenarios. Adequate data inputs are crucial to ensure that AI
models have enough information to learn complex patterns and relationships within
the data, making them capable of solving the specific task at hand. The sufficiency
of data inputs plays a pivotal role in the effectiveness of data‑centric AI. Insufficient
data can lead to underfitting, where the model fails to capture the underlying patterns
in the data, resulting in poor performance. To ensure data sufficiency, researchers
and practitioners need to carefully curate datasets that encompass a broad spectrum
of instances relevant to the task. This includes not only positive examples but also
negative and ambiguous instances, providing the model with a comprehensive under‑
standing of the problem space. In addition to sufficiency, the data inputs must also
be representative of the real‑world scenarios that the AI system will encounter. A
dataset that is biased or lacks diversity can lead to a model that performs well on the
training data but fails to generalize to unseen data from different distributions. To
address this, researchers must be vigilant in avoiding bias during data collection and
ensure that the dataset accurately reflects the target population.
Curating datasets with adequate and representative data inputs can be challeng‑
ing, especially in domains where data is scarce or subject to privacy constraints.
However, there are several strategies to overcome these challenges. Collaborating
with domain experts and stakeholders can help in identifying critical data attributes
and real‑world use cases. Data augmentation techniques can also be employed to
increase the diversity of the dataset, enabling the model to learn from a wider range
of instances. Adequate and representative data inputs are fundamental to the success
of data‑centric AI. By prioritizing the quality of data during the AI model develop‑
ment process, researchers and practitioners can create more robust and effective AI
systems capable of generalizing to real‑world scenarios. As the AI field continues to
advance, the importance of data‑centric AI will only grow, making it imperative for
data scientists and AI engineers to focus on obtaining high‑quality data inputs for
their models.
Another key step in implementing a data‑centric approach is ensuring high‑quality
data during data preparation. Research teams must be cautious of potential biases
introduced during data labeling. To address this, textual descriptions can be incor‑
porated as an intermediate step between data inputs and label assignments. These
descriptions, consisting of 3–10‑word sentences, provide contextual information
reflecting human perspectives. Although this approach may extend data creation time,
it proves beneficial for AI engineers as it ensures the collected data vividly captures
the essential concepts required for effective learning by AI systems. Consequently,
Unleashing the Power of Industry 4.0 49
AI systems can efficiently learn from smaller datasets, a common scenario in vari‑
ous industries. The third step entails the continuous engagement of both AI‑ and
business‑domain experts. Domain experts should take charge of data engineering
as they possess in‑depth knowledge of specific business use cases, enabling them to
provide domain‑specific representations of the real world. Their involvement in the
evaluation process can enhance it significantly by designing domain‑sensitive tests
for the AI model, making AI more applicable and accessible across various indus‑
tries. The fourth step involves the implementation of MLOps (Machine Learning
Operations) platforms. By leveraging these platforms, research teams can reduce the
time and effort spent on software development, leading to a decrease in the main‑
tenance cost of AI applications. MLOps platforms offer essential software scaffold‑
ing for the production of AI systems, considerably reducing the time from proof of
concept to production, transforming the timeline from years to mere weeks. These
platforms encompass a range of MLOp tools that cater to both data‑centric and
model‑centric AI, including data labeling, data cleaning, model storage, continuous
integration, training, and deployment tools. Their adoption streamlines the develop‑
ment and deployment process, fostering efficient and scalable AI applications.
TABLE 3.1
Distinguishing Traits of Model‑Centric AI and Data‑Centric AI
Category Model‑Centric AI Data‑Centric AI
• System development • The iterative enhancement of a • Consistent enhancement in the
lifecycle model (algorithm/code) using a data quality with unchanging
fixed volume and type of data model hyperparameters
• Performance • Demonstrates high performance • Demonstrates strong performance
primarily with extensive datasets even with smaller datasets
• Robustness • Vulnerable to adversarial • Exhibits higher resilience against
samples adversarial samples
• Applicability • Suitable for evaluating • Especially well‑suited for
algorithmic solutions in real‑world scenarios and
applications with specific and applications
limited tasks
• Generalization • Limited ability to generalize • Likely to achieve good
across datasets due to a lack of generalization across various
contextual understanding datasets beyond the ones used for
testing
50 Data-Centric Artificial Intelligence for Multidisciplinary Applications
during propagation learning runs, nor are they caused by overfitting or incomplete
model training. Instead, they demonstrate resilience to random noise and can trans‑
fer between neural network models, even when these models differ in the number of
layers, hyperparameters, and training data. This remarkable characteristic empha‑
sizes the robustness challenges faced by AI systems and underscores the impor‑
tance of considering both data‑centric and model‑centric approaches to enhance
their performance and security. This suggests that the robustness of a deep neural
network model based on backpropagation is not solely determined by the datasets
used for training. Instead, it is influenced by the structural connection between the
network and the data distribution. Therefore, a combination of both data‑centric and
model‑centric approaches is essential when striving to enhance the network’s robust‑
ness. By addressing both aspects, AI researchers and practitioners can create more
resilient and secure neural network models.
3.8 CONCLUSIONS
The AI/ML community is witnessing a surge of interest in data‑centric AI, poten‑
tially heralding a paradigm shift in AI/ML model development. Our thorough
analysis explored the advantages and disadvantages of both data‑centric and
model‑centric approaches, culminating in a balanced perspective that harmonizes
the two approaches. While we concur with fellow researchers on the importance of
data‑centric AI, we firmly assert that this shift should not diminish the significance
of model‑centric AI. Embracing the data‑centric approach is a crucial advancement
in AI capabilities, as it places a strong emphasis on high‑quality datasets, enabling
models to better comprehend and predict real‑world scenarios.
However, it is equally vital not to overlook the significance of model‑centric
AI, which concentrates on refining and optimizing the algorithms and hyperpa‑
rameters that form the backbone of AI models. This approach has proven highly
valuable in achieving superior performance and efficiency across various appli‑
cations. Achieving a balance and recognizing the complementary nature of both
data‑centric and model‑centric AI is the key. By amalgamating the strengths of
both approaches, we can forge more resilient and potent AI systems that excel in
diverse situations. The context of Industry 4.0, promising revolutionary automa‑
tion and IoT‑driven interaction among cyber‑physical objects, accentuates the criti‑
cal interplay between data‑centric and model‑centric AI. The triumph of Industry
4.0 technologies hinges on effective data utilization and sophisticated AI models.
Merging both approaches empowers us to fully leverage AI’s potential in Industry
4.0 applications, leading to heightened efficiency and innovation across multiple
domains.
In conclusion, embracing the data‑centric AI approach represents significant
progress, but it must not overshadow the importance of model‑centric AI. Instead,
recognizing the value of both approaches and synergistically incorporating them can
usher in groundbreaking AI technologies, particularly in the context of Industry 4.0’s
ambitious goals. By striking a harmonious balance, we pave the way for a new era of
AI development, where data‑centric and model‑centric AI unite to reshape the fron‑
tiers of AI, transcending limitations and unlocking untapped potential.
Unleashing the Power of Industry 4.0 53
REFERENCES
[1] Bender, E.M.; Gebru, T.; McMillan‑Major, A.; Shmitchell, S. On the dangers of sto‑
chastic parrots: Can language models be too big? In: Proceedings of the 2021 ACM
Conference on Fairness, Accountability, and Transparency, Toronto, ON, Canada,
3–10 March 2021
[2] Bhatt, S. The Big Fight: RPA Versus Traditional Automation. 2018. https://www.
botreetechnologies.com/blog/the‑big‑fight‑robotic‑process‑automation‑vs‑traditional‑
automation (accessed on 3 January 2023).
[3] Boubin, J.; Banerjee, A.; Yun, J.; Qi, H.; Fang, Y.; Chang, S.; Srinivasan, K.; Ramnath,
R.; Arora, A. PROWESS: An Open Testbed for Programmable Wireless Edge Systems.
Association for Computing Machinery, New York, NY, USA, 2022
[4] Chen, T.; Moreau, T.; Jiang, Z.; Zheng, L.; Yan, E.; Shen, H.; Cowan, M.; Wang, L.; Hu, Y.;
Ceze, L. et al. {TVM}: An automated {End‑to‑End} optimizing compiler for deep learn‑
ing. In: Proceedings of the 13th USENIX Symposium on Operating Systems Design and
Implementation (OSDI 18), Carlsbad, CA, USA, 8–10 October 2018, pp. 578–594.
[5] Fujita, H. AI‑based computer‑aided diagnosis (AI‑CAD): The latest review to read first.
Radiological Physics and Technology 2020, 13, 6–19.
[6] Hack, U. What Is the Real Story Behind the Explosive Growth of Data? 2021. https://
www.red‑gate.com/blog/database‑development/whats‑the‑real‑story‑behind‑the‑e
xplosive‑growth‑of‑data (accessed on 3 January 2023).
[7] Hamid, O.H.; Braun, J. Reinforcement learning and attractor neural network mod‑
els of associative learning. In: Computational Intelligence: Proceedings of the 9th
International Joint Conference, IJCCI 2017, Funchal–Madeira, Portugal, 1–3 November
2017, Revised Selected Papers, Springer, New York, NY, USA, 2019, pp. 327–349.
54 Data-Centric Artificial Intelligence for Multidisciplinary Applications
[8] Hamid, O.H. From model‑centric to data‑centric AI: A paradigm shift or rather a com‑
plementary approach? In: Proceedings of the 2022 8th International Conference on
Information Technology Trends (ITT), Dubai, United Arab Emirates, 25–26 May 2022,
IEEE, Piscataway, NJ, USA, 2022, pp. 196–199
[9] Tian, Q.; Yang, Y.; Lin, C.; Li, Q.; Shen, C. Improving Adversarial Robustness with
Data‑Centric Learning. 2022. https://alisec‑competition.oss‑cn‑shanghai.aliyuncs.
com/competition_papers/20211201/rank5.pdf
[10] Jiang, Y.; Zhu, Y.; Lan, C.; Yi, B.; Cui, Y.; Guo, C. A unified architecture for accelerat‑
ing distributed {DNN} training in heterogeneous {GPU/CPU} clusters. In: Proceedings
of the 14th USENIX Symposium on Operating Systems Design and Implementation
(OSDI 20), 4–6 November 2020, pp. 463–479.
[11] Kotsiopoulos, T.; Sarigiannidis, P.; Ioannidis, D.; Tzovaras, D. Machine learning and
deep learning in smart manufacturing: The smart grid paradigm. Computer Science
Review 2021, 40, 100341.
[12] Krishnan, S.; Wang, J.; Wu, E.; Franklin, M.J.; Goldberg, K. Activeclean: Interactive
data cleaning for statistical modeling. Proceedings of the VLDB Endowment 2016, 9,
948–959.
[13] Mazumder, M.; Banbury, C.; Yao, X.; Karlaš, B.; Rojas, W.G.; Diamos, S.; Diamos, G.;
He, L.; Kiela, D.; Jurado, D. et al. DataPerf: Benchmarks for data‑centric AI devel‑
opment. 2022. Dataperf: Benchmarks for data‑centric ai development. arXiv preprint
arXiv:2207.10062.
[14] Miranda, L.J. Towards Data‑Centric Machine Learning: A Short Review. https://ljvmi‑
randa921.github.io/notebook/2021/07/30/data‑centric‑ml
[15] Molnar, C. Interpretable Machine Learning. 2022. https://christophm.github.io/inter‑
pretable‑ml‑book (accessed on 3 January 2023).
[16] Pareek, P.; Shankar, P.; Pathak, M.P.; Sakariya, M.N. Predicting music popularity
using machine learning algorithm and music metrics available in Spotify. Center for
Development Economics Studies 2022, 9, 10–19.
[17] Reinsel, D.; Rydning, J.; Gantz, J.F. Worldwide Global Datasphere Forecast, 2021–2025:
The World Keeps Creating More Data‑Now, What Do We Do With it All? 2021. https://
www.marketresearch.com/IDC‑v2477/Worldwide‑Global‑DataSphere‑Forecast‑
Keeps‑14315439/
[18] Renggli, C.; Karlaš, B.; Ding, B.; Liu, F.; Schawinski, K.; Wu, W.; Zhang, C. Continuous
integration of machine learning models with ease. ml/ci: Towards a rigorous yet practi‑
cal treatment. Proceedings of Machine Learning and Systems 2019, 1, 322–333.
[19] Russell, B.C.; Torralba, A.; Murphy, K.P.; Freeman, W.T. LabelMe: A database and
web‑based tool for image annotation. International Journal of Computer Vision 2008,
77, 157–173.
[20] Schlegl, T.; Stino, H.; Niederleithner, M.; Pollreisz, A.; Schmidt‑Erfurth, U.; Drexler,
W.; Leitgeb, R.A.; Schmoll, T. Data‑centric AI approach to improve optic nerve head
segmentation and localization in OCT en face images. arXiv preprint arXiv:2208.03868.
[21] van Moorselaar, D.; Slagter, H.A. Inhibition in selective attention. Annals of the New
York Academy of Sciences 2020, 1464, 204–221.
[22] Vartak, M.; Subramanyam, H.; Lee, W.E.; Viswanathan, S.; Husnoo, S.; Madden, S.;
Zaharia, M. ModelDB: A system for machine learning model management. In:
Proceedings of the Workshop on Human‑In‑the‑Loop Data Analytics, San Francisco,
CA, USA, 26 June–1 July 2016, pp. 1–3.
[23] Zhang, H.; Li, Y.; Huang, Y.; Wen, Y.; Yin, J.; Guan, K. MLmodelCI: An automatic
cloud platform for efficient MLaaS. In: Proceedings of the 28th ACM International
Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020, pp. 4453–4456.
4 Data‑Centric AI
Approaches for
Machine Translation
Chandrakant D. Kokane, Pranav Khandagale,
Mehul Ligade, Shreeyash Garde, and Vilas Deotare
4.1 INTRODUCTION
Machine translation, a fundamental component of interlingual communication in
our increasingly globalized world, has witnessed significant advancements in recent
years.1 As the demand for accurate and efficient translation grows, the integration of
machine learning techniques has emerged as a transformative force in enhancing the
quality and efficacy of machine translation systems.
The automated translation of text or speech from one language into another is
referred to as machine translation. It makes cross‑cultural communication easier
in a number of different professions, such as business, diplomacy, and academia.
Historically, rule‑based and statistical approaches have been utilized to address the
challenges of translation. These approaches could not adequately handle the intrinsic
language complexity, resulting in fewer accurate translations and poor translation
quality.
With the advent of machine learning, a revolutionary approach that focuses on the
development of algorithms capable of learning and improving from data, a paradigm
shift has occurred in the field of machine translation. Machine learning algorithms,
powered by neural networks, have demonstrated exceptional capabilities in capturing
the complex patterns and linguistic nuances present in diverse language pairs.
By leveraging large‑scale parallel corpora and sophisticated neural network archi‑
tectures, machine learning techniques enable machine translation systems to com‑
prehend and generate translations that exhibit higher fidelity to the original meaning
and context. This paradigm shift toward data‑driven, machine learning‑based trans‑
lation systems, commonly known as neural machine translation, has led to remark‑
able advancements and a significant improvement in translation quality.
The incorporation of machine learning in machine translation not only addresses
the limitations of traditional approaches but also opens up new avenues for explo‑
ration and innovation.1 The ability of machine learning models to learn from vast
amounts of training data and adapt to different language pairs has fueled the progress
in achieving state‑of‑the‑art translation performance.
The key contributions of the book chapter aim to explore the pivotal role of
machine learning techniques in advancing machine translation. By delving into
DOI: 10.1201/9781003461500-5 55
56 Data-Centric Artificial Intelligence for Multidisciplinary Applications
produce a translation in the target language from a source language input sequence
in the context of machine translation.
Contextual dependencies are captured by RNNs by propagating data from one
time step to the next. The information about the previously viewed words and their
context is encoded in the RNN’s hidden state.
This enables the model to take into account the history of the input sequence
and produce translations that are logical and appropriate for the context. Standard
RNNs, however, experience the vanishing gradient problem, which might restrict
their use.
Longest Short‑Term Memory (LSTM) networks, a part of RNNs, were developed
to overcome this drawback.3 In order to more effectively capture long‑term depen‑
dence, LSTMs contain memory cells and gating mechanisms.
While the gating mechanisms regulate the flow of information, the memory
cells enable the model to selectively store an update information in the future time.
Compared to simple RNNs, LSTMs have a better capacity to grasp long‑term depen‑
dencies and increase translation accuracy.3
Transformer Models: Transformer models have become a ground‑breaking
machine translation architecture. Transformers rely on self‑attention techniques to
capture contextual dependencies as opposed to RNN‑based systems, which analyze
the input stream sequentially.
Using self‑attention, each word in the input sequence pays attention to every other
word to assess their relative value, is the fundamental concept of transformers. This
enables the model to provide translations while taking the sentence’s whole context
into account.
Transformers perform simultaneous processing of the full input sequence, paying
attention to various sections of the sequence to acquire data and make translation
judgments. Through effective context modeling and the capturing of dependencies
over the whole sequence, this parallel processing improves translation accuracy.
Transformers include a self‑attention mechanism that enables the model to gener‑
ate the translation while weighing the significance of various terms in the original
text.
Transformers can better match the source and destination language information,
resulting in more accurate translations, by paying attention to pertinent source lan‑
guage information.
Transformers also provide positional encoding to take into consideration the input
sequence’s sequential nature. The model can capture the word order and preserve
the sentence structure during translation, thanks to positional encoding which offers
information about the placement of each word in the sentence.
Role of Attention Mechanisms: The alignment of the information in the source
and destination languages during translation depends critically on attention pro‑
cesses. When producing translations, attention enables the model to concentrate on
particular segments of the input sequence, matching the pertinent source data with
the intended result.
The ability of the model to identify which words in the source sentence are most
pertinent to each word in the target phrase is provided by attention mechanisms in
machine translation.4
62 Data-Centric Artificial Intelligence for Multidisciplinary Applications
Attention mechanism in machine translation.
FIGURE 4.3
Data‑Centric AI Approaches for Machine Translation 63
By taking into account the context and meaning of the original words, this align‑
ment aids the model in producing appropriate translations.
In order to translate one word from the source sentence into another in the target
phrase, attention mechanisms compute attention weights for each word in the source
sentence.
Figure 4.3 gives us a brief idea about the role of the attention mechanism.
Machine translation models based on transformers have shown to be especially
good at using attention processes. Transformers include a self‑attention mechanism
that enables the model to understand the relationships between every word in the
input sequence, leading to more precise alignment and context‑aware translation.
The model is more versatile and adaptive to various translation tasks because of the
attention mechanism’s ability to manage variable‑length input and output sequences.
In summary, by retaining hidden states, utilizing memory cells and gating mecha‑
nisms, and utilizing attention processes, neural network designs like RNNs, LSTMs,
and transformers capture contextual dependencies in machine translation.5
These architectures have transformed the area of machine translation by making
it possible to create models that can provide translations that are more accurate and
contextually suitable.
i. Initialization: For each model parameter, Adam initializes the first and
second moment variables (m and v) to zero.
ii. Gradient Computation: Based on the small batch of training examples,
the gradients of the model parameters are computed during the forward
propagation and backpropagation steps.
iii. Moment Updates: Adam determines the exponential moving averages of
the first (m) and second (v) moments of the gradients. The decay rates used
to calculate these moving averages are 1 and 2, respectively.
iv. Bias Correction: The moving average estimates may be biased toward
zero because they are initialized with zeros, especially in the first training
Data‑Centric AI Approaches for Machine Translation 65
Based on the first and second moment estimations, Adam modifies the learning rate
for each parameter, causing the model to converge more quickly and handle various
gradients effectively.
BLEU has some restrictions even if it is now a de facto standard. Fluency, ade‑
quacy, and word order are not taken into account by BLEU, which primarily con‑
centrates on lexical similarity. Because it is sensitive to superficial similarities,
translations that might not accurately convey the intended meaning often receive
high marks. Furthermore, BLEU is domain‑dependent, which means that its perfor‑
mance may vary across various text domains.
Metric for Evaluation of Translation with Explicit Ordering or METEOR:
Another popular evaluation metric, METEOR, seeks to capture many facets of trans‑
lation quality. It takes into account a variety of matching factors, such as unigrams,
stemming, synonymy, word order, and more.
Considering the word alignment between the reference translations and the
machine‑generated translation, METEOR determines precision and recall values. In
addition to a number of matching and penalty systems, the ultimate score is calcu‑
lated using a harmonic mean of recall and precision.
METEOR has the advantage of handling word‑order variations, accounting for
synonyms, and stemming.4 It can distinguish between variations in words’ surface
forms and offer more precise alignments. It has been demonstrated that METEOR
performs well across a variety of language pairs and text domains. It does, however,
have some restrictions. METEOR largely depends on the caliber of available linguis‑
tic resources, including word alignments, stemmers, and synonyms. Because of its
complicated scoring system, it is challenging to evaluate and comprehend the precise
contribution of each component.
TER (Translation Edit Rate): The Edit Distance between the Machine‑Generated
Translation and the Reference Translation (TER) is a measurement used in
evaluation. It determines how many edits – including additions, subtractions, and
substitutions – are necessary to turn the machine‑generated translation into the refer‑
ence translation.
TER concentrates on the surface‑level alterations and offers a direct measurement
of translation similarity.4 Since it is linguistically neutral, it can be applied to other
language combinations. A text that has been heavily edited or post‑edited, where the
emphasis is on the number of revisions done, can be evaluated well using TER.
TER does, however, have some restrictions. It does not take semantic equiva‑
lence or fluency into account and rather concentrates on the number of modifications.
It may penalize translators for legitimate paraphrases or rearranging sentences to
increase readability or flow. Additionally, because TER scores cannot fully capture
the whole meaning or appropriateness of the translation, they may not be able to
provide a comprehensive assessment of translation quality.
4.7.2.1 BLEU
a. Strengths: BLEU is widely used, simple to implement, and effective com‑
putationally. It offers an immediate evaluation of translation quality and is
simple to understand. For specific language pairs and text domains, BLEU
scores have a fair amount of agreement with human assessments.
b. Weakness: BLEU’s emphasis on lexical similarity and n‑gram accuracy
can produce high ratings for translations that may not accurately convey
the intended meaning. Fluency, sufficiency, and word order – which are
essential for assessing translation quality – are not taken into account.
BLEU is susceptible to manipulation and is also perceptive to superficial
similarities.
4.7.2.2 METEOR
Strengths: METEOR provides a more thorough assessment of translation
quality by taking many linguistic factors like word order, stemming, and
synonymy into account. It performs effectively in a variety of language and
domain combinations. In comparison to BLEU, METEOR can tolerate vari‑
ances in surface shapes and offer superior alignments.
Limitations: The effectiveness of METEOR significantly depends on
the availability and caliber of linguistic resources like stemmers, synonyms,
and word alignments. Its scoring system is intricate and may be difficult to
understand. When linguistic resources are scarce or poorly matched to the
target language, METEOR scores may not always match human judgments
exactly.
4.7.2.3 TER
a. Strengths: TER provides an accurate comparison of similarity by mea‑
suring the edit distance between translations. It is not linguistically reliant
and is less prone to fluency and adequacy problems. Since TER can detect
significant translational changes, it is helpful for assessing heavily edited or
post‑edited text.
b. Weaknesses: TER does not take semantic equivalence into account and
rather concentrates on the number of revisions. It may penalize translators
for legitimate paraphrases or rearranging sentences to increase readability
or flow. As TER scores do not fully capture the whole meaning or appro‑
priateness of the translation, they may not be able to provide a thorough
assessment of translation quality.
It’s critical to realize that while these evaluation criteria provide quantitative assess‑
ments of translation quality, they do not fully account for human perception. In order
to gain a thorough understanding of machine translation system performance, they
should be utilized in conjunction with human review and other qualitative analyses.
Additionally, depending on the language pair, text domain, and particular translation
issues, the applicability and efficacy of these indicators may change.
68 Data-Centric Artificial Intelligence for Multidisciplinary Applications
4.8 METHODOLOGY
Machine learning techniques play a crucial role in machine translation since they
have completely changed how we approach language translation tasks. The process
for using machine learning in machine translation is described below, and refer to
Figure 4.5 for brief information about the methodology:
4.11 CONCLUSION
The improvement in the quality, fluency, and accuracy of machine translations is
mainly due to machine learning techniques. In terms of pattern recognition, trans‑
lation models can be trained using machine learning methods to provide transla‑
tions that are very similar to human‑generated translations, understand context, and
make decisions. Here are some important functions of a machine learning strategy
in machine translation:
Pattern Recognition: Machine learning techniques allow translation models to
learn from large amounts of multilingual textual data and identify patterns that con‑
tribute to accurate translations. The lexical, syntactic, and semantic patterns in the
source language can be recognized by the models, which can then be used to gener‑
ate equivalent translations in the target language.
Contextual Understanding: Translation models can understand the context of
the source sentence more efficiently during translation through machine learning
techniques such as neural networks. To manage idioms, distinguish words or phrases,
and provide linguistically consistent and contextual translations, models can use con‑
textual information.
Learn from Data: To discover language matches between source and target lan‑
guages, machine translation models are mainly data‑driven. Models can be trained on
huge parallel corpus using supervised learning methods, where the source sentence
Data‑Centric AI Approaches for Machine Translation 75
and its corresponding translation are synchronized. Models can learn translation pat‑
terns using this data‑driven method, generalize from versions, and improve transla‑
tion quality over time.
Generalizability and Adaptability: Machine learning techniques make it easy
to adapt translation models to other languages, domains, and styles. To support addi‑
tional language pairs or specialized domains, models that have been trained on a
language pair or domain can be adapted or extended. Machine translation systems
can accommodate many translation requirements due to their flexibility, which also
helps them perform better over time.
Machine learning algorithms are excellent at collecting complex patterns and
dealing with language differences. Due to the ambiguities, idiomatic expressions,
and variations that exist between languages, machine translation is inherently dif‑
ficult. Rule‑based systems often have difficulty successfully handling such complexi‑
ties, while machine learning models can learn from huge volumes of data and adapt
to that complexity.
Data‑driven approach: Machine learning techniques rely on data‑driven
approaches and make translation decisions by learning from large volumes in paral‑
lel. Models can generalize from examples using this data‑centric approach, adapting
to different language and domain pairs, and improving over time. On the other hand,
rule‑based systems are less scalable and flexible because they require humans to
build and maintain rules significantly.
Generalizability and Adaptability: By adapting or extending their training
on specific data, machine learning models can easily generalize to new language,
domain, or style pairs. Machine translation systems can now adapt to a wide vari‑
ety of translation requirements without having to manually build or modify rules.
Traditional rule‑based methods have limited extensibility and generalization, as
every single translation job requires extensive rule engineering and maintenance.
Transformer model and RNN are two machine learning models commonly used
in machine translation. RNNs such as LSTM or closed‑repetition units are com‑
monly used in machine translation systems using a sequence‑by‑sequence model.
On the other hand, transforming toys have become a powerful alternative.
The foundational article “Attention is all you need” introduced the design of the
Transformer, which uses self‑attention techniques to capture dependencies through
the input string. Using this attention mechanism, Transformer models can pay atten‑
tion to relevant source words during translation, increase translation accuracy, and
handle long‑range dependencies more efficiently.
REFERENCES
[1] Brown, Peter F., John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Frederick
Jelinek, John Lafferty, Robert L. Mercer, and Paul S. Roossin. “A statistical approach to
machine translation.” Computational Linguistics 16, no. 2 (1990): 79–85.
[2] Zhao, Bei, and Wei Gao. “Machine Learning Based Text Classification Technology.”
In: 2022 IEEE 2nd International Conference on Mobile Networks and Wireless
Communications (ICMNWC), IEEE, 2022, pp. 1–5.
[3] Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. “Sequence to sequence learning with
neural networks.” Advances in Neural Information Processing Systems 27 (2014).
76 Data-Centric Artificial Intelligence for Multidisciplinary Applications
[4] Sennrich, Rico, Barry Haddow, and Alexandra Birch. “Neural machine translation of
rare words with subword units.” (2015). arXiv preprint arXiv:1508.07909.
[5] Benkov, Lucia. “Neural machine translation as a novel approach to machine trans‑
lation.” In: DIVAI 2020 The 13th International Scientific Conference on Distance
Learning in Applied Informatics, 2020, pp. 499–509.
[6] Bkassiny, Mario, Yang Li, and Sudharman K. Jayaweera. “A survey on machine‑learning
techniques in cognitive radios.” IEEE Communications Surveys & Tutorials 15, no. 3
(2012): 1136–1159.
Section II
Data‑Centric AI in Healthcare
and Agriculture
5 Case Study Medical
Images Analysis and
Classification with
Data-Centric Approach
Namrata N. Wasatkar and Pranali G. Chavhan
DOI: 10.1201/9781003461500-7 79
80 Data-Centric Artificial Intelligence for Multidisciplinary Applications
and evaluating performance, we can select the best model and optimize its
performance for our data.
Step 6: Model Evaluation
Model evaluation is a crucial step in the data modeling process to assess
the performance and effectiveness of a trained model. Model evaluation
is an ongoing process that requires careful consideration of the evaluation
metrics, performance strategies, and the specific goals of your analysis. By
adopting a data‑centric approach, you can gain a deeper understanding of
your model’s performance, make informed decisions, and refine your mod‑
eling techniques to optimize results.
5.6 CONCLUSION
The case study emphasizes the importance of comprehensive and diverse datasets.
The availability of a wide range of medical images representing different diseases
and conditions enables the system to generalize and adapt to new cases effectively.
Additionally, the inclusion of diverse patient populations helps to ensure that the
developed algorithms are applicable across various demographics, thus reducing
potential biases.
86 Data-Centric Artificial Intelligence for Multidisciplinary Applications
Furthermore, the case study highlights the need for robust and scalable infrastruc‑
ture to support the analysis and classification of medical images. This includes pow‑
erful computing resources, storage capacity, and secure data management systems.
Such infrastructure enables the efficient processing of large datasets and facilitates
collaboration among healthcare professionals and researchers [23].
In conclusion, the case study underscores the transformative potential of a
data‑centric approach in the analysis and classification of medical images. By lever‑
aging advanced algorithms, comprehensive datasets, and robust infrastructure,
healthcare professionals can enhance diagnostic accuracy, improve patient outcomes,
and accelerate medical research. The findings from this case study provide valuable
insights and serve as a foundation for further advancements in the field of medical
image analysis.
REFERENCES
[1] Roth, A., Wüstefeld, K. and Weichert, F., 2021. A data‑centric augmentation approach
for disturbed sensor image segmentation. Journal of Imaging, 7(10), p. 206.
[2] Bullock, J., Cuesta‑Lázaro, C. and Quera‑Bofarull, A., 2019. XNet: A convolutional
neural network (CNN) implementation for medical x‑ray image segmentation suitable
for small datasets. In: Medical Imaging 2019: Biomedical Applications in Molecular,
Structural, and Functional Imaging (Vol. 10953, pp. 453–463). SPIE, Bellingham, WA.
[3] Zahid, A., Poulsen, J.K., Sharma, R. and Wingreen, S.C., 2021. A systematic review
o emerging information technologies for sustainable data‑centric health‑care.
International Journal of Medical Informatics, 149, p. 104420.
[4] Kalyankar, P.A., Mulani, A.O., Thigale, S.P., Chavhan, P.G. and Jadhav, M.M., 2022.
Scalable face image retrieval using AESC technique. Journal of Algebraic Statistics,
13(3), pp. 173–176.
[5] Shamshad, F., Khan, S., Zamir, S.W., Khan, M.H., Hayat, M., Khan, F.S. and Fu,
H., 2023. Transformers in medical imaging: A survey. Medical Image Analysis, 88,
p. 102802.
[6] Wang, L., Xue, W., Li, Y., Luo, M., Huang, J., Cui, W. and Huang, C., 2017. Automatic
epileptic seizure detection in EEG signals using multi‑domain feature extraction and
nonlinear analysis. Entropy, 19(6), p. 222.
[7] Waugh, S.A., Purdie, C.A., Jordan, L.B., Vinnicombe, S., Lerski, R.A., Martin, P. and
Thompson, A.M., 2016. Magnetic resonance imaging texture analysis classification of
primary breast cancer. European Radiology, 26, pp. 322–330.
[8] Maharana, K., Mondal, S. and Nemade, B., 2022. A review: Data pre‑processing and
data augmentation techniques. Global Transitions Proceedings, 3(1), pp. 91–99.
[9] Preece, S.J., Goulermas, J.Y., Kenney, L.P. and Howard, D., 2008. A comparison of fea‑
ture extraction methods for the classification of dynamic activities from accelerometer
data. IEEE Transactions on Biomedical Engineering, 56(3), pp. 871–879.
[10] Aybike, U and Kilimci, Z.H., 2021. The prediction of chiral metamaterial resonance
using convolutional neural networks and conventional machine learning algorithms.
International Journal of Computational and Experimental Science and Engineering,
7(3), pp. 156–163.
[11] Dinh, A., Miertschin, S., Young, A. and Mohanty, S.D., 2019. A data‑driven approach
to predicting diabetes and cardiovascular disease with machine learning. BMC Medical
Informatics and Decision Making, 19(1), pp. 1–15.
Case Study Medical Images 87
[12] Bhattacharya, S., Maddikunta, P.K.R., Pham, Q.V., Gadekallu, T.R., Chowdhary, C.L.,
Alazab, M. and Piran, M.J., 2021. Deep learning and medical image processing for
coronavirus (COVID‑19) pandemic: A survey. Sustainable Cities and Society, 65,
p. 102589.
[13] Castaneda, C., Nalley, K., Mannion, C., Bhattacharyya, P., Blake, P., Pecora, A., Goy,
A. and Suh, K.S., 2015. Clinical decision support systems for improving diagnostic
accuracy and achieving precision medicine. Journal of Clinical Bioinformatics, 5(1),
pp. 1–16.
[14] Whang, S.E., Roh, Y., Song, H. and Lee, J.G., 2023. Data collection and quality chal‑
lenges in deep learning: A data‑centric ai perspective. The VLDB Journal, 32(4),
pp. 791–813.
[15] Hamid, O.H., 2023. Data‑centric and model‑centric AI: Twin drivers of compact and
robust Industry 4.0 solutions. Applied Sciences, 13(5), p. 2753.
[16] Panthong, R. and Srivihok, A., 2015. Wrapper feature subset selection for dimension
reduction based on ensemble learning algorithm. Procedia Computer Science, 72,
pp. 162–169.
[17] Taylor, K.I., Staunton, H., Lipsmeier, F., Nobbs, D. and Lindemann, M., 2020. Outcome
measures based on digital health technology sensor data: Data‑and patient‑centric
approaches. NPJ Digital Medicine, 3(1), p. 97.
[18] Taylor, K.I., Staunton, H., Lipsmeier, F., Nobbs, D. and Lindemann, M., 2020. Outcome
measures based on digital health technology sensor data: Data‑and patient‑centric
approaches. NPJ Digital Medicine, 3(1), p. 97.
[19] Kumar, A. and Gautam, S., 2022. Improving medical diagnostics with machine learn‑
ing: A study on data classification algorithms. International Journal of Advanced
Computer Research, 12(61), p. 31.
[20] Chavhan, P.G., Ratil, R.V. and Mahalle, P.N., 2023. An investigative approach of context
in internet of behaviours (IoB). In: International Conference on Emerging Trends in
Expert Applications & Security (pp. 333–343). Springer, Singapore.
[21] Daphal, P., Pokale, S., Chavhan, P., Wasatkar, N., Rathi, S., Dongre, Y. and Kolekar, V.,
2023. Human pose detection system using machine learning. International Journal of
Intelligent Systems and Applications in Engineering, 11(3), pp. 553–561.
[22] Kharate, N., Patil, S., Shelke, P., Shinde, G., Mahalle, P., Sable, N. and Chavhan,
P.G., 2023. Unveiling the resilience of image captioning models and the influence of
pre‑trained models on deep learning performance. International Journal of Intelligent
Systems and Applications in Engineering, 11(9s), pp. 1–7.
[23] Mahalle, P.N., Shinde, G.R., Ingle, Y.S. and Wasatkar, N.N. (2023). Data‑centric AI.
In: Data Centric Artificial Intelligence: A Beginner’s Guide: Data‑Intensive Research.
Springer, Singapore. doi:10.1007/978‑981‑99‑6353‑9_5
6 Comparative Analysis
of Machine Learning
Classification
Techniques for Kidney
Disease Prediction
Jayashri Bagade, Nilesh P. Sable,
and Komal M. Birare
6.1 INTRODUCTION
The kidney’s primary job is to filter the body’s blood. Kidney disease is a silent
killer because kidney failure can occur without any warning signs or symptoms. The
definition of chronic renal disease is a deterioration in kidney function over months
or years. Diabetes and high blood pressure are common contributors to kidney dam‑
age. Globally, chronic kidney disease (CKD) is a serious health issue that affects
many people. People who can’t afford therapy for chronic renal disease may suffer
catastrophic effects if they don’t receive it. The most accurate test to assess kidney
function and the severity of CKD is the glomerular filtration rate (GFR). It can be
calculated using the blood creatinine level, as well as factors like age, gender, and
other details. Most of the time, becoming sick sooner is preferable. Consequently, it
is feasible to avoid major diseases. The kidney function of people with CKD gradu‑
ally deteriorates over time. It is a huge burden on the healthcare system due to its
rising frequency and high risk of developing end‑stage renal disease, which calls for
dialysis or kidney transplantation. A major worldwide health concern, CKD also has
a terrible prognosis for morbidity and mortality. However, CKD can be significantly
slowed down and serious complications can be avoided with early detection and
treatment. In order to effectively manage and treat the condition, it is imperative to
be aware of the signs and symptoms of renal disease. By leading a healthy lifestyle,
CKD can be prevented from progressing as quickly, by modifications like eating a
balanced diet, exercising frequently, abstaining from smoking and excessive alcohol
use, and managing underlying medical conditions like diabetes and hypertension.
Regular kidney function tests (KFTs), such as urine and blood tests, can also iden‑
tify CKD early on, allowing for quick management and intervention to stop further
kidney damage. The major objective of this research is to investigate datasets, flow
88 DOI: 10.1201/9781003461500-8
Comparative Analysis of Machine Learning Classification Techniques 89
diagrams, and block diagrams to employ different algorithms to predict the develop‑
ment of renal illness.
Early identification of renal disease can help prevent irreparable kidney dam‑
age, albeit it may not always be possible. It’s critical to have a better understanding
of kidney disease symptoms to accomplish this goal. To predict the occurrence
of renal illness, the project involves analysing patient health and disease data,
comparing them using various indices and using machine learning classification
algorithms. Random Forests, K‑nearest Neighbour, Support Vector Machine, ADA
Boost, Gradient Boosting, Cat Boost, and Stochastic Gradient Boosting are some
of the classification methods used [1]. The data is categorized by the machine clas‑
sified into various classes, labels, and categories. Doctors often perform physical
examinations, evaluate the patient’s medical history, and then perform diagnostic
tests and treatments to identify the underlying cause of symptoms to diagnose an
illness.
With a fast‑rising patient population, CKD is currently the leading cause of death,
accounting for 1.7 million fatalities per year. Although there are many different diag‑
nostic techniques, this study uses machine learning because of its high accuracy.
Today, millions of people die from CKD, a condition that is quickly spreading and for
which there is now no timely, effective treatment. Patients with chronic renal disease
typically originate from middle‑ and low‑income nations.
Exercise, drinking water, and avoiding junk food are all advised. Figure 6.1
depicts the typical signs of chronic renal disease.
disease progression prediction are essential. In 2015, Swathi Baby [2] suggested a
system based on predictive data mining for the development of an analysis and pre‑
diction tool for renal illness. Data on renal disease collected and analysed by Weka
and Orange software were used in the system. The study used a variety of machine
learning methods to anticipate and statistically analyse the likelihood of renal sick‑
ness, including Naive Bayes, J48, AD Trees, K Star, and Random Forest. Performance
indicators for each approach were computed and contrasted. The study’s findings
demonstrate that the K‑star and Random Forest (RF) algorithms outperformed the
opposition on the test dataset.
These methods have Receiver Operating Characteristic (ROC) values of 1 and 2,
respectively, and were discovered to produce models more quickly. A million people
perished in 2013 because of chronic renal disease. More people have CKD in devel‑
oping nations, where there are 387.5 million CKD patients overall, 177.4 million of
whom are men and 210.1 million of whom are women. These statistics demonstrate
that a significant portion of the population in emerging nations has chronic renal
disease and that this proportion is rising daily. For CKD to be treated at an early
stage, a lot of work has been put into early diagnosis. In this paper, we emphasize
accuracy while concentrating on machine learning prediction models for chronic
renal disease. When both kidneys are destroyed, a common type of kidney illness
known as CKD develops, and CKD patients must live with this condition for an
extended period. Any renal issue that could impair kidney function is referred to here
as kidney damage.
To forecast renal illnesses in 2015, Dayanand [3] suggested employing support
vector machines (SVMs) and artificial neural networks (ANNs). The primary objec‑
tive of this study was to compare the accuracy and execution times of these two
algorithms. According to the experimental data, ANN is performed better than SVM
in terms of accuracy. The Naive Bayes, SVM, and decision tree (DT) were used in
different machine learning algorithms for classification, while RF, logistic regres‑
sion, and linear regression were utilized in the medical sectors for regression and
prediction. Due to early‑stage diagnosis and prompt patient treatment, the death rate
can be reduced by the effective application of these algorithms. Patients with chronic
renal disease should maintain their clinical symptoms and engage in regular physi‑
cal activity. Researchers Ganapathi Raju, K Gayathri Praharshitha, and K Prasanna
Lakshmi completed a study in 2020 [4] that used different classification algorithms
on patient medical information to diagnose chronic kidney illness. The primary
objective of this study was to determine the classification algorithm that, based on
the classification report and performance indicators, would be most useful for diag‑
nosing CKD. In 2017, a team of researchers have used 14 attributes to predict CKD
and achieved 0.991 accuracy with a multiclass decision forest [8].
A team of researchers used a multiclass decision forest in 2017 to predict CKD
using 14 different attributes, and they were able to do so with an astounding accuracy
of 0.991 [6]. To increase accuracy, the researchers eliminated instances with missing
values and trained both a logistic regression model and a neural network. Overall
accuracy for these models was 0.975 and 0.960, respectively. Correlations between
the chosen attributes ranged from 0.2 to 0.8. From a medical standpoint, it’s critical
to consider the associations between characteristics and CKD. For instance, specific
Comparative Analysis of Machine Learning Classification Techniques 91
gravity has a correlation of 0.73 to the class and can both cause and be caused by
CKD. Eliminating these characteristics might result in a drop in accuracy. In 2017,
Sarica et al. [5] talked about the benefits of RF while considering any possible dis‑
advantages. More study on comparisons between this method and other widely used
classification systems is also encouraged, particularly in the early detection of the
transition from mild cognitive impairment (MCI) to Alzheimer’s disease (AD). This
study recommends a procedure that involves preprocessing of the data, collaborative
filtering for handling missing values, attribute selection, and CKD status prediction
using clinical data. The additional tree and RF classifiers displayed the highest accu‑
racy, obtaining 100% accuracy, with the least amount of bias towards the attributes.
In 2015, In 2021, Authors used the WEKA data mining tool to test eight machine
learning models [7]. The Naive Bayes, Multi‑layer Perception, and J48 algorithms
performed the best with accuracy scores of 0.95, 0.99, and 0.99, respectively, and
ROC scores of 1. The multilayer perceptron algorithm scored the highest in the study
using Kappa statistics, with a score of 0.99, followed by the decision table and J48
algorithms with scores of 0.97. El‑Houssainy et al. [8] examined the outcomes of vari‑
ous machine learning models and discovered that the Multiclass Decision Forest algo‑
rithm had the highest accuracy rate of about 99% for a condensed dataset with only 14
attributes. Supriya Aktar et al. [9] concentrated on using different machine learning
classification algorithms to increase the CKD diagnosis’ accuracy and shorten the
diagnosis process. The goal of the study was to categorize various stages of CKD
according to their severity. The performance of various algorithms, including radial
basis function (RBF,) RF, and Basic Propagation Neural Network, was examined by
the researchers. The analysis’s findings demonstrated that, with an accuracy of 85.3%,
the RBF algorithm performed better than the other classifiers. In a dataset of CKD,
Dilli Arasu and Thirumalaiselvi [10] have worked on missing values. The accuracy of
our model and the results of our predictions will both be lowered by missing values in
the dataset. They came up with a solution to the issue that by performing a recalcula‑
tion process on the CKD stages, they came up with unknown values.
They recalculated the values in place of the missing ones. With the aid of a machine
learning algorithm, Salekin and Stankovic [11] employ a novel approach to detect
CKD. They receive results based on a dataset with 400 records and 25 attributes that
indicate whether a patient has CKD or not. To obtain results, they employ neural
networks, RFs, and k‑nearest neighbours. They use the wrapper method for feature
reduction. For feature reduction, they use the wrapper method which detects CKD
with high accuracy. The effects of class imbalance during data training for the devel‑
opment of neural network algorithms for the treatment of chronic renal disease are
examined by Yildirim [12]. In the suggested work, comparative research was carried
out using a sampling algorithm. This study demonstrates how sampling algorithms
can help classification algorithms perform better. It also shows that the learning rate
is a crucial factor that has a big influence on multilayer perceptron. Sharma et al. [13]
tested 12 different classification algorithms on a dataset containing 400 records and
24 attributes. They have compared their estimated findings with the actual results to
ascertain the precision of their forecasts. They used evaluation standards like preci‑
sion, sensitivity, accuracy, and specificity. The DT approach offers a precision of 1,
specificity of 1, and sensitivity of 0.9720 with an accuracy of up to 98.6%.
92 Data-Centric Artificial Intelligence for Multidisciplinary Applications
resistance to noise in training data, though its effectiveness might depend on the
number of training examples used. The accuracy of the KNN algorithm depends
on choosing the best value for the K parameter, which establishes the number of
nearest neighbours and the distance metric to be applied. Calculations in machine
learning can take a while, particularly when determining how far apart each
instance is from all training instances. To handle categorical features of a dataset,
RF, which is essentially a collection of DTs combined, can be used. Many train‑
ing examples and high‑dimensional spaces can both be handled by this algorithm.
Certain criteria must be established to use RF. The algorithmic approach is shown
in Figure 6.3.
kernel, was used in this study. The SVM algorithm’s objective is to establish the
optimal decision boundary or line that can divide n‑dimensional space into classes
so that we may quickly place.
and numerical features can be handled by the gradient boosting variation known as
Cat Boost. To transform categorical data into numerical features, there is no need for
feature encoding methods like One‑Hot Encoder or Label Encoder.
6.3.5 Decision Tree
One common supervised learning method for both classification and regression tasks
is the DT algorithm. Although it can be used to solve regression issues, classification
tasks are where it is most frequently applied. The algorithm uses a classifier struc‑
ture resembling a tree, with internal nodes denoting dataset features and internal
branches denoting decision rules. DTs use an inquiry‑based approach, in which ques‑
tions are posed to ascertain whether or not a specific attribute is present. The data
are then divided into subtrees and further examined for classification or regression
tasks using the results.
6.3.6 Random Forest
The popular machine learning algorithm RF is a part of the supervised learning
methodology. It can be applied to ML issues involving both classification and regres‑
sion. It is predicated on the idea of ensemble learning, which is the act of integrating
various classifiers to address a complicated issue and enhance the model’s perfor‑
mance. According to what its name implies, “Random Forest is a classifier that con‑
tains a number of DTs on various subsets of the given dataset and takes the average
to improve the predictive accuracy of that dataset.” Instead of depending on a single
DT, the RF uses forecasts from each tree and predicts the result based on the votes
of the majority of predictions.
Higher accuracy and overfitting are prevented by the larger number of trees in
the forest. Some DTs may predict the correct output, while others may not because
the RF combines numerous trees to forecast the class of the dataset. But when
all the trees are combined, they forecast the right result. For the dataset’s feature
variable to predict true outcomes rather than a speculated result, there should be
some actual values in the dataset. Each tree’s predictions must have extremely low
correlations.
6.4 RESULT
The effectiveness of each algorithm was assessed after a study was conducted using
a variety of algorithms. The accuracy values for each algorithm are displayed on
a graph that was made to help convey this information in an understandable and
concise manner. The graph gives a visual representation of how each algorithm per‑
forms, making it simple to compare them and determine which algorithm is the most
accurate for the given dataset.
Data analysis frequently employs this method of presenting findings through
graphs because it makes complex information easier to understand. It is simpler to
spot patterns and trends. Graphs can also aid in the simplification of complex infor‑
mation to make it more understandable to a wider audience.
96 Data-Centric Artificial Intelligence for Multidisciplinary Applications
TABLE 6.1
Performance Comparison of Models
Accuracy
Sr. No Classification Model Training Testing Precision Recall F1 Score
1 KNN 0.77 0.67 0.74 0.71 0.72
2 SVM 0.94 0.95 0.99 0.93 0.96
3 Random forest 1.0 0.975 0.96 0.94 0.98
4 Decision tree 0.97 0.94 0.93 0.97 0.95
5 ADA Boost 1.0 0.97 0.96 1.00 0.98
6 Cat Boost 1.0 0.966 0.96 0.99 0.97
In this situation, the graph showing the accuracy values for each algorithm can
assist academics and medical professionals in deciding which algorithm is best for
identifying a specific disease or condition. The development of more precise diag‑
nostic tools or an improvement in patient care for the specific condition can both
benefit from this information. Overall, the effectiveness of data analysis in healthcare
and other fields can be greatly improved using graphs and other visual aids when data
is presented visually than when it is presented in its raw form.
The machine learning models are trained on the KFT dataset to validate the per‑
formance of the model. The models trained on 320 data records are used to test 80
unseen data records. Performance of the models is evaluated using four evaluation
parameters namely accuracy, precision, recall, and F1 score. The results are presented
in Table 6.1. A comparison of the accuracy of the model is shown in Figure 6.4.
novel decision support system that makes use of machine learning classifiers. The
classifiers were successful in predicting other diseases in addition to CKD. The
effectiveness of three different classifiers used in the study to predict the presence of
CKD was compared. The findings demonstrated that in terms of predicting CKD, the
ADA Boost and RF classifiers performed better than the SVM, DT, KNN, and CAT
Boost classifiers. Overall, the study has overcome earlier shortcomings and increased
the precision of CKD prediction. The study’s results are encouraging and serve as a
foundation for additional study in this area.
The current techniques for anticipating chronic renal disease are deemed suf‑
ficient, despite some drawbacks. However, as the table below illustrates, research
is still being done to increase the precision of kidney disorder prediction and iden‑
tification. A more advanced CKD prediction system is still needed despite current
initiatives. Given that this area of research is still largely unexplored, a decision sup‑
port system that can aid in the early detection of chronic renal disease is especially
necessary.
This study uses a variety of classification methods to identify CKD including
RFs, KNN, SVM, ADA Boost, DT, and Cat Boost. These classifiers’ performance
can be evaluated and contrasted with that of other classifiers. For prompt treatment
to begin and to stop the disease from worsening, early detection of CKD is essential.
Therefore, early disease detection and prompt treatment are crucial for the medical
industry. Alternative classifiers can be investigated and assessed in upcoming studies
to find better approaches to objective function problems.
REFERENCES
[1] Konstantina Kourou, Themis P. Exarchosa, Konstantinos P. Exarchos, Michalis V.
Karamouzis, Dimitrios I. Fotiadis: “Machine learning applications in cancer progno‑
sis and prediction”, Computational and Structural Biotechnology Journal, Vol. 13,
pp. 8–17 (2015).
[2] P. Swathi Baby, T. Panduranga Vital: “Statistical analysis and predicting kidney dis‑
eases using machine learning algorithms” International Journal of Engineering
Research and Technology, Vol. 4, (2015).
[3] S. Vijayarani, S. Dhayanand: “Kidney disease prediction using SVM and ANN algo‑
rithms”, International Journal of Computing and Business Research, Vol. 6, (2015).
[4] Ganapathi Raju, K. Gayathri Praharshitha, K. Prasanna Lakshmi: “Prediction of
chronic kidney disease (CKD) using data science”, International Conference on
Intelligent Computing and Control Systems (ICCS) (2019).
[5] Alessia Sarica, Antonio Cerasa, Aldo Quattrone: “Random forest algorithm for the
classification of neuroimaging data in alzheimer’s disease”, Front Aging Neurosci. 2017
Oct 6; 9: 329. doi: 10.3389/fnagi.2017.00329. PMID: 29056906; PMCID: PMC5635046
(2017).
[6] N.V. Ganapathi Raju, K. Prasanna Lakshmi, K. Gayathri Praharshitha: “Prediction
of chronic kidney disease (CKD) Using data science”, International Conference on
Intelligent Computing and Control Systems (ICCS) (2017).
[7] Gazi Mohammed Ifraz, Muhammad Hasnath Rashid, Tahia Tazin, Sami Bourouis,
Mohammad Monirujjaman Khan: “Comparative analysis for prediction of kidney dis‑
ease using intelligent machine learning methods” Comput Math Methods Med. 2021
Dec 3; 2021: 6141470. doi: 10.1155/2021/6141470. Retraction in: Comput Math Methods
Med. 2023 Nov 1; 2023: 9864519. PMID: 34899968; PMCID: PMC8664508, (2021).
98 Data-Centric Artificial Intelligence for Multidisciplinary Applications
[8] El‑Houssainy A. Ready, Ayman S. Anwar: “Prediction of kidney disease stages using
data mining”. Informatics in Medicine Unlocked, Vol. 15 (2019).
[9] Suraiya Aktar, Abhijit Pathak, Abrar Hossain Tasin: “Chronic kidney disease (CKD)
Prediction using data mining techniques”, In book: Advances in Intelligent Systems and
Computing (pp. 976–988). Publisher: Springer, Cham. (2021).
[10] S Dilli Arasu, R Thirumalaiselvi: “Review of chronic kidney disease based on data
mining techniques”, International Journal of Applied Engineering Research ISSN
0973‑4562 Volume 12, Number 23 pp. 13498–13505 (2017).
[11] Asif Salekin, John Stankovic: “Detection of Chronic Kidney Disease and Selecting
Important Predictive Attributes”, 2016 IEEE International Conference on Healthcare
Informatics (ICHI), (2016).
[12] Elias Dritsas, Maria Trigka: “Machine learning techniques for chronic kidney disease
risk prediction.” Pinar Yildirim: “Chronic Kidney Disease Prediction on Imbalanced
Data by Multilayer Perceptron: Chronic Kidney Disease Prediction”, IEEE 41st Annual
Computer Software and Applications Conference (COMPSAC), (2017).
[13] Sahil Sharma, Vinod Sharma, Atul Sharma: “Performance Based Evaluation of Various
Machine Learning Classification Techniques for Chronic Kidney Disease Diagnosis”,
International Journal of Modern Computer Science (IJMCS), Vol. 4, (2016).
7 Fusion of Multi‑Modal
Lumber Spine Scans
Using Convolutional
Neural Networks
Bhakti Palkar
DOI: 10.1201/9781003461500-9 99
100 Data-Centric Artificial Intelligence for Multidisciplinary Applications
FIGURE 7.1 Quantity of papers available in PubMed database on “medical image fusion”.
Source: PubMed Dataset.
of these images. The research area “medical image fusion” is growing vastly [3]. In
Figure 7.1, we can observe the increasing number of publications based on “Medical
Image Fusion” in PubMed database from year the 2000 to June 2023.
Key Contribution: In this research work, lumbar spine diseases of 12 different
types are considered for fusion. A novel deep learning and wavelet‑based approach
is used for CT and MRI fusion of spine images to generate one image that contains
details from both MR and CT images.
7.3.3 AlexNet Network
AlexNet is also a convolutional neural network like VGG[48]. Alexnet is the winner
of the ILSVRC competition held in 2012. Figure 7.4 shows AlexNet architecture.
It is much smaller in size than VGG‑19 with just five convolutional layers and
three fully connected layers. Each Convolutional layer has a RELU activation func‑
tion with it except the last layer. Max‑pooling layer is used after the first, second
and fifth convolutional layers which are of size 3 × 3 with stride 2. It has three fully
connected layers at the end. Like VGG19, AlexNet is also trained over the ImageNet
dataset.
technique. Lumbar spine fusion using “wavelet with VGG‑19” and “Wavelet with
Alexnet” have been compared with the following conventional methods.
FIGURE 7.7 (a) Moving image (CT)‑control point selection, (b) fixed image (MR)‑control
point selection, and (c) CT registered with MR image.
The high value of spatial frequency indicates good quality. “Wavelet+Vgg19” tech‑
nique showed the best values (shown in bold in every row) for all the patients.
7.5 CONCLUSION
Lumber Spine CT and MRI scans are merged into one image to observe L1, L2, L3,
L4 and L5 vertebras, discs, spinal cord, nerves and tissues in one image. A novel
technique based on wavelet and CNN is introduced in this research work. VGG19
108 Data-Centric Artificial Intelligence for Multidisciplinary Applications
FIGURE 7.8 Patient 1 fused images: (a) simple average, (b) DCT, (c) hybrid wavelet trans‑
form, (d) wavelet+alexnet, (e) SWT, (f) DCT‑LP, (g) DT‑CWT, and (h) Wavelet+VGG19.
and Alexnet are combined with 2D‑DWT to generate a fused image. Extensive
experimentation is done to observe difference in fused images generated using eight
different techniques, of which six are conventional techniques and two are novel
techniques. Registration of CT with MRI is done using a landmark‑based registration
technique, and then it is fused with MRI. Three evaluation performance metrics –
entropy, spatial frequency and standard deviation are used to compare all the meth‑
ods. It is observed that fused images generated using Wavelet+VGG19 technique
showed the highest values for all three parameters which indicate that this fused
image has at most information content, better contrast and quality than all the other
images. Medical experts can look at this image only instead of looking at CT and
MRI images of the spine. VGG19 can be replaced by other convolutional neural
networks like RESNET and GoogleNet to observe the difference in fused images.
Fusion of Multi-Modal Lumber Spine Scans 109
FIGURE 7.9 (a) CT, (b) MRI and (c) CT registered with MR image.
FIGURE 7.10 Patient 2 fused images. (a) Simple average, (b) DCT, (c) hybrid wavelet trans‑
form, (d) wavelet+alexnet, (e) SWT, (f) DCT‑LP, (g) DT‑CWT, and (h) wavelet+VGG19.
110 Data-Centric Artificial Intelligence for Multidisciplinary Applications
TABLE 7.1
Entropy
Patient Simple
No. Average DCT HWT Wavelet+AlexNet SWT DCT‑LP DT‑CWT Wavelet+VGG19
1 7.30 7.30 7.30 7.31 7.52 7.34 7.28 7.74
2 7.56 7.56 7.56 7.58 7.78 7.60 7.55 7.85
3 7.27 7.27 7.27 7.28 7.44 7.29 7.29 7.80
4 7.55 7.55 7.55 7.55 7.41 7.57 7.53 7.72
5 7.06 7.06 7.06 7.08 7.37 7.09 7.07 7.43
TABLE 7.2
Standard Deviation
Patient Simple
No. Average DCT HWT Wavelet+AlexNet SWT DCT‑LP DT‑CWT Wavelet+VGG19
1 51.16 50.52 51.16 48.42 62.03 49.58 53.04 63.53
2 47.00 63.07 47.07 47.57 62.09 48.28 46.81 64.29
3 42.30 46.39 42.35 42.35 54.12 42.38 42.56 60.63
4 58.80 59.82 58.87 58.88 72.47 59.12 59.05 73.23
5 35.59 46.36 35.66 36.01 47.08 36.28 35.53 48.99
TABLE 7.3
Spatial Frequency
Patient Simple
No. Average DCT HWT Wavelet+AlexNet SWT DCT‑LP DT‑CWT Wavelet+VGG19
1 7.00 7.08 7.00 5.24 5.88 5.55 3.56 9.96
2 13.42 13.45 13.44 11.15 10.86 12.31 6.49 14.35
3 5.23 5.34 5.24 2.94 4.86 3.10 2.87 9.71
4 9.07 9.09 9.08 6.03 7.64 6.63 4.85 11.62
5 6.98 7.01 7.00 6.10 5.69 6.56 3.39 7.53
REFERENCES
[1] http://timesofindia.indiatimes.com /articleshow/64992298.cms?utm_source=
contentofinterest&utm_medium=text&utm_campaign=cppst
[2] https://timesofindia.indiatimes.com/city/mumbai/73‑patients‑with‑spine‑problems‑
have‑lower‑back‑complaint‑survey/articleshow/61137025.cms
[3] https://pubmed.ncbi.nlm.nih.gov/?term=medical+image+fusion
[4] Changtao He, Quanxi Liu, Hongliang Li, Haixu Wang, Multimodal medical image
fusion based on IHS and PCA, Procedia Engineering, 7, 2010, pp. 280–285.
Fusion of Multi-Modal Lumber Spine Scans 111
[5] Mahesh Malviya, Sanju Kumari, Srikant Lade, Image fusion techniques based on pyra‑
mid decomposition, International Journal of Artificial Intelligence and Mechatronics,
4, 2014, pp. 127–130.
[6] P. Burt, E. Adelson, Laplacian pyramid as a compact image code, IEEE Transactions
on Communications, 31, 1983, pp. 532–5407.
[7] Jianguo Sun, Qilong Han, Liang Kou, Liguo Zhang, Kejia Zhang, and Zilong Jin,
Multi‑focus image fusion algorithm based on Laplacian pyramids, Journal of the
Optical Society of America 35, 2018, pp. 480–490.
[8] H. Olkkonen, P. Pesola, Gaussian pyramid wavelet transform for multiresolution analy‑
sis of images, Graphical Models and Image Processing, 58, 1996, pp. 394–398.
[9] P. Burt, A gradient pyramid basis for pattern selective image fusion, The Society for
Information Displays (SID) International Symposium Digest of Technical Papers, 23,
1992, pp. 467–470.
[10] A. Toet, Image fusion by a ratio of low‑pass pyramid, Pattern Recognition Letters, 9,
1996, pp. 245–253.
[11] H. Anderson, A filter‑subtract‑decimate hierarchical pyramid signal analyzing and syn‑
thesizing technique, U.S. Patent 718–104, 1987.
[12] L. C. Ramac, M. K. Uner, P. K. Varshney, Morphological filters and wavelet based image
fusion for concealed weapon detection, Proceedings of SPIE, 3376, 1998, pp. 110–119.
[13] M. D. Jasiunas, D. A. Kearney, J. Hopf, Image fusion for uninhabited airborne vehicles,
In: Proceedings of IEEE International Conference on Field Programmable Technology,
2002, pp. 348–351.
[14] S. Marshall, G. Matsopoulos, Morphological data fusion in medical imaging, In: IEEE
Winter Workshop on Nonlinear Digital Signal Processing, IEEE, 1993, pp. 6–1.
[15] K. Mikoajczyk, J. Owczarczyk, W. Recko, A test‑bed for computer‑assisted fusion of
multi‑modality medical images, In: Chetverikov, D., Kropatsch, W.G. (eds) Computer
Analysis of Images and Patterns. CAIP 1993. Lecture Notes in Computer Science, vol
719, Springer, 1993, pp. 664–668.
[16] G. Matsopoulos, S. Marshall, J. Brunt, Multiresolution morphological fusion of MR
and CT images of the human brain, In: Vision, Image and Signal Processing, IEE
Proceedings, vol. 141, IET, 1994, pp. 137–142.
[17] V. P. S. Naidu, Bindu Elias, A novel image fusion technique using DCT based Laplacian
pyramid, International Journal of Inventive Engineering and Sciences (IJIES), 1, 2,
2013, pp. 1–18.
[18] S. G. Mallat, A theory for multiresolution signal decomposition, the wavelet represen‑
tation, IEEE Transaction on Pattern Analysis and Machine Intelligence, 11, 7, 1989,
pp. 674–693.
[19] Yong Yang, Dong Sun Park, Shuying Huang, Zhijun Fang, Zhengyou Wang, Wavelet
Based Approach for Fusing Computed Tomography and Magnetic Resonance Images,
IEEE, 2009, pp. 5770–5772.
[20] Yong Yang, Dong Sun Park, Shuying Huang, Nini Rao, Medical image fusion via
an effective wavelet based approach, Journal on Advances in Signal Processing,
579341 2010, 44.
[21] Guihong Qu, Dali Zhang, Pingfan Yan, Medical image fusion using two‑dimensional
discrete wavelet transform, In: Proc. SPIE 4556, Data Mining and Applications, 2001.
doi:10.1117/12.440275
[22] J. Teng, X. Wang, J. Zhang, S. Wang, P. Huo, A multimodality medical image fusion
algorithm based on wavelet transform, In: Tan, Y., Shi, Y., Tan, K.C. (eds) Advances
in Swarm Intelligence. ICSI 2010. Lecture Notes in Computer Science, 6146. 2010,
pp. 627–633.
112 Data-Centric Artificial Intelligence for Multidisciplinary Applications
[40] Y.‑P. Wang, J.‑W. Dang, Q. Li, S. Li, Multimodal medical image fusion using fuzzy
radial basis function neural networks, In: International Conference on Wavelet Analysis
and Pattern Recognition, ICWAPR07, vol. 2, IEEE, 2007, pp. 778–782.
[41] W. Li, X.‑F. Zhu, A new algorithm of multi‑modality medical image fusion based on
pulse‑coupled neural networks, In: Wang, L., Chen, K., Ong, Y.S. (eds) Advances in
Natural Computation, Springer, 2005, pp. 995–1001.
[42] Sharma Dileepkumar Ramlal, Jainy Sachdeva, Chirag Kamal Ahuja, Niranjan
Khandelwal, Multimodal medical image fusion using non‑subsampled shearlet trans‑
form and pulse coupled neural network incorporated with morphological gradient,
Signal, Image and Video Processing, 12, 8, 2018. pp. 1479–1487.
[43] C.T. Kavitha a, C.Chellamuthu b, R. Rajesh, Multimodal Medical Image Fusion Using
Discrete Ripplet Transform and Intersecting Cortical Model, Procedia Engineering 38,
2012, pp. 1409–1414.
[44] Sudeb Das, M. K. Kindu, NSCT‑based multimodal medical image fusion using
pulse coupled neural network and modified spatial frequency, Medical & Biological
Engineering & Computing, 50, 10, 2012, pp. 1105–1114, doi:10.1007/s11517‑012‑0943‑3.
[45] Y. Liu, X. Chen, R. K. Ward et al., Image fusion with convolutional sparse representa‑
tion, IEEE Signal Processing Letters, 23, 12, 2016, pp. 1882–1886.
[46] Yu Liu, Xun Chen, Juan Cheng, Hu Peng, A medical image fusion method based on
convolutional neural networks, In: 20th International Conference on Information
Fusion Xian, China, 10–13 July 2017.
[47] https://www.image‑net.org/
[48] https://www.oreilly.com/library/view/advanced‑deep‑learning/9781789956177/b2258a
a6‑2c18‑449c‑ac00‑939e812f5a4a.xhtml.
https://www.oreilly.com/library/view/advanced‑deep‑learning/9781789956177/b2258a
a6‑2c18‑449c‑ac00‑939e812f5a4a.xhtml
[49] http://spineweb.digitalimaginggroup.ca
[50] Dhirendra Mishra, Bhakti Palkar. Article: Image fusion techniques: Review. International
Journal of Computer Applications, 130, 9, 2015, pp. 7–13.
8 Medical Image Analysis
and Classification
for Varicose Veins
Jyoti Yogesh Deshmukh, Vijay U. Rathod,
Yogesh Kisan Mali, and Rachna Sable
8.1 INTRODUCTION
Digital imaging and medicine are getting more and more integrated into society as
science and technology advance. Recent innovations in technology, such as image
processing and virtual reality, are rapidly finding their way into the medical industry.
The field of digital medicine has advanced quickly in recent years as interdisciplin‑
ary medical research combines with digital art.
The current healthcare system understands that while clinical procedures must be effi‑
cient, patient safety is of utmost importance. These conditions are not at all exclusive of
one another. For instance, when local or regional anaesthesia is used during surgery rather
than more intensive general anaesthesia, both patient safety and economic effectiveness
are maximized. It is still a difficult and open challenge to objectively evaluate the safety of
clinical procedures while taking patient comfort and financial expenses into account major
diseases in India caused by a bad lifestyle! Between 2005 and 2015, India’s proportion of
overweight and obese people increased. 21.7% of women and 19.6% of males between
the ages of 17 and 41 were not set in stone to be overweight or large. Corpulence is lethal!
Joint inflammation, malignant growth, barrenness, coronary illness, back agony, diabetes,
and stroke are the seven sicknesses most often connected to heftiness. Poor psychological
wellness, respiratory issues, irregular chemical characteristics, and food sensitivities are
also normal. Stoutness, which was once broad in major league salary nations, is currently
present and deteriorating in low‑ and centre‑pay nations. Obesity is mostly caused by an
inability to compensate for all of the energy consumed. Fat is created from this additional
energy. Fighting infectious diseases requires levying higher costs on unhealthy foods,
accurate labelling, and the development of environments that promote physical activity.
With a few notable exceptions, current therapies for obesity have largely demonstrated just
moderate success, and preventative strategies are relatively ineffectual. Therefore, a fuller
knowledge of the factors that contribute to severe or morbid obesity is urgently needed.
This understanding could result in novel and creative intervention strategies.
Accurate analysis of medical data will benefit in the early detection of illnesses,
patient care, and community services as machine learning in biomedicine and health‑
care quickly improves. The quantity and quality of unreported and inferior medical
data, on the other hand, jeopardizes disease analysis accuracy.
In order to identify obese people, this study provides an obesity detection algorithm
that is specifically designed for the Indian population. Additionally, based on the set of
symptoms entered into the system, it predicts which diseases the person is most likely
to get. Even if you feel healthy, it can tell you if you are at high or low risk. The system
certifies that a person’s health is within acceptable bounds and provides the option to
evaluate and monitor their health via email. Documents that patients (or maybe doc‑
tors) need to examine or recover as needed can also be uploaded and accessed easily.
The most prevalent peripheral vascular disease is varicose veins in the lower
limbs. There are now more than 25 million adults who have varicose veins in their
lower limbs, which affects about 23% of adults. Over 8% of Chinese people have
varicose veins. In addition to compromising aesthetics, varicose veins can induce
thrombophlebitis, venous oedema, cutaneous varicose veins, and an increased risk
of deep vein thrombosis. This could lead to disability and a decrease in employment
capacity. The direct annual cost of treating chronic venous disease in the USA is
estimated to be between $2.5 and $2 billion. Varicose vein treatment costs in the UK
make up 2% of all national health care spending. Conservative therapy and surgical
therapy are the two most common conventional treatments for varicose veins; how‑
ever, roughly one‑third of cases return within 10 years of surgery. Treatments with a
low level of invasiveness are being created. As a result, research into the molecular
and cellular mechanisms underlying fungal pathogenesis will be crucial in the future
for discovering novel treatment targets and creating novel therapeutic approaches.
Hemodynamic variations are directly sensed by vascular endothelial cells (VECs),
the unbroken monolayer layer that lines the lining of blood vessels between the blood
flow and the vessel wall. It is a mechanical barrier that serves a variety of important
roles in the body’s physiological and pathological processes, including substance trans‑
port, auto crime, and paracrime. In many illnesses, particularly in the lower extremities,
its structure and function are aberrant. It is essential for the growth and spread of vari‑
cose veins. The method proposed using harsh words in patients with varicose veins to
improve diagnosis accuracy through fundamental information processing [1]. Images
of vascular endothelial cells can be classified using this method based on various illness
situations. Ajitha [2] classified a number of noteworthy traits using a range of pattern
classifiers before using an artificial neural network to photos of leg veins. Shi et al.’s
[3] excellent classification of pictures of vascular endothelial cells using support vector
machines improved early detection of this disease. Zhu et al. [1] proposed employing
a grip sensation‑based responsive and predictive brain control method to investigate
lower extremity varicose veins. Veinidis et al. used 3D mesh sequences to construct an
unsupervised human motion search technique to assess whether lower extremities have
varicose veins. The aforementioned algorithm, meanwhile, has several drawbacks. B.
Vascular endothelial cell data are sparse, network training is challenging, and the adap‑
tive effect is weak. Through consecutive iterations of the training process, deep learn‑
ing uses a hierarchical feature extraction structure to abstract the input image from low
level to high level, identifying the most crucial file attributes. Extract more in‑depth
and broader features. In order to extract characteristics and classify them, this article
makes use of deep learning algorithms. In this study, lower extremity varicose veins
are detected and classified using features extracted from images of vascular endothe‑
lial cell inflammation using multi‑scale deep learning algorithms. Multiple convolu‑
tional layers were used to extract multi‑scale characteristics from photos of vascular
116 Data-Centric Artificial Intelligence for Multidisciplinary Applications
endothelial cells. In addition, we develop a competitive strategy that can lower the
network layer parameters while extracting more compact features by using the MFM
activation function rather than the ReLU activation function. For dimensionality reduc‑
tion, this network employs a method of 3 * 2 convolution kernel and 1 * 2 convolution
kernels. This can be applied to strengthen the network’s ability to extract features and
further optimize the network’s parameters.
The clinical tissue classification system for identifying varicose veins using medi‑
cal imaging is suggested in this article. Over 90% of all cases of the many varieties
of varicose veins are ulcers. The system outlined in the proposed study can be used
to recognize and categorize varicose veins at any stage. The taxonomy is divided into
two sections in this article:
Texture features like homogeneity, energy, entropy, contrast, mean, dissimilar‑
ity, and variance are extracted during the feature extraction stage. To distinguish
between various twisting phases, these collected features are categorized using a
K‑nearest neighbour classifier and a support vector machine.
A significant fraction of people are affected by chronic venous insufficiency
(CVI), which cannot be treated without medical treatment. However, a lot of patients
don’t get their medical advice right away. Physicians also require methods to catego‑
rize patients based on the severity of their CVI at the same time. To help doctors and
patients, we suggest an automated categorization technique dubbed the CVI clas‑
sifier. An idea classifier in this strategy first guides low‑level picture highlights to
medium‑level semantic elements prior to building a multi‑scale semantic model to
develop a semantically rich picture portrayal. Second, a scene classifier is prepared
to gauge CVI seriousness utilizing the streamlined element subsets obtained by the
high‑request reliance‑based include determination approach. Classification perfor‑
mance is measured using the F1 score, kappa coefficient, and classification accuracy.
Routine outpatient varicose vein surgery is commonly performed under local
anaesthesia. Despite the fact that local anaesthesia is affordable and can reduce
patient risk, some patients endure discomfort during surgery. Thus, the careful group
should decide if to utilize general or neighbourhood sedation in light of an emotional
subjective evaluation of the patient’s nervousness and torment responsiveness. It is
absolutely impossible to confirm. To make sense of the relationship between cardio‑
vascular reaction change and agony during varicose vein a medical procedure, we
foster a three‑layered polynomial surface fitting of physiological information and
patient mathematical agony evaluations. Pulse changeability information was dis‑
sected for ghostly and underlying intricacy highlights as torment markers in no time
before 18 varicose vein systems. The aggravation expectation model that came about
was approved once, with a kappa coefficient of 0.82 (virtual arrangement) and a
region under the beneficiary working bend of 0.98 (close to consummate exactness).
This evidence of idea study shows the capacity of exactly evaluating torment aware‑
ness and science [4], allowing practitioners to prescribe the safest and least expensive
anaesthetic drugs to specific individuals in an unbiased manner.
By creating a cutting‑edge system to optimize machine learning algorithms for
accurately predicting obesity and related disorders among the Indian population,
this work seeks to address the aforementioned constraints. Unhealthy manufactured
foods are readily available, thanks to federal food restriction restrictions. More so
among younger generations, India’s increasingly career‑focused lifestyle has led to
an unpredictable biological pattern that favours indoor play over outside recreation.
The emergence of obesity nowadays is influenced by behavioural and socio‑psycho‑
logical factors as sleep, stress, race, and hormone imbalances [5].
Capillary blood pressure (CBP) is the main driver of fluid exchange between
microvessels. Prior to apparent peripheral oedema, asymptomatic systemic venous
118 Data-Centric Artificial Intelligence for Multidisciplinary Applications
is varicose veins of the lower extremities, which can cause leg oedema, dis‑
torted superficial vein dilation, severe varicose vein development, localized
bleeding, or infection. Varicose veins in the lower legs have been linked
favourably to vascular endothelial cell inflammation. The purpose of this
article is to identify and detect leg varicose veins by concentrating on vas‑
cular endothelial cells and extracting attributes from cell images.
B. Multi‑Scale Image Technology Method
Multi‑scale picture innovation is a strategy for communicating pictures
with different goals. The method involved with handling pictures at differ‑
ent scales is known as multi‑scale division after that.
Removing all highlights on a solitary scale in numerous visual imaging
applications could challenge. In this study, we present a multi‑scale tech‑
nique to facilitate feature extraction and improve feature extraction perfor‑
mance. The key tasks when employing multi‑scale picture technology are
expressing images in multi‑scale settings and figuring out how scales relate
to one another. A common example of a multi‑scale depiction of a picture
among them is an image pyramid.
Scale‑based analysis is a method that uses the picture pyramid. This
reflects the transform resolution for the identical image’s layers precisely.
With the biggest picture at the base and the littlest picture at the top, the
image size is commonly expanded and diminished in a pyramidal example.
Like the human visual framework, picture pyramid designs can address pic‑
tures at a few scales and depict the full picture at different scales. Pyramidal
pictures are developed utilizing down testing. Subsequently, the picture
order ascends as more picture detail is lost and the picture goal drops.
However, as demonstrated in Figure 8.2, photographs with lower quality
120 Data-Centric Artificial Intelligence for Multidisciplinary Applications
and size could include more generic elements. The resolution of vascular
endothelial cells, for instance, decreases with decreasing size and increases
in the sharpness of outlines and features following multi‑scale processing of
the pictures of vascular endothelial cells. Vascular endothelial cell images
get more detailed and greater in size as the resolution of the endothelial
cells increases. By removing features from multi‑scale images, image pyra‑
mid structures can achieve the goal of condensing multi‑scale information,
improving the relevance and depth of feature representation. Below is a
basic illustration of the many image pyramid types and structures.
1. Sub‑Sampling Pyramid
Image pyramids can be used to graphically portray multi‑scale struc‑
tures. Pyramiding is a common technique that involves applying a low
pass filter to smooth out an image and then down‑sampling the outcome.
One method for lowering the size of the image and improving the reso‑
lution of the two adjacent layers is down‑sampling. This method mostly
involves removing the horizontal and vertical pixels from the image.
Scale‑space theory states that the image size should be decreased using
the proper smoothing filtering. The resulting pyramid is a subsampled
pyramid if no filter smoothing is applied. An image that has been suc‑
cessfully downscaled using a subsampling pyramid has a much lower
resolution. To create thumbnail images that are only one‑fourth the size
of the actual image, the pixels and columns are primarily sampled pixel
by pixel for an image subsampling pyramid. This procedure is repeated
Medical Image Analysis and Classification for Varicose Veins 121
1 −( x 2 /2σ 2 )
g(x) = e (8.1)
2πσ
The standard deviation is the variable, and its magnitude affects how
smooth the signal is. Using the example of the 5 * 3 Gaussian kernels,
we may obtain the Gaussian kernel for convolution after discretizing the
Gaussian function.
The standard deviation formula is:
1
w= [1 2 6 4 16 24 6 24 36 4 1 16 4 24 6 ] (8.2)
256
3. The Pyramid of Gaussian Difference
The distinction of two Gaussian capabilities with contrasting fluc‑
tuations by the Laplace‑Gaussian, a band‑pass filter that reflects is an
approximation of the Gaussian difference. The filter function’s formula
is as follows:
D ( x , y, σ 1, σ 2 ) = ( G ( x , y, σ 1) ) − G ( x , y, σ 2 )) * I ( x , y ) = L ( x , y, σ 1) − L ( x , y, σ 2 )
(8.3)
122 Data-Centric Artificial Intelligence for Multidisciplinary Applications
S ( i, j ) = ( X * W ) ( i, j ) + b = ∑( X
k =1
k + Wk ) ( i, j ) + b (8.4)
1
f (x) = (8.5)
( e− x )
1 +
( )(
f ( x ) = 1 − e −2 x / 1 + e −2 x ) (8.6)
f ( x ) = max ( x ,0 ) (8.7)
2. Pooling Layer
Sampling is the core of a pooling layer. In order to extract the
essential aspects of the input feature map while lowering the net‑
work parameters for improved characterization, the feature map
is somehow compressed. Decreased network parameters are also
helpful in avoiding over‑fitting.
There are two common pooling techniques. The best and typical
pooling. See the maximum 2 * 2 pooling with a 2‑pixel step size
in Figures 8.7 and 8.8. Only one of these pooling convolution part
loads can go over 1, while the rest can go under 0. The convolution
bit will be set where 1 signifies the most noteworthy worth of every
component on the element map.
In augmentations of two pixels, the convolution bit travels
through the component map. The component map is diminished to
∑ 1 + Pi(i −, j j )
i , j−0
2 (8.8)
∑ ( Pi, j )
i , j−0
2
(8.9)
∑ − ln ( pi, j ) pij
i , j=0
(8.10)
∑∑ (i − j ) ( P (i, j ))
i j
2
(8.11)
µ= ∑ j ( p) = ∑i ( p)
j =1
ij
i =1
ij
(8.12)
Medical Image Analysis and Classification for Varicose Veins 129
∑∑Pi =1 j =1
ij i− j (8.13)
V = Vi ∑∑ (1 − µi )
i =1 j =1
2
Pi , j (8.14)
TABLE 8.1
Confusion Matrix for the CNN Classifier’s Test Set
Test Set Genuine Positive Misleading Positive Genuine Negative Misleading Negative
T1 14 7 34 6
T2 15 6 33 5
T3 14 7 35 7
TABLE 8.2
Statistical Performance Analysis of CNN and SVM Classifier
CNN Classifier SVM Classifier
Test set Sensitivity Specificity Sensitivity Specificity
T1 0.772929 0.764768 0.896825 0.884729
T2 0.888889 0.955555 1 0.926833
T3 0.823576 0.968254 0.552632 1
FIGURE 8.10 The pictorial sensitivity specificity analysis of KNN and SVM classifier.
The results of the testing phase were used to calculate the aforemen‑
tioned performance metrics. Figure 8.10 shows the image sensitivity spec‑
ificity analyses for the CNN and SVM classifiers. Table 8.3 presents the
findings of the accuracy analysis. Experiment results reveal that the SVM
classifier offers sensitivity (77.94%), specificity (86.78%), and accuracy
(78.84%) while the CNN classifier of the suggested approach offers sensi‑
tivity (72.46%), specificity (85.17%), and accuracy (80.95%).
Medical Image Analysis and Classification for Varicose Veins 131
TABLE 8.3
Accuracy Analysis of KNN and SVM Classifier
CNN Classifier SVM Classifier
Test set Accuracy Accuracy
T1 0.869542 0.845654
T2 0.936488 0.92238
T3 0.918635 0.801268
8.6 CONCLUSION
Now more than ever, early detection and treatment of varicose veins require com‑
puter‑assisted image analysis for histological classification. The suggested model
extracts textural information that is extremely helpful for classifying different
phases of wounding. He also uses two classifiers in this strategy; thus, the efficacy of
both classifiers will aid in the discovery of a more accurate classifier in subsequent
research. The results for the classifier to identify varicose vein stages will be increas‑
ingly accurate as more datasets and processing steps are added to the approach in
the future.
REFERENCES
1. Zhu, Ruizong, Huiping Niu, Ningning Yin, Tianjiao Wu, and Yapei Zhao. “Analysis
of varicose veins of lower extremities based on vascular endothelial cell inflammation
images and multi‑scale deep learning.” IEEE Access 7 (2019): 174345–174358.
2. Ajitha, K. “SVM VS KNN for classification of histopathological images of varicose
ulcer”, Advances in Engineering: an International Journal (ADEIJ) 3, no. 2 (2020):
19–28.
3. Shi, Qiang, Weiya Chen, Ye Pan, Shan Yin, Yan Fu, Jiacai Mei, and Zhidong Xue.
“An automatic classification method on chronic venous insufficiency images.” Scientific
Reports 8, no. 1 (2018): 17952.
4. Barulina, Marina, Askhat Sanbaev, Sergey Okunkov, Ivan Ulitin, and Ivan
Okoneshnikov. “Deep learning approaches to automatic chronic venous disease clas‑
sification.” Mathematics 10, no. 19 (2022): 3571.
5. Adjei, Tricia, Wilhelm Von Rosenberg, Valentin Goverdovsky, Katarzyna Powezka,
Usman Jaffer, and Danilo P. Mandic. “Pain prediction from ECG in vascular surgery.”
IEEE Journal of Translational Engineering in Health and Medicine 5 (2017): 1–10.
6. Pereira, Naomi Christianne, Jessica D’souza, Parth Rana, and Supriya Solaskar.
“Obesity related disease prediction from healthcare communities using machine learn‑
ing.” In: 2019 10th International Conference on Computing, Communication and
Networking Technologies (ICCCNT), pp. 1–7. IEEE, 2019.
7. Liu, Jing, Bryan Yan, Shih‑Chi Chen, Yuan‑Ting Zhang, Charles Sodini, and Ni Zhao.
“Non‑invasive capillary blood pressure measurement enabling early detection and clas‑
sification of venous congestion.” IEEE Journal of Biomedical and Health Informatics
25, no. 8 (2021): 2877–2886.
9 Brain Tumor Detection
Using CNN
Paras Bhat, Sarthak Turki, Vedyant Bhat,
Gitanjali R. Shinde, Parikshit N. Mahalle,
Nilesh P. Sable, Riddhi Mirajkar,
and Pranali Kshirsagar
9.1 INTRODUCTION
The human body is a combination of interrelated parts or networks where each part
is interconnected to the other, and dysfunction in one part shows the impact on the
overall body of the individual. Being such a complex system nature has provided it
with an inbuilt processor which manages all its work and responds to every stimulus
in a reasonable manner. Humans have named it as brain, the most important organ
of human beings. Our existence is immensely dependent on the proper functioning
of the brain that performs most of our tasks, be it controlling voluntary movement,
creating and managing memories, developing thoughts, etc. [1].
Now‑a‑days, due to processed foods, the use of plastics in our day‑to‑day life, the
consumption of adulterant drinks, the increase in smoking among youth, etc., have
made this disease of cancer spread like a forest fire which is increasing at a very fast
pace [2]. Some of the brain cancers are depicted in Figure 9.1.
A brain tumor is such a kind of tumor in which the tissues inside the brain start to
grow abnormally creating an extra piece of mass inside the brain which takes away
the nutrients of its surrounding cells, thus resulting in brain failure. The disease is
curable if it is identified at an early stage, which is the most challenging part of this
process as it mildly shows any symptoms at its early stage and often gets skipped
away by doctors. In order to help the live saviors, i.e. doctors to predict this problem
at an early stage, here this project aims to provide some more time for the doctors to
think and curate this problem as early as possible. We have tried to build a machine
that uses CNN and many more algorithms to detect the brain tumor at its early stage.
Our machine identifies a tumor from a picture and returns the result of whether the
tumor is positive or negative, which makes it useful in situations when we need to be
certain of the tumor status.
The primary goal of brain tumor detection is to classify various tumor forms in
addition to just detecting them. It also serves the purpose of identifying a tumor from
a picture and gives the output of whether the disease is present or not, which makes it
useful in situations when an infected person needs to be certain of the tumor’s status.
This project is focused on developing a system that can recognize tumor blocks in
MRI scans of various patients and prevent the damage it causes in a patient’s life.
while reaching an accuracy of 98.66% and also used the radial basis function and
Decision Tree classifier. The SoftMax Fully Connected Plate used for image clas‑
sification had a classification accuracy of 98.57%.
Deepak et al. [10] proposed the methods that were used: GoogleNet and CNN, and he
also describes the advantage of CNN‑based classifier systems is that they do not require
manually segmented tumor regions and provide a fully automated classifier. Demiharan
et al. [11] suggested a segmentation technique for categorizing brain tumor MRIs. Using
station wavelet transform, learning vector quantization, cerebral spinal fluid (CSF),
edema, white matter (WM), and grey matter were on the order of 0.87 for grey matter,
0.96 for CSf, and 0.77 for edema. WM was discovered in 0.91%. Aneja et al. [12] sug‑
gested a segmentation algorithm that uses fuzzy C‑means (FCM) clusters for noise fig‑
ures as well as a fuzzy clustering averaging technique. The cluster validity function, run
time, and convergence error rate of 0.537% are used to evaluate segmentation values.
Yang et al. [13] used discrete wavelength transform (DWT) with an accuracy of
93.9% and an objective prediction of 6.9%. Badza et al. [14] proposed their own
CNN architecture for three types of brain tumor classification. The proposed model
is more straightforward than the existing pre‑trained models. They used T1W‑MRI
data for the training and testing with tenfold cross‑validation.
These approaches have suggested many ways in which the model has achieved efficient
ways to diagnose a brain tumor. Table 9.1 summarizes the literature work on brain cancer.
135
136
Data-Centric Artificial Intelligence for Multidisciplinary Applications
TABLE 9.1 (Continued)
Summary of Literature Work on Brain Cancer
Research
work Used Methods Used Dataset Obtained accuracy Advantages Discussion
Deepak Methods used were Figshare DCNN 91.2% and SVM Classifier Stable and efficient Lack of accuracy in
et al. [10] GoogleNet and deep CNN 0.98% transfer model
Demirahan Methods used are neural IBSR 2015 and WM 90%, GM 87%, hydrops 76%, Enhancement of efficiency in Can’t be applied on
et al. [11] networks, self‑organizing BraTS 2012 tumor 60% and CSF 95% WM.GM and edema newly generated dataset
maps and Wavelets
Aneja Method used is fuzzy NSL‑KDD FCM 1.173, T2FCM 0.951 and IFCM Reduction of disturbances in Need of updation in
et al. [12] clustering mean algorithm 0.436 training set & size clot misclassification error
Yang Methods used is DWT GE Healthcare Collected reliability of 93.9% and an More work on deduction on SVM Crises of Model handling
et al. [13] objective error rate of 6.9%
Badza Method used is CNN BRATS and Repeating the Fitting procedure 10 Could be used for differential Heavy run‑time
et al. [14] CBICA times we get the punctuality of 95.08% datasets
Brain Tumor Detection Using CNN 137
The dataset is taken from github, 253 MRI images with 155 positive instances and 98 neg‑
ative samples. The neural network couldn’t be trained because the dataset was too little. In
order to address the problem of data imbalance, data augmentation proved helpful. Data
augmentation is used to increase the dataset. The dataset now has 1085 positive examples
and 980 negative examples, for a total 2065 example photos, following data augmentation.
The following pre‑processing processes were used for each image:
Images are resized such that they only show the brain in one section (which is the
most important part of the image). The neural network accepts images of the same
sizes, hence image shapes are kept as 240, 240, 3 = (image width, image height, number
of channels). The value of pixels can be scaled using Normalization to a range of 0–1.
The data were divided as follows:
The first step involves giving the input to the neural network. The neural network
is given an input image with a shape of (240, 240, 3) for each input image x. When
the image is given as input to the neural network, it traverses the following layers:
The first layer it traverses is the zero‑padding layer which is of the size of (2,2).
Then followed by the zero‑padding layer, there is a convolutional layer which
consists of 32 filters, with a stride of 1 and the filter with the size of (7,7). After
the convolution layer, there is a normalization layer which helps in normaliz‑
ing the pixel values in a batch to speed up computation. After the normalization
layer, there is the activation layer which consists of the activation function. The
activation function we used is the ReLU activation function. After the activa‑
tion layer, there is a max pooling layer with a filter size of 4 and a stride of 4.
Then again, there is an identical layer of Max Pooling with f = 4 and s = 4. After
the pooling layers, there is a flatten layer which converts the three‑dimensional
matrix into a vector with only one dimension. After the flatten layer, there is a
dense layer in which one neuron is in a dense, fully linked layer with an output
unit that has sigmoid activity.
MRI images without brain tumor and having brain tumor are shown in Figures 9.4
and 9.5, respectively.
Let’s keep in mind the ratio of positive to negative examples: Start by declaring
the variables m and n_positive for the size of the dataset and the quantity of positive
examples, respectively. Now, we can compute the total examples that are negative
i.e. n_negetive=m‑n_positive
Training Data:
Validation Data:
Testing Data:
Graphs shown in Figures 9.6 and 9.7 show that with the increase in the testing data
of the model, its training and validation accuracy is increasing continuously while
the training and validation loss is being continuously reduced which is shown in
Figures 9.6 and 9.7.
Brain Tumor Detection Using CNN 141
9.5 CONCLUSION
In this work, we reviewed the available feature‑based research in the literature. We
have implemented the CNN model with 88.7% accuracy on the test set and a score
of 0.88 on the test set for f1. This model can detect brain cancer, and the results are
satisfactory when you consider how balanced the data is. This will help the doctors
to predetermine the disease and save more lives.
REFERENCES
1. Wu, Wentao et al. (2020). An intelligent diagnosis method of brain MRI tumor segmen‑
tation using deep convolutional neural network and SVM algorithm. Computational
and Mathematical Methods in Medicine, Volume 2020, Special issue.
2. Deepa, S. A. (2016) Review of brain tumor detection from tomography. In: International
Conference on Computing for Sustainable Global Development (INDIACom).
(pp. 3997–4000). IEEE.
3. Siar, M., & Teshnehlab, M. (2019, October). Brain tumor detection using deep neural
network and machine learning algorithm. In 2019 9th international conference on com‑
puter and knowledge engineering (ICCKE) (pp. 363–368). IEEE. https://www.zeeva.in/
know‑everything‑about‑brain‑cancer/
4. Sasikala, S., Bharathi, M., & Sowmiya, B. R. (2018). Lung cancer detection and classifi‑
cation using deep CNN. International Journal of Innovative Technology and Exploring
Engineering, 8(25), 259–262.
5. Jyothi, P., & Singh, A. R. (2022). Deep learning models and traditional automated tech‑
niques for brain tumor segmentation in MRI: a review. Artificial Intelligence Review,
56(4), 2923–2969.
6. Joshi, D., & Goyal, R. (2017). Review of tumor detection in brain MRI images. In:
2019 International Conference on Innovative Trends and Advances in Engineering and
Technology (ICITAET) (pp. 206–209). IEEE.
7. Kiranmayee, B. V., Rajinikanth, T. V., & Nagini, S. (2017, September). Explorative
data analytics of brain tumour data using R. In: 2017 International Conference on
Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC)
(pp. 1182–1187). IEEE.
8. Arya, M., & Sharma, R. (2016). Brain tumor detection through MR images: a review of
segmentation techniques. International Journal of Computer Applications, 975, 8887.
9. Siar, M., & Teshnehlab, M. (2019). Brain tumor detection using deep neural network
and machine learning algorithm. In: 2019 9th International Conference on Computer
and Knowledge Engineering (ICCKE) (pp. 363–368). IEEE.
10. Deepak, S., & Ameer, P. M. (2019). Brain tumor classification using deep CNN features
via transfer learning. Computers in Biology and Medicine, 111, 103345.
11. Demirhan, A., Törü, M., & Güler, I. (2014). Segmentation of tumor and edema along
with healthy tissues of brain using wavelets and neural networks. IEEE Journal of
Biomedical and Health Informatics, 19(4), 1451–1458.
12. Aneja, D., & Rawat, T. K. (2013). Fuzzy clustering algorithms for effective medical image
segmentation. International Journal of Intelligent Systems and Applications, 5(11), 55–61.
13. Yang, G., Nawaz, T., Barrick, T. R., Howe, F. A., & Slabaugh, G. (2015). Discrete wave‑
let transform‑based whole‑spectral and subspectral analysis for improved brain tumor
clustering using single voxel MR spectroscopy. IEEE Transactions on Biomedical
Engineering, 62(12), 2860–2866.
14. Badža, M. M., & Barjaktarović, M. Č. (2020). Classification of brain tumors from MRI
images using a convolutional neural network. Applied Sciences, 10(6), 1999.
10 Explainable Artificial
Intelligence in the
Healthcare
An Era of Commercialization
for AI Solutions
Prasad Raghunath Mutkule, Nilesh Popat Sable,
Parikshit N. Mahalle, and Gitanjali R. Shinde
10.1 INTRODUCTION
The significance of explainable artificial intelligence (XAI) is a term encompassing
techniques that render AI systems interpretable and comprehensible to end‑users. The
importance of XAI is underscored by its applicability in numerous fields. In areas
such as healthcare, ensuring the reliability of AI models is paramount. Furthermore,
AI explainability can lead to new insights in fields such as Physics, Mathematics,
and Chemistry. Equally important is the need for everyday users of AI to understand
its decision‑making processes. Additionally, XAI can aid neuroscience research
in the testing and explanation of hypotheses related to brain activity and learning
mechanisms. Apart from explicating the importance of XAI, this chapter delves into
two XAI methods – model‑agnostic and model‑specific. The universal applicability
and sensitivity analysis feature of model‑agnostic methods positions them as suit‑
able for any ML model. Conversely, model‑specific methods are tied to certain ML
models, employing techniques such as activation maximization and deep network
understanding and visualization. XAI solutions adhere to specific criteria, such as
trustworthiness, transferability, causality, and interactivity, that provide a framework
for assessing the quality and efficiency of XAI methods. Through this investigation
of XAI and its significance, it seeks to demystify AI systems, making them more
transparent and interpretable.
Machine learning (ML) algorithms can produce findings that are understandable
to humans with the help of XAI. An XAI model predicts an impact and its biases.
It assists in decision‑making aided by AI by evaluating correctness, fairness, and
transparency. A company’s ability to explain AI is crucial when it comes to bring‑
ing AI models into production. A responsible AI development strategy can also be
10.2.1 Explanation
There is an explanation or reasoning accompanying all outputs. AI systems must
provide proof, support, and rationale in support of each output under the Explanation
principle. Essentially, this theory implies that an explanation can be provided by a
system without it necessarily being justified, appropriate, or instructive in and of
itself; the only requirement is that the system can provide a rational explanation [4,5].
In the current state of technology, this type of XAI technique is being developed
and validated extensively. There are a number of technologies and methods being
developed and implemented at the moment. Their explanations are not subject to any
quality criterion.
10.2.2 Meaningful
System explanations are understandable by individual users. A meaningful system is
one that is understood by the recipient of its explanations. A user meets this concept
when the explanation is understood by them and/or helpful to them in completing a
task. A one‑size‑fits‑all approach may not be the best answer based on this principle.
It may be necessary to provide different explanations to different groups of users for
the same system. Users may receive explanations tailored to their needs due to the
Meaningful principle [6–8]. There are many large groups of people in the world,
Explainable Artificial Intelligence in the Healthcare 145
such as developers versus users, lawyers versus judges, etc. There may be some dif‑
ferences between the objectives and desires of these two groups. The importance of
certain factors may differ between forensic practitioners and juries. It is also possible
to personalize explanations for individuals using this concept as well. It is found that
there is sometimes a difference in perception between individuals who are observing
the output of the same AI system, for a variety of reasons.
10.2.3 Explanation Accuracy
It is accurate to describe the output generation process of the system in the descrip‑
tion. Explanations that are meaningful to users can only be generated by systems
if they are applied together with the Meaningful principle. System output genera‑
tion processes are not required to be accurately explained in order to comply with
these two principles. Providing accurate explanations is essential to the Explanation
Accuracy principle. Correctness of explanation is different from decision accuracy.
Decision accuracy refers to the system’s ability to make the right decision when mak‑
ing decisions. Although the system may make an accurate judgement regarding the
situation, the accompanying explanation may not accurately explain how the results
were reached, no matter how accurate the judgment may be [9]. It has been estab‑
lished by AI researchers to create standard metrics that measure the accuracy of
algorithms and systems. It is understood that there is no performance metric that can
be used to measure the accuracy of explanations, although there are reliable measure‑
ment methods available.
10.4.3 Integrated Gradients
The theory and application of Integrated Gradients is an innovative method for deci‑
phering the predictions of a deep learning (DL) model. The application of Integrated
Gradients involves observing changes in the model’s prediction compared to a base‑
line or masked instance by progressively turning on discrete input features [11,20].
The objective of this method is to identify pivotal features and examine their influ‑
ence on the predictions made by the model. Integrated Gradients offer an optimized
approach by delivering faster computations than SHAP values, holding particular
suitability for DL models. However, this procedure requires the model to possess dif‑
ferentiability for successful implementation. Future research should aim to expand
the applicability of Integrated Gradients to a broader range of models and prediction
tasks [21].
Natural Language Processing (NLP) with the assistance of NLP. With AI, technol‑
ogy can process and store large quantities of information, which will enable database
knowledge to be built and may improve clinical decision support through facilitating
examinations and recommendations for each patient.
10.8.4 Nurses on Call
Virtual nursing assistants can direct patients to the best and most effective care units
with the help of AI systems. Almost all queries can be answered by these virtual
nurses around the clock, as well as examinations and instant solutions can be pro‑
vided by them. At the current time, there are many AI‑powered applications that
allow patients to interact with their care providers more regularly between visits to
the doctor’s office, avoiding unnecessary hospital visits by enabling more regular
interactions between patients and their care providers. With AI and voice controls,
Care Angel is the world’s first virtual nurse assistant.
10.8.5 Diagnosing Accurately
With AI, doctors can diagnose diseases more accurately, predict them more effec‑
tively, and diagnose them faster. Additionally, AI algorithms have been proven to be
effective and cost‑effective in diagnosing diabetic retinopathy as well as detecting
other diseases. ML will help pathologists make better diagnoses, for example, thanks
to PathAI. Cancer diagnosis mistakes are being reduced and methods for treating
patients individually are being developed by the company.
Explainable Artificial Intelligence in the Healthcare 151
10.9.1 Injury/Error
There is a significant risk that AI in healthcare will at times be wrong, as it might
suggest the wrong treatment to a patient or make the wrong diagnosis on a radiology
scan, resulting in injury or dire health consequences to the patient. For instance, it
might suggest the wrong medication to a patient or make a mistake in locating the
tumour on a radiology scan. At least two reasons may be involved in the difference
between AI errors and human errors. The significance of this is that an AI system
error may cause injury to thousands of patients, while human medical professionals
may also make errors.
10.9.3 Security
Many patients believe that data is being collected and exchanged between health
systems and AI developers for the purpose of enabling AI systems, leading them to
sue. AI systems can also predict private information about patients even if they have
not disclosed it themselves. This raises another issue regarding the use of AI systems.
10.9.5 Changing Professions
Medical professions may undergo significant changes in the future as a result of AI
systems’ use. Most of the work in areas such as radiology is automated. The high
amount of AI use raises the concern that humans will become increasingly unable
to detect AI errors and develop medical knowledge as a result of a decline in human
knowledge and capacity over time.
clinicians and the discord between the preferences of ML experts and clinicians for
different types of explanations. This advocates effective resolution for these chal‑
lenges to establish more dependable and transparent ML models in healthcare sys‑
tems, thus ensuring their broader acceptability and application.
10.11.4 Assistive Intelligence
The role and limitations of ML algorithms in critical domains, such as healthcare, is
explored. While ML aims to automate decision‑making processes, it is highlighted
that human supervision remains essential, particularly in safety‑critical applica‑
tions. It asserts while ML systems can function as beneficial medical assistants,
they should not be entrusted entirely with patient care due to the need for accu‑
rate data and human intervention. Accordingly, this would lead to arguments for a
human‑in‑the‑loop framework in healthcare and stresses the need for the develop‑
ment of XAI mechanisms to ensure transparency and accountability in ML systems
used in healthcare.
Explainable Artificial Intelligence in the Healthcare 155
10.12.3 Model Performance
The model’s behaviour cannot be tracked due to the lack of awareness among model
users.
10.12.4 Regulatory Standards
Regulatory standards cannot be recognized by users. As a result, the system would
be harmed.
components correlate with other insights. In some industries, there are still a number
of techniques that are embedded deeply into the culture, yet powerful ML algorithms
are finding new applications in these fields. Fragmented and experimental implemen‑
tations of existing or custom‑developed interpretable techniques exist in the nascent
field of medical ML, which is still in its infancy. In spite of the current focus on
improving feature selection and extraction accuracy and performance, interpretabil‑
ity research may still have large untapped potential.
10.14 CONCLUSION
This chapter has significantly explored the profound relevance of XAI in healthcare,
underlining its critical role in high‑stakes decision‑making processes that pervade
the medical field. In light of potentially severe consequences arising from incor‑
rect predictions by AI models, this underscores the need for the development and
adoption of techniques for building AI applications that aid users in comprehending
the model’s output and predictions. The chapter detailed various methods, assign‑
ing them into six categories, and thoroughly examined diverse tools and techniques
specific to healthcare’s unique demands. Moreover, the chapter sheds light on the
irrefutable demand for integrated explainability tools combining XAI, ML, and DL,
especially for delicate medical procedures that call for utmost precision. Recognizing
the increasing demand for trustworthy and transparent AI models among medical
professionals, the call for enhanced collaboration between data scientists and medi‑
cal experts for the design and successful development of efficient XAI systems is
indispensable. Successfully harnessing this alliance can lead to an enhanced under‑
standing of diseases’ causation, better measurement of medication influences, and
overall improved patient satisfaction. Finally, the chapter excelled in providing fur‑
ther research sources, reinforcing its arguments and providing groundwork for future
exploration in the realm of XAI in healthcare. Such ingenuity creates an opportunity
to delve deeper into the concept and critically analyze areas such as ethics and the
engagement of user‑centred design in AI developments. This ultimately paves the
way for the potential maximization of the use of AI, particularly XAI in healthcare,
advancing medical practices towards remarkable progress and efficiency.
REFERENCES
1. Arrieta AB, Díaz‑Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A et al.
Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and
challenges toward responsible AI. Information Fusion. 2020;58:82–115.
2. Angelov PP, Soares EA, Jiang R, Arnold NI, Atkinson PM. Explainable artificial
intelligence: An analytical review. Wiley Interdisciplinary Reviews: Data Mining and
Knowledge Discovery. 2021;11(5):e1424.
3. Samek W, Wiegand T, Müller K‑R. Towards explainable artificial intelligence,
Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, pp. 5–22.
doi:10.1007/978‑3‑030‑28954‑6_1. 2017.
4. Ribeiro MT, Singh S, Guestrin C, editors. Why should i trust you? Explaining the pre‑
dictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international
Conference on Knowledge Discovery and Data Mining, 2016. San Francisco, USA.
Explainable Artificial Intelligence in the Healthcare 157
23. Das A, Rad P. Opportunities and challenges in explainable artificial intelligence (xai):
A survey (2020) pp. 229–239, vol‑11.
24. Alonso JM, Catala A, editors. Proceedings of the 1st Workshop on Interactive
Natural Language Technology for Explainable Artificial Intelligence (NL4XAI 2019).
Proceedings of the 1st Workshop on Interactive Natural Language Technology for
Explainable Artificial Intelligence (NL4XAI 2019) (2019). Turkey.
25. Almasri A, Alkhawaldeh RS, Çelebi E. Clustering‑based EMT model for predicting
student performance. Arabian Journal for Science and Engineering. 2020;45:10067–78.
26. Warman A, Warman PI, Sharma A, Parikh P, Warman R, Viswanadhan N et al.
Interpretable artificial intelligence for COVID‑19 diagnosis from chest CT reveals
specificity of ground‑glass opacities (2020). doi:10.1101/2020.05.16.20103408.
27. Zhu S, Fan W, Yang S, Pardalos PM. Scheduling operating rooms of multiple hospi‑
tals considering transportation and deterioration in mass‑casualty incidents. Annals of
Operations Research. 2023;321(1‑2):717–53.
28. Lan S, Fan W, Shao K, Yang S, Pardalos PM, editors. Medical Staff Scheduling
Problem in Chinese Mobile Cabin Hospitals During Covid‑19 Outbreak. Learning and
Intelligent Optimization: 15th International Conference, LION 15, Athens, Greece,
June 20–25, 2021, Revised Selected Papers 15, Springer, 2021.
29. Lan S, Fan W, Yang S, Pardalos PM, Mladenovic N. A survey on the applications of vari‑
able neighborhood search algorithm in healthcare management. Annals of Mathematics
and Artificial Intelligence. 2021;12:1–35.
30. Anand L, Rane, K.P., Bewoor, L.A., Bangare, J.L., Surve, J., Raghunath, M.P.,
Sankaran, K.S., and Osei, B., 2022. Development of machine learning and medical
enabled multimodal for segmentation and classification of brain tumor using MRI
images. Computational Intelligence and Neuroscience, 2022;29:1–8.
31. Fan W, Liu J, Zhu S, Pardalos PM. Investigating the impacting factors for the healthcare
professionals to adopt artificial intelligence‑based medical diagnosis support system
(AIMDSS). Annals of Operations Research. 2020;294:567– 92.
32. Abidin, S., Prasad Raghunath, M., Rajasekar, P., Kumar, A., Ghosal, D., and Ishrat,
M., 2022. Identification of disease based on symptoms by employing ML. In: 2022
International Conference on Inventive Computation Technologies, Nepal (ICICT).
33. Horáček J, Koucký V, Hladík M. Novel approach to computerized breath detection in
lung function diagnostics. Computers in Biology and Medicine. 2018;101:1–6.
34. Pardalos, P. M., Georgiev, P. G., Papajorgji, P., and Neugaard, B. (eds.). Systems analy‑
sis tools for better health care delivery. Springer Science & Business Media, 2013;
vol. 74.
35. Alves, C.J., Pardalos, P.M., and Vicente, L.N. (Eds.). Optimization in medicine. Springer
Science & Business Media, 2007; vol. 12.
36 Roshanzamir, M. et al. (2023) Quantifying uncertainty in automated detection of
alzheimer’s patients using Deep Neural Network [Preprint]. doi:10.20944/preprints
202301.0148.v1.
11 Role of Data‑Centric
Artificial Intelligence
in Agriculture
Rajkumar Patil, Nilesh Popat Sable,
Parikshit N. Mahalle, Gitanjali R. Shinde,
Prashant Dhotre, and Pankaj Chandre
11.1 INTRODUCTION
Indian agriculture has traversed a remarkable journey from an era where farmers
relied on age‑old techniques and manual labor to the present, characterized by the
integration of modern tools and technology. In the past, agricultural practices were
deeply rooted in traditional wisdom, with manual plowing, hand sowing, and rudi‑
mentary irrigation methods being the norm. As time progressed, the Green Revolution
introduced improved seeds, fertilizers, and mechanized equipment, catapulting pro‑
ductivity. Today, precision agriculture, satellite imagery, IoT devices, and data‑driven
insights have reshaped Indian farming, enhancing efficiency and sustainability while
bridging the gap between historical practices and cutting‑edge innovation [1].
The realm of agriculture is undergoing a transformative evolution, powered by
the synergistic integration of Data‑Centric Artificial Intelligence (AI) [2–4]. This
convergence holds the promise of addressing pressing challenges in food production,
resource allocation, and sustainability. With an ever‑growing global population and
the escalating impact of climate change, the need for innovative solutions in agricul‑
ture has never been more critical.
The advent of AI has revolutionized the way we perceive and harness data.
Agriculture, a sector deeply rooted in empirical knowledge and practice, is now
embracing the data‑driven paradigm. Data, often referred to as the new “oil,” has
become a valuable asset for decision‑makers across the agricultural spectrum. By
capturing and analyzing a wealth of information ranging from climate patterns, soil
health, crop growth, and market trends, AI empowers stakeholders with insights
that were once beyond reach [5]. The introduction of AI in agriculture is not merely
a technological shift, but a strategic shift in the very fabric of farming practices.
Traditional approaches, while valuable, often grapple with inefficiencies and uncer‑
tainties that can limit productivity. In contrast, Data‑Centric AI offers the potential
to enhance precision, optimize resource utilization, and minimize environmental
impact. Through advanced predictive models, decision support systems (DSSs), and
and sustainable while using fewer resources to fulfill the needs of the world’s rising
population. Figure 11.1 shows the potential application of AI in Agriculture.
The coronavirus epidemic and the Ukraine War, which were both exacerbated
by a labor shortage, had a negative impact on the record levels of crop production in
2022 [11]. According to the researcher’s findings in Ref. [12], the widespread adop‑
tion of AI and Precision Agriculture tools has the potential to significantly lower
operating expenses as a percentage of revenue, from 42% to 33%. A potential $67
billion market opportunity could arise from this. Figure 11.2 shows global annual
operating cost compared with the use of AI for the year 2022 for Corn, Wheat, and
Soybean. Also, when autonomous technology becomes more widely used in agricul‑
ture, businesses may be able to create recurring revenue streams with margins com‑
parable to software‑as‑a‑service business models. It’s crucial to remember that, even
FIGURE 11.2 Global annual agricultural operating cost. Source: ARK Investment
Management LLC, 2023, based on data from USDA as of July 12, 2023 [12].
though the emphasis has been on cost reduction, autonomous solutions can increase
crop yields by reducing the need for human labor, particularly at night or during cru‑
cial agronomic periods. Due to labor constraints, many farmers sometimes struggle
to procure labor‑dependent machinery for field activities. Autonomous technologies
have the potential to dramatically improve crop output prospects while addressing
this difficulty.
We hold the belief that AI and Precision Agriculture have the potential to bring
about the most significant advancements in agriculture since the introduction of the
tractor a century ago. In our perspective, these innovations have the capacity to boost
farm profitability, reduce food costs, and meet the growing global need for crops,
ultimately enhancing the efficiency and sustainability of farming on a global scale
within the next 5–10 years.
and cutting down on food waste. In the end, data‑driven agriculture promotes sus‑
tainability, boosts yields, and aids in feeding a growing world population while
minimizing negative environmental effects. In Ref. [13], the author highlights the
pivotal role of efficient data management in driving the exponential growth of Smart
Farming in modern agriculture. It emphasizes the use of data in supporting pro‑
ducers’ crucial decisions, with a focus on maximizing productivity and sustainabil‑
ity through unbiased data gathered from sensors. It has been demonstrated that the
future of sustainable agriculture will be built on the integration of AI and data‑driven
tactics with robotic solutions. The review thoroughly examines the entire range of
advanced farm management systems, from data collection in crop fields to variable
rate applications, highlighting their potential to improve resource utilization, lessen
environmental impact, and revolutionize food production to address the challenges
of impending population growth.
The primary data collected from crops must be processed efficiently to transform
numerous images into meaningful and clear‑cut information. Farmers in traditional
settings rely on visual inspections to make crop management decisions based on their
experience without the use of technological technology. Farms with cutting‑edge
technology tend to adopt a more data‑driven strategy. Sensors are used to gather fac‑
tual information about the environment, the soil, and the crops. To help farmers make
wise decisions, the data is subsequently processed using AI algorithms and filtering
techniques. Throughout the information‑based management cycle for advanced agri‑
culture shown in Figure 11.3, this cycle – from data gathering to action – continues
and comes to an end after harvest. The way that farmer maximizes their crop yields
and resource management has been revolutionized by precision agriculture.
transformed into a crucial instrument for modern farming, fostering innovation and
efficiency in a sector entrusted with feeding a growing global population while con‑
tending with a variety of environmental and economic concerns.
70% by the year 2050 [17]. This is particularly crucial as the world faces the chal‑
lenge of increasing global food production by 60% by 2050 to support a growing
population expected to reach over nine billion [18].
One of the primary benefits of IoT implementation is the achievement of higher
crop yields and cost reduction. For instance, studies conducted by OnFarm revealed
that, on average, farms using IoT experience a 1.75% increase in crop yields, while
energy costs decrease by $17–$32 per ha, and the water usage for irrigation dimin‑
ishes by 8% [15].
A. Sensor Technologies
Sensor technologies have played a significant role in revolutionizing agri‑
culture by enabling farmers to monitor and manage their crops, livestock,
and overall farm operations more efficiently. These sensors gather data on
many environmental aspects and offer insightful analysis to enhance deci‑
sion‑making and optimize resource allocation [19]. In agriculture, the fol‑
lowing prominent sensor technologies are employed:
1. Soil Sensors
a. Soil Moisture Sensors: Let farmers decide when and how much to
irrigate crops, therefore saving water and enhancing crop develop‑
ment. These sensors assess the moisture content in the soil.
b. Soil pH Sensors: Measure the acidity or alkalinity of the soil, which
is important for managing nutrients and choosing the right crops.
2. Weather Stations
a. Weather Stations: Include numerous sensors including tempera‑
ture, humidity, wind speed, and rainfall detectors. Weather stations
also incorporate other types of sensors. For accurate weather fore‑
casts and administration of farming activities, they offer real‑time
weather data.
3. Crop Health Sensors
a. Remote Sensing: To track crop health, find illnesses, and determine
nutrient levels, satellite‑ or drone‑based sensors collect photos and
data.
b. Hyper spectral Imaging: This cutting‑edge technique examines the
light reflected from crops to detect minute changes in plant health.
4. Livestock Monitoring Sensors
a. RFID Tags and GPS: Used in the location and movement monitor‑
ing of cattle. Information on breeding and health can also be stored
on RFID tags.
b. Wearable Sensors: To keep checks on an animal’s health, behavior,
and well‑being, sensors like accelerometers and temperature gauges
can be fastened to it.
5. Environmental Sensors
a. Air Quality Sensors: Measure factors such as air temperature,
humidity, and gas concentrations to ensure optimal conditions for
livestock and crops.
166 Data-Centric Artificial Intelligence for Multidisciplinary Applications
11.3.1 Data Preprocessing
Preprocessing the data is an essential stage in agricultural data analysis because it
guarantees that the information that will be used for decision‑making and analysis is
correct, dependable, and consistent. Figure 11.5 illustrates some key steps and factors
involved in preprocessing of data for agricultural purposes.
The process of preprocessing data is iterative, and the choice of preprocessing pro‑
cesses will rely on the particular agricultural application as well as the features
of the data. An efficient preprocessing of data can result in better predictions and
insights, which can, in turn, lead to improvements in agricultural decision‑mak‑
ing, whether the prediction is for crop production, disease detection, or resource
allocation.
to find trends and make predictions. For instance, ML models can assist in early
disease detection, enabling prompt action, and minimizing agricultural losses by
analyzing data on crop health and environmental conditions. DL, with its neural
network architectures, has made significant strides in agriculture. Convolutional
neural networks (CNNs) are excellent in image analysis, which makes them the best
choice for identifying plant diseases. CNNs can quickly and effectively identify
illnesses by studying images of leaves or fruits, eliminating the need for manual
checks and resulting in healthier crops [26]. Figure 11.6 shows the architecture of
ML and DL in agriculture.
Another DL method used in agriculture for time series data analyzing, such as
weather forecasting, is recurrent neural networks (RNNs). RNN can analyze huge
amounts of historical meteorological data to predict future circumstances, assisting
farmers in making well‑informed choices about when to sow, irrigate, and harvest.
This enhances resource allocation and minimizes the impact of weather‑related
risks on crop yields [27]. Generative adversarial networks (GANs) play a unique role
by generating synthetic data to simulate various environmental conditions. Without
the requirement for actual field testing, this synthetic data helps with crop plan‑
ning and scenario testing. For optimal production and resource efficiency, farmers
can utilize GANs to optimize resource allocation, irrigation plans, and crop rota‑
tion techniques [28]. Figure 11.7 illustrates the agricultural challenges addressed by
various research projects, along with the AI‑powered solutions employed for each of
these endeavors. These projects differ in their goals, approaches, and the materials
utilized [29].
Due to the tremendous capabilities that ML and DL technologies have given
farmers for decision‑making and resource optimization, agriculture has undergone a
revolution. These innovations improve weather forecasting, disease detection, yield
prediction, and crop management. Farmers may contribute to sustainable agricul‑
tural practices that are essential for feeding a growing world population by utiliz‑
ing the potential of ML and DL, which can also help them enhance production and
minimize resource waste.
Figure 11.7 shows the Conceptualization of the several agricultural operations and
AI‑related technologies of the European research projects.
FIGURE 11.7 Conceptualization of the several agricultural operations and AI‑related tech‑
nologies of the European research projects described in [29].
agriculture while protecting the sensitive data that drives its development in this
harmonic synergy of technology, legislation, and monitoring [33].
11.7 CONCLUSION
This chapter has explored the pivotal role of data‑centric AI in agriculture. The con‑
vergence of AI and agriculture is a promising frontier with enormous potential to
change how we think about modern farming. Effective use of AI technology depends
on an understanding of the significance of data in contemporary agriculture. We
explored the crucial components of data collection, highlighting the IoT’s contribu‑
tion to the collection of important data from agricultural activities. Once gathered,
this data goes through a substantial modification known as data preprocessing, which
makes it possible to use ML and DL algorithms designed particularly for agriculture
on it. DSSs have developed as an essential AI use in agriculture, offering farmers
insightful information and suggestions for improving their farming methods. We
also talked about the difficulties that must be overcome in order to fully realize AI’s
promise in agriculture, such as issues with data security and privacy, integration with
174 Data-Centric Artificial Intelligence for Multidisciplinary Applications
conventional agricultural expertise, and the requirement for scaling AI solutions for
smallholder farmers.
We also looked at real‑world examples of AI being successfully used in agri‑
culture, with Telangana serving as a shining example of how AI can improve agri‑
cultural practices. We also discussed the major advancements being made in this
industry by Australian start‑ups. The application of data‑centric AI in agriculture has
the potential to revolutionize the sector by presenting fresh approaches to venerable
problems. In order to ensure sustainable and effective agricultural production in the
years to come, it will be essential to handle the difficulties and seize the potential
given by AI in agriculture.
REFERENCES
1. Rozenstein, O., Cohen, Y., Alchanatis, V., et al. “Data‑driven agriculture and sus‑
tainable farming: friends or foes?” Precision Agric volume (2023). doi:10.1007/
s11119‑023‑10061‑5
2. Singh, P. “Systematic review of data‑centric approaches in artificial intelligence
and machine learning.” Data Science and Management 6, no. 3 (2023): 144–157.
doi:10.1016/j.dsm.2023.06.001.
3. Zha, D., Bhat, Z.P., Lai, K.H., Yang, F., Jiang, Z., Zhong, S., and Hu, X. “Data‑centric
artificial intelligence: a survey.” (2023). arXiv preprint arXiv:2303.10158.
4. Salehi, S., and Schmeink, A. “Data‑centric green artificial intelligence: a survey.” IEEE
Transactions on Artificial Intelligence. doi:10.1109/TAI.2023.3315272.
5. AlZubi, A.A., and Galyna, K. “Artificial intelligence and internet of things for
Sustainable Farming and Smart Agriculture.” IEEE Access 11 (2023): 78686–78692.
doi:10.1109/ACCESS.2023.3298215.
6. Elbasi, E. et al. “Artificial intelligence technology in the agricultural sector: a systematic
literature review.” IEEE Access 11 (2023): 171–202. doi:10.1109/ACCESS.2022.3232485.
7. Kobayashi, T., Yokogawa, T., Igawa, N., Sato, Y., Fujii, S., and Arimoto, K. “A Compact
low power AI module mounted on drone for plant monitor system.” In: 2019 8th
International Congress on Advanced Applied Informatics (IIAI‑AAI), Toyama, Japan,
2019, pp. 1081–1082. doi:10.1109/IIAI‑AAI.2019.00236.
8. Tantalaki, N., Souravlas, S., and Roumeliotis, M. “Data‑driven decision making in pre‑
cision agriculture: the rise of big data in agricultural systems.” Journal of Agricultural
& Food Information 20, no. 4 (2019): 344–380. doi:10.1080/10496505.2019.1638264
9. Wakchaure, M., Patle, B.K., and Mahindrakar, A.K. “Application of AI techniques and
robotics in agriculture: a review.” Artificial Intelligence in the Life Sciences 3 (2023):
100057. doi:10.1016/j.ailsci.2023.100057.
10. Dilmurat, K., Sagan, V., and Moose, S. “AI‑driven maize yield forecasting using
unmanned aerial vehicle‑based hyperspectral and lidar data fusion.” ISPRS Annals
of the Photogrammetry, Remote Sensing and Spatial Information Sciences 3 (2022):
193–199. doi:10.5194/isprs‑annals‑V‑3‑2022‑193‑2022.
11. U.S. Senate Committee on Agriculture, Nutrition, and Forestry. “Revisiting farm pro‑
duction expenses.” https://www.agriculture.senate.gov/newsroom/minority‑blog/revis‑
iting‑farm‑production‑expenses (accessed September 9, 2023).
12. ARK Invest. “Will the convergence between artificial intelligence and precision
agriculture lower farming costs?” ARK Invest (2023). https://ark‑invest.com/articles/
analyst‑research/will‑the‑convergence‑between‑artificial‑intelligence‑and‑preci‑
sion‑agriculture‑lower‑farming‑costs/ (accessed September 9, 2023).
Role of Data‑Centric Artificial Intelligence in Agriculture 175
13. Saiz‑Rubio, V., and Rovira‑Más, F. “From smart farming towards agriculture 5.0:
a review on crop data management.” Agronomy 10, no. 2 (2020): 207. https://doi.
org/10.3390/agronomy10020207.
14. Brown, A. “What is IoT in agriculture? Farmers aren’t quite sure despite $4bn US oppor‑
tunity‑report.” AgFunderNews (accessed September 2, 2023). https://agfundernews.
com/iot‑agriculture‑farmers‑arent‑quite‑sure‑despite‑4bn‑us‑opportunity.html.
15. Gralla, P. “Precision agriculture yields higher profits, lower risks.” HPE Insights
(June 2018). https://www.hpe.com/us/en/insights/articles/precision‑agriculture‑
yields‑higher‑profits‑lower‑risks‑1806.html (accessed September 2, 2023).
16. Tzounis, A., Katsoulas, N., Bartzanas, T., and Kittas, C. “Internet of things in agri‑
culture, recent advances and future challenges.” Biosystems Engineering 164 (2017):
31–48. doi:10.1016/j.biosystemseng.2017.09.007
17. Sarni, W., Mariani, J., and Kaji, J. “From dirt to data: the second green revolution
and IoT.” Deloitte Insights. https://www2.deloitte.com/insights/us/en/deloitte‑review/
issue‑18/second‑green‑revolution‑and‑internet‑of‑things.html (accessed September 2,
2023).
18. Mykleby, M., Doherty, P., and Makower, J. The New Grand Strategy: Restoring
America’s Prosperity, Security, and Sustainability in the 21st Century. St. Martin’s
Press (2016).
19. Shaikh, F.K., Karim, S., Zeadally, S., and Nebhen, J. “Recent trends in inter‑
net‑of‑things‑enabled sensor technologies for smart agriculture.” IEEE Internet of
Things Journal 9, no. 23 (2022): 23583–23598. doi:10.1109/JIOT.2022.3210154.
20. Shafi, U., Mumtaz, R., García‑Nieto, J., Hassan, S.A., Zaidi, S.A.R., and Iqbal, N.
“Precision agriculture techniques and practices: from considerations to applications.”
Sensors 19, no. 17 (2019): 3796. doi:10.3390/s19173796.
21. Nachankar, M. “Challenges of big data in agriculture: data collection and integra‑
tion.” International School of Advanced Management. https://isam.education/en/
challenges‑of‑big‑data‑in‑agriculture‑data‑collection‑and‑integration (accessed
September 2, 2023).
22. Wijaya, R., and Pudjoatmodjo, B. “An overview and implementation of extraction‑trans‑
formation‑loading (ETL) process in data warehouse (case study: department of agri‑
culture).” In: 2015 3rd International Conference on Information and Communication
Technology (ICoICT), 70–74. Nusa Dua, Bali, Indonesia, 2015. doi:10.1109/
ICoICT.2015.7231399.
23. Bocca, F.F., and Rodrigues, L.H.A. “The effect of tuning, feature engineering, and fea‑
ture selection in data mining applied to rainfed sugarcane yield modeling.” Computers
and Electronics in Agriculture 128 (2016): 67–76. doi:10.1016/j.compag.2016.08.015.
24. Sabarina, K., and Priya, N. “Lowering data dimensionality in big data for the benefit of
precision agriculture.” Procedia Computer Science 48 (2015): 548–554. doi:10.1016/j.
procs.2015.04.1345.
25. Yanwei, Y., Ling, X., Fuhua, J., Dafang, G., Sa, A., and Kang, N. “Experimental opti‑
mization of big data cleaning method for agricultural machinery.” Nongye Jixie Xuebao
Transactions of the Chinese Society of Agricultural Machinery 52, no. 6 (2021). https://
nyjxxb.net/index.php/journal/article/view/1190
26. Durai, S.K.S., and Shamili, M.D. “Smart farming using machine learning and deep
learning techniques.” Decision Analytics Journal 3 (2022): 100041. doi:10.1016/j.
dajour.2022.100041.
27. Saini, U., Kumar, R., Jain, V., and Krishnajith, M.U. “Univariant time series forecast‑
ing of agriculture load by using LSTM and GRU RNNs.” In: 2020 IEEE Students
Conference on Engineering & Systems (SCES), Prayagraj, India, pp. 1–6, 2020.
doi:10.1109/SCES50439.2020.9236695.
176 Data-Centric Artificial Intelligence for Multidisciplinary Applications
28. Lu, Y., Chen, D., Olaniyi, E., and Huang, Y. “Generative adversarial networks (GANs)
for image augmentation in agriculture: a systematic review.” Computers and Electronics
in Agriculture 200 (2022): 107208. doi:10.1016/j.compag.2022.107208.
29. Linaza, M.T. et al. 2021. “Data‑driven artificial intelligence applications for sustainable
precision agriculture.” Agronomy 11 (6): 1227. doi:10.3390/agronomy11061227.
30. Borrero, J.D., and Mariscal, J. “A case study of a digital data platform for the agricul‑
tural sector: a valuable decision support system for small farmers.” Agriculture 12, no.
6 (2022): 767. doi:10.3390/agriculture12060767.
31. Nikhil, R., Anisha, B.S., and Kumar, P.R. “Real‑time monitoring of agricultural land
with crop prediction and animal intrusion prevention using internet of things and
machine learning at edge.” In: 2020 IEEE International Conference on Electronics,
Computing and Communication Technologies (CONECCT), Bangalore, India, pp. 1–6,
2020. doi:10.1109/CONECCT50063.2020.9198508.
32. Lafont, M., Dupont, S., Cousin, P., Vallauri, A., and Dupont, C. “Back to the future:
IoT to improve aquaculture ‑ real‑time monitoring and algorithmic prediction of water
parameters for aquaculture needs.” In: 2019 Global IoT Summit (GIoTS), Aarhus,
Denmark, IEEE, 2019. doi:10.1109/GIOTS.2019.8766436.
33. Kumar, P., Gupta, G.P., and Tripathi, R. “PEFL: deep privacy‑encoding‑based feder‑
ated learning framework for smart agriculture.” IEEE Micro 42, no. 1 (2022): 33–40.
doi:10.1109/MM.2021.3112476.
34. Neo, G.H., and Rama Rao, K.T. “Telangana is the success story of Indian agri‑
tech: AI tools, soil testing, E‑commerce & more.” 2023. https://theprint.in/
economy/telangana‑is‑the‑success‑story‑of‑indian‑agritech‑ai‑tools‑soil‑testing‑
e‑commerce‑more/1630359/ (accessed September 2, 2023).
35. Ajayi, O. “AI‑powered agriculture: revolutionizing farming practices.” 2023. https://
www.linkedin.com/pulse/ai‑powered‑agriculture‑revolutionizing‑farming‑practices‑
ajayi/ (accessed September 2, 2023)
12 Detection and
Classification of Mango
Fruit‑Based on Feature
Extraction Applying
Optimized Hybrid
LA‑FF Algorithms
Mukesh Kumar Tripathi, M. Neelakantappa,
Parikshit N. Mahalle, Shylesha V. Channapattana,
Ganesh Deshmukh, and Ghongade Prashant
12.1 INTRODUCTION
India is capable of producing a variety of horticulture products owing to its land
territory resilience. The overall horticultural output includes 90% of the fruit and
vegetables. The production of fruit and vegetables is 33% [1]. India is the leading
producer of mangoes. However, India is currently witnessing negative growth of
−0.86%. This is due to estimation loss during post‑harvest. Improper assessment,
wrong field h andling, transportation, mechanical damage during harvesting, and dis‑
ease cause quality losses in fruits. [2–5]. This is a serious issue that requires proper
attention. Mango fruits usually perish quickly, especially if stored in low tempera‑
tures of 7°C –13°C. Another cause for fruit losses is the traditional grading approach
[6,7]. This is more time‑consuming and labor intensive. This loss can be minimized
through proper framework and supply chain management with participants and other
entities. An optimal harvest time and the selection of quality are beneficial.
Customers power the fruit industry. The public trust in the fruit industry has been
diminished. Humans are more health conscious and vigilant. The definitions and
aspects of “quality” vary from the fruit class, the target audience, the requirements,
and the applicability. The assessment and grading of fruit quality is a progressively
complex task. In the past, determining fruit’s internal and external attributes was
challenging and time‑consuming. This is because of traditional evaluation methods
and a need for more research. The traditional approach uses tone to detect the quality
based on flavor. Yellow spots on the skin surface are another common form to iden‑
tify the disease. Much of the experiment also focused on external characteristics
and fruit deficiencies that could contribute to inaccuracies. Manual assessment relies
upon human activity. Research is based on the conventional technique for selecting a
high‑quality fruit in mango fruits [8]. In post‑harvest processing, this approach could
be more realistic. Demand for high‑quality fruit is growing, and non‑destructive
automated techniques with a neutrosophic machine‑learning framework are also
desirable.
In today’s era, only a few studies have also been conducted for the quality evalu‑
ation of mangoes. Therefore, the suggested method is to grade the quality of mango
fruits based on external and internal characteristics. The destructive and non‑destruc‑
tive framework for quality grading is the most suitable solution. Our proposed system
can be further investigated for mango fruit quality assessment and grading with a
machine‑learning framework. Fruits are essential parts of the diet [9]. Fruits con‑
tain vital nutrients, fiber, energy, ascorbic acid, and proteins necessary for a healthy
human body [10]. Fruits are consumed in various forms as food or supplementary to
food. This mango is widely accepted due to its high nutrient value, taste, and flavor.
Mango is consumed in raw form or ripe form. The worldwide market for mango is
55 million tonnes [11].
High‑quality mangoes have increased day by day. Therefore, assessment and
grading are essential. Some internal features include soluble solids content (SSC),
total acid content (TAC), PH, physiological features, weight, dryness, firmness, mois‑
ture, and maturity [12–15]. Combining all these physical and biochemical param‑
eters defines the quality of mangoes. The grading based on external attributes could
be more efficient and accurate. Near‑infrared spectroscopy (NIRS) has excellent
potential for internal quality assessment and grading of mangoes. Further, to expand
the mango fruit market, it is necessary to develop an alternative framework to grade
the quality of mangoes [16–18]. This paper investigated the grading for mango fruit
quality with a neuromorphic approach‑based intelligent system. Then, hyperspectral
imaging is employed to estimate the internal attributes of the mangoes with machine
learning techniques.
Maturity is one of the critical aspects of the quality of fruit. During the ripening
process, dry matter (DM) content is utilized to show the maturity level of mango
fruit. A robust and practical approach to recognize the spectral image and assess the
maturity level of mango fruit [21] is explored. They have implemented classification
and regression modeling to detect the mango and evaluate the quantity of DM. The
results show that for the Partial least squares regression (PLSR) model, R2 = 0.580 for
the CNN model. However, the system is unstable under natural light.
A random forest (RF)‑based model to evaluate the internal attributes of mango
[22] is investigated. The two categories of mango, namely “Nam Dakoi” and
“Irwin” at different temperatures are studied. L* a* b* color space is employed
to identify the color of the mango peel. The internal features such as Total soluble
solid (TSS) and ascorbic acid are calculated through destructive techniques. A RF
based model is utilized to grade the internal quality of mango. However, this model
has utilized destructive techniques.
NIRS techniques have gained attention as they are non‑destructive, fast, and
cost‑effective. NIRS‑based wavelength selection for calibration model [23] is studied.
This wavelength selection method has high consistency. Two different databases are
used to predict the effectiveness of the prediction model. The experiment combined
the wavelength selection method with standard sample calibration transfer methods.
The proposed method is applicable only in a high range of wavelengths of spectra.
A framework for the total acidity prediction of mango [24] by NIRS is presented.
This method utilized three regression approaches: Partial least squares regression
(PLSR), Support Vector Machine (SVM), and artificial neural networks (ANNs).
Spectra acquired in a wavelength range from 1000 to 2500 nm. Further, total acidity
is predicted. The calibration and prediction models achieved more than 90% accu‑
racy with the ANN model. However, handling spectra dimension is complex.
A hyperspectral imaging system to estimate moisture content [25] is studied.
Visible‑near infrared is applied for spectra with a wavelength range between 400
and 1000 nm, and second, NIR is employed with a wavelength range between 880
and 1720 nm. PLS is utilized for calibration and prediction models. Results show that
for mango samples in the spectral wavelength range between 400 and 1000 nm, the
accuracy rate is 43.70%, whereas an 87.15% accuracy rate has been achieved with
NIR‑based spectra wavelength range between 880 and 1720 nm.
Detection of firmness attribute framework for mango [25] based on NIRS is
explored. In this framework, both destructive and non‑destructive methodologies are
implemented to evaluate the internal quality of mangoes. Vis‑NIR extracts the spec‑
tra with a wavelength range of 400–1050 nm. Subsequently, the PLS model builds the
relationship between spectra and internal parameters. Further, genetic algorithms are
used to estimate the firmness level of mango fruit. Mango fruit image is a combina‑
tion of different internal biochemicals, and sometimes, it is difficult to estimate the
parameters based on a regression model. These challenges can be avoided by utiliz‑
ing NIRS with machine learning techniques.
The internal attributes of fruit are evaluated by applying destructive and
non‑destructive methods. Destructive approaches are time‑consuming and costly
in deciding the quality of the features, such as carotenoid materials, chlorophyll,
180 Data-Centric Artificial Intelligence for Multidisciplinary Applications
phenolic acids, and sugars. Research has therefore concentrated on developing and
applying non‑destructive techniques to quality assessment and evaluation of mature
stages in real time [26]. The benefit of non‑destructive techniques is that we can
track the same fruit for an extended period, refine calculations, and gain a more pro‑
found image of the actual properties of the fruit. These non‑destructive techniques
have been shown as efficient and proposed for evaluating quality grading in the fruit
industry [27].
NIRS has the advantage of extracting important internal attributes of fruits.
It has successfully tracked the production of disorders such as early detection
injury in the mango [28]. One limitation of the NIRS application is that it is
costly. However, NIRS is practical and usable for quality grading of fruits based
on internal features [29,30]. The critical problem of NIRS is the robustness of the
calibration mode [31] for fruit quality evaluation. Furthermore, the robustness
of NIRS models [32,33] often depends on fruit cultivation and harvest season.
The non‑destructive process with infrared spectroscopy is used to estimate the
internal attributes of the quality of the mango. The development of an automated
assessment and grading system is a complex task. In this framework, extracting
the feature of image data is difficult, followed by a training and classification
TABLE 12.1
Application of Machine Vision for Fruit Grading
Feature
Application Preprocessing Extraction Data Analysis Accuracy References
Sorting of Ostu Threshold Color and Fuzzy rule 94.97 [23]
mango Techniques Size
Grading of Binary Threshold Shape, Size, BPNN 80 [24]
date fruit Techniques Intensity
Grading of Binary Threshold Mass Statics 97 [25]
mango Techniques analysis
Sorting of Gamma curve fitting Color Coefficient of 98 [26]
mango determination
Sorting of Convolution Filter Color, ANN 80 [27]
mango volume
Grading of Threshold‑Techniques Size Caliber model 89.5 [28]
Mango
Grading of HSI Texture Neural 93.33 [29]
Mango Network
Grading of Binaries adaptive Color, PCA 92 [30]
Mango threshold shape
Classification Ostu Region Bayes 90.01 [31]
of mango Threshold‑Techniques classifier
Sorting of Threshold‑Techniques Mass and Regression 91.76 [32]
mango volume model
Detection and Classification of Mango Fruit-Based on Feature Extraction 181
process. With the above point, developing an accurate and efficient quality grad‑
ing model‑based approach is necessary.
After reviewing the data in Table 12.1, we observed that an accurate and efficient
quality grading model‑based approach is needed based on machine learning tech‑
niques. Developing innovation in consolidating the physicochemical and biochemi‑
cal data with machine vision conveys an effort towards a coordinated framework for
the agriculture industry. These goals will take care of the issues, yet they will like‑
wise give legitimate knowledge of internal and external parameters in the machine
vision framework.
Before removing the condition, the info picture is entirely changed to a dim‑scale
picture. The variety highlights like histogram, mean, middle, standard deviation,
most excellent variety recurrence, and negligible variety recurrence are removed.
Before this component extraction, the RGB picture is changed over entirely to an
LAB (laboratory) image.
TABLE 12.2
Comparison Performance Measure of All Categories of Mango with Respect
to All Method
Test CNN+ All CNN+ FIS Auto Encoder+ RNN+ All Optimized CNN+
cases Measures Features Features All Features Features Optimal Features
HD FOR 0.085 0.123 0.24 0 0.942
FPR 0.901 0 0 0.5117 0
FNR 0.2 1 0.8 0 0.9
FDR 0.871 0 0 0.587 0
RU FOR 0.087 0 0.24 0.57 0.0574
FPR 0.257 1 0 0 0.547
FNR 0.052 0 0.548 0.39 0.031
FDR 0.125 0.578 0 0 0.134
BMV FOR 0 0.1475 0.185 0.139 0.15
FPR 0.85 0 0 0 0
FNR 0 1 0.758 0.3433 0.7
FDR 0.758 0 0 0 0
Detection and Classification of Mango Fruit-Based on Feature Extraction 183
12.5 CONCLUSION
An automated grading system is designed to speed up the process of classifying the
mango images and facilitate quality evaluation in the industrial sector. A new hybrid
optimization algorithm, LA‑FF, is introduced to overcome the slow convergence.
The grading is evaluated based on healthy–diseased (HD), Ripe Unripe (RU), and
Big medium very big (BMV) categories, and in all test cases, our proposed method‑
ology achieves higher accuracy than conventional methods. Similarly, the proposed
Optimized CNN reaches the least False omission rate (FOR), False positive predic‑
tive value (FPR), False Negative predictive value (FNR), and False discovery rate
(FDR) values than the traditional method.
REFERENCES
[1] Nandi, C.S., Tudu, B., and Koley, C.: A machine vision technique for grading of har‑
vested mangoes based on maturity and quality, IEEE Sensors Journal 2016, 16(16),
6387–6396.
[2] Nambi, V.E., Thangavel, K., Jesudas, D.M.: Scientific classification of ripening period
and development of colour grade chart for Indian mangoes (Mangifera indica L.) using
multivariate cluster analysis, Scientia Horticulturae 2015, 193, 90–98.
[3] Tripathi, M.K., and Dhananjay D.M.: A role of computer vision in fruits and vegeta‑
bles among various horticulture products of agriculture fields: A survey, Information
Processing in Agriculture 2020, 7(2), 183–203.
[4] Anurekha, D., and Sankaran, R.A.: Efficient classification and grading of MANGOES
with GANFIS for improved performance, Multimedia Tools and Applications 2019, 79,
1–16.
[5] Wang, F., Zheng, J., Tian, X., Wang, J., and Feng, W.: An automatic sorting system for
fresh white button mushrooms based on image processing, Computers and Electronics
in Agriculture 2018, 151, 416–425.
[6] Tripathi, M.K., and Maktedar, D.D.: Recent machine learning based approaches for
disease detection and classification of agricultural products, In: 2016 International
Conference on Computing Communication Control and Automation (ICCUBEA),
IEEE, 2016, pp. 1–6.
[7] Bhatt, A.K., and Pant, D.: Automatic apple grading model development based on back
propagation neural network and machine vision, and its performance evaluation, AI &
Society 2015, 30(1), 45–56.
[8] Chiranjeevi, K., Tripathi, M.K., and Maktedar, D.D.: Block chain technology in agricul‑
ture product supply chain, In: 2021 International Conference on Artificial Intelligence
and Smart Systems (ICAIS), IEEE, 2021, pp. 1325–1329.
[9] Mohammadi, V., Kheiralipour, K., and Varnamkhasti, M.G.: Detecting maturity of per‑
simmon fruit based on image processing technique, Scientia Horticulturae 2015, 184,
123–128.
[10] Mohapatra, A., Shanmugasundaram, S., and Malmathanraj, R.: Grading of ripening
stages of red banana using dielectric properties changes and image processing approach,
Computers and Electronics in Agriculture 2017, 143, 100–110.
[11] Tripathi, M.K., and Dhananjay, D.M.: Optimized deep learning model for mango
grading: Hybridizing lion plus firefly algorithm, IET Image Processing 2021, 15(9),
1940–1956.
184 Data-Centric Artificial Intelligence for Multidisciplinary Applications
[12] Chen, X., Li, Z., Wang, Y., and Liu, J.: Effect of fruit and hand characteristics on
thumb‑index finger power‑grasp stability during manual fruit sorting, Computers and
Electronics in Agriculture 2019, 157, 479–487.
[13] Tripathi, M.K., and Dhananjay, D.M.: A framework with OTSUS thresholding method
for fruits and vegetables image segmentation, International Journal of Computer
Applications 2018, 975, 8887.
[14] Zhang, Y., Lee, W.S., Li, M., Zheng, L., Ritenour, M.A.: Non‑destructive recognition
and classification of citrus fruit blemishes based on ant colony optimized spectral infor‑
mation, Postharvest Biology and Technology 2018, 143, 119–128.
[15] Tripathi, M.K., and Maktedar, D.D.: Detection of various categories of fruits and veg‑
etables through various descriptors using machine learning techniques, International
Journal of Computational Intelligence Studies 2021, 10(1), 36–73.
[16] Zhang, Y., Wang, S., Ji, G., Phillips, P.: Fruit classification using computer vision and
feedforward neural network, Journal of Food Engineering 2014, 143, 167–177.
[17] Taghipour, A., and Frayret, J.‑M.: Coordination of operations planning in supply chains:
a review, International Journal of Business Performance and Supply Chain Modelling
2013, 5(3), 272–307.
[18] Channapattana, S.V., Srinidhi, C., Madhusudhan, A., Notla, S., Arkerimath, R., and
Tripathi, M.K.: Energy analysis of DI‑CI engine with nickel oxide nanoparticle added
Azadirachta indica biofuel at different static injection timing based on exergy. Energy
2023, 267, 126622.
[19] Alavi, N.: Quality determination of Mozafati dates using Mamdani fuzzy inference
system, Journal of the Saudi Society of Agricultural Sciences 2013, 12(2), 137–142.
[20] Gandomi, A.H., Yang, X.‑S., Talatahari, S and Alavi, A.H.: Firefly algorithm with
chaos, Commun Nonlinear Sci Numer Simulat 2013, 18, 89–98.
[21] Utai, K., Nagle, M., Hämmerle, S., Spreer, W., Mahayothee, B., Müller, J.: Mass estima‑
tion of mango fruits (Mangifera indica L., cv. Nam Dokmai) by linking image process‑
ing and artificial neural network, Engineering in Agriculture, Environment and Food
2019, 12(1), 103–110.
[22] Shivendra, K.C., and Tripathi, M.K.: Detection of fruits image applying decision tree
classifier techniques. In: Computational Intelligence and Data Analytics: Proceedings
of ICCIDA 2022, Springer Nature Singapore, Singapore, 2022, pp. 127–139.
[23] Saad, F.S.A., Ibrahim, M.F., Shakaff, A.Y.M., Zakaria, A., Abdullah, M.Z.: Shape
and weight grading of mangoes using visible imaging, Computers and Electronics in
Agriculture 2015, 115, 51–56.
[24] Schulze, K., Nagle, M., Spreer, W., Mahayothee, B., Müller, J.: Development and
assessment of different modeling approaches for size‑mass estimation of mango fruits
(Mangifera indica L., cv. Nam Dokmai), Computers and Electronics in Agriculture
2015, 114, 269–276.
[25] Tripathi, M.K., and Maktedar, D.D.: Internal quality assessment of mango fruit: an
automated grading system with ensemble classifier, The Imaging Science Journal 2022,
70(4), 253–272.
[26] Mizushima, A., Lu, R.: An image segmentation method for apple sorting and grad‑
ing using support vector machine and Otsus method, Computers and Electronics in
Agriculture 2013, 94, 29–37.
[27] Gurubelli, Y., Ramanathan, M., Ponnusamy, P.: Fractional fuzzy 2DLDA approach
for pomegranate fruit grade classification, Computers and Electronics in Agriculture,
2019, 162, 95–105.
[28] Nyalala, I., Okinda, C., Nyalala, L., Makange, N., Chao, Q., Chao, L., Yousaf, K., Chen,
K.: Tomato volume and mass estimation using computer vision and machine learning
algorithms: Cherry tomato model, Journal of Food Engineering 2019, 263, 288–298.
Detection and Classification of Mango Fruit-Based on Feature Extraction 185
[29] Tripathi, M.K., Maktedar, D., Vasundhara, D.N., Moorthy, C., and Patil, P.: Residual
life assessment (RLA) analysis of apple disease based on multimodal deep learning
model, International Journal of Intelligent Systems and Applications in Engineering
2023, 11(3), 1042–1050.
[30] Luo, F., Zhang, L., Du, B. and Zhang, L.: 2020. Dimensionality reduction with
enhanced hybrid‑graph discriminant learning for hyperspectral image classification.
IEEE Transactions on Geoscience and Remote Sensing, 58(8), 5336–5353.
[31] Tripathi, M.K., Neelakantapp, M., Nagesh Kaulage, A., Nabilal, K.V., Patil, S.N., and
Bamane, K.D.: Breast cancer image analysis and classification framework by apply‑
ing machine learning techniques, International Journal of Intelligent Systems and
Applications in Engineering 2023, 11(3), 930–941.
[32] LeCun, Y., Kavukvuoglu, K., and Farabet, C.: Convolutional networks and applications
in vision, In: Proceedings of 2010 IEEE international symposium on circuits and sys‑
tems, 2010, pp. 253–256.
[33] Boothalingam, R.: Optimization using lion algorithm: a biological inspiration from
lions social behavior, Evolutionary Intelligence 2018, 11(1‑2), 31–52.
Section III
Building AI with Quality Data
for Multidisciplinary Domains
13 Solving Student
Guiding Your Way
Admission Woes
Snehal Rathi, Shekhar Chaugule,
Manisha Mali, Gitanjali R. Shinde,
and Swati Patil
13.1 INTRODUCTION
Our research paper focuses on addressing the challenges encountered by students
during the admission process for engineering colleges. The project, “Guiding Your
Way,” aims to alleviate the difficulties and provide a streamlined solution for stu‑
dents seeking admission to these institutions. We understand the complexities and
obstacles that students face during this critical phase and have developed this project
with the intention of making the process smoother and more efficient. The admis‑
sion process for engineering colleges entails enrolling on a dedicated website where
the percentage obtained in the diploma exams plays a crucial role. Once all students
have completed the enrollment, a comprehensive list is generated of the percentage
and rank of each student among all Maharashtra State Board of Technical Education
(MSBTE) candidates. The rank obtained becomes instrumental in determining the
cutoff list for various colleges. Our motivation for developing this project stemmed
from personal experiences, where we encountered similar challenges during our own
admissions. While I had a rank within the range of 700–750, allowing me to easily
identify the top ten colleges I could potentially be allotted to, some of my friends
faced a more daunting situation with ranks like 15,000 or 18,000. They were unsure
which colleges would consider their rank for admission. One significant hurdle they
faced was the cumbersome process of scrolling through a lengthy PDF document
containing hundreds of college listings. It became impractical and time‑consuming
to manually identify the colleges suitable for their rank. This inspired us to create a
solution that could provide a comprehensive list of colleges with a single click.
Our proposed solution involves the development of an application [1] that allows
students to input their preferred department, category, and admission criteria (rank or
percentage). By specifying a minimum and maximum rank, the application generates
a list of colleges with cutoffs falling within the given rank range. This feature not only
saves students from the anxiety and effort of manually searching through extensive
PDF documents but also significantly reduces the time required to obtain a suitable list
of colleges. Our project aims to alleviate the stress and uncertainty faced by students
during the admission process. By providing a user‑friendly platform that streamlines
the search for eligible colleges, we hope to empower students with the information
they need to make informed decisions about their educational future [2,3].
challenges in finding suitable colleges based on their rank and caste category, as their
admission process differs from that of other students.
To address this gap, our project, Guiding Your Way, focuses on incorporating this
essential feature. We have developed an application that allows Diploma holder stu‑
dents to easily search for colleges based on their rank range and caste category. This
feature streamlines the college selection process for Diploma students and provides
them with relevant and personalized options [11].
One key feature that is missing in IndiaCollegesHub.com is the ability to find
colleges based on rank and caste, especially for Diploma holder students. While the
website offers general information about colleges, it does not cater to the unique
requirements of Diploma students when it comes to college selection. Diploma hold‑
ers face challenges in finding suitable colleges based on their rank range and caste
category, which is different from the admission process for other students. Through
our literature survey, we have identified the limitation of IndiaCollegesHub.com in
meeting the specific needs of Diploma holder students. Our project aims to bridge
this gap and provide a comprehensive solution that allows Diploma students to find
suitable colleges based on their rank and caste category. By incorporating this fea‑
ture, Guiding Your Way offers a unique tool to facilitate the admission process for
Diploma holder students [12,13].
IndCareer.com primarily focuses on providing information about the hospital‑
ity of colleges, scholarships, and event details, along with general education sys‑
tem information [14]. However, our analysis revealed a significant gap in terms of
a specific feature that is vital for Diploma holder students. One crucial feature that
is absent in IndCareer.com is the ability to find colleges based on rank and castes
specifically for Diploma holder students. While the website offers valuable infor‑
mation about the hospitality sector and scholarships, it does not cater to the unique
needs of Diploma students when it comes to college selection. Diploma holders face
challenges in finding suitable colleges based on their rank range and caste category,
which is distinct from other students’ admission criteria [15].
to manage the source code, facilitating collaboration and efficient tracking of code
changes. These technologies collectively enabled the development of a robust and
efficient web‑based application that streamlined the admission process for engineer‑
ing colleges.
13.4 METHODOLOGY
The college selection and admission process can be overwhelming for students due
to the vast amount of information available and the complexity of decision‑making.
To address these challenges, this research paper presents a proposed system called
“Guiding Your Way.” This innovative web‑based application aims to revolutionize the
college selection and admission process by providing students with accurate informa‑
tion, personalized recommendations, and efficient tools for decision‑making [8].
Comprehensive College Comparison: “Guiding Your Way” provides a compre‑
hensive college comparison feature that allows students to evaluate multiple colleges
based on various parameters such as infrastructure, faculty, placement records, and
academic programs. This enables students to make well‑informed decisions by con‑
sidering their specific preferences and priorities.
• Data Collection: The data required for the project was collected from the
result website, which provides information about students’ ranks and the
cutoff lists of various engineering colleges. The data collection process
involved scraping the website to extract the necessary information
• Data Conversion: The collected data was initially in a website format. To
analyze and manipulate the data effectively, it was converted into an Excel
spreadsheet format. Python programming language was utilized to develop
a web scraping script using libraries such as BeautifulSoup and Selenium to
extract the data from the website and store it in the Excel file (Figure 13.1).
Guiding Your Way: Solving Student Admission Woes 193
• Data Transformation: Once the data was in the Excel format, a Python
script using the openpyxl library was developed to convert the Excel
data into JSON format. This step was crucial as it allowed for easier data
manipulation and integration with the “Guiding Your Way” application
(Figure 13.2).
In the above system, work begins with data collection from the internet in the form
of PDF files. These files contain valuable information about colleges, including
rankings, courses, and other relevant details. To facilitate easy data manipulation,
a PDF‑to‑Excel converter is employed to convert the PDF files into an Excel for‑
mat. Once in Excel format, the system creates a dataset by organizing the data into
columns and rows. This dataset includes crucial information such as college names,
rankings, branches, and percentages. To enable further processing, an Excel to JSON
transformation is performed using a Python automation script. This script parses the
Excel file and converts the data into a JSON file, which offers a structured represen‑
tation suitable for filtering and querying.
When a user interacts with the system, they input their desired criteria, such as
rank, branch, and percentage. The system then applies filtering mechanisms to the
JSON data, narrowing down the options to colleges that meet the user’s specified
requirements.
194 Data-Centric Artificial Intelligence for Multidisciplinary Applications
Block diagram for the proposed system.
FIGURE 13.2
Guiding Your Way: Solving Student Admission Woes 195
Lack of Career Guidance: The current education websites often overlook the cru‑
cial aspect of career guidance. The proposed system will integrate career guidance
resources, providing students with valuable insights into various career options asso‑
ciated with different courses and colleges.
• The Guiding Way project successfully retrieved study content for the syl‑
labus of the Diploma course from reliable sources.
Finding Best Colleges based on Percentage Range:
• The Guiding Way system effectively processed the percentage range pro‑
vided by students and generated a list of colleges matching the criteria.
• The algorithm compared the student’s percentage with the cutoff lists of various
colleges and identified the ones where admission was possible (Figure 13.4).
We aimed to address the challenges faced by students during the admission process
for engineering colleges. We developed this project with the intention of solving
these problems and ensuring that no student encounters the same difficulties we
experienced. The admission process for engineering colleges involves enrolling
on a website and considering the aggregate percentage obtained in the final year
exams of the Diploma course. Once all students have enrolled, a comprehensive
list is generated, which includes the percentage and rank of each student among all
MSBTE students.
We observed that the use of rank is crucial in this process. By referring to the
previous year’s cutoff list, students can compare their rank and estimate the col‑
leges they may be allotted. However, we encountered a problem where students with
higher ranks, such as 15,000 or 18,000, struggled to identify the colleges they were
eligible for due to the large number of options listed in a PDF document.
To address this issue, we developed an application called Guiding Your Way.
This application allows students to select a particular department and their admis‑
sion category and choose between rank and percentage as the criteria. By entering a
minimum rank of 7000 and a maximum rank of 8000, for example, the application
generates a list of colleges with a cutoff between the specified rank range. This func‑
tionality saves students time and effort by providing a comprehensive list with just a
single click (Figures 13.5–13.7).
students to manually sift through lengthy PDF documents. This not only saves time
but also reduces the chances of errors. Guiding Way offers personalized guidance to
students based on their rank and preferences. The system suggests a list of colleges
that the student is eligible for, based on their rank and desired course. This helps
students make informed decisions and increases their chances of getting admitted to
their preferred college. Guiding Way provides a user‑friendly interface that is easy to
navigate, even for those with limited technical knowledge. The system is designed
to be intuitive and user‑centric, with clear instructions and helpful tips. Guiding Way
Guiding Your Way: Solving Student Admission Woes 199
is a comprehensive solution that addresses all the major challenges faced by students
during the admission process. From providing information on colleges to helping
students make choices based on their preferences, the system offers end‑to‑end guid‑
ance and support.
13.11 LIMITATION
However, we acknowledge several limitations in our current research. First, while
our project offers a useful application for students seeking admission to engineering
colleges, it does not introduce any novel methodologies. The lack of novel techniques
may limit the originality and potential impact of the work. Future research should
explore opportunities to incorporate innovative methodologies to further enhance the
project’s effectiveness and relevance. Secondly, “Guiding Your Way” currently relies
on data provided by the MSBTE enrollment website. Any discrepancies or inac‑
curacies in the data from the source may affect the application’s output. Therefore,
continuous efforts to validate and update the data sources are essential to ensure the
reliability of our application.
REFERENCES
[1] KularbPhettong, K. and Limphoemsuk, N. 2017. The effective of learning by augmented
reality on Android platform. In: E-Learning, E-Education, and Online Training,
Springer, Cham, pp. 111–118.
[2] Sahin, D. and Yilmaz, R.M. 2020. The effect of Augmented Reality Technology on mid‑
dle school students’ achievements and attitudes towards science education. Computers
& Education 144, 103710.
[3] Friendsickness in the Transition to College: Precollege Predictors and College
Adjustment Correlates. https://www.proquest.com/openview/8f3984886702e757f2d29
bcaa7276a9e/1?pq‑origsite=gscholar&cbl=18750
[4] This System Website Provide the Details about the College Site and Counseling System.
https://collegedunia.com/ and https://www.careers360.com/]
[5] Von Ah D, Ebert S, Ngamvitroj A, Park N, Kang DH. 2005. Predictors of health behav‑
iours in college students. Journal of Advanced Nursing 48(5),463-474. doi: 10.1111/
j.1365-2648.2004.03229.x. PMID: 15533084.
[6] Osadchyi, V.V., Valko, N.V. and Kuzmich, L.V., 2021, March. Using augmented reality
technologies for STEM education organizations. Journal of Physics: Conference Series
1840(1), 012027.
Guiding Your Way: Solving Student Admission Woes 201
[7] Fan, M., Antle, A.N. and Warren, J.L., 2020. Augmented reality for early language
learning: A systematic review of augmented reality application design, instructional
strategies, and evaluation outcomes. Journal of Educational Computing Research
58(6), 1059–1100.
[8] Ghulamani, S. and Zareen, S., 2018, March. Educating students in remote areas using
augmented reality. In: 2018 International Conference on Computing, Mathematics and
Engineering Technologies (iCoMET), IEEE, pp. 1–6.
[9] Rathi, S., Deshpande, Y., Nagaral, S., Narkhede, A., Sajwani, R. and Takalikar, V. 2021.
Analysis of user’s learning styles and academic emotions through web usage mining.
In: 2021 International Conference on Emerging Smart Computing and Informatics
(ESCI), Pune, India, pp. 159–164. doi:10.1109/ESCI50559.2021.9397037.
[10] Adrianto, D., Hidajat, M. and Yesmaya, V., 2016, December. Augmented reality using
Vuforia for marketing residence. In: 2016 1st International Conference on Game,
Game Art, and Gamification (ICGGAG), IEEE, pp. 1–5.
[11] Radosavljevic, S., Radosavljevic, V. and Grgurovic, B. 2020. The potential of imple‑
menting augmented reality into vocational higher education through mobile learning.
Interactive Learning Environments 28(4), 404–418.
[12] Aneena Aley Abraham (2023). Ranking System of Indian Universities and Colleges.
Retrieved February 12, 2024, from https://www.shiksha.com/science/ranking/
top-universities-colleges-in-india/121-2-0-0-0
[13] Gomes, C., Chanchal, S., Desai, T. and Jadhav, D., 2020. Class student management
system using facial recognition. ITM Web of Conferences 32, 02001.
[14] Raju, K.C., Yugandhar, K., Bharathi, D.V.N. and Vegesna, N., 2018, November.
Third based modern education system using augmented reality. In: 2018 IEEE 6th
International Conference on MOOCs, Innovation and Technology in Education
(MITE), IEEE, pp. 37–42.
[15] Živčić-Bećirević, I., Smojver-Ažić, S., & Dorčić, T. M. 2017. Predictors of university
students’ academic achievement: A prospective study. Društvena Istraživanja, 26(4),
457–476
14 Melodic Pattern
Recognition for
Ornamentation Features
in Music Computing
Makarand Ramesh Velankar,
Sneha Kiran Thombre, and
Harshad Suryakant Wadkar
14.1 INTRODUCTION
Technology has made inroads in creative art domains such as music, drawing, etc.
Automatic pattern recognition in music is a challenging problem due to the complex
nature of music. Modeling and recognition of prosodic components such as ornamen‑
tation in music makes it more challenging. Examples of prosodic features used in
speech are tone, stressed word, or voice modulation. They are experienced in audio,
but cannot be represented in the corresponding text matter. Traditionally, different
pattern recognition (PR) paradigms for music pattern analysis include statistical and
structural approaches for a specific predefined task. A statistical approach based on
probabilistic models with efficient use of machine learning algorithms for different
applications in music information retrieval (MIR) is common among researchers.
The structural or symbolic approach based on formal grammar helps model melodic
or rhythmic structures in the music. The use of neural networks for PR now extends
to deep neural networks for efficient prediction. The boundaries between different
paradigms are fuzzy and fading. Combined approaches are also gaining popularity
as they share the same goals.
The statistical approach attempts to extract numerical values from the data as a
source for classification. This technique has gained more acceptance and popularity
in the research community due to different machine learning and classification algo‑
rithms applied to the numerical data extracted from the digital objects under study.
Self‑repetitive pattern identification is a topic of interest from music summary or
content‑based MIR. A comparative study of Chroma features, Constant Q transforms
features, and MFCC features was performed. Results were compared with ground
truth obtained from human expert annotation for identifying repetitive patterns [1].
The ground truth used in most of the systems is input from human experts.
It is not easy to get the ground truth for large datasets with duration in hours
for audio files. The challenge is to generate and evaluate ground truth for massive
musical data, which can evaluate different machine learning algorithms. The vector
space model was used for melodic pattern extraction of the raga in Indian art music,
and the results were assessed for diverse classification strategies [2]. The statistical
PR approach using vector representation is suitable for machine learning algorithms.
They typically require input data in feature values. Modeling appropriate features
of the music, which we interpret as structures, is the challenge in music computing.
The structural approach thus becomes necessary for melodic or rhythmic patterns,
perceived on a timeline as a sequence.
Structural or syntactic PR is generally applied for melodic or rhythmic PR. In
melodic patterns, the note sequence pattern is usually represented as an ordered list
of notes with string‑type data structure. A monophonic musical pattern is described
by using notes as the fundamental unit for the representation. The hierarchical tree
structure can be used to represent the pattern at different granularity levels. The
directed graph structure is another representation of the notes’ transition in melody.
Induction and matching of sequential melodic patterns pose several computational
challenges but are helpful for musically interesting retrieval tasks [3]. Time series
symbolic pattern representation of music is a challenge, as the data are multidimen‑
sional and real valued, with patterns rarely repeating precisely. Data margins and
fuzziness are used considering the perception of patterns. Pattern segmentation for
processing can be done using perceptually grouped melodic phrases. Pitch transcrip‑
tion, rhythmic meter, or tempo information may not be the only helpful information
for structural pattern analysis.
The use of timbre information with string‑matching techniques can be more ben‑
eficial for polyphonic music [4]. The music structural pattern representation plays a
significant role in the automatic conversion from a sheet printed music for the perfor‑
mance (interpretation) or vice versa (transcription). Evaluation of accuracies related
to optical music recognition systems used for structural musical interpretation is a
challenge [5]. A graphical structure can represent music scores. The terminal nodes
directly describe the content of the music, the internal nodes represent its incremen‑
tal summary, and the arc represents its relationship.
The similarity between two melodies can be calculated by analyzing the struc‑
ture of the graph and finding the shortest path between corresponding nodes in the
graph [6]. Due to the effective modeling of sequential patterns for the time‑variant
music and human perception of musical patterns, a structural approach is suitable for
melodic patterns. Therefore, more robust music knowledge representation systems
are possible with this paradigm. Human brains process musical patterns for different
interpretations, and a neural network‑based approach attempts to model the same.
The artificial neural network (NN) is a computing model composed of intercon‑
nections of artificial neuron units to simulate the human brain. Different problems
in music have been successfully addressed with the help of different variants of NN.
For example, the use of dynamic programming and recurrent NN with hidden states
or memory units for chord recognition is found better than the hidden Markov model
[7]. Convolutional Neural Network (CNN) trained on a mel‑scaled spectrogram
produced promising results for automatic boundary detection for musical structures
[8]. Research indicates that NN can be trained to identify statistical inconsistencies
across audio features to predict valence/arousal values for emotion classification [9].
204 Data-Centric Artificial Intelligence for Multidisciplinary Applications
Deep neural network models for solving PR problems are becoming increasingly
popular. They can primarily learn complex non‑linear input and output relationships,
have almost no dependence on domain‑specific knowledge, and the availability of
practical models. For the effective implementation of this strategy, a vast training
dataset is required. Datasets provide the foundation for machine learning Algorithms.
Indian music is represented using either Western or Indian notations. For example,
piano or harmonium as wind instruments with reeds representing the notes shows
similarity and mapping of both notations. The mapping is as shown in Figure 14.1.
The melodic patterns are described using a sequence of notes such as C D E in
Western form with equivalent Indian notations as Sa Re Ga. The notation representa‑
tion of songs is helpful for composers or performers to play songs during recording or
live shows. Traditionally, Indian classical music (ICM) uses various music ornamen‑
tation forms to convey specific rasa or mood during raga performance. Music orna‑
mentation is integrated into melodic and rhythmic patterns to add esthetic appeal
during the performance. Various ornamentation forms include kan swar, meend,
khatka, murkhi, Andolan, and gamaka. This ornamentation is used in Indian film
songs as a natural extension due to the influence of ICM. As per the inputs from
the performers [10], kan swar or grace note (tiny duration note) is used prominently
in Indian music for conveying emotional appeal. kan swar is sung or played before
or after the main note during the melody. A grace note is a similar notion used in
Western music ornamentation.
During the computational study of ornaments in Hindustani music [11], kan swar
or grace note was observed as a subtle change (delicate and difficult to analyze).
Other ornaments such as meend (glide from one note to another or glissando) or
krintan (kan swar followed by a meend) represent a variety of music ornaments used
in Indian music. The rate of change in frequency and amplitude constitutes ornamen‑
tation’s fundamental nature of the computational aspects.
The chapter is organized in the following manner. A detailed survey related to
data‑driven melodic PR and specific to music ornamentation feature is covered in
Section 2. Research gaps are presented in Section 3. Methodology along with an
exploratory learning algorithm used is explained in Section 4. Section 5 covers
results and discussions with future directions.
Melodic Pattern Recognition for Ornamentation Features 205
grace notes or kan swar, was prominent in conveying emotions [10]. Further, such
feature dimensions may be associated with different emotion classes. Therefore,
successfully capturing the expressive features of ornamentation and its association
with emotion can improve the accuracy of Music Emotion recognition classification.
Performers use ornamentation to provide a different experience to listeners during
the performances.
14.2.3 Music Ornamentation
Musical ornamentation is the way the performers or music arrangers add value to
the original musical composition to convey meaning [11]. Music ornamentation plays
a significant role in conveying the intended meaning and experience to the listener.
Different genres of music forms have varied ornamentation specific to the genre
with the cultural impact. Indian music consists of various musical ornaments such
as grace notes, vibrato, glides, and variations [32]. Many ornamentations use subtle
acoustic parameters such as timbre, intensity, and pitch. They are challenging to cap‑
ture, considering the nature of change and relativeness. In the case of Western music,
a majority of the ornaments are well‑documented with sheet music notations [33].
Western music performance guided by sheet music has a fixed format and relatively
less scope for improvisations. Jazz music has a similarity with Indian music consid‑
ering the improvisation aspect in both. In speech communication, pauses, stress on
different words, voice modulation, and repetition of specific words or phrases convey
content effectively. Similarly, in musical performance, these clues are used along
with musical embodiment, referred to as music prosody.
Music prosody is the non‑verbal clues used to convey musical meaning in the case
of vocal performances. In the instrumental version, the clues used are specific to the
instrument by utilizing the features and capabilities associated with the instrument.
Every performer attempts to convey the message in their style and interpretation.
One can notice that the performance by different artists for the same musical piece
or same artist at different times does have variations, and modeling these aspects is
a challenging task considering its subtle nature.
Indian music has various forms such as classical, regional, and traditional folk,
in popular. The ornamentation and prosodic elements used to differ with these
forms. Indian music is traditionally an orally transmitted teaching‑learning system,
and Pandit Bhatkhande and Pandit Paluskar introduced formal notations during
the beginning of the nineteenth century. It was probably the first sincere attempt of
music documentation in the form of notations which many musicians and composers
then adopted. These notations attempt to capture some of the ornamentation, such
as glides and vibrato which guides the performer. Thus, the performers have more
liberty within the musical framework of Indian music.
The gaps identified from the survey and discussions with researchers are as
follows:
1. As per the literature survey related to Indian music, it was observed that
very little work has been done so far to model and identify the music orna‑
mental features.
2. The annotated dataset for ornamental features is not readily available.
The need was felt to apply PR algorithms. For data‑driven approaches such as deep
learning, the annotated dataset required is quite large. Thus, the initial approach used
was exploratory learning with a small annotated dataset to identify the challenges
likely to be faced which need to be addressed for data‑driven methods.
14.4 METHODOLOGY
In our first attempt to explore the music embodiment of ornaments as prosodic fea‑
tures, we have developed an algorithm to capture this ornamentation feature using an
exploratory learning approach. The exploratory learning algorithm is developed to
identify ”grace note”, a source for inducing emotion as per performers. The experi‑
mentation was carried out with the help of annotations provided by domain experts.
Due to no fixed rules and boundaries documented for ornamentation with flexibility
for interpretations, the annotations provided by experts for the same musical piece
had some variations. A grace note is an ornamentation, defined as a short‑duration
note presented before or after a steady note. The duration of a grace note and steady
note is not fixed and is relative as per the perception or interpretations of domain
experts. Manual annotations are a time‑consuming process and have limitations due
to time availability by domain experts.
The annotated audio sample with waveform and annotated notes marked before
the main note is as shown in Figure 14.2. The audio sample has eight grace notes
sung before the main notes. The grace notes are of very small duration visible in
the figure in blue line spikes. The main notes are represented in Indian, and their
counterpart Western note format is shown in brackets as SA (C4), Re (D4), respec‑
tively. That is, a note in Indian notation SA is shown in Western notation as C4
in the bracket. It can be noticed from the vertical bars in blue that the main note
durations are relatively long compared to a small duration grace note appearing
before them.
Grace notes (in bracket) annotated before the main notes in the Sargam as per
Indian notations are
(Re) Sa, (Ga)Re, (Ma)Ga, (Pa)Ma, (Dh)Pa, (Ni)Dh, (Sa’) Ni, (Re’) Sa’
The equivalent representation using western notations is
(D4) C4, (E4) D4, (F4) E4, (G4) F4, (A4) G4, (A#4) A4, (C5) B4, (D5) C5.
The following example demonstrates the grace notes identified using the devel‑
oped algorithm. Results are obtained with possible grace notes identified with the
note and duration. “gn” stands for the grace note identified, mentioned before the
grace note identified. Results show notes followed by relative durations and grace
notes as small relative durations, such as 1, 2, or 3. Whereas, the main notes have
longer relative durations, such as 13, 15, and 16.
[‘gn’, ‘ D4’, 2, ‘C4’, 13, ‘gn’, ‘E4’, 2, ‘D4’, 13, ‘gn’, ‘F4’, 3, ‘E4’, 16, ‘gn’, ‘G4’, 2,
‘F4’, 15, ‘gn’, ‘A4’, 2, ‘G4’, 15, ‘gn’, ‘A#4’, 2, ‘A4’, 16, ‘gn’, ‘C5’, 1, ‘B4’, 15, ‘gn’, ‘D5’,
2, ‘C5’, 16]
The result shows that eight grace notes are correctly identified out of a total of
eight grace notes annotated for the sample shown. The algorithm was further tested
for other samples. The results obtained are used to fine‑tune the algorithm to cor‑
rectly identify grace notes for different test samples. The results from 20 annotated
short samples were obtained using the developed algorithm and were discussed with
expert annotators for further improvements in the algorithm.
Algorithm for Grace note identification
// Initialization
I = 1//sample number initialized to the first sample
// Time instance of the sample I − T[I]
// Intensity of the sample I − A[I]
// Pitch of the sample I − P[I]
// Identification maximum Intensity and minimum intensity
MaxI = A[0]// Maximum Intensity audio sample initialization
i =1
while(End of samples i)// begin of loop 1
{
If(MaxI < A[i])
MaxI = A[i]
} // end of loop 1
MinI = MaxI− 30// Defines Audible Intensity range
// Intensity below MinI considered as silence
i =0
// Convert pitch info to notes N[ ] a ray
while(End of samples n)// begin of loop 2
{
Read P[i]
Melodic Pattern Recognition for Ornamentation Features 209
The samples used for training and testing the algorithm were containing grace notes
as the only ornamentation feature. Further, it was advised to test the algorithm for
audio samples with the presence of other ornamentation patterns such as glides. It
was observed that the algorithm provided the wrong output on a few occasions due to
overlap. Overlap of grace notes and confusion with glides was one of the significant
issues observed, as shown in Figure 14.3. Grace notes here are predicted as (A#4)
A4 and (C4) B4, which are observed as false positives compared with human annota‑
tions. Although the grace note is visible, it is referred to as glide by domain experts.
Thus, the algorithm misjudged the glide with grace note as it was not modeled for
other ornamentation features such as glide.
210 Data-Centric Artificial Intelligence for Multidisciplinary Applications
The results showed reasonably good accuracy for samples; however, the algorithm
misinterpreted some instances. It is due to the overlap of ornamentation patterns such
as glides and grace notes.
It was observed that although the algorithm developed has provided reasonably
good accuracy, it is still far away from practical use in applications. The rule‑based
approach has limitations for ornamentation patterns due to the complex nature of
the data. Experts did not annotate the grace notes identified by algorithms in some
cases. The algorithm failed to capture some grace notes, annotated by experts.
Non‑standardization of duration leads to some discrepancies among expert opinions.
Annotation and agreement among the experts is a challenging issue. The need felt to
standardize the duration of grace notes with experimentation and opinions of domain
experts. More experimentation and a huge annotated dataset are needed for enhanc‑
ing the algorithm and utilizing the ornamentation features for any application. Music
recommendation is a prominent application in music computing considering online
music consumption.
Melodic Pattern Recognition for Ornamentation Features 211
REFERENCES
[1] Lu, L., Wang, M., and Zhang, H.‑J. (2004). Repeating pattern discovery and struc‑
ture analysis from acoustic music data. In: Proceedings of the 6th ACM SIGMM
international workshop on Multimedia information retrieval, New York, NY, USA,
pp. 275–282.
[2] Gulati, S., Serra, J., Ishwar, V., Senturk, S., and Serra, X. (2016). Phrase‑based raga
recognition using vector space modeling. In: 2016 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 66–70.
[3] Klapuri, A. (2010). Pattern induction and matching in music signals. In: Mitsuko
Aramaki, Mathieu Barthet, Richard Kronland‑Martinet, Sølvi Ystad (eds.)
International Symposium on Computer Music Modeling and Retrieval, Springer, New
York, pp. 188–204.
[4] Aucouturier, J.‑J., and Sandler, M. (2002). Finding repeating patterns in acoustic musi‑
cal signals: Applications for audio thumb nailing. In: Jyri Huopaniemi (ed.) Audio
Engineering Society Conference: 22nd International Conference: Virtual, Synthetic,
and Entertainment Audio, Audio Engineering Society.
[5] Bainbridge, D., and Bell, T. (2001). The challenge of optical music recognition.
Computers and the Humanities, 35(2), 95–121.
[6] Orio, N., and Roda, A. (2009). A measure of melodic similarity based on a graph repre‑
sentation of the music structure. In: ISMIR, Kobe, Japan, pp. 543–548.
[7] Boulanger‑Lewandowski, N., Bengio, Y., and Vincent, P. (2013). Audio chord recogni‑
tion with recurrent neural networks. In: ISMIR, Citeseer, pp. 335–340.
[8] Ullrich, K., Schl¨uter, J., and Grill, T. (2014). Boundary detection in music struc‑
ture analysis using convolutional neural networks. In: ISMIR, Taipei, Taiwan.
pp. 417–422.
212 Data-Centric Artificial Intelligence for Multidisciplinary Applications
[9] Vempala, N. N., and Russo. F. A. (2012). Predicting emotion from music audio fea‑
tures using neural networks. In: Proceedings of the 9th International Symposium on
Computer Music Modeling and Retrieval (CMMR), Lecture Notes in Computer Science,
London, UK, pp. 336–343.
[10] Personal Discussions (2012–2013). Personal discussions with renowned vocalist Veena
Saharabuddhe, Pandit Sanjeev Abhyankar and flute player Pandit Keshav Ginde.
[11] Narayan, A. A. (2018). Computational study of ornaments in Hindustani music. PhD
thesis, International Institute of Information Technology, Hyderabad.
[12] Bai, X., Wang, X., Liu, X., Liu, Q., Song, J., Sebe, N., and Kim, B. (2021). Explainable
deep learning for efficient and robust pattern recognition: A survey of recent develop‑
ments. Pattern Recognition, 120, 108102.
[13] Solomatine, D. P., and Ostfeld, A. (2008). Data‑driven modeling: some past experiences
and new approaches. Journal of Hydro Informatics, 10(1), 3–22.
[14] Chen, Y. W., and Jain, L. C. (2020). Deep Learning in Healthcare: Paradigms and
Applications, Springer, Heidelberg.
[15] Zhang, Q., Yang, L. T., Chen, Z., and Li, P. (2018). A survey on deep learning for big
data. Information Fusion, 42, 146–157.
[16] Busia, A., Dahl, G. E., Fannjiang, C., Alexander, D. H., Dorfman, E., Poplin, R., and
DePristo, M. (2018). A deep learning approach to pattern recognition for short DNA
sequences. BioRxiv, 353474.
[17] Sarker, I. H. (2021). Deep cybersecurity: a comprehensive overview from neural net‑
work and deep learning perspective. SN Computer Science, 2(3), 154.
[18] Calvo‑Zaragoza, J., Toselli, A. H., and Vidal, E. (2019). Handwritten music recogni‑
tion for mensural notation with convolutional recurrent neural networks. Pattern
Recognition Letters, 128, 115–121.
[19] Velankar, M., and Kulkarni, P. (2022). Melodic pattern recognition and similarity mod‑
eling: a systematic survey in music computing. Journal of Trends in Computer Science
and Smart Technology, 4(4), 272–290.
[20] Elbir, A., and Aydin, N. (2020). Music genre classification and music recommendation
by using deep learning. Electronics Letters, 56(12), 627–629.
[21] Van den Oord, A., Dieleman, S., and Schrauwen, B. (2013). Deep content‑based music
recommendation. In: C.J. Burges and L. Bottou and M. Welling and Z. Ghahramani and
K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems, NeurIPS
Proceedings, 26.
[22] Benetos, E., Dixon, S., Duan, Z., and Ewert, S. (2018). Automatic music transcription:
An overview. IEEE Signal Processing Magazine, 36(1), 20–30.
[23] Briot, J. P., and Pachet, F. (2020). Deep learning for music generation: challenges and
directions. Neural Computing and Applications, 32(4), 981–993.
[24] Micchi, G., Kosta, K., Medeot, G., and Chanquion, P. (2021). A deep learning method
for enforcing coherence in automatic chord recognition. In: ISMIR, pp. 443–451.
[25] Bittner, R. M., McFee, B., Salamon, J., Li, P., and Bello, J. P. (2017, October). Deep
salience representations for F0 estimation in polyphonic music. In: ISMIR, Suzhou
(China) pp. 63–70.
[26] Su, L. (2018, April). Vocal melody extraction using patch‑based CNN. In: 2018 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP),
IEEE, Calgary, AB, Canada. pp. 371–375.
[27] Sharma, A. K., Aggarwal, G., Bhardwaj, S., Chakrabarti, P., Chakrabarti, T., Abawajy,
J. H., and Mahdin, H. (2021). Classification of Indian classical music with time‑series
matching deep learning approach. IEEE Access, 9, 102041–102052.
[28] Shah, D. P., Jagtap, N. M., Talekar, P. T., and Gawande, K. (2021). Raga recognition in
Indian classical music using deep learning. In: Artificial Intelligence in Music, Sound, Art
and Design: 10th International Conference, EvoMUSART 2021, Held as Part of EvoStar
2021, Proceedings 10, Virtual Event April 7–9, 2021, Springer, New York, pp. 248–263.
Melodic Pattern Recognition for Ornamentation Features 213
[29] Nag, S., Basu, M., Sanyal, S., Banerjee, A., and Ghosh, D. (2022). On the application of
deep learning and multi‑fractal techniques to classify emotions and instruments using
Indian Classical Music. Physica A: Statistical Mechanics and its Applications, 597,
127261.
[30] Singh, Y., and Biswas, A. (2022). Robustness of musical features on deep learning mod‑
els for music genre classification. Expert Systems with Applications, 199, 116879.
[31] Velankar, M., Thombre, S., and Wadkar, H. (2022). Evaluating deep learning models
for music emotion recognition. International Journal of Engineering Applied Sciences
and Technology, 7(6), 252–259.
[32] Pudaruth, S. K. (2016). A reflection on the aesthetics of Indian music, with special refer‑
ence to Hindustani raga‑Sangita. Sage Open, 6(4), 2158244016674512
[33] Cont, A. (2010). A coupled duration‑focused architecture for real‑time music‑to score
alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6),
974–987.
[34] Windsor, Luke, Aarts, Rinus, Desain, Peter, Heijink, Hank, and Timmers, Renee.
(2001). The timing of grace notes in skilled musical performance at different tempi: A
preliminary case study. Psychology of Music, 29(2), 149–169.
[35] Jeong, Dasaem, Kwon, Taegyun, Kim, Yoojin, and Nam, Juhan. (2019). Score and per‑
formance features for rendering expressive music performances. In: Music Encoding
Conference, Music Encoding Initiative Vienna, Austria, pp. 1–6.
15 Content Analysis
Framework for Skill
Assessment
Abhishek Kabade, Harshad Jagadale,
Anurag Bharde, and S. P. Sonavane
15.1 INTRODUCTION
The education sector has undergone significant changes due to advancements in tech‑
nology, resulting in a need for innovative learning and skill assessment methods.
This book chapter delves into a cutting‑edge system that harnesses the power of the
Internet of Things (IoT), machine learning, artificial intelligence (AI), blockchain,
and RFID technology to address these challenges.
In today’s fast‑paced world, educational institutions must teach students theo‑
retical knowledge and practical skills that are relevant to the industry. Traditional
assessment methods may often fail to capture students’ abilities. Therefore, there is a
growing demand for a robust system that can accurately assess skills while fostering
collaboration between academia and industry. The system presented in this chapter
revolves around an AI‑based learning environment, incorporating an academic chat‑
bot and a central database to facilitate seamless communication between students
and industry professionals. The effectiveness of this system is demonstrated through
a case study conducted at an academic institution, Walchand College of Engineering
(WCE) in Sangli (MS), India.
At the core of the system lies the integration of RFID technology. Each student is
assigned a unique identity through an RFID card that stores comprehensive information,
including personal details, academic performance, and skill indices. These skill indices
represent the students’ abilities, encompassing both relative and absolute skill levels.
Stored in separate blocks within a blockchain network, these indices ensure data integ‑
rity and security while granting industry professionals access to relevant information.
The system also employs machine learning algorithms along with natural language pro‑
cessing techniques to extract meaningful content from diverse sources such as text and
audio materials. This content analysis enables the system to provide intelligent responses
to student queries and offer personalized recommendations for skill improvement.
The book chapter is divided into three parts to provide a comprehensive under‑
standing of the system.
Part I focuses on the infrastructure of the WCE campus network, highlighting
the academic chatbot, central database, and wireless RFID connections that form the
foundation of the system.
Part II delves into the intent behind industry queries, referencing query process‑
ing from Part I and introduces the content analyzer. The contents are selected from
documents, and for experimentation, the student resume document is taken as input
to extract skill‑indicating tokens resulting in Resume Analyzer [1]. This component
calculates skill indices based on the data stored on RFID cards. This section demon‑
strates how the system offers a comprehensive overview of students’ skills, assisting
both academia and industry in identifying suitable talent.
Part III explores the mechanism of recommending study materials and educa‑
tional resources tailored to each student’s specific skill improvement needs.
By integrating IoT, machine learning, AI, blockchain, and RFID technology, this
system offers a holistic approach to content learning and analysis for skill assess‑
ment. It promotes effective collaboration between academia and industry, enhances
skill evaluation, and provides personalized opportunities for skill development. The
subsequent sections of the chapter delve into the intricate details of the system, high‑
lighting its potential to revolutionize the educational landscape and meet the evolv‑
ing demands of the modern world.
i. Developed framework
ii. Connected with the cloud
iii. Integrated RFID
iv. AI/ML Block
v. Result Analysis
To address these challenges, a chatbot can be developed. The chatbot aims to facilitate
interaction between users and a chatbot that can be accessed anytime and anywhere.
By integrating the chatbot seamlessly into the university or college website through
simple language conversions, it becomes readily available to provide a wide range of
information related to the institution and student‑specific queries. The chatbot serves as
a valuable resource for anyone accessing the university’s website, allowing users to ask
questions pertaining to the university and receive corresponding responses generated by
an algorithm after processing the input message. Job search [5] from the online portal is
one of the most upcoming and efficient job search methods for both the job seeker and
the job provider. The solutions for these new‑age technologies are still the traditional
and time‑taking methods are still the same. The answers are driven by manual rules
like searching and reading the complete resume which takes a huge mental power and
time also hinging a bit with the effectiveness and frustration. Job finding is a type of
recommender system. The Recommender system was first introduced by Resnick and
Varian [6] who pointed out that in a typical recommender system, people provide rec‑
ommendations as inputs, which the system then aggregates and directs to appropriate
recipients. For job matching, many research works have been conducted to invent differ‑
ent recommender systems for job recruiting [7]. Among all of them, Malinowski et al.
[8] proposed bilateral matching recommendation systems for bringing people together
with jobs using an Expectation Maximization algorithm, while Golec and Kahya [9]
portray a fuzzy model for competency‑based employee evaluation and selection with
fuzzy rules. Paparrizos et al. [10] used Decision Table/Naive Bayes as a hybrid classifier.
FIGURE 15.1 Institutional level content learning and analysis for skill assessment.
Content Analysis Framework for Skill Assessment 217
15.4.1 NLP Implementation
The implementation of the chatbot involves the use of Natural Language Toolkit
(NLTK), a widely adopted open‑source library in Python for NLP tasks. Figure 2
indicates the flow of implementation. The following steps outline the implementa‑
tion process:
1. Importing Corpus: This step entails accessing and uploading the data files
required to train and evaluate the NLP models. The system retrieves the
necessary data files, which serve as the dataset for training the chatbot. The
corpus may consist of various textual data such as documents, conversa‑
tions, or any relevant content.
2. Preprocessing Data: Data preprocessing is a critical step that involves
cleaning and modifying the raw text to make it suitable for analysis. It
includes operations such as removing punctuation, converting all letters to
lowercase, eliminating stop words (commonly used words like “a,” “the,” and
“is” that carry little meaning), and handling special characters or symbols.
3. Test Case Processing: Test case processing involves preparing and manag‑
ing test cases to evaluate the chatbot’s functionality. Test cases comprise
input patterns and their corresponding expected responses, used to validate
the accuracy and validity of the chatbot’s replies. These test cases cover dif‑
ferent scenarios and user queries to ensure the chatbot performs as intended.
218 Data-Centric Artificial Intelligence for Multidisciplinary Applications
By following these detailed steps, including data import, preprocessing, test case
processing, tokenization, stemming, BoW, and one‑hot coding, the NLP‑based chat‑
bot can effectively process and analyze textual data, generating precise and relevant
responses to user inquiries, depicted in Figure 15.2.
15.4.2 Block Diagram
15.4.3 Algorithm and Code: Chatbot Response Generation
Using TF‑IDF and Cosine Similarity
The code implements a chatbot that uses TF‑IDF and cosine similarity to find
the most similar sentence to the user’s response. It generates a response based on the
similarity, and if there is no understanding, it responds with a default message. The
TF‑IDF matrix is used to represent the importance of words in the sentences and to
measure the similarity between the user’s response and previous sentences cosine
similarity is used.
Content Analysis Framework for Skill Assessment 219
Algorithm
Program Code
def response(user_response):
robo_response = ‘’
# Initialize the response variable for the chatbot
220 Data-Centric Artificial Intelligence for Multidisciplinary Applications
In the provided code snippet, cosine similarity is utilized to find the most similar
sentence to the user’s input. Cosine similarity is a metric which is used to measure
the similarity between two vectors, particularly in high‑dimensional spaces.
How it helps in generating responses to the chatbot:
1. Sentence Tokenization: The user’s input and other sentences are tokenized,
which means they are split into individual words or phrases.
2. TF‑IDF Vectorization: The TfidfVectorizer from the ‘sklearn.feature_
extraction.text’ module is used to convert the tokenized sentences to numer‑
ical representations called TF‑IDF vectors. TF‑IDF, which stands for Term
Frequency‑Inverse Document Frequency, is a technique commonly utilized
in information retrieval and text mining. Based on the frequency and rarity
of the words in the entire corpus, it assigns corresponding weights.
3. Cosine Similarity Calculation: The TF‑IDF vector for the user’s input
is compared with the TF‑IDF vectors of all the other sentences in the
‘sentence_tokens’ list. The cosine similarity is calculated between the vec‑
tors using the ‘cosine_similarity’ function from the ‘sklearn.metrics.pair‑
wise’ module.
The cosine similarity between two vectors can be determined by evalu‑
ating the dot product of the vectors and dividing it by the product of their
Content Analysis Framework for Skill Assessment 221
Algorithm of Pyresparser
Algorithm
13. Parse the resume document using the appropriate parser for the file format
(e.g., PyPDF2 for PDF).
14. Preprocess the parsed text to clean and normalize it.
15. Segment the preprocessed text into sentences.
16. Use spaCy’s NER module to identify entities (names, addresses, and phone
number) in each sentence.
17. Add the identified entities to the list of extracted entities.
18. Use pattern‑matching techniques to extract specific information (educa‑
tional qualifications, work experience, and skills) from the parsed text.
19. Add the extracted information to the list of extracted entities.
20. Convert the list of extracted entities into a structured format like JSON or
DataFrame.
21. Return the parsed resume information in the chosen structured format.
224 Data-Centric Artificial Intelligence for Multidisciplinary Applications
• Resume collections.
• Keywords searching and skills extraction.
• Matching skills with industry requirements.
• Calculating percentage requirement fulfilled by the candidates.
• Creating QR code of the uploaded resume and a self‑introductory video of
the user.
• Using Blockchain technology for data privacy and security
15.5.2 Resume Collection
Different machine learning algorithms are employed to suggest a shortlist of resumes
to recruiters from a large pool of resumes. However, these decisions rely on the
assumption that the data provided in the resumes is structured and standardized
within the same field.
For instance, if a company requires “Docker” skill, the system can search for the
keyword “Docker” in the resumes of the candidates. However, this method is not
able to give any insight into the candidate’s proficiency level in that specific field. It
is unable to determine the candidate’s level of expertise or competence in that par‑
ticular area.
15.5.3 Blockchain Implementation
In the proposed Figure 15.5, the process initiates with a request being initialized.
This request triggers the creation of a block that contains the student’s resume.
After the formation of the block, it is transmitted to peer industries within the
network. The block node then undergoes validation to ensure its authenticity and
integrity. If the block successfully passes the validation process, it is added to
the existing Blockchain, expanding the chain with the new block. Finally, the
request reaches the industry, marking the completion of the cycle. This diagram
visually represents the sequential flow of the request and highlights the crucial
steps involved in the creation and dissemination of blocks within the Blockchain
network.
The algorithms are written in the Python language. These algorithms are used
for keywords searching and extraction. Extracted data is stored in MySQL database.
Stored data is fetched whenever required for matching with the given or mandatory
skill set of the industry. According to the candidate’s skills, the system will generate
his interested field. The framework suggests the candidate for the online courses and
YouTube channels as per his interest derived by the framework.
Based on his score, the framework classify them into three groups:
1. Fresher
2. Intermediate
3. Experienced
The score is calculated as the sum of percentage skills matched and the score
obtained for the mentioning other important details such as projects, achievements,
educational details, etc.
The candidates and companies will interact through web‑based interface devel‑
oped using “Streamlit” library. The percentage marks are divided by 2 to scale it in
terms of 50, and other 50 marks are for the mentioning of other important resume
details.
Content Analysis Framework for Skill Assessment 227
recommended fields, name of the candidate, etc. Admin is also provided with a
pie chart showing recommended fields for the candidates depicting the percent‑
age share of each field.
In Figure 15.11, Company can add or delete skills from the skill set that the candi‑
date needs to have. Recruiter can add manually also. These skill set are then matched
with the skills of the candidate. The score is calculated from the percentage skills
matched with the company’s required skill set. The candidate’s extracted skill set and
the company’s skill set are stored in two different arrays. Their intersection is taken
to calculate the percentage score.
230 Data-Centric Artificial Intelligence for Multidisciplinary Applications
FIGURE 15.10 Pie‑chart depicting the percentage of the candidate interested in each field.
Content Analysis Framework for Skill Assessment 231
15.9 REMARK
This book chapter explores an innovative IoT‑based learning system that integrates
machine learning, AI, blockchain, and RFID technology. The system aims to bridge
the gap between academia and industry by providing a comprehensive solution for
skill assessment and collaboration between the two sectors. The chapter discusses
the implementation details, with a particular focus on the development of an aca‑
demic chatbot by the use of NLP algorithms. The chatbot’s ability to get the language
and generate human‑like language enables meaningful interactions and personalized
recommendations for skill improvement.
The integration of blockchain technology in the Resume Analyzer module, along
with NLP techniques, enhances security, privacy, and trust in the resume verification
process. It enables efficient text processing and addresses the challenges associated
with natural language extraction. However, the system may face difficulties in disam‑
biguating similar words with different meanings. Despite this, it serves as a primary
tool for identifying eligible candidates for job opportunities.
232 Data-Centric Artificial Intelligence for Multidisciplinary Applications
Additionally, the use of QR codes for recruiters streamlines the hiring process,
saving time and providing convenience. However, it is important to carefully evaluate
the challenges and limitations associated with blockchain technology in the resume
analysis system. Factors such as complexity, scalability, and integration efforts
should be considered based on the specific requirements and goals of the application.
REFERENCES
1. Resume Screening Using LSTM, International Journal of Research Publication and
Reviews, 3(4), 2567–2569, https://ijrpr.com/uploads/V3ISSUE4/IJRPR3705.pdf
2. Adamopoulou, E., and Moussiades, L. (2020). An overview of chatbot technol‑
ogy. In: Maglogiannis I., Iliadis L., Pimenidis E. (eds), Artificial Intelligence
Applications and Innovations. AIAI 2020. IFIP Advances in Information and
Communication Technology, vol. 584, Springer, Cham. https://www.researchgate.net/
publication/341730184_An_Overview_of_Chatbot_Technology
3. Ghandeharioun, A., McDuff, D., Czerwinski, M., and Rowan, K. (2019). EMMA:
An emotion‑aware wellbeing chatbot. In: 2019 8th International Conference on
Affective Computing and Intelligent Interaction (ACII), Cambridge, UK, 2019, pp. 1–7,
doi:10.1109/ACII.2019.8925455. https://ieeexplore.ieee.org/abstract/document/8925455
4. Patel, N. P., Parikh, D. R., Patel, D. A., and Patel, R. R. (2019). AI and web‑based
human‑like interactive university chatbot (UNIBOT). In: 2019 3rd International
Conference on Electronics, Communication and Aerospace Technology (ICECA),
Coimbatore, India, 2019, pp. 148–150, doi:10.1109/ICECA.2019.8822176. https://ieeex‑
plore.ieee.org/abstract/document/8822176
Content Analysis Framework for Skill Assessment 233
5. Lin, Y., Lei, H., Addo, P. C., and Li, X. (2016). Machine learned resume‑job matching
solution. https://arxiv.org/abs/1607.07657
6. Resnick, P., and Varian, H. R. (1997). Recommender systems. Communications of the
ACM, 40(3), 56–58. doi:10.1145/245108.245121
7. Al‑Otaibi, S. T., and Ykhlef, M. (2012). A survey of job recommender systems.
International Journal of the Physical Sciences, 7(29), 5127–5142. https://academicjour‑
nals.org/journal/IJPS/article‑full‑text‑pdf/B19DCA416592.pdf
8. Malinowski, J., Keim, T., Wendt, O., and Weitzel, T. (2006). Matching people and
jobs: A bilateral recommendation approach. https://www.researchgate.net/publi‑
cation/232615527_Matching_People_and_Jobs_A_Bilateral_Recommendation_
Approach
9. Golec, A., and Kahya, E. (2007). A fuzzy model for competency‑based employee evalu‑
ation and selection. Computers & Industrial Engineering, 52(1), 143–161. https://www.
researchgate.net/publication/222332549_A_fuzzy_model_for_competencybased_
employee_evaluation_and_selection
10. Paparrizos, I., Cambazoglu, B. B., and Gionis, A. (2011). Machine learned job rec‑
ommendation. In: Proceedings of the Fifth ACM Conference on Recommender
Systems, ACM, New York, pp. 325–328. https://www.researchgate.net/publication/
221141098_Machine_learned_job_recommendation
11. Bojars, U., and Breslin, J. G. ResumeRDF: Expressing skill information on the semantic
web. https://www.researchgate.net/publication/266448089_ResumeRDF_Expressing_
skill_information_on_the_Semantic_Web
12. (2015). Resume analyzer an automated solution to recruitment process. International
Journal of Engineering nd Technical Research, 3(8). https://www.erpublication.org/
published_paper/IJETR032886.pdf
13. Bojars, U. Extending FOAF with resume information. https://www.w3.org/2001/sw/
Europe/events/foaf‑galway/papers/pp/extending_foaf_with_resume/
14. Verma, M. (2017). Cluster based ranking index for enhancing recruitment process using
text mining and machine learning. International Journal of Computer Applications,
157(9). https://www.researchgate.net/publication/312518297_Cluster_based_Ranking_
Index_for_Enhancing_Recruitment_Process_using_Text_Mining_and_Machine_
Learning
15. Jurka, T. P., Collingwood, L., Boydstun, A. E., Grossman, E., and van Atteveldt, W.
(2013). A supervised learning package for text classification. The R Journal. https://
journal.r‑project.org/archive/2013/RJ‑2013‑001/index.html
16. Li, L., Chu, W., Langford, J., and Schapire, R. E. (2010). A contextual‑bandit approach
to personalized news article recommendation. In: Proceedings of the Nineteenth
International Conference on World Wide Web. https://arxiv.org/pdf/1003.0146.pdf
16 Machine‑Learning
Techniques for
Effective Text Mining
Shivam Singh, Chandrakant D. Kokane,
Vilas Deotare, and Tushar Waykole
16.2.3 Stopword Removal
• Stopwords are common words that appear frequently in the language (e.g.,
“the,” “and,” and “is”) but rarely add significantly to the overall meaning of
the text. Stopwords are removed to reduce noise and focus on more relevant
words that provide crucial information.
Preprocessing and text representation help to turn raw text data into an organized
and comprehensible manner, laying the groundwork for effective text mining.
These transformed representations can then be fed into various machine‑
learning algorithms for tasks such as text classification, clustering, sentiment
analysis, topic modeling, and more. Proper preprocessing and representation
are essential for extracting accurate and useful insights from text data as well as
increasing the overall performance of text mining models. Figure 16.2 depicts
the activity diagram detailing the preprocessing procedures within the context
of text mining.
• In the context of text categorization, Naive Bayes posits that the pres‑
ence of one word in a document is unrelated to the presence of other
words (thus the term “naive”). It computes the likelihood of a text
belonging to a specific class based on the likelihood of individual words
occurring in that class.
• Naive Bayes is a computationally efficient method that requires little
training data. It is effective for high‑dimensional feature spaces such as
BoW representations.
2. Support Vector Machines (SVMs):
• SVMs are a strong and commonly used supervised learning technique
for text classification. Its goal is to determine the ideal hyperplane in a
high‑dimensional space for separating data points of different classes.
• SVM attempts to discover the hyperplane in the feature space that opti‑
mizes the margin between data points of various classes in the context
of text classification. It is useful for applications with complex decision
boundaries and can efficiently handle high‑dimensional feature vectors.
• SVM can handle noisy data and performs well with less training data.
However, when dealing with very huge datasets, it may experience scal‑
ing challenges.
3. Decision Trees:
• Decision Trees are a non‑parametric supervised learning approach that
can be used for classification and regression applications. They partition
the data recursively based on feature values to generate a tree‑like struc‑
ture that predicts the class labels of subsequent occurrences.
• Decision trees in text classification make binary judgments at each node
based on the values of specific words or features, which results in the
assignment of a class label at the tree’s leaves.
• Decision Trees are easily interpretable and visualized, making them
useful for understanding the decision‑making process in text categori‑
zation problems. They are, nevertheless, prone to overfitting, particu‑
larly with deep trees and noisy data.
4. Neural Networks:
• Because of their ability to automatically learn complicated patterns
from text input, neural networks, particularly deep learning models,
have gained great interest in text categorization.
• For sequential data such as words or texts, RNNs and long short‑term
memory (LSTM) networks are appropriate. They are capable of captur‑
ing text’s sequential dependencies, making them useful for sentiment
analysis and language modeling.
• CNNs are frequently employed for text classification problems, par‑
ticularly when dealing with fixed‑length input like BoW representa‑
tions. From text data, CNNs can learn local patterns and hierarchical
representations.
• Transfer learning with pretrained language models such as BERT and
GPT‑3 has also demonstrated exceptional performance in text categori‑
zation tasks, exploiting knowledge obtained from large‑scale pretraining.
Machine‑Learning Techniques for Effective Text Mining 243
Each supervised learning algorithm has its strengths and weaknesses when applied
to text classification. The choice of algorithm depends on factors like the size of
the dataset, the complexity of the task, the interpretability required, and the avail‑
ability of computational resources. Proper evaluation and experimentation are
crucial to selecting the most suitable algorithm for a specific text classification prob‑
lem. Figure 16.3 presents the activity diagram delineating the operational flow of
Supervised Learning for Text Classification within the domain of text mining.
1. Bag‑of‑Words (BoW):
• BoW is a common and simple text data representation approach.
It entails generating a vocabulary of unique words (or tokens) found
across the dataset. Each document is then represented numerically as a
vector, with each entry representing the frequency of a certain term in
the document.
• BoW treats each word separately and disregards word order and con‑
text. While straightforward, it captures the incidence of various words
in documents, making it appropriate for text categorization tasks such
as sentiment analysis and spam detection.
2. TF‑IDF (Term Frequency‑Inverse Document Frequency):
• The TF‑IDF representation is a variant of the BoW representation that
assigns a weight to each word in the document to indicate its relevance
in the document relative to the overall dataset.
• TF is a measure of the frequency of a term in a document, whereas
Inverse Document Frequency (IDF) is a measure of the rarity of a word
across all documents. The TF‑IDF weight is the product of these two
variables.
• TF‑IDF aids in the identification of words that are discriminative and
informative for a given class because they are common in the document
of interest but uncommon in other documents.
3. Word Embeddings:
• Word embeddings are continuous vector space‑dense vector representa‑
tions of words. Word embeddings represent the semantic relationships
between words, allowing algorithms to comprehend word similarity
and context.
• In text classification tasks, pretrained word embeddings such as
Word2Vec, GloVe, and FastText are extensively employed. These
embeddings are derived from big text corpora and can be applied to
244 Data-Centric Artificial Intelligence for Multidisciplinary Applications
Supervised learning for text classification.
FIGURE 16.3
Machine‑Learning Techniques for Effective Text Mining 245
1. Data Preprocessing: Clean the text data and remove URLs, special char‑
acters, and emojis. Tokenize the tweets and remove stopwords.
2. Feature Engineering: Convert the preprocessed tweets into numerical rep‑
resentations using TF‑IDF or word embeddings.
246 Data-Centric Artificial Intelligence for Multidisciplinary Applications
Real‑World Example: A company wants to gauge public sentiment about their latest
product release. They use sentiment analysis to analyze thousands of tweets men‑
tioning the product. The analysis reveals that overall sentiment is positive, but some
negative feedback points to specific issues that need to be addressed for product
improvement.
1. Data Preprocessing: Clean the emails, remove HTML tags, and normalize
the text (e.g., convert to lowercase).
2. Feature Engineering: Convert the text into numerical representations
using BoW or TF‑IDF.
3. Model Selection: Train classifiers like Naive Bayes, SVMs, or Decision
Trees.
4. Model Evaluation: Use metrics like accuracy, precision, recall, and F1
score to evaluate the model’s performance on a separate test set.
5. Prediction: Deploy the best‑performing model to automatically classify
incoming emails as spam or ham.
Real‑World Example: An email service provider wants to protect its users from
spam emails. By employing a spam detection system using supervised learning, they
can accurately filter spam emails and improve user experience and security.
Benefits:
Challenges:
Unsupervised learning for text clustering is a valuable technique for discovering pat‑
terns and organizing large text corpora without the need for explicit class labels. Its
applications span across various domains, including information retrieval, content
analysis, and document organization. However, proper evaluation and interpretation
of clusters are essential for deriving meaningful insights from unsupervised text clus‑
tering. Figure 16.4 delineates the operational mechanics underlying Unsupervised
Learning for Text Clustering.
248 Data-Centric Artificial Intelligence for Multidisciplinary Applications
Each of these unsupervised clustering algorithms has its strengths and weaknesses.
The choice of algorithm depends on the specific characteristics of the data and the
nature of the clusters to be discovered. Experimenting with different algorithms is
essential to find the most suitable one for a particular clustering task.
The choice of feature representation depends on the nature of the data and the clus‑
tering task at hand. Proper feature engineering is essential for achieving accurate
and meaningful clustering results and plays a critical role in the success of clustering
algorithms.
1. Data Preprocessing: Clean the text data and remove irrelevant informa‑
tion, special characters, and punctuation. Tokenize the text into words and
apply stemming or lemmatization to reduce words to their base form.
2. Feature Representation: Convert the preprocessed text into numerical
representations using techniques like BoW, TF‑IDF, or word embeddings.
These representations capture the semantic meaning and context of words.
3. Sentiment Classification: Utilize supervised machine‑learning algorithms,
such as Naive Bayes, SVMs, or deep learning models like LSTM or BERT,
to classify the sentiment of the text. These models are trained on labeled
data with sentiment annotations.3
4. Sentiment Analysis Output: The output of sentiment analysis is the clas‑
sification of the text into positive, negative, or neutral sentiments.
Applications:
Significance:
• Pros:
– Supervised learning provides accurate sentiment predictions when
trained on sufficient and representative labeled data.
– It can handle complex relationships between features and sentiment
labels.
• Cons:
– Requires a large amount of labeled data for training.
– Performance may suffer if the training data is biased or does not
fully represent the distribution of sentiments in real‑world data.
2. Unsupervised Sentiment Analysis: Unsupervised sentiment analysis is an
approach where the sentiment analysis model is not provided with labeled
data during training. Instead, it aims to identify patterns and structures in
the data without predefined sentiment labels
• Approach:
– Data Preparation: Text data undergoes preprocessing, including
cleaning, tokenization, and normalization.
– Feature Representation: The preprocessed text is converted into
numerical representations using techniques like BoW, TF‑IDF, or
word embeddings.
– Clustering: Unsupervised clustering algorithms like K‑means,
Hierarchical Clustering, or DBSCAN are applied to group similar
text samples together based on their numerical representations.
– Sentiment Assignment: Sentiments are assigned to the clusters
based on the predominant sentiment of the text samples within each
cluster. For example, if most samples in a cluster are positive, that
cluster is assigned a positive sentiment.
– Evaluation (Optional): Since unsupervised sentiment analysis
doesn’t have labeled data for evaluation, the quality of the clusters
can be assessed using internal clustering evaluation metrics.
• Pros:
– Unsupervised approaches can be applied when labeled data is
scarce or unavailable.
– It can discover hidden patterns and structures in the data without
relying on predefined sentiment labels.
• Cons:
– The sentiment assignments may not always match human judgment
or predefined sentiment labels.
– It can be challenging to interpret and validate the accuracy of the
results in the absence of labeled data.
1. Data Preprocessing: The text data is cleaned and tokenized into individual
words or phrases.
2. Linguistic Features: NER systems often use linguistic features, such as
POS tagging, to identify named entities. Certain POS patterns are indicative
of named entities, like proper nouns.
3. Machine‑learning Models: NER is often approached as a supervised
learning problem, where machine‑learning models are trained on annotated
text data that includes the labeled named entities. Popular machine‑learning
algorithms like conditional random fields (CRFs), SVMs, or deep learning
models like bidirectional long short‑term memory (BiLSTM) networks are
commonly used.2
4. Named Entity Classification: During training, the model learns to classify
each word or phrase in the text into predefined categories, such as “Person,”
“Organization,” “Location,” etc.
5. Named Entity Extraction: Once the model is trained, it can be used to pro‑
cess new, unseen text data and identify and extract named entities present in
the text.
Significance:
• NER has various practical applications across different domains:
– Information Extraction: NER is used to extract valuable informa‑
tion from unstructured text, helping in knowledge discovery and
information retrieval.
– Search Engines: NER improves the accuracy of search engines by
identifying and recognizing entities mentioned in search queries or
web pages.
258 Data-Centric Artificial Intelligence for Multidisciplinary Applications
Overall, NER is a fundamental NLP task that plays a crucial role in various down‑
stream applications, improving the understanding and processing of text data by
identifying and classifying specific entities of interest. Depicted in Figure 16.6 is the
sequential process illustrating the operational flow of NER functioning.
• Cons:
– Creating accurate and comprehensive rules can be labor intensive
and time‑consuming.
– Rule‑based approaches may struggle with handling complex or
ambiguous cases.
2. Machine‑learning‑Based NER Techniques: Machine‑learning‑based
NER techniques use supervised or unsupervised learning methods to auto‑
matically learn patterns and features from labeled training data. These
methods have become more popular due to their ability to handle complex
linguistic patterns and generalize well across different domains.
• Approach:
– Data Preparation: A labeled dataset is prepared, where the text
data is annotated with entity labels (e.g., “Person,” “Organization,”
“Location,” etc.).
– Feature Extraction: Features are extracted from the text data, such
as word embeddings, POS tags, contextual information, etc., to rep‑
resent the words in a numerical format suitable for machine‑learn‑
ing algorithms.
– Model Training: Supervised machine‑learning algorithms, like
CRF, SVMs, or deep learning models like BiLSTM networks, are
trained on the labeled data to learn the relationship between the
features and entity labels.
– Model Evaluation: The trained model is evaluated on a separate
test dataset to measure its performance using metrics like precision,
recall, F1 score, etc.
– Prediction: Once the model is trained and evaluated, it can be used
to predict named entities in new, unseen text data.
• Pros:
– Machine‑learning‑based NER can automatically learn complex
patterns and generalize to different contexts and domains.
– It can handle large amounts of data and adapt to new data.
• Cons:
– Requires a significant amount of labeled data for training.
– Model complexity and training time may be higher compared to
rule‑based methods.
Hybrid Approaches:
In practice, hybrid approaches that combine both rule‑based and machine‑learn‑
ing techniques are often used. Rule‑based methods can be used for specific entity
types or known patterns, while machine‑learning models can be applied to handle
more ambiguous cases or new entity types. Such hybrid approaches leverage the
strengths of both techniques to improve NER performance.
In conclusion, NER is a crucial task in NLP, enabling the extraction of valuable
information from text data. Both rule‑based and machine‑learning‑based techniques
have their advantages and applications, and the choice between them depends on the
specific requirements and characteristics of the NER task at hand.
Machine‑Learning Techniques for Effective Text Mining 261
Overall, NER and Entity Linking are powerful NLP techniques with various
real‑world applications across diverse domains. They contribute to improving infor‑
mation retrieval, knowledge extraction, and understanding in the era of big data and
unstructured text.
1. Recurrent Neural Networks (RNNs): RNNs are widely used for sequen‑
tial data processing, making them suitable for handling text sequences.
They have a feedback mechanism that allows them to maintain a hidden
state and consider the context from previous words while processing each
word in a text. However, RNNs suffer from the vanishing gradient problem,
limiting their ability to capture long‑range dependencies in texts.
2. Long Short‑Term Memory (LSTM): LSTMs are a type of RNN that
addresses the vanishing gradient problem. They introduce memory cells
that allow information to be stored and retrieved over long periods, making
them better at handling long sequences and capturing long‑term dependen‑
cies in text.5
Machine‑Learning Techniques for Effective Text Mining 263
Challenges:
1. Data Preprocessing: Clean the text data by removing stopwords and spe‑
cial characters, and perform tokenization.
2. Word Embeddings: Convert the preprocessed text into word embeddings
using techniques like Word2Vec or GloVe, representing each word as a
dense numerical vector.
3. Model Architecture: Utilize a combination of CNN and LSTM layers to
capture both local and global features in the text data. The CNN layers iden‑
tify local patterns (e.g., n‑grams), while LSTM layers process the sequential
information.
4. Model Training: Train the deep learning model on the labeled dataset
using cross‑entropy loss and backpropagation. Fine‑tune the model on the
training data to optimize the classification performance.
5. Evaluation: Evaluate the model on a separate test dataset using metrics like
accuracy, precision, recall, and F1 score.
Results: The deep learning model achieves high accuracy in classifying news articles
into their respective topics. The combination of CNN and LSTM allows the model
to effectively capture relevant features and patterns in the text, leading to improved
performance compared to traditional machine‑learning methods.
1. Data Preprocessing: Clean the text data by removing noise, special char‑
acters, and stopwords.
2. BERT Embeddings: Use pretrained BERT (Bidirectional Encoder
Representations from Transformers) to convert the text data into contextu‑
alized word embeddings.
3. Fine‑tuning BERT: Fine‑tune the pretrained BERT model on the senti‑
ment analysis task using the labeled dataset. Update the model’s weights to
adapt to the specific sentiment classification task.
4. Model Training: Train the fine‑tuned BERT model on the labeled dataset,
using categorical cross‑entropy loss and gradient descent optimization.
5. Evaluation: Evaluate the BERT‑based sentiment analysis model on a sepa‑
rate test dataset using metrics like accuracy, precision, recall, and F1 score.
266 Data-Centric Artificial Intelligence for Multidisciplinary Applications
The future of machine learning for text mining is exciting, with continuous advance‑
ments unlocking new possibilities and applications. As the field progresses, ethical
considerations, responsible AI development, and societal impacts will remain essen‑
tial components of shaping the direction of text mining research and its transforma‑
tive applications.5
16.10 CONCLUSION
In this chapter, we have explored the dynamic and transformative field of “Machine
Learning Techniques for Effective Text Mining.” Text mining, powered by machine
learning and deep learning, has emerged as a powerful tool to extract valuable
insights from unstructured text data, enabling us to make informed decisions and
understand human language in novel ways.
We began by delving into the definition and scope of text mining, understanding
its importance in processing vast amounts of textual data and extracting meaningful
information from it. We discussed how text mining is employed in various real‑world
applications, including sentiment analysis, NER, text classification, clustering, and
more.
Supervised learning algorithms, such as Naive Bayes, SVMs, Decision Trees,
and Neural Networks, showcased their effectiveness in text classification tasks. We
explored the process of preprocessing and text representation, laying the foundation
for building accurate and robust text classification models.
The chapter then shifted its focus to unsupervised learning techniques, where
algorithms like K‑means, Hierarchical Clustering, and DBSCAN have been instru‑
mental in discovering patterns and structures in unlabeled text data through text
clustering.
Deep learning stole the spotlight as we delved into its application in text mining.
RNNs, LSTM, and the revolutionary Transformers have redefined NLP and pushed
the boundaries of text understanding and generation.
Real‑world case studies illustrated the effectiveness of machine‑learning tech‑
niques in solving practical text mining challenges, showcasing their significance in
diverse domains, including social media monitoring, customer feedback analysis,
sentiment analysis in news articles, and more.
However, we also acknowledged the challenges and ethical considerations in text
mining. Data quality, interpretability, bias, and privacy emerged as key concerns that
demand careful attention and responsible AI development.
As we look to the future, emerging trends in transfer learning, multimodal learn‑
ing, and low‑resource learning promise to shape the landscape of text mining, open‑
ing up new possibilities and applications. Ethical considerations, transparency, and
accountability will remain paramount in driving the responsible advancement of text
mining technologies.
In conclusion, the fusion of machine learning and text mining has paved the
way for unprecedented opportunities in understanding and utilizing textual data.
As this field continues to evolve, researchers, practitioners, and policymakers must
collaborate to ensure the ethical and responsible development of text mining tech‑
niques, harnessing the potential of machine learning to empower us with insights and
270 Data-Centric Artificial Intelligence for Multidisciplinary Applications
knowledge from the vast realm of human language. The journey of text mining is one
of continuous exploration, discovery, and innovation, and its impact on society is set
to grow in profound ways in the years to come.
REFERENCES
[1] Cohen, Aaron M., and William R. Hersh. “A survey of current work in biomedical text
mining.” Briefings in Bioinformatics 6, no. 1 (2005): 57–71.
[2] Doğan, Emre, K. Buket, and Ahmet Müngen. “Generation of original text with text min‑
ing and deep learning methods for Turkish and other languages.” In: 2018 International
Conference on Artificial Intelligence and Data Processing (IDAP), pp. 1–9. IEEE,
2018.
[3] Lewis, David D., Yiming Yang, Tony Russell‑Rose, and Fan Li. “Rcv1: A new bench‑
mark collection for text categorization research.” Journal of Machine Learning
Research 5, no. (2004): 361–397.
[4] Albert, Noel, and Matthew Thomson. “A synthesis of the consumer‑brand relationship
domain: using text mining to track research streams, describe their emotional associa‑
tions, and identify future research priorities.” Journal of the Association for Consumer
Research 3, no. 2 (2018): 130–146.
[5] Jin, Gang. “Application optimization of NLP system under deep learning technology in
text semantics and text classification.” In: 2022 International Conference on Education,
Network and Information Technology (ICENIT), pp. 279–283. IEEE, 2022.
[6] Zhao, Bei, and Wei Gao. “Machine learning based text classification technology.”
In: 2022 IEEE 2nd International Conference on Mobile Networks and Wireless
Communications (ICMNWC), pp. 1–5. IEEE, 2022.
17 Emails Classification
and Anomaly Detection
using Natural Language
Processing
Tanvi Mehta, Renu Kachhoria,
Swati Jaiswal, Sunil Kale, Rajeswari Kannan,
and Rupali Atul Mahajan
17.1 INTRODUCTION
The Enron Corporation, an American energy business with headquarters in Houston,
Texas, and amongst the five largest accountancy and audit firms in the world, filed
for bankruptcy as a result of the Enron controversy, which became public in 2001 [1].
Enron was noted as the worst audit failure at the time in addition to being the big‑
gest bankruptcy reorganization in American history. The majority of its clients had
left, and the business had eventually stopped functioning [2]. Despite suffering
billion‑dollar losses in pensions and asset prices, Enron’s employees and stockhold‑
ers only won little compensation through litigation.
Customer satisfaction (CS) assessment is now a key metric for assessing the suc‑
cess of businesses in the market [3]. Client satisfaction has become the top priority
for all business kinds. Analyzing client reviews and comments for a product or ser‑
vice is one technique to gauge CS [4]. Domo claims that we produce more than 2.5
quintillion bytes of data every day, with this data generated from a variety of sources,
including social media, emails, Amazon, YouTube, and Netflix.
The earliest method of business communication is said to be email. It is inevitable
in a workflow scenario that involves both internal and external actors and is defined
by both [5]. The email bodies inherently include the characteristics of the organi‑
zational process viewpoint. The organizational model, however, has not been fully
taken into account in numerous publications that concentrated on email analysis for
process model mining [6]. So, this research proposes ways like Anomaly Detection,
Social Network Analysis, Email Classification, and Word Cloud, that are employed
using Machine Learning, and Natural Language Processing to evaluate the informa‑
tion and assist toward corporate development.
Anomaly detection, also known as outlier detection, is a technique used in data
mining to locate unusual things that happen or observations that stand out from the
bulk of the data in a way that raises questions [7]. Generally, anomalous objects will
point to some sort of issue, like financial fraud, a structural flaw, a health issue, or
syntax errors in a text. For instance, sudden bursts of activity rather than infrequent
items are frequently intriguing objects in the context of abuse and network intrusion
detection [8]. Isolation Forest is used to create a model of typical behavior and iden‑
tify unusual conduct before the public controversy.
Finding the communities inside a virtual community is crucial because it enables
the identification of members who have common interests and behavior prediction [9].
Social network analysis is a method for analyzing social systems utilizing networks
and graph theory [10]. By changing the visual representation of a network’s nodes and
edges to reflect certain properties of interest, these visualizations offer a method for
qualitatively evaluating networks. Additionally, emails in the collection are categorized
into documentation, transactions, attorney, etc. This is accomplished by first employing
a bag of words, then SVM, Naive Bayes, and RNN approaches. Furthermore, a word
cloud is created for visualization and comparative analysis is done.
A survey of the preceding scholarly papers is provided in Section 17.2. A compre‑
hensive outline of the methods employed in the aforementioned investigations is pro‑
vided in Section 17.3. An explanation of the approaches used is given in Section 17.4.
Section 17.5 offers an in‑depth description of how the suggested tasks are executed.
In Section 17.6, the conclusions and observations of the research are in‑depth ana‑
lyzed. Section 17.7 addresses the outcome and prospective applications.
17.3 TECHNIQUES
The proposed techniques for analyzing the emails of the Enron organization are
briefly explained in this section.
17.3.1 Isolation Forest
Employing a tree‑like framework in accordance with traits selected arbitrarily, an
isolation forest is applied to examine randomly subsampled data [20]. The fragments
that crossed farther into the tree required more cuts to separate, thus making them
less probable to be anomalies [21]. Similar to the last example, data that end up on
274 Data-Centric Artificial Intelligence for Multidisciplinary Applications
shorter branches tend to be anomalies since the tree found it simpler to distinguish
them from others.
E ( h ( x ))
−
S ( x, n) = 2 c( n )
(17.1)
where S (x, n) = score of an anomaly; E (h(x)) = observations’ path length x; c(n) = unsuc‑
cessful search path length that is average; n = count of nodes that are external.
17.3.6 Gensim
Gensim is an unrestricted Python toolkit that aims to represent documents as seman‑
tic vectors as quickly and painlessly as possible for humans and computers [28].
Gensim uses unsupervised machine learning methods to analyze unstructured,
uncooked digital messages.
17.4.1 Email Dataset
The Enron email dataset consists of around 0.5 million emails between the employ‑
ees and managers of the company. This data is needed to be cleaned and processed
for feature extraction and selection.
17.4.4 Model Formation
Several techniques are used for examining the data of emails, including anomaly
detection, social network analysis, email classification, and word cloud formation.
276 Data-Centric Artificial Intelligence for Multidisciplinary Applications
17.5.2 Anomaly Detection
Finding patterns in the data that do not match the anticipated (normal) behavior is
known as anomaly detection. Novelties, noise, outliers, exceptions, and deviations
Emails Classification and Anomaly Detection 277
are other terms for anomalies. The algorithmic flow of the Isolation Forest technique
is demonstrated in Figure 17.2.
Figure 17.9 shows that top the 500 recipients have actually received most of the
emails in the organization.
17.5.4.2 Tokenization
In this type of text preprocessing concept, first, the text is normalized, converted to
lowercase, punctuations are removed, and finally split into words; these words are
called tokenizers.
p ( word1& word2 )
PMI ( word1, word2 ) = log2 (17.2)
p ( word1) xp ( word2 )
TP + TN
Accuracy = (17.3)
TP + TN + FP + FN
TP
Precision = (17.4)
TP + FP
TP
Recall = (17.5)
TP + FN
2TP
F1 Score= (17.6)
2TP + FP + FN
17.5.5 Pseudocode
function: ClassifyEmailText(t): (T, l)
input: t: string ‑ the email text to be distinguished.
Emails Classification and Anomaly Detection 285
TABLE 17.1
Evaluation Metrics
Positive Negative
Models Precision Recall F1 Score Precision Recall F1 Score Accuracy
NB 0.92 0.91 0.89 0.87 0.93 0.91 0.924
SVM 0.85 0.89 0.86 0.98 0.86 0.87 0.873
RNN 0.97 0.97 0.97 0.97 0.97 0.97 0.978
17.7 CONCLUSION
This research suggests different methods for analyzing the email dataset of Enron
company. Techniques like Anomaly Detection, Social Network Analysis, Email
Classification, and Word Cloud generation are used for examining the anoma‑
lies. Enhanced and futuristic technologies like natural language processing and
machine learning are applied to investigate the mails and the social network formed.
Algorithmic techniques like Isolation Forest, NB, SVM, and RNN are used to train
suitable models. The RNN model was the most accurate, reaching 97.8%, based on
the findings. An experiment has been conducted as part of this study to elevate the
model’s analytical outcomes. The goal is to investigate and perform more research
on anomaly detection in line with fraud analysis in various sectors using the hybrid
approach, wherein the dataset may be hosted in a cloud environment like AWS.
Emails Classification and Anomaly Detection 289
As a consequence, all the traits and features are analyzed, tested, and the most accu‑
rate findings are obtained. As a result, this will significantly aid in both the growth
and success of organizations as well as the advancement of business.
290 Data-Centric Artificial Intelligence for Multidisciplinary Applications
REFERENCES
1. N. Shashidhar et al. (2022). Topic modeling in the ENRON dataset. In: Hu B., Xia Y.,
Zhang Y., Zhang L.J. (eds) Big Data‑BigData 2022. BigData 2022. Lecture Notes in
Computer Science, vol. 13730, Springer, Cham. doi:10.1007/978‑3‑031‑23501‑6_4.
2. M. MacDonnell et al. (2022). Exploring social‑emotional learning, school climate, and
social network analysis. Journal of Community Psychology, 51(1), 84–102.
3. R. Benbenishty et al. (2017). A research synthesis of the associations between socioeco‑
nomic background, inequality, school climate, and academic achievement. Review of
Educational Research, 87(2), 425–469.
4. T. Mehta et al. (2022). A comparative study on approaches for text quality predic‑
tion using machine learning and natural language processing. In: 2022 International
Conference on Smart Generation Computing, Communication and Networking
(SMART GENCON), Bangalore, India, 2022, pp. 1–5.
Emails Classification and Anomaly Detection 291
25. K. Rao et al. (2017). Text analysis for author identification using machine learning.
Journal of Emerging Technologies and Innovative Research. 4(6), pp.138–141.
26. T. Mehta et al. (2022). YouTube Ad view sentiment analysis using deep learning and
machine learning. International Journal of Computer Applications 184(11), 10–14.
27. S. D. Kale et al. (2017). A systematic review on author identification methods.
International Journal of Rough Sets and Data Analysis (IJRSDA). 4(2), pp. 81–91.
28. K. Chang et al. (2011). Word cloud model for text categorization. IEEE 11th International
Conference on Data Mining, Vancouver, BC, Canada, 2011, pp. 487–496.
29. S. D. Kale et al. (2018). Author identification using sequential minimal optimization
with rule‑based decision tree on Indian literature in Marathi. Procedia Computer
Science. Procedia Computer Science, Elsevier, vol. 35, pp. 1086–1101.
30. F. Chang et al. (2019). Hot topic community discovery on cross social networks. MDPI
Future Internet, 11(3), 60–76.
Index
Adam (Adaptive Moment Estimation) 64 Generative adversarial networks (GANs) 169
aerial surveillance 166 glomerular filtration rate (GFR) 88
Agriculture Data Exchange (ADEx) 173 GoogleNet 134
AlexNet network 103
Alzheimer’s disease (AD) 91 healthy–diseased (HD) 183
anomaly detection 276 hierarchical clustering 248
anonymization 13 Hypertext Markup Language (HTML) 192
artificial neural networks (ANNs) 179
IndCareer.com 191
bag of words (BoW) 218 Indian classical music (ICM) 204
Bidirectional Encoder Representations from International Data Corporation (IDC) 45
Transformers (BERT) 242 Internet of Things (IoT) 160
big medium very big (BMV) 183 isolation forest 273
BLEU (Bilingual Evaluation Understudy) 65
JavaScript 192
Careers360 190 JavaScript Object Notation (JSON) 192
cerebral spinal fluid 134
chronic kidney disease (CKD) 88 K‑means clustering 248
chronic venous insufficiency (CVI) 117
CollegeDekho.com 190 Latent Dirichlet Allocation (LDA) 237
convolution layer 134 Locally Interpretable Model-Agnostic
convolutional neural networks 10 Explanations (LIME) 147
customer satisfaction (CS) 271 long short-term memory (LSTM) 11
293
294 Index