5
5
1st N.V. Naik 2nd Dudekula Nasreen 3rd Somu Chowdeswar Reddy
Department of Computer Science and Department of Computer Science and Department of Computer Science and
Engineering Engineering Engineering
Lakireddy Bali Reddy College of Lakireddy Bali Reddy College of Lakireddy Bali Reddy College of
Engineering (Autonomous) Engineering (Autonomous) Engineering (Autonomous)
M ylavaram, India M ylavaram, India M ylavaram, India
[email protected] [email protected] [email protected]
• Appetite loss; nausea; pale stools; swelling (edoema) of accurate predictions, with the XGBoost model having the
the legs; loss of weight; weakness or exhaustion; and best accuracy..
yellowing of the skin and eyes.[10] Weidong Ji et al created and evaluated machine learning
(ML) models that can be utilised for identifying a group of
individuals. This research involved 304,145 individuals who
took part in the national physical examination, and their
survey results and physical measurement data were used as
candidate covariates in the model. The relevance score of the
covariate in NAFLD after absolute shrinkage was generated
by a classifier with the highest performance, and a selection
operator (LASSO) was used to feature select from potential
covariates. The screening model for NAFLD was then
developed using four ML approaches. The performance of
XGBoost was the best of the four ML algorithms, with BMI,
age and waist circumference ranking highest in
significance.[2]
Fig. 2. Stages of Fatty Liver Disease
Chieh-ChenWu et al. set out to design a model to predict
Some patients get fatty liver disease completely out of
FLD that would help medical practitioners categories a
the blue. Fatty liver disease has the following causes includes patients, establish a diagnosis, and treat, and stop FLD. To
Excessive weight gain, Type 2 diabetes, insulin resistance,
predict FLD, classification algorithms such as RF, NB,
metabolic syndrome, high blood fat levels, particularly ANN, and LR were developed. The area under the receiver
triglycerides, adverse effects from several drugs , certain operating characteristic curve was used to assess the
infections, such as hepatitis C, and uncommon hereditary performance of the four models (ROC). The experiment
diseases.[11] involved 577 individuals, 377 of whom had fatty livers. The
Risk elements for fatty liver disease include Heavy random forest model outperformed the others.[3]
alcohol consumption, exposure to certain toxins, genetics, Cheng-fu Xu et al proposed the best clinical prediction
obesity, obstructive sleep apnea, older age, polycystic ovary
model for NAFLD was assessed using machine learning
syndrome (PCOS), pregnancy, starvation, rare genetic techniques. At Zhejiang University, participants in a health
conditions like Wilson disease, hypobetalipoproteinemia, examination participated in a cross -sectional study. The use
smoking, and use of certain medications like methotrexate of questionnaires, lab testing, physical exams, and hepatic
(Trexall), tamo xifen (Nolvadex), and ultrasonography was made. Then, using the free program
amiodarone(Pacerone).[8]
Weka, machine learn ing techniques were put into practice.
Machine learning (ML) is the process of analysing Features selection and classification were among the tasks. A
massive amounts of data to identify patterns that may be screening model was created using feature selection
used to forecast a variety of outcomes .In a variety of approaches by deleting unnecessary elements. A prediction
disciplines, machine learning approaches have emerged as a model was created using classification and assessed using the
potential tool for prediction and decision-making. F-measure. [4] 11 cutting-edge machine learning methods
Developing a machine learning model would be a huge help were researched. 2,522 (24%) of the 10,508 registered
participants matched the NAFLD d iagnostic criteria. Using a
in recognizing disorders and making positive healthcare
variety of statistical testing methodologies, the top five risk
decisions in real-t ime. It would also allow for the earlier
variables for NAFLD were discovered to be BMI,
classification of appropriate individuals with significant risk triglycerides, gamma-g lutamyl transpeptidase (GT), seru m
factors, allowing for the optimization of hospital alanine aminotransferase (ALT), and uric acid. [20] To
resources[21]. classify the data, a 10-fold cross-validation was used. The
results revealed that, among the 11 different tactics tested,
In this paper, the methods of Naïve Bayes (NB), Support the Bayesian network model performed the best. For
Vector Machine (SVM ) and Hybrid of ANN with eXtreme accuracy, specificity, sensitivity, and F-measure, up to 83%,
Gradient Boosting (XGBoost) are applied to the dataset to 0.878, 0.675, and 0.655, respectively, were obtained. The
find accuracy, sensitivity, specificity, duration and AUROC. Bayesian network model increases the F-measure score by
9.17% when compared to logistic regression.[22]
II. LIT ERAT URE SURVEY
Pei X et al proposed a model to predict FLD that can III. A RCHIT ECT URE
support medical professionals in catogorising people who are Following is the architecture of the proposed system.
at high risk of FLD and in making unique diagnoses, Firstly data is collected, then unnecessary data is removed,
decisions about treatment, and plans for FLD prevention. A then it is trained and algorithms are applied to compare
total of 3,419 participants were chosen, and 845 of them had accuracy[1].
FLD screenings. In order to find the disease, classification
models were applied. The models included in this study are
LDA, KNN, ANN, LR, RF and XGBoost. The prediction
accuracy was measured using AUC, sensitivity, specificity,
positive predictive value, and negative predictive value[1]. It
demonstrated that machine learning models yield more
theorem. It applies bayes theorem formulas on the data set Naïve Bayes Random Hybrid
to predict future dataset values. This is very easy to use. It Forest (ANN+XGBoost)
works on the principle that no features are dependent[14]. Accuracy 62.570455 82.424546 86.8836
Sensitivity 62.676507 82.530598 86.9896
( ) ( ) Specificity 62.348799 82.202890 86.6619
( ) ( ) AURO C 0.788671 0.888671 0.9060
( )
Where A=a1 ,a2 ,a3 ,…….,an , P(b) is prior probability, Here Hybrid of ANN with XGBoost gave better results than
p(b/A)is the posterior probability of A and A is feature Naïve Bayes and Random Forest when compared with
vector Accuracy, Sensitivity, Specificity and AUROC.
a) Accuracy: Accuracy is the probability of exact
b) Random Forest: Leo Rando m forest is one of the predictions of a model. The hybrid model has given an
commonly used machine learn ing algorith ms that is used to higher accuracy in the detection of fatty liver disease.
process handwriting. In this algorith m, all the dataset is Accuracy= (3)
divided into subsets based on features. Decision trees are
constructed for each feature[17]. The output of rando m
forest is the output of maximu m decision trees. The majority
voting of all decision trees is taken as the output of the
Random forest. It brings together the results of various
REFERENCES
[1] Pei X, Deng Q, Liu Z, Yan X, Sun W: Machine Learning Algorithms
for Predicting Fatty Liver Disease. Ann Nutr Metab 2021;77:38-45.
doi: 10.1159/000513654.
[2] Ji W, Xue M, Zhang Y, Yao H, Wang Y. A Machine Learning Based
Framework to Identify and Classify Non-alcoholic Fatty Liver
Disease in a Large-Scale Population. Front Public Health. 2022 Apr
Fig. 5. Graphical representation of Model Sensitivity 4;10:846118. doi: 10.3389/fpubh.2022.846118. PMID: 35444985;
PMCID: PMC9013842.
c) Specificity: The capacity of an algorith m or model
[3] Wu CC, Yeh WC, Hsu WD, Islam MM, Nguyen PAA, Poly TN,
to predict a true negative for every accessible category can Wang YC, Yang HC, Jack Li YC. Prediction of fatty liver disease
be used to measure specificity. Th is is sometimes referred to using machine learning algorithms. Comput Methods Programs
as the genuine negative rate in the literature.[13] Biomed. 2019 Mar;170:23-29. doi: 10.1016/j.cmpb.2018.12.032.
Epub 2018 Dec 29. PMID: 30712601.
[4] Han Ma, Cheng-fu Xu, Zhe Shen, Chao-hui Yu, You-ming Li,
"Application of Machine Learning T echniques for Clinical Predictive
Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver
Disease in China", BioMed Research International, vol. 2018, Article
ID 4304376, 9 pages, 2018. https://doi.org/10.1155/2018/4304376
[5] M. F. Rabbi, S. M. Mahedy Hasan, A. I. Champa, M. AsifZaman and
M. K. Hasan, "Prediction of Liver Disorders using Machine Learning
Algorithms: A Comparative Study," 2020 2nd International
Conference on Advanced Information and Communication
Technology (ICAICT), Dhaka, Bangladesh, 2020, pp. 111-116, doi:
10.1109/ICAICT51780.2020.9333528.
[6] C. Anuradha, D. Swapna, B. Thati, V. N. Sree and S. P. Praveen,
"Diagnosing for Liver Disease Prediction in Patients Using Combined
Machine Learning Models," 2022 4th International Conference on
Smart Systems and Inventive Technology (ICSSIT ), T irunelveli,
India, 2022, pp. 889-896, doi: 10.1109/ICSSIT 53264.2022.9716312.
[7] Islam, Md & Wu, Chieh-Chen & Poly, T ahmina & Nguyen, Phung
Fig. 6. Graphical representation of Model Specificity Anh & Yang, Hsuan-Chia & Li, Yu-Chuan. (2019). Prediction of
Fatty Liver Disease using Machine Learning Algorithms. Computer
methods and programs in biomedicine.
d) AUROC: AUC-ROC is a curve used to visualize 170.10.1016/j.cmpb.2018.12.032.
the performance of a model. For unbalanced data, the [8] Rahman, A. K. M. & Shamrat, F.M. & Tasnim, Zarrin & Roy, Joy &
AUROC is more revealing than accuracy. It is a widely Hossain, Syed. (2019). A Comparative Study On Liver Disease
reported performance statistic that is simp le to calculate Prediction Using Supervised Machine Learning Algorithms. 8. 419 -
using multiple software packages, so calculating AUROC 422.
[9] El-Shafeiy, Engy & El-Desouky, Ali & Elghamrawy, Sally. (2018). Machine Learning-Based Surgical Planning for Neurosurgery:
Prediction of Liver Diseases Based on Machine Learning T echnique Artificial Intelligent Approaches to the Cranium. Front. Surg. 2022, 9,
for Big Data. 10.1007/978-3-319-74690-6_36. 863633.
[10] A.M. Hall and A.L. Smith. (1999), “Feature Selection for Machine [17] Sakatani, K.; Oyama, K.; Hu, L.; Warisawa, S. Estimation of Human
Learning: Comparing a Correlation-Based Filter Approach to the Cerebral Atrophy Based on Systemic Metabolic Status Using
Wrapper”, In Proceedings of the T welfth International Florida Machine Learning. Front. Neurol. 2022, 13, 869915.
Artificial Intelligence Research Society Conference, AAAI Press pp. [18] Yen, H.H.; Wu, P.Y.; Chen, M.F.; Lin, W.C.; Tsai, C.L.; Lin, K.P.
235- 239. Current Status and Future Perspective of Artificial Intelligence in the
[11] Torkadi, P.P.; Apte, I.C.; Bhute, A.K. Biochemical evaluation of Management of Peptic Ulcer Bleeding: A Review of Recent
patients of alcoholic liver disease and non-alcoholic liver Literature. J. Clin. Med. 2021, 10, 3527. [Google Scholar] [CrossRef]
disease. Indian J. Clin. Biochem. 2014, 29, 79–83. [19] Yen, H.-H.; Wu, P.-Y.; Su, P.-Y.; Yang, C.-W.; Chen, Y.-Y.; Chen,
[12] Robles-Diaz, M.; Garcia-Cortes, M.; Medina-Caliz, I.; Gonzalez- M.-F.; Lin, W.-C.; T sai, C.-L.; Lin, K.-P. Performance Comparison of
Jimenez, A.; Gonzalez-Grande, R.; Navarro, J.M.; Castiella, A.; the Deep Learning and the Human Endoscopist for Bleeding Peptic
Zapata, E.M.; Romero-Gomez, M.; Blanco, S.; et al. The value of Ulcer Disease. J. Med. Biol. Eng. 2021, 41, 504–513. [Google
serum aspartate aminotransferase and gamma-glutamyl transpetidase Scholar] [CrossRef]
as biomarkers in hepatotoxicity. Liver Int. 2015, 35, 2474–2482. [20] Yen, H.H.; Su, P.Y.; Zeng, Y.H.; Liu, I.L.; Huang, S.P.; Hsu, Y.C.;
[13] Arieira, C.; Monteiro, S.; Xavier, S.; Dias de Castro, F.; Magalhaes, Chen, Y.Y.; Yang, C.W.; Wu, S.S.; Chou, K.C. Glecaprevir-
J.; Moreira, M.J.; Marinho, C.; Cotter, J. Hepatic steatosis and pibrentasvir for chronic hepatitis C: Comparing treatment effect in
patients with inflammatory bowel disease: When transient patients with and without end-stage renal disease in a real-world
elastography makes the difference. Eur. J. Gastroenterol. Hepatol. setting. PLoS ONE 2020, 15, e0237582.
2019, 31, 998–1003 [21] Sakatani, K.; Oyama, K.; Hu, L.; Warisawa, S. Estimation of Human
[14] M. Ghosh, M. Mohsin Sarker Raihan, M. Raihan, L. Akter, A. Kumar Cerebral Atrophy Based on Systemic Metabolic Status Using
Bairagi et al., "A comparative analysis of machine learning Machine Learning. Front. Neurol. 2022, 13, 869915.
algorithms to predict liver disease," Intelligent Automation & Soft [22] Demšar, J.; Curk, T.; Erjavec, A.; Gorup, Č.; Hočevar, T .;
Computing, vol. 30, no.3, pp. 917–928, 2021. Milutinovič, M.; Možina, M.; Polajnar, M.; Toplak, M.; Starič, A.; et
[15] Ravi Kumar R., Babu Reddy M. and Praveen, 2019 “An evaluation of al. Orange: Data mining toolbox in Python. J. Mach. Learn. Res.
feature selection algorithms in machine learning”, International 2013, 14, 2349–2353. [Google Scholar]
journal of scientific & technology research, 8(12) PP. 2071–2074.
[16] Dundar, T.T.; Yurtsever, I.; Pehlivanoglu, M.K.; Yildiz, U.; Eker, A.;
Demir, M.A.; Mutluer, A.S.; T ektaş, R.; Kazan, M.S.; Kitis, S.; et al.