The value of machine learning approaches
The value of machine learning approaches
Abstract
Background The application of machine learning (ML) for identifying early gastric cancer (EGC) has drawn increasing
attention. However, there lacks evidence-based support for its specific diagnostic performance. Hence, this systematic
review and meta-analysis was implemented to assess the performance of image-based ML in EGC diagnosis.
Methods We performed a comprehensive electronic search in PubMed, Embase, Cochrane Library, and Web
of Science up to September 25, 2022. QUADAS-2 was selected to judge the risk of bias of included articles. We did
the meta-analysis using a bivariant mixed-effect model. Sensitivity analysis and heterogeneity test were performed.
Results Twenty-one articles were enrolled. The sensitivity (SEN), specificity (SPE), and SROC of ML-based models
were 0.91 (95% CI: 0.87–0.94), 0.85 (95% CI: 0.81–0.89), and 0.94 (95% CI: 0.39–1.00) in the training set and 0.90 (95%
CI: 0.86–0.93), 0.90 (95% CI: 0.86–0.92), and 0.96 (95% CI: 0.19–1.00) in the validation set. The SEN, SPE, and SROC
of EGC diagnosis by non-specialist clinicians were 0.64 (95% CI: 0.56–0.71), 0.84 (95% CI: 0.77–0.89), and 0.80 (95% CI:
0.29–0.97), and those by specialist clinicians were 0.80 (95% CI: 0.74–0.85), 0.88 (95% CI: 0.85–0.91), and 0.91 (95% CI:
0.37–0.99). With the assistance of ML models, the SEN of non-specialist physicians in the diagnosis of EGC was signifi-
cantly improved (0.76 vs 0.64).
Conclusion ML-based diagnostic models have greater performance in the identification of EGC. The diagnostic
accuracy of non-specialist clinicians can be improved to the level of the specialists with the assistance of ML models.
The results suggest that ML models can better assist less experienced clinicians in diagnosing EGC under endoscopy
and have broad clinical application value.
Keywords Machine learning, Gastric cancer, Artificial intelligence, Endoscopy, Neural networks
†
Yiheng Shi, Haohan Fan, and LiLi contributed equally to this work and share
first authorship.
*Correspondence:
Bei Miao
[email protected]
Sujuan Fei
[email protected]
Full list of author information is available at the end of the article
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecom-
mons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Shi et al. World Journal of Surgical Oncology (2024) 22:40 Page 2 of 13
– Only performed analysis for the risk factors, with no independently by the same two reviewers, and their
ML-based model completely constructed results were cross-checked. Any disagreements among
– Lacked the following outcome measures: sensitivity them were addressed by a third reviewer (FSJ).
(SEN), specificity (SPE), receiver operator character-
istic curve (ROC), calibration curve, c-index, accu- Statistical analysis
racy, precision rate, recovery rate, confusion matrix, We used a bivariant mixed-effect model for meta-
diagnostic fourfold table, and F1 score analysis. The model takes into account both fixed- and
– Assessed the accuracy using univariate analysis random-effects models and better handles heterogene-
ity across studies and the correlation between SEN and
SPE, making the results more robust and reliable [20, 21].
Search strategy The number of true positive (TP), false positive (FP), true
A comprehensive electronic search was implemented up negative (TN), and false negative (FN) cases in original
to September 25, 2022, in PubMed, Embase, Cochrane studies were needed, while we could only obtain the SEN
Library, and Web of Science. The strategy was designed and SPE from several studies instead of the above infor-
based on Medical Subject Headings (MeSH) and free mation. Given this situation, we used the SEN and SPE
words. No restrictions were set to region and language. in combination with EGC cases and total cases to cal-
culate TP, FP, FN, and TN. Some studies only provided
Study screening and data extraction the ROC. In this case, we adopted Origin based on the
We used Endnote X9 for the management of the retrieved optimal Youden index to extract the SEN and SPE from
papers. Following the duplicate-checking, potentially eli- the ROC and subsequently calculated TP, FP, TN, and
gible articles were screened by browsing the titles and FN. The outcome variables in the bivariant mixed-effect
abstracts, and we downloaded the full texts of potentially model contained the SEN and SPE as well as the negative
eligible articles. Studies that met the pre-set eligibility likelihood ratio (NLR), positive likelihood ratio (PLR),
criteria were included after reading the full texts. A pre- diagnostic odds ratio (DOR), and 95% confidence inter-
designed form was adopted for extracting the data, which vals (95%CI). Summarized ROC was produced and the
contained the following: title, author, publication date, area under the curve was computed. Deek’s funnel plot
nationality, study type, EGC cases, total cases, images of was utilized for publication bias assessment.
EGC, total images, EGC cases in training set, total cases Subgroup analysis was processed based on the data sets
in training set, images of EGC in training set, total images (training set and validation set) and modeling variables
in training set, EGC cases in validation set, total cases (fixed images and dynamic videos). Moreover, we sum-
in validation set, images of EGC in validation set, total marized the results of non-specialist clinicians/specialist
images in validation set, model type, variables for model clinicians, non-specialist clinicians/specialist clinicians
construction, and comparisons with clinicians. The above with the assistance of ML, and video validation.
processes were completed independently by two review- All the data analyses were done on Stata 15.0, and
ers (SYH and MB), and their results were cross-checked. p < 0.05 implied statistical significance.
Ant disagreements among them were addressed by a
third reviewer (FSJ). Results
Study selection
Quality assessment There were 8758 articles retrieved through the literature
Quality Assessment of Diagnostic Accuracy Studies-2 search, of which 1394 were from PubMed, 3866 from
(QUADAS-2) [19] was applied for the evaluation of the Embase, 138 from Cochrane Library, and 3360 from Web
risk of bias. QUADAS-2 contains the following 4 aspects: of Science, and 4683 ineligible articles were removed
patient selection, index test, reference standard, and flow due to duplication and other reasons. We screened the
and timing. Each domain includes several items that remaining 4075 articles through browsing their titles and
could be filled as “yes,” “no,” or “uncertain,” correspond- abstracts, and 39 articles preliminarily met the inclusion
ing “low,” “high,” and “unclear” risk of bias, respectively. If criteria. Among these 39 articles, the full texts of 1 study
all items in a domain are filled as “yes,” this domain would could not be obtained, and full texts of the other 38 were
be graded as “low” risk of bias. If one item in a domain is read. After excluding conference summaries, reviews,
filled as “no,” there would be potential bias, and the risk studies with the full texts unavailable, and studies for
should be assessed according to the established guideline. which the diagnostic performance of the ML models
“Unclear” refers to no detailed information provided in could not be assessed, 21 articles were finally included.
the study, which makes it difficult for reviewers to assess The flow diagram of study selection is presented in Fig. 1,
its risk of bias. The above processes were completed and the detailed search strategies are shown in Table S1.
Shi et al. World Journal of Surgical Oncology (2024) 22:40 Page 4 of 13
Fig. 2 Risk of bias and clinical applicability assessment of included studies by QUADAS-2
29–114), respectively. No evident publication bias was EGC, and 6 of them [22, 24, 26, 29, 30, 35] had included
found (p = 0.51). More details are provided in Supple- more than 1 set of data. The pooled AUC, SEN, and SPE
mentary Fig. 1. were 0.96 (95% CI: 0.19–1.00), 0.90 (95% CI: 0.86–0.93),
and 0.90 (95% CI: 0.86–0.92) (Fig. 4A, B). The PLR,
Diagnostic performance of ML models in the image NLR, and DOR were 8.7 (95% CI: 6.6–11.4), 0.11 (95%
validation set CI: 0.08–0.15), and 80 (95% CI: 47–138), respectively.
There were 17 studies [13, 22–24, 26, 29–40] that vali- No evident publication bias was noted (p = 0.84). More
dated the performance of the ML models for diagnosing details are provided in Supplementary Fig. 2.
Shi et al. World Journal of Surgical Oncology (2024) 22:40 Page 6 of 13
Fig. 3 Diagnostic performance of the ML models in image training set. A SROC; B forest plot of pooled SEN and SPE
Fig. 4 Diagnostic performance of the ML models in image validation set. A SROC; B forest plot of pooled SEN and SPE
Diagnostic performance of clinicians SPE were 0.91(95% CI: 0.37–0.99), 0.80 (95% CI: 0.74–
We divided those clinicians into specialists and non- 0.85), and 0.88 (95% CI: 0.85–0.91) (Fig. 6A, B). The PLR,
specialists according to their working experience and the NLR, and DOR were 6.7 (95% CI: 5.4–8.4), 0.23 (95% CI:
number of times of endoscopy performed. There were 72 0.18–0.30), and 29 (95% CI: 21–41), respectively. No evi-
non-specialist clinicians, and the pooled AUC, SEN, and dent publication bias existed (p = 0.27). More details are
SPE were 0.80 (95% CI: 0.29–0.97), 0.64 (95% CI: 0.56– provided in Supplementary Figs. 3 and 4.
0.71), and 0.84 (95% CI: 0.77–0.89) (Fig. 5A, B). The PLR,
NLR, and DOR were 4 (95% CI: 2.9–5.3), 0.44 (95% CI: Diagnostic performance of clinicians with the assistance
0.37–0.52), and 9 (95% CI: 6–13), respectively. No evi- of ML models
dent publication bias was noticed (p = 0.94). There were There were 6 studies [13, 24, 29, 30, 35, 41] report-
76 specialist clinicians, and the pooled AUC, SEN, and ing the performance of clinicians in diagnosing EGC
Shi et al. World Journal of Surgical Oncology (2024) 22:40 Page 7 of 13
Fig. 5 Diagnostic performance of non-specialist clinicians in the diagnosis of EGC through endoscopic images. A SROC; B forest plot of pooled SEN
and SPE
Fig. 6 Diagnostic performance of specialist clinicians in the diagnosis of EGC by endoscopic images. A SROC; B forest plot of pooled SEN and SPE
with the assistance of ML models. We also divided and 21 (95% CI:11–43). No evident publication bias
these clinicians into specialist clinicians and non- was existed (p = 0.10). With the assistance of the ML
specialist clinicians. There were 16 specialist cli- models, the pooled AUC, SEN, and SPE of specialist
nicians and 12 non-specialist clinicians. With the clinicians were 0.93 (95% CI: 0.38–1.00), 0.89 (95% CI:
assistance of the ML models, the pooled AUC, SEN, 0.82–0.93), and 0.86 (95% CI: 0.81–0.90), respectively
and SPE of non-specialist clinicians were 0.90 (95% (Fig. 8A, B). The PLR, NLR, and DOR were 6 (95% CI:
CI: 0.36–0.99), 0.76 (95% CI: 0.68–0.83), and 0.87 (95% 4.6–8.6), 0.13 (95% CI: 0.08–0.21), and 48 (95% CI:
CI: 0.83–0.90), (Fig. 7A, B). The PLR, NLR, and DOR 26–87), respectively. No evident publication bias was
were 6 (95% CI: 4.1–8.3), 0.27 (95% CI: 0.19–0.38), noticed (p = 0.22). More details are provided in Sup-
plementary Figs. 5 and 6.
Shi et al. World Journal of Surgical Oncology (2024) 22:40 Page 8 of 13
Fig. 7 Diagnostic performance of non-specialist clinicians with assistance of the machine learning models in the diagnosis of EGC by endoscopic
images. A SROC; B forest plot of pooled SEN and SPE
Fig. 8 Diagnostic performance of specialist clinicians with assistance of the machine learning models in the diagnosis of EGC by endoscopic
images. A SROC; B forest plot of pooled SEN and SPE
Diagnostic performance of ML models in the video validation existed (p = 0.08). More details are provided in Supple-
set mentary Fig. 7.
There were 4 studies [13, 24, 30, 39] that validated the
diagnostic performance of ML models in real-time vid- Diagnostic performance of clinicians in the video validation
eos. The pooled AUC, SEN, and SPE were 0.94 (95% CI: set
0.39–1.00), 0.91 (95% CI: 0.82–0.96), and 0.86 (95% CI: There were 3 studies [13, 30, 39] that validated the per-
0.75–0.93) (Fig. 9A, B). The PLR, NLR, and DOR were 6 formance of clinicians (n = 20) in the diagnosis of EGC in
(95% CI: 3.5–12.1), 0.11 (95%CI: 0.05–0.22), and 60 (95% real-time videos. The pooled AUC, SEN, and SPE were
CI: 20–176), respectively. No evident publication bias 0.90 (95% CI: 0.58–0.98), 0.83 (95% CI: 0.77–0.88), and
Shi et al. World Journal of Surgical Oncology (2024) 22:40 Page 9 of 13
Fig. 9 Performance of ML models in the diagnosis of EGC in video validation set. A SROC; B forest plot of pooled SEN and SPE
0.85 (95% CI: 0.77–0.90) (Fig. 10A, B). The PLR, NLR, of these models with clinicians of different skill levels.
and DOR were 5 (95% CI: 3.6–8.2), 0.20 (95% CI: 0.15– Moreover, we assessed the diagnostic performance of ML
0.27), and 27 (95% CI: 17–44), respectively. No evident models in real-time videos. The analysis results revealed
publication bias was noticed (p = 0.51). More details are that ML models would be of greater performance in
provided in Supplementary Fig. 8. diagnosing endoscopic images than clinicians (including
specialists and non-specialists), and the diagnostic per-
Discussion formance of non-specialist clinicians could be improved
In this study, we systematically searched articles regard- to the level of the specialists with the assistance of ML
ing the application of ML for the diagnosis of EGC, models. ML models presented a remarkable performance
assessed the application value of image-based ML mod- in real-time video diagnosis, and the sensitivity and spec-
els for EGC diagnosis, and compared the performance ificity were all higher than those of clinicians.
Fig. 10 Performance of clinicians in the diagnosis of EGC in video validation set. A SROC; B forest plot of pooled SEN and SPE
Shi et al. World Journal of Surgical Oncology (2024) 22:40 Page 10 of 13
ML is a crucial part of artificial intelligence. It is com- EGC diagnosis were 0.95 and 0.89, respectively. Among
posed of multiple disciplines and can learn and prac- the articles included, only 2 articles [25, 26] used con-
tice with a large amount of historical data to construct ventional ML methods (SVM). Miyaki, R et al. [25] dis-
algorithm models that provide accurate prediction and covered the mean SVM output-value of the cancer lesion
assessment for the new data [42, 43], which refers to a was 0.846 ± 0.220, which was evidently higher than that
process from experience summarizing to flexible use. ML of the reddened lesions (0.381 ± 0.349) and surrounding
technique has been extensively employed in screening tissues (0.219 ± 0.277). Yuanpeng Li et al. [26] elicited
gastrointestinal malignancies, mainly in assisting endo- the SEN, SPE, and accuracy of SVM in diagnosing EGC
scopic diagnosis, automatic pathological examination, were all over 90%, indicating its good application value.
and tumor invasion depth detection, and has produced However, conventional ML methods such as SVM have
desired results. [44] Chang et al. [45] reviewed the diag- more limitations compared to DL models. The former
nostic performance of endoscopic image-based ML for relies on experienced experts to manually design the
early esophageal cancer. The AUC, SEN, and SPE were image features, requires multiple calculations to obtain
0.97 (95% CI 0.95–0.99), 0.94 (95% CI, 0.89–0.96), and the best truncation value, and yields poor performance in
0.88 (95% CI, 0.76–0.94). Jiang et al. [46] included 16 arti- processing large-scale data sets [44, 55, 56]. All of these
cles and found that the AUC, SEN, and SPE of AI-assisted problems impede the further development of conven-
EGC diagnosis were 0.96 (95% CI: 0.94–0.97), 86% (95% tional ML methods.
CI: 77–92%), and 93% (95% CI: 89–96%). However, Luo We observed, in this study, that ML-based mod-
et al. [47] included 15 articles and reported the pooled els had a higher diagnostic sensitivity than clinicians.
AUC, SEN, and SPE of endoscopic images-based AI in These models showed diagnostic performance as good
the detection of EGC were 0.94, 0.87 (95% CI: 0.87–0.88), as clinical specialists in both the images and videos.
and 0.88 (95% CI: 0.87–0.88). Variances in the diagnos- With the assistance of ML, the diagnostic sensitiv-
tic performance of ML models among different studies ity of non-specialists and specialists for EGC was sig-
indicate significant heterogeneity among different mod- nificantly improved, while such an improvement was
els. ML models can have overfitting or underfitting prob- not observed in the specificity, and the specificity of
lems when dealing with specific datasets, which can limit ML-assisted specialists was slightly lower than the
their application and generalization [48, 49]. Thereby, we ML models. This indicated that the assistance of ML
strictly differentiated between the results of the train- increased the specialists` misdiagnosis rate. Misdiag-
ing set and validation set, which could help us to analyze nosis caused by ML models in the process of image
whether ML models are at risk of overfitting and under- recognition is often attributed to the poor endoscopic
fitting and to reflect whether there are any challenges in image resolution leading to an abnormal mucosal
the goodness-of-fit of the existing ML models from an background color, which could be induced by residual
evidence-based medicine perspective. Fortunately, our foam, blood, and food residues in the lesion site, and
results were not overfitting or underfitting. Additionally, confusing tissue structures such as atrophic gastritis,
validating the model performance in different datasets intestinal metaplasia, and ulcers [29, 30]. ML mod-
with adequate external validation is necessary to improve els could interfere with clinical experts` judgment
the model and increase its reliability and application [50]. by presenting them with misidentified information,
There is a current lack of articles comparing the per- as reported by Tang et al. [24] In addition, in video
formance of ML-based models with clinicians of differ- diagnosis, the SROC, SEN, and SPE of ML models
ent skill levels and clinicians with the assistance of ML for EGC were 0.94 (95% CI: 0.39–1.00), 0.91 (95% CI:
models in EGC diagnosis as well as studies validating the 0.82–0.96), and 0.86 (95% CI: 0.75–0.93), greater than
diagnostic performance of ML models in real-time vid- that of clinicians: the SROC, SEN, and SPE were 0.90
eos. Our study has filled the gap. (95% CI: 0.58–0.98), 0.83 (95% CI: 0.77–0.88), and 0.85
According to our study, the mainstream ML method (95% CI: 0.77–0.90). By comparing the performance
is CNN. CNN is among the most typical DL models, between ML models in EGC diagnosis in images and
which includes multiple algorithm models such as VGG, real-time videos, we found that video slightly outper-
GoogleNet, ResNet, and DenseNet [51]. It is of excellent formed image on SEN, with image vs. video at 0.90 vs.
image recognition and classification ability and has been 0.91. And image slightly outperformed video on SROC
widely applied in endoscopic image-based diagnosis [27, (0.96 vs. 0.94) and SPE (0.9 vs. 0.86). However, this is
52]. Fang et al. [53] revealed the AUC, SEN, and SPE of not enough to clarify whose performance of ML mod-
CNN in the endoscopic image-based GC diagnosis were els is better in images and real-time videos. Because
0.89, 0.83, and 0.94. Md Mohaimenul Islam et al. [54] only 4 papers validated the detection performance of
revealed that the SROC and SEN of the CNN model in ML models in real-time videos, with a significantly
Shi et al. World Journal of Surgical Oncology (2024) 22:40 Page 11 of 13
Acknowledgements 9. Raftopoulos SC, Segarajasingam DS, Burke V, Ee HC, Yusoff IF. A cohort
We would like to thank the researchers and study participants for their study of missed and new cancers after esophagogastroduodenoscopy.
contributions. Am J Gastroenterol. 2010;105(6):1292–7.
10. Rugge M, Genta RM, Di Mario F, El-Omar EM, El-Serag HB, Fassan M,
Authors’ contributions Hunt RH, Kuipers EJ, Malfertheiner P, Sugano K, et al. Gastric cancer as
Conceptualization: YS, YH. Data curation: YS, LL. Formal analysis: YS, SF. Inves- preventable disease. Clin Gastroenterol Hepatol : the official clini-
tigation: HF, SF. Methodology: FQ, SF. Project administration: MZ. Software: YS, cal practice journal of the American Gastroenterological Association.
LL. Supervision: LL, BM. Validation: LL, YH. Visualization: HF, YH.Writing -original 2017;15(12):1833–43.
draft: YS, LL, YH. Writing-review & editing: YS, BM, SF. All authors contributed to 11. Ren W, Yu J, Zhang ZM, Song YK, Li YH, Wang L. Missed diagnosis of early
the article and approved the submitted version. gastric cancer or high-grade intraepithelial neoplasia. World J Gastroen-
terol. 2013;19(13):2092–6.
Funding 12. Pimenta-Melo AR, Monteiro-Soares M, Libânio D, Dinis-Ribeiro M. Miss-
The authors declare that they did not receive any funding from any source. ing rate for gastric cancer during upper gastrointestinal endoscopy:
a systematic review and meta-analysis. Eur J Gastroenterol Hepatol.
Availability of data and materials 2016;28(9):1041–9.
The data that support the findings of this study are available from the cor- 13. Li J, Zhu Y, Dong Z, He X, Xu M, Liu J, Zhang M, Tao X, Du H, Chen D, et al.
responding author upon reasonable request. Development and validation of a feature extraction-based logical anthro-
pomorphic diagnostic system for early gastric cancer: A case-control
study. EClinicalMedicine. 2022;46:101366.
Declarations 14. van der Sommen F, de Groof J, Struyvenberg M, van der Putten J, Boers T,
Fockens K, Schoon EJ, Curvers W, de With P, Mori Y, et al. Machine learning
Ethics approval and consent to participate in GI endoscopy: practical guidance in how to interpret a novel field. Gut.
Not applicable. 2020;69(11):2035–45.
15. Gottlieb K, Daperno M, Usiskin K, Sands BE, Ahmad H, Howden CW,
Consent for publication Karnes W, Oh YS, Modesto I, Marano C, et al. Endoscopy and central
Not applicable. reading in inflammatory bowel disease clinical trials: achievements, chal-
lenges and future developments. Gut. 2021;70(2):418–26.
Competing interests 16. Rezaeijo SM, Chegeni N, Baghaei Naeini F, Makris D, Bakas S. Within-
The authors declare no competing interests. modality synthesis and novel radiomic evaluation of brain MRI scans.
Cancers (Basel). 2023;15(14):3565.
Author details 17. Khanfari H, Mehranfar S, Cheki M, Mohammadi Sadr M, Moniri S, Heydar-
1
Department of Gastroenterology, The Affiliated Hospital of Xuzhou Medical heydari S, Rezaeijo SM. Exploring the efficacy of multi-flavored feature
University, 99 West Huaihai Road, Jiangsu Province 221002, Xuzhou, China. extraction with radiomics and deep features for prostate cancer grading
2
First Clinical Medical College, Xuzhou Medical University, Jiangsu Prov- on mpMRI. BMC Med Imaging. 2023;23(1):195.
ince 221002, Xuzhou, China. 3 Institute of Digestive Diseases, Xuzhou Medical 18. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow
University, 84 West Huaihai Road, Jiangsu Province 221002, Xuzhou, China. CD, Shamseer L, Tetzlaff JM, Akl EA, Brennan SE, et al. The PRISMA 2020
4
Key Laboratory of Gastrointestinal Endoscopy, Xuzhou Medical University, statement: an updated guideline for reporting systematic reviews. BMJ.
Jiangsu Province 221002, Xuzhou, China. 5 College of Nursing, Yangzhou 2021;372: n71.
University, Yangzhou 225009, China. 19. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB,
Leeflang MM, Sterne JA, Bossuyt PM. QUADAS-2: a revised tool for the
Received: 14 November 2023 Accepted: 23 January 2024 quality assessment of diagnostic accuracy studies. Ann Intern Med.
2011;155(8):529–36.
20 Chu H, Cole SR. Bivariate meta-analysis of sensitivity and specificity with
sparse data: a generalized linear mixed model approach. J Clin Epidemiol.
2006;59(12):1331–2 (author reply 1332-1333).
References 21. McDowell M, Jacobs P. Meta-analysis of the effect of natural frequencies
1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray on Bayesian reasoning. Psychol Bull. 2017;143(12):1273–312.
F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and 22. Yao Z, Jin T, Mao B, Lu B, Zhang Y, Li S, Chen W. Construction and multi-
mortality worldwide for 36 cancers in 185 countries. CACancer J Clin. center diagnostic verification of intelligent recognition system for endo-
2021;71(3):209–49. scopic images from early gastric cancer based on YOLO-V3 algorithm.
2. Thrift AP, El-Serag HB. Burden of gastric cancer. Clinical gastroenterology Front Oncol. 2022;12:815951.
and hepatology : the official clinical practice journal of the American 23. Ueyama H, Kato Y, Akazawa Y, Yatagai N, Komori H, Takeda T, Matsumoto
Gastroenterological Association. 2020;18(3):534–42. K, Ueda K, Matsumoto K, Hojo M, et al. Application of artificial intelligence
3. Ajani JA, Lee J, Sano T, Janjigian YY, Fan D, Song S. Gastric adenocarci- using a convolutional neural network for diagnosis of early gastric cancer
noma Nature reviews Disease primers. 2017;3:17036. based on magnifying endoscopy with narrow-band imaging. Journal of
4. Miller KD, Nogueira L, Devasia T, Mariotto AB, Yabroff KR, Jemal A, Kramer Gastroenterology and Hepatology (Australia). 2021;36(2):482–9.
J, Siegel RL. Cancer treatment and survivorship statistics. CA Cancer J 24. Tang D, Ni M, Zheng C, Ding X, Zhang N, Yang T, Zhan Q, Fu Y, Liu W,
Clin. 2022;72(5):409–36. Zhuang D, et al. A deep learning-based model improves diagnosis
5. Ajani JA, D’Amico TA, Bentrem DJ, Chao J, Cooke D, Corvera C, Das P, Enz- of early gastric cancer under narrow band imaging endoscopy. Surg
inger PC, Enzler T, Fanta P, et al. Gastric cancer version 2.2022 clinical prac- Endosc. 2022;36(10):7800–10.
tice guidelines in oncology. J Natl Compr Canc Netw. 2022;20(2):167–92. 25. Miyaki R, Yoshida S, Tanaka S, Kominami Y, Sanomura Y, Matsuo T, Oka S,
6. Hamashima C, Okamoto M, Shabana M, Osaki Y, Kishimoto T. Sensitivity of Raytchev B, Tamaki T, Koide T, et al. A computer system to be used with
endoscopic screening for gastric cancer by the incidence method. Int J laser-based endoscopy for quantitative diagnosis of early gastric cancer. J
Cancer. 2013;133(3):653–9. Clin Gastroenterol. 2015;49(2):108–15.
7. Telford JJ, Enns RA. Endoscopic missed rates of upper gastroin- 26. Li Y, Xie X, Yang X, Guo L, Liu Z, Zhao X, Luo Y, Jia W, Huang F, Zhu S,
testinal cancers: parallels with colonoscopy. Am J Gastroenterol. et al. Diagnosis of early gastric cancer based on fluorescence hyper-
2010;105(6):1298–300. spectral imaging technology combined with partial-least-square
8. Veitch AM, Uedo N, Yao K, East JE. Optimizing early upper gastrointes- discriminant analysis and support vector machine. J Biophotonics.
tinal cancer detection at endoscopy. Nat Rev Gastroenterol Hepatol. 2019;12(5):e201800324.
2015;12(11):660–7. 27. Li L, Chen Y, Shen Z, Zhang X, Sang J, Ding Y, Yang X, Li J, Chen M, Jin
C, et al. Convolutional neural network for the diagnosis of early gastric
Shi et al. World Journal of Surgical Oncology (2024) 22:40 Page 13 of 13
cancer based on magnifying narrow band imaging. Gastric Cancer. 47. Luo D, Kuang F, Du J, Zhou M, Liu X, Luo X, Tang Y, Li B, Su S. Artificial
2020;23(1):126–32. intelligence-assisted endoscopic diagnosis of early upper gastroin-
28. Jin T, Jiang Y, Mao B, Wang X, Lu B, Qian J, Zhou H, Ma T, Zhang Y, Li S, et al. testinal cancer: a systematic review and meta-analysis. Front Oncol.
Multi-center verification of the influence of data ratio of training sets on 2022;12:855175.
test results of an AI system for detecting early gastric cancer based on 48. Charilaou P, Battat R. Machine learning models and over-fitting considera-
the YOLO-v4 algorithm. Front Oncol. 2022;12:953090. tions. World J Gastroenterol. 2022;28(5):605–7.
29. Hu H, Gong L, Dong D, Zhu L, Wang M, He J, Shu L, Cai Y, Cai S, Su W, 49. Hosseinzadeh M, Gorji A, Fathi Jouzdani A, Rezaeijo SM, Rahmim A, Sal-
et al. Identifying early gastric cancer under magnifying narrow-band manpour MR. Prediction of cognitive decline in Parkinson’s disease using
images with deep learning: a multicenter study. Gastrointest Endosc. clinical and DAT SPECT imaging features, and hybrid machine learning
2021;93(6):1333-1341.e1333. systems. Diagnostics (Basel). 2023;13(10):1691.
30. He X, Wu L, Yu H. Real-time use of artificial intelligence for diagnosing 50. Heydarheydari S, Birgani MJT, Rezaeijo SM. Auto-segmentation of
early gastric cancer by endoscopy: a multicenter, diagnostic study. United head and neck tumors in positron emission tomography images
European Gastroenterology Journal. 2021;9(SUPPL 8):777. using non-local means and morphological frameworks. Pol J Radiol.
31. Zhou B, Rao X, Xing H, Ma Y, Wang F, Rong L. A convolutional neural 2023;88:e365–70.
network-based system for detecting early gastric cancer in white-light 51. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature.
endoscopy. Scand J Gastroenterol. 2022;58(2):157–62. 2015;521(7553):436–44.
32. Zhang LM, Zhang Y, Wang L, Wang JY, Liu YL. Diagnosis of gastric 52. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der
lesions through a deep convolutional neural network. Dig Endosc. Laak J, van Ginneken B, Sánchez CI. A survey on deep learning in medical
2021;33(5):788–96. image analysis. Med Image Anal. 2017;42:60–88.
33. Wu L, Zhou W, Wan X, Zhang J, Shen L, Hu S, Ding Q, Mu G, Yin A, Huang 53. Xie F, Zhang K, Li F, Ma G, Ni Y, Zhang W, Wang J, Li Y. Diagnostic accuracy
X, et al. A deep neural network improves endoscopic detection of early of convolutional neural network-based endoscopic image analysis in
gastric cancer without blind spots. Endoscopy. 2019;51(6):522–31. diagnosing gastric cancer and predicting its invasion depth: a systematic
34. Wu L, He X, Liu M, Xie H, An P, Zhang J, Zhang H, Ai Y, Tong Q, Guo review and meta-analysis. Gastrointest Endosc. 2022;95(4):599-609.
M, et al. Evaluation of the effects of an artificial intelligence system (e597).
on endoscopy quality and preliminary testing of its performance in 54 Islam MM, Poly TN, Walther BA, Lin MC, Li YJ. Artificial intelligence in
detecting early gastric cancer: a randomized controlled trial. Endoscopy. gastric cancer: identifying gastric cancer using endoscopic images with
2021;53(12):1199–207. convolutional neural network. Cancers (Basel). 2021;13(21):5253.
35. Tang D, Wang L, Ling T, Lv Y, Ni M, Zhan Q, Fu Y, Zhuang D, Guo H, Dou 55. Zhou S. Sparse SVM for sufficient data reduction. IEEE Trans Pattern Anal
X, et al. Development and validation of a real-time artificial intelligence- Mach Intell. 2022;44(9):5560–71.
assisted system for detecting early gastric cancer. a multicentre retro- 56. Erickson BJ, Korfiatis P, Akkus Z. Kline TL Machine learning for medical
spective diagnostic study. EBioMedicine. 2020;62:103146. imaging. Radiographics a review publication of the Radiological Society
36. Noda H, Kaise M, Higuchi K, Koizumi E, Yoshikata K, Habu T, Kirita K, Onda of North America. 2017;37(2):505–15.
T, Omori J, Akimoto T, et al. Convolutional neural network-based system 57. Chen S, Lu S, Tang Y, Wang D, Sun X, Yi J, Liu B, Cao Y, Chen Y, Liu X. A
for endocytoscopic diagnosis of early gastric cancer. BMC Gastroenterol- machine learning-based system for real-time polyp detection (DeFrame):
ogy. 2022;22(1):237. a retrospective study. Front Med (Lausanne). 2022;9:852553.
37. Kanesaka T, Lee TC, Uedo N, Lin KP, Chen HZ, Lee JY, Wang HP, Chang HT. 58. Gong EJ, Bang CS, Lee JJ, Baik GH, Lim H, Jeong JH, Choi SW, Cho J, Kim
Computer-aided diagnosis for identifying and delineating early gastric DY, Lee KB, et al. Deep learning-based clinical decision support system for
cancers in magnifying narrow-band imaging. Gastrointest Endosc. gastric neoplasms in real-time endoscopy: development and validation
2018;87(5):1339–44. study. Endoscopy. 2023;55(8):701–8.
38. Ikenoyama Y, Hirasawa T, Ishioka M, Namikawa K, Yoshimizu S, Horiuchi Y,
Ishiyama A, Yoshio T, Tsuchida T, Takeuchi Y, et al. Detecting early gastric
cancer: Comparison between the diagnostic ability of convolutional Publisher’s Note
neural networks and endoscopists. Dig Endosc. 2021;33(1):141–50. Springer Nature remains neutral with regard to jurisdictional claims in pub-
39. Horiuchi Y, Hirasawa T, Ishizuka N, Tokai Y, Namikawa K, Yoshimizu lished maps and institutional affiliations.
S, Ishiyama A, Yoshio T, Tsuchida T, Fujisaki J, et al. Performance of a
computer-aided diagnosis system in diagnosing early gastric cancer
using magnifying endoscopy videos with narrow-band imaging (with
videos). Gastrointest Endosc. 2020;92(4):856–65 (e851).
40. Horiuchi Y, Aoyama K, Tokai Y, Hirasawa T, Yoshimizu S, Ishiyama A, Yoshio
T, Tsuchida T, Fujisaki J, Tada T. Convolutional neural network for dif-
ferentiating gastric cancer from gastritis using magnified endoscopy with
narrow band imaging. Dig Dis Sci. 2020;65(5):1355–63.
41. Gong L, Wang M, Shu L, He J, Qin B, Xu J, Su W, Dong D, Hu H, Tian J,
et al. Automatic captioning of early gastric cancer via magnification
endoscopy with narrow band imaging. Gastrointestinal endoscopy.
2022;96(6):929-942.e6.
42. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920–30.
43. Handelman GS, Kok HK, Chandra RV, Razavi AH, Lee MJ, Asadi H.
eDoctor: machine learning and the future of medicine. J Intern Med.
2018;284(6):603–19.
44. Cao R, Tang L, Fang M, Zhong L, Wang S, Gong L, Li J, Dong D, Tian J,
et al. Artificial intelligence in gastric cancer: applications and challenges.
Gastroenterol Rep. 2022;(10):goac064.
45. Bang CS, Lee JJ, Baik GH. Computer-aided diagnosis of esophageal cancer
and neoplasms in endoscopic images: a systematic review and meta-
analysis of diagnostic test accuracy. Gastrointest Endosc. 2021;93(5):1006-
1015. (e1013).
46. Jiang K, Jiang X, Pan J, Wen Y, Huang Y, Weng S, Lan S, Nie K, Zheng Z, Ji
S, et al. Current evidence and future perspective of accuracy of artificial
intelligence application for early gastric cancer diagnosis with endos-
copy: a systematic and meta-analysis. Front Med. 2021;8:629080.