You are viewing the site in preview mode

Skip to main content

The efficacy of artificial intelligence in diabetic retinopathy screening: a systematic review and meta-analysis

Abstract

Background

To evaluate the efficacy of artificial intelligence (AI) in screening for diabetic retinopathy (DR) using fundus images and optical coherence tomography (OCT) in comparison to traditional screening methods.

Methods

This systematic review was registered with PROSPERO (ID: CRD42024560750). Systematic searches were conducted in PubMed Medline, Cochrane Central, ScienceDirect, and Web of Science using keywords such as “diabetic retinopathy,” “screening,” and “artificial intelligence.” Only studies published in English from 2019 to July 22, 2024, were considered. We also manually reviewed the reference lists of relevant reviews. Two independent reviewers assessed the risk of bias using the QUADAS-2 tool, resolving disagreements through discussion with the principal investigator. Meta-analysis was performed using MetaDiSc software (version 1.4). To calculate combined sensitivity, specificity, summary receiver operating characteristic (SROC) plots, forest plots, and subgroup analyses were performed according to clinician type (ophthalmologists vs. retina specialists) and imaging modality (fundus images vs. fundus images + OCT).

Results

18 studies were included. Meta-analysis showed that AI systems demonstrated superior diagnostic performance compared to doctors, with the pooled sensitivity, specificity, diagnostic odds ratio, and Cochrane Q index of the AI being 0.877, 0.906, 0.94, and 153.79 accordingly. The Fagan nomogram analysis further confirmed the strong diagnostic value of AI. Subgroup analyses revealed that factors like imaging modality, and doctor expertise can influence diagnostic performance.

Conclusion

AI systems have demonstrated strong diagnostic performance in detecting diabetic retinopathy, with sensitivity and specificity comparable to or exceeding traditional clinicians.

Introduction

Diabetes is a major global health issue, affecting an estimated 463 million people worldwide. This number is projected to increase to 700 million by 2045 [1]. Diabetic retinopathy (DR) is a leading cause of vision loss globally, affecting millions of people with diabetes. Early detection and timely intervention are crucial to prevent vision loss [2]. Clinically, DR is defined as a microvascular condition that affects the capillaries of the retina causing damage and secondary visual impairment. The underlying mechanisms involve the long-standing hyperglycemia and its sequels [2, 3]. Among diabetic patients, the global prevalence of DR was 22.27% in 2020 and the number of people with DR was estimated to be 103.12 million worldwide [1]. Traditional screening methods often rely on manual examination by ophthalmologists, which can be time-consuming, resource-intensive, and subject to human error.

Artificial intelligence (AI) has emerged as a promising tool for automating DR screening, offering potential improvements in efficiency, accuracy, and accessibility [4]. Recent advancements in computing power have made deep learning the leading AI technique for DR screening. Many deep learning models have outperformed traditional feature-based machine learning methods [5]. This systematic review and meta-analysis aimed to evaluate the efficacy of AI-based screening for DR using fundus images and optical coherence tomography (OCT) in comparison to traditional methods. By synthesizing the existing evidence, this study seeks to inform healthcare decision-makers about the potential benefits and drawbacks of AI-assisted DR screening and guide future research efforts.

Materials and methods

Search strategy

This systematic review was registered with PROSPERO (ID: CRD42024560750). We conducted a systematic review and meta-analysis to determine the efficacy of artificial intelligence in the screening of diabetic retinopathy. A systematic search using PubMed Medline, Central, ScienceDirect, and Web of Science was conducted to identify studies on DR and AI. We used a combination of keywords and Medical Subject Headings (MeSH) terms, including “diabetic retinopathy,” “screening,” “artificial intelligence,” “deep learning,” “machine learning,” and “computer-aided diagnosis.” The search was conducted across all fields, including the title, abstract, and MeSH terms, as outlined in Supplementary Table 1. We included publications in English published up to July 22, 2024. Further literature search consisted of reviewing the reference lists of relevant articles such as previous country or region-based systematic reviews and meta-analyses about DR screening using AI. This adopted strategy identified all articles used in previous reviews. No informed consent was required because of the retrospective nature of the study.

Study selection and eligibility criteria

Studies that met the following criteria were included: (1) diagnostic accuracy studies; (2) clear definition of random sampling procedure; (3) had a response rate above 60%, to ensure sufficient representation and minimize selection bias; (4) participants ≥ 18 years old; (5) known diagnosis of type 1 or type 2 diabetes mellitus; (6) and Diabetic retinopathy with all stages (mild, moderate, severe). However, we excluded all studies that (1) were duplicates, (2) did not have full-text articles, (3) Patients with other retinal diseases, (4) Persistent retinal impairments other than diabetic retinopathy in one or two eyes, (5) Previous retinal surgeries and interventions, and (6) Patient with contraindication to fundus photography. Four reviewers independently screened titles, abstracts, and full texts for relevance and performed data extraction, using a predefined data collection sheet. Any disagreements were addressed through discussion with the principal investigator.

Data extraction

After obtaining the complete articles, four reviewers—MS, WA, HA, and WH—independently analyzed the features of the included studies and extracted findings related to the diagnostic effectiveness of AI from each study. Any discrepancies between the reviewers were first discussed among the four to reach a consensus. If disagreements persisted, they were resolved through discussion with a fifth investigator, AA, who acted as an arbitrator to ensure consistency and accuracy in data extraction. The reviewers gathered key indicators, including sensitivity (SE), specificity (SP), the number of patients with diabetic retinopathy (DR), and the overall number of participants from the studies. These indicators were then used to calculate the outcome variables for the diagnostic meta-analysis, which included true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN). The results were compiled into contingency tables for use in the meta-analysis. If a study presented different types of DR or employed various algorithms, leading to multiple contingency tables, we considered these as independent entities.

Quality assessment

To evaluate the quality of the included studies, two reviewers (WH and WA) utilized the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) and RevMan 5.4.1. The QUADAS-2 framework consists of four components for assessing the risk of bias: patient selection, index test, reference standard, and flow and timing. Each component contains two or three pertinent questions. The response to each question was Yes, No, or Unclear; The latter is only used when there is insufficient information to judge. A component was deemed to be at low risk if all responses were “Yes” or at high risk if a response was “ No” to any of the pertinent questions. Furthermore, the elements of patient selection, index test, and reference standard were also assessed regarding their clinical applicability. If these components were found to be of “low risk,” it indicates that the studies included in the review are less likely to be biased.

Data synthesis and analysis

We utilized MetaDiSc software (version 1.4) and employed a bivariate random-effects model to analyze the outcome variables (True Positives, False Positives, False Negatives, True Negatives). This model was chosen because it accounts for both within- and between-study variability, making it suitable for diagnostic accuracy studies that involve heterogeneous data sources. Unlike univariate models, the bivariate random-effects approach considers the correlation between sensitivity and specificity, allowing for more precise pooled estimates despite variations in study design, AI algorithms, and patient populations.The results were presented using Summary Receiver Operating Characteristic (SROC) plots, along with forest plots and a Fagan nomogram. Furthermore, the bivariate random effects model was used to determine the pooled sensitivity, specificity, area under the curve (AUC), diagnostic odds ratio (DOR), adjusted diagnostic odds ratio (ADOR), and positive and negative likelihood ratios (LR + and LR-, respectively). In addition, we calculate the Q index for both the AI screening method and doctor practices to determine whether AI enhances diagnostic accuracy.

The Fagan nomogram was analyzed along with a positive post-test probability and a negative post-test probability to show how the AI test results modify the probability of the presence of DR. This visualization will help convey the effectiveness of AI in altering diagnostic probabilities compared to traditional methods. Additionally, Subgroup analyses were performed according to clinician type (ophthalmologists vs. retina specialists) and imaging modality (fundus images vs. fundus images + OCT). These specific subgroups were chosen because imaging modality can influence AI diagnostic performance due to variations in image resolution and the depth of structural details captured. Moreover, AI performance may vary depending on whether it is compared to general ophthalmologists or retina specialists, as the latter have more specialized expertise in diabetic retinopathy diagnosis. Understanding these differences provides valuable insights into the real-world applicability of AI screening.

Results

Selection and characteristics of eligible studies

A flowchart of the literature search and study selection process is presented in Fig. 1. The study selection process involved a systematic search of databases and registers, during which a total of 2065 records were identified from databases and 26 from registers. After removing 910 duplicate records, 722 records were marked as ineligible by automation tools, and an additional 333 records were eliminated for various other reasons, resulting in 126 records being screened. Out of these, 56 records were excluded. Subsequently, 67 reports were assessed for eligibility; 40 of these were retrospective studies, leading to 18 studies ultimately included in the review. The eligible studies included in this systematic review exhibited diverse characteristics, contributing to a comprehensive understanding of the research question. A total of 18 studies were reviewed, comprising randomized controlled trials, cross-sectional, and prospective studies. The studies were conducted in various settings, including general population, hospitals and community clinics across multiple countries, such as the United States of America, and China. In addition, two of the studies used OCT for imaging while 16 studies captured the retina only by a fundus image camera. The included studies assessed DR and ME for the screening process and 9 of them only assessed DR. We examined the diagnostic AI utilized by each study for detecting DR and ME, as well as the quality of the images, the geographical areas where the studies were conducted, and the sample sizes involved (Table 1).

Fig. 1
figure 1

Flow diagram of literature selection

Table 1 Summary of the data obtained from the included studies

Quality assessment

Figures 2 and 3 show the summary chart and bar chart for quality assessment of the included studies. Out of the studies included, 4 studies, Baget-Bernaldz et al.,2021; LI et al., 2021; Nunz do rio et al., 2021; and scheetz et al., 2021 had high risk of bias for patient selection. Out of the 18 studies, Wonngchaisuwat et al., 2020 had unclear risk of bias for the index testing. 1 study, scheetz et al., 2021 reported unclear information for the establishment and blinding of reference grading. 13 out of the 18 studies performed poorly in the time and flow evaluation. All studies showed low concerns for applicability of patient selection, index testing, and reference grading.

Fig. 2
figure 2

QUADAS-2 risk of bias and applicability concerns summary plot

Fig. 3
figure 3

QUADAS-2 risk of bias and applicability concerns bar plot

Threshold analysis and heterogeneity test

To investigate the presence of a threshold effect, we calculated Spearman’s correlation coefficient between sensitivity and specificity. The correlation coefficient was − 0.45, suggesting a moderate threshold effect across the studies included. An asymmetry was also observed in the summary ROC curve, which indicates that diagnostic performance may be influenced by the threshold levels used in different studies. This suggests that different studies may have applied varying diagnostic thresholds, potentially affecting the balance between sensitivity and specificity. We assessed heterogeneity across the included studies using the I² statistic and Cochran’s Q test. Cochran’s Q test was significant for both sensitivity and specificity (p < 0.05), the SE, SP, LR+, LR-, and DOR are all shown in Table 2. The differences in study design, patient populations, and diagnostic thresholds likely contribute to the heterogeneity observed in sensitivity and specificity estimates (see Fig. 4).

Fig. 4
figure 4

Results of meta-analysis and forest plots of Artificial Intelligence devices. (A) Forest plot of pooled Se. (B) Forest plot of pooled Sp. (C) Summary reciever operating characteristics (SROC) plot

Table 2 The combined predictive value of all included studies

Synthesis of results

The included data on AI systems and doctors in screening diabetic retinopathy were analyzed using MetaDiSc software (version 1.4). The AI-based screening systems demonstrated high diagnostic accuracy, with pooled sensitivity and specificity of 0.877 (95% CI: 0.870–0.884) and 0.906 (95% CI: 0.904–0.908), respectively. Additional diagnostic performance metrics, including the diagnostic odds ratio (DOR), and likelihood ratios (LR + and LR-) are summarized in Table 2.

For doctors, the pooled sensitivity and specificity were 0.751 (95% CI: 0.736–0.766) and 0.941 (95% CI: 0.936–0.946), respectively. Additional performance metrics, including the diagnostic odds ratio (DOR), and likelihood ratios (LR + and LR-) are summarized in Table 2.

The Fagan nomogram analysis demonstrated the clinical utility of AI-based DR screening. If a patient tests positive for DR using AI, the post-test probability of truly having the disease increases to 84.92%, indicating that AI effectively enhances disease detection. Conversely, a negative AI result reduces the probability of disease presence to just 3.56%, highlighting AI’s potential to rule out DR with confidence. These findings reinforce the role of AI in triaging patients, allowing ophthalmologists to focus on high-risk cases requiring further evaluation (Fig. 5). Subgroup analyses based on factors such as imaging modality, and doctor expertise revealed further insights into diagnostic performance. These results are shown in Table 3.

Fig. 5
figure 5

Fagan nomogram of artificial intelligence (AI) for the diagnosis of diabetic retinopathy (DRP)

Table 3 Results of subgroup analysis

Meta regression and sensitivity analysis

To explore the sources of heterogeneity, we performed a meta-regression analysis using the MetaDiSc software (version 1.4), evaluating the potential influence of covariates such as imaging modality (fundus vs. OCT) and doctor expertise (retina specialist vs. ophthalmologist). The analysis revealed that both the doctor and AI datasets exhibited high levels of heterogeneity, with I² values exceeding 98% for doctors and 99% for AI systems. This heterogeneity suggests substantial variability in study methodologies, including differences in AI model architecture, image quality, patient populations, and diagnostic thresholds used to classify diabetic retinopathy (see Fig. 6).

Fig. 6
figure 6

Results of meta-analysis and forest plots of doctors. (A) Forest plot of pooled Se. (B) Forest plot of pooled Sp. (C) Summary receiver operating characteristics (SROC) plot

High heterogeneity affects the interpretation of pooled results by potentially exaggerating or underestimating AI’s diagnostic performance in different settings. One key factor contributing to heterogeneity is the variation in AI training datasets—models trained on diverse populations may generalize better than those trained on homogeneous datasets. Additionally, differences in study inclusion criteria (e.g., DR severity grading) and reference standards for diagnosis could influence pooled estimates.

Despite this heterogeneity, sensitivity analyses confirmed the robustness of the results. The overall pooled estimates remained significant across different subgroup exclusions, indicating that AI consistently demonstrated strong diagnostic performance across multiple study conditions.

Publication bias

Publication bias was assessed through visual inspection of a funnel plot and statistical tests, including Egger’s test. The funnel plot (Fig. 7) demonstrated asymmetry, suggesting the presence of potential publication bias. Specifically, there was a noticeable lack of studies on the left side of the funnel, indicating that smaller studies with non-significant or smaller effect sizes may be underrepresented in the analysis. This was further supported by Egger’s test, which produced a statistically significant result (t = 2.1400, p = 0.0472), confirming asymmetry and the likelihood of publication bias. A trim-and-fill plot (Fig. 8) was then made to adjust for the publication bias.

Fig. 7
figure 7

Funnel plot

Fig. 8
figure 8

Trim-and-Fill funnel plot

Discussion

Over the past years artificial intelligence has revolutionized many fields including healthcare. AI can be utilized as a screening tool aiding in the early detection of many conditions such as diabetic retinopathy [24]. AI models, especially those based on deep learning and machine learning techniques, have demonstrated effectiveness in screening for diabetic retinopathy.

In our systematic review and meta-analysis, we included 18 studies with a total of (214,463) patients. We assessed the effectiveness of AI compared to human graders (ophthalmologists and retina specialists) by analyzing the sensitivity, specificity, and diagnostic odds ratios (DOR). Our results reveal that AI shows strong diagnostic performance. However, there was some notable heterogeneity among the studies and concerns for publication bias.

One of our key findings was the high heterogeneity among the included studies, with I² values frequently surpassing 97% in numerous analyses. This heterogeneity could possibly arise from differences in imaging techniques, AI algorithms, and study demographics [25, 26]. Although we performed a sub-group analysis to discover the source of this considerable variability among the studies, the main source remains unclear. One of the potential sources of this heterogeneity is the variable threshold among the studies. Different studies could be using different cut-points to detect and categorize diabetic retinopathy. As these thresholds fluctuate between the studies, the false positive and false negative results fluctuate as well. Another contributing factor is the diversity in AI training. AI models trained on a diverse dataset will generalize differently compared to other models trained on a more uniform dataset. AI models that are trained on a more uniform dataset, for example, will encounter some challenges in the diverse clinical setting Whereas another AI model trained on a diverse dataset will show a better performance. Notably, a research conducted by Wu et al. indicated that machine learning models trained on a diverse dataset show more efficacy in clinical situations [27]. These differences in AI training create a varying diagnostic effect leading to high heterogeneity levels [28]. Moreover, the use of different imaging techniques such as fundus photography and OCT may have contributed to this significant heterogeneity.

To explore the source of heterogeneity further, we performed subgroup analyses based on clinician (ophthalmologist versus retina specialist) and imaging technique (fundus image only versus fundus image + OCT). The results revealed that AI interestingly outperformed clinicians. AI had a pooled sensitivity of 0.877 and specificity of 0.906 while retina specialists had sensitivity of 0.750 and specificity of 0.949. It is worth noting that general ophthalmologists had lower sensitivity compared to retina specialists which indicates that AI can considerably outperform general ophthalmologists in detecting diabetic retinopathy. This aligns with the findings of previous studies. A study by Gulshan et al. found that in the screening of diabetic retinopathy, AI could outperform general ophthalmologists as AI is consistent in analyzing and identifying minute patterns in large datasets compared to human experts [29].

Regarding imaging techniques, using fundus images and OCT did not show significant difference compared to using fundus images alone. This indicates that AI is useful in the screening of diabetic retinopathy over a wide range of settings, especially in areas with limited resources where OCT may not be readily available. Previous research done by Abramoff et al. and Gulshan et al. show that AI trained on fundus image alone could achieve comparable results to those of clinicians [13, 29]. For instance, Gulshan et al. found that AI trained on fundus images achieved a sensitivity of 90.3% and a specificity of 98.1% in detecting referable diabetic retinopathy without the need for the advanced technology of OCT. This means that AI can recognize important indicators of diabetic retinopathy such as, microaneurysms, hemorrhages, and exudates based solely on fundus image making it highly efficient in broad clinical use [29].

This study addresses critical gaps in the literature. It is the first to include a detailed subgroup analysis based on imaging modality and clinician expertise. The efficacy of AI screening using fundus photography versus OCT has not been discussed before. Moreover, this study uniquely compares AI versus clinicians with different levels of expertise including general ophthalmologist and retina specialist. This approach allows for a deeper understanding of how AI performance varies across different clinical scenarios, and whether more advanced imaging modalities or higher levels of expertise are required for efficient screening. Furthermore, our meta-analysis includes recently published studies, ensuring that the findings reflect the most current data available in the field.

The use of AI in DR screening is promising in terms of efficiency. Yet, the question whether using AI is cost effective or not is an important consideration. There is conflict in the literature, and it might be attributable to some factors such as geography and deployment of the strategy [30]. Artificial intelligence has demonstrated significant cost-effectiveness in screening for diabetic retinopathy. A Markov model analysis conducted in rural China evaluated the economic viability of AI screening compared to no screening and ophthalmologist-led methods. The study revealed that AI screening increased quality-adjusted life years (QALYs) by 0.16 at an incremental cost of $180.19 compared to no screening. In contrast, ophthalmologist-led screening was less effective and more expensive. The incremental cost-effectiveness ratio (ICER) for AI screening was $1,107.63, which was well below the cost-effectiveness threshold of one to three times the per capita GDP, affirming its feasibility in resource-constrained environments [31].

Furthermore, a systematic review of AI-based DR screening systems highlighted their potential to scale effectively, reducing screening costs and improving accessibility, particularly in underserved regions. Notably, AI systems reduce reliance on ophthalmologists for initial DR detection, enabling healthcare systems to reallocate resources to treatment and follow-up care, thereby improving cost-effectiveness across the care pathway [30]. By combining accurate diagnosis with reduced costs, AI-based DR screening offers a scalable solution to alleviate the economic burden of vision loss, making it an invaluable tool in achieving global health equity [30, 31].

Despite the promising results, our study has several limitations that should be acknowledged. First, the high heterogeneity across studies limits the generalizability of the findings. Second, the presence of publication bias, as indicated by the asymmetry in the funnel plot and Egger’s test, suggests that smaller or non-significant studies may be underrepresented in the analysis. Although we applied the trim-and-fill method to adjust for missing studies, publication bias remains a concern that could have influenced the pooled estimates. Third, In the meta regression, we did not further analyze patient information, such as age, sex, and duration of the disease, which may be a source of heterogeneity and need further studying. Finally, only English studies were included, which may cause a bias due to the lack of literature in other languages.

Conclusion

In conclusion, AI systems have demonstrated strong diagnostic performance in detecting diabetic retinopathy, with sensitivity and specificity comparable to or exceeding traditional clinicians. Their ability to outperform non-specialist clinicians highlights the promise of integrating AI into clinical practice, particularly for large-scale populations. As AI technology continues to improve and becomes more cost-effective, its integration could significantly enhance the early detection and treatment of DR, ultimately reducing the global burden of blindness. Emphasizing the practical application of AI in clinical settings will be vital for realizing its full benefit. Furthermore, future research should aim to standardize AI evaluation metrics and dataset diversity to reduce variability in future meta-analyses.

Data availability

All data generated or analyzed during this study are included in this published article.

References

  1. Teo ZL, Tham Y-C, Yu M, Chee ML, Rim TH, Cheung N, et al. Global prevalence of diabetic retinopathy and projection of burden through 2045. Ophthalmology. 2021;128(11):1580–91. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.ophtha.2021.04.027.

    Article  PubMed  Google Scholar 

  2. Antonetti DA, Silva PS, Stitt AW. Current Understanding of the molecular and cellular pathology of diabetic retinopathy. Nat Reviews Endocrinol. 2021;17(4):195–206. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41574-020-00451-4.

    Article  Google Scholar 

  3. Miller DJ, Cascio MA, Rosca MG. Diabetic retinopathy: The role of mitochondria in the neural retina and microvascular disease. Antioxidants. 2020 Sept 23;9(10):905. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/antiox9100905

  4. Grzybowski A, Singhanetr P, Nanegrungsunk O, Ruamviboonsuk P. Artificial intelligence for diabetic retinopathy screening using color retinal photographs: from development to deployment. Ophthalmol Therapy. 2023;12(3):1419–37. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s40123-023-00691-3.

    Article  Google Scholar 

  5. Tufail A, Rudisill C, Egan C, Kapetanakis VV, Salas-Vega S, Owen CG, et al. Automated diabetic retinopathy image assessment software. Ophthalmology. 2017;124(3):343–51. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.ophtha.2016.11.014.

    Article  PubMed  Google Scholar 

  6. Lim JI, Regillo CD, Sadda SR, Ipp E, Bhaskaranand M, Ramachandra C, et al. Artificial intelligence detection of diabetic retinopathy. Ophthalmol Sci. 2023;3(1):100228. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.xops.2022.100228.

    Article  PubMed  Google Scholar 

  7. Zhang W, Li D, Wei Q, Ding D, Meng L, Wang Y, et al. The validation of deep Learning-based grading model for diabetic retinopathy. Front Med. 2022;9. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fmed.2022.839088.

  8. Liu R, Li Q, Xu F, Wang S, He J, Cao Y, et al. Application of artificial intelligence-based dual-modality analysis combining fundus photography and optical coherence tomography in diabetic retinopathy screening in a community hospital. Biomed Eng Online. 2022;21(1). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12938-022-01018-2.

  9. Li N, Ma M, Lai M, Gu L, Kang M, Wang Z, et al. A stratified analysis of a deep learning algorithm in the diagnosis of diabetic retinopathy in a real-world study. J Diabetes. 2021;14(2):111–20. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/1753-0407.13241.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Dow ER, Khan NC, Chen KM, Mishra K, Perera C, Narala R, et al. Artificial Intelligence-human hybrid workflow enhances teleophthalmology for the detection of diabetic retinopathy. Ophthalmol Sci. 2023;3(4):100330. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.xops.2023.100330.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Verbraak FD, Abramoff MD, Bausch GCF, Klaver C, Nijpels G, Schlingemann RO, et al. Diagnostic accuracy of a device for the automated detection of diabetic retinopathy in a primary care setting. Diabetes Care. 2019;42(4):651–6. https://doiorg.publicaciones.saludcastillayleon.es/10.2337/dc18-0148.

    Article  PubMed  Google Scholar 

  12. Wintergerst MWM, Bejan V, Hartmann V, Schnorrenberg M, Bleckwenn M, Weckbecker K, et al. Telemedical diabetic retinopathy screening in a primary care setting: quality of retinal photographs and accuracy of automated image analysis. Ophthalmic Epidemiol. 2021;29(3):286–95. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/09286586.2021.1939886.

    Article  PubMed  Google Scholar 

  13. Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. Npj Digit Med. 2018;1(1). https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41746-018-0040-64.

  14. Quellec G, Lamard M, Lay B, Guilcher AL, Erginay A, Cochener B et al. Instant automatic diagnosis of diabetic retinopathy [Internet]. 2024 [cited 2024 Jul 7]. Available from: https://arxiv.org/abs/1906.11875

  15. Wang Y, Shi D, Tan Z, Niu Y, Jiang Y, Xiong R, et al. Screening referable diabetic retinopathy using a semi-automated deep learning algorithm assisted approach. Front Med. 2021;8. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fmed.2021.740987.

  16. Baget-Bernaldiz M, Pedro R-A, Santos-Blanco E, Navarro-Gil R, Valls A, Moreno A, et al. Testing a deep learning algorithm for detection of diabetic retinopathy in a Spanish diabetic population and with Messidor database. Diagnostics. 2021;11(8):1385. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/diagnostics11081385.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Nunez do Rio JM, Nderitu P, Bergeles C, Sivaprasad S, Tan GS, Raman R. Evaluating a deep learning diabetic retinopathy grading system developed on mydriatic retinal images when applied to non-mydriatic community screening. J Clin Med. 2022;11(3):614. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/jcm11030614.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Gulshan V, Rajan RP, Widner K, Wu D, Wubbels P, Rhodes T et al. Performance of a deep-learning algorithm vs manual grading for detecting diabetic retinopathy in India. JAMA Ophthalmology. 2019 Sept 1;137(9):987. https://doiorg.publicaciones.saludcastillayleon.es/10.1001/jamaophthalmol.2019.2004

  19. Rogers TW, Gonzalez-Bueno J, Garcia Franco R, Lopez Star E, Méndez Marín D, Vassallo J, et al. Evaluation of an AI system for the detection of diabetic retinopathy from images captured with a handheld portable fundus camera: the mailor AI study. Eye. 2020;35(2):632–8. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41433-020-0927-8.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Scheetz J, Koca D, McGuinness M, Holloway E, Tan Z, Zhu Z, et al. Real-world artificial intelligence-based opportunistic screening for diabetic retinopathy in endocrinology and Indigenous healthcare settings in Australia. Sci Rep. 2021;11(1). https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41598-021-94178-5.

  21. Zhang Y, Shi J, Peng Y, Zhao Z, Zheng Q, Wang Z, et al. Artificial intelligence-enabled screening for diabetic retinopathy: A real-world, multicenter and prospective study. BMJ Open Diabetes Res Care. 2020;8(1). https://doiorg.publicaciones.saludcastillayleon.es/10.1136/bmjdrc-2020-001596.

  22. Wongchaisuwat N, Trinavarat A, Rodanant N, Thoongsuwan S, Phasukkijwatana N, Prakhunhungsit S, et al. In-person verification of deep learning algorithm for diabetic retinopathy screening using different techniques across fundus image devices. Translational Vis Sci Technol. 2021;10(13):17. https://doiorg.publicaciones.saludcastillayleon.es/10.1167/tvst.10.13.17.

    Article  Google Scholar 

  23. Ming S, Xie K, Lei X, Yang Y, Zhao Z, Li S, et al. Evaluation of a novel artificial intelligence-based screening system for diabetic retinopathy in community of China: A real-world study. Int Ophthalmol. 2021;41(4):1291–9. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s10792-020-01685-x.

    Article  PubMed  Google Scholar 

  24. Lu W, Tong Y, Yu Y, Xing Y, Chen C, Shen Y. Applications of artificial intelligence in ophthalmology: general overview. J Ophthalmol. 2018;2018:1–15. https://doiorg.publicaciones.saludcastillayleon.es/10.1155/2018/5278196.

    Article  Google Scholar 

  25. Ting DS, Pasquale LR, Peng L, Campbell JP, Lee AY, Raman R, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2018;103(2):167–75. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/bjophthalmol-2018-313173.

    Article  PubMed  Google Scholar 

  26. Grauslund J. Diabetic retinopathy screening in the emerging era of artificial intelligence. Diabetologia. 2022;65(9):1415–23. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s00125-022-05727-0.

    Article  PubMed  Google Scholar 

  27. Wu J-H, Liu TY, Hsu W-T, Ho JH-C, Lee C-C. Performance and limitation of machine learning algorithms for diabetic retinopathy screening: Meta-analysis. J Med Internet Res. 2021;23(7). https://doiorg.publicaciones.saludcastillayleon.es/10.2196/23863.

  28. Wang Z, Li Z, Li K, Mu S, Zhou X, Di Y. Performance of artificial intelligence in diabetic retinopathy screening: A systematic review and meta-analysis of prospective studies. Front Endocrinol. 2023;14. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fendo.2023.1197783.

  29. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402. https://doiorg.publicaciones.saludcastillayleon.es/10.1001/jama.2016.17216.

    Article  PubMed  Google Scholar 

  30. Rajesh AE, Davidson OQ, Lee CS, Lee AY. Artificial intelligence and diabetic retinopathy: AI framework, prospective studies, head-to-head validation, and cost-effectiveness. Diabetes Care 2023 Sept 20;46(10):1728–39. https://doiorg.publicaciones.saludcastillayleon.es/10.2337/dci23-0032

  31. Huang X-M, Yang B-F, Zheng W-L, Liu Q, Xiao F, Ouyang P-W, et al. Cost-effectiveness of artificial intelligence screening for diabetic retinopathy in rural China. BMC Health Serv Res. 2022;22(1). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12913-022-07655-6.

Download references

Acknowledgements

Not applicable.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: AA, Data curation: WA, WH, HA, MS, Formal analysis: WA, HA, Methodology: AA, MS, WH, Supervision: AA, Validation: AA, Writing– original draft: WA, HA, WH, MS, Writing– review and editing: AA, WA.

Corresponding author

Correspondence to Abdullah S. Alqahtani.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alqahtani, A.S., Alshareef, W.M., Aljadani, H.T. et al. The efficacy of artificial intelligence in diabetic retinopathy screening: a systematic review and meta-analysis. Int J Retin Vitr 11, 48 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40942-025-00670-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s40942-025-00670-9

Keywords