Artificial Intelligence-enabled Decision Support in Surgery: State-of-the-art and Future Directions

Ann Surg. 2023 Jul 1;278(1):51-58. doi: 10.1097/SLA.0000000000005853. Epub 2023 Mar 21.

Abstract

Objective: To summarize state-of-the-art artificial intelligence-enabled decision support in surgery and to quantify deficiencies in scientific rigor and reporting.

Background: To positively affect surgical care, decision-support models must exceed current reporting guideline requirements by performing external and real-time validation, enrolling adequate sample sizes, reporting model precision, assessing performance across vulnerable populations, and achieving clinical implementation; the degree to which published models meet these criteria is unknown.

Methods: Embase, PubMed, and MEDLINE databases were searched from their inception to September 21, 2022 for articles describing artificial intelligence-enabled decision support in surgery that uses preoperative or intraoperative data elements to predict complications within 90 days of surgery. Scientific rigor and reporting criteria were assessed and reported according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews guidelines.

Results: Sample size ranged from 163-2,882,526, with 8/36 articles (22.2%) featuring sample sizes of less than 2000; 7 of these 8 articles (87.5%) had below-average (<0.83) area under the receiver operating characteristic or accuracy. Overall, 29 articles (80.6%) performed internal validation only, 5 (13.8%) performed external validation, and 2 (5.6%) performed real-time validation. Twenty-three articles (63.9%) reported precision. No articles reported performance across sociodemographic categories. Thirteen articles (36.1%) presented a framework that could be used for clinical implementation; none assessed clinical implementation efficacy.

Conclusions: Artificial intelligence-enabled decision support in surgery is limited by reliance on internal validation, small sample sizes that risk overfitting and sacrifice predictive performance, and failure to report confidence intervals, precision, equity analyses, and clinical implementation. Researchers should strive to improve scientific quality.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Artificial Intelligence*
  • Humans
  • ROC Curve