Deep Learning Assistance Versus Manual Reading in Cervical Cytopathology

06/12/2026
Key Takeaways
- Higher sensitivity was observed with deep learning-assisted reading than with manual microscopy in routine liquid-based cytology.
- Specificity remained similar, and slide review was markedly faster when assistance was available.
- Sensitivity gains were reported across several prespecified subgroups, and the authors described implementation-related limitations.
The study took place at four pathological centers in China, including Peking Union Medical College Hospital, Shenzhen Maternity and Child Healthcare Hospital, Anhui Province Hospital, and Zhejiang Cancer Hospital. Between April 7 and September 27, 2023, 1,920 women aged 18 years or older undergoing routine liquid-based cytology were included after exclusions. Four non-expert cytopathologists with 1 to 3 years of experience interpreted slides with and without deep learning assistance, using an expert consensus reference standard based on two senior cytopathologists and a third senior adjudicator. The trial used randomized reading order, a four-week washout period, and CONSORT-AI reporting, with sensitivity and specificity as primary endpoints and AUC, predictive values, and reading efficiency as secondary outcomes.
Specificity was 85.1% without assistance and 86.5% with assistance, a difference that was not statistically significant at p=0.238. Average AUC improved from 0.782 to 0.861 with assistance, and the 14.3% sensitivity difference had a 95% confidence interval from 7.6% to 21.1%, with p<0.001. Median reading time fell from 175 seconds to 30 seconds per slide, and the AUC and efficiency differences were also significant at p<0.001. Prespecified margins were 5% for sensitivity superiority and negative 5% for specificity non-inferiority, and both criteria were met.
Across prespecified analyses, sensitivity gains were reported by menopausal status, for ASC-US detection, at two study sites, with the natural sediment method, and across both scanners. As a separate result, the standalone deep learning system achieved an AUC of 0.879, sensitivity of 91.8%, and specificity of 84.0%. The authors stated that no HSIL+ cases were missed by the standalone system, although some non-HSIL cases were missed. These findings added context to the assisted versus manual comparison within the reported study setting.
The authors noted several limitations, including reliance on expert consensus rather than histologic confirmation and a limited number of atypical glandular cells for assessment. They also reported that cytopathologists could not be blinded, and observer attention may have been influenced by awareness of being studied. The study population was mainly Han Chinese, which limited assessment across ethnic groups, and the authors noted future follow-up and histologic confirmation for real-world evaluation. The reported findings remained limited to improved sensitivity and faster reading within this crossover framework.
