A global artificial intelligence (AI) competition, the Prostate cANcer graDe Assessment (PANDA) challenge, compiled and publicly released a European cohort for AI development, the largest publicly available dataset of prostate biopsies to date and fully reproduced top-performing algorithms and externally validated their generalisation to independent United States and European cohorts and compared them with the reviews of pathologists. Through such a community-driven competition, the PANDA challenge provides a curated diverse dataset and a catalogue of models for prostate cancer pathology, and represents a blueprint for evaluating AI algorithms in digital pathology according to an article published on 13 January 2022 in the Nature Medicine.
Pathologists characterise tumours into different Gleason growth patterns based on the histological architecture of the tumour tissue. Based on the distribution of Gleason patterns, biopsy specimens are categorised into 5 grade groups. However, the assessment is subjective with considerable inter- and intra-pathologist variability that lead to both undergrading and overgrading of prostate cancer.
AI algorithms have shown promise for grading prostate cancer, specifically in prostatectomy samples and biopsies, and by assisting pathologists in the microscopic reviews, but they are susceptible to various biases in their development and validation. This represents a key barrier to their implementation in clinical practice. The authors wrote that despite AI promise for diagnosing prostate cancer in biopsies, the results have been limited to individual studies, lacking validation in multinational settings.
Competitions have been shown to be accelerators for medical imaging innovations, but their impact is hindered by lack of reproducibility and independent validation. PANDA competition setup isolated the developers from the independent evaluation of the algorithms’ performance, minimising the potential for information leakage and offering a true assessment of the diagnostic power of these techniques.
PANDA challenge is the largest histopathology competition organised to date. In total, 1,290 developers joined to catalyze development of reproducible AI algorithms for Gleason grading using 10,616 digitised prostate biopsies. It validated that a diverse set of submitted algorithms reached pathologist-level performance on independent cross-continental cohorts, fully blinded to the algorithm developers.
On external validation sets, the algorithms achieved agreements of 0.862 (quadratically weighted κ, 95% confidence interval [CI] 0.840–0.884) and 0.868 (95% CI 0.835–0.900) with expert uropathologists. The performance exhibited by this group of algorithms adds evidence of the maturity of AI for this task.
The authors stated that successful generalisation across different patient populations, laboratories and reference standards, achieved by a variety of algorithmic approaches, warrants evaluating AI-based Gleason grading in prospective clinical trials. Taken together, the PANDA consortium showed that the combination of AI and innovative study designs, together with prespecified and rigorous validation across diverse cohorts, can be utilised to solve challenging and important medical problems.
To stimulate further advancement of the field, the full development set of 10,616 biopsies has been made publicly available for non-commercial research use at panda.grand-challenge.org
This work was supported by the Dutch Cancer Society, Netherlands Organization for Scientific Research, Google LLC, Verily Life Sciences, Swedish Research Council, Swedish Cancer Society, Swedish eScience Research Center, EIT Health, Karolinska Institutet, Åke Wiberg Foundation and Prostatacancerförbundet, Academy of Finland, Cancer Foundation Finland, and ERAPerMed.
Reference
Bulten W, Kartasalo K, Chen P-HC, et al. Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge. Nature Medicine; Published online 13 January 2022. DOI: https://doi.org/10.1038/s41591-021-01620-2