Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study - 31/01/20

Doi : 10.1016/S1470-2045(19)30739-9

Wouter Bulten, MSc ^a,⁎ , Hans Pinckaers, MD ^a, Hester van Boven, MD ^c, Robert Vink, MD ^d, Thomas de Bel, MSc ^a, Bram van Ginneken, ProfPhD ^b, Jeroen van der Laak, PhD ^a, Christina Hulsbergen-van de Kaa, PhD ^d, Geert Litjens, PhD ^a

^a Department of Pathology, Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, Netherlands

^b Department of Radiology & Nuclear Medicine, Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, Netherlands

^c Department of Pathology, Antoni van Leeuwenhoek Hospital, The Netherlands Cancer Institute, Amsterdam, Netherlands

^d Laboratory of Pathology East Netherlands, Hengelo, Netherlands

^* Correspondence to: Wouter Bulten, Department of Pathology, Radboud University Medical Center, Nijmegen 6500 HB, Netherlands Department of Pathology Radboud University Medical Center Nijmegen HB 6500 Netherlands

Summary

Background

The Gleason score is the strongest correlating predictor of recurrence for prostate cancer, but has substantial inter-observer variability, limiting its usefulness for individual patients. Specialised urological pathologists have greater concordance; however, such expertise is not widely available. Prostate cancer diagnostics could thus benefit from robust, reproducible Gleason grading. We aimed to investigate the potential of deep learning to perform automated Gleason grading of prostate biopsies.

Methods

In this retrospective study, we developed a deep-learning system to grade prostate biopsies following the Gleason grading standard. The system was developed using randomly selected biopsies, sampled by the biopsy Gleason score, from patients at the Radboud University Medical Center (pathology report dated between Jan 1, 2012, and Dec 31, 2017). A semi-automatic labelling technique was used to circumvent the need for manual annotations by pathologists, using pathologists’ reports as the reference standard during training. The system was developed to delineate individual glands, assign Gleason growth patterns, and determine the biopsy-level grade. For validation of the method, a consensus reference standard was set by three expert urological pathologists on an independent test set of 550 biopsies. Of these 550, 100 were used in an observer experiment, in which the system, 13 pathologists, and two pathologists in training were compared with respect to the reference standard. The system was also compared to an external test dataset of 886 cores, which contained 245 cores from a different centre that were independently graded by two pathologists.

Findings

We collected 5759 biopsies from 1243 patients. The developed system achieved a high agreement with the reference standard (quadratic Cohen’s kappa 0·918, 95% CI 0·891–0·941) and scored highly at clinical decision thresholds: benign versus malignant (area under the curve 0·990, 95% CI 0·982–0·996), grade group of 2 or more (0·978, 0·966–0·988), and grade group of 3 or more (0·974, 0·962–0·984). In an observer experiment, the deep-learning system scored higher (kappa 0·854) than the panel (median kappa 0·819), outperforming 10 of 15 pathologist observers. On the external test dataset, the system obtained a high agreement with the reference standard set independently by two pathologists (quadratic Cohen’s kappa 0·723 and 0·707) and within inter-observer variability (kappa 0·71).

Interpretation

Our automated deep-learning system achieved a performance similar to pathologists for Gleason grading and could potentially contribute to prostate cancer diagnosis. The system could potentially assist pathologists by screening biopsies, providing second opinions on grade group, and presenting quantitative measurements of volume percentages.

Funding

Dutch Cancer Society.

Le texte complet de cet article est disponible en PDF.

Plan

Introduction

Methods

Study design and participants

Test methods

Statistical analysis

Role of the funding source

Results

Discussion

Export

Vol 21 - N° 2

P. 233-241 - février 2020 Retour au numéro

Article précédent

Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study
Peter Ström, Kimmo Kartasalo, Henrik Olsson, Leslie Solorzano, Brett Delahunt, Daniel M Berney, David G Bostwick, Andrew J Evans, David J Grignon, Peter A Humphrey, Kenneth A Iczkowski, James G Kench, Glen Kristiansen, Theodorus H van der Kwast, Katia R M Leite, Jesse K McKenney, Jon Oxley, Chin-Chen Pan, Hemamali Samaratunga, John R Srigley, Hiroyuki Takahashi, Toyonori Tsuzuki, Murali Varma, Ming Zhou, Johan Lindberg, Cecilia Lindskog, Pekka Ruusuvuori, Carolina Wählby, Henrik Grönberg, Mattias Rantalainen, Lars Egevad, Martin Eklund

| Article suivant

Olanzapine 5 mg plus standard antiemetic therapy for the prevention of chemotherapy-induced nausea and vomiting (J-FORCE): a multicentre, randomised, double-blind, placebo-controlled, phase 3 trial
Hironobu Hashimoto, Masakazu Abe, Osamu Tokuyama, Hideaki Mizutani, Yosuke Uchitomi, Takuhiro Yamaguchi, Yukari Hoshina, Yasuhiko Sakata, Takako Yanai Takahashi, Kazuhisa Nakashima, Masahiko Nakao, Daisuke Takei, Sadamoto Zenda, Koki Mizukami, Satoru Iwasa, Michiru Sakurai, Noboru Yamamoto, Yuichiro Ohe

Bienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.

Déjà abonné à cette revue ?

connectez-vous ou créez un compte

Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study - 31/01/20

Summary

Background

Methods

Findings

Interpretation

Funding

Plan

Export citations

Fichier

Contenu

Accès rapides

Mon compte

Aide & support

Plateformes Elsevier Masson

Déclaration CNIL