Vienna, Austria

ESTRO 2023

Session Item

Saturday
May 13
09:00 - 10:00
Stolz 2
TCP/NTCP modelling and prediction
Karen Kirkby, United Kingdom;
Nienke Hoekstra, The Netherlands
Mini-Oral
Physics
External validation of radiomics and deep learning models for recurrence-free survival prediction
Yan Li, The Netherlands
MO-0062

Abstract

External validation of radiomics and deep learning models for recurrence-free survival prediction
Authors:

Yan Li1, Baoqiang Ma1, Hung Chu1, Johannes Albertus Langendijk1, Lisanne Vania van Dijk1, Nanna Maria Sijtsema1

1University Medical Center Groningen, Department of Radiation Oncology, Groningen, The Netherlands

Show Affiliations
Purpose or Objective

Prediction models can be used to identify patients with a high- and low-risk for recurrence-free survival (RFS) including local, regional recurrence and distant metastases. Studies to investigate the options for treatment intensification or de-intensification for high- and low risk patients could be performed to determine the optimal treatment for individual patients. Previously, a radiomics, a deep-learning (DL) and a hybrid model were developed for RFS prediction in oropharyngeal cancer (OPC) patients within HECKTOR challenge 2022. Good model performance was obtained in the train and test dataset. The goal of the current study was to test the generalizability of these models by exploring their performance in another external validation dataset.

Material and Methods

As part of the HECKTOR 2022 challenge, we developed and trained a radiomics, a deep-learning (DL) and a hybrid model on the provided training dataset including 481 OPC patients from 7 centers and tested them on the independent dataset with 339 OPC patients from 3 centers. In total, 219 OPC patients that were treated in our institution were included for the current external validation. The primary tumor and pathologic lymph nodes (GTVp+n) were auto contoured and used as input for RFS prediction.

The radiomics model variables were weight, HPV status, surface area and Run Length Non Uniformity of the GTVp+n on CT scan. The DL model input included weight, chemotherapy, gender, age, the PET scan and the GTVp+n contour. The hybrid model was the DL-model in combination with the risk output of the radiomics model. The c-index of each model was calculated in our external dataset. Kaplan Meier (KM) curves were made where patients were stratified into high and low risk groups by the median value of the predicted risk. The log-rank test was used to compare the two risk groups.

Results

The validation performance of the radiomics (c-index=0.67), DL (c-index=0.67) and hybrid model (c-index=0.68) in our external set was good and was comparable with the performances in the HECKTOR test set (Table 1). The KM-curves of the three models showed good and significant separation in the high and low RFS risk groups (Figure 1).


The radiomics model with interpretable imaging biomarkers achieved the same performance as the DL-model. The hybrid model showed the best performance in the training and our external set, yet its generalizability may be lower as suggested by the lower performance in the HECKTOR test set.

Table 1. C-indexes of the three models

Figure 1. Kaplan-Meier (KM) curves of the three models


Conclusion

Radiomics, DL and hybrid models for RFS prediction all showed good performance when validated in our external OPC dataset. Since the radiomics model is the most interpretable, this may be the most promising tool for the prediction of RFS and can contribute to a more individualized treatment.