Copenhagen, Denmark
Onsite/Online

ESTRO 2022

Session Item

Sunday
May 08
14:15 - 15:15
Mini-Oral Theatre 1
13: Implementation of new technology
Livia Marrazzo, Italy;
Stefanie Ehrbar, Switzerland
Mini-Oral
Physics
Systematic input evaluation for deep learning-based pre-treatment quality assurance
Cecile Wolfs, The Netherlands
MO-0548

Abstract

Systematic input evaluation for deep learning-based pre-treatment quality assurance
Authors:

Cecile Wolfs1, Frank Verhaegen1

1GROW - School for Oncology, Maastricht University Medical Center+, Radiation Oncology (Maastro), Maastricht, The Netherlands

Show Affiliations
Purpose or Objective

In pre-treatment quality assurance (QA) with electronic portal imaging device (EPID) dosimetry, gamma analysis with standard criteria and thresholds on gamma pass rates are commonly used for dose comparison and error detection. However, studies show that deep learning (DL) methods provide higher sensitivity for detecting errors, because full dose comparison images can be used as input and error causes can be identified [1-3]. While gamma analysis is the traditional dose comparison method of choice, other comparison methods (e.g. dose difference maps) could further improve error detection when using DL. Moreover, image preprocessing steps, such as normalization and image resizing, are known to influence DL model performance. The objective of this work is to systematically evaluate the impact of different dose comparison and image preprocessing methods on the performance of a DL model for error identification in pre-treatment QA.

Material and Methods

For 53 VMAT treatment plans of 46 lung cancer patients, mechanical errors were simulated (MLC leaf positions, monitor unit scaling, collimator rotation). Two DL classification levels were assessed: error type (Level 1), and error magnitude (Level 2). Portal dose images were predicted using treatment plans with and without errors, and subsequently compared using the dose comparison methods listed in Table 1. Preprocessing consisted of cropping the dose comparison images by applying a 10% low dose threshold, normalizing the pixel values (min/max or mean/stdev; Table 1) and resizing to a square image size (Table 1). Making all possible combinations of classification level, dose comparison, normalization method and image size led to 144 input datasets. A DL network architecture consisting of blocks of 2 convolutional layers and a max pooling layer, followed by dense layers was used. The exact network (e.g. number of convolutional blocks) and hyperparameters (e.g. learning rate) were optimized for each input set.


Results

Figure 1 shows that using relatively simple dose comparison methods such as ratio analysis or relative dose differences provides highest DL model performance, although gamma analysis with strict criteria (particularly in the distance-to-agreement) also performs well. Mean/stdev normalization particularly improves Level 2 classification. Higher image resolution improves error identification, as more details of the dose comparison images are preserved.



Conclusion

The choice of dose comparison method has the largest impact on error identification for pre-treatment QA using DL, compared to image preprocessing. Model performance can improve by applying mean/stdev normalization and high image resolution, but the latter needs more computational resources and longer training times. While this is not a major issue for 2D images, it may be for 2D images per treatment segment or for 3D reconstructed dose volumes.


1. Nyflot et al. 2019 Med Phys 46: 456-464

2. Potter et al. 2020 Med Phys 47: 4711-4720

3. Kimura et al. 2021 Med Phys 48: 4769-4783