ESTRO 2022

Session Item

Imaging acquisition and processing

Session Type: Poster (digital)

Track: Physics

Journey:

Standardising Nomenclatures in Breast Radiotherapy Imaging Data using Machine Learning Algorithms

Phillip Chlap, Australia

Presentation Number: PO-1618

Abstract

Abstract Title:

Standardising Nomenclatures in Breast Radiotherapy Imaging Data using Machine Learning Algorithms

Authors:

ali haidar^1,2,3, Matthew Field^1,2,3, Vikneswary Batumalai^1,4, Kirrily Cloak^1,2,3, Daniel Al Mouiee^1,2,3, Phillip Chlap^1,2,3, Xiaoshui Huang^5,2,3, Vicky Chin^1,2,3, Martin Carolan⁶, Jonathan Sykes^7,8, Shalini Vinod^1,2,3, Geoffrey Delaney^1,2,3, Lois Holloway^1,2,3

¹University of New South Wales, South Western Sydney Clinical School, Sydney, Australia; ²South Western Sydney Local Health District, Liverpool and Macarthur Cancer Therapy Centres, Sydney, Australia; ³Ingham Institute for Applied Medical Research, Medical Physics Research Group, Sydney, Australia; ⁴GenesisCare, Radiation Oncology, Sydney, Australia; ⁵University of Sydney, ImageX, Sydney, Australia; ⁶Wollongong Hospital, Illawarra Cancer Care Centre, Wollongong, Australia; ⁷Sydney West Radiation Oncology Network, Radiation Onology, Sydney, Australia; ⁸University of Sydney, Institute of Medical Physics, Sydney, Australia

Show Affiliations

Purpose or Objective

Data mining and analyses using retrospective radiotherapy imaging datasets sourced from single/multiple centres requires translation of local ontologies for structure names to a standardised ontology. Our aim was to investigate machine learning (ML) based tools for standardising target and organ-at-risk (OAR) volume definition in breast cancer radiotherapy plans.

Material and Methods

Radiotherapy imaging data for 1613 breast cancer patients treated between 2014 and 2018 were collected from a single centre. The volumes were initially classified based on discussions with clinicians. 1440 patients were selected for ML model development, and 173 patients were used for testing (hold-out). To represent each target and OAR volume, four characteristics were generated: textual features, geometric features, dosimetry features, and central slices representing the slice with the highest number of contoured pixels in a volume. Five datasets were created from the original cohort, the first four represented different subsets of volumes and the last represented the whole list of volumes (Table1). For each dataset, 15 sets of feature combinations were created to see how the use of different attributes affected the standardisation performance.

Three types of artificial neural networks were used to model different combinations of features: feed forward neural networks (FFNN), convolutional neural networks (CNNs), and multi-input neural networks (MINN). FFNN were used for training tabular data combinations (e.g. text and dosimetry features), a CNN was used for training imaging data (central slices), and MINN were used for training tabular and imaging combinations (e.g. text and imaging features). Classification accuracy was used to compare the developed models against each other over the hold-out dataset.

Table 1 Classes used in datasets.

Results

Classification accuracy of each of the developed models is shown in Fig. 1. The best model (a MINN) reported 99.416% classification accuracy over the hold-out samples when used to standardise all the nomenclatures in a breast radiotherapy plan into 21 different classes (Dataset 5). 19 samples belonging to different classes were misclassified with 10 being predicted as ‘exclude’ (i.e. not to use).Three types of features were used with this model: textual features, dosimetry features, and images. When compared to employing single characteristics, integrating several features resulted in greater classification accuracy. Reliable performance was observed with all the datasets when using the text feature as input to the model, which is consistent with the traditional approach, where the clinicians look at text first to standardise nomenclatures.

Fig.1 Modelling results.

Conclusion

Standardisation of nomenclatures using ML is feasible on single institutional data if multiple features are included in the model. This is an ongoing project, where federated ML will be investigated for standardising radiotherapy data across different centres, guidelines, and anatomical sites.