University of Twente Student Theses
Reliability of Uncertainty for Assessing the generalisability of Deep Learning Segmentation for Head and Neck Organs at Risk
Aalst, Joëlle E. van (2024) Reliability of Uncertainty for Assessing the generalisability of Deep Learning Segmentation for Head and Neck Organs at Risk.
Full text not available from this repository.
Full Text Status: | Access to this publication is restricted |
Embargo date: | 17 September 2026 |
Abstract: | Deep learning (DL) segmentation models for organs at risk (OARs) in radiotherapy have reached the accuracy of manual segmentation. The high-risk environment of radiotherapy has led to cautious adoption of these autosegmentation models, affecting their optimal deployment. During manual quality assurance (QA) of DL segmentation, many corrections are made, especially within 2 mm, with fluctuating, but often minimal effect on dose planning. These corrections partially eliminate the efficiency gain proposed by DL segmentation models and re-introduce inter- and intra-observer variability. The use of uncertainty in automated or semi-manual quality assurance of autosegmentation results is an emerging field. However, there is no consensus on the optimal method and metrics to compute uncertainty maps. Furthermore, there is disagreement on the theoretical understanding of uncertainty representation in radiotherapy and disentanglement of uncertainty for out-of-distribution detection and data uncertainty visualisation. For out-of-distribution detection, no practical research with real-world radiotherapy data has been performed in radiotherapy. This thesis evaluates the reliability of common uncertainty estimation methods and metrics. Additionally, we test their ability to distinguish between in-distribution and out-of distribution data by withholding patient subsets during training, defined by radiotherapy specific constraints (metal artefacts, a bolus and a bitelock). Using an nnU-Net segmentation model for 20 head and neck cancer organs at risk, we computed uncertainty estimations with Monte Carlo dropout, deep ensemble modelling, and test-time-augmentation. We find that these methods produce trustworthy uncertainty maps based on reliability diagrams and model performance stability. However, the quality between uncertainty metrics varies greatly, with entropy outperforming mutual information and variance. Finally, we show that uncertainty can effectively detect out-of-distribution cases, particularly when the visibility or shape of organs-at-risk is influenced between subsets. |
Item Type: | Essay (Master) |
Clients: | University Medical Centre Groningen, Groningen, The Netherlands |
Faculty: | TNW: Science and Technology |
Subject: | 44 medicine, 50 technical science in general |
Programme: | Technical Medicine MSc (60033) |
Link to this item: | https://purl.utwente.nl/essays/103608 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page