University of Twente Student Theses


Inherently interpretable Machine Learning for Probability of Default Estimation in IRB Models

Hottenhuis, Wouter (2022) Inherently interpretable Machine Learning for Probability of Default Estimation in IRB Models.

[img] PDF
Abstract:In this thesis, we investigate the topic of inherently interpretable machine learning algorithms for the use in internal ratings-based models. Three different high potential models will be assessed and compared on their applicability for the use in internal ratings-based models, specifically on the probability of default component. To effectively assess and compare potential machine learning algorithms, a framework is constructed to score the different models. Research on the industry’s perspective and the regulatory context showed that there are three main categories on which the models should be evaluated: interpretability, performance, and implementation. These categories are split up into criteria, which are used to score the different models. The status quo in probability of default modelling, a logistic regression model, is also included in the comparison as a baseline. The investigated models are i) Logistic Model Tree (LMT), ii) Generalized Additive Models with Structured Interactions constructed with disentangled feed forward neural networks (GAMI-Net), and iii) Genetic Programming based Symbolic Regression (GPSR). In terms of performance, the LMT and GAMI-Net showed to outperform the logistic regression. Although the GPSR did not outperform the logistic regression in terms of performance, it has some other interesting qualities that can be proven to be of use in future research. The GAMI-Net sacrificed less in terms of interpretability than the LMT did to get to a better performance. The LMT has more disadvantages, which makes it less suitable to be adopted in the IRB model landscape. The GPSR does turn in less in terms of interpretability, and has advantages over the logistic regression. However, the algorithm does make use of neural networks in the construction of the final model, which is the main disadvantage of the GAMI-Net. To conclude, the only real challenger of the logistic regression model is the GAMI-Net, which seems to have the right balance on the interpretability-performance trade-off.
Item Type:Essay (Master)
Faculty:BMS: Behavioural, Management and Social Sciences
Subject:85 business administration, organizational science
Programme:Industrial Engineering and Management MSc (60029)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page