University of Twente Student Theses


Forecasting Mortgage Prepayment

Star, Tim van der (2022) Forecasting Mortgage Prepayment.

[img] PDF
Abstract:Embedded into the Dutch mortgage contract is the option for mortgagors to prepay part of their residential mortgage outside their scheduled contractual payments. There are three types of prepayments that mortgage providers have to consider, namely partial, full and arbitrage prepayments. Mortgage prepayments alter the expected cash flows from the mortgage and due to their stochastic nature it is hard for mortgagees to make a correct or accurate valuation for their mortgage portfolio on an aggregate level. We study how different models, determinants and undersampling techniques help predict the observed prepayment rate for the Dutch portfolio book of Allianz. The current prepayment model at Allianz forecasts a mortgage prepayment cash flow over the period of 2014-2021 with a total error of -19.3% compared to the actual cash flows seen in the Allianz portfolio. Furthermore, the average yearly forecasted cash flow error by the Allianz model is 22.1%, which Allianz would like to reduce. The aim of this thesis is therefore to compute a model which is able to do so. We perform a literature research to identify prepayment determinants and use these determinants to perform a preliminary data analysis on the Allianz mortgage data. We investigate the relevancy of these determinants on the prepayment rate for all models. After this we construct three models, namely a logistic regression model, random forest model and a neural network model, and investigate their ability to forecast each type of prepayment separately. Furthermore, we explore the ability of each model in forecasting the average and total prepayment cash flow error. We undersample the training set in order to decrease the data imbalance towards the non-prepayment class. Through undersampling the training data set we alter the data set size multiple times and use these various data sets as training sets for the model. We create data sets where the prepayment observations (the minority class) are present for 10%, 20%, 30%, 40% and 50% of the training data set and explore the effect of undersampling and their ability to help improve the predictive power of each model. By evaluating each model on multiple portfolio error and loan level metrics we can deduce which model is able to best replicate the observed conditional prepayment rate (CPR) for each of the three separate prepayment types, namely partial, full and arbitrage, present in the Allianz portfolio. Concerning the models that give insight into relevant prepayment variables we find that the logistic regression model is the only model to give interpretable and clear relationships between the modelled variables and each of the three prepayment rates. Furthermore, we find that all models are very imprecise at predicting prepayments on an individual loan-level and thus we reduce the relevance of these results. We find that the random forest model trained on the undersampled training set where the minority class represents 30% of all observations on the basis of weighted root mean square error (WRMSE) produces the lowest partial prepayment CPR compared to the observed partial prepayment CPR. This model had a WRMSE of 0.205%. Concerning full prepayments we find that the random forest model trained on the data set where the minority class (in this case full prepayments) represents 50% of the training data has the lowest WRMSE error compared to the observed full prepayment CPR, with it being 0.581%. Regarding the arbitrage prepayment CPR (which takes place when refinancing) we find that the random forest model without undersampling performs best with a WRMSE of 0.392%. Reviewing the cash flow estimation we find that the neural network model trained on the original imbalanced training set, without undersampling, is able to achieve a yearly cash flow error of 17.5% which is the lowest error of all models and 4.5% lower than the benchmark Allianz model of 22.1%. We therefore achieve the goal of this thesis in computing a model which has a lower error than the benchmark Allianz model.
Item Type:Essay (Master)
Faculty:BMS: Behavioural, Management and Social Sciences
Subject:85 business administration, organizational science
Programme:Industrial Engineering and Management MSc (60029)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page