University of Twente Student Theses
As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.
Investigating the Impact of Synthetic Data Balancing Techniques on Fairness in Credit Risk Machine Learning Models
Tunc, Johan (2025) Investigating the Impact of Synthetic Data Balancing Techniques on Fairness in Credit Risk Machine Learning Models.
Full text not available from this repository.
Full Text Status: | Access to this publication is restricted |
Abstract: | In credit risk modeling, machine learning (ML) algorithms often face the challenge of class imbalance, where default cases are significantly underrep- resented compared to non-default cases. A default case being a lender not paying back their loan-fees. To address this issue, synthetic data balancing techniques like Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN) are commonly applied prior to the use of ML models for credit risk assessments. However, the impact of these methods on both predictive performance and fairness is underexplored. This thesis investigates how synthetic data balancing techniques influence model accuracy and fairness in credit risk datasets. Using open-source data, classifiers such as logistic regression (LR) and XGBoost are evaluated with standard metrics including area under the ROC curve (AUC-ROC), precision, recall, and F1 score. Fairness is assessed using Equalized Odds, Demographic Parity, and Disparate Impact Ratio (DIR). Results show that while both bal- ancing techniques modestly improve the predictive performance of logistic regression, their effect on XGBoost is minimal. Importantly, both methods contribute to reduced fairness disparities between genders, supporting more equitable model outcomes and aligning with regulatory requirements such as the EU AI Act and GDPR. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science, 83 economics |
Programme: | Business & IT BSc (56066) |
Link to this item: | https://purl.utwente.nl/essays/107511 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page