University of Twente Student Theses


Analysis of random forest algorithms

Smulders, Janiek (2021) Analysis of random forest algorithms.

[img] PDF
Abstract:Random forests have many different implementations in R-packages. This study aims to analyse the performance of different random forests and to provide guidelines on which R-package to use. The R-packages studied in this paper are extraTrees, party, randomForestSRC, ranger, RLT, RRF and KnowGRRF. Only regression problems are considered in this study. The analysis is done by comparing the R-packages to randomForest regarding the mean squared error, the runtime and the variable importance. This is done by testing the R-packages on different types of data. Based on the computations in this research it can be concluded that RLT is advised to use for numerical data to obtain the lowest MSE. In all other cases ranger is suggested to use as it has a significantly lower runtime. Furthermore, the mean decrease in accuracy found in randomForest or the unbiased mean decrease found in party are recommended methods to use for obtaining the variable importance.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:31 mathematics
Programme:Applied Mathematics BSc (56965)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page