University of Twente Student Theses
Feature Importance versus Feature Selection in Predictive Modeling for Formula 1 Race Standings
Cheteles, BSc Octaviana-Alexandra (2024) Feature Importance versus Feature Selection in Predictive Modeling for Formula 1 Race Standings.
PDF
893kB |
Abstract: | In the fast-paced world of Formula 1, drivers’ skills, technological advancements, and, most importantly, the strategic use of data are all used to gain a competitive edge. This research paper aims to determine how to accurately extract the most influential features of race outcomes that are publicly available. The analysis is conducted using two methods, feature importance and feature selection. One approach explores the division of the features into weather, car, and driver categories to develop specific predictive models, assessing the importance of top features within each category. A final model with each model’s set of features that give the lowest root-mean-square is created and compared to the other approach, which is applying feature selection from the beginning to a new model. Additional features are developed based on the existing ones and used in both approaches. To improve prediction model accuracy, the lowest root mean square error (RMSE) possible is targeted, and to evaluate the features, feature importance scores are used. The following set of features was discovered to be the most outcome-significant in all models: the grid position, the average breaking points, and the variance of the breaking points. The importance-based model presented the lowest RMSE, 0.005 and 0.006 when using Random Forest Regression and a Gradient Boosting Regression respectively. The model that used feature selection had a deviation of 0.93 when using Random Forest Regression and 2.23 Gradient Boosting Regression. The RMSE values decreased for all models when new features were added. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Computer Science BSc (56964) |
Link to this item: | https://purl.utwente.nl/essays/101020 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page