University of Twente Student Theses

This website will be unavailable due to maintenance December 1st between 8:00 and 12:00 CET.

The impact of the Dutch weather on the health of horses

Padje, J. van 't (2020) The impact of the Dutch weather on the health of horses.

This is the latest version of this item.

[img] PDF
Abstract:Gut feeling and farm wisdom often attribute diseases in horses to specific weather conditions, which might lead to false assumptions. The goal of this research is to see if these assumptions are valid or not by answering the questions What is the influence of the Dutch weather on the health of horses? and To what extend can the Dutch weather be used to predict the occurrence of colic, laminitis, respiratory disease and skin disease? To answer these questions the data of animal clinic Den Ham is used. This data required pre-processing. Duplicate horses are merged, measured horse temperatures are extracted and the data is grouped into consults. These consults are labelled with one or more of the previously mentioned diseases using the text description of the consult and the admitted medication. The labelling is performed with a bag of words approach using Stochastic Gradient Descent testing different classifiers, loss functions and other parameters. This data is merged with the weather data of Heino form the weather station of the KNMI (The Royal Netherlands Meteorological Institute), which needed imputation of some values and variables. The values are imputed using a k-Nearest neighbours approach. The missing variables are taken from the weather station in Hoogeveen. This weather station, most likely, has the least difference with Heino for the missing variables. Visualizations are made to find obvious correlations between the diseases and changes in the weather and to see the occurrence of the diseases over a year. To find correlations between the weather and the diseases, the weather variables are split into two groups: the weather on the days where the disease occurs and the weather variables on the remaining days. Permutation tests are performed for significance testing between the two groups of weather variables. When a significant difference is found between the weather conditions of those two groups, the weather variable is considered to be correlated to the weather variable. Predictions are made using Ensemble predictions, which are compared to four single classifiers: Logistic Regression, Support Vector Machine, Decision Tree and Neural network. The ensemble prediction methods Voting, Bagging and Boosting are tested. Voting combines Logistic Regression, Support Vector Machine, Decision Tree and Neural network. Bagging is performed once for each of these four classifiers and Boosting is performed using Decision Trees only. The methods as described above produce the following results; The measured temperature of the horses can be obtained from the data with an accuracy of 0.99805. With ten Nearest neighbours, an $R^2$ score of 0.99556 is achieved for the imputation of the missing weather values. The surrounding weather stations did not have very different results for the missing variables, therefore the weather station of Hoogeveen is used as a donor for the missing variables since this weather station is closest to Heino and Den Ham. The visualizations of the changes in the weather do not show any obvious correlations. The correlation analysis does not show clear links between specific weather variables and one of the diseases. Roughly the same variables are correlated to each of the diseases. Laminitis has turned out to be the hardest to predict with an accuracy of 65%, obtained using a single Support Vector Machine or a single Neural Network. Colic and skin disease are both predicted best using the Bagging algorithm with Decision Trees with respectively 70% and 74% accuracy. The best result has been achieved for respiratory disease with an accuracy of 79.8%. This is achieved with the Voting algorithm, Bagging Support Vector Machines and with a single Support Vector Machine. One can expect better results when better-structured veterinarian data is used since the labelling of the consults has proven to be challenging. From this data, we cannot conclude that the Dutch weather influences the health of horses. Neither is the weather a good predictor for diseases.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page