University of Twente Student Theses


Identification of a drinking water softening model using machine learning

Jenden, J.N. (2020) Identification of a drinking water softening model using machine learning.

[img] PDF
Abstract:This report identifies Machine Learning (ML) models of the water softening treatment process in a Water Treatment Plant (WTP), using two different ML algorithms applied on time series data: eXtreme Gradient Boost (XGBoost) and Recurrent Neural Networks (RNNs). In addition, a control method for the draining of pellets in the softening reactor is explored based on collected softening treatment data and the resultingMLmodels. In particular, the pH is identified as a potential variable for the control of pellet draining within a softening reactor. The pH forecasts produced by ML models are able to predict the future behaviour of the pH and potentially anticipate when the pellets should be drained. For implementation of the ML algorithms, the inputs and outputs of the ML models are first identified. Wherein, the pH within the softening reactor is selected as the output, due to its potential control properties. Subsequently, water softening treatment data is collected from a water company residing in the Netherlands. After collection, the data is pre-processed and analysed to be able to better interpret the ML results and to improve the performance of the ML models trained. During pre-processing, the implementation of twoML data splitting methods, walk-forward and train-validation-test, is carried out. The performance of the models is gauged using two different evaluation metrics: Mean Squared Error (MSE) and R-squared. Lastly, predictions are carried out using the trained ML models for a set of forecast horizon lengths. Comparing the XGBoost and RNN pH predictions, the RNN performs in general better than the XGBoost method, where the RNN model with a train-validation-test split, has aMSE value of 0.0004 (4 d.p.) and an R-squared value of 0.9007 (4 d.p.). Extending the forecast horizon to four hours for the RNN walkforward model yielded MSE values below 0.01, but only negative R-squared values. Thereby, suggesting that the prediction is relatively close to the actual data points, but does not follow the shape of the actual data points well. The evaluation metric results suggest that it is possible to create a good performing model using the RNN method for a forecast horizon length equal to one minute. Alternatively, this model is heavily dependent on the current pH value and therefore is deemed to be not a good predictor of the pH. Increasing the horizon length leads to only slightly lower MSE values, but the R-squared values are in general negative, indicating a poor fit. Keywords: Machine Learning (ML), water softening treatment, Water Treatment Plant (WTP), time series, eXtreme Gradient Boost (XGBoost), Recurrent Neural Network (RNN), pH, control, pellet draining, softening reactor, forecast, inputs, outputs, pre-process, data splittingmethod, walk-forward, trainvalidation-test, evaluation metric,Mean Squared Error (MSE), R-squared, prediction, forecast horizon
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Programme:Systems and Control MSc (60359)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page