Forecasting Water Levels in Twente Canal Using Time Series Analysis
Yurttaş, Görkem (2024)
Introduction:
Water levels in the Twente canal have experienced historical lows in year 2018, which affected
the supply chain and the logistic operations of the businesses that use the canal to be
hindered. As a result, it has become apparent that predicting these unprecedented changes
in water levels is crucial to prevent such negative effects again.
This research aims to provide an answer to how forecast models for the Twente canal could
be made using time series analysis. The research is part of a bigger project in which a digital
twin of the Twente canal is being made. The forecasts provided by this study serve the purpose
of monitoring the water levels in this project and also provide a basis for any further models to
be developed for the project.
Theoretical Framework:
As the relevant modelling techniques had to be identified as well as a theory needed for the
research to be based upon, a theoretical framework was made utilising the existing literature.
Several models such as ARMA, ARIMA, SARMA, SARIMA, PARMA, ARIMAX, SARIMAX and
MLR were identified as a result of this research. Additionally, several important properties such
as seasonality and stationarity were described.
Data Understanding and Transformation:
Data used in this report were acquired from Rijkswaterstaat and KNMI, both which are public
sources of data. In general, data was of high quality and did not require much cleaning in
terms of outliers and measurement types. However, the data gathered from Rijkswaterstaat
was measured every 10 minute interval, which had to be transformed into a daily average
value. After the data was cleaned and transformed into a daily average value, properties of
the dataset were examined. Data exhibited low variance and standard deviation which could
be the result of averaging the values for daily measurements. Additionally, data was found to
be normally distributed, non-stationary and non-seasonal. Finally, several exogenous
variables, for which the data was gathered from KNMI, were investigated for use in modelling.
Unfortunately, none of there were deemed suitable for different reasons.
Modelling:
After the data was explored and transformed, it was ready to be modelled. Several possible
models depending on the properties highlighted before were identified. Mainly ARIMA models
were found suitable for modelling time series. Additionally, although expected to not add value,
an ARIMAX model using temperature as an exogenous variable was modelled for research
purposes. According to the theoretical framework and methods proposed in it, parameters for
the models were estimated. After the initial estimations, models were fitted and compared on
their information criterion. Based on this comparison, models with the lowest criterions were
chosen to be actually modelled for forecasting.
Results and Conclusion:
In general, there are mixed results from the modelling phase. In particular, long term forecasts
were a failure due to the predicted values converging to the sample mean of the training
dataset. Several reasons as to why this behaviour occurs could be unincorporated seasonality
and low variance in the dataset. On the other hand, short term predictions were highly accurate
and were able to showcase the patterns that the actual values follow. Unfortunately, it is
arguable how these short term forecasts could be utilised.
Yurttas_BA_BMS.pdf