University of Twente Student Theses

Login

Evaluating the explainability and performance of an elementary versus a statistical impact-based forecasting model A case study of tropical cyclone early action in the Philippines

Sedhain, Sahara (2022) Evaluating the explainability and performance of an elementary versus a statistical impact-based forecasting model A case study of tropical cyclone early action in the Philippines.

[img] PDF
5MB
Abstract:Climate vulnerable countries can expect an increase in the number of disasters, so investing in preparedness needs to be scaled up. Recently, there has been a remarkable shift in the focus of disaster risk practitioners, from traditional response mechanisms to proactive approaches of acting early based on impact-based forecasts (IbF). An effective implementation of these activities can only happen when the right information reaches the right people at the right time. For that, automatic trigger mechanisms are being developed, where pre-designed models are used to assess the impact and inform decisions with minimal human judgement. As the complexities in modelling algorithms increase, the interpretability of results from such models becomes more difficult, especially for users outside the domain. Therefore, benchmarking different approaches to IbF along with an interpretable evaluation mechanism is a top priority for humanitarian decision-makers, and is relatively unexplored. This study attempted to evaluate two different models: (1) an existing statistical trigger model, operationalized for informing decisions for typhoon early actions in the Philippines, which uses a machine learning algorithm with several predictor variables, and (2) an elementary trigger model used for informing cyclone early action in Bangladesh, that combines damage curves and composite index overlay. For an objective comparison, the elementary model was adapted to the Philippines, placing both the statistical and elementary model in the same spatial context. The models were evaluated based on (1) their performance for damage prediction and their sensitivity to different risk indicators in hindsight for Typhoon Kammuri (2019) in the Philippines, and (2) their interpretability/explainability based on the architecture and parameters. To support this further, an interactive decision support tool was built for post-hoc evaluation. Our findings suggest that, in retrospect, both models would have triggered with a minimum lead time of 72 hours, which is considered adequate for carrying out the pre-defined early actions. However, the performance of both models at the trigger time is not satisfactory, with a F1 score of 0.05 and 0.26 for the statistical and elementary models, respectively. The performance did not show an improvement over lead time, which can be attributed to the characteristics of this typhoon with considerable deviation from its forecasted track. However, in relative terms the elementary model performed better, and would have been able to maximize the impact reduced through early action, suggesting that, for this particular case, complex was not necessarily a better choice. At the same time, the overall results show that both models' performances are inconsistent in terms of lead time, and the elementary model does not show improvement in performance, even with observed typhoon data. Out of the two models, the elementary model was able to correctly predict higher damage percentages, while the statistical model was more conservative in its predictions. The statistical model better captures the characteristics of damage associated with the typhoon track, which is not considered in the elementary model. In conclusion, the results are evidence that a more statistical analysis of events of different characteristics is needed to examine the overall suitability of these models for the implementation goal. A common evaluation framework needs to be built, not only to benchmark IbF models against each other, but also to communicate the uncertainties and considerations to relevant stakeholders. The interactive dashboard built in this research has the potential to be further expanded to fit that purpose.
Item Type:Essay (Master)
Clients:
510- Netherlands Red Cross, The Hague, Netherlands
Faculty:ITC: Faculty of Geo-information Science and Earth Observation
Subject:38 earth sciences
Programme:Geoinformation Science and Earth Observation MSc (75014)
Link to this item:https://purl.utwente.nl/essays/94451
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page