University of Twente Student Theses


Anomaly detection for Linux system log

Ma, R. (2020) Anomaly detection for Linux system log.

[img] PDF
Abstract:The goal of this study is to validate effective methods for detecting anomalies in Linux Syslog collected during CI/CD procedure. The automatic detection will help improve the efficiency of debugging by saving much time of manually searching for errors in the sea of logs. For this purpose, two different types of anomaly detection methods are evaluated, namely workflow-based method and PCA-based method. During the experiment, different Natural language processing (NLP) methods such as word2vec and TF-IDF are tested for preprocessing and encoding the log message body. Long short-term memory (LSTM) and Principal component analysis (PCA) models are implemented separately as the representatives for the two types of methods mentioned above. The experiment results of both methods turn out to surpass the performance of the baseline method stupid backoff, which is the current solution used by the thesis sponsor company. LSTM and PCA both reach a relatively balanced performance of recall and precision. As a harmonic indicator, the F1 score for PCA reaches 0.9043 and for LSTM, it is 0.9124 while the baseline is 0.6411. In the conclusion section, different suitable use cases of different methods are discussed. These two methodologies proposed in this thesis contributes towards detecting anomalies in an unsupervised manner when no label is provided.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Interaction Technology MSc (60030)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page