University of Twente Student Theses

Login

Two-Stage Overfitting of Neural Network-Based Video Coding In-Loop Filter

Jánosi, József-Hunor (2023) Two-Stage Overfitting of Neural Network-Based Video Coding In-Loop Filter.

[img] PDF
1MB
Abstract:Modern video coding standards like the Versatile Video Coding (VVC) produce compression artefacts, due to their block-based, lossy compression techniques. These artefacts are mitigated to an extent by the in-loop filters inside the coding process. Neural Network (NN) based in-loop filters are being explored for the denoising tasks, and in recent studies, these NN-based loop filters are overfitted on test content to achieve a content-adaptive nature, and further enhance the visual quality of the video frames, while balancing the trade-off between quality and bitrate. This loop filter is a relatively low-complexity Convolutional Neural Network (CNN) that is pretrained on a general video dataset and then fine-tuned on the video that needs to be encoded. Only a set of parameters inside the CNN architecture, named multipliers, are fine-tuned, thus the bitrate overhead, that is signalled to the decoder, is minimized. The created weight update is compressed using the Neural Network Compression and Representation (NNR) standard. In this project, an exploration of high-performing hyperparameters was conducted, and the two-stage training process was employed to, potentially, further increase the coding efficiency of the in-loop filter. A first-stage model was overfitted on the test video sequence, it explored on which patches of the dataset it could improve the quality of the unfiltered video data, and then the second-stage model was overfitted only on these patches that provided a gain. The model with best-found hyperparameters saved on average 1.01% (Y), 4.28% (Cb), and 3.61% (Cr) Bjontegaard Delta rate (BD-rate) compared to the Versatile Video Coding (VVC) Test Model (VTM) 11.0 NN-based Video Coding (NNVC) 5.0, Random Access (RA) Common Test Conditions (CTC). The second-stage model, although exceeded the VTM, it underperformed with about 0.20% (Y), 0.23% (Cb), and 0.18% (Cr) BD-rate with regards to the first-stage model, due to the high bitrate overhead created by the second-stage model.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:https://purl.utwente.nl/essays/97398
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page