Memory and inference time considerations in a C translated QKeras model
Author(s): Dotsika, Adamantia (2022)
Abstract:
With the growth of applications using neural networks, there is an increase in need for compact C models. The work of [1] presents interesting results on QKeras models and its impact on memory footprint. With that in mind this paper presents a modified version of the keras2c library to adapt to the needs of QKeras models. The modified library is used for studying the influence of data representation on memory and inference time in C-translated Qkeras models. The results show a memory reduction of 2.5x in case of the fixed-point representation with no loss in inference time. Even though the output of the inference could not be studied in accuracy, the study shows interesting and promising results that need further investigation.
Document(s):
Dotsika_BA_EEMCS.pdf