University of Twente Student Theses

Login

Investigating vision transformers for human activity recognition from skeletal data

Joseph, A.M. (2023) Investigating vision transformers for human activity recognition from skeletal data.

[img] PDF
13MB
Abstract:Transformers are increasingly being used for different kinds of applications these days. Recent works show that vision transformers can also demonstrate great capacity in solving Human Activity Recognition tasks based on skeletal trajectories. However, there are still certain aspects of them that are left unexplored, with respect to the input representation as well as the model architecture. We investigate two aspects of the problem: first, we use skeletal keypoint trajectories as inputs which are decomposed locally as well as globally. Secondly, we introduce convolutional learning in to transformers by using tubelet embeddings which we assume could extract better spatio-temporal information. We inspect our model on two different datasets, NTURGB+D 120 and HR-Crime. We observe that decomposing the keypoints globally and locally does not improve the performance. We also observe that incorporating a tubelet embedder to a simple transformer architecture gives similar results as the baseline results with significantly lesser computational costs. We also discuss the limitations of our work and what could be done to improve it.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:https://purl.utwente.nl/essays/94291
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page