Pretraining Deep Reinforcement Learning with Imitation Learning for Autonomous UAV Exploration in Unknown Indoor Environments

Author(s): Pietersma, D.N. (2025)

Abstract:

Autonomous Unmanned Aerial Vehicles (UAVs) have great potential for indoor exploration tasks, such as search and rescue, surveillance, and inspection due to their compact size, operational flexibility, and high maneuverability, with autonomy allowing them to operate without the need for human expertise.

Traditional exploration methods struggle with performance and higher computational costs, while Deep Reinforcement Learning (DRL) offers better performance but suffers from instability and struggles with convergence.

This thesis proposes a hybrid approach that integrates Behavior Cloning (BC), a form of Imitation Learning (IL), with the Soft Actor-Critic (SAC) DRL algorithm for continuous control to improve UAV exploration performance, robustness, and generalization. 
The designed method begins by collecting expert trajectories and using them to pretrain a policy with IL. The pretrained policy is then fine-tuned with DRL in a photorealistic indoor environment in NVIDIA Isaac Sim. The state given to the algorithm includes semantic visual inputs with depth perception to enhance situational awareness. 
Experiments show that the pretrained IL policy with DRL converges faster and achieves higher coverage and success rates compared to baselines trained purely with DRL or IL. The results also show that incorporating semantic visual inputs enhances exploration performance by providing structured high-level environmental information. 

In conclusion, this thesis shows the benefits of combining IL with RL in complex, unknown indoor environments, laying the foundation for deploying autonomous UAVs in mission-critical scenarios.

Document(s):

MTP_ILRL_indoor_exploration_uav_Desiree_Pietersma.pdf