University of Twente Student Theses


PIRATE Robot Autonomous Navigation through Complex Pipe Networks using Reinforcement Learning

Vacariu, P. (2021) PIRATE Robot Autonomous Navigation through Complex Pipe Networks using Reinforcement Learning.

[img] PDF
Abstract:There are many industries that rely on pipe systems. Regardless of what they transport, a leak caused by pipe failure can prove to be either costly, disastrous, or even both. Therefore, it is important to perform periodic inspections. In some cases, especially when the pipes are easily accessible, this task can be achieved in a relatively easy way. Unfortunately this is not always the case, for instance in case of pipes buried underground. In such situations the maintenance becomes expensive, in both time and resources. The solution comes in the form of camera systems that can be used to inspect the pipes. Out of all the options, the autonomous pipe inspections robots have the most advantages, as they have the possibility to travel long distances, navigate through bends and they do not require too much human intervention. This project focuses on improving the navigation controller for the Pipe Inspection Robot for AuTonomous Exploration (PIRATE). The robot itself employs several modular sections separated by rotational joints and is equipped with wheels that allow it to navigate through pipe systems of various diameters. The autonomous navigation uses a Hierarchical Reinforcement Learning architecture, consisting of multiple agents organized in a hierarchical way. The top level agent employs the options framework method to pick one of the subpolicies, which then generate actions sent to the robot. In this project, the hierarchical model is extended. The goal is to introduce a decision mechanism that allows the PIRATE robot to pick one of the bends in a T-junction and navigate through it. To achieve this goal, the hierarchical structure is extended by complementing it with a feudal learning approach. Thus, the top level policy generates not only an option regarding picking one of the low level policies, but also a subgoal that is passed to the active subpolicy and acts as a decisional element. Being able to navigate through T-junctions opens up the possibility to perform certain inspection missions. In case the pipe network is already known, the robot can trace a predetermined path to reach a certain position. If there is no previous information related to the structure of the pipe system, the robot can explore it and gather data which can be used to build a virtual map. One extra aspect studied in this project is hyperparameter tuning. Using a Population Based Training approach, the parameters of each policy can be tuned, thus increasing the performance and robustness of each agent.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:50 technical science in general, 52 mechanical engineering, 54 computer science
Programme:Electrical Engineering MSc (60353)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page