An end-to-end approach to a Reinforcement Learning application in the transport logistics

Ramón Gómez, Nerea

This research describes an end-to-end application of Reinforcement Learning to solve a multimodal routing problem. This application starts with the data collection through Blockchain in order to guarantee the security, transparency and scalability of the databases. With a robust data we proceed to make a fictional simulation of a routing scenario with three modes of transport. The RL agent will have to learn how to design the routes based on a model-free RL method. Four different agents are trained, each one of them considering a different strategy to design the route. After the training, an evaluation of the results will be carried out. This evaluation is completed by comparing the results against the decisions made by a Dijkstra shortest path algorithm. This comparison is of sufficient granularity to be able to judge the accuracy of the RL algorithm. The present application demonstrates the feasibility of applying this discipline to a routing problem endowed with a high level of variability such as real life presents.

An end-to-end approach to a Reinforcement Learning application in the transport logistics

Ramón Gómez, Nerea (2022)