During the outgoing phase (periodic report: FIRST), a deep RL environment to train an agent to localise an underwater target using range-only methods has been designed. This environment has been published in a GitHub repository to make it publicly available:
https://github.com/imasmitja/RLforUTracking(si apre in una nuova finestra). Additionally, different deep RL algorithms have been implemented to test their performance in the designed environment. Specifically, the following algorithms have been developed:
•DDPG: Deep Deterministic Policy Gradient
•TD3: Twin Delayed DDPG
•SAC: Soft Actor-Critic
Additionally, different field test has been conducted during the reporting period:
•Tracking a static and moving target at Monterey Bay (California, USA): In this test, we used a Wave Glider to localise an underwater docking station and a LRAUV.
•Tracking a static target in the harbor of Sant Feliu (Catalonia, Spain): In this test, a Sparus II was able to localise a standalone acoustic modem deployed in the middle of the harbor. This test was of special importance because demonstrated that the algorithms (and strategy) are platform-free (i.e. they can be deployed in a variety of vehicles, from a glider to a conventional AUV).
The results have been published in a large number of meetings and conferences, delivering both oral and poster presentations. For example: the IEEE 18th International Conference on Automation Science and Engineering (CASE2022) which took place in Mexico City (Mexico), and virtually, between August 20th to August 24th of 2022; the Deep Learning Barcelona Symposium (DLBCN2021), on December 2022; and the Ocean Sciences Meeting (OSM2020), on February 2022 (virtually), to show the AIforUTracking objectives and first achievement. I have also attended the Machine Learning on Monterey Bay (MLonMB2021) workshop which was conducted at the University of California Santa Cruz (California, USA) on November 10th, 2021.
During the incoming phase, both the environments and the algorithms have been modified in order to apply multi-agent reinforcement learning (MARL) algorithms. With these modifications, different MARL algorithms were implemented, trained, and tested to coordinately localize and track an underwater target using acoustic localization techniques. Despite standard state-of-the-art algorithms, during this phase, a novel MARL algorithm has been developed, called TransfQMix. This new architecture uses transformers to update the QMix algorithm. This approach outperforms state-of-the-art algorithms in different scenarios such as Spread in Particle (from OpenAI) and StartCrafII. Both environments constitute well-known benchmarks within the community, and therefore, highlight the performance obtained with the TransfQMix algorithm.
Additionally, the RL algorithms and the field tests conducted at Monterey Bay were post-processed during the incoming phase and a manuscript was written and published in the top-class Science Robotics journal. This was a great milestone for the AIforUTracking project, where the main outputs and results were shared within the community. Additionally, the project was presented at Science Is Wonderful! event organized by the European Commission's Science Fair in Brussels in 2023, reaching almost 4000 kids over the course of two days.