Design of AUV controller based on improved PPO algorithm
-
-
Abstract
Objectives In order to improve the robustness of AUV controllers to environment modeling errors, a reinforcement learning control strategy that introduces contextual information and course learning training mechanism is proposed. Methods First, the contextual information is embedded into the policy network by using the interaction history data as part of the policy network input; second, the course-learning training mechanism is designed to gradually increase the interference strength during the training process to avoid training instability and early stopping phenomenon caused by too much interference. Fixed-depth control experiments were conducted in a simulation environment, and the effectiveness of the algorithm was further verified using a solid AUV in a pool. Results The experimental results show that the proposed algorithm can improve the convergence speed by 25% and the reward steady state value by 10.8%, which effectively improves the training process. The proposed algorithm can realize static-free tracking in the simulation environment, and the mean
-
-