无人船性能预设强化学习轨迹跟踪控制

王润旭; 王宁; 霍阳

doi:10.19693/j.issn.1673-3185.04946

无人船性能预设强化学习轨迹跟踪控制

Performance-prescribed reinforcement learning trajectory-tracking control of an unmanned surface vehicle

摘要

摘要: 【目的】针对复杂航行环境下无人船因推进系统饱和限制而引发的轨迹跟踪误差振荡失稳问题，提出一种基于预设性能约束的强化学习最优控制策略。【方法】首先，引入光滑饱和函数处理控制输入的幅值约束；其次，设计一种改进的预设性能控制方法，通过构造非对称性能边界以严格约束跟踪误差的收敛范围；进而，建立基于Actor-Critic网络的强化学习优化框架，通过在线迭代学习逼近最优控制策略及价值函数，在状态约束下实现了无人船跟踪性能的优化；最后，基于Lyapunov稳定性理论，严格证明了该方案下无人船闭环跟踪控制系统的稳定性。【结果】以远洋油轮KVLCC2为研究对象进行数值仿真，结果表明所提方法能够有效处理饱和限制下的无人船轨迹跟踪问题，且跟踪误差始终限制在预设性能边界之内。【结论】本研究为受限无人船的高性能跟踪控制提供了新的解决方案，具有实际的工程应用价值。

Abstract: Objectives This paper addresses the oscillatory instability in trajectory tracking errors of unmanned surface vehicle (USV) caused by propulsion saturation under complex navigation conditions, by proposing a prescribed performance-based reinforcement learning optimal control method. Methods First, a novel saturation function is introduced to handle USV input saturation. Second, an improved prescribed performance control scheme is designed, which constrains tracking error convergence via an asymmetric performance boundary, relaxing stringent dependence on initial error states. Then, a reinforcement learning optimization mechanism based on an Actor‑Critic framework is constructed to iteratively learn the optimal control policy and value function, achieving performance optimization under state constraints. Finally, the stability of the closed‑loop tracking system is rigorously proven via Lyapunov theory. Results Numerical simulations performed on the KVLCC2 tanker model demonstrate that the proposed method effectively handles the trajectory tracking problem subject to saturation constraints, with all tracking errors strictly confined within the prescribed performance boundaries. Conclusions The study provides a new solution for high‑performance tracking control of constrained USVs and exhibits practical value for engineering applications.

HTML全文

参考文献(0)

施引文献

资源附件(0)