基于HM-SAC算法的无人艇机动目标跟踪控制研究

USV maneuvering target tracking based on the HM-SAC algorithm

  • 摘要:目的】针对有限信息下,无人艇对高速机动目标追踪过程中出现的响应延迟和跟踪效率低等问题,本文提出了一种基于深度强化学习的无人水面艇(Unmanned Surface Vehicles,USV)目标跟踪控制方法。【方法】该方法基于SAC(Soft Actor-Critic)框架,设计了在有限信息下的观测空间、动作空间及奖励函数,同时结合了一种嵌入SAC隐藏层的长短期记忆网络,结合当前环境状态和历史记忆,优化了强化学习“状态-动作”映射,使无人艇学习到时序最优策略,最终实现在有限信息下对机动目标的有效跟踪控制。【结果】仿真表明,所提出方法能够在考虑风浪流和观测存在延迟噪声的环境中,实现对机动目标的快速追踪并保持安全距离持续跟踪。通过开展多回合随机鲁棒测试,本文验证了所提出算法的有效性。【结论】所提出的HM-SAC(Hidden Long Short-term Memory Soft Actor-Critic)算法表现出高效、自适应的追踪策略,在无人艇目标跟踪控制方法中优势明显。

     

    Abstract: Objectives To address issues such as response delays and low tracking efficiency encountered by unmanned surface vehicles (USV) during high-speed maneuvering target tracking under limited information conditions, this paper proposes a target tracking control method for USV based on deep reinforcement learning. Methods This method is based on the Soft Actor-Critic (SAC) framework, designing observation space, action space, and reward functions under limited information. It integrates a Long Short-Term Memory (LSTM) network embedded within the SAC hidden layer, combining current environmental states with historical memory to optimize the reinforcement learning “state-action” mapping. This enables the unmanned surface vehicle to learn temporally optimal strategies, ultimately achieving effective tracking and control of maneuvering targets under limited information. Results Simulations demonstrate that the proposed method can achieve rapid tracking of maneuvering targets while maintaining a safe distance for continuous tracking, even in environments with wind, waves, currents, and observational delays with noise. The effectiveness of the proposed algorithm is validated through multiple rounds of random robust testing.Conclusions The proposed HM-SAC (Hidden Long Short-term Memory Soft Actor-Critic) algorithm demonstrates an efficient and adaptive tracking strategy, exhibiting significant advantages in unmanned surface vehicle target tracking control methods.

     

/

返回文章
返回