基于深度强化学习的双体船姿态控制

Attitude control of catamaran based on deep reinforcement learning

  • 摘要:
    目的 针对双体船纵向运动控制中传统控制算法对精确的数学模型和系统参数的依赖问题,提出基于深度强化学习的纵向运动控制算法。
    方法 设计奖励函数和神经网络结构以及调整相关超参数,与双体船模型相结合。通过实验,比较深度强化学习的深度确定性策略梯度(DDPG)算法和遗传算法求解最优的线性二次调节控制器(GA-LQR)法在3种不同控制方式下的控制效果,以及在不同工况和初始状态下的鲁棒性。
    结果 在相同工况下,DDPG算法相对于GA-LQR算法在控制效果上略有优势,但其控制过程中的鳍角输出更为激进。在不同工况和初始状态下的仿真实验中,当系统和环境模型发生较大变化时,DDPG算法的控制效果会受到较大影响,但在系统和环境变化较小的情况下,DDPG算法表现出更好的适应性,相较于GA-LQR算法更具优势。综合分析得出,DDPG算法在性能上与GA-LQR算法相当。
    结论 基于深度强化学习的DDPG算法在双体船在纵向运动控制中具有应用潜力,为未来复杂海况下船舶运动控制提供了新的研究方向和方法论支持。

     

    Abstract:
    Objective A longitudinal motion control algorithm based on deep reinforcement learning is proposed, focusing on the dependency of traditional control algorithms on precise mathematical models and system parameters in longitudinal motion control of catamarans.
    Methods By designing reward functions and neural network structures and adjusting relevant hyper-parameters, in combination with the catamaran model, through experiments, the control effect of the deep reinforcement learning DDPG algorithm and the GA-LQR algorithm under three different control modes and the robustness under different operating conditions and initial states were compared.
    Results  Under the same operating conditions, the DDPG algorithm has a slight advantage over the GA-LQR algorithm in control effect, but its fin angle output during the control process is more aggressive. In the simulation experiments under different operating conditions and initial states, when the system and the environmental models undergo significant changes, the control effect of the DDPG algorithm is significantly affected. However, when the system and the environment undergo small changes, the DDPG algorithm exhibits better adaptability and superiority over the GA-LQR algorithm. The comprehensive analysis shows that the DDPG algorithm demonstrates similarity to the GA-LQR algorithm in terms of performance.
    Conclusions The DDPG algorithm based on deep reinforcement learning has the potential of applications in the longitudinal motion control of catamarans, providing new research directions and methodological support for ship motion control under complex sea conditions in the future.

     

/

返回文章
返回