Collision avoidance path planning of tourist ship based on DDPG algorithm
-
摘要:
目的 作为船舶航行安全的核心问题,若船舶避碰完全依赖船长的个人状态和判断将存在一定的安全隐患。为了统筹关键水域上所有船舶(游船、货船等)并进行路径预测,需要建立防碰撞预警机制。 方法 利用深度确定性策略梯度(DDPG)算法和船舶领域模型,采用电子海图模拟船舶的航行路径,提出基于失败区域重点学习的DDPG算法改进策略,并针对游船特点改进的船舶领域模型参数等改进方法,提高航线预测和防碰撞准确率。 结果 使用改进的DDPG算法和改进的船舶领域模型,与未改进前的算法相比,船舶避碰正确率由84.9%升至89.7%,模拟航线与真实航线的平均误差由25.2 m降至21.4 m。 结论 通过基于改进的DDPG算法和改进的船舶领域模型开展船舶避碰路径规划,可以实现水域船舶航线监管功能,且当预测航线与其他船舶存在交会时,告警调度人员,从而实现防碰撞预警机制。 -
关键词:
- 混合航道 /
- 船舶领域 /
- 船舶避碰 /
- 深度确定性策略梯度算法 /
- 失败区域探索策略
Abstract:Objective Sailling saftey is the chief matter for ship navigation, supposing that collision avoidance operation is heavily dependent on the captain's performance or judgement, it would pose potential risks to the ship safety. In order to coordinate all ships (tourist ships, cargo ships, etc.) in key waters and predict their routes, it is necessary to establish an anti-collision mechanism. Methods Using the deep deterministic policy gradient (DDPG) algorithm and the Fujii's ship domain model, an electronic chart is used to simulate the ship's navigation route, and an improved strategy for the DDPG algorithm based on the key learning of failure regions and the improved parameters of the ship domain model according to the characteristics of tourist ships are proposed, so as to enhance the accuracy of route prediction and anti-collision. Results Using the improved DDPG algorithm and ship domain model, compared with the previous algorithm, the accuracy of ship collision avoidance is raised from 84.9% to 89.7%, and the average error between the simulated and real route is reduced from 25.2 m to 21.4 m. Conclusion Through the proposed ship collision avoidance path planning based on the improved DDPG algorithm and ship domain model, the supervision function of ship routes in water areas can be realized; when the predicted route intersects with other ships, the dispatcher will be alerted, realizing an effective anti-collision early warning mechanism. -
表 1 原始与改进DDPG算法的仿真数据对比
Table 1. Comparison of simulation data between original and improved DDPG algorithms
算法 对比
次数
/次碰撞
次数
/次碰撞率
/%转向正确
次数
/次转向
正确率
/%航迹点
平均距离
偏差/m原始算法 1 000 0 0 849 84.9 25.2 改进算法 1 000 0 0 897 89.7 21.4 -
[1] 吴飞, 李志特. 新时期中国内河航运发展问题分析[J]. 珠江水运, 2020(15): 87–88.WU F, LI Z T. Analysis of the sustainable development of China's inland river in the new era[J]. Pearl River Water Transport, 2020(15): 87–88 (in Chinese). [2] 童霖. 内河船舶避碰事故调查处理要点[C]//中国航海学会内河船舶驾驶专业委员会桥区船舶航行安全与管理学术会议论文集. 珠海: 中国航海学会, 2010: 3.TONG L. Key points of investigation and handling of inland watercraft collision avoidance accidents[C]//Papers on Navigation Safety and Management in Bridge Area (1). Zhuhai: China Nautical Society, 2010: 3 (in Chinese). [3] 倪生科, 刘正江, 蔡垚, 等. 基于遗传算法的船舶避碰决策辅助[J]. 上海海事大学学报, 2017, 38(1): 12–15.NI S K, LIU Z J, CAI Y, et al. Ship collision avoidance decision aids based on genetic algorithm[J]. Journal of Shanghai Maritime University, 2017, 38(1): 12–15 (in Chinese). [4] 倪生科, 刘正江, 蔡垚, 等. 基于混合遗传算法的船舶避碰路径规划[J]. 上海海事大学学报, 2019, 40(1): 21–26.NI S K, LIU Z J, CAI Y, et al. Ship collision avoidance path planning based on hybrid genetic algorithm[J]. Journal of Shanghai Maritime University, 2019, 40(1): 21–26 (in Chinese). [5] 尚明栋, 朱志宇, 周涛. 基于改进蚁群算法的水面无人艇智能避碰方法研究[J]. 船舶工程, 2016, 38(9): 6–9.SHANG M D, ZHU Z Y, ZHOU T. Research on intelligent anti-collision method of USV based on improved ant colony algorithm[J]. Ship Engineering, 2016, 38(9): 6–9 (in Chinese). [6] 宋勇. 船舶路径规划算法的研究[D]. 武汉: 武汉理工大学, 2018.SONG Y. Research on ship path planning algorithm[D]. Wuhan: Wuhan University of Technology, 2018 (in Chinese). [7] 欧阳子路, 王鸿东, 王检耀, 等. 基于改进Bi-RRT的无人水面艇自动避碰算法[J]. 中国舰船研究, 2019, 14(6): 8–14.OUYANG Z L, WANG H D, WANG J Y, et al. Automatic collision avoidance algorithm for unmanned surface vessel based on improved Bi-RRT algorithm[J]. Chinese Journal of Ship Research, 2019, 14(6): 8–14 (in Chinese). [8] 严浙平, 杨泽文, 王璐, 等. 马尔科夫理论在无人系统中的研究现状[J]. 中国舰船研究, 2018, 13(6): 9–18.YAN Z P, YANG Z W, WANG L, et al. Research status of Markov theory in unmanned systems[J]. Chinese Journal of Ship Research, 2018, 13(6): 9–18 (in Chinese). [9] 王程博, 张新宇, 张加伟, 等. 未知环境中无人驾驶船舶智能避碰决策方法[J]. 中国舰船研究, 2018, 13(6): 72–77.WANG C B, ZHANG X Y, ZHANG J W, et al. Method for intelligent obstacle avoidance decision-making of unmanned vessel in unknown waters[J]. Chinese Journal of Ship Research, 2018, 13(6): 72–77 (in Chinese). [10] 丁志国, 张新宇, 王程博, 等. 基于驾驶实践的无人船智能避碰决策方法[J]. 中国舰船研究, 2021, 16(1): 96–104, 113. doi: 10.19693/j.issn.1673-3185.01781DING Z G, ZHANG X Y, WANG C B, et al. Intelligent collision avoidance decision-making method for unmanned ships based on driving practice[J]. Chinese Journal of Ship Research, 2021, 16(1): 96–104, 113. doi: 10.19693/j.issn.1673-3185.01781 [11] SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. Cambridge, MA: MIT Press, 1998. [12] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.ZHOU Z H. Machine learning[M]. Beijing: Tsinghua University Press, 2016 (in Chinese). [13] GÖRGES D. Relations between model predictive control and reinforcement learning[J]. IFAC-PapersOnLine, 2017, 50(1): 4920–4928. doi: 10.1016/j.ifacol.2017.08.747 [14] ENJALBERT S, VANDERHAEGEN F. A hybrid reinforced learning system to estimate resilience indicators[J]. Engineering Applications of Artificial Intelligence, 2017, 64: 295–301. doi: 10.1016/j.engappai.2017.06.022 [15] SHI Y M, DU J, AHN C R, et al. Impact assessment of reinforced learning methods on construction workers' fall risk behavior using virtual reality[J]. Automation in Construction, 2019, 104: 197–214. doi: 10.1016/j.autcon.2019.04.015 [16] GENDERS W, RAZAVI S. Evaluating reinforcement learning state representations for adaptive traffic signal control[J]. Procedia Computer Science, 2018, 130: 26–33. doi: 10.1016/j.procs.2018.04.008 [17] 卜令正. 基于深度强化学习的机械臂控制研究[D]. 徐州: 中国矿业大学, 2019.BU L Z. Study of robot arm control based on deep reinforcement learning[D]. Xuzhou: China University of Mining and Technology, 2019 (in Chinese). [18] 陈希亮, 曹雷, 李晨溪, 等. 基于重抽样优选缓存经验回放机制的深度强化学习方法[J]. 控制与决策, 2018, 33(4): 600–606.CHEN X L, CAO L, LI C X, et al. Deep reinforcement learning via good choice resampling experience replay memory[J]. Control and Decision, 2018, 33(4): 600–606 (in Chinese). [19] FUJII Y, TANAKA K. Traffic capacity[J]. The Journal of Navigation, 1971, 24(4): 543–552. doi: 10.1017/S0373463300022384 [20] UHLENBECK G E, ORNSTEIN L S. On the theory of the Brownian motion[J]. Physical Review, 1930, 36(5): 823. doi: 10.1103/PhysRev.36.823 -