基于深度强化学习的安全绿色近海通信感知一体化波束赋形优化

李旭东; 赵晓楠; 荣寒潇; 崔杨; 姚如贵

doi:10.19693/j.issn.1673-3185.04890

基于深度强化学习的安全绿色近海通信感知一体化波束赋形优化

DRL-empowered secure and green beamforming optimization for maritime heterogeneous ISAC

摘要

摘要: 【目的】近海通信感知一体化（integrated sensing and communications, ISAC）系统面临节点机动频繁、信道时变强干扰、跨网窃听威胁等动态异构难题，传统优化方法计算复杂且难以实时响应。【方法】针对此问题，提出一种基于深度强化学习（deep reinforcement learning, DRL）的智能波束赋形优化框架。将安全能效最大化问题建模为马尔可夫决策过程，设计复合奖励函数引导策略优化。引入速率分割多址接入（rate-splitting multiple access, RSMA）以精细化管理跨网干扰，并创新性地利用感知信号作为针对窃听者的内生绿色干扰，在无需额外功耗的前提下提升物理层安全。采用近端策略优化，结合监督预训练与在线微调的混合训练机制，实现快速收敛与动态自适应。【结果】基于载波频率18 GHz、发射功率35 dBm、用户数3的典型近海参数仿真，结果表明：所提方案的安全能效较传统的RSMA交替优化方案提升18%以上，收敛速度提升30%以上，且在信道估计误差与拓扑突变场景下表现出更强的鲁棒性。【结论】所提DRL智能波束赋形方法可在复杂海洋环境中实现安全能效、实时响应与鲁棒性的协同提升，为ISAC的智能资源管控提供了新思路。

Abstract: Objective The integrated sensing and communications (ISAC) system for offshore communication faces challenges such as frequent node mobility, strong time-varying interference in the channel, and cross-network eavesdropping threats. Traditional optimization methods are computationally complex and difficult to respond in real time. Method To address this issue, an intelligent beamforming optimization framework based on deep reinforcement learning (DRL) is proposed. The problem of maximizing security and energy efficiency is modeled as a Markov decision process, and a composite reward function is designed to guide the strategy optimization. Rate-splitting multiple access (RSMA) is introduced to finely manage cross-network interference, and a novel approach is adopted to utilize the sensing signal as an inherent green interference against eavesdroppers, thereby enhancing physical layer security without additional power consumption. The proximal policy optimization (PPO) algorithm is employed, combined with a hybrid training mechanism of supervised pre-training and online fine-tuning, to achieve rapid convergence and dynamic adaptability. Result Based on typical offshore parameters with a carrier frequency of 18 GHz, an emission power of 35 dBm, and 3 users, the simulation results show that the security energy efficiency (SEE) of the proposed scheme is more than 18% higher than that of the traditional RSMA alternating optimization scheme, and the convergence speed is more than 30% faster. Moreover, it demonstrates stronger robustness in scenarios of channel estimation errors and topology mutations. Conclusion The proposed DRL intelligent beamforming method can achieve a synergistic improvement in security energy efficiency, real-time response, and robustness in complex marine environments, providing a new idea for intelligent resource management in ISAC.

HTML全文

参考文献(0)

施引文献

资源附件(0)