Abstract:
Objectives Unmanned surface vehicle (USV) swarms have become essential in critical maritime applications such as search and rescue, environmental monitoring, and military operations, due to their enhanced robustness and operational efficiency compared to single-USV systems. However, traditional path planning methods for USV swarms encounter significant limitations in complex and dynamic marine environments. Conventional algorithms like A*, RRT*, APF, and DWA operate on a passive response basis, relying on static or quasi-static environmental parameters (e.g., obstacle positions and velocities) to construct models. This approach lacks predictive capabilities for dynamic targets and active decision-making mechanisms, leading to poor adaptability and insufficient robustness. While deep reinforcement learning (DRL) approaches enable end-to-end policy learning, they suffer from high sample complexity, high training costs, weak generalization, and challenges in integrating high-level task constraints (e.g., task priorities and safety thresholds). To address these challenges, this study proposes a novel adaptive path planning framework that leverages the advanced reasoning and decision-making capabilities of large language models (LLMs) to improve the performance of USV swarms in complex scenarios.
Methods The proposed method, called adaptive path planning with tool-function chains (APPT), utilizes a multi-component design to achieve intelligent and adaptive path planning. First, a planning encoder is specifically developed to process environmental data, extracting critical features such as obstacle density, dynamic obstacle movement patterns, and task constraints (including path length limits and safety distance requirements). This encoder converts unstructured environmental and task information into structured prompt vectors, which are then fed into the LLM. Second, leveraging prompt engineering, an LLM-driven USV swarm path planning agent is constructed. This agent incorporates a library of classical path planning algorithms (A*, RRT*, APF, DWA) as plug-and-play "tool functions". The LLM dynamically assembles optimal tool chains by calculating the cosine similarity between the prompt vectors (representing environmental and task demands) and the capability feature vectors of the tool functions. Third, an adaptive iterative optimization mechanism, guided by user input, is implemented. Based on three key evaluation metrics—planning time (total duration from prompt input to plan generation), path total length (sum of individual USV paths in the swarm), and safety (distance between USVs and obstacles relative to obstacle expansion radii) —the LLM recursively adjusts tool function parameters (e.g., A* heuristic weights, APF attraction/repulsion gains). This recursive adjustment is driven by structured prompt templates that incorporate scenario details, current performance metrics, and optimization goals, ensuring the framework can flexibly adapt to evolving task requirements.
Results Extensive experiments were conducted in dynamically generated obstacle environments (100 m × 100 m maps) with 3 fixed obstacles (radii: 8 m, 10 m, 15 m) and 3 dynamic obstacles (radius: 5 m, with discrete movement updates). The results validated the effectiveness of the APPT method across multiple dimensions. In tool selection accuracy, the APPT method achieved an average accuracy of 89.7% across various scenarios, with performance adapting to environmental and task parameters. For instance, when the safety weight was prioritized (0.6–0.8), the accuracy of safety-related tool selection reached 95.6%. Conversely, when the planning time weight was high (0.6–0.8), the time-related accuracy peaked at 96.2%. Regarding path optimization, the APPT method achieved a significant reduction in path total length, decreasing it by 14.55% after iterative parameter adjustment (e.g., optimizing APF parameters to mitigate local minimum oscillations). Compared to conventional parameter optimization algorithms, the APPT method maintained comparable path quality (with only a 0.7% increase in path length compared to the conventional method) while reducing optimization time by 61% (7.52 s vs. 19.48 s), highlighting its efficiency.
Conclusions The APPT method represents a paradigm shift in USV swarm path planning by fully leveraging the reasoning and analytical capabilities of LLMs. By integrating prompt engineering and dynamic tool-chain assembly, the method overcomes the adaptability and robustness limitations of traditional approaches and the scalability challenges of DRL methods. The APPT method not only achieves high tool selection accuracy (89.7% on average) across complex environments but also enables efficient, user-guided iterative optimization, leading to notable improvements in both path quality and planning efficiency. In practice, the APPT framework offers a versatile solution that can be tailored to diverse maritime tasks, from civilian environmental monitoring to military operations, by adjusting evaluation metrics and tool parameters. Theoretically, it bridges the gap between LLMs and engineering applications in maritime robotics, providing a foundation for future research on intelligent, adaptive multi-agent systems in complex and dynamic environments.