Abstract:
Objective Accurate and continuous wave-direction perception is an essential prerequisite for shipborne marine environmental awareness, intelligent navigation, route optimization, and safe offshore operations. However, vision-based wave-direction estimation onboard a moving vessel remains challenging, as the observed sea-surface texture is highly susceptible to nighttime low illumination, partial rain–fog occlusion, strong specular reflections, and dynamic vessel motions such as roll, pitch, and yaw. These factors may lead to texture degradation, viewpoint disturbances, image blur, and unstable wave-direction regression. To address these challenges, this paper proposes a multispectral monocular-vision and inertial measurement unit (IMU)-assisted method for shipborne wave-direction estimation to meet all-weather sensing requirements. Dominant wave-direction estimation is treated as the core task, and the feasibility of the proposed method is validated using limited field data collected in real shipborne scenarios.
Methods A visible-light and long-wave infrared multispectral imaging system with synchronized attitude acquisition is first constructed. The image streams and IMU measurements are temporally aligned using timestamps, and sequential samples are generated via a sliding-window strategy. To enhance the visibility and stability of sea-surface texture under nonuniform illumination and low-contrast conditions, contrast limited adaptive histogram equalization (CLAHE) is applied to the input images. A U-Net-based segmentation network is then employed to extract the effective sea-surface region while suppressing interference from the sky, vessel structures, wake, and localized high-intensity reflections. On this basis, a ResNet-18 backbone initialized with transfer learning is used to encode visual features from each frame. The synchronized vessel attitude angles are represented using sine–cosine encoding to avoid angular discontinuities and are further embedded into the same feature space as the visual representations via a multilayer perceptron. The visual and attitude features are fused and fed into a lightweight multi-head self-attention module, which models cross-frame temporal dependencies and learns stable directional representations from evolving wave textures. The final wave direction is represented as a two-dimensional unit vector on the angular circle and is recovered using the arctangent function, thereby mitigating discontinuities near the 0°/360° boundary. In a subset of system-level validation experiments, three orthogonally arranged cameras perform independent inference. Their outputs are verified using geometric consistency constraints, and valid results are fused in post-processing to improve output stability.
Results Field experiments were conducted in the Guangzhou–Zhuhai coastal waters. Under visible-light conditions, the proposed method achieved a mean absolute error of 0.41°, a standard deviation of 0.28°, and a success rate of 99.90% for wave-direction estimation. Under long-wave infrared conditions, the mean absolute error was 1.14°, the standard deviation was 0.89°, and the success rate was 98.94%. These results show that visible-light imagery provides clear wave-crest and wave-trough texture cues under daytime illumination, whereas long-wave infrared imaging still preserves effective sea-surface texture information under nighttime and low-illumination conditions included in this study. The multispectral configuration therefore improves the continuity of shipborne wave-direction estimation across varying illumination conditions. In addition, the incorporation of IMU-based attitude information and temporal attention modeling helps mitigate the effects of vessel motion and single-frame fluctuations, resulting in more stable wave-direction estimation over time.
Conclusions The proposed multispectral monocular-vision–IMU multimodal temporal estimation method shows good stability on the collected visible-light daytime samples, long-wave infrared low-light/nighttime samples, and a limited set of complex-environment samples. The results indicate that the integration of multispectral imaging, attitude assistance, effective sea-surface extraction, and temporal correlation modeling is feasible for shipborne dominant wave-direction estimation. It should be noted that the current dataset does not systematically cover varying rain intensities, fog conditions, or different wind–wave coupling regimes. Therefore, the results should be regarded as a feasibility validation based on limited field data rather than a comprehensive demonstration of adaptability to all-weather operating conditions. Future work will focus on expanding the field dataset, improving model robustness under severe weather and complex sea states, and extending the framework from dominant wave-direction estimation to more comprehensive wave-parameter perception.