Abstract:
Aiming at the problems of small samples, high annotation costs and complex sea interference faced by ship target recognition in complex battlefield environments, an improved UNet++ model integrating the attention mechanism of SE-Net channel is proposed to achieve end-to-end accurate classification of ship images. ResNet50 is used as the backbone of the encoder to construct a bidirectional feature refining mechanism of encoder-decoder, the SE-Net module is embedded to dynamically calibrate the channel feature response, and the joint loss function is introduced to enhance feature discrimination. The weakly supervised learning strategy is adopted, and only image-level labels are required, which significantly reduces the annotation cost. On the self-built seven types of ship datasets, the improved model classification accuracy reached 92.0%, the training loss was reduced from the initial 1.40 to 0.20, and the verification loss was reduced from 1.50 to 0.23, which was better than the mainstream backbone network and a variety of attention mechanisms, and the accuracy was improved by 8.9%~18.5% compared with the baseline in the applicability experiments under complex sea conditions such as scale changes, mist, and dense fog, and the effectiveness of each module was verified by ablation experiments and visual analysis. This study provides an effective technical solution for accurate target identification of complex undersea ships.