Abstract:In order to quickly and accurately detect the location and species of small target organisms(sea cucumbers, scallops, starfish and sea urchins) in complex underwater environments, a small target organism detection algorithm based on improved YOLOv5s was designed in this study. In the feature extraction stage, a self-attention residual module based on multi-head self-attention design was introduced to enhance the global modeling ability of the network while enhancing the target feature information; in the feature fusion stage, the feature fusion network was adjusted to a bidirectional feature pyramid structure with lateral connections to enhance the network’s ability to fuse feature information at different stages; in the detection stage, the large target detection scale was discarded and the small target detection scale was added to improve the detection accuracy of small target organisms; finally, the α-CIoU loss function was introduced as the model bounding box regression loss function to improve the bounding box regression accuracy, thereby improving the algorithm detection accuracy. In the qualitative test, almost all aquatic product targets visible to the naked eye were detected and correctly marked by the improved model, which reflects the effectiveness of the improved algorithm. In the α value selection test, the best effect was achieved when the α value was 2.0, and the mean average precision(mAP) was better than other values, reaching 0.857, which was 0.016 higher than that when the α value was 1.0. In the ablation experiment, adding any optimization method increased the detection accuracy of the improved model. The mAP of the improved model finally reached 0.873, which was 0.032 higher than that of the original model, and the number of model parameters was reduced by 26.8%, only 5 M. In the comparative experiment, the mAP of the improved model was improved by more than 0.020 compared with Faster RCNN, YOLOv3, YOLOv4, YOLOv5s, YOLOvX, SSD, NAS-FCOS, and improved YOLOv5; and the detection speed of the improved model on the local server reached 139 frames/s, which was 14 frames/s higher than that of the YOLOv5s, slightly lower than that of the SSD model known for its detection speed. It could be concluded that the improved model meets the requirements of lightweight and real-time performance. The improved model was also successfully deployed on Android mobile devices.