Abstract:Aiming at the problem that the Ochotona curzoniae target detection model based on convolutional neural network lacks training data in practical application, a data augmentation method is proposed by the fusion of foreground and background. Firstly, separate the foreground and the background of the training data, with image transforming the separated foreground randomly and covering the separated background by background pixels, to obtain the foreground set and the background set, respectively. The foreground and background are randomly selected from the foreground set and the background set, respectively and are fused based on pixel addition. Then randomly select a sample from the training set, and use the cut-and-paste method to fuse the labeled bounding box area of the selected sample to the training images' random positions to obtain an augmented data set. A two-stage weakly supervised transfer learning was used as the train the model. The first stage pre-trains the model dependent on the augmented data set. The second stage fine-tunes the pre-training model to obtain the detection model. Under the same experimental conditions, the experimental results of the target detection of Ochotona curzoniae in natural scenes show that the average accuracy of the target detection model based on this method is better than that of the target detection model without data augmentation, Mosaic, and Cutout data augmentation. The optimal AP of the target detection model based on data augmentation method by the fusion of foreground and background is 78.4%, which is higher than 72.6% of Mosaic method, 75.86% of Cutout method, and 77.4% of Random Erasing method.