Research on Small Target Detection Algorithm of DETR Network Based on Improved SWIN Transformer

Main Article Content

Fengchang Xu, Rayner Alfred, Rayner Henry Pailus, Jackel Vui Lung Chew, Ge Lyu , Xiaoyou Zhang, Xinliang Wang

Abstract

A small target object refers to a target with a very small bounding box size. The definition methods include: (1) Relative definition: the ratio of the width of the bounding box to the width and height of the original image is less than 10%, or the ratio of the area of the bounding box to the total area of the original image is less than 3%; (2) Absolute definition: the size of the bounding box is less than 32×32 pixels. Small target detection has important application value in remote sensing image, medical image, industrial quality inspection and automatic driving. Although the detection of large targets has achieved remarkable results, the detection of small targets still faces challenges such as low image resolution, small size, strong background interference and insufficient samples, resulting in low detection accuracy and slow speed. In view of the fact that the detection accuracy of DETR small target detection algorithm of Swin Transformer on Tiny Person and Wider Face data sets still needs to be improved. In order to solve the problem of low detection rate, this paper puts forward an improved DETR small target detection algorithm model of Swin Transformer, and adopts the following optimization strategies: Firstly, BiFPN (Bidirectional Feature Pyramid) is introduced to enrich multi-scale features, so that its small target features can flow more fully between shallow and deep layers, thus improving the detection accuracy; Secondly, by dynamically adjusting the window size and increasing the attention weight of small target area, the feature loss caused by local self-attention mechanism is compensated; Furthermore, the Hungarian matching algorithm of DETR is improved, and Simota (Optimal Transport Assignment) strategy is introduced to match the prediction frame and the small target more efficiently, so as to improve the recall rate and detection accuracy of the small target and reduce the detection speed. Finally, multi-scale training and data enhancement strategies (random cropping, scaling, small target random resampling, etc.) are used to increase the number of small target samples, and progressive learning rate decline and data disturbance are used to enhance the perceptual ability of the model. The small target detection rate mAP of the improved algorithm model on Tiny Person and Wider Face datasets reaches 50.8% and 70.3% respectively. The results show that the improved DETR algorithm of Swin Transformer has excellent generalization ability, detection accuracy and robustness in different data sets and scenarios.

Article Details

Section
Articles