A Two-Stage Region Proposal Optimization for Efficient Bird Detection in Deep Learning Pipelines
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
Recent advancements in deep learning have significantly improved object detection performance in computer vision. Nevertheless, region-based convolutional neural networks (R-CNNs) often suffer from computational inefficiencies due to the generation of an excessive number of candidate regions, many of which are redundant or irrelevant. This paper introduces a novel two-stage preprocessing strategy to optimize the region proposal phase in R-CNN-based object detection systems, specifically focusing on bird species recognition. The proposed approach effectively filters out high-frequency noise in non-object regions and reduces the total number of region proposals without compromising detection accuracy. Experimental evaluations conducted on five benchmark bird datasets demonstrate that our method increases the proportion of region proposals with an Intersection over Union (IoU) greater than 0.5 from 80.95% to 86.11%. Furthermore, the number of positive proposals increases by 330% during training and 726% during testing, while the number of redundant proposals is reduced by 55.48%. Moreover, the proposed model reduces the average time required to generate region proposals per image by up to 70%, significantly enhancing computational efficiency. Additionally, to analyze the proposed model in various real-world scenarios, it was evaluated on the publicly available Eastern Cottontail Rabbits dataset, which serves as a challenging benchmark. The proportion of predicted bounding boxes with IoU greater than 0.5 also increased by 8.66%, indicating a notable improvement in localization accuracy. To observe the impact of the proposed method on single-stage object detectors, five bird datasets were unified into a multiclass object detection dataset and evaluated using YOLOv8. The proposed approach improved precision by approximately 18% and recall by 42%. The results validate the effectiveness of the proposed preprocessing framework in improving both the efficiency and accuracy of object detection systems, making it well suited for deployment in parallel and distributed computing environments.









