Abstract:
Objective To address the challenges of varying illumination, occlusion by foliage and branches, fruit overlapping, and multi-distance object recognition in mature tomato detection within facility environments.
Method This study designed a mature tomato fruit detection model, YOLOv8n-SPMF, to solve the aforementioned problems. Firstly, the Conv module in the YOLOv8n backbone network was replaced with SPDConv to improve the detection accuracy of small tomatoes. Secondly, a PSCEA attention mechanism was added to the backbone network to extract local details and edge information of tomatoes, thereby enhancing the model's feature extraction capability. Then, the SPPF module in the backbone was replaced with a mixed pooling module (MixSPPF) to strengthen the information fusion among different level features. Finally, a Focaler-MPDIoU loss function was adopted to improve the bounding box regression performance in complex scenarios.
Result Experimental results showed that the YOLOv8n-SPMF model achieved an mAP50 of 96.24% on the test set, with an mAP50-95 of 81.36%, a Recall of 90.33%, a model parameter count of 4.13 M, and a detection speed of 11.7 ms per image. Compared with YOLOv3-tiny, YOLOv5n, YOLOv6n, YOLOv7-tiny, Faster-RCNN, YOLOv8n, YOLOv9t, YOLOv10n, YOLOv11n and YOLOv12n, the model's mAP50 was improved by 3.57, 0.78, 1.40, 0.05, 2.64, 0.74, 0.66, 0.72, 0.50 and 1.26 percentage points, respectively, and the Precision was increased by 0.48, 0.95, 0.69, 0.38, 4.90, 0.11, 0.63, 1.90, 0.43 and 0.61 percentage points, respectively.
Conclusion The YOLOv8n-SPMF model proposed in this paper exhibits high accuracy for mature tomato fruit detection in facility environments and can provide effective technical support for intelligent tomato harvesting.