Abstract:
Objective In order to make full use of context information and integrate multi-scale features, a YOLOv5s algorithm based on BiFPN and Triplet attention mechanism (BTF-YOLOv5s) for identifing defective apple was proposed.
Method Firstly, the additional weights were introduced to the weighted bidirectional feature pyramid network ( BiFPN ) to learn the importance of different input features. The model realized the repeated fusion of multi-scale features through the top-down and bottom-up bidirectional paths, and improved the multi-scale detection ability. Secondly, the Triplet attention mechanism was applied to the Neck layer to enhance the model's ability to represent the correlation between target and contextual information, so that the model could focus more on the learning of apple features. Finally, the Focal-CIoU loss function was used to adjust the loss weight, so that the model payed more attention to defective apple recognition, and improved the perception ability of the model. Different loss functions were compared through ablation experiments. The position of attention mechanism in YOLOv5 structure was changed, and compared with the mainstream algorithms.
Result On the basis of the initial YOLOv5s model, BTF-YOLOv5s improved the accuracy, recall and mAP by 5.7, 2.2 and 3.5 percentage points respectively, and the memory usage of the model was 14.7 MB. The average accuracy of BTF-YOLOv5s was 5.7, 3.5, 13.3, 3.5, 2.9, 2.6, 2.8 and 0.3 percentage points higher than those of SSD, YOLOv3, YOLOv4, YOLOv5s, YOLOv7, YOLOv8n, YOLOv8s and YOLOv9, respectively.
Conclusion The model of BTF-YOLOv5s shows significant superiority in identifing defective apples, which provides certain technical support for the picking robot to realize the automatic sorting of high-quality apples and defective apples in the picking process.