Abstract:
Objective Currently, the handling operation of vegetable baskets after harvesting in protected greenhouses is still dominated by manual labor, which has problems such as low efficiency and high labor intensity, seriously restricting the large-scale and intelligent development of agricultural production. Developing a new type of agricultural robot with autonomous basket-grabbing functionality is a key technical path to break this bottleneck and improve agricultural production efficiency. Among them, achieving accurate pose estimation of baskets based on computer vision technology is the core premise and technical foundation for ensuring the stability and reliability of the robot’s grabbing actions. However, the accuracy and real-time performance of existing pose estimation methods are difficult to meet the actual operational requirements in complex greenhouse environments, and further in-depth research and optimization are urgently needed.
Method Based on the YOLOv8-pose baseline model, this approach estimated the basket’s pose by detecting its feature points and integrating the PnP algorithm. Firstly, RGB images of baskets under diverse complex backgrounds were captured using a monocular camera to construct a dedicated dataset. Secondly, the Biformer module, GAM attention mechanism and Focaler_GIoU loss function were incorporated into the YOLOv8-pose framework to enhance keypoint detection robustness in challenging scenarios involving cluttered backgrounds and occlusions. Finally, leveraging the basket’s predefined dimensional parameters and the detected 2D keypoint coordinates, the PnP algorithm was employed to solve for the 3D pose parameters of the basket in physical space.
Result The mean average precision (mAP) and precision of keypoints were increased by 3.73 and 4.31 percentage points, respectively. The average positioning precision was increased by 5.20 pixels, and the root mean square error (RMSE) between these keypoints and manually identified keypoints was increased by 4.45 pixels on average. The pose estimation algorithm achieved higher accuracy when the camera was 1.7 to 1.9 m from the basket, highlighting the critical influence of relative distance between the camera and the basket on localization estimation precision.
Conclusion The method proposed in this study can provide a low-cost and high-precision solution for basket pose estimation in a facility greenhouse scenario, and provide technical support for agricultural robots to grasp the basket.