Abstract:In order to enable picking robots to quickly and accurately detect apple fruits with varying levels of maturity in complex orchard environments (such as different lighting conditions, leaf occlusions, dense apple clusters, and ultra-long viewing distances), this study proposes an apple fruit detection model based on an improved YOLOv8. First, the EMA attention mechanism module is integrated into the YOLOv8 model, making the model more focused on the region of interest for fruit detection, suppressing general feature information such as background and branch leaf occlusion, and improving the detection accuracy of occluded fruits. Second, the original C2f module is reconstructed using a more efficient three-branch DWR module for feature extraction, which enhances the small object detection capability through multi-scale feature fusion methods. Simultaneously, based on the DAMO-YOLO idea, the original YOLOv8 neck is reconstructed to achieve efficient fusion of high-level semantics and low-level spatial features. Finally, the model is optimized using the Inner-SIoU loss function to improve the recognition accuracy. In the complex orchard environment, using apples as the detection target, the experimental results show that the proposed algorithm achieves P, R, mAP50, mAP50-95, and F1 of 86.1%, 89.2%, 94.0%, 64.4%, and 87.6% on the test set, respectively. The improved algorithm outperforms the original model in most indicators, and demonstrates excellent robustness through comparative experiments with different numbers of fruits. This provides practical application value for addressing the precise identification problem of fruit picking robots in complex environments.