Abstract:To enable harvesting robots to quickly and accurately detect apples of varying maturity levels in complex orchard environments (including different lighting conditions,leaf occlusion,dense apple clusters,and ultra-long-range vision scenarios),we propose an apple detection model based on improved YOLOv8.First,the Efficient Multi-scale Attention (EMA) module is integrated into the YOLOv8 to enable the model to focus on the region of interest for fruit detection and suppress general feature information such as background and foliage occlusion,thus improving the detection accuracy of occluded fruits.Second,the original C2f module is replaced by a more efficient three-branch Dilation-Wise Residual (DWR) module for feature extraction,which enhances the detection capability for small objects through multi-scale feature fusion.Simultaneously,inspired by the DAMO-YOLO concept,the original YOLOv8 neck is reconstructed to achieve efficient fusion of high-level semantics and low-level spatial features.Finally,the model is optimized using the Inner-SIoU loss function to improve the recognition accuracy.In complex orchard environments with apples as the detection target,experimental results show that the proposed algorithm achieves Precision,Recall,mAP0.5,mAP0.5-0.95,and F1 score of 86.1%,89.2%,94.0%,64.4%,and 87.6%,respectively on the test set.The improved algorithm outperforms the original model in most indicators,and demonstrates excellent robustness through comparative experiments with varying fruit counts,offering practical value for applications in addressing the precise identification challenge faced by fruit harvesting robots in complex environments.