LESO-Net:一种轻量高效的小目标分割网络
作者:
作者单位:

南京市南京信息工程大学

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)


LESO-Net A Lightweight and Efficient Segmentation Network for Small Object
Author:
Affiliation:

Nanjing University of Information Science and Technology

Fund Project:

The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • | |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    图像内的部分小目标因其具有形状不规则以及边界模糊等特征,在分割处理时常常遭遇诸多挑战,主要包括特征提取困难、边缘细节丢失、噪声干扰显著等。针对以上问题,本文提出一种基于YOLOv8n-seg模型的轻量高效的小目标分割网络LESO-Net。首先,使用可变形卷积网络(DCNv2)替换骨干网络中的C2f模块,以提高对不同形状小目标的特征提取和自适应泛化能力;然后,将大可分离核注意力(LSKA)模块引入到颈部网络中,以提高分割精确度,降低计算复杂度和内存占用;最后,通过对损失函数进行优化,改善类别不平衡和边界框精确度不足的问题。在自建的气泡数据集和SAR公共图像数据集(HRSID)上进行的实验结果表明,改进后的网络LESO-Net与原始YOLOv8n-seg模型相比,精确度分别提高1.2和2.5个百分点,mAP0.5分别提高0.2和1.2个百分点,参数量减少10%,证明所提出的LESO-Net模型具有较好的综合性能,能够满足复杂场景中小目标分割任务的要求。

    Abstract:

    Small objects in images often present significant challenges during segmentation due to their irregular shapes and blurred boundaries. These challenges primarily include difficulties in feature extraction, loss of edge details, and significant noise interference. To effectively address these challenges, we propose an efficient small object segmentation model named LESO-Net, a lightweight and efficient object segmentation Network for small objects, based on You Only Look Once (YOLO) v8n-seg. Initially, we integrate a Large Separable Convolution Attention (LSKA) module into the Neck network, which not only enhances segmentation accuracy but also reduces computational complexity and memory usage. In addition, to specifically address the unstable shapes of small objects, we replace the C2f module in the backbone model with our improved Deformable Convolutional Networks (DCNv2). This modification significantly enhances feature extraction and adaptive generalization capabilities for small objects of varying shapes. Furthermore, to further improve the model's performance, we ameliorate the loss function, thereby effectively tackling class imbalances and insufficient bounding box accuracy. We validated the effectiveness of LESO-Net on a self-constructed bubble dataset and a public high-resolution SAR images dataset (HRSID). Compared to the original version of YOLOv8n-seg, LESO-Net achieved a precision improvement of 1.2 percentage points and 2.5 percentage points, along with average precision enhancements of 0.2 percentage point and 1.2 percentage points, respectively, while the number of parameters was reduced by nearly 10%. This result effectively demonstrates the superior performance of the LESO-Net model, meeting the accuracy requirements for segmenting small targets in practical remote sensing scenarios.

    参考文献
    [1] Y. Sun, L. Su, S. Yuan and H. Meng, "DANet: Dual-Branch Activation Network for Small Object Instance Segmentation of Ship Images," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 11, pp. 6708-6720, Nov. 2023, doi: 10.1109/TCSVT.2023.3267127.
    [2] F. Quinton, R. Popoff, B. Presles et al., “A tumour and liver automatic segmentation (ATLAS) dataset on contrast-enhanced magnetic resonance imaging for hepatocellular carcinoma,” Data, vol. 8, no. 5, p. 79, 2023.
    [3] 李孟浩,袁三男.基于改进YOLOv5s的交通标识检测算法[J].南京信息工程大学学报(自然科学版),2024,16(1):11-19.
    [4] H. Xu, S. Li, and P. Xu." Segmentation of Flotation Pulp Phase Bubbles Image Based on DA-Attention U-Net." Nonferrous Metals (Mineral Processing Section) .06(2024):106-115+131.
    [5] Z. Ding, W. Song, J. Xu et al., "A Deep-Learning-Based Low-Cost Micro-Leakage Measurement System for Industrial Applications," in IEEE/ASME Transactions on Mechatronics, vol. 29, no. 1, pp. 119-130, Feb. 2024, doi: 10.1109/TMECH.2023.3272797.
    [6] C. Pelletier, S. Valero, J. Inglada et al., "Assessing the robustness of random forests to map land cover with high resolution satellite image time series over large areas", Remote Sens. Environ., vol. 187, pp. 156-168, Dec. 2016.
    [7] H. Wang, C. Wang and H. Wu, "Using GF-2 imagery and the conditional random field model for urban forest cover mapping", Remote Sens. Lett., vol. 7, no. 4, pp. 378-387, Apr. 2016.
    [8] E H Houssein, K Hussain, L Abualigah et al., "An improved opposition-based marine predators algorithm for global optimization and multilevel thresholding image segmentation", Knowledge-Based Systems, vol. 1, pp. 107348, 2021.
    [9] H Gao, G Zhou, Y Cao et al., "Research on Edge Detection and Image Segmentation of Cabinet Region Based on Edge Computing Joint Image Detection Algorithms", International Journal of Reliability Quality and Safety Engineering, 2022.
    [10] D. Guo, L. Zhu, Y. Lu, H. Yu et al., "Small object sensitive segmentation of urban street scene with spatial adjacency between object classes", IEEE Trans. Image Process., vol. 28, no. 6, pp. 2643-2653, Jun. 2019.
    [11] 王子民,周悦,关挺强et al.基于改进U2-Net网络的多裂肌MRI图像分割算法[J].南京信息工程大学学报(自然科学版),2024,16(3):364-373.
    [12] H. Li, Z. Lin, X. Shen et al., "A convolutional neural network cascade for face detection", Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 5325-5334, 2015.
    [13] S. Chandra and I. Kokkinos, "Fast exact and multi-scale inference for semantic image segmentation with deep Gaussian CRFs", Proc. Eur. Conf. Comput. Vis., pp. 402-418, 2016.
    [14] S. Zheng et al., "Conditional random fields as recurrent neural networks", Proc. IEEE Int. Conf. Comput. Vis., pp. 1529-1537, 2015.
    [15] N. V, P. G. Acharya, B. Suhas Krishnaprasad et al., "Design and Evaluation of a Real-Time Semantic Segmentation System for Autonomous Driving," 2024 3rd International Conference for Innovation in Technology (INOCON), Bangalore, India, 2024, pp. 1-6, doi: 10.1109/INOCON60754.2024.10511680.
    [16] S. Minaee, Y. Boykov, F. Porikli et al., "Image Segmentation Using Deep Learning: A Survey," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3523 - 3542, 1 July 2022, doi: 10.1109/TPAMI.2021.3059968.
    [17] Long, Jonathan, E. Shelhamer, and T. Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
    [18] Ronneberger, Olaf, P. Fischer et al.. "U-Net: Convolutional Networks for Biomedical Image Segmentation." Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer International Publishing, 2015.
    [19] H. Zhao, et al. "Pyramid scene parsing network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
    [20] L. Chen, et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." Proceedings of the European conference on computer vision (ECCV). 2018.
    [21] K. He, G.Gkioxari, P. Dollár et al., "Mask r-cnn", Proceedings of the IEEE international conference on computer vision, pp. 2961-2969, 2017.
    [22] Ma, J. Wang, Y. Zhong and Z. Zheng, "FactSeg: Foreground activation-driven small object semantic segmentation in large-scale remote sensing imagery", IEEE Trans. Geosci. Remote Sens., vol. 60, 2022.
    [23] J. Li et al., "Class-incremental learning network for small objects enhancing of semantic segmentation in aerial imagery", IEEE Trans. Geosci. Remote Sens., vol. 60, 2022.
    [24] Y. Sun, L. Su, S. Yuan et al., "DANet: Dual-Branch Activation Network for Small Object Instance Segmentation of Ship Images," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 11, pp. 6708-6720, Nov. 2023, doi: 10.1109/TCSVT.2023.3267127.
    [25] G. Liu, Z. Chen, D. Liu et al., "FTMF-Net: A Fourier Transform-Multiscale Feature Fusion Network for Segmentation of Small Polyp Objects," in IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1-15, 2023, Art no. 5020815, doi: 10.1109/TIM.2023.3293880.
    [26] Y. Liu, Z. Shao and N. Hoffmann, "Global attention mechanism: Retain information to enhance channel-spatial interactions", arXiv:2112.05561, 2021.
    [27] D. Ouyang et al., "Efficient Multi-Scale Attention Module with Cross-Spatial Learning," ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109/ICASSP49357.2023.10096516.
    [28] S. Woo, J. Park, J.-Y. Lee et al., "CBAM: Convolutional block attention module", Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 3-19, 2018.
    [29] S. Zhou, Y. Zhao and D. Guo, "YOLOv5-GE Vehicle Detection Algorithm Integrating Global Attention Mechanism," 2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS), Guangzhou, China, 2022, pp. 439-444, doi: 10.1109/ISPDS56360.2022.9874196.
    [30] C. Wang, X. Wang and Y. Sun, "Multi-Scale Convolutional Neural Network Fault Diagnosis Based on Attention Mechanism," 2023 42nd Chinese Control Conference (CCC), Tianjin, China, 2023, pp. 1-6, doi: 10.23919/CCC58697.2023.10241216.
    [31] S. Xie, R. Girshick, P. Dollár et al., "Aggregated Residual Transformations for Deep Neural Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 5987-5995, doi: 10.1109/CVPR.2017.634.
    [32] Sunkara, Raja, and T. Luo. "No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects." Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer Nature Switzerland, 2022.
    [33] S. Guo et al., "Intrusion object recognition of rail perimeter using an improved YOLOv5 algorithm," 2023 29th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), Queenstown, New Zealand, 2023, pp. 1-6, doi: 10.1109/M2VIP58386.2023.10413417.
    [34] S. Xu, Y. Ji, G. Wang et al., "GFSPP-YOLO: A Light YOLO Model Based on Group Fast Spatial Pyramid Pooling," 2023 IEEE 11th International Conference on Information, Communication and Networks (ICICN), Xi''an, China, 2023, pp. 733-738, doi: 10.1109/ICICN59530.2023.10393445.
    [35] Z. Ding, W. Song and S. Zhan, "A Measurement System for the Tightness of Sealed Vessels Based on Machine Vision Using Deep Learning Algorithm," in IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1-15, 2022, Art no. 5007115, doi: 10.1109/TIM.2022.3158989.
    [36] Z. Ding, Z. Yin, W. Song et al., "Simulations of Bubble Formation and Oscillation Behavior in Micro-Leakage Measurement System," in IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1-9, 2023, Art no. 7502709, doi: 10.1109/TIM.2023.3261926.
    [37] Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934.
    [38] Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv5: A State-of-the-Art Object Detection System. 2020. Available online: https://github.com/ultralytics/YOLOv5
    [39] Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475.
    [40] J. Redmon, S. Divvala, R. Girshick et al., "You only look once: Unified real-time object detection", Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, December 2016.
    [41] V. M. R, P. Rao and Lakshmeesha, "Advancements in Semantic Skin Lesion Segmentation: A Comparative Study with YOLOv8 for Accurate Skin Lesion Segmentation," 2023 Global Conference on Information Technologies and Communications (GCITC), Bangalore, India, 2023, pp. 1-6, doi: 10.1109/GCITC60406.2023.10425934.
    [42] 张建东.融合深度监督与改进YOLOv8的海上目标检测[J].南京信息工程大学学报(自然科学版),2024,16(4):482-489.
    [43] X. Zhu, H. Hu, S. Lin et al., "Deformable ConvNets V2: More Deformable Better Results", 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9300-9308, 2019.
    [44] K. W. Lau, L. M. Po and Y. A. U. Rehman, "Large separable kernel attention: Rethinking the large kernel attention design in cnn", Expert Systems with Applications, pp. 121352, 2024.
    [45] J. Yu, Y. Jiang, Z. Wang, Z. Cao et al., "Unitbox: An advanced object detection network", Proceedings of the 24th ACM international conference on Multimedia, pp. 516-520, 2016.
    [46] H. Rezatofighi, N. Tsoi, J. Gwak et al., "Generalized intersection over union: A metric and a loss for bounding box regression", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658-666, 2019.
    [47] T. Y. Lin, P. Goyal, R. Girshick et al., "Focal loss for dense object detection", Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
    [48] S. Wei, X. Zeng, Q. Qu, et al, "HRSID: A High-Resolution SAR Images Dataset for Ship Detection and Instance Segmentation", in IEEE Access, vol. 8, pp. 120234-120254, 2020.
    [49] Q. -L. Zhang and Y. -B. Yang, "SA-Net: Shuffle Attention for Deep Convolutional Neural Networks," ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 2021, pp. 2235-2239, doi: 10.1109/ICASSP39728.2021.9414568.
    [50] J. Hu, S. Li, and G. Sun. "Squeeze-and-excitation networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
    相似文献
    引证文献
引用本文

丁正龙,胡一凡,杜元洪,徐炜杰,魏哑美,姚选. LESO-Net:一种轻量高效的小目标分割网络[J].南京信息工程大学学报,,():

复制
分享
文章指标
  • 点击次数:27
  • 下载次数: 0
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2024-12-28
  • 最后修改日期:2025-02-11
  • 录用日期:2025-02-13

地址:江苏省南京市宁六路219号    邮编:210044

联系电话:025-58731025    E-mail:nxdxb@nuist.edu.cn

南京信息工程大学学报 ® 2025 版权所有  技术支持:北京勤云科技发展有限公司