基于融合注意力和特征增强的跨模态行人重识别
作者:
中图分类号:

TP391.41

基金项目:

国家自然科学基金(62176126);江苏省自然科学基金优秀青年基金(BK20230095)


Cross-modal person re-identification based on fused attention and feature enhancement
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • | | |
  • 文章评论
    摘要:

    跨模态行人重识别是一项具有挑战性的任务,目的是在可见光和红外模式之间匹配行人图像,以便在犯罪调查和智能视频监控应用中发挥重要作用.为了解决跨模态行人重识别任务中对细粒度特征提取能力不强的问题,本文提出一种基于融合注意力和特征增强的行人重识别模型.首先,利用自动数据增强技术缓解不同摄像机的视角、尺度差异,并基于交叉注意力多尺度Vision Transformer,通过处理多尺度特征生成具有更强区分性的特征表示;接着,提出通道注意力和空间注意力机制,在融合可见光和红外图像特征时学习对区分特征重要的信息;最后,设计损失函数,采用基于自适应权重的难三元组损失,增强了每个样本之间的相关性,提高了可见光和红外图像对不同行人的识别能力.在SYSU-MM01和RegDB数据集上进行大量实验,结果表明,本文提出方法的mAP分别达到了68.05%和85.19%,相较之前的工作性能有所提升,且通过消融实验和对比分析验证了本文模型的先进性和有效性.

    Abstract:

    RGB-Infrared person re-identification (Re-ID) is a challenging task which aims to match person images between visible and infrared modalities,playing a crucial role in criminal investigation and intelligent video surveillance.To address the weak feature extraction capability for fine-grained features in current cross-modal person Re-ID tasks,this paper proposes a person re-identification model based on fused attention and feature enhancement.First,automatic data augmentation techniques are employed to mitigate the differences in perspectives and scales among different cameras,and a cross-attention multi-scale Vision Transformer is proposed to generate more discriminative feature representations by processing multi-scale features.Then the channel attention and spatial attention mechanisms are introduced to learn information important for distinguishing features when fusing visible and infrared image features.Finally,a loss function is designed,which adopts the adaptive weight based hard triplet loss,to enhance the correlation between each sample and improve the capability of identifying different persons from visible and infrared images.Extensive experiments conducted on the SYSU-MM01 and RegDB datasets show that the proposed approach achieves mAP of 68.05% and 85.19%,respectively,outperforming many state-of-the-art approaches.Moreover,ablation experiments and comparative analysis validate the superiority and effectiveness of the proposed model.

    参考文献
    [1] Bhardwaj S, Dave M. Enhanced neural network-based attack investigation framework for network forensics:identification, detection, and analysis of the attack[J]. Computers & Security, 2023, 135:103521
    [2] Zhu J L, Li Q L, Gao C B, et al. Camera-aware re-identification feature for multi-target multi-camera tracking[J]. Image and Vision Computing, 2024, 142:104889
    [3] Zennayi Y, Benaissa S, Derrouz H, et al. Unauthorized access detection system to the equipments in a room based on the persons identification by face recognition[J]. Engineering Applications of Artificial Intelligence, 2023, 124:106637
    [4] Eli M B, Lidor R, Lath F, et al. The feudal glove of talent-selection decisions in sport-strengthening the link between subjective and objective assessments[J]. Asia Journal of Sport and Exercise Psychology, 2024, 4(1):1-6
    [5] Coşar S, Bellotto N. Human re-identification with a robot thermal camera using entropy-based sampling[J]. Journal of Intelligent & Robotic Systems, 2019, 98:85-102
    [6] Fu D P, Chen D D, Bao J M, et al. Unsupervised pre-training for person re-identification[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 20-25, 2021, Nashville, TN, USA. IEEE, 2021:14750-14759
    [7] 周非, 舒浩峰, 白梦林, 等. 生成对抗网络协同角度异构中心三元组损失的跨模态行人重识别[J]. 电子学报, 2023, 51(7):1803-1811 ZHOU Fei, SHU Haofeng, BAI Menglin, et al. Cross-modal person re-identification based on generative adversarial network coordinated with angle based heterogeneous center triplet loss[J]. Acta Electronica Sinica, 2023, 51(7):1803-1811
    [8] Yuan B W, Chen B R, Tan Z Y, et al. Unbiased feature enhancement framework for cross-modality person re-identification[J]. Multimedia Systems, 2022, 28(3):749-759
    [9] Chen Y, Wan L, Li Z H, et al. Neural feature search for RGB-infrared person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). June 20-25, 2021, Nashville, TN, USA. IEEE, 2021:587-597
    [10] Choi S, Lee S, Kim Y, et al. Hi-CMD:hierarchical cross-modality disentanglement for visible-infrared person re-identification[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). June 13-19, 2020, Seattle, WA, USA. IEEE, 2020:10257-10266
    [11] Liu H J, Xia D X, Jiang W. Towards homogeneous modality learning and multi-granularity information exploration for visible-infrared person re-identification[J]. IEEE Journal of Selected Topics in Signal Processing, 2023, 17(3):545-559
    [12] Xia D X, Liu H J, Xu L L, et al. Visible-infrared person re-identification with data augmentation via cycle-consistent adversarial network[J]. Neurocomputing, 2021, 443:35-46
    [13] Cubuk E D, Zoph B, Mané D, et al. AutoAugment:learning augmentation strategies from data[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 15-20, 2019, Long Beach, CA, USA. IEEE, 2019:113-123
    [14] Müller S G, Hutter F. TrivialAugment:tuning-free yet state-of-the-art data augmentation[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). October 10-17, 2021, Montreal, QC, Canada. IEEE, 2021:754-762
    [15] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words:transformers for image recognition at scale[J]. arXiv e-Print, 2020, arXiv:2010. 11929
    [16] Woo S, Park J, Lee J Y, et al. CBAM:convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018:3-19
    [17] Radenovic F, Tolias G, Chum O. Fine-tuning CNN image retrieval with no human annotation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(7):1655-1668
    [18] Liu H J, Tan X H, Zhou X C. Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification[J]. IEEE Transactions on Multimedia, 2080, 23:4414-4425
    [19] Wu A C, Zheng W S, Yu H X, et al. RGB-infrared cross-modality person re-identification[C]// IEEE International Conference on Computer Vision. October 22-29, 2017, Venice, Italy. IEEE, 2017:5380-5389
    [20] Ye M, Shen J B, Lin G J, et al. Deep learning for person re-identification:a survey and outlook[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6):2872-2893
    [21] Kingma D P, Ba J. Adam:a method for stochastic optimization[J]. arXiv e-Print, 2014, arXiv:1412. 6980
    [22] Wang X J, Cordova R S. Global and part feature fusion for cross-modality person re-identification[J]. IEEE Access, 2038, 10:122038-122046
    [23] 刘志刚, 常乐乐, 赵宜珺, 等. 基于通道干预渐进式差异减小网络的跨模态行人重识别[J/OL]. 计算机辅助设计与图形学学报, 2024:1-11. [2024-03-14]. https://kns.cnki.net/kcms/detail/11.2925.TP.20240314.1047.012.html LIU Zhigang, CHANG Lele, ZHAO Yijun, et al. Progressive difference reduction network with channel intervention for visible-infrared re-identification[J/OL]. Journal of Computer-Aided Design & Computer Graphics, 2024:1-11. [2024-03-14]. https://kns.cnki.net/kcms/detail/11.2925.TP.20240314.1047.012.html
    [24] Zhang Q, Lai C Z, Liu J N, et al. FMCNet:feature-level modality compensation for visible-infrared person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). June 19-20, 2022, Long Beach, CA, USA. IEEE, 2022:7349-7358
    [25] Gao Y J, Liang T F, Jin Y, et al. MSO:multi-feature space joint optimization network for RGB-infrared person re-identification[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York, NY, USA. ACM, 2021:5257-5265
    [26] Lu H, Zou X Z, Zhang P P. Learning progressive modality-shared transformers for effective visible-infrared person re-identification[J]. Proceedings of the 37th AAAI Conference on Artificial Intelligence, 2023, 37(2):1835-1843
    [27] Yang B, Chen J, Ye M. Top-K visual tokens transformer:selecting tokens for visible-infrared person re-identification[C]// 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). June 4-10, 2023, Rhodes Island, Greece. IEEE, 2023:1-5
    [28] Luo H, Gu Y Z, Liao X Y, et al. Bag of tricks and a strong baseline for deep person re-identification[C]/2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). June 16-17, 2019, Long Beach, CA, USA. IEEE, 2019:1487-1495
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

黄驰涵,沈肖波.基于融合注意力和特征增强的跨模态行人重识别[J].南京信息工程大学学报(自然科学版),2024,16(4):451-460
HUANG Chihan, SHEN Xiaobo. Cross-modal person re-identification based on fused attention and feature enhancement[J]. Journal of Nanjing University of Information Science & Technology, 2024,16(4):451-460

复制
分享
文章指标
  • 点击次数:153
  • 下载次数: 921
  • HTML阅读次数: 39
  • 引用次数: 0
历史
  • 收稿日期:2024-03-30
  • 在线发布日期: 2024-08-07
  • 出版日期: 2024-07-28

地址:江苏省南京市宁六路219号    邮编:210044

联系电话:025-58731025    E-mail:nxdxb@nuist.edu.cn

南京信息工程大学学报 ® 2025 版权所有  技术支持:北京勤云科技发展有限公司