基于融合注意力和特征增强的跨模态行人重识别
DOI:
作者:
作者单位:

南京理工大学

作者简介:

通讯作者:

中图分类号:

基金项目:


Fused Attention and Feature Enhancement Based Cross-Modal Person Re-Identification
Author:
Affiliation:

Nanjing University of Science and Technology

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    跨模态行人重识别是一项具有挑战性的任务,旨在在可见光和红外模式之间匹配行人图像,在犯罪调查和智能视频监控应用中发挥重要作用。为了解决目前跨模态行人重识别任务中对细粒度特征提取能力不强的问题,本文提出了基于融合注意力和特征增强的行人重识别模型。首先利用自动数据增强技术来缓解不同摄像机的视角、尺度差异,其次提出基于交叉注意力多尺度Vision Transformer,通过处理多尺度特征来生成具有更强区分性的特征表示。接着提出通道注意力和空间注意力机制,在融合可见光和红外图像特征时学习对区分特征重要的信息。最后设计损失函数,提出了基于自适应权重的难三元组损失,增强了每个样本之间的相关性,提高了可见光和红外图像对不同行人的识别能力。在SYSU-MM01和RegDB数据集上进行大量实验,本文提出方法的mAP分别达到了68.05%和85.19%,相较之前的工作有性能提升,且通过消融实验和对比分析验证了本文模型的先进性和有效性。

    Abstract:

    RGB-Infrared person re-identification is a challenging task aimed at matching person images between visible and infrared modalities, playing a crucial role in criminal investigation and intelligent surveillance. To address the issue of weak feature extraction capability for fine-grained features in current cross-modal person re-identification tasks, this paper proposes a person re-identification model based on fused attention and feature enhancement. Firstly, automatic data augmentation technique is employed to mitigate perspective and scale differences among different cameras. Secondly, a cross-attention multi-scale Vision Transformer is introduced to generate more discriminative feature representations by processing multi-scale features. Furthermore, channel attention and spatial attention mechanisms are proposed to learn important information for distinguishing features when fusing visible and infrared image features. Finally, a loss function based on adaptive weights is designed, presenting a hard triplet loss to enhance the correlation between each sample and improve the identification capability of visible and infrared images for different persons. Extensive experiments conducted on the SYSU-MM01 and RegDB datasets show that the proposed method achieves mAP scores of 68.05% and 85.19%, respectively, outperforming previous works. Moreover, ablation experiments and comparative analysis validate the superiority and effectiveness of the proposed model.

    参考文献
    相似文献
    引证文献
引用本文

黄驰涵,沈肖波.基于融合注意力和特征增强的跨模态行人重识别[J].南京信息工程大学学报,,():

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-03-30
  • 最后修改日期:2024-06-08
  • 录用日期:2024-06-11
  • 在线发布日期:
  • 出版日期:

地址:江苏省南京市宁六路219号    邮编:210044

联系电话:025-58731025    E-mail:nxdxb@nuist.edu.cn

南京信息工程大学学报 ® 2024 版权所有  技术支持:北京勤云科技发展有限公司