基于融合注意力和特征增强的跨模态行人重识别

doi:10.13878/j.cnki.jnuist.20240330001

2025年4月13日 7:25 星期日

首页 > 过刊浏览>2024年第16卷第4期 >451-460. DOI:10.13878/j.cnki.jnuist.20240330001

基于融合注意力和特征增强的跨模态行人重识别
DOI:
                        10.13878/j.cnki.jnuist.20240330001
                    
作者:
                        黄驰涵黄驰涵
南京理工大学 设计艺术与传媒学院, 南京, 210094
在期刊界中查找
在百度中查找
在本站中查找
沈肖波沈肖波
南京理工大学 计算机科学与工程学院, 南京, 210094
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP391.41
基金项目:国家自然科学基金(62176126);江苏省自然科学基金优秀青年基金(BK20230095)

Cross-modal person re-identification based on fused attention and feature enhancement

Author:

HUANG Chihan
HUANG Chihan
School of Design Art and Media, Nanjing University of Science & Technology, Nanjing 210094, China
在期刊界中查找
在百度中查找
在本站中查找
SHEN Xiaobo
SHEN Xiaobo
School of Computer Science and Engineering, Nanjing University of Science & Technology, Nanjing 210094, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

跨模态行人重识别是一项具有挑战性的任务,目的是在可见光和红外模式之间匹配行人图像,以便在犯罪调查和智能视频监控应用中发挥重要作用.为了解决跨模态行人重识别任务中对细粒度特征提取能力不强的问题,本文提出一种基于融合注意力和特征增强的行人重识别模型.首先,利用自动数据增强技术缓解不同摄像机的视角、尺度差异,并基于交叉注意力多尺度Vision Transformer,通过处理多尺度特征生成具有更强区分性的特征表示;接着,提出通道注意力和空间注意力机制,在融合可见光和红外图像特征时学习对区分特征重要的信息;最后,设计损失函数,采用基于自适应权重的难三元组损失,增强了每个样本之间的相关性,提高了可见光和红外图像对不同行人的识别能力.在SYSU-MM01和RegDB数据集上进行大量实验,结果表明,本文提出方法的mAP分别达到了68.05%和85.19%,相较之前的工作性能有所提升,且通过消融实验和对比分析验证了本文模型的先进性和有效性.

关键词:行人重识别;跨模态;交叉注意力;特征提取;多尺度

Abstract:

RGB-Infrared person re-identification (Re-ID) is a challenging task which aims to match person images between visible and infrared modalities,playing a crucial role in criminal investigation and intelligent video surveillance.To address the weak feature extraction capability for fine-grained features in current cross-modal person Re-ID tasks,this paper proposes a person re-identification model based on fused attention and feature enhancement.First,automatic data augmentation techniques are employed to mitigate the differences in perspectives and scales among different cameras,and a cross-attention multi-scale Vision Transformer is proposed to generate more discriminative feature representations by processing multi-scale features.Then the channel attention and spatial attention mechanisms are introduced to learn information important for distinguishing features when fusing visible and infrared image features.Finally,a loss function is designed,which adopts the adaptive weight based hard triplet loss,to enhance the correlation between each sample and improve the capability of identifying different persons from visible and infrared images.Extensive experiments conducted on the SYSU-MM01 and RegDB datasets show that the proposed approach achieves mAP of 68.05% and 85.19%,respectively,outperforming many state-of-the-art approaches.Moreover,ablation experiments and comparative analysis validate the superiority and effectiveness of the proposed model.

Key words:person re-identification (Re-ID);cross-modal;cross attention;feature extraction;multi-scale

引用本文

黄驰涵,沈肖波.基于融合注意力和特征增强的跨模态行人重识别[J].南京信息工程大学学报(自然科学版),2024,16(4):451-460
HUANG Chihan, SHEN Xiaobo. Cross-modal person re-identification based on fused attention and feature enhancement[J]. Journal of Nanjing University of Information Science & Technology, 2024,16(4):451-460

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-03-30
最后修改日期:
录用日期:
在线发布日期: 2024-08-07
出版日期: 2024-07-28

地址：江苏省南京市宁六路219号邮编：210044

联系电话：025-58731025 E-mail：nxdxb@nuist.edu.cn

引用本文

分享

文章指标

历史