Fused Attention and Feature Enhancement Based Cross-Modal Person Re-Identification

2025-5-4- 16

Fused Attention and Feature Enhancement Based Cross-Modal Person Re-Identification
DOI:
                        
                    
CSTR:
                        
                    
Author:
                        Huang ChihanHuang Chihan
Nanjing University of Science and Technology
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Shen XiaoboShen Xiaobo
Nanjing University of Science and Technology
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:Nanjing University of Science and Technology
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

RGB-Infrared person re-identification is a challenging task aimed at matching person images between visible and infrared modalities, playing a crucial role in criminal investigation and intelligent surveillance. To address the issue of weak feature extraction capability for fine-grained features in current cross-modal person re-identification tasks, this paper proposes a person re-identification model based on fused attention and feature enhancement. Firstly, automatic data augmentation technique is employed to mitigate perspective and scale differences among different cameras. Secondly, a cross-attention multi-scale Vision Transformer is introduced to generate more discriminative feature representations by processing multi-scale features. Furthermore, channel attention and spatial attention mechanisms are proposed to learn important information for distinguishing features when fusing visible and infrared image features. Finally, a loss function based on adaptive weights is designed, presenting a hard triplet loss to enhance the correlation between each sample and improve the identification capability of visible and infrared images for different persons. Extensive experiments conducted on the SYSU-MM01 and RegDB datasets show that the proposed method achieves mAP scores of 68.05% and 85.19%, respectively, outperforming previous works. Moreover, ablation experiments and comparative analysis validate the superiority and effectiveness of the proposed model.

Key words:Re-identification; cross-modal;cross attention;feature extraction;multi-scale

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:March 30,2024
Revised:June 08,2024
Adopted:June 11,2024
Online:
Published:

Article QR Code

Address：No. 219, Ningliu Road, Nanjing, Jiangsu Province

Postcode：210044

Phone：025-58731025

Get Citation

Share

Article Metrics

History

Article QR Code