Multi-feature pedestrian recognition based on cross-attention mechanism

2025-7-5- 21

Multi-feature pedestrian recognition based on cross-attention mechanism
DOI:
                        
                    
CSTR:
                        
                    
Author:
                        Wu XinyiWu Xinyi
Nanjing University of Information Science and Technology
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
DENG ZhiliangDENG Zhiliang
Nanjing University of Information Science and Technology
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LIU YunpingLIU Yunping
Nanjing University of Information Science and Technology
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
DONG JuanDONG Juan
Nanjing University of Information Science and Technology
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LI JiaqiLI Jiaqi
Nanjing University of Information Science and Technology
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:Nanjing University of Information Science and Technology
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [35]

Cited by [0]

Materials

Comments

Abstract:

In view of the fact that the existing person re-identification methods are difficult to avoid inaccurate feature extraction caused by environmental noise and are easily mistaken for person features, a person multi-feature fusion branch network based on dynamic convolution and attention mechanism is proposed. Firstly, due to the uncertain factors such as illumination change, human posture, object occlusion, etc., dynamic convolution is proposed to replace static convolution in ResNet50 to obtain a more robust Dy-ResNet50 model. Secondly, considering that there are great differences in the perspective of taking pictures of people and people are blocked by objects, it is proposed to embed the self-attention mechanism and the cross-attention mechanism into the backbone network. Finally, the cross entropy loss function and the difficult sample ternary loss function are used as the loss function together.Experiments are carried out on DukeMTMC-ReID、 Market-1501 and MSMT17 public datasets and compared with mainstream network models. The results show that the Rank-1 and mAP of the proposed model is 0.9 and 1.6 percentage points higher than that of the current mainstream model on the data set DukeMTMC-ReID.

Key words:Pedestrian re-recognition;Dynamic convolution;self-attention;cross-attention

Reference

[1] 郭彤,赵倩,赵琰等.多分支融合注意力机制的行人重识别方法[J].计算机工程与设计,2022,43(08):2260-2267.GUO Tong,ZHAO Qian,ZHAO Yan.Pedestrian Re-recognition Method Based on Multi-branch Fusion Attention Mechanism[J].Computer Engineering and Design,2022,43(08):2260-2267.

[2] Sun Y, Zheng L and Yan Y,et al.Beyond part models:Person retrieval with refind part pooling [C]//Proceedings of the European Conferenc on Computer vision. Munich: Springer, 2018:480-496.

[3] Fu Yan, Wei Yan, Zhou Yan and so on. Horizontal pyramid matching Proceedings of [C]//AAAI for reaffirming identity Conference on Artificial Intelligence,2019, 33 (1) 8295-8302.

[4] Fei W,Rongsong M,Laifa Y,et al.A deep learning-based approach for rectus abdominis segmentation and distance measurement in ultrasonography[J].Frontiers in Physiology,2023,14

[5] Ming S ,Yifan W ,Mingquan Z , et al.Development and application of creepage distance measurement system for zinc oxide arrester[J].Journal of Physics: Conference Series,2023,2591(1):

[6] 张聪聪,何宁.基于关键帧的双流卷积网络的人体动作识别方法[J].南京气象学院学报(自然科学版),2019,11(06):716-721ZHANG Congcong and HE Ning.Human motion recognition based on key frame two-stream convolutional network[J]. Journal of Nanjing University of Information Science Technology, 2019,11(6):716-721

[7] 李金轩,杜军平,周南.基于注意力特征提取网络的图像描述生成算法[J]南京气象学院学报.(自然科学版),2019,11(03):295-301.LI Jinxuan,DU Junping and ZHOU Nan.Image caption algorithm based on an attention image feature extraction network[J]. Journal of Nanjing University of Information Science Technology, 2019,11(3):295-301

[8] 刘忠洋,周杰,陆加新等.基于注意力机制的多尺度特征融合图像去雨方法[J].南京信息工程大学学报(自然科学版),2023,15(05):505-513.LIU Zhongyang,ZHOU Jie,LU Jiaxin,MIAO Zelin,SHAO Genfu,JIANG Kaiqiang,GAO Wei.Image rain removal via multi-scale feature fusion based on attention mechanism[J]. Journal of Nanjing University of Information Science Technology, 2023,15(5):505-513

[9] Wang J Y, Jang J S R.Training a Singing Transcription Model Using Connectionist Temporal Classification Loss and Cross-Entropy Loss[J].IEEE/ACM transactions on audio, speech, and language processing, 2023.

[10] Yang Z, Yuan Y, Xu Y,et al.FACE: Evaluating Natural Language Generation with Fourier Analysis of Cross-Entropy[J].ArXiv, 2023, abs/2305.10307.DOI:10.48550/arXiv.2305.10307.

[11] Cheng D, Gong Y, Zhou S,et al.Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function[C]//Computer Vision Pattern Recognition.IEEE, 2016.DOI:10.1109/CVPR.2016.149.

[12] Bui T, Ribeiro L, Ponti M,et al.Compact Descriptors for Sketch-based Image Retrieval using a Triplet loss Convolutional Neural Network[J].Computer Vision Image Understanding, 2017:S1077314217301194.DOI:10.1016/j.cviu.2017.06.007.

[13] 段炼,胡涛,朱欣焰等.顾及时空语义的疑犯位置时空预测[J].武汉大学学报(信息科学版)(自然科学版),2019,44(5):765-770DUAN Lian,HU Tao,AHU Xinyan,et al.Spatio-Temporal Prediction of Suspect Location by Spatio-Temporal Semantics[J]. Journal of Nanjing University of Information Science Technology, 2019,44(5):765-770

[14] Helbing D, Molnar P.Social Force Model for Pedestrian Dynamics[J].Phys.rev.e, 1995, 51(5):4282.DOI:10.1103/PhysRevE.51.4282.

[15] Trautman P,Krause A. Unfreezing the Robot:Navi‐ gation in Dense,Interacting Crowds[C]. IEEE/ RSJ International Conference on Intelligent Robots and Systems,Taipei,China,2010

[16] Yang Jiren, Zheng Weisheng, Yang Qingzhi, etc Video-based temporary volume network Re-certification [c]; 8744: IEEE/ CVF meetings Computer Vision and Pattern Recognition (CVPR) for Users 2020: from 3286 to 3296.

[17] 冉瑞生,石凯,江小鹏等.基于双注意力CrossViT的微表情识别方法[J].南京信息工程大学学报(自然科学版),2023,15(05):541-550.RAN Ruisheng,SHI Kai,JIANG Xiaopeng,WANG Ning.Micro-expression recognition based on dual attention CrossViT[J]. Journal of Nanjing University of Information Science Technology, 2023,15(5):541-550

[18] Song Chunfeng, Huang Ya, Ouyang Wanli et al.Maskguided contrastive attention model for person re-identification[C]//IEEE/ CVF Conference on Computer Vision and Pattern Recognition.IEEE,2018:1179-1188.

[19] Franco A, Oliveira L. A coarse-to-fine deep learning for person re-identification[C]//IEEEWinter Conference on Applications of Computer Vision.IEEE,2016:1-7.

[20] 李明哲. 基于时空注意力机制的视频行人再识别方法研究[D].哈尔滨工程大学,2020.DOI:10.27060/d.cnki.ghbcu.2020.000515.LI Mingzhe.Research on Video Pedestrian Recognition Method Based on Spatio-temporal Attention Mechanism[D].Harbin Engineering University,2020.DOI:10.27060/d.cnki.ghbcu.2020.000515.

[21] 宋婉茹,赵晴晴,陈昌红等.行人重识别研究综述[J].智能系统学报,2017,12(06):770-780.SONG Wanru,ZHAO Qingqing,CHEN Changhong.Review of Pedestrian Re-recognition Research[J].CAAI Transactions on Intelligent Systems,2017,12(06):770-780.

[22] 耿韶松,李晋国.基于动态卷积与注意力的多特征融合行人重识别[J].计算机工程与设计,2023,44(04):1228-1234.GENG Shaosong,LI Jinguo.Pedestrian Recognition Based on Dynamic Convolution and Attention Fusion with Multi-features[J].Computer Engineering and Design,2023,44(04):1228-1234.

[23] Xin C,Jingmei Z,Xiangmo Z, et al.A presentation attack detection network based on dynamic convolution and multi-level feature fusion with security and reliability[J].Future Generation Computer Systems,2023,146

[24] 李金轩,杜军平,周南.基于注意力特征提取网络的图像描述生成算法[J].南京气象学院学报(自然科学版),2019,11(03):295-301.LI Jinxuan,DU Junping and ZHOU Nan.Image caption algorithm based on an attention image feature extraction network[J]. Journal of Nanjing University of Information Science Technology, 2019,11(3):295-301

[25] 饶天荣,潘涛,徐会军.基于交叉注意力机制的煤矿井下不安全行为识别[J].工矿自动化,2022,48(10):48-54.RAO Tianrong,PAN Tao,XU Huijun.Recognition of Unsafe Behavior in Coal Mine Based on Cross Attention Mechanism[J].Journal of Mine Automation,2022,48(10):48-54.

[26] 符进武,石林瑞,黄祎婧等.基于空间弱化和通道增强注意力的行人重识别[J].计算机工程与设计,2023,44(04):1235-1241.FU Jinwu,SHI Linrui,HUANG Yijing.Pedestrian Recognition Based on Spatial Weakening and Channel Enhancing Attention[J].Computer Engineering and Design,2023,44(04):1235-1241.

[27] 韩超群. 基于无监督域自适应的跨域行人重识别研究[D].石家庄铁道大学,2022.DOI:10.27334/d.cnki.gstdy.2022.000125.HAN Chaoqun.Research on Cross-domain Pedestrian Recognition Based on Unsupervised Domain Adaptation[D].Shijiazhuang railway university,2022.DOI:10.27334/d.cnki.gstdy.2022.000125.

[28] 李聪,蒋敏,孔军.基于多尺度注意力机制的多分支行人重识别算法[J].激光与光电子学进展,2020,57(20):29-37.LI Cong,JIANG Min,KONG Jun.Multi-branch people re-identification algorithm based on multi-scale attention mechanism[J].Laser Optoelectronics Progress,2020,57(20):29-37.

[29] 付景枝,马悦,宏观等.基于I-CBAM-DenseNet模型的小麦发育期识别研究[J/OL].南京信息工程大学学报(自然科学版):1-16[2023-11-08].FU Jingzhi,MA Yue,HONG Guan.Study on identification of wheat development period based on I-CBAM-DenseNet model[J/OL].ournal of Nanjing University of Information Science Technology(Natural Science Edition):1-16[2023-11-08].

[30] Linqin C,Hao L,Wei D, et al.Micro-expression recognition using 3D DenseNet fused Squeeze-and-Excitation Networks[J].Applied Soft Computing Journal,2022,119

[31] Zoph, B, Vasudevan V, Shlens J,et al. Learning Transferable Architectures for Scalable Image Recognition.2018IEEE/CVF????????????? Conference????????????? on Computer Vision and Pattern Recognition.doi:10.1109/cvpr.2018.00907

[32] Ma N, Zhang X, Zheng H.-T.,et al.ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. Lecture Notes in Computer Science, 122–138.doi:10.1007/978-3-030-01264-9_8

[33] Li W, Zhu X,Gong S. Harmonious Attention Network for Person Re-identification.2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.doi:10.1109/cvpr.2018.00243

[34] Chang X, Hospedales T.M,Xiang T. Multi-level Factorisation Net for Person Re-identification.2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.doi:10.1109/cvpr.2018.00225

[35] Zhou K , Yang Y , Cavallaro A ,et al.Learning Generalisable Omni-Scale Representations for Person Re-Identification.[J].IEEE Transactions on Software Engineering, 2021.DOI:10.1109/TPAMI.2021.3069237.

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:November 13,2023
Revised:December 28,2023
Adopted:January 12,2024
Online:
Published:

Article QR Code

Address：No. 219, Ningliu Road, Nanjing, Jiangsu Province

Postcode：210044

Phone：025-58731025

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code