Unsupervised adversarial example detection methods based on image transformation

2025-5-4- 15

Unsupervised adversarial example detection methods based on image transformation
DOI:
                        
                    
CSTR:
                        
                    
Author:
                        Zhang Ling1Zhang Ling
Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education,School of Cyber Science and Engineering, Wuhan University,
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Zhao Bo1Zhao Bo
Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education,School of Cyber Science and Engineering, Wuhan University,
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
HUANG Linquan2HUANG Linquan
Wuhan Vocational College of Software and Engineering（Wuhan Open University）
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:1.Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education,School of Cyber Science and Engineering, Wuhan University,;2.Wuhan Vocational College of Software and Engineering（Wuhan Open University）
Clc Number:T391.4??????
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Deep Neural Networks (DNNs) are vulnerable to specially designed adversarial examples and can be easily deceived. Although current detection techniques can identify some malicious inputs, their protective capabilities are still insufficient when facing complex attacks. This paper proposes a novel unsupervised adversarial example detection method based on unlabeled data. The core idea is to transform the adversarial example detection problem into an anomaly detection problem through feature construction and fusion. To this end, five core components are designed, including image transformation, neural network classifier, heatmap generation, distance calculation, and anomaly detector. Firstly, the original image is transformed, and the images before and after the transformation are input into the neural network classifier. The prediction probability array and convolutional layer features are extracted to generate a heatmap. The detector is extended from focusing solely on the model's output layer to the input layer features, enhancing the detector's ability to model and measure the differences between adversarial and normal samples. Then, the KL divergence of the probability arrays and the change distance of the heatmap focus points of the images before and after the transformation are calculated, and the distance features are input into the anomaly detector to determine whether it is an adversarial example. Experiments on the large-scale, high-quality image dataset ImageNet showed that our detector achieved an average AUC value of 0.77 against five different types of attacks, demonstrating good detection performance. Compared with other cutting-edge unsupervised adversarial example detectors, our detector has a TPR at least 15.88% higher while maintaining a similar false alarm rate, indicating a significant advantage in detection capability.

Key words:Adversarial examples detection; Unsupervised learning; Adversarial attack defense; Deep neural networks; Image transformation

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:March 21,2024
Revised:May 15,2024
Adopted:May 16,2024
Online:
Published:

Article QR Code

Address：No. 219, Ningliu Road, Nanjing, Jiangsu Province

Postcode：210044

Phone：025-58731025

Get Citation

Share

Article Metrics

History

Article QR Code