Unsupervised adversarial example detection methods based on image transformation
DOI:
Author:
Affiliation:

1.Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education,School of Cyber Science and Engineering, Wuhan University,;2.Wuhan Vocational College of Software and Engineering(Wuhan Open University)

Clc Number:

T391.4??????

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Deep Neural Networks (DNNs) are vulnerable to specially designed adversarial examples and can be easily deceived. Although current detection techniques can identify some malicious inputs, their protective capabilities are still insufficient when facing complex attacks. This paper proposes a novel unsupervised adversarial example detection method based on unlabeled data. The core idea is to transform the adversarial example detection problem into an anomaly detection problem through feature construction and fusion. To this end, five core components are designed, including image transformation, neural network classifier, heatmap generation, distance calculation, and anomaly detector. Firstly, the original image is transformed, and the images before and after the transformation are input into the neural network classifier. The prediction probability array and convolutional layer features are extracted to generate a heatmap. The detector is extended from focusing solely on the model's output layer to the input layer features, enhancing the detector's ability to model and measure the differences between adversarial and normal samples. Then, the KL divergence of the probability arrays and the change distance of the heatmap focus points of the images before and after the transformation are calculated, and the distance features are input into the anomaly detector to determine whether it is an adversarial example. Experiments on the large-scale, high-quality image dataset ImageNet showed that our detector achieved an average AUC value of 0.77 against five different types of attacks, demonstrating good detection performance. Compared with other cutting-edge unsupervised adversarial example detectors, our detector has a TPR at least 15.88% higher while maintaining a similar false alarm rate, indicating a significant advantage in detection capability.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:March 21,2024
  • Revised:May 15,2024
  • Adopted:May 16,2024
  • Online:
  • Published:

Address:No. 219, Ningliu Road, Nanjing, Jiangsu Province

Postcode:210044

Phone:025-58731025