Research on Domain Adaptive Classification Based on Gradient Weight Pursuit
CSTR:
Author:
Affiliation:

1.School of Automation, Nanjing University of Information Science and Technology;2.Nanjing University of Information Science and Technology

Fund Project:

Science and Technology Innovation 2030- "New generation of Artificial Intelligence" major project(No.2018AAA0100400),The National Natural Science Foundation of China (No.U21B2049,No.61936005)

  • Article
  • | |
  • Metrics
  • |
  • Reference [33]
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    In this paper, we propose a pruning and optimization algorithm (GWP) based on gradient weight Pursuit to solve the overfitting problem in the unsupervised domain, that is, the accuracy of downstream tasks is much lower than that of training sets. The dense-sparse-dense strategy is applied to solve the overfitting problem for the difference and adversarial adaptive methods in the unsupervised domain. The network is trained intensively and which connections are important are learned. The second is the pruning stage. Different from the pruning process in the original dense-sparse-dense strategy, the optimization algorithm in this paper considers both weight and gradient. On the one hand, weight information (i.e. zero-order information) is used, and on the other hand, the influence of gradient information (i.e. first-order information) on the network pruning process is also considered. In the final intensive phase, the pruned connections are restored and the intensive network is retrained with a smaller learning rate. Finally, the obtained network achieves ideal results in downstream tasks. The experimental results show that the proposed GWTA can effectively improve the accuracy of downstream tasks and has a plug-and-play effect compared with the original difference-based and adversarial domain adaptive methods.

    Reference
    [1] Vapnik V N. Statistical learning theory[M]. New York: DBLP, 1998.
    [2] Huang K Z, Zheng D N, Sun J, et al. Sparse learning for support vector classification [J]. Pattern Recognition Letters, 2010, 31(13) : 1944-1951.
    [3] Corinna C, Mehryar M. Domain adaptation in regression [J]. Algorithmic Learning Theory Lecture Notes in Computer Science, 2011(6925): 308-323.
    [4] Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification: A Deep Learning Approach [C] // Proceedings of the 28th International Conference on Machine Learning.,2011:513-520.
    [5] Wang M and Deng W H. Deep Visual Domain Adaptation: A Survey[J]. Neurocomputing,2018, 312:135-153.
    [6] Long M S, Cao Y, Wang J M,et al. Learning transferable features with deep adaptation networks[C] // Proceedings of the 32nd International Conference on International Conference on Machine Learning, 2015: 97-105.
    [7] Yaroslav G, Victor L. Unsupervised domain adaptation by backpropagation [C] // in Proceedings of the 32nd International Conference on Machine Learning, 2015: 1180-1189.
    [8] Long M S, Cao Z J,Wang J M, et al. Conditional adversarial domain adaptation [C] // Proceedings of the 32nd International Conference Neural Information Processing Systems, 2018: 1647-1657.
    [9] Chen L, Chen H, Wei Z, et al. Reusing the task-specific classifier as a discriminator: discriminator-free adversarial domain adaptation [J]. arXiv preprints, arXiv.2204.03838.
    [10] Han S, Pool J, Narang S, et al. DSD: Dense-Sparse-Dense training for deep neural networks[J]. arXiv preprint arXiv:1607.04381,2016.
    [11] Han S, Pool J, Tran J, et al. Learning both weights and connections for efficient neural networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 1135-1143.
    [12] Lecun Y, Denker J S, Solla S A. Optimal brain damage [J]. Neural Information Processing Systems, 1990, 2(279):598-605.
    [13] Tzeng E, Hoffman J, Zhang N, et al. Deep domain confusion: Maximizing for domain invariance.Technical report, arXiv:1412.3474, 2014.
    [14] Long M, Zhu H, Wang J, et al. Deep transfer learning with Joint Adaptation Networks[J].2016,arXiv,1605.06636.
    [15] B. Hassibi and D. G. Stork, Second order derivatives for network pruning: Optimal brain surgeon. Morgan Kaufmann, 1993.
    [16] Jin X J, Yuan X T, Feng J S, et al. “Training skinny deep neural networks with iterative hard thresholding methods,” arXiv preprint arXiv:1607.05423, 2016.
    [17] Hinton G, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580(2012).
    [18] Krizhevsky A , Sutskever I , Hinton G.ImageNet classification with deep convolutional neural networks [J].In Advances in Neural Information Processing Systems, 2012, 25(2).
    [19] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition [C] // Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2016:770-778.
    [20] Kuniaki Saito, Kohei Watanabe, Yoshitaka Ushiku, et al. Maximum classifier discrepancy for unsupervised domain adaptation[J]. In CVPR, 2018: 3723–3732.
    [21] Chen X Y, Wang S A, Long M S, et al. Transferability vs. discriminability: Batch spectral penalization for adversarial domain adaptation[J]. In ICML, 2019: 1081–1090.
    [22] Cui S H, Wang S H, Zhuo J B, et al. Gradually vanishing bridge for adversarial domain adaptation[J]. In CVPR, 2020: 12455–12464.
    [23] Tang H and Jia K. Discriminative adversarial domain adaptation[J]. In AAAI, 2020,34: 5940–5947.
    [24] Li S, Xie M X, Gong K X, et al. Transferable semantic augmentation for domain adaptation[J]. In CVPR, 2021: 11516–11525.
    [25] Li S, Xie M X, Fangrui Lv, et al. Semantic concentration for domain adaptation[J]. In ICCV, 2021: 9102–9111.
    [26] Paszke A, Gross S, Massa F, et al. PyTorch: An imperative style, high-performance deep learning library[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems Vancouver, 2019: 8026-8037.
    [27] Saenko K, Kulis B, Fritz M, et al.Adapting visualcategory models to new domains [C] // Proceedings of the 11th European Conference on Computer Vision, 2010: 213-226.
    [28] Venkateswara H, Eusebio J, Chakraborty S, et al. Deep hashing network for unsupervised domain adaptation [C] // Proceedings in 2017 IEEE Conference on Computer Vision and Parttern Recognition. Honolulu, 2017: 5018-5027.
    [29] Peng X C, Bai Q X, Xia X, et al. Moment matching for multi-source domain adaptation [C] // Proceedings of the IEEE / CVF International Conference on Computer Vision, 2019 : 1406-1415.
    [30] Peng X,Usman B, Kaushik N,et al.Visda: The visual domain adaptation challenge[J].arXiv preprint arXiv:1710.06924,2017.
    [31] Loshchilov L, Hutter F. SGDR: Stochastic gradient descent with warm restarts. arXiv preprint arXiv: 1608.03983, 2016.
    [32] Nazmul K, Niluthpol C M, Abhinav R, et al. C-SFDA: A curriculum learning aided self-training framework for efficient source Free domain adaptation [C] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),2023:24120-24131.
    [33] Mattia L, Alessio D, Pietro M, et al. Guiding pseudo-labels with uncertainty estimation for source-free unsupervised domain adaptation [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),2023:7640-7650.
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation
Share
Article Metrics
  • Abstract:120
  • PDF: 0
  • HTML: 0
  • Cited by: 0
History
  • Received:September 27,2023
  • Revised:February 15,2024
  • Adopted:February 25,2024
Article QR Code

Address:No. 219, Ningliu Road, Nanjing, Jiangsu Province

Postcode:210044

Phone:025-58731025