A deep learning-based speaker recognition system for open set scenarios
Author:
Clc Number:

TN912.3;TP18

  • Article
  • | |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    Due to the low accuracy of speaker recognition for short-term speech or under overlapping noisy situations, a new speaker recognition algorithm based on deep learning is proposed and then deployed on an embedded device.The encoding layer and loss function are the two aspects to improve the speaker recognition system in robustness.For the encoding layer, the NeXtVLAD technique based on differential encoding is used to model both static and dynamic speaker features at frame level.For the loss function, the cosine-prototypical loss function based on small-sample learning framework is fused with the additional margin classification loss function AM-Softmax to train the speaker recognition model, which enables the model to collect similar features and separate dissimilar features as much as possible in the feature space.Then the improved speaker recognition algorithm is deployed on the Raspberry Pi platform to realize speaker recognition with fast inference.The experimental results illustrate that the system can accomplish speaker recognition in real time and accurately under various open set scenarios, and meet the requirements of practical applications.

    Reference
    Related
    Cited by
Get Citation

GUO Xin, LUO Chengfang, DENG Aiwen. A deep learning-based speaker recognition system for open set scenarios[J]. Journal of Nanjing University of Information Science & Technology,2021,13(5):526-532

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 08,2021
  • Online: December 02,2021
Article QR Code

Address:No. 219, Ningliu Road, Nanjing, Jiangsu Province

Postcode:210044

Phone:025-58731025