SEL

Text Adaptation for Speaker Verification with Speaker-Text Factorized Embeddings

We proposed the text-adptation speaker verification task and an intital solution called Speaker-text factorization network, which could deal with different text-mismatch conditions

Channel Invariant Speaker Embedding Learning with Joint Multi-Task and Adversarial Training

Using deep neural network to extract speaker embedding has significantly improved the speaker verification task. However, such embeddings are still vulnerable to channel variability. Previous works have used adversarial training to suppress channel …

BUT system description to voxceleb speaker recognition challenge 2019

This paper describes the winning systems developed by the BUT team for the two tracks of the First VoxSRC Speaker Recognition Challenge

Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification

Short duration text-independent speaker verification remains a hot research topic in recent years, and deep neural network based embeddings have shown impressive results in such conditions. Good speaker embeddings require the property of both small …

Margin matters: Towards more discriminative deep neural network embeddings for speaker recognition

Recently, speaker embeddings extracted from a speaker discriminative deep neural network (DNN) yield better performance than the conventional methods such as i-vector. In most cases, the DNN speaker classifier is trained using cross entropy loss with …

Knowledge Distillation for Small Foot-print Deep Speaker Embedding

Deep speaker embedding learning is an effective method for speaker identity modelling. Very deep models such as ResNet can achieve remarkable results but are usually too computationally expensive for real applications with limited resources. On the …

Data Augmentation Using Variational Autoencoder for Embedding Based Speaker Verification

Domain or environment mismatch between training and testing, such as various noises and channels, is a major challenge for speaker verification. In this paper, a variational autoencoder (VAE) is designed to learn the patterns of speaker embeddings …

On the Usage of Phonetic Information for Text-independent Speaker Embedding Extraction

We proposed the segment-level representation for phonetic information and the corresponding segment-level multi-task/adversarial training framework, we revisited the usage the phonetic information for the text-independent embedding learning and designed experiments to verify the assumption: For TI-SV, it could be benificial to remove the phonetic variation in the final speaker embeddings

Angular Softmax for Short-Duration Text-independent Speaker Verification.

Recently, researchers propose to build deep learning based endto-end speaker verification (SV) systems and achieve competitive results compared with the standard i-vector approach. In addition to deep learning architectures, optimization metric such …

Deep discriminant analysis for i-vector based robust speaker recognition

Linear Discriminant Analysis (LDA) has been used as a standard post-processing procedure in many state-of-the-art speaker recognition tasks. Through maximizing the inter-speaker difference and minimizing the intra-speaker variation, LDA projects …