1

Focal KL-divergence based dilated convolutional neural networks for co-channel speaker identification

Recognizing the identities of multiple talkers via their overlapped speech is a challenging task, it is also one main difficulty for the “cocktail party problem”. In this paper, a novel dilated convolutional neural network with a focal KL-divergence …

Joint i-vector with end-to-end system for short-duration text-independent speaker verification

Factor analysis based i-vector has been the state-of-the-art method for speaker verification. Recently, researchers propose to build DNN based end-to-end speaker verification systems and achieve comparable performance with i -vector. Since these two …

Integrating Online i-vector into GMM-UBM for Text-dependent Speaker Verification

GMM-UBM is widely used for the text-dependent task for its simplicity and effectiveness, while i-vector provides a compact representation for speaker information. Thus it is promising to fuse these two frameworks. In this paper, a variation of …

What Does the Speaker Embedding Encode?

The first attempt to systematically analyze the information encoded in speaker embeddings (prior to x-vector), detailed analysis on x-vectors could be refered to the paper Probing the Information Encoded in X-vectors from JHU