I obtained my Ph.D. degree in Shanghai Jiao Tong University in 2020.09, under the supervision of Kai Yu and Yanmin Qian. During the Ph.D. my research interests include deep learning based approaches for speaker recognition, speaker diarization and voice activity detection.
I serve as a regular reviewer for speech related conferences/journals: Interspeech, ICASSP, ICME and TASLP.
Currently, I work at Tencent as a senior researcher and my research area extended to speech synthesis, which is a facinating task.
PhD in Computer Science and Technology, 2020
Shanghai Jiao Tong University
BSc in Software Engineering, 2014
Northwestern Polytechnical University
Work on several research papers and contribute to
Working on deep learning based speaker recognition systems for real-world applications such as
This paper describes the winning systems developed by the BUT team for the four tracks of the Second DIHARD Speech Diarization Challenge, with source code available
We proposed the text-adptation speaker verification task and an intital solution called Speaker-text factorization network, which could deal with different text-mismatch conditions
This paper describes the winning systems developed by the BUT team for the two tracks of the First VoxSRC Speaker Recognition Challenge
We proposed the segment-level representation for phonetic information and the corresponding segment-level multi-task/adversarial training framework, we revisited the usage the phonetic information for the text-independent embedding learning and designed experiments to verify the assumption: For TI-SV, it could be benificial to remove the phonetic variation in the final speaker embeddings