Speaker adaptation is an important
technology to fine-tune either features or speech models for mis-match due to inter-
speaker variation. In the last decade, eigenvoice (EV) speaker adaptation has been developed. It makes use of the prior knowledge of training speakers to provide a fast adaptation
algorithm (in other words, only a small amount of
adaptation data is needed). Inspired by the kernel eigenface idea in
face recognition, kernel eigenvoice (KEV) is proposed.[1] KEV is a non-linear generalization to EV. This incorporates
Kernel principal component analysis, a non-linear version of
Principal Component Analysis, to capture higher order correlations in order to further explore the speaker space and enhance recognition performance.
Mak, B.; Ho, S. (2005). "Various Reference Speakers Determination Methods for Embedded Kernel Eigenvoice Speaker Adaptation". IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. Proceedings. ICASSP '05. Vol. 1. pp. 981–984.
doi:
10.1109/ICASSP.2005.1415280.
Mak, Brian Kan-Wing; Hsiao, Roger Wend-Huu; Ho, Simon Ka-Lung; Kwok, J. T. (July 2006). "Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting". IEEE Transactions on Audio, Speech, and Language Processing. 14 (4): 1267–1280.
CiteSeerX10.1.1.206.4596.
doi:
10.1109/TSA.2005.860836.
S2CID7527119.
Speaker adaptation is an important
technology to fine-tune either features or speech models for mis-match due to inter-
speaker variation. In the last decade, eigenvoice (EV) speaker adaptation has been developed. It makes use of the prior knowledge of training speakers to provide a fast adaptation
algorithm (in other words, only a small amount of
adaptation data is needed). Inspired by the kernel eigenface idea in
face recognition, kernel eigenvoice (KEV) is proposed.[1] KEV is a non-linear generalization to EV. This incorporates
Kernel principal component analysis, a non-linear version of
Principal Component Analysis, to capture higher order correlations in order to further explore the speaker space and enhance recognition performance.
Mak, B.; Ho, S. (2005). "Various Reference Speakers Determination Methods for Embedded Kernel Eigenvoice Speaker Adaptation". IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. Proceedings. ICASSP '05. Vol. 1. pp. 981–984.
doi:
10.1109/ICASSP.2005.1415280.
Mak, Brian Kan-Wing; Hsiao, Roger Wend-Huu; Ho, Simon Ka-Lung; Kwok, J. T. (July 2006). "Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting". IEEE Transactions on Audio, Speech, and Language Processing. 14 (4): 1267–1280.
CiteSeerX10.1.1.206.4596.
doi:
10.1109/TSA.2005.860836.
S2CID7527119.