LSTM-Based Robust Voicing Decision Applied to DNN-Based Speech Synthesis

R. Pradeep; M. Kiran Reddy; K. Sreenivasa Rao

doi:10.3103/S0146411619040096

LSTM-Based Robust Voicing Decision Applied to DNN-Based Speech Synthesis

Autores: Pradeep R.¹, Reddy M.K.², Rao K.S.²
Afiliações:
1. Advanced Technology Development Center
2. Department of Computer Science and Engineering
Edição: Volume 53, Nº 4 (2019)
Páginas: 328-332
Seção: Article
URL: https://journal-vniispk.ru/0146-4116/article/view/175840
DOI: https://doi.org/10.3103/S0146411619040096
ID: 175840

Citar

Texto integral

Acesso aberto
Acesso é fechado

Acesso está concedido
Acesso é fechado

Somente assinantes

Resumo
Sobre autores
Bibliografia
Arquivos suplementares
Estatísticas

Resumo

The quality of statistical parametric speech synthesis (SPSS) relies on voiced/unvoiced classification. Errors in voicing decision can contribute to significant degradation in speech quality. This paper proposes a robust voicing detection method based on power spectrum and long short term memory (LSTM) network for SPSS. The performance of the proposed method is evaluated using CMU Arctic, Keele and MIR-1K databases. Further, the effectiveness of the proposed method is analyzed for deep neural network (DNN)-based SPSS. The results show that the proposed method can better classify the voiced and unvoiced speech segments, which significantly improves the speech quality.

Palavras-chave

deep neural networks, long short term memory, speech synthesis, voicing detection

Arquivos suplementares

Ação

1. JATS XML

Baixar

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

Nome de usuário
Senha
Lembrar usuário

Esqueceu a senha?	Cadastro

LSTM-Based Robust Voicing Decision Applied to DNN-Based Speech Synthesis

Texto integral

Resumo

Palavras-chave

Sobre autores

R. Pradeep

M. Reddy

K. Rao

Arquivos suplementares