Exploiting LSTM structure in deep neural networks for speech recognition

T He, J Droppo - … conference on acoustics, speech and signal …, 2016 - ieeexplore.ieee.org
2016 IEEE international conference on acoustics, speech and signal …, 2016ieeexplore.ieee.org
The CD-DNN-HMM system has became the state-of-art system for large vocabulary
continuous speech recognition (LVCSR) tasks, in which deep neural networks (DNN) plays
a key role. However, DNN training suffers from the vanishing gradient problem, limiting
training of deep models. In this work, we address this problem by incorporating the
successful long-short term memory (LSTM) structure, which has been proposed to help
recurrent neural network (RNN) to remember long term dependencies, into DNN. Also, we …
The CD-DNN-HMM system has became the state-of-art system for large vocabulary continuous speech recognition (LVCSR) tasks, in which deep neural networks (DNN) plays a key role. However, DNN training suffers from the vanishing gradient problem, limiting training of deep models. In this work, we address this problem by incorporating the successful long-short term memory (LSTM) structure, which has been proposed to help recurrent neural network (RNN) to remember long term dependencies, into DNN. Also, we propose a generalized formulation of the LSTM block, which we name general LSTM(GLSTM). In our experiments, it is shown that our proposed (G)LSTM-DNN scales well with more layers, and achieves 8.2% relative word error rate reduction on the 2000-hour Switchboard data set.
ieeexplore.ieee.org
Résultat de recherche le plus pertinent Voir tous les résultats