Publication date: Available online 14 December 2018
Source: Computer Speech & Language
Author(s): Rahhal Errattahi, Asmaa El Hannani, Thomas Hain, Hassan Ouahmane
Abstract
This paper addresses errors in continuous Automatic Speech Recognition (ASR) in two stages: error detection and error type classification. Unlike the majority of research in this field, we propose to handle the recognition errors independently from the ASR decoder. We first establish an effective set of generic features derived exclusively from the recognizer output to compensate for the absence of ASR decoder information. Then, we apply a variant Recurrent Neural Network (V-RNN) based models for error detection and error type classification. Such model learn additional information to the recognized word classification using label dependency. As a result, experiments on Multi-Genre Broadcast Media corpus have shown that the proposed generic features setup leads to achieve competitive performances, compared to state of the art systems in both tasks. Furthermore, we have shown that V-RNN trained on the proposed feature set appear to be an effective classifier for the ASR error detection with an Accuracy of 85.43%.
from Speech via a.sfakia on Inoreader https://ift.tt/2SSifr9
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.