摘要: Automatic speech recognition systems like those at the core of Alexa convert speech into text, and one of their components is a model that predicts which word will come after a sequence of words. They’re typically n-gram based, meaning they suss out the probability of next words given the past n-1 words. But architectures like recurrent neural networks, which are commonly used in speech recognition because of their ability to learn long-range dependencies, are tough to incorporate into real-time systems and often struggle to ingest data from multiple corpora.
Automatic speech recognition systems like those at the core of Alexa convert speech into text, and one of their components is a model that predicts which word will come after a sequence of words. They’re typically n-gram based, meaning they suss out the probability of next words given the past n-1 words. But architectures like recurrent neural networks, which are commonly used in speech recognition because of their ability to learn long-range dependencies, are tough to incorporate into real-time systems and often struggle to ingest data from multiple corpora.
That’s why researchers at Amazon’s Alexa research division investigated techniques to make such AI models more practical for speech recognition. In a blog post and accompanying paper (“Scalable Multi Corpora Neural Language Models for ASR”) scheduled to be presented at the upcoming Interspeech 2019 conference in Graz, Austria, they claim they can reduce word recognition error rate by 6.2% over the baseline.
...
Full Text: Venture Beat
若喜歡本文,請關注我們的臉書 Please Like our Facebook Page:
留下你的回應
以訪客張貼回應