Tacotron 2, Google’s new model to make process of AI teaching easier
As one of the hottest trends in the Google AI ecosystem, creating convincing artificial speech recognition is the latest trend. And, without a hint of doubt, Google has been leading the same. In order to extend its reach, Google AI has recently announced the onset of an innovative neural network, called Tacotron that is trained to yield realistic speech from the text without requiring the grammatical expertise.
Google AI has taken the reference from two of its previous speech recognition projects called WaveNet and the original Tacotron for developing its latest model. Google Tacotron 2 utilizes text and narration to calculate the linguistic rules that are to be followed. The word from this model is generated by using a WaveNet-style system while the text gets converted to Tacotron-style “Mel-scale spectrogram” for adding the rhythms.
The resulting speech is as good as anything else. (have a look at the examples here) The convincing rhythm appears to be soothing and a bit alerting too at the same time. Though the Google Tacotron 2 sometimes stumbles on words that are not intuitive and out of American linguistic language. “Decorum” & “Merlot” are the two fine examples that Google Tacotron 2 emphasizes on the first syllable while speaking. “And in extreme cases, it can even randomly generate strange noises,” the researchers write.
When it comes to controlling the tone of the speech, there is no such way to have the manual control over the same. While Google’s attempt in this direction is absolutely amazing. The initiative has also lowered the barrier for introducing a training system that offers an exceptional way of using the technologies without much of human assistance.
The onset of such technological system for speech recognition is Google’s exceptional attempt that should be applauded. In addition to this, the upcoming changes are sure to offer new and innovative ways of providing AI training to machines
She is a content marketer and has more than five years of experience in IoT, blockchain, Web, and mobile development. In all these years, she closely followed the app development, and now she writes about the existing and the upcoming mobile app technologies. Her essence is more like a ballet dancer.