WorldOfTopics.com

Google unveils new voice translation technology that preserves original speech features (Topic)

World Of Topics » Internet » Google unveils new voice translation technology that preserves original speech features

Google unveils new voice translation technology that preserves original speech features

Image

Google has introduced a new technology for converting spoken language into the same speech only in another language. Translatotron - so called Google translator - can translate a voice and then play audio in a foreign language while preserving the original intonation and timbre of the original. Unlike analogs, the system does not require an intermediate stage, when speech is usually transformed into text before being converted. The service directly reproduces the translation in audio format.

Modern technologies that are engaged in speech translation, most of them use the cascade method. With this method, the system automatically recognizes the voice, then translates it, receiving text at the output, which is then converted into audio in another language. As a result, the new speech differs in many ways from the original speaker.

The waterfall method has shown its efficiency in practice, and its use in many systems, including in the Google service itself, is quite natural. At the same time, the Google team believes that it is possible to create an even better technology, in which the number of intermediate stages will be less, which ultimately leads to fewer errors. For this reason, the new Google Translate uses an end-to-end translation system, which, according to the developers, is the best version of the waterfall method as it bypasses the intermediate speech-to-text stage.

Image

In its work, the new Google voice translator uses the capabilities of a neural network, which converts the initially spoken speech into a visual image of the frequency display - a spectrogram. Translatotron then creates a new spectrogram, in a different language. The technology does not perform unnecessary actions between these two steps, including creating a text file.

Thus, the Google translator presented is a one-step process, not a sequence of several tasks. Because of this, the translation speed increases, while the likelihood of losing part of the data and increasing errors is reduced. At the same time, the technology reproduces in the final translation the same intonations, pauses and specificity that were originally present in the speech. The final result is not devoid of a certain "robotic" sound, but the similarity with the original remains much more.

Professional translators often pay attention not only to pronunciation, but also to how words are pronounced. The meaning of the original speech sometimes significantly changes the meaning of the spoken phrases. The engineers of the Translatotron project agree that the new system has not surpassed the waterfall method in the accuracy of translation, however, like all machine learning technologies, the new translator will be gradually improved.

The Topic of Article: Google unveils new voice translation technology that preserves original speech features.
Author: Jake Pinkman


LiveInternet