The ATA Language Technology Division page contains an interesting video featuring Jost Zetzsche. In the video, Jost explains how machine translation has rapidly evolved from a separate, quite isolated technology into a new concept that is very much integrated in other translation tools and systems used by human translators.
Jost goes on explaining the three main supporting arguments to his theory:
- SDL, among other providers, has integrated MT into its mainstream translation memory tools. This means that translators are able to leverage suggestions from SDL’s own generic MT engine (which, according to Jost, very often produces a lot of “garbage”). Translators working for enterprise clients that have an account with SDL will also benefit from the client’s customized MT database, which supposedly offers better quality. It is expected that all the major CAT tool providers will soon follow suit.
- Google will soon introduce its Google Translation Center, which will allow translation buyers and providers to use a common platform for the exchange of translation jobs. Such platform will heavily rely on MT and TM technologies.
- The initiatives of TAUS – Translation Automation User Society are aimed at pooling the translated material of very large translations buyers (among which the EU, Microsoft, Oracle, etc.) in order to obtain better results with machine translation by leveraging the enormous amount of translated material produced by these organizations.
The conclusion? According to Jost, translators will not be able to oppose the radical change that all this will cause, so they’d better face the music and start learning new skills, such as machine translation post-editing.
Most of the changes in the game are due to increased availability of machine translation. The quality isn’t getting better; the best MT systems are just becoming more available. I don’t expect to be editing MT in Japanese-to-English translation for a while yet.
Hi Ryan, thanks for your comment.
If MT has not yet reached a high enough degree of reliability and accuracy for Japanese, perhaps things are moving a bit faster for Latin languages. I don’t think that the results that the Google MT returns for Italian technical texts are completely useless.
Of course if we move from the purely technical to something that requires even a tiny bit of style sophistication or syntactical complexity, we’ll end up with blunders and ridiculous results with MT.
I’m sure you have come across some of the Microsoft support web pages translated by MT. What is your experience with them? For Italian, I think that they reach their declared purpose of getting the user on the right track, no matter how stylistically bad they may sound. Do they exist for Japanese too?
I believe that the better MT we’re seeing today is due to the best MT becoming available, not to improvements in MT per se. The results look much the same, but the implications are far different (i.e. what we can expect from MT in the near future; in my opinion, not much).
I’ve used Google Translate and pals to translate Web pages from various European languages in English. The results are often passable. Machine translation between Japanese and English is usually a much closer approximation to garbage.
Ryan,
The online MT portals (Google Translate, Babelfish, etc) are indeed examples of making MT available, but are not the gauge of the best of MT systems. These are lowest level of MT, usually based upon just the MT engine and the standard dictionary for the Rule-based systems. It is not possible to customize them with one’s own terminology. The paid systems allow for all such features in order to make the MT system a productivity tool for the user, rather than a push-the-button and discover the result experience.
I put together a ordered list of forum posts to read on understanding MT and how to implement it from the perspective of a translation service provider. This is at:
http://www.translatorscafe.com/cafe/MegaBBS/thread-view.asp?threadid=12723&messageid=160999#160999
Also a detailed explanation (not technical, but from practical perspective with relation to the underlying Translation Memory part) of how the Google Translate Statistical MT system is different from the Rule-based systems:
http://www.translatorscafe.com/cafe/MegaBBS/thread-view.asp?threadid=12723&messageid=160937#160937
The best of MT today is the dictionary/terminology extraction and customization work being done which is then uploaded into the MT systems in order to fine-tune them to the specific terminology needs of customers.
Hi Jeff, thanks for your informative post and for the links you submitted.
I find your closing sentence about terminology extraction particularly interesting.
The very few experiments I have done with MT in a production environment involved technical documentation. I think this is (and for many years to come, will be) the only applicable area for MT.
Having large, reliable translation memories coming from previously human-translated projects allowed me to perform a bilingual term extraction using a rule-based system. This in turn produced an extensive, high-quality bilingual glossary where about 90% of the terms did not need any further editing work.
Feeding the bilingual glossary correctly into the MT was the tricky, time-consuming part of the task. I noticed that in most cases I needed to adjust the grammar rules of the terms. However, once the setup was completed, the machine translation produced results that needed a degree of post-editing that was comparable to what is normally required with 80% fuzzy matches in a CAT tool. Without the MT, these would have been 0% matches.
I do not think that with that particular project MT allowed me to save any significant amount of time. However, the first tangible result I appreciated was terminology consistency. I’m sure that with larger on-going projects this type of approach would allow for much faster turnaround times than projects where MT is not applied. And, something not less important in my opinion, for better consistency in terminology.
If we move away from technical documentation and into other areas where stylistic nuances in the source and target language play a much more important role, I’m inclined to agree with Ryan in saying that MT will not change things in the foreseeable future.
From reading feedback and comments from translators around the web, I got the impression that the scope and context of application of MT is not well understood. It seems to me that many translators are refusing to adopt MT without even understanding what it means. This may be similar to what happened with CAT tools: the translators who embraced them were able to add value to their skills and obtain more work, while the ones who didn’t cannot seriously hope to work translating technical documentation.
Translators who work more as transcreators or who deal with literary, marketing texts may ignore MT without any problems. Those of us who are constantly dealing with repeat work coming from the same end clients and involving technical documentation had better come to gripes with MT, otherwise they risk missing the boat.
Thank you for the information.
There is a need to check the accuracy and quality of documents translated by machine. As you said – start learning new skills, such as machine translation post-editing – has to be done. We do have a number of projects like this, as there are many things that still have to be corrected. Inaccurately translated documents are a common occurrence that causes delays in projects, among other things.
Legal Translation Solutions