Thoughts on machine translation

There’s been a lot of buzz lately about machine translation, with translators and translation industry watchers either condemning machine translation as useless and damaging to the profession or touting it as the next big thing in the industry. ATA President Jiri Stejskal wrote his most recent column for the ATA Chronicle about MT, reader Polly-Vous Français  sent this link to a Financial Times blog about machine translation’s role in the blogosphere, and last evening the Colorado Translators Association sponsored an excellent presentation by Cris Silva of All in Portuguese, which focused on integrating machine translation into your existing suite of computer-assisted translation tools.

Cris’ presentation, which she also gave at the recent ATA conference with Giovanna Boselli as a co-presenter, was interesting in that she used the ATA certification exam grading scale to score the output from Google Translate. Out of three sample texts that Cris submitted to Google Translate, two would have failed the ATA certification exam and one would have passed (for what it’s worth, I think that 30% is a better passing rate than what is achieved by the humans who take the ATA exam!). So, what’s the outlook for MT? Friend? Foe? Colleague?

I’ll go on the record as saying that I think that most translators are much too paranoid about machine translation. MT technology has come a long way, and I think that we’re on the cusp of its being considered a standard productivity tool for translators, much as translation memory is today. However, I think it’s not worth losing sleep about machine translation sending human translators the way of the telegraph operator. For an industry leader’s outlook, check out 5 Questions with Renato Beninatto on Sarah Dillon’s blog “There’s Something About Translation.” In this interview, Renato makes the case that the translation industry is in a bind, with “good translators…scarce and becoming scarcer” and the demand for translation growing at 15-20% a year. Companies are turning to MT as an option because their volume of content demands it, not because they want to avoid human involvement in the process.

So, I decided to do my own unscientific test of what Google Translate is producing these days, using three different texts that are similar to what I might be translating on any given day: fluffy, dense and flowery. Let’s see how it did.

For the “fluffy” test, I pulled a tidbit from the French Wikipedia page about France’s first lady, Carla Bruni-Sarkozy. Let’s read about her life pre-Nicolas!

Alors qu’elle vit avec l’éditeur littéraire Jean-Paul Enthoven, elle entame une liaison avec le fils de celui-ci, Raphaël Enthoven, qui était alors marié avec Justine Lévy. En 2001, elle a un fils, Aurélien, avec Raphaël Enthoven[19]. En 2004, elle est l’un des personnages du premier best-seller de Justine Lévy Rien de grave. L’auteur, fille de Bernard-Henri Lévy (dont l’éditeur historique et meilleur ami n’est autre que le père de Raphaël Enthoven), y expose son important passage à vide et sa période de reconstruction à la suite de son divorce avec Raphaël Enthoven, parti avec Carla Bruni.

And the result from Google Translate:

While she lives with the literary editor Jean-Paul Enthoven, she began an affair with the son of the latter, Raphaël Enthoven, who was then married to Justine Lévy. In 2001, she has a son, Aurélien, with Raphael Enthoven [19]. In 2004, she is one of the first bestseller by Justine Lévy Nothing serious. The author, daughter of Bernard-Henri Lévy (including historical editor and best friend is none other than the father of Raphael Enthoven), sets out its important transition to vacuum and its period of reconstruction following his divorce from Raphaël Enthoven, who with Carla Bruni.

At first, with the exception of its confusion over the use of the present tense in French (which sometimes drives me crazy too!), things are looking good. Then, Mme. Bruni-Sarkozy herself becomes a bestseller, rather than a character in one. Finally, the “transition to vacuum” following “his” divorce is a complete disaster. This passage  highlights some of the areas in which MT really struggles, for example confusion over the fact that French does not have separate words for “her” and “his.” The key question: is this editable, or could I have translated it from scratch in less time? I’d pick to redo this one entirely.

Next, let’s move on to something dense. Here’s an excerpt from the Quebec department of labor’s information about its purpose:

Emploi-Québec agit à la fois comme producteur, utilisateur et diffuseur d’information nationale et régionale sur le marché du travail.

  • Son rôle de producteur de données lui permet d’offrir une information de qualité, fiable et à jour.
  • Utilisant ces données dans sa prestation de services, Emploi-Québec occupe une position privilégiée pour bien cerner les besoins d’information des personnes et des entreprises.

Google Translate tells us that:

Emploi-Québec acts both as a producer, user and disseminator of information on national and regional labor market.

* His role as a producer of data allows it to provide quality information, reliable and current.
* Using these data in its services, Emploi-Québec occupies a privileged position to identify the information needs of individuals and businesses.

Here, I’d say that with the exception of some misplaced modifiers and the pesky problem of French referring to all nouns as “he” or “she,” we’re moving closer to something that’s editable, more along the lines of what one would expect from a non-expert human translator.

Now how about something flowery. Here’s a passage from a project I worked on last year, a promotional document for an arts festival in Senegal.

Faut-il s’étonner que, portés par cette lame de fond, les responsables africains aient, à leur tour, pris la décision historique de transformer l’Organisation de l’Unité Africaine (O.U.A.) en Union Africaine (U.A.) ? En reprenant à son compte la vision prospective de Kwame Nkrumah, la nouvelle Union Africaine a donné à l’Afrique de nouvelles ambitions.

And the result:

Is it any wonder that, carried by the slide background, African leaders have, in turn, took the historic decision to transform the Organization of African Unity (OAU) African Union (AU)? Showing in his account of the vision of Kwame Nkrumah, the new African Union has given to Africa for new ambitions. Complete their destiny.

Wow. To quote that immortal line from Cool Hand Luke, “What we’ve got here is a failure to communicate.” Even someone who doesn’t speak French can appreciate the exuberant incomprehensibility of this passage. Slide background? His account of the vision? And the OAU and the AU are now the same thing? We’ll just let Google Translate complete its destiny on passages such as this.

So, my unscientific observation would be that for straightforward texts, this type of MT is almost at the point where it could be a useful add-on to a translator’s tool suite.  Many translation environment tools now incorporate some time of MT module, and on a text such as the second example, I would probably be willing to try this feature. For more nuanced texts, I think that the human touch is still critical

23 Responses to “Thoughts on machine translation”
  1. Kevin Lossner January 29, 2009
  2. Ryan Ginstrom January 29, 2009
  3. Aaron January 30, 2009
  4. Cris Silva - ALLinPortuguese January 30, 2009
  5. Ryan Ginstrom January 30, 2009
  6. Istvan Fulop January 30, 2009
  7. Aaron January 30, 2009
  8. Judy Jenner January 31, 2009
  9. Kirti Vashee January 31, 2009
  10. MaskedTranslator February 2, 2009
  11. MaskedTranslator February 2, 2009
  12. Jeff Allen February 6, 2009
  13. Jeff Allen February 6, 2009
  14. Jeff Allen February 7, 2009
  15. Ryan Ginstrom February 8, 2009
  16. Kristin February 24, 2009
  17. rikker February 26, 2009
  18. Mike Unwalla February 26, 2009
  19. James February 26, 2009
  20. Kirti Vashee February 27, 2009
  21. rikker February 27, 2009
  22. kate February 9, 2012

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.