Online French-English MT Exercise, Level I


Other online machine translation tutorials: Other Machine Translation tutorials:            
MT Browser Tools, Level I Reverso Promt I
  Reverso Promt II



I. Introduction


Uses of online MT


With the growth of the Internet and the increasing availability of documents in a variety of languages, there has also been growth in the field of machine translation (MT), and of the availability of (often free) MT tools online. Online MT can meet the needs of some users, particularly those who are interested primarily in getting an idea of the content of a document, but are not concerned with the “quality” of the language in the translation produced. (The focus is thus very much on content rather than on form. These are generally cases in which a professional translator would not be hired to do the job if MT weren’t used — users would likely just try to muddle through in the source language or do without.)


Challenges of online MT


MT has shortcomings of which translators — if not always the public — are well aware, and of course online MT is no exception to the rule. (In fact, in some ways it’s the classic case.) These shortcomings may affect both the form and the content of the translations produced. However, while it is rarely difficult to find things to criticize in the output of MT systems, critics may be less aware of the challenges faced by MT systems that can explain many of the (admittedly sometimes laughable) translations produced.


Today’s online systems face a number of challenges at a fairly general level. First and foremost, like any MT system, online MT works exclusively with the forms of words that are present in the text — character strings — with no knowledge about the real world or an author’s intentions, no critical judgment, and no ability to reason or infer any information as a human can do.


The challenges also include the lack of control over the subject matter, vocabulary and form of texts that are submitted. Of course, given the almost infinite variety of words, terms, structures, and even errors that can be present in texts, it is certainly not surprising that no system can deal successfully with all texts that are submitted for translation.


In addition, there are some specific linguistic phenomena that are recognized as widely problematic for MT (and in fact, most natural language processing, or NLP, applications). These include problems with lexical units (e.g., challenges of delimiting units, part of speech ambiguity, semantic ambiguity, metaphorical and fixed expressions, anaphora), with sentence structures (e.g., structural ambiguities, the need to analyze more than a sentence at a time to do a correct translation), and with the task of translation in general (e.g., the need to have world — and not just linguistic — knowledge in order to produce a correct translation). (Machine translation and its challenges are discussed for example in chapters 1, 2, 3 and 10 of L’Homme (1999). For more information, you can also consult Hutchins and Somers (1992) and Melby (1996).)


Ways of dealing with some challenges


One way of dealing with problems is to try to adjust an MT system or how it works (e.g., the content of its dictionaries or the resources it uses to analyze the source text or generate the target text) according to the errors observed or to adapt it to the type of texts that the system will generally be dealing with. However, unfortunately the former sometimes just introduces new errors in other contexts, and the latter isn’t really an option for online MT, which can be used to translate anything and everything.


As users have little or no control over how the online MT systems they use deal with challenges, an alternative is to think about what can be done in the source text itself to reduce the impact of these problems. This kind of pre-editing is in some ways a simpler version of what is known as controlled language— that is, the implementation of writing and vocabulary rules designed to make texts as easily “understandable” as possible, both for humans and for natural language processing applications.


Controlled language is commonly used when MT is necessary (for example, when the volume of texts to be translated is too large, the deadlines too short, or the cost too high for human translators to do a job, or when translators are not available). By predicting and avoiding problems in translation as texts are written, the time required to revise (post-edit) MT output is also minimized.


Of course, controlled language is implemented in large-scale projects (and generally those using highly developed MT systems), and not for online MT! But thinking about the principles behind controlled language in even this context can help you to see how you can improve the results of MT or NLP without having any control whatsoever over the application itself. Most of all, it will remind you that the system itself is only part of the equation: to work well, it needs input that it can work well with.


The exercises


For these exercises, you can use your choice of online MT systems, including any or all of the following:



II. Getting ready

  1. If you want, open Word (Start > Microsoft Office Word 2003), and paste the sentences below into a new document, so you can keep a copy of their translations by different MT systems and compare them. (You can even convert them to table form if you want to be really organized. To learn how, consult the document Word: Converting Text to Tables on the CERTT site Word: Converting text to tables.)
  2. Click on one or more of the links above to open the page for an MT tool in a Web browser.


III. Evaluating challenges in MT


Each of the sentences in the groups below poses at least one difficulty for MT. Sentences have been grouped together because they each sentence in the groups presents a particular type of challenge (perhaps in addition to others as well).

For the sentences in each group below:


  1. Evaluate the sentences and try to identify the item(s) and phenomena that may cause problems for an online MT system, and why. For example:
    1. Some character strings in the sentences could correspond to different words belonging to distinct part of speech categories (part of speech ambiguity).
    2. Others could be associated with a single part of speech, but have two or more meanings (semantic ambiguity, including homonymy and polysemy).
    3. Some sentences may contain complex items, expressions or images that should not be broken down and translated word-for-word (issues relating to delimitation of lexical units, metaphors or fixed expressions).
    4. Others may have structures that could be analyzed in more than one way, depending on the elements that are linked by conjunctions or are modified by prepositional or other phrases, or on the combinations of part of speech classes that are identified (scope of conjunctions, attachment of modifiers, structural ambiguity).
    5. Still others may include cases in which pronouns or other items are used to take the place of a “content” word, or in which words have been left out instead of being repeated(anaphora).
    6. Finally, to be correctly translated some sentences may require reference to linguistic information from outside the sentence itself (need for extrasentential analysis), or real-world knowledge that is altogether external to the text.
  2. Once you’ve identified problematic items, try to determine what problem the sentences all share, and to find one or more terms that describe this problem (that you can use, for example, to name the group).
  3. For each sentence, think about the “correct” translation of the item and/or sentence an MT system would ideally produce, as well as the translation(s) that could result if a system fell into the “trap(s)” you identified.

Group A



Je me suis acheté un pull caca d'oie.

Mon nouveau système d’exploitation exige plus de mémoire vive dans mon ordinateur portable.

Ma belle-mère aime faire des casse-tête avec son petit-fils.

À la cabane à sucre, j’aime beaucoup les oreilles de crisse.

Le raton laveur adore les poissons rouges.


Group B


Ça baigne dans l'huile.

Il est parti sur un coup de tête.

J'aime donner des coups de main aux gens dans le besoin.

Il y a une quantité folle de nids de poule sur les routes de Montréal cette année.

Je suis aux prises avec des mauvaises herbes.

J’étais en train d’expliquer mon idée. Mais en cours de route, j’ai perdu le fil.

Ce programme ne tourne pas sur cette plateforme. Je dois tout faire à la mitaine.

Qui va à la chasse perd sa place.


Group C


À Notre-Dame, un bossu bosse à sonner les cloches du clocher. 

Cours, Forrest, cours !

Ton lit est comment ? Ferme ou pas très ferme ?

Quand on parle aux petits, c’est le ton ferme qui compte.

Un des participants remarque que nous avons une autre rencontre demain, et l’autre note le fait.


Group D


Le chasseur arme le chien du fusil et appuie sur la détente.

Comme dessert, il y avait des éclairs, des mille-feuilles, des religieuses, des choux, des petits fours, des sablés et des fondants, mais j'ai opté pour la tarte flambée à la carambole et la glace à la lime.

Ils jouent au badminton dans la cour.

Mon chat pèse douze livres et mesure trente-trois pouces.

Le son est très bon pour la santé, surtout dans des casseroles.

Je suis un séminaire de grec ancien.

La marche m’a fait du bien.

J’ai déposé mes affaires dans le local.


Group E


Il lit le numéro d’un ton ferme.

Je vois des étudiants débutants et avancés au cours de temps à autre.

Les étudiants boivent de la bière et les professeurs du café.

La créativité est la plus importante et la plus grande force des jeunes.

L’étudiant ferme la porte derrière lui.

Le combattant brave la garde.


Group F


Le juge Fish est un homme très sobre. Il porte sa robe (sa toge) avec beaucoup de dignité.

Mon frère est fermier et ma sœur est mannequin. Ma sœur trouve ses talons aiguille très chics, mais je rassure toujours mon frère que ses bottes de caoutchouc sont beaucoup plus pratiques.

Mon frère est fermier et ma sœur est mannequin. Mon frère trouve ses bottes de caoutchouc très pratiques, mais je rassure toujours ma sœur que ses talons aiguille sont beaucoup plus chics.


IV. Using online MT systems


Once you’ve thought about the challenges and the possibilities for translations, test your observations by translating the sentences using an online MT system.


  1. Copy the groups of sentences from the exercises.
    1. Select the sentences with your mouse;
    2. On the keyboard, hold down the Ctrl key and press the Ckey.

The sentences are copied to the Windows Clipboard. (You won’t see anything on the screen, but they’re there.)

  1. On the MT system Web page(s) you opened in your Web browser, paste these sentences into the field for the text to translate:
    1. Click in the field to place the cursor there;
    2. On the keyboard, hold down the Ctrl key and press the Vkey.

The sentences you copied appear in the field.

  1. Choose the language pair and direction that you want; in this case, French-English. (This is generally done from a drop-down list that appears on the page, or by clicking a button.)
  2. Click the Translateor OK button to translate the text.

After a moment, the translation will appear on the page in a new field. If you want, you can copy it and paste it into your Word document using the keyboard shortcuts above (Ctrl + C and Ctrl + V).


V. Evaluating the results

  1. Once the results are available, compare them to what you expected to find.
    1. Were your hypotheses verified?
    2. Did the system(s) fall into the traps you spotted?
    3. Did it/they successfully avoid any of the traps?
    4. Did it/they fall into any others you hadn’t expected?
    5. Did you observe any challenges that seem to go together? Which ones?


VI. Evaluating possibilities

  1. Consider how the major challenges could have been avoided. Would it be possible to get better results by modifying the source text? How?
    1. If you want, try changing one or two of the sentences according to your hypothesis and re-submitting them. What are the results?


VII. Wrapping up

  1. To make a copy of your results as a backup or to transfer them to another computer, copy your Word document to a USB key, or send a copy as an attachment to your e-mail.


VIII. Questions for reflection


  • As you did these exercises, what did you notice about how the various online MT tools work? Did you note any differences between systems?
  • How do you think MT (and also specifically online MT) could be useful? For whom? In what kind of situation? Is the use and usefulness of MT in general and online MT likely to be different?
  • There is no denying that there are serious problems with the translations produced. These would be obvious to almost anyone! But now that you’ve thought about where these problems come from, do you see the MT output and the problems the same way that you did before? Why or why not?
  • What kinds of problems did you identify? Which did you find most striking or important, and why?
  • Do you think these problems would be equally important for translation in the opposite direction? Would they be manifested in the same way, do you think? Why or why not?
  • Do you think that these problems could affect other types of NLP, translation or office tools? Which ones, and how?
  • Were you surprised at any of the potentially problematic items that one or more of the systems translated correctly? Which ones? Can you think why the system(s) might have succeeded (especially where others failed)?
  • Can you think of other potential problems that were not covered here but could be pertinent?
  • Were you able to think of ways of avoiding some problems with MT? How? Would these be measures that could be implemented on a large scale (e.g., in controlled language initiatives)? Do you think it would be worth doing so if a text was expected to be translated using MT?


Tutorial created by the CERTT Team. (2012-11-21)