Online French-English MT Exercise, Level I
|Other online machine translation tutorials:||Other Machine Translation tutorials:|
|MT Browser Tools, Level I||Reverso Promt I|
|Reverso Promt II|
Uses of online MT
With the growth of the Internet and the increasing availability of documents in a variety of languages, there has also been growth in the field of machine translation (MT), and of the availability of (often free) MT tools online. Online MT can meet the needs of some users, particularly those who are interested primarily in getting an idea of the content of a document, but are not concerned with the “quality” of the language in the translation produced. (The focus is thus very much on content rather than on form. These are generally cases in which a professional translator would not be hired to do the job if MT weren’t used — users would likely just try to muddle through in the source language or do without.)
Challenges of online MT
MT has shortcomings of which translators — if not always the public — are well aware, and of course online MT is no exception to the rule. (In fact, in some ways it’s the classic case.) These shortcomings may affect both the form and the content of the translations produced. However, while it is rarely difficult to find things to criticize in the output of MT systems, critics may be less aware of the challenges faced by MT systems that can explain many of the (admittedly sometimes laughable) translations produced.
Today’s online systems face a number of challenges at a fairly general level. First and foremost, like any MT system, online MT works exclusively with the forms of words that are present in the text — character strings — with no knowledge about the real world or an author’s intentions, no critical judgment, and no ability to reason or infer any information as a human can do.
The challenges also include the lack of control over the subject matter, vocabulary and form of texts that are submitted. Of course, given the almost infinite variety of words, terms, structures, and even errors that can be present in texts, it is certainly not surprising that no system can deal successfully with all texts that are submitted for translation.
In addition, there are some specific linguistic phenomena that are recognized as widely problematic for MT (and in fact, most natural language processing, or NLP, applications). These include problems with lexical units (e.g., challenges of delimiting units, part of speech ambiguity, semantic ambiguity, metaphorical and fixed expressions, anaphora), with sentence structures (e.g., structural ambiguities, the need to analyze more than a sentence at a time to do a correct translation), and with the task of translation in general (e.g., the need to have world — and not just linguistic — knowledge in order to produce a correct translation). (Machine translation and its challenges are discussed for example in chapters 1, 2, 3 and 10 of L’Homme (1999). For more information, you can also consult Hutchins and Somers (1992) and Melby (1996).)
Ways of dealing with some challenges
One way of dealing with problems is to try to adjust an MT system or how it works (e.g., the content of its dictionaries or the resources it uses to analyze the source text or generate the target text) according to the errors observed or to adapt it to the type of texts that the system will generally be dealing with. However, unfortunately the former sometimes just introduces new errors in other contexts, and the latter isn’t really an option for online MT, which can be used to translate anything and everything.
As users have little or no control over how the online MT systems they use deal with challenges, an alternative is to think about what can be done in the source text itself to reduce the impact of these problems. This kind of pre-editing is in some ways a simpler version of what is known as controlled language— that is, the implementation of writing and vocabulary rules designed to make texts as easily “understandable” as possible, both for humans and for natural language processing applications.
Controlled language is commonly used when MT is necessary (for example, when the volume of texts to be translated is too large, the deadlines too short, or the cost too high for human translators to do a job, or when translators are not available). By predicting and avoiding problems in translation as texts are written, the time required to revise (post-edit) MT output is also minimized.
Of course, controlled language is implemented in large-scale projects (and generally those using highly developed MT systems), and not for online MT! But thinking about the principles behind controlled language in even this context can help you to see how you can improve the results of MT or NLP without having any control whatsoever over the application itself. Most of all, it will remind you that the system itself is only part of the equation: to work well, it needs input that it can work well with.
For these exercises, you can use your choice of online MT systems, including any or all of the following:
- Bing Translator
- Google Translate
- PROMT Online
- SDL Free Translation
Each of the sentences in the groups below poses at least one difficulty for MT. Sentences have been grouped together because they each sentence in the groups presents a particular type of challenge (perhaps in addition to others as well).
For the sentences in each group below:
Je me suis acheté un pull caca d'oie.
Mon nouveau système d’exploitation exige plus de mémoire vive dans mon ordinateur portable.
Ma belle-mère aime faire des casse-tête avec son petit-fils.
À la cabane à sucre, j’aime beaucoup les oreilles de crisse.
Le raton laveur adore les poissons rouges.
Ça baigne dans l'huile.
Il est parti sur un coup de tête.
J'aime donner des coups de main aux gens dans le besoin.
Il y a une quantité folle de nids de poule sur les routes de Montréal cette année.
Je suis aux prises avec des mauvaises herbes.
J’étais en train d’expliquer mon idée. Mais en cours de route, j’ai perdu le fil.
Ce programme ne tourne pas sur cette plateforme. Je dois tout faire à la mitaine.
Qui va à la chasse perd sa place.
À Notre-Dame, un bossu bosse à sonner les cloches du clocher.
Cours, Forrest, cours !
Ton lit est comment ? Ferme ou pas très ferme ?
Quand on parle aux petits, c’est le ton ferme qui compte.
Un des participants remarque que nous avons une autre rencontre demain, et l’autre note le fait.
Le chasseur arme le chien du fusil et appuie sur la détente.
Comme dessert, il y avait des éclairs, des mille-feuilles, des religieuses, des choux, des petits fours, des sablés et des fondants, mais j'ai opté pour la tarte flambée à la carambole et la glace à la lime.
Ils jouent au badminton dans la cour.
Mon chat pèse douze livres et mesure trente-trois pouces.
Le son est très bon pour la santé, surtout dans des casseroles.
Je suis un séminaire de grec ancien.
La marche m’a fait du bien.
J’ai déposé mes affaires dans le local.
Il lit le numéro d’un ton ferme.
Je vois des étudiants débutants et avancés au cours de temps à autre.
Les étudiants boivent de la bière et les professeurs du café.
La créativité est la plus importante et la plus grande force des jeunes.
L’étudiant ferme la porte derrière lui.
Le combattant brave la garde.
Le juge Fish est un homme très sobre. Il porte sa robe (sa toge) avec beaucoup de dignité.
Mon frère est fermier et ma sœur est mannequin. Ma sœur trouve ses talons aiguille très chics, mais je rassure toujours mon frère que ses bottes de caoutchouc sont beaucoup plus pratiques.
Mon frère est fermier et ma sœur est mannequin. Mon frère trouve ses bottes de caoutchouc très pratiques, mais je rassure toujours ma sœur que ses talons aiguille sont beaucoup plus chics.
Once you’ve thought about the challenges and the possibilities for translations, test your observations by translating the sentences using an online MT system.
The sentences are copied to the Windows Clipboard. (You won’t see anything on the screen, but they’re there.)
The sentences you copied appear in the field.
After a moment, the translation will appear on the page in a new field. If you want, you can copy it and paste it into your Word document using the keyboard shortcuts above (Ctrl + C and Ctrl + V).
As you did these exercises, what did you notice about how the various online MT tools work? Did you note any differences between systems?
How do you think MT (and also specifically online MT) could be useful? For whom? In what kind of situation? Is the use and usefulness of MT in general and online MT likely to be different?
There is no denying that there are serious problems with the translations produced. These would be obvious to almost anyone! But now that you’ve thought about where these problems come from, do you see the MT output and the problems the same way that you did before? Why or why not?
What kinds of problems did you identify? Which did you find most striking or important, and why?
Do you think these problems would be equally important for translation in the opposite direction? Would they be manifested in the same way, do you think? Why or why not?
Do you think that these problems could affect other types of NLP, translation or office tools? Which ones, and how?
Were you surprised at any of the potentially problematic items that one or more of the systems translated correctly? Which ones? Can you think why the system(s) might have succeeded (especially where others failed)?
Can you think of other potential problems that were not covered here but could be pertinent?
Were you able to think of ways of avoiding some problems with MT? How? Would these be measures that could be implemented on a large scale (e.g., in controlled language initiatives)? Do you think it would be worth doing so if a text was expected to be translated using MT?
Tutorial created by the CERTT Team. (2012-11-21)