Le Migou Tutorial, Level I 


Other monolingual concordancer tutorials



I. Introduction


Le Migou is an online monolingual concordancer that was developed at the Observatoire de linguistique Sens-Texte (OLST) at Université de Montréal by Patrick Drouin (http://www.mapageweb.umontreal.ca/drouinp/). It allows a user to consult general and specialized language corpora which, for the most part, have been compiled by members of OLST. The corpora are mainly in French, but there is one in English (the SACOT corpus on terrorism), and one in Korean (on computer science).


A concordancer such as Le Migou can make it easier to search through texts by allowing a user to search for one or several character strings and display all occurrences of these character strings in context.


Le Migou is conveniently available online at all times. However, since the corpora are already built, the user has less flexibility in terms of content; you are limited in the domains or corpora you wish to consult.


Please note that Le Migou interface is French-only. As previously mentioned, there is an English corpus, in which you can search in English, but the interface will always remain in French.  


To use Le Migou, you will need an account. (The account and use of the tool are free). You can register from Le Migou’s website by clicking the access button. Your username and password will be emailed to you. 


II. Getting ready

  1. Open the web browser of your choice (for example, Internet Explorer or Mozilla Firefox) from the shortcut on the Desktop or from the Start menu.

  2. Open Le Migou’s home page by clicking on the following link: http://olst.ling.umontreal.ca/?page_id=54.
  1.  Click the Access button to open the Le Migou search page.


III. Getting to know the interface

  1. The search interface has several fields:
    1. In the first search field next to Premier mot, you can enter a word or group of words (character strings) for which to search in a corpus;
    2. In the drop-down list next to the search field, you can specify how you want to search for your character strings:
      1. Exactly the way it appears in the search field (without preceding or following characters) (Séquence exacte);
      2. Potentially followed by more characters (Débute par cette séquence);
      3. Potentially preceded by more characters (Se termine par cette séquence);
      4. Potentially preceded and followed by more characters (Séquence imbriquée). (See Note 1.)
  1. Below the first field, there is a drop-down list of available Boolean operators which help a user to combine search strings. The first in the list is the ET (AND) operator which searches for two words together. The second in the list is the OU (OR) operator which searches for one word or the other. If the ET operator is used, you can specify that you want the words searched in the order in which you entered them by checking the Respecter l’ordre des mots checkbox next to the drop-down list.
  2. The Deuxième mot search field allows you to enter a second search string of characters, if you wish. The drop-down list next to it allows you to specify how you want to search for your character strings (as in the Premier mot drop-down list).
  3. The drop-down list next to Corpus allows you to specify which corpus you wish to search. Le Migou gives you access to a series of French corpora; most of these are specialized, but there are also two newspaper corpora (articles from Le Monde and the Canadian press). You can search through one corpus at a time by selecting the one you wish to search, or you can search all of them at once by selecting Tous les corpus.
  4. The drop-down list next to Distance maximale entre les mots allows you to specify how many words can appear between the specified search strings if you want to combine two strings using the ET operator.
  5. The drop-down list next to Nombre de contextes par page allows you to specify how many contexts will be displayed per page in the search results.
  6. Finally, the Lancer la recherche button allows you to start your search.

IV. Complete a search and analyze the results

  1. Complete a search for the term terror :
    1. Enter terror in the Premier mot search field;
    2. From the drop-down list next to it, select Séquence exacte;
    3. From the Corpus drop-down list, select Corpus SACOT (EN);
    4. Leave the Deuxième mot search field blank for now;
    5. Leave the rest of the default values;
    6. Click the Lancer la recherche button.

A new page opens displaying the search results.

  1. Analyze the search results, which are displayed in table form.
    1. The Identificateur column identifies the sentence in which the search string occurs. You can click on the sentence number to display the sentence in a longer context.
    2. The Corpus column identifies the corpus in which the search string occurs.
    3. The Document column identifies the name of the file in which the search string occurs.
    4. The Contextes column displays all occurrences of the search string in the context of about one sentence. The search string is indicated in red. (See Note 2.)
  1. Above the table, the number of contexts that are displayed on the page is indicated. (If this is the maximum number of results per page, there may be some results on subsequent pages.)
  2. Below the table, you can click the Suite des occurrences button to see the following occurrences.
  3. Finally, you can click the Nouvelle recherche button to return to the search page.
  1. Try the other search possibilities by searching for words that begin or end with the string terror or words which contain this string. Look at the different strings that are displayed.
    1. Of these possibilities, what is the most efficient search? Why?
    2. What are the advantages of using this type of search? What are the disadvantages?
  2. Try some searches that combine this string with another (for example, war, cell or violence) using ET or OU.
    1. What searches did you launch? What functions did you use? Why?
    2. What type of information can you gather from doing these types of searches? How is this information useful for translators or terminologists?
  3. Complete a search for all forms of the verb to wage:
    1. How can you search for all occurrences of the different forms of the verb?
    2. What effects will the different ways of searching have on the results? What are the difficulties you might encounter when searching for various forms of verbs using this type of tool?
    3. In what type of structures does this verb appear? With what types of words is this verb used (e.g., what are its typical subjects and objects, which in French are sometimes called its combinatoire)?
  4. Complete a similar search in the French corpus Corpus medical OLST (FR), for the verb traiter, and observe its structures and combinatoire.
    1. Do you think all occurrences of this verb have the same meaning? Why?
    2. Do you think the search results represent all possible meanings of the verb? Complete some searches in different corpora (for example, the computer science corpora) to test your hypothesis.
    3. Compare the structures and common subjects and objects of the occurrences in the other corpora with those in the medical corpus. Do you notice any differences? Any similarities?
    4. Do you see any differences between these results and those from your analysis of the verb to wage? Might these be significant? Why?
    5. How is this information useful for translators or terminologists?























NOTE 1: By default, Le Migou searches single words, that is, character strings which are graphically delimited by spaces, punctuation marks, symbols and carriage returns. 
































NOTE 2: To make searching easier, the format of the texts has been modified slightly, for example, to separate punctuation marks from the words they follow. Sometimes some effort will be required to interpret the contexts as they appear. 


V. Questions for reflection


  • What are your first impressions of the functions and functioning of Le Migou?
  • Are the search functions in Le Migou similar to those in other concordancers that you have worked with? What about the interface? Is it similar or different? Is the presentation of results in Le Migou similar to the presentation of results in other concordancers you have worked with?
  • What are some of the advantages and disadvantages of using an online concordancer compared to a concordancer that searches through corpora stored on your own computer?
  • What are the main difficulties that you face when searching through corpora using a concordancer such as Le Migou?


Tutorial developed by the CERTT team (2007).