Google Search Engine Tutorial, Level I
The fastest and most comprehensive way of finding a variety of resources on the Web is using search engines, tools that allow you to look for occurrences of character strings (representing words, terms, expressions, etc.) in Web documents. Search engines such as Google and AltaVista work by creating and consulting indexes: lists of all of the character strings in each document they find on the Web. Without these indexes, tools would have to search through tens of millions of Web pages to find what a user is looking for, a task that would take days!
Individual search engines such as Google compile and index Web pages independently. Google is the most popular search engine on the Web and performs millions of queries daily. It maintains one of the largest databases of Web pages and also includes blogs, discussion boards, files (.pdf, .doc, .rtf, etc.), images and videos, giving you access to many different kinds of documents. To learn more about Google and its features, consult the help files (www.google.ca/intl/en/about.html), or the Google Guide at www.googleguide.com.
However, despite Google’s sheer size and popularity, it is not the only search engine. Studies have shown that there can be significant variation in results of different search engines (for example, because they index pages at different times, or because they rank them using different criteria); therefore, it may be beneficial to consult other engines when searching. You can find out more about some other search engines in resources such as the Bare Bones Web tutorial at www.sc.edu/beaufort/library/pages/bones/bones.shtml.
Another alternative to Google is a different kind of search tool. Meta search engines, such as Dogpile (www.dogpile.com), do not index Web pages themselves, but rather search within other engines simultaneously.
For this tutorial, we will concentrate on Google’s features and options. You can use many of these functions in other search tools, although you may need to use different techniques or enter your queries a bit differently (e.g. using different structures, symbols or operators).
The search will return numerous results; the total number of pages found and your query are displayed at the top right of the page. The first results are those considered by Google to be most relevant. The first line of each result indicates the title of the Web page. These titles are underlined and in blue font – they are links to the Web pages. Under each title there is an excerpt from the Web page called a snippet. Within the snippet, you will see your query strings in bold. The address or URL (uniform resource locator) of the page is shown in green.
On the right of the window you may see a list of Sponsored Links. These are paid advertisements offered by Google.
Click on one or two of the results of the search to see the pages that were found.
You will notice in the top right portion of the results section that the words is, there and in are not hyperlinks as public, transportation and Ottawa are. (The hyperlinks will redirect you to Answers.com, a free online reference site. The page that appears contains dictionary entries, encyclopedia articles and other information related to the word.)
Unlike Google in English, Google in French does not insert hyperlinks for search words at the top right of the screen.
The order of your search strings affects the results received and the order in which they are presented (called page ranking). Google gives higher rank to Web pages that have the strings in the same order as in the search. Google also considers proximity and ranks documents in which the search strings are near each other higher than those in which the strings are further apart.
Auto-stemming, or word variation, is a feature of Google that searches not only the exact string you entered, but also similar forms. These other forms may be plural, possessive, inflected or conjugated forms of words.
When searching multiple strings, there is an implicit AND operator between the strings. This means that Google only returns the pages that match all of the search strings (or their variants).
When you search many common abbreviations and acronyms, Google will also recognize and return results containing the full forms.
The term public transport is preferred in the British Isles and most Commonwealth countries, whereas public transportation, public transit and mass transit are used most often in North America. The term transit is less likely to include long-distance forms of public transportation, such as long-distance or commuter railroads, inter-city buses, or intercity railways.
You can put a phrase, proper name or series of words in a specific order in quotation marks to search for these strings in that order, or to indicate an exact search string.
In the Find web pages that have… section, each field corresponds to a simple operator explained above.
all these words: search items entered in the search field with the implicit AND operator
the exact wording or phrase: items in quotation marks
one or more of these words: the OR or | operator
The But don’t show pages that have… allows you to specify strings to be excluded from the search.
any of these unwanted words: search strings to be excluded by using the minus (-) operator
Need more tools? Here you can further refine your search.
Results per page: You can define how many results appear on each page.
Language: You can specify a particular language. All of the results returned will be in this language.
File type: You can specify a particular file format (e.g. .pdf, .doc, .rft) for your results, or exclude one specific file type from the results.
Domain: This feature allows you to search within a given domain (e.g. gc.ca for Government of Canada Web pages or uotttawa.ca for University of Ottawa Web pages).
Click the plus sign beside Date, usage rights, numeric range, and more to see additional features.
Date: By choosing one of the date options, only pages modified (updated, created, indexed) within the specified timeframe will appear in the results. This information is gathered by Googlebot which crawls and indexes Web pages.
Usage Rights: This refers to the legal shareability of published content. For more information, click the Usage rights link.
Where your keywords show up: This function allows you to specify where on a page the search strings must appear. You may choose anywhere, title, text, URL or links in the page.
Region: You can choose a country from which the results should originate.
NumericRange: You can enter two numbers in the fields; only results with numbers in this range will be returned. This corresponds to the two-point ellipsis (..) operator.
SafeSearch: This allows you to remove or allow explicit adult content in the search results. There are three options for filtering – moderate, strict or none. By default, the SafeSearch option is set to moderate. For more information, click the SafeSearch link. You can further edit this setting on the Preferences page.
Find pages similar to the page (under Page-Specific tools): Find pages similar to the page entered in the search field.
Find pages that link to the page (under Page-Specific tools): Find pages that have links to the page entered in the search field.
Google definitions allows you to search the Web for glossary and dictionary entries and other definitions of words and terms. It can gather several definitions together and present them to the user.
Google allows you to save preferred settings, so that you don’t need to re-set them every time you search. (Note that this will not work in the Writing Centre, where preferences are lost when you shut down the computer.)
These are just a few of the search options that Google offers. You can find out much more about Google tools by exploring the home page and the more > even more links that appear at the top of the page.
In addition to its well-known search engine, Google offers a lesser-known Directory of Web resources. (Its principles are similar to those of the Yahoo! Directory, which is better known.) The directory contains sites chosen by human editors and classified into categories and sub-categories to facilitate Web browsing
The pages are subject to Importance Ranking, a system which places higher-quality pages (as determined by their Google page rank) at the top of the list. You can easily search within Google categories or perform a Web search from the Directory.
The green bar to the left of each result shows the quality ranking of the Web page.
NOTE: If you enter your search string(s) and press the I’m Feeling Lucky button, you will be taken directly to the first result. This can be useful when you are confident about your query, or when you are querying popular pages.
NOTE: Punctuation marks and special characters such as . ! ? , . ; @ / # are ignored in Google searches. However, the Google search field can also be used as a calculator, and in these instances, mathematical symbols are used. For more information, refer to Google’s calculator help files (www.google.com/help
NOTE: There is no way to force Google to recognize capitals. By default, searches are not case-sensitive.
NOTE: Actually, there are some exceptions to this rule. Google also searches links that point to the pages, and the keywords that are entered in the HTML code to describe the page’s contents. So if a word is found in a link to the page or the keywords entered for the page, but does not appear on the page itself, Google may still be able to locate the page for you.
NOTE: When using simple operators +, - and ~, do not enter a space between the operator and the word (i.e., + search is incorrect, while +search is correct).
NOTE:Be sure to insert a space on either side of OR or the vertical bar (e.g., string1 OR string2, string1 | string2).
NOTE: Since Google ranks pages with many occurrences of the search strings highest, you may find that the first results still contain both taxi and cab. You are likely to see more of a difference as you go down the list to lower-ranked pages.
(Screenshot coming soon!)
(Screenshot coming soon!)
NOTE: For more information about crawling and indexing by Googlebot, refer to Google’s support and help pages or the Google Guide (http://www.googleguide.com/
NOTE: Not all online resources can be found by search engines. The Deep Web or Invisible Web refers to pages that have not been indexed by search engines. These pages can include databases, sites that require a login and password or information that is not linked from other pages. A query in a search engine will return many results, but remember that not all Web resources will be found.
Not all Web pages are of high quality. What are some things that should be considered when assessing a Web page? What qualities or features make a Web page a reliable resource? Are there ways you can evaluate these qualities or features by looking at a Web page or site?
How can Directories help you focus your search? How are the results you find likely to differ from the results of searching using a search engine?
Google is not the only search engine available. Some other search engines are: Yahoo! (www.yahoo.com or the Canadian site http://ca.yahoo.com), Ask (www.ask.com) and Alta Vista (www.altavista.com or the Canadian site http://ca.altavista.com). Take a look at one or all of these other search engines. Try repeating some of the searches above and see how the results compare. Do these search engines offer the same or similar search options?
You may also want to try a meta-search engine, a tool that synthesizes the results of searching in many different search engines at once. One example of a meta-search engine is Dogpile (www.dogpile.com). Try a search or two with this tool. What do you think are the advantages and challenges of using a meta-search engine?
Tutorial written by Cheryl McBride. (2009-06-11)