Module III: Search Engine comparisons based on the subject of libraries and literacy.
(ready to use)
|Education, Social Studies
|Preschool, Kinder, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, Continuing, Informal
Literacy, library contruction, community outreach, poor communities|
Rationale of the Unit
|Having once read that the construction of libraries has a direct, positive influence on the literacy rates in poorer communities, my wish to find information surrounding this issue has been keen|
Being an avid Google user, using the above stated search query finally lead me to the following website, which appeared as the first query result.
Following this back to it's relevant home page (http://www.ala.org/olos/ ), I found this to be the Office of Literacy and Outreach Services, an office of the American Library Association, a most pertinent information source for the topics in which I have an interest.
Due to the high ranking of this site within my Google search, I decided to apply the same query in three other search engines to see if the OLOS website would also be returned as one of the query results.
SEARCH ENGINES TO COMPARE:
Google (http://www.google.com/ )
Northern Light (http://www.northernlight.com/ )
Ixquick (http://www.ixquick.com/ ) considered "major metacrawler"
Vivisimo (http://vivisimo.com/ )anoterh "major metacrawler"
Background and Resources
Activities and Open-ended problems
1) OLOS website existence
2) Query rank of OLOS site,
3) Number of hits
4) Relevancy of other websites
Dialogues, Discussions, and Presentations
The comparison of query results was based on the first 20 results returned from the four search engines. While this might seem like a sampling too small to make a solid search engine behavior comparison, my rationale stems from the fact that my study was using one specific search engine as the foundational comparison tool. Attempting to put myself in the place of the average impatient web user ( highly opinionated statement), I rationalized that if I could not find my desired result within the first 20 results, my tendency would be to move on to either another search query and/or search engine altogether.
Query: "library construction" literacy poor communities
Website search goal: http://www.ala.org/olos (Office for Literacy and Outreach Services, subcommittee of the ALA).
Relevant Information Goals: information for which I was searching was on the topic of libraries, the physical construction of the library building itself, organizations, and projects and their effects on poor communities of the world, specifically, if possible, on the literacy rate of the community. The OLOS site was my measure for other website relevancy.
Number of Results: 137
Website search goal: Found
Rank: 1st and 2nd query results
The first 2 results were actually subsets of the root OLOS page for which I was searching.
Relevancy of Other Results: most of the remaining first 20 results did not produce my desired Relevant Information Goals. Several sites were state sites relaying discussions and policies on future grants for library construction and aspirations towards literacy programs within schools and/or the community. While this is somewhat relevant to my search topic, it is still outside of the realm of what I was seeking (and had found withing the OLOS site).
Northern Light results:
Number of Results: 131 items in 81 sources
Website search goal: Not Found
Relevancy: Department of Education
Several state sites (FL, UT, CA, TX, TN, KS, IL)
Websites with descriptions such as "Atlas of Florida vascular plants" (rank: #10) proved to be very confusing.
Number of Results: 30 unique top-ten pages selected from at least 8,120,365 matching results. 26 results total
Website search goal: OLOS not found.
Query result # 12 very relevant:http://www.ala.org/alonline/news/2000/000410.html
Relevancy: #1 Results: New York State library (http://www.nysl.nysed.gov/library/fs/ncl-gen.htm ) This same web site ranked 3rd in the Google search.\
Again, many state websites dealing with state legislative laws and proposals (RI, CA, NE, AZ)
Confusion again was caused by such results as #7: Religion Books Alphabetic Index of Titles Commencing with "L" (http://book.netstoreusa.com/index/bkixrkl.shtml )
Web Page Article title: Clinton Cites ALA in Announcing "Digital Divide" Initiative
Ixquick query description: THE MAGAZINE OF THE AMERICAN LIBRARY ASSOCIATION :Go back to American Libraries home page Go to American Library Association home page SEARCH AMERICAN LIBRARIES Like what you see? Join the American Library Association online News brief: http://www.ala.org/alonline/news/2000/000410.html
The query desciption is not useful at all in determing the content of the cited website. Had I not been specifically searching for a site from the root www.ala.org site, I most likely would have bypassed this site. Not many users, I'll venture a guess, will take the time to open many cited sites without some kind of relevant description. It becomes a waste of time.
Number of Results: 70
Website search goal: http://www.ala.org/olos/coleman_2001_speaker.html
Relevancy of Other Results: Seemed, of all search engines, to be the biggest mix between the results of the other three.
As stated by Lawrence and Giles, information retrieval technology may not require exact matches.
Also, my information desires may be found on pages that are hidden to direct queries. In othern words, they are not indexed in a search engine. One must know the exact URL and go to it as opposed to going through a search engine.
Google uses link analysis as its main method of page ranking. Even though advance search syntax is not supported, link analysis provides for high relevancy. (***site references and facts as to why). Perhaps the #1 ranking of the OLOS site is due to the fact that Google automatically looks for terms in close proximity as one of its primary search criteria.. Thus, while the double quotes might be somewhat helpful, they are not as necessary in Google as in other search engines which do not take word proximity into account as tightly.
Northern Light has one of the largest search indexes on the web but while this may seem, and can be, a positive feature, it also seems to interfere with relevancy when using fairly simple (limited syntax) searches. (***check on validity of this statement).
While Northern Light has a unique function offer of separating results into sub-categories, I found this feature too limiting and producing results even further away relevantly. Even though relevancy percentage are displayed for each document, relevancy, as with most search engines, seems to be a subjective determination.
Purges duplicate results from major search engines.
Sometimes strange search results are due to the relevance ranking systems that search engines use. Unfortunately, most of the systems are kept away from the public domain. Some systems might recognize a string of search terms as complete search phrase instead of separate words. There maybe be an internal set of stop words or BOOLEAN terms that cause the search engine to interpret the query differently than intended. Many times, the reasons behind a high return of irrelevant terms just cannot be pinpointed.
1) The Organization of Information
Arlene G. Taylor
2) Searching the World Wide Web
Steve Lawrence and C. Lee Giles
Relevant Web sites:
ON-LINE SEARCH ENGINE RESOURCES:
3) http://www10.org/cdrom/papers/577/ (Paper on Rank Aggregation)
If you want to add your comments on this Unit, please login first.