Other Resources
University Park, Pa. -- Searching experts who tout the benefits of using advanced query markers have it wrong: Web searches with "and," "or," "must appear" and "phrase" fare no better than simple, dressed-down submissions, according to a Penn State researcher.
"Our research shows that query operators, commonly thought to narrow searches, don't return more relevant results and don't reduce the number of non-relevant results," said Bernard J. Jansen, assistant professor of information sciences and technology and co-author of "Coverage, Relevance, and Ranking: The Impact of Query Operators on Web Search Engine Results." The article, written with Caroline Eastman of University of South Carolina, appeared in ACM Transactions on Information Systems ' October 2003 issue.
While 80 to 90 percent of Web searchers typically forgo query operators, common wisdom is that more expert users who tailor their searches end up with more and better quality results. But this may be a holdover from earlier search systems that aren't as sophisticated as what's now in the marketplace, Jansen said.
Instead of requiring users to formulate complex queries, today's search engine systems utilize more complex algorithms to retrieve and rank documents, he added. As a result, the barrier for Internet searches is low.
"If you can type and spell -- and most search engines help with spelling -- then you can pretty much conduct Web searches," Jansen said. "It's like today's cars: If you can turn the key, you don't need to know much else about the automobile."
For their study, Jansen and Eastman used real queries submitted to Excite search service on May 2, 2001. The researchers then selected Google, AOL and MSN as search engines to test the queries. The queries were run through first with their markers and then without them. The results were scored for numbers of documents retrieve and 6,000 documents were evaluated for relevance and numerical ranking of relevant documents.
Some unexpected results occurred.
Operators like "must appear" that are typically used to reduce the number of results sometimes increased them. This occurred on all three search engines, Jansen said. In addition, the search engines showed great variability in the number of documents retrieved for particular queries. On average, MSN reported retrieving 17 times more results than the other two search engines.
All three search engines basically returned the same number of relevant documents, probably because the Web-crawling technologies they use are similar, according to the researchers. On average, Google did a little better retrieving relevant documents.
"In response to competitive pressures, search engine developers have drawn pretty much on the same intellectual capital from the same pool of researchers and practitioners," Jansen said. "As a result, the technology in these search engines appears to be fairly comparable."
The research results have implications for search engine developers and businesses marketing on the Web. Because only about half of a search's Top 10 documents will be relevant, search engine designers should look at clustering similar documents into topics, Jansen said. Search engines also need to personalize searches to individuals. This would provide users with targeted personal assistance based on their previous Web interactions, he added.
As for businesses, the research highlights how crucial key word selection is for their sites.
"Businesses and organizations can't count on users formulating queries that narrow in on their particular sites," The Penn State researcher said. "They should do good market research to identify key words and then make sure they have a good presence in search engine results."
The IST Web site is http://ist.psu.edu