New Capabilities Allow for Spelling Variations and Compound Word Handling
SEATTLE – June 7, 2011 – Serials Solutions®, a business unit of ProQuest® LLC, announced today it has enhanced its German-language search functionality within the Summon™ web-scale discovery service by enabling a host of sophisticated features such as support for spelling variations and intelligent handling of various word forms, compound words and stop words.
While the Summon™ service has always supported UTF-8 encoding compliance for its foreign language search requirements, the complexities of the German language presents challenges for returning the most relevant search results available to users. With the new, sophisticated German-language search functionality in the Summon™ service, users will be able to search quickly and easily in German, without the need to account for the language’s unique characters and phrasing.
“Our goal has always been to provide researchers with the fastest and easiest discovery solutions available, and enhancing language search capabilities is high on the priority list,” said Andrea Michalek, director of technology, discovery services with Serials Solutions. “We have always supported match-for-match search in German, but we are now taking it a step further by offering researchers a truly native search functionality to ensure they receive the most relevant search results.”
The following sophisticated German-language search features are now available through the Summon™ service:
- Support for Spelling Variations – The Summon™ service now includes support for umlauts (ä, ö and ü), other types of diacritics and the sharp “s” (ß) within German-language search. The German umlauts (ä, ö and ü) can be spelled as “ae”, “oe” and “ue” respectively, and the Summon™ service allows users to search for words with umlauts using these variations. Additionally, users may substitute “ss” for “ß” and will receive relevant results that include “ß.”
- German-Language Stemming – Stemming is a commonly used technique in search engines to allow users to search for the same word in different forms. The Summon™ service now performs German-language stemming in German-language search, allowing users to search for either the singular form or the plural form of most nouns.
- Enhanced Handling of Compound Words – The German language allows for the productive formation of compound words. For example, “Abwasserbehandlungsanlage” means “wastewater treatment plant.” To account for this, the Summon™ search performs a process called decompounding, or the splitting of compound words, enabling users to receive results for “wastewater treatment plant” even if they search with its components “Abwasser,” “Behandlung” and “Anlagen.”
- Intelligent Stop Word Support – The Summon™ service enables smart handling of stop words. For example, when users search for “und der Dichter” (meaning “and the poet”), the Summon™ service returns documents containing the phrase “und der Dichter” as well as documents containing “Dichter” without the stop words “und” and “der.”
- Intelligent Relevancy Scoring – Additionally, in all of the above features, the Summon™ service is able to promote documents with the exact form of the search terms above the alternate or derived forms.
Serials Solutions® Summon™ service currently enables sophisticated search features in Dutch, French, and Japanese, and will be adding enhancements for search in additional languages in the near future. The Library at the University of Konstanz became the first German institution to adopt the Summon™ web-scale discovery service, and recently completed the full integration of the solution into their library’s technology suite.