
Business Information Systems
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
1 Introduction
This paper presents Faceted Wikipedia Search, an alternative search interface for the English edition of Wikipedia. Faceted Wikipedia Search allows users to ask complex questions, like “Which rivers ?ow into the Rhine and are longer than 50 kilometers?” or “Which skyscrapers in China have more than 50 ?oors and were constructed before the year 2000?” against Wikipedia knowledge. Such questions cannot be answered using keyword-based search as provided by Google, Yahoo, or Wikipedia’s own search engine.
In order to answers such questions, a search engine must facilitate structured knowledge which needs to be extracted from the underlying articles. On the user interface side, a search engine requires an interaction paradigm that enables inexperienced users to express complex questions against a heterogeneous information space in an exploratory fashion. For formulating queries, Faceted Wikipedia Search relies on the faceted search paradigm. Faceted search enables users to navigate a heterogeneous information space by combining text search with a progressive narrowing of choices along multiple dimensions [6,7,5].
The user subdivides an entity set into multiple subsets. Each subset is de?ned by an additional restriction on a property. These properties are called the facets. For example, facets of an entity “person” could be “nationality” and “year-of-birth”. By selecting multiple facets, the user progressively expresses the di?erent aspects that make up his overall question. Realizing a faceted search interface for Wikipedia poses three challenges:
1. Structured knowledge needs to be extracted from Wikipedia with precision and recall that are high enough to meaningfully answer complex queries.
2. As Wikipedia describes a wide range of di?erent types of entities, a search engine must be able to deal with a large number of di?erent facets. As the number of facets per entity type may also be high, the search engine must apply smart heuristics to display only the facets that are likely to be relevant to the user.
3. Wikipedia describes millions of entities. In order to keep response times low, a search engine must be able to e?ciently deal with large amounts of entity data.
Faceted Wikipedia Search addresses these challenges by relying on two software components: The DBpedia Information Extraction Framework is used to extract structured knowledge from Wikipedia [4]. neofonie search, a commercial search engine, is used as an e?cient faceted search implementation.
This paper is structured as follows: Section 2 describes the Faceted Wikipedia Search user interface and explains how facets are used for navigating and ?ltering Wikipedia knowledge. Section 3 gives an overview of the DBpedia Information Extraction Framework and the resulting DBpedia knowledge base. Section 4 describes how the e?cient handling of facets is realized inside neofonie search. Section 5 compares Faceted Wikipedia Search with related work."
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.