
Experimental IR Meets Multilinguality, Multimodality, and Interaction
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
This book constitutes the refereed proceedings of the 6th International Conference of the CLEF Initiative, CLEF 2015, held in Toulouse, France, in September 2015.
The 31 full papers and 20 short papers presented were carefully reviewed and selected from 68 submissions. They cover a broad range of issues in the fields of multilingual and multimodal information access evaluation, also included are a set of labs and workshops designed to test different aspects of mono and cross-language information retrieval systems.
More details
Other editions
Additional editions

Content
- Intro
- Preface
- Organization
- Keynotes
- Personal Information Systemsand Personal Semantics
- Evaluating the Search Experience:From Retrieval Effectiveness to User Engagement
- Beyond Information Retrieval:When and How Not to Find Things
- Contents
- Experimental IR
- Experimental Study on Semi-structured Peer-to-Peer Information Retrieval Network
- 1 Introduction
- 2 Related Work
- 3 Experimental Methodology
- 3.1 Testbeds and Queries
- 3.2 Retrieval Process and Evaluation
- 4 Retrieval Effectiveness of Semi-structured P2PIR Systems
- 4.1 Centralised System
- 4.2 Semi-structured P2PIR Architecture
- 5 Retrieval Models in Semi-structured System
- 6 The Resource Selection Methods on Semi-structured P2PIR Systems
- 6.1 Message Complexity
- 6.2 Retrieval Effectiveness
- 7 Conclusion
- References
- Evaluating Stacked Marginalised Denoising Autoencoders Within Domain Adaptation Methods
- 1 Introduction
- 2 Domain Adaptation Problem and Methods
- 2.1 Stacked Marginalized Denoising Autoencoders
- 3 Datasets and Evaluation Framework
- 4 Evaluation Results
- 4.1 Comparing sMDAs to Other Domain Adaptation Approaches
- 5 Conclusion
- References
- Language Variety Identification Using Distributed Representations of Words and Documents
- 1 Introduction
- 2 Related Work
- 3 Continuous Skip-Gram Model
- 3.1 Learning Sentence Vectors
- 3.2 Classification Using Distributed Representations
- 4 Alternative Methods for Language Variety Identification
- 4.1 Information Gain Word-Patterns
- 4.2 Emotion-labeled Graphs
- 5 Evaluation
- 6 Conclusions
- References
- Evaluating User Image Tagging Credibility
- 1 Introduction
- 2 A Multi-Topic Tagging Credibility Dataset (MTTCred)
- 2.1 User Credibility Dataset Design
- 2.2 Dataset Creation
- 2.3 Dataset Statistics
- 2.4 Deriving a Ground Truth Credibility Score
- 3 User Credibility Features
- 4 Problem Definition
- 4.1 Data Exploration
- 4.2 User Classification Experiments
- 4.3 Credible Users Retrieval Experiments
- 5 Conclusions and Future Work
- References
- Web and Social Media
- Tweet Expansion Method for Filtering Task in Twitter
- 1 Introduction
- 2 Filtering Method
- 2.1 Overview
- 2.2 Expansion Steps
- 3 Experimental Results
- 3.1 Experimental Setup
- 3.2 Results
- 3.3 Discussion
- 4 Conclusion and Future Work
- References
- Real-Time Entity-Based Event Detection for Twitter
- 1 Introduction
- 2 Background
- 2.1 Named Entities in Events and Twitter
- 3 Entity-Based Event Detection
- 3.1 Pre-processing
- 3.2 Clustering
- 3.3 Burst Detection
- 3.4 Cluster Identification
- 3.5 Event Merging
- 4 Experimentation
- 5 Results and Discussion
- 5.1 Effect of Named Entities
- 5.2 Nouns, Verbs, Hashtags and Retweets
- 5.3 Evaluation Measures
- 6 Conclusion
- References
- A Comparative Study of Click Models for Web Search
- 1 Introduction
- 2 Click Models
- 3 Evaluation Measures
- 4 Experimental Setup
- 5 Results
- 6 Conclusion
- References
- Evaluation of Pseudo Relevance Feedback Techniques for Cross Vertical Aggregated Search
- 1 Introduction
- 2 System Overview
- 3 System Details
- 3.1 Query Reformulation
- 3.2 Source Specific Query Reformulation
- 3.3 Result Aggregation
- 4 Evaluation
- 5 Results
- 5.1 Qualitative Evaluation
- 5.2 Quantitative Evaluation
- 5.3 Evaluation Guideline
- 6 Conclusions and Future Work
- References
- Long Papers with Short Presentation
- Analysing the Role of Representation Choices in Portuguese Relation Extraction
- 1 Introduction
- 2 Related Work
- 3 Data Annotation
- 4 Method
- 4.1 Conditional Random Field Model
- 4.2 Representation
- 4.3 Features
- 5 Evaluation
- 5.1 Analysis and Discussion
- 6 Conclusions
- References
- An Investigation of Cross-Language Information Retrieval for User-Generated Internet Video
- 1 Introduction
- 2 Related Work
- 3 Experimental Test Set and Evaluation
- 3.1 Query Construction for the CLIR Task
- 4 CLIR Using Single Field Indexes
- 5 CLIR Using Combined Metadata Fields
- 6 Conclusions and Further Research
- References
- Benchmark of Rule-Based Classifiers in the News Recommendation Task
- 1 Introduction
- 2 On-line Task: Setup and Results
- 2.1 Algorithms
- 2.2 Performance
- 3 Off-line Task: Setup and Results
- 3.1 Data and Task
- 3.2 Algorithms
- Association Rule Classifiers.
- Rule Learning (Baseline).
- Decision Trees.
- 3.3 Experimental Evaluation
- Trading Speed for Accuracy.
- Optimizing CBA.
- 4 Conclusion and Future Work
- References
- Enhancing Medical Information Retrieval by Exploiting a Content-Based Recommender Method
- 1 Introduction
- 2 Framework of the Integrated IR Model
- 2.1 Background to Recommender Systems Applications
- 2.2 Combining Information Retrieval and Recommender Systems
- 3 Experimental Test Collection
- 3.1 Query Set
- 3.2 Click-Through Data
- 3.3 Document Collection
- 3.4 Query Relevance Data
- 4 Experimental Investigation
- 4.1 Information Retrieval Component
- 4.2 Recommender Component
- 4.3 Combination of Results
- 4.4 Experimental Results
- 5 Conclusions and Further Investigations
- References
- Summarizing Citation Contexts of Scientific Publications
- 1 Introduction
- 2 Related Work
- 3 Materials and Methods
- 3.1 Data and Tools
- 3.2 Suggested Approach
- 3.3 Text Summarization
- 4 Experimental Results
- 5 Discussion and Conclusions
- References
- A Multiple-Stage Approach to Re-ranking Medical Documents
- 1 Introduction
- 2 Related Work
- 3 Methods
- 4 Experiments
- 4.1 Data
- 4.2 Evaluation Settings
- 4.3 Results
- 5 Conclusion
- References
- Exploring Behavioral Dimensions in Session Effectiveness
- 1 Introduction
- 2 Prior Studies on IR Simulation
- 3 The Session Sim mulator
- 3.1 Simulator Definitio on
- 3.2 A Simulation Step
- 4 Study Design
- 4.1 Research Questions
- 4.2 Test Collection and Search Engine
- 4.3 Search Goals, Gains and Cost Constraints
- 4.4 Query Formulation Strategies
- 4.5 Snippet Scanning and Stopping Behavior
- 4.6 Relevance Related Behavior
- 4.7 Session Generation
- 5 Experimental Results
- 6 Summary
- References
- Short Papers
- META TEXT ALIGNER: Text Alignment Based on Predicted Plagiarism Relation
- 1 Introduction
- 2 META TEXT ALIGNER
- 2.1 Predicting the Plagiarism Type
- 2.2 Using Predicted Plagiarism Type to Improve Text Alignment's Performance
- 3 Experiments and Analysis
- 4 Conclusion and Future Works
- References
- Automatic Indexing of Journal Abstracts with Latent Semantic Analysis
- 1 Introduction
- 2 Background
- 2.1 MeSH Hierarchy
- 2.2 PubMed Annotation
- 3 Method
- 3.1 Data
- 3.2 Latent Semantic Analysis
- 3.3 Choosing Closest Neighbors
- 3.4 MeSH Tag Scoring and Selection
- 3.5 Additional Ranking Experiments and Learning-to-Rank
- 4 Results and Discussion
- 5 Conclusions
- References
- Shadow Answers as an Intermediary in Email Answer Retrieval
- 1 Introduction
- 2 Shadow Answer
- 3 Experiment Data
- 4 Experiment Process
- 5 Experiment Results
- 6 Conclusions
- References
- Are Topically Diverse Documents Also Interesting?
- 1 Introduction
- 2 Methods
- Measuring Debates' Topical Diversity.
- Measuring Debate's Interestingness.
- Correlation of Debates' Topical Diversity and Interestingness.
- 3 Analysis
- 3.1 Datasets and Experimental Setup
- 3.2 Results
- Measuring Topical Diversity of Debates.
- Measuring Interestingness of Debates.
- The Correlation Between Interestingness and Diversity.
- 4 Conclusion
- References
- Modeling of the Question Answering Task in the YodaQA System
- 1 Introduction
- 2 Question Answering Approaches
- 3 Benchmarking
- 4 YodaQA Question Answering System
- 4.1 System Architecture
- 4.2 Reference Baseline
- 4.3 System Performance
- 5 Conclusion and Future Work
- 5.1 Benchmarking
- References
- Unfair Means: Use Cases Beyond Plagiarism
- 1 Introduction
- 2 Related Work
- 2.1 Types of Plagiarism
- 2.2 Plagiarism Detection
- 3 Types of Unfair Means Problems
- 3.1 Review of University Guidelines
- 3.2 Interviews with Staff in the University of Sheffield
- 4 Discussion
- 5 Summary
- References
- Instance-Based Learning for Tweet Monitoring and Categorization
- 1 Introduction
- 2 Methods
- 2.1 Overall Architectur re of the System
- 2.2 Preprocessing
- 2.3 Indexing
- 2.4 k-NN
- 3 Results and Discussions
- 3.1 Q1: Is It Better to Build One KB for Each Domain, or to Merge Automotive and Banking into the Same KB ?
- 3.2 Q2: Is It Better to Build one KB for Each Language, or to Merge English and Spanish into the Same KB ?
- 4 Conclusion
- References
- Are Test Collections ``Real"? Mirroring Real-World Complexity in IR Test Collections
- 1 Introduction
- 2 Status Quo and Related Work
- 3 Modality Categorization
- 4 Building Complex Collections
- 5 Conclusions
- References
- Evaluation of Manual Query Expansion Rules on a Domain Specific FAQ Collection
- 1 Introduction
- 2 Related Work
- 3 Test Collection
- 4 Methodology
- 4.1 Retrieval Models
- 4.2 Query Expansion Rules
- 5 Evaluation
- 5.1 QE Rules Accuracy
- 5.2 Retrieval Evaluation
- 6 Conclusion
- References
- Evaluating Learning Language Representations
- 1 Introduction and Motivation
- 2 Testing Outcome Versus Process
- 3 Existing Tests for Human Language Learning
- 4 Requirements for a Learning-Focused Evaluation
- 5 Conclusion
- References
- Automatic Segmentation and Deep Learning of Bird Sounds
- 1 Introduction
- 2 Automatic Segmentation of Noisy Bird Sounds
- 3 Deep Neural Network Classification
- 4 Results
- 5 Discussion and Conclusions
- References
- The Impact of Noise in Web Genre Identification
- 1 Introduction
- 2 Previous Work
- 3 Experiments
- 4 Conclusion
- References
- On the Multilingual and Genre Robustness of EmoGraphs for Author Profiling in Social Media
- 1 Introduction
- 2 Emotion-Labelled Graphs
- 3 Evaluation Framework
- 3.1 PAN-AP-14 Corpus
- 3.2 Methodology
- 4 Experimental Results
- 4.1 Age and Gender Identification
- 4.2 The Impact of EmoGraphs
- 5 Conclusions
- References
- Is Concept Mapping Useful for Biomedical Information Retrieval?
- 1 Introduction
- 2 Related Work
- 3 Experiments Using Concepts
- 3.1 Bag-of-Concepts vs. Bag-of-Words
- 3.2 Concepts as Phrases
- 4 Conclusions
- References
- Using Health Statistics to Improve Medical and Health Search
- 1 Introduction
- 2 Probabilistic Model for Personalized Health Search
- 2.1 Epidemiological Measures of Occurrence
- 2.2 Estimating Probability Distributions
- 3 Evaluation
- 3.1 Experimental Setup
- 3.2 Results
- 4 Discussion and Future Work
- References
- Determining Window Size from Plagiarism Corpus for Stylometric Features
- 1 Problem Statement
- 2 Methodology
- 3 Analysis and Results
- 4 Conclusion
- References
- Effect of Log-Based Query Term Expansion on Retrieval Effectiveness in Patent Searching
- 1 Introduction
- 2 Related Work
- 3 Experiments
- 3.1 Baseline Runs
- 3.2 Using
- 4 Analysis of the Retrieval Results
- 5 Conclusions and Future Work
- References
- Integrating Mixed-Methods for Evaluating Information Access Systems
- 1 Introduction
- 2 Related Work
- 3 Case Study: WorldCat.org
- 4 Data Integration
- 5 Conclusions
- References
- Teaching the IR Process Using Real Experiments Supported by Game Mechanics
- 1 Motivation
- 2 State-of-the-Art
- 3 System Overview
- 4 Conducting Experiments
- 5 Gamification
- 6 Conclusions and Future Work
- References
- Tweet Contextualization Using Association Rules Mining and DBpedia
- 1 Introduction
- 2 The Proposed Approaches for Tweet Contextualization
- 2.1 Statistical Approach Based on Inter-Terms Association Rules (ARE)
- 2.2 Semantic Approach Based on DBpedia (DBE)
- 3 Experiments and Results
- 4 Conclusion
- References
- Best of the Labs
- Search-Based Image Annotation: Extracting Semantics from Similar Images
- 1 Introduction
- 2 Related Work
- 3 Semantic Search-Based Image Annotation
- 3.1 Retrieval of Similar Images
- 3.2 Text Processing
- 4 ImageCLEF 2014 Annotation Challenge
- 4.1 Scalable Concept Image Annotation Task
- 4.2 DISA Participation
- 5 Evaluation
- 6 Conclusions and Future Work
- References
- NLP-Based Classifiers to Generalize Expert Assessments in E-Reputation
- 1 Introduction
- 2 Related Work
- 3 Reputation Monitoring
- 3.1 Approaches
- Terms Weighting.
- Baselines.
- Multi-word Expression.
- Lexical Context.
- 4 Author Profiling
- 4.1 Approaches
- 5 Experimental Evaluation and Results
- 5.1 Evaluation
- 5.2 Reputation Monitoring
- Reputation Dimensions.
- Priority Detection.
- 5.3 Author Profiling
- Author Ranking.
- Author Categorization.
- 5.4 Classes Distribution Issue and Perspectives
- 6 Conclusions
- References
- A Method for Short Message Contextualization: Experiments at CLEF/INEX
- 1 Introduction
- 2 Method Description
- 2.1 Sentence Scoring
- 2.2 Topic-Comment Relationship in Contextualization Task
- 2.3 Sentence Re-ordering
- 3 Evaluation
- 4 Other Applications of the Sentence Retrieval
- 4.1 Snippet Retrieval
- 4.2 Query Expansion
- 5 Conclusion
- References
- Towards Automatic Large-Scale Identification of Birds in Audio Recordings
- 1 Introduction
- 2 Feature Engineering
- 2.1 Metadata
- 2.2 openSMILE
- 2.3 Segment-Probabilities
- 3 Feature Selection
- 4 Training and Classification
- 5 Results
- 6 Discussion
- References
- Optimizing and Evaluating Stream-Based News Recommendation Algorithms
- 1 Introduction
- 2 Related Work
- 3 Approach
- 3.1 The Analyzed Scenario
- 3.2 The Evaluation Framework
- 4 The Recommender Algorithms
- 4.1 The Implemented Algorithms
- 4.2 Implementation Details
- 4.3 Discussion
- 5 Evaluation
- 5.1 Metrics
- 5.2 Evaluation Setup
- 5.3 Evaluation Results
- 5.4 Complexity Dependent Evaluation
- 5.5 Discussion
- 6 Conclusion
- References
- Information Extraction from Clinical Documents: Towards Disease/Disorder Template Filling
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 System Architecture
- 3.2 Pipeline Processing of Individual Modules
- 4 Evaluation
- 4.1 Dataset
- 4.2 Evaluation Metric
- 4.3 Results and Discussion
- 5 Conclusions
- References
- Adaptive Algorithm for Plagiarism Detection: The Best-Performing Approach at PAN 2014 Text Alignment Competition
- 1 Introduction
- 2 Text Alignment
- 3 Our Approach
- 3.1 Seeding
- 3.2 Extension
- 3.3 Filtering
- 3.4 Adaptive Behavior
- 4 Experimental Results
- 5 Conclusions and Future Work
- References
- Question Answering via Phrasal Semantic Parsing
- 1 Introduction
- 2 Related Work
- 3 The Task
- 4 Recognizing the Structure of Query Intention
- 4.1 Phrase Detection
- 4.2 Phrase Dependency Parsing with Multiple Heads
- 5 Converting Phrase Dependency Graph into Structured Queries
- 6 Instantiating Query Intention Regarding Existing KBs
- 7 Experiments
- 7.1 Datasets
- 7.2 Main Results
- 8 Conclusion and Future Work
- References
- Labs Overviews
- Overview of the CLEF eHealth Evaluation Lab 2015
- 1 Introduction
- 2 Tasks Motivations
- 2.1 Task 1
- 2.2 Task 2
- 3 Materials and Methods
- 3.1 Speech and Text Documents
- 3.2 Human Annotations, Queries, and Relevance Assessments
- 3.3 Evaluation Methods
- 4 Results
- 5 Conclusions
- References
- General Overview of ImageCLEF at the CLEF 2015 Labs
- 1 Introduction
- 2 ImageCLEF 2015: The Tasks, the Data and the Participation
- 2.1 The Image Annotation Task
- 2.2 The Medical Classification Task
- 2.3 The Medical Clustering Task
- 2.4 The Liver CT Annotation Task
- 3 Conclusions
- References
- LifeCLEF 2015: Multimedia Life Species Identification Challenges
- 1 LifeCLEF Lab Overview
- 1.1 Motivations
- 1.2 Evaluated Tasks
- 1.3 Main Contributions
- 2 Task1: PlantCLEF
- 2.1 Context
- 2.2 Dataset
- 2.3 Task Description
- 2.4 Participants and Results
- 3 Task2: BirdCLEF
- 3.1 Context
- 3.2 Dataset
- 3.3 Task Description
- 3.4 Participants and Results
- 4 Task3: FishCLEF
- 4.1 Context
- 4.2 Dataset
- 4.3 Task Description
- 4.4 Participants and Results
- 5 Conclusions and Perspectives
- References
- Overview of the Living Labs for Information Retrieval Evaluation (LL4IR) CLEF Lab 2015
- 1 Introduction
- 2 Living Labs for IR
- 2.1 Living Labs API
- 2.2 Evaluation Metric
- 3 Use-Case 1: Product Search
- 3.1 Task and Data
- 3.2 Submissions and Results
- 4 Use-Case 2: Web Search
- 4.1 Task and Data
- 4.2 Results
- 5 Discussion and Conclusions
- References
- Stream-Based Recommendations: Online and Offline Evaluation as a Service
- 1 Introduction
- 2 Related Work
- 2.1 Benchmarking Using Static Datasets
- 2.2 Recommendations in Dynamic Settings
- 3 Task Descriptions
- 3.1 Task 1: Benchmark News Recommendations in a Living Lab
- 3.2 Task 2: Benchmark News Recommendations in a Simulated Environment
- 4 The Offline Evaluation Framework
- 4.1 Idomaar Architecture
- 4.2 Idomaar Data Workflow
- 4.3 Discussion
- 5 Evaluation
- 5.1 Participation
- 5.2 The Baseline Algorithm
- 5.3 Evaluated Algorithms
- 5.4 Evaluation Results
- 5.5 Discussion
- 6 Conclusion and Outlook
- References
- Overview of the PAN/CLEF 2015 Evaluation Lab
- 1 Introduction
- 2 Plagiarism Detection
- 2.1 Related Work
- 2.2 Community-Driven Construction of Evaluation Resources
- 2.3 Text Alignment Corpus Construction
- 3 Author Identification
- 3.1 Related Work
- 3.2 Evaluation Setup
- 3.3 Corpus
- 3.4 Evaluation Results
- 4 Author Profiling
- 4.1 Related Work
- 4.2 Experimental Settings
- 4.3 Evaluation Results
- 5 Conclusions
- References
- Overview of the CLEF Question Answering Track 2015
- 1 Introduction
- 2 Tasks
- 2.1 QALD: Question Answering Over Linked Data
- 2.2 Entrance Exams Task
- 2.3 BioASQ: Biomedical Semantic Indexing and Question Answering
- 3 Participation
- 4 Conclusions
- References
- Overview of the CLEF 2015 Social Book Search Lab
- 1 Introduction
- 2 Participating Organisations
- 3 The Amazon/LibraryThing Corpus
- 4 The SBS Suggestion Track
- 4.1 Information Needs
- 4.2 Evaluation
- 5 The SBS Interactive Track
- 5.1 User Tasks
- 5.2 Experiment Structure
- 5.3 System and Interfaces
- 5.4 Participants
- 5.5 Procedure
- 5.6 Results
- 6 Conclusions and Plans
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.