
Natural Language Processing and Information Systems
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The 22 full papers, 19 short papers, and 16 poster papers presented were carefully reviewed and selected from 125 submissions. The papers are organized in the following topical sections: feature engineering; information extraction; information extraction from resource-scarce languages; natural language processing applications; neural language models and applications; opinion mining and sentiment analysis; question answering systems and applications; semantics-based models and applications; and text summarization.
More details
Other editions
Additional editions

Content
- Intro
- Preface
- Organization
- Invited Papers
- Linguistic Musicology
- Using Machine Reading to Aid Cancer Understanding and Treatment
- Contents
- Feature Engineering
- Fine-Grained Opinion Mining from Mobile App Reviews with Word Embedding Features
- 1 Introduction
- 2 Related Work
- 3 Baseline Model
- 4 Word Embedding Model
- 4.1 Synonym Expansion
- 4.2 Clustering
- 5 Experiments and Results
- 5.1 Data
- 5.2 Experimental Setup
- 5.3 Evaluation Results
- 6 Conclusion
- References
- Feature Selection Using Multi-objective Optimization for Aspect Based Sentiment Analysis
- 1 Introduction
- 2 Method
- 2.1 Brief Overview of MOO
- 2.2 Non-dominated Sorting Genetic Algorithm-II (NSGA-II)
- 2.3 Problem Formulation
- 2.4 Problem Encoding
- 2.5 Fitness Computation
- 2.6 Features
- 3 Experiments and Analysis
- 3.1 Datasets
- 3.2 Results and Analysis
- 3.3 Comparisons
- 3.4 Feature Selection: Analysis
- 4 Error Analysis
- 4.1 OTE
- 4.2 Sentiment Classification
- 5 Conclusion
- References
- Feature Selection and Class-Weight Tuning Using Genetic Algorithm for Bio-molecular Event Extraction
- 1 Introduction
- 2 Major Steps for Event Extraction
- 2.1 Event Trigger Extraction
- 2.2 Argument Extraction by Edge Detection
- 3 Features
- 4 Experimental Results and Analysis
- 4.1 Comparison with Existing Systems
- 5 Conclusion
- References
- Automated Lexicon and Feature Construction Using Word Embedding and Clustering for Classification of ASD Diagnoses Using EHR
- Abstract
- 1 Introduction
- 2 Manual and Automated Creation of Lexicons as Features
- 3 ASD Case Status Classification
- 4 Conclusion
- Acknowledgements
- References
- Multi-objective Optimisation-Based Feature Selection for Multi-label Classification
- 1 Introduction
- 2 Multiobjective Feature Subset Selection
- 3 Experimental Setup
- 4 Results
- 5 Conclusion
- References
- Information Extraction
- WikiTrends: Unstructured Wikipedia-Based Text Analytics Framework
- 1 Introduction
- 2 Related Work
- 3 The WikiTrends Framework
- 3.1 WikiTrends Parser
- 3.2 WikiTrends Extractors
- 3.3 WikiTrends Analysis Layer
- 4 Evaluation
- 4.1 Test Set
- 4.2 Gender Extractor Evaluation
- 4.3 Location Extractor Evaluation
- 4.4 Time Extractor Evaluation
- 5 Conclusion and Future Work
- References
- An Improved PLDA Model for Short Text
- 1 Introduction
- 2 Related Works
- 3 Model and Algorithms
- 3.1 Notation
- 3.2 ST-PLDA Model
- 3.3 Model Inference
- 3.4 Classification
- 4 Experiment Analysis
- 4.1 Data Sets
- 4.2 Classification Performance
- 4.3 Evaluation of Topics
- 4.4 Sensitivity Analysis
- 5 Conclusions
- References
- Mining Incoherent Requirements in Technical Specifications
- 1 Motivations and Objectives
- 2 Construction of a Corpus of Incoherent Requirements
- 2.1 Corpus Compilation Method
- 2.2 Extraction of Incoherent Requirements
- 3 Annotating Incoherence in Requirements
- 4 A Preliminary Categorization of Incoherence in Requirements
- 4.1 Partial or Total Incompatibilities Between Expressions
- 4.2 Incompatible Events
- 4.3 Contextual Incompatibilities
- 4.4 Terminological Variations and Other Discrepancies
- 4.5 Category Synthesis
- 5 Analysis of Errors in the Incoherence Analysis
- 6 Conclusion and Perspectives
- References
- Enriching Argumentative Texts with Implicit Knowledge
- 1 Introduction
- 2 Related Work
- 3 Annotating Implicit Knowledge in Arguments
- 3.1 The Microtext Corpus
- 3.2 Task I: Revealing Implicit Knowledge in Argumentative Texts
- 3.3 Task IIa: Situation Entity Types Annotations
- 3.4 Task IIb: Concept Net Relations Annotations
- 3.5 Task III: Retrieving Similar Sentences from a Wikipedia Corpus
- 4 Analysis of the Annotations
- 4.1 Task I: Data Statistics and Evolution of Annotator Agreement
- 4.2 Task IIa: Analysis of Situation Entity Types Annotations
- 4.3 Task IIb: Analysis of ConceptNet Relation Type Annotations
- 4.4 Task III: Aligning Knowledge Annotations with Wikipedia
- 5 Conclusion
- References
- Technical Aspect Extraction from Customer Reviews Based on Seeded Word Clustering
- 1 Introduction
- 2 Related Work
- 3 Computing Word Technicality
- 4 Word Clustering
- 4.1 Constructing the Technical Semantic Space
- 4.2 Seeded Word Clustering
- 5 Evaluation
- 5.1 Technical Scores
- 5.2 Technical Clusters
- 5.3 Information Extraction
- 6 Conclusion and Future Work
- References
- Event Detection for Heterogeneous News Streams
- 1 Introduction
- 2 Related Work
- 3 EDNC: Event Detection and News Clustering
- 3.1 Event Detection
- 3.2 Estimating the Temporal Windows of Events
- 3.3 Event-Based Clustering of News
- 4 Characteristics of EDNC
- 5 Experimental Results
- 5.1 Dataset
- 5.2 Event Detection Evaluation
- 5.3 News Clustering
- 6 Conclusions and Future Work
- References
- Twitter User Profiling Model Based on Temporal Analysis of Hashtags and Social Interactions
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Proposed Model
- 3.1 Social and Temporal Profiles' Construction
- 3.1.1 Interactive and Temporal User Profiles' Construction
- 3.1.2 Thematic and Temporal User Profiles' Construction
- 3.2 Users' Clustering Based on Social and Temporal Profiles
- 4 Experimental Illustration
- 4.1 Illustration of Clustering Based on Social Interactions
- 4.2 Illustration of Clusters Based on Thematic Profiles
- 4.3 Formal Concept Analysis of Temporal Thematic Profiles
- 5 Conclusion and Future Work
- References
- Extracting Causal Relations Among Complex Events in Natural Science Literature
- 1 Introduction
- 2 Data Set
- 3 Task
- 4 Experiments
- 5 Results
- 6 Conclusion
- References
- Supporting Experts to Handle Tweet Collections About Significant Events
- 1 Introduction
- 2 System Architecture
- 3 Experiments and Evaluation
- 4 Conclusion
- References
- Named Entity Classification Based on Profiles: A Domain Independent Approach
- 1 Introduction
- 2 Method: Named Entity Classification Through Profiles
- 3 Evaluation
- 4 Conclusions and Future Work
- References
- Information Extraction from Resource-Scarce Languages
- Does the Strength of Sentiment Matter? A Regression Based Approach on Turkish Social Media
- 1 Introduction
- 2 Related Work
- 3 Data Collection and Preprocessing
- 4 Features Extracted from Tweets
- 4.1 Lexical Features
- 4.2 Emoticons
- 4.3 Features Based on Sentiment Scores
- 4.4 Word Embedding
- 5 Experiments and Results
- 6 Conclusion
- References
- AL-TERp: Extended Metric for Machine Translation Evaluation of Arabic
- Abstract
- 1 Introduction
- 2 Related Work
- 2.1 TER and TER-plus
- 2.2 METEOR and AL-BLEU
- 3 AL-TERp
- 3.1 Normalization
- 3.2 Arabic WordNet
- 3.3 Stemming
- 3.4 Paraphrase Database
- 3.5 Edit Costs Optimization
- 4 Experiments and Results
- 4.1 Data and Performance Criteria
- 4.2 Results
- 4.3 Discussion
- 5 Conclusions
- Acknowledgments
- References
- A Morphological Approach for Measuring Pair-Wise Semantic Similarity of Sanskrit Sentences
- 1 Introduction
- 2 Proposed Methodology
- 2.1 Implementation
- 3 Experimental Results
- 4 Conclusion and Future Work
- References
- Word-Level Identification of Romanized Tunisian Dialect
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Tunisian Dialect Identification
- 3.1 TD Identification Using N-Gram Based Cumulative Frequency Addition
- 3.2 TD Identification Using Support Vector Machines
- 4 Experiments and Results
- 5 Conclusion
- References
- Named Entity Recognition in Turkish: Approaches and Issues
- 1 Introduction
- 2 Named Entities in Turkish
- 3 A Review of Named Entity Recognition Studies
- 4 Outstanding Issues
- 5 Conclusion
- References
- Natural Language Processing Applications
- Simplified Text-to-Pictograph Translation for People with Intellectual Disabilities
- 1 Introduction
- 2 Status Quaestionis
- 3 Objectives
- 4 System Description
- 4.1 Pre-processing
- 4.2 Applying Alpino
- 4.3 Syntactic Simplification
- 5 Evaluation
- 6 Conclusion
- References
- What You Use, Not What You Do: Automatic Classification of Recipes
- 1 Introduction
- 2 Related Work
- 3 Feature Sets for Recipe Classification
- 4 Results
- 4.1 Experimental Setup
- 4.2 Classification Results
- 4.3 Visualization
- 4.4 Feature Analysis
- 4.5 Error Analysis
- 5 Discussion, Conclusion and Future Work
- References
- Constructing Technical Knowledge Organizations from Document Structures
- 1 Introduction
- 2 Formalized Document Components
- 3 Constraint-Based Construction of Technical Knowledge Organizations
- 3.1 Extracting Partial Hierarchies from Document Structures
- 3.2 Unification of Concept Candidates
- 3.3 Constraint-Based Hierarchy Construction
- 4 Application and Implementation Remarks
- 5 Conclusion
- References
- Applications of Natural Language Techniques in Solving SmartHome Use-Cases
- 1 Introduction
- 2 Methods
- 3 Results
- 4 Conclusion
- References
- A Method for Querying Touristic Information Extracted from the Web
- 1 Introduction
- 2 Proposed Method
- 3 Discussion and Conclusion
- References
- Towards Generating Object-Oriented Programs Automatically from Natural Language Texts for Solving Mathematical Word Problems
- 1 Introduction
- 2 Related Work
- 3 System Description
- 4 Dataset, Results and Discussions
- 5 Conclusions
- References
- Authorship Attribution System
- 1 Method
- 2 Implementation
- 3 Test Methodology
- 4 Results of Experiments
- 5 Conclusion
- References
- Neural Language Models and Applications
- Estimating Distributed Representations of Compound Words Using Recurrent Neural Networks
- 1 Introduction
- 2 Distributed Representation of Compound Words
- 2.1 Neural Network Language Model
- 2.2 Problem Definition
- 3 Related Work
- 4 Methodology
- 4.1 Learning Distributed Representation of Words
- 4.2 Recurrent Neural Networks
- 5 Experiments
- 5.1 Experimental Setup
- 5.2 Experiment 1
- 5.3 Experiment 2
- 6 Conclusion
- References
- A Large-Scale CNN Ensemble for Medication Safety Analysis
- 1 Introduction
- 2 Related Research
- 3 Method
- 3.1 Input Processing
- 3.2 CNN Architecture
- 3.3 Ensemble of CNNs
- 4 Dataset Construction and Preprocessing
- 5 Experiments
- 5.1 Experimental Setup
- 5.2 Results and Discussion
- 6 Conclusion and Future Work
- References
- A Feature Based Simple Machine Learning Approach with Word Embeddings to Named Entity Recognition on Tweets
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Implementation and Feature Set
- 4 Experimental Study
- 5 Conclusions
- Acknowledgements
- References
- Challenges and Solutions with Alignment and Enrichment of Word Embedding Models
- 1 Introduction
- 2 Our Approach: Local Enrichment via Latent Words
- 3 Experimental Setup and Results
- 4 Conclusion
- References
- Legalbot: A Deep Learning-Based Conversational Agent in the Legal Domain
- 1 Introduction
- 2 Conversational Systems
- 2.1 Encoder-Attention-Decoder
- 3 Dataset
- 4 Training
- 5 Evaluation
- 6 Conclusion
- References
- Using Word Embeddings for Computing Distances Between Texts and for Authorship Attribution
- 1 Introduction
- 2 Corpora
- 3 Method
- 4 Results
- 5 Discussion and Conclusion
- References
- Combination of Neural Networks for Multi-label Document Classification
- 1 Introduction
- 2 Networks and Combination Approaches
- 2.1 Individual Nets
- 2.2 Combination
- 3 Experiments
- 3.1 Tools and Corpus
- 3.2 Results of the Individual Networks
- 3.3 Results of Unsupervised Combinations
- 3.4 Results of Supervised Combinations
- 4 Conclusions and Future Work
- References
- Supporting Business Process Modeling Using RNNs for Label Classification
- Abstract
- 1 Introduction
- 2 Label Classification
- 3 Experimental Evaluation
- 4 Conclusion
- Acknowledgment
- References
- A Convolutional Neural Network Based Sentiment Classification and the Convolutional Kernel Representation
- Abstract
- 1 Introduction
- 2 Modelling on Convolutional Neural Network
- 3 Convolutional Kernel Representation for Textual Data
- 4 Experiments and Analysis
- 5 Conclusions and Future Works
- Acknowledgments
- References
- Quotology - Reading Between the Lines of Quotations
- 1 Introduction
- 2 Related Work
- 3 Proposed RTE System for Quote-Explanation Pair
- 4 Experiment Result and Analysis
- 5 Conclusion and Future Works
- References
- Opinion Mining and Sentiment Analysis
- Automatically Labelling Sentiment-Bearing Topics with Descriptive Sentence Labels
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Preliminaries of the JST Model
- 3.2 Modelling the Relevance Between Sentiment-Bearing Topics and Sentences
- 3.3 Automatic Sentence Label Selection
- 4 Experimental Setup
- 4.1 Setup
- 4.2 Baselines
- 4.3 Evaluation Task
- 5 Experimental Results
- 5.1 Human Evaluation
- 5.2 Qualitative Analysis
- 6 Conclusion
- References
- Does Optical Character Recognition and Caption Generation Improve Emotion Detection in Microblog Posts?
- 1 Introduction
- 2 Methods and Experimental Setup
- 2.1 Features
- 2.2 Corpus
- 3 Results
- 4 Conclusion and Future Work
- References
- Identifying Right-Wing Extremism in German Twitter Profiles: A Classification Approach
- 1 Introduction
- 2 Related Work
- 3 Profile Classification
- 4 Evaluation
- 4.1 Data Collection and Annotation
- 4.2 Experimental Results
- 4.3 Qualitative Discussion
- 5 Conclusions and Outlook
- References
- An Approach for Defining the Author Reputation of Comments on Products
- 1 Introduction
- 2 Related Research
- 3 Data Retrieval and Corpus Preparation
- 4 Proposed Approach
- 5 Experiments and Results
- 6 Conclusion
- References
- Quality of Word Embeddings on Sentiment Analysis Tasks
- 1 Introduction
- 2 Word Embedding Corpora and Models
- 3 Sentiment Analysis Tasks
- 4 Results
- 5 Discussion
- References
- Question Answering Systems and Applications
- A Hierarchical Iterative Attention Model for Machine Comprehension
- 1 Introduction
- 2 Problem Notation, Datasets
- 2.1 Definition and Notation
- 2.2 Reading Comprehension Datasets
- 3 Proposed Approach
- 3.1 Tree-LSTM Model
- 3.2 Inference Attention Model
- 4 Experiments
- 4.1 Experimental Setups
- 4.2 Results
- 5 Related Work
- 6 Conclusions
- References
- A Syntactic Parse-Key Tree-Based Approach for English Grammar Question Retrieval
- 1 Introduction
- 2 Related Work
- 3 Proposed Approach
- 3.1 Parse-Key Tree Construction
- 3.2 Parse-Key Tree Similarity
- 3.3 Ranking
- 4 Performance Evaluation
- 4.1 Experimental Setup
- 4.2 Parameter Tuning and Learning
- 4.3 Methods Used for Performance Comparison
- 4.4 Performance Results
- 5 Conclusion
- References
- A Discriminative Possibilistic Approach for Query Translation Disambiguation
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Possibility Theory
- 3.1 Possibility Distribution
- 3.2 Possibility and Necessity Measures
- 3.3 Possibilistic Networks (PN)
- 4 The Discriminative Possibilistic QT Approach
- 4.1 The Degree of Possibilistic Relevance (DPR)
- 4.2 Illustrative Example
- 5 Experiments and Discussion
- 6 Conclusion and Future Work
- References
- Building TALAA-AFAQ, a Corpus of Arabic FActoid Question-Answers for a Question Answering System
- 1 Introduction
- 2 Related Work
- 2.1 Question-Answer Corpora for Non-arabic Languages
- 2.2 Arabic Question-Answer Corpora
- 3 TALAA-AFAQ: A Corpus of Arabic FActoid Question-Answers
- 3.1 Building of the Corpus of Arabic FActoid Question-Answers
- 3.2 Validation of TALAA-AFAQ
- 4 Corpus Access and Usage
- 5 Corpus Statistics
- 6 Conclusion and Future Work
- References
- Detecting Non-covered Questions in Frequently Asked Questions Collections
- 1 Introduction
- 2 Related Work
- 3 Retrieval Models
- 4 Experiments
- 5 Conclusion
- References
- Semantics-Based Models and Applications
- Vector Space Representation of Concepts Using Wikipedia Graph Structure
- 1 Introduction
- 2 Related Work
- 3 Concept Representation
- 3.1 (Reverse) PageRank
- 4 Word Sense Disambiguation (WSD)
- 4.1 Coherence Modeling Using Integer Programming (IP)
- 4.2 Key Entity Modeling
- 4.3 VSM Key Entity Recognition
- 5 Experiments
- 5.1 Experiment 1: Semantic Relatedness
- 5.2 Experiment 2: Word Sense Disambiguation
- 6 Conclusion
- References
- Composite Semantic Relation Classification
- 1 Introduction
- 2 Composite Semantic Relation Classification
- 2.1 Semantic Relation Classification
- 2.2 Existing Approaches for Semantic Relation Classification
- 3 From Single to Composite Relation Classification
- 3.1 Introduction
- 3.2 Commonsense KB Lookup
- 3.3 Distributional Navigational Algorithm (DNA)
- 3.4 Neural Entity/Relation Model
- 4 Baseline Models
- 5 Experimental Evaluation
- 5.1 Training and Test Dataset
- 5.2 Results
- 6 Conclusion
- References
- Vector Embedding of Wikipedia Concepts and Entities
- 1 Introduction
- 2 Related Works
- 3 Distributed Representation of Concepts
- 4 Evaluation
- 5 Conclusion
- A Appendix: Python Libraries
- B Appendix: Pruning Wikipedia Pages
- References
- TEKNO: Preparing Legacy Technical Documents for Semantic Information Systems
- 1 Introduction
- 2 Structure Recovery for Technical Documents
- 2.1 Methodology and Distinction
- 2.2 Knowledge Representation
- 2.3 Classification
- 3 Implementation Remarks and Case Study
- 4 Conclusion
- References
- Improving Document Classification Effectiveness Using Knowledge Exploited by Ontologies
- 1 Introduction
- 2 Proposed Classification Model
- 3 Conclusion and Future Work
- References
- Text Summarization
- Document Aboutness via Sophisticated Syntactic and Semantic Features
- 1 Introduction
- 2 Related Work
- 3 Our Proposal: Swat
- 3.1 More on Feature Generation (Stage 2)
- 4 Experimental Setup
- 4.1 Datasets
- 4.2 Tools
- 4.3 Experimental Results
- 5 Feature and Error Analysis
- 6 Future Work
- References
- Summarizing Web Documents Using Sequence Labeling with User-Generated Content and Third-Party Sources
- 1 Introduction
- 2 Summarization with User-Generated Content and Third-Party Sources
- 2.1 Basic Idea
- 2.2 Data Preparation
- 2.3 Summarization by SoCRFSum
- 3 Results and Discussion
- 3.1 Experimental Setup
- 3.2 Baselines
- 3.3 Evaluation Method
- 3.4 Results
- 3.5 Feature Contribution
- 3.6 Summarization with L2R Methods
- 3.7 Error Analysis
- 4 Conclusion
- References
- ``What Does My Classifier Learn?'' A Visual Approach to Understanding Natural Language Text Classifiers
- 1 Introduction
- 2 Related Research
- 3 Document Influence Matrices
- 3.1 Classifying Text Using CNNs
- 3.2 Computing Document Influence Matrices
- 3.3 Creating a Visual Representation from Document Influence Matrices
- 4 Experiments and Results
- 4.1 Analyzing Most Often Highlighted Words
- 4.2 Measuring the Quality of the Visual Representations
- 5 Application to Natural Language Requirements Classification
- 6 Conclusions and Future Work
- References
- Gated Neural Network for Sentence Compression Using Linguistic Knowledge
- 1 Introduction
- 2 Models
- 2.1 Linguistic Knowledge
- 2.2 Lookup Table
- 2.3 Linguistic Knowledge Based Recurrent Neural Network (LK-RNN)
- 2.4 Linguistic Knowledge Based Gated Neural Network (LK-GNN)
- 3 Experiment
- 3.1 Datasets
- 3.2 Baseline and Proposed Models
- 4 Evaluation and Analysis
- 4.1 Automatic Evaluation
- 4.2 Human Evaluation
- 4.3 Visualization of Gating Mechanism
- 5 Related Works
- 6 Conclusion and Future Work
- References
- A Study on Flexibility in Natural Language Generation Through a Statistical Approach to Story Generation
- 1 Context and Motivation
- 2 Statistical Approach Based on Language Models
- 2.1 Positional Language Models for Macroplanning
- 2.2 Factored Language Models for Surface Realisation
- 3 Experimental Scenario: Story Generation
- 3.1 Determining the Structure and Content (Macroplanning)
- 3.2 Surface Realisation
- 4 Evaluation and Discussion
- 4.1 Error Analysis
- 5 Conclusion and Future Work
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.