
Natural Language Understanding and Intelligent Applications
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The 48 revised full papers presented together with 41 short papers were carefully reviewed and selected from 216 submissions. The papers cover fundamental research in language computing, multi-lingual access, web mining/text mining, machine learning for NLP, knowledge graph, NLP for social network, as well as applications in language computing.
More details
Other editions
Additional editions

Content
- Intro
- Message from the Program Committee Co-chairs
- Organization
- Contents
- Fundamentals on Language Computing
- Integrating Structural Context with Local Context for Disambiguating Word Senses
- Abstract
- 1 Introduction
- 2 Proposed Approach
- 2.1 Generate Permuted-Lexicon-Sequence
- 2.2 Proposed Model
- 3 Evaluation
- 3.1 Data Sets
- 3.2 Experiments
- 4 Related Work
- 5 Conclusion
- Acknowledgements
- References
- Tibetan Multi-word Expressions Identification Framework Based on News Corpora
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Brief Description of Tibetan MWE Identification Framework
- 4 Tibetan MWE Identification Based on the Combination of Context Analysis and Language Model-Based Analysis
- 4.1 Context Analysis
- 4.2 Two-Word Coupling Degree
- 4.3 Tibetan Syllable Inside Word Probability
- 5 Experiments
- 5.1 Experimental Data
- 5.2 Evaluation
- 5.2.1 Evaluation for Different Strategies in Identifying Framework
- 5.2.2 Evaluation for the Effect of Context Analysis Granularity
- 5.2.3 Evaluation on Large Corpus
- 6 Conclusion
- Acknowledgements
- References
- Building Powerful Dependency Parsers for Resource-Poor Languages
- 1 Introduction
- 2 Our Approach
- 2.1 Data Preprocessing
- 2.2 Projecting Dependencies and POS Tags
- 2.3 CRF-Based POS Tagging Model
- 2.4 Graph-Based Dependency Parsing Model
- 3 Enhancing the Parsers
- 3.1 Subtree Based Features
- 3.2 Word-Cluster Based Features
- 4 Experiments
- 4.1 Data Sets
- 4.2 Results on POS Tagging
- 4.3 Results on Parsing
- 5 Related Work
- 6 Conclusions
- References
- Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification
- 1 Introduction
- 2 Related Works
- 3 Methodology
- 3.1 Embedding Layer
- 3.2 Sentence Modeling with Bi-LSTM
- 3.3 Gated Relevance Network
- 3.4 Max-Pooling Layer and MLP
- 3.5 Model Training
- 4 Experiments
- 4.1 Dataset and Evaluation Metrics
- 4.2 Parameter Settings
- 4.3 Baselines
- 4.4 Results of Comparison Experiments
- 5 Conclusion
- References
- Syntactic Categorization and Semantic Interpretation of Chinese Nominal Compounds
- Abstract
- 1 Introduction
- 2 Related Literature
- 3 Syntactic Categorization of Nominal Compounds in Chinese
- 3.1 Basic Rules
- 3.2 Context-Based Rules
- 3.3 Rules of Named Entities
- 3.4 Rules for Syntactic Categorization
- 3.5 Syntactic Categorization Experiments
- 4 Automatic Semantic Interpretation of Head-Modifier Nominal Compounds
- 4.1 Description of the System
- 4.2 Resources and Similarity Computation
- 4.3 Noun Matching
- 4.4 Acquisition of Semantic Interpretation Templates
- 4.5 Experiments of Automatic Semantic Interpretation
- 5 Application in Syntactic Parsing and Machine Translation
- 5.1 Correction in Syntactic Parsing
- 5.2 Application in Machine Translation
- 6 Conclusions
- Acknowledgement
- References
- TDSS: A New Word Sense Representation Framework for Information Retrieval
- 1 Introduction
- 2 Related Work
- 3 A New Word Sense Representation Framework
- 4 TDSS Sense Extraction
- 4.1 Explanation Words and Context Extraction
- 4.2 Sense Graph Construction
- 4.3 Sense Generation and Weighting
- 5 Experiments
- 5.1 Evaluating Explanation Word Extraction
- 5.2 Evaluating Word Sense Generation
- 5.3 Case Study: Query Rewriting
- 6 Conclusions
- References
- A Word Vector Representation Based Method for New Words Discovery in Massive Text
- Abstract
- 1 Related Work
- 2 The New Word Discovery Method Based on Word Vector Pruning
- 2.1 Data Preprocessing
- 2.2 Word Vector Representation and Training
- 2.3 Mining n-Gram Word String
- 2.4 Pruning Based on Word Vector
- 3 Experiment Results
- 3.1 Data Sets and Experimental Settings
- 3.2 The Result of New Word Detection
- 3.3 Comparative Analysis
- 3.4 Different Vector Similarity Measure Pruning Comparison
- 4 Conclusions and Future Work
- Acknowledgments
- References
- Machine Translation and Multi-lingual Information Access
- Better Addressing Word Deletion for Statistical Machine Translation
- 1 Introduction
- 2 The Proposed Approach
- 2.1 Undesired WD Classification
- 2.2 Undesired WD Model
- 2.3 Integration into SMT Decoder
- 3 Evaluation Metric - Recall of WD
- 3.1 Unigram Recall
- 4 Evaluation
- 4.1 Experiment Setup
- 4.2 Corpus
- 4.3 Results
- 4.4 Recall of WD vs Human Evaluation
- 5 Related Work
- 6 Conclusion and Future Work
- References
- A Simple, Straightforward and Effective Model for Joint Bilingual Terms Detection and Word Alignment in SMT
- 1 Introduction
- 2 Related Work
- 3 The Proposed Joint Model
- 3.1 The Framework for Jointly Detecting Bilingual Term Pairs and Aligning Words
- 3.2 The Joint Model
- 3.3 Derivation Details
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Results and Analysis
- 5 Conclusion
- References
- Bilingual Parallel Active Learning Between Chinese and English
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Corpus Annotation
- 3.1 Corpus Selection
- 3.2 Annotation of Chinese Corpus
- 3.3 Mapping to English Corpus
- 3.4 Manual Adjustment
- 3.5 Alignment Statistics
- 4 Bilingual Parallel Active Learning
- 4.1 Problem Definition
- 4.2 BPAL Algorithm
- 5 Experiments
- 5.1 Corpora
- 5.2 Experimental Methods
- 5.3 Features for Relation Classification
- 5.4 Evaluation Metrics
- 5.5 Experimental Results
- 6 Conclusion
- Acknowledgement
- References
- Study on the English Corresponding Unit of Chinese Clause
- Abstract
- 1 Chinese-to-English Clause-Aligned Parallel Corpus
- 2 ECUCC Grammatically Annotated Corpus
- 2.1 Grammatical Analytic Principles of ECUCC
- 2.2 Grammatical Analytic System of ECUCC
- 3 Classification and Statistical Analysis of ECUCC
- 3.1 Sentences and Clauses
- 3.2 Major Clauses and Subordinate Clauses
- 3.3 Functions of Subordinate Clauses: Adverbial and Attributive
- 3.4 Structures of Subordinate Clauses: Restrictive Relative and Non-defining
- 3.5 Simple Clauses and Coordinate Clauses
- 3.6 General Analysis
- 4 Conclusion and Further Research
- Acknowledgments
- References
- Research for Uyghur-Chinese Neural Machine Translation
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Model
- 3.1 Pre-process
- 3.2 Pointer-NMT Model
- 3.3 Post-process
- 4 Experiment
- 4.1 Experiment Set
- 4.2 Results of Experiment
- 5 Conclusion
- Acknowledgements
- References
- MaxSD: A Neural Machine Translation Evaluation Metric Optimized by Maximizing Similarity Distance
- 1 Introduction
- 2 Learning Task
- 3 MaxSD Model: Maximizing Similarity Distance Model
- 3.1 MaxSD Model
- 3.2 Bi-LSTM and BiC-LSTM Networks
- 4 Experiments and Results
- 4.1 Datasets
- 4.2 Setups
- 4.3 Results
- 5 Conclusion
- References
- Automatic Long Sentence Segmentation for Neural Machine Translation
- 1 Introduction
- 2 Related Work
- 3 Neural Machine Translation
- 4 The Segmentation Method
- 4.1 The Split Model
- 4.2 The Reordering Model
- 4.3 Joint Model: Combining the Two Submodels
- 5 Experiment
- 5.1 Setup
- 5.2 The Split Model
- 5.3 The Reordering Model
- 5.4 Comparison
- 5.5 Analysis
- 6 Conclusion and Future Work
- References
- Machine Learning for NLP
- Topic Segmentation of Web Documents with Automatic Cue Phrase Identification and BLSTM-CNN
- 1 Introduction
- 2 Related Work
- 3 Models
- 3.1 BLSTM (Bidirectional Long Short Term Memory)
- 3.2 CNN for Paragraph Representation
- 3.3 Model Learning
- 4 Features
- 4.1 Frequent Subsequence Mining Based Cue Phrase Identification
- 4.2 Other Features
- 5 Experiments
- 5.1 Data and Setup
- 5.2 Results
- 5.3 Error Analysis
- 6 Conclusion and Future Work
- References
- Multi-task Learning for Gender and Age Prediction on Chinese Microblog
- 1 Introduction
- 2 Multi-task Convolutional Neural Network (MTCNN)
- 2.1 Model Description
- 2.2 Model Learning
- 3 Weibo Data
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Baselines
- 4.3 Results
- 4.4 Error Analysis
- 5 Related Work
- 6 Conclusion and Future Work
- References
- Dropout Non-negative Matrix Factorization for Independent Feature Learning
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 NMF as a Linear Neural Network
- 3.2 Dropout and Sequential NMF
- 3.3 Complexity Analysis
- 4 Experimental Results
- 4.1 Datasets
- 4.2 Experimental Settings
- 4.3 Clustering Results
- 4.4 Parameter Selection and Convergence Analysis
- 4.5 Case Study
- 5 Conclusion
- Acknowledgement
- References
- Analysing the Semantic Change Based on Word Embedding
- 1 Introduction
- 2 Related Work
- 3 Approaches
- 3.1 Word Embedding
- 3.2 Random Project Forest
- 3.3 DBSCAN
- 4 Experiments
- 4.1 Preparations
- 4.2 Detecting the Semantic Change Based on Word Embedding
- 4.3 Analysing the Semantic Trend with Word Embedding
- 4.4 Clustering on the Similar Words and Context Words
- 5 Conclusion and Future Work
- References
- Learning Word Sense Embeddings from Word Sense Definitions
- 1 Introduction
- 2 Methodology
- 2.1 Definition Understanding Model
- 2.2 Training Definition Understanding Model with Definitions of Monosemous Words
- 2.3 Word Sense Embedding Learning
- 2.4 Training with Word Sense Embeddings to Represent Words in Definitions
- 3 Experiments
- 3.1 Setup
- 3.2 Qualitative Evaluations
- 3.3 Quantitative Evaluations
- 4 Related Work
- 5 Conclusion
- References
- Information Extraction, Question Answering and Knowledge Acquisition
- Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition
- 1 Introduction
- 2 Related Work
- 3 Neural Network Architecture
- 3.1 LSTM
- 3.2 CRF
- 3.3 Radical-Level LSTM
- 3.4 Tagging Scheme
- 4 Network Training
- 4.1 LSTM Variants
- 4.2 Pretrained Embeddings
- 4.3 Training
- 5 Experiments
- 5.1 Data Sets
- 5.2 Results
- 6 Conclusion
- References
- Improving First Order Temporal Fact Extraction with Unreliable Data
- 1 Introduction
- 2 Related Work
- 3 Dataset Construction
- 4 Model
- 4.1 PCNN Model
- 4.2 Curriculum Learning
- 4.3 Label Dropout
- 4.4 Instance Attention
- 5 Experiments
- 5.1 Main Results
- 5.2 Influence of Curriculum Learning Parameters
- 5.3 Influence of Label Dropout Parameters
- 6 Conclusion
- References
- Reducing Human Effort in Named Entity Corpus Construction Based on Ensemble Learning and Annotation Categorization
- 1 Introduction
- 2 Related Work
- 3 Preliminaries
- 3.1 Definitions
- 3.2 System Architecture
- 4 Method
- 5 Experimental Setup
- 5.1 Datasets
- 5.2 Taggers
- 5.3 Pre-annotators
- 5.4 Assisted Annotation Experiments
- 6 Results and Analysis
- 6.1 Performance of Pre-annotated Annotations
- 6.2 Performance of Annotators
- 6.3 Analysis
- 7 Conclusion
- References
- A Convolution BiLSTM Neural Network Model for Chinese Event Extraction
- 1 Introduction
- 2 Trigger Labeling
- 2.1 Language Specific Issues
- 2.2 Word-Based Method
- 2.3 Character-Based Method
- 3 Argument Labeling
- 3.1 Input Layer
- 3.2 Output Layer
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Network Training
- 4.3 Trigger Labeling
- 4.4 Argument Labeling
- 5 Conclusion
- References
- Detection of Entity Mixture in Knowledge Bases Using Hierarchical Clustering
- Abstract
- 1 Introduction
- 2 Background and Related Work
- 2.1 Knowledge Base and Knowledge Service
- 2.2 Entity Disambiguation and Entity Linking
- 3 Detection of Homonymous Entities Mixture
- 3.1 Definition of Homonymous Entities Mixture
- 3.2 Workflow of Entity Mixture Detection
- 3.3 Hierarchical Clustering for Detection of Entity Mixture
- 4 Experiments of Entity Mixture Detection
- 4.1 Experiment Environments and Settings
- 4.2 Analysis of Experimental Results
- 5 Conclusion
- Acknowledgments
- References
- Knowledge Base Question Answering Based on Deep Learning Models
- 1 Introduction
- 2 Related Work
- 3 Methods
- 3.1 Topic Entity Extraction Models
- 3.2 Deep Structured Semantic Models
- 3.3 Candidates Retrieval
- 3.4 Answer Selection
- 4 Experiment
- 4.1 Data Set
- 4.2 Setup
- 4.3 Evaluation Metric
- 4.4 Experimental Results
- 5 Conclusion
- References
- An Open Domain Topic Prediction Model for Answer Selection
- 1 Introduction
- 2 Topic Prediction Model
- 2.1 Training Data Acquisition
- 2.2 Model Training
- 3 Topic Prediction Model for Answer Selection
- 3.1 Implicit Topic Matching Feature
- 3.2 Explicit Topic Matching Feature
- 4 Related Work
- 5 Experiment
- 5.1 Evaluation on Topic Prediction
- 5.2 Experiment on Answer Selection
- 6 Conclusion
- References
- Joint Event Extraction Based on Skip-Window Convolutional Neural Networks
- Abstract
- 1 Introduction
- 2 Event Extraction Task
- 3 Methodology
- 3.1 Word Embedding Learning
- 3.2 Skip-Window Convolutional Neural Networks
- 3.3 Label Vectors Learning
- 3.4 Use RNNs to Operate Sequence Labeling
- 3.5 Training
- 4 Experiment
- 4.1 Dataset and Evaluation Metric
- 4.2 Baselines
- 4.3 Overall Performance
- 4.4 The Effectiveness of S-CNNs
- 5 Related Work
- 6 Conclusion
- Acknowledgments
- References
- Improving Collaborative Filtering with Long-Short Interest Model
- 1 Introduction
- 2 Related Work
- 2.1 Collaborative Filtering
- 2.2 Neural Network Language Model
- 3 Our Approach
- 3.1 Problem Definition
- 3.2 Long-Short Interest Model
- 3.3 Feature Based Collaborative Filtering
- 4 Experiment
- 4.1 Experimental Setup
- 4.2 Benchmark Models
- 4.3 Overall Results
- 4.4 Effects on Different Users
- 5 Conclusions
- References
- Discourse Analysis
- Leveraging Hierarchical Deep Semantics to Classify Implicit Discourse Relations via Mutual Learning Method
- Abstract
- 1 Introduction
- 2 Related Works
- 3 Method
- 3.1 Mutual Learning Neural Model
- 3.2 Learning Model Parameters and Semantic Embeddings
- 4 Experiment
- 4.1 Datasets
- 4.2 Classification Results
- 5 Conclusion
- Acknowledgment
- References
- Transition-Based Discourse Parsing with Multilayer Stack Long Short Term Memory
- 1 Introduction
- 2 Related Work
- 3 Long Short Term Memory Theory
- 4 Transition-Based Parsing
- 5 Method
- 5.1 EDU Representation
- 5.2 Stack LSTM
- 5.3 Multilayer Stack LSTM Discourse Parsing Model
- 5.4 Parsing with Multilayer Stack LSTM Model
- 5.5 Composition Functions
- 6 Experiment
- 6.1 Data
- 6.2 Results and Discussion
- 7 Conclusion
- References
- Predicting Implicit Discourse Relation with Multi-view Modeling and Effective Representation Learning
- 1 Introduction
- 2 Overview of the Penn Discourse Treebank
- 3 Model
- 3.1 Discourse Relation Scoring Model
- 3.2 Max-Margin Learning
- 4 Multi-level Representations for the Arguments
- 5 Implementation Details
- 6 Experiments
- 6.1 First-Level Relation Recognition
- 6.2 Second-Level Relation Recognition
- 7 Discussion
- 8 Related Work
- 9 Conclusion
- References
- A CDT-Styled End-to-End Chinese Discourse Parser
- 1 Introduction
- 2 The Chinese Discourse Tree Bank
- 3 End-to-End Chinese Discourse Parser
- 3.1 System Overview
- 3.2 Elementary Discourse Unit Detector
- 3.3 Discourse Relation Recognizer
- 3.4 Discourse Parse Tree Generator
- 3.5 Attribution Labeler
- 4 Experiments
- 4.1 Experimental Setting
- 4.2 Experimental Results and Analysis
- 5 Related Work
- 6 Conclusion
- References
- NLP for Social Media
- Events Detection and Temporal Analysis in Social Media
- 1 Introduction
- 2 Related Work
- 3 Keywords Extraction
- 3.1 User Authority Estimation
- 3.2 Words Score
- 3.3 Keywords Selection
- 4 Events Detection
- 4.1 Building KeyGraph
- 4.2 Community Detection
- 5 Temporal Analysis
- 6 Experiment Analysis
- 6.1 Dataset
- 6.2 Experiment Result and Analysis
- 7 Conclusions
- References
- Discovering Concept-Level Event Associations from a Text Stream
- 1 Introduction
- 2 Burst Information Networks
- 2.1 Burst Detection
- 2.2 BINet Construction
- 3 Event Association Discovery
- 3.1 Event Extraction
- 3.2 Major Event Identification
- 3.3 Event Association Pair Ranking
- 4 Experiments and Evaluations
- 4.1 Data
- 4.2 End-to-end Evaluation
- 4.3 Future Event Prediction with Association Knowledge
- 5 Related Work
- 6 Conclusion
- References
- A User Adaptive Model for Followee Recommendation on Twitter
- 1 Introduction
- 2 Approach Overview
- 3 User Modeling
- 3.1 Topology-Based Neural Network
- 3.2 Content-Based Neural Network
- 3.3 Adaptive Layer
- 4 Learning
- 5 Experiments
- 5.1 Experiment Setup
- 5.2 Comparison with Recommendation Methods
- 5.3 Analysis of Our Model
- 6 Related Work
- 7 Conclusions
- References
- Who Will Tweet More? Finding Information Feeders in Twitter
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Task Description
- 3.2 User Features
- 4 Experiments
- 4.1 Data Preparation
- 4.2 Data Description
- 4.3 Experiment Setting
- 4.4 Results
- 5 Examples
- 6 Conclusion and Future Work
- References
- Short Papers
- Discrete and Neural Models for Chinese POS Tagging: Comparison and Combination
- 1 Introduction
- 2 Baseline Models
- 2.1 The Discrete Model
- 2.2 The Neural Model
- 2.3 Training Method
- 3 Combination
- 3.1 Feature Combination
- 3.2 Stacked Learning
- 4 Incorporating Cluster Features
- 5 Experiments
- 5.1 Experimental Settings
- 5.2 Development Results
- 5.3 Final Results
- 6 Related Work
- 7 Conclusion
- References
- Improving Word Vector with Prior Knowledge in Semantic Dictionary
- Abstract
- 1 Introduction
- 2 Our Approach
- 2.1 Hownet Semantic Knowledge
- 2.2 Similar Words
- 2.3 Character Embedding
- 3 Experiment
- 3.1 Correlation with Human Judgement
- 3.2 Temporal Tagging
- 3.3 People Daily NER
- 4 Conclusion
- Acknowledgement
- References
- Adapting Attention-Based Neural Network to Low-Resource Mongolian-Chinese Machine Translation
- Abstract
- 1 Introduction
- 2 Attention-Based Neural Network Machine Translation
- 3 Sub-words Mongolian-Chinese NMT
- 4 Applying Monolingual Data to NMT
- 5 Applying Correction Model to NMT
- 6 Evaluation
- 6.1 Setting
- 6.2 Evaluation of Attention-Based NMT
- 6.3 Evaluation of Sub-words NMT
- 6.4 Evaluation of Monolingual
- 6.5 Evaluation of NMT Correction
- 6.6 Comparison
- 7 Conclusion
- Acknowledgements
- References
- Sentence Similarity on Structural Representations
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Structural Representations
- 3.1 Motivation
- 3.2 Shallow Syntactic Tree
- 3.3 Dependency Tree
- 4 Experiment
- 4.1 Baseline
- 4.2 Experimental Setup
- 4.3 Experimental Results and Analysis
- 5 Summary
- References
- Word Sense Disambiguation Using Context Translation
- Abstract
- 1 Introduction
- 2 Proposed Approach
- 2.1 The Bayesian Classifier
- 2.2 WSD Methods Based on Context Translation
- 3 Experiment
- 3.1 Experimental Setup
- 3.2 Evaluation Results
- Acknowledgments
- References
- Cyrillic Mongolian Named Entity Recognition with Rich Features
- Abstract
- 1 Introduction
- 2 Construction of Cyrillic Mongolian
- 2.1 Characteristics of Mongolian
- 2.2 Collection of Corpus
- 2.3 Annotation of Corpus
- 3 The Model
- 3.1 CRF Framework
- 3.2 Features
- 4 Experiment
- 4.1 Setting up
- 4.2 Results and Analysis
- 5 Conclusion
- Acknowledgements
- References
- Purchase Prediction via Machine Learning in Mobile Commerce
- 1 Introduction
- 2 Related Work
- 3 Problem Definition
- 4 Method
- 4.1 Training Module and Prediction Module
- 4.2 Feature Project
- 4.3 Filtering Module
- 4.4 Reduced Data
- 5 Experiments
- 5.1 Data Description
- 5.2 Two Rule-Based Baselines
- 5.3 Result
- 6 Conclusion
- References
- Exploring Long Tail Data in Distantly Supervised Relation Extraction
- 1 Introduction
- 2 Related Work
- 3 Problem Definition
- 4 Rule Learning Using EBL-Based Distant Supervision
- 4.1 Algorithm DistantEBL
- 4.2 Relation Keyword Extraction
- 5 Experiments
- 5.1 Data Generation
- 5.2 Evaluation on Long Tail Data
- 5.3 Evaluation on Standard Data
- 6 Conclusion
- References
- Detecting Potential Adverse Drug Reactions from Health-Related Social Networks
- Abstract
- 1 Introduction
- 2 Detecting Potential ADRs on Health-Related Social Networks
- 2.1 Data Acquisition Module
- 2.2 Potential ADRs Detecting Module
- 2.3 Associated Protein Recognition Modules
- 3 Experiments and Result Analysis
- 3.1 Datasets
- 3.2 Performance on Recognizing Mentions of Diseases and ADRs
- 3.3 Performance on Potential ADRs Detection
- 3.4 Associated Proteins for Potential ADRs
- 4 Conclusion and Future Work
- Acknowledgements
- References
- Iterative Integration of Unsupervised Features for Chinese Dependency Parsing
- Abstract
- 1 Introduction
- 2 Previous Work
- 3 Joint Word Segmentation, POS Tagging and Dependency Parsing Model
- 3.1 Character-Based Joint Model
- 3.2 Unsupervised Feature Using in Joint Model
- 4 Iterative Exploring of Unsupervised Features for Chinese Dependency Parsing
- 4.1 Preliminary Investigation
- 4.2 Iterative Exploring of Unsupervised Feature
- 5 Experiments
- 5.1 Experimental Settings
- 5.2 Experimental Result and Analyses
- 6 Conclusions
- Acknowledgments
- References
- Can We Neglect Function Words in Word Embedding?
- Abstract
- 1 Introduction
- 2 Word Embedding Models
- 3 Experiments
- 3.1 Dataset and Settings
- 3.2 Tasks
- 3.3 Results and Discussion
- 4 Related Work
- 5 Conclusion
- Acknowledgements
- References
- A Similarity Algorithm Based on the Generality and Individuality of Words
- Abstract
- 1 Introduction
- 2 Related Works
- 3 Word Similarity Calculation
- 3.1 Sememe Similarity Calculation
- 3.2 Concepts Similarity Calculation
- 3.3 Word Similarity Calculation
- 4 Experiment and Result
- 4.1 Experimental Data
- 4.2 Evaluation Standard
- 4.3 Experimental Results and Analysis
- 5 Conclusion
- Acknowledgement
- References
- An Improved Information Gain Algorithm Based on Relative Document Frequency Distribution
- Abstract
- 1 Introduction
- 2 Information Gain
- 3 An Improved Information Gain Algorithm Based on Relative Document Frequency Distribution
- 3.1 Feature Selection by Categories
- 3.2 Reducing the Impact of Unbalanced Data Sets
- 3.3 Reducing the Impact of the Low-Frequency Characteristics
- 3.4 Within-Class Word Frequency Distribution
- 3.5 Between-Class Features Selection Based on Relative Document Frequency Distribution
- 3.5.1 Limitations of Features Selection Based on Absolute Document Frequency Distribution
- 3.5.2 Advantages of Features Selection Based on Relative Document Frequency Distribution
- 4 Experiment Results and Analysis
- 4.1 Experimental Procedure
- 4.2 Evaluation Criterion
- 4.3 Experiments Results Analysis and Comparison
- 5 Conclusions and Future Work
- Acknowledgements
- References
- Finding the True Crowds: User Filtering in Microblogs
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Categorization of Users in Microblogs
- 3.1 Categorization of Microblog Users
- 4 Unified Filtering Model for Advertisers
- 4.1 Study on Non-content Features
- 4.2 Topic-Specific Divergence (TSD)
- 4.3 Advertisers Filtering Model
- 5 Comparative Experimental Analysis
- 6 Conclusions and Future Work
- References
- Learning to Recognize Protected Health Information in Electronic Health Records with Recurrent Neural Network
- Abstract
- 1 Introduction
- 2 De-Identification with Recurrent Neural Network
- 2.1 Records Preprocessing and Skeleton Generation
- 2.2 Chunk Representation Schemes
- 2.3 Sequence Labeling Using RNNs
- 3 Experiments and Discussion
- 3.1 Datasets
- 3.2 Parameters of the Framework
- 3.3 Performance on i2b2 Datasets
- 3.4 Performance on Chinese Dataset
- 4 Conclusion
- References
- Sentiment Classification of Social Media Text Considering User Attributes
- 1 Introduction
- 2 Datasets
- 2.1 Data
- 2.2 User Attributes
- 3 The Proposed Method
- 3.1 Some Notations
- 3.2 Content-Based Method
- 3.3 Feature-Based Method
- 3.4 Graph-Based Method
- 3.5 Combination Strategy
- 4 Experiments
- 4.1 Experimental Settings
- 4.2 Performance Comparison
- 4.3 Effects of Pruning
- 4.4 Attribute Group Preference Analysis
- 5 Related Work
- 6 Conclusion and Future Work
- References
- Learning from User Feedback for Machine Translation in Real-Time
- 1 Introduction
- 2 Related Work
- 3 Online Learning Framework
- 3.1 Anchor-Based Word Alignment Method
- 3.2 Online Translation Model
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Results and Analysis
- 5 Conclusion
- References
- GuideRank: A Guided Ranking Graph Model for Multilingual Multi-document Summarization
- 1 Introduction
- 2 Related Work
- 2.1 Multilingual Multi-document Summarization
- 2.2 Graph-Based Extractive Summarization Models
- 3 Methods
- 3.1 CoRank Model
- 3.2 GuideRank Model
- 4 Experiment
- 4.1 Dataset
- 4.2 Baseline Models
- 4.3 Experimental Results
- 5 Analysis
- 6 Conclusion
- References
- Fast-Syntax-Matching-Based Japanese-Chinese Limited Machine Translation
- Abstract
- 1 Introduction
- 2 Architecture
- 3 Algorithm
- 4 Experiment
- 4.1 Dataset
- 4.2 Preprocess Result and Discussion
- 4.3 FSM Result and Discussion
- 4.4 LMT Result and Discussion
- 5 Conclusion
- Acknowledgements
- References
- Value at Risk for Risk Evaluation in Information Retrieval
- 1 Introduction
- 2 Value at Risk in Finance
- 2.1 Concept of Value at Risk
- 2.2 General Formula of Value at Risk
- 2.3 Variance-Covariance Method
- 3 Value at Risk in IR (VaR_IR)
- 3.1 Estimation of Variance of Ranking
- 3.2 Incorporation of Effectiveness Metric and Effectiveness Improvement
- 3.3 The Formula of VaR_IR
- 4 Empirical Evaluation
- 4.1 Evaluation Setup
- 4.2 Risk Evaluation Results in Session Track
- 5 Conclusions
- References
- Chinese Paraphrases Acquisition Based on Random Walk N Step
- Abstract
- 1 Introduction
- 2 Background
- 3 Paraphrases Acquisition Based on Graph Model
- 3.1 Graph Model Constructed for Phrase Table
- 3.2 Paraphrases Acquisition Based on Random Walk
- 3.3 Paraphrase Credibility Based on Expected Number Steps
- 3.4 Augmented Graph Based Model for Multiple Language Pairs
- 4 Experiment
- 4.1 Experiment Data and Parameters
- 4.2 Experiment Result
- 5 Conclusion
- Acknowledgments
- References
- A Micro-topic Model for Coreference Resolution Based on Theme-Rheme Structure
- 1 Introduction
- 2 Micro-topic Scheme
- 2.1 Elementary Discourse Units
- 2.2 Theme-Rheme Structure
- 2.3 Patterns of Thematic Progression
- 2.4 Model Representation
- 3 Proposed Model
- 3.1 Identifying the EDUs
- 3.2 Identifying the TRS
- 3.3 Recognizing Thematic Progression Patterns
- 3.4 Identifying Mention-Pairs
- 4 Experiments and Results
- 5 Conclusion and Further Work
- References
- Learning from LDA Using Deep Neural Networks
- 1 Introduction
- 2 Related Work
- 3 Methods
- 4 Experiments
- 4.1 Database and Experimental Setup
- 4.2 Results
- 5 Topic Discovery by Transfer Learning
- 6 Conclusion and Future Work
- References
- Relation Classification: CNN or RNN?
- 1 Introduction
- 2 Model
- 2.1 Model Training
- 2.2 Position Indicators
- 3 Experiments
- 3.1 Database
- 3.2 Experimental Setup
- 3.3 Results
- 4 Discussion
- 4.1 Impact of Long Context
- 4.2 Proportion of Long Context
- 4.3 Semantic Accumulation
- 5 Conclusion
- References
- Shared Tasks
- Ensemble of Feature Sets and Classification Methods for Stance Detection
- 1 Introduction
- 2 Problem Description
- 3 Feature Engineering
- 3.1 Text Preprocessing
- 3.2 Word Selection
- 3.3 Latent Semantic Features
- 3.4 Lexical Features
- 4 Model Training
- 4.1 Feature Ranking and Selection
- 4.2 Model Ensemble
- 5 A Performance Study
- 5.1 Importance of the Features
- 5.2 Performance on Test Data
- 6 Conclusion
- References
- Exploiting External Knowledge and Entity Relationship for Entity Search
- 1 Introduction
- 2 Related Works
- 3 Method
- 4 Implementation
- 4.1 Offline Prediction Component
- 4.2 Online Prediction Component
- 5 Experiment
- 5.1 Case Study
- 5.2 Performance on Large-Scale Dataset
- 6 Conclusion
- References
- A Flexible and Sentiment-Aware Framework for Entity Search
- 1 Introduction
- 2 Related Work
- 3 System Design
- 3.1 Data Collection and Data Alignment
- 3.2 Query Rewriting
- 3.3 Entity Ranking
- 4 Experiments
- 5 Conclusion
- References
- Word Segmentation on Micro-Blog Texts with External Lexicon and Heterogeneous Data
- 1 Introduction
- 2 The Baseline CRF-Based WSTagger
- 3 Exploring External Lexicon Features
- 4 The Guide-Feature Based Approach for Exploiting CTB7 and PD
- 5 The Coupled Approach for Exploring CTB7 and PD
- 6 The Merge-then-re-decode Ensemble Approach
- 7 Experiments
- 7.1 Datasets
- 7.2 Heterogeneity of WB and CTB7
- 7.3 Results on CTB7-dev/test
- 7.4 Results on WB-dev
- 7.5 Reported Results on WB-test
- 8 Related Work
- 9 Conclusion
- References
- Open Domain Question Answering System Based on Knowledge Base
- 1 Introduction
- 2 Related Work
- 3 Architecture
- 3.1 Topic Entity Linking
- 3.2 Predicate Scoring
- 3.3 Answer Pattern
- 3.4 Ranking
- 4 Experiment
- 4.1 Dataset
- 4.2 Experiment Settings
- 4.3 Benchmark Systems
- 4.4 Results
- 4.5 Error Analysis
- 4.6 Dataset Analysis
- 5 Conclusion
- References
- Recurrent Neural Word Segmentation with Tag Inference
- 1 Introduction
- 2 The Network Architecture
- 2.1 Character Feature Vectors
- 2.2 LSTM
- 2.3 Tag Inference
- 2.4 Model Training
- 3 Experiments
- 3.1 Features
- 3.2 Feature Evaluation
- 3.3 Model Evaluation
- 4 Conclusion
- References
- Chinese Word Similarity Computing Based on Combination Strategy
- Abstract
- 1 Introduction
- 2 Combination Strategy
- 3 Chinese Word Similarity Computation Strategy
- 3.1 HowNet
- 3.2 Word2Vector
- 3.3 Chinese FrameNet (CFN)
- 3.4 Other
- 4 Experiments
- 4.1 Data and Evaluation Metrics
- 4.2 Experiments Results
- 5 Conclusion and Future Work
- Acknowledgements
- References
- An Empirical Study on Chinese Microblog Stance Detection Using Supervised and Semi-supervised Machine Learning Methods
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Data Preprocessing
- 4 Stance Detection Based on Supervised Learning
- 5 Stance Detection Based on Semi-supervised Learning
- 6 Experiments and Results
- 6.1 Experiment Dataset
- 6.2 Experiment Results Based on Supervised Learning
- 6.3 Experiment Results Based on Semi-supervised Learning
- 7 Conclusions and Future Work
- Acknowledgements
- References
- Combining Word Embedding and Semantic Lexicon for Chinese Word Similarity Computation
- 1 Introduction
- 2 Methodology
- 2.1 Similarity Computation Based on Tongyici Cilin
- 2.2 Similarity Computation Based on Embedding Vectors
- 2.3 Combination Strategies
- 2.4 A-Posteriori Improvements
- 3 Experiment Settings
- 3.1 Data Set
- 3.2 Evaluation
- 4 Results and Analysis
- 4.1 Results of Submission
- 4.2 Result of Improvement
- 5 Conclusion
- References
- Football News Generation from Chinese Live Webcast Script
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Sports News Generation
- 3.1 Keywords Dictionary
- 3.2 Key Sentences
- 3.3 News Generation
- 3.4 Readability Improvement
- 4 Experiments
- 4.1 Keywords Collection
- 4.2 Weights Alignment Between Hot Events and Key Sentences
- 4.3 Scores of Automatic Evaluation
- 5 Conclusion
- Acknowledge
- References
- Convolutional Deep Neural Networks for Document-Based Question Answering
- 1 Introduction
- 2 Convolutional Sentence Model
- 2.1 Embedding Layer
- 2.2 Convolution Layer
- 2.3 Non-linearity Layer
- 2.4 Max-Pooling Layer
- 3 Convolutional Matching Model
- 3.1 Interact Layer
- 3.2 Multi-layer Perceptron
- 3.3 Training
- 4 Attentive Pooling
- 5 Experiments
- 5.1 Dataset
- 5.2 Evaluation Metrics
- 5.3 Embedding
- 5.4 Results and Discussion
- 5.5 Attentive Pooling Visualization
- 6 Conclusion
- References
- Research on Summary Sentences Extraction Oriented to Live Sports Text
- Abstract
- 1 Introduction
- 2 Expanding Correlated Words Method Based on Word2vec
- 2.1 Model Training on Word2vec
- 2.2 Correlated Words Extension
- 3 Summary Sentence Extraction Based on CRFs
- 3.1 Conditional Random Fields
- 3.2 Extraction Model
- 3.3 Feature Selection
- 4 Experiment and Results
- 4.1 Data Set
- 4.2 Evaluating Indicator
- 4.3 Result and Analysis
- 5 Conclusion
- Acknowledgements
- References
- Short Papers
- Statistical Entity Ranking with Domain Knowledge
- 1 Introduction
- 2 Framework of Learning to Rank Entity
- 2.1 Problem Formulation
- 2.2 Extending Entity Extension by External Resources
- 2.3 Feature for Learning
- 2.4 Word Segmentation and 2-Gram Words
- 2.5 Ranking Model Design
- 3 Experiments
- 3.1 DataSet Description
- 3.2 Experiments Design and Results
- 4 Conclusion
- References
- Study on the Method of Precise Entity Search Based on Baidu's Query
- Abstract
- 1 Introduction
- 2 Query String Parsing
- 2.1 Movies, TV Shows Query String Classification
- 2.2 The Extraction of the Matched Words in Restaurants' Query
- 2.3 The Extraction of the Matched Words in Name's Query
- 3 Semantic Extension and Matching Rules
- 3.1 Word2vec Word Vector Model and Semantic Extension
- 3.2 Matching Rule
- 4 Evaluation Method and Experimental Result Analysis
- 4.1 Evaluation Standard
- 4.2 Experimental Results and Analysis
- 5 Conclusion
- References
- Overview of the NLPCC-ICCPOL 2016 Shared Task: Chinese Word Similarity Measurement
- Abstract
- 1 Introduction
- 2 Dataset Construction
- 2.1 Word Selection
- 2.2 Word Pair Generation
- 2.3 Similarity Score Annotation
- 3 Task Setup
- 4 Evaluation Results and Analysis
- 4.1 Overall Results
- 4.2 Inter-annotator Agreement and Evaluation Results
- 4.3 Part of Speech on Similarity Computation
- 4.4 Word Length on Similarity Computation
- 4.5 Polysemous Words on Similarity Computation
- 5 Participating Systems
- 6 Conclusion
- Acknowledgement
- Appendix A: 91 Word Pairs with Standard Deviation Greater Than 2
- References
- Exploring Various Linguistic Features for Stance Detection
- Abstract
- 1 Introduction
- 2 Related Works
- 3 Approach
- 3.1 Lexical Features
- 3.2 Morphology
- 3.3 Semantics
- 3.4 Syntax
- 4 Experiments
- 4.1 Datasets
- 4.2 Experimental Setting
- 4.3 Experimental Results
- 5 Conclusion
- Acknowledgments
- References
- Overview of Baidu Cup 2016: Challenge on Entity Search
- Abstract
- 1 Introduction
- 2 Entity Search Task
- 2.1 Task Definition
- 2.2 Dataset
- 2.2.1 Query Preparation
- 2.2.2 Association Entities Preparation
- 2.2.3 Annotation
- 2.2.4 Dataset Arrangement
- 3 Challenge Results
- 4 Future Work
- Acknowledgement
- Reference
- A Feature-Rich CRF Segmenter for Chinese Micro-Blog
- 1 Introduction
- 2 Our Method
- 2.1 Model
- 2.2 Features
- 3 Experiment
- 3.1 Experimental Results
- 4 Conslusion
- References
- NLPCC 2016 Shared Task Chinese Words Similarity Measure via Ensemble Learning Based on Multiple Resources
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Frameworks of Word Similarity Measure
- 3.2 Word Similarity Computation via Different Algorithms
- 4 Experiments and Results Analysis
- 4.1 Task Dataset and Evaluation Method
- 4.2 Multiple Resources
- 4.3 Experimental Results and Analysis
- 5 Conclusion and Future Works
- Acknowledgments
- References
- Overview of the NLPCC-ICCPOL 2016 Shared Task: Sports News Generation from Live Webcast Scripts
- Abstract
- 1 Task
- 2 Data
- 3 Participants
- 4 Results
- 4.1 Automatic Evaluation
- 4.2 Manual Evaluation
- 5 Conclusions
- Acknowledgments
- References
- Sports News Generation from Live Webcast Scripts Based on Rules and Templates
- Abstract
- 1 Introduction
- 2 Related Work
- 3 System Description
- 3.1 System Architecture
- 3.2 Rules Based on Common Sense
- 3.3 Sentence Extraction and Generation
- 4 Evaluation Results and Discussions
- 5 Conclusions and Future Work
- Acknowledgments
- References
- A Deep Learning Approach for Question Answering Over Knowledge Base
- Abstract
- 1 Introduction
- 2 Background
- 3 Approach
- 3.1 Entity Mention
- 3.2 Relation Classification
- 3.3 Ranking
- 4 Experiments
- 4.1 Datasets
- 4.2 Settings
- 4.3 Search
- 4.4 Model Tuning
- 4.5 Results
- 5 Conclusion
- Acknowledgments
- References
- Stance Detection in Chinese MicroBlogs with Neural Networks
- 1 Introduction
- 2 Related Work
- 3 Model Based Neural Network Overview
- 3.1 Word Embedding Layer
- 3.2 Convolution Neural Layer
- 3.3 Bi-Directional LSTM
- 3.4 Pooling Layer
- 3.5 Training
- 4 Experiment
- 4.1 Parameter Settings
- 4.2 Result Analysis
- 5 Conclusion and Future Work
- References
- Overview of the NLPCC-ICCPOL 2016 Shared Task: Chinese Word Segmentation for Micro-Blog Texts
- 1 Introduction
- 2 Data
- 2.1 Background Data
- 3 Description of the Task
- 3.1 Tracks
- 4 Evaluations
- 4.1 Evaluation Metric
- 4.2 Results
- 4.3 Some Representative Systems
- 5 Analysis
- 6 Conclusion
- References
- Overview of NLPCC Shared Task 4: Stance Detection in Chinese Microblogs
- Abstract
- 1 Introduction
- 2 Stance Detection
- 2.1 Stance Detection
- 2.2 Stance Detection and Sentiment Analysis
- 3 Dataset for Stance Detection in Chinese Microblogs
- 3.1 Dataset Construction and Annotation
- 3.2 Statistics of the Dataset
- 4 Evaluation Settings
- 4.1 Sub-tasks
- 4.2 Evaluation Metrics
- 5 Submission Results and Discussions
- 5.1 Submission Result for Task A
- 5.2 Submission Result for Task B
- 5.3 Discussions
- 6 Conclusions
- Acknowledgment
- References
- Combining Deep Learning with Information Retrieval for Question Answering
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Framework
- 3.1 Topic Phrase Detecting
- 3.2 NBSVM-Based Ranking
- 3.3 CNN-Based Ranking
- 3.4 Re-ranking
- 4 Experiment
- 4.1 Train
- 4.2 Experimental Results
- 5 Conclusion
- Acknowledgment
- References
- A Hybrid Approach to DBQA
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Hybrid Approach via Rank SVM
- 3.1 Measures for Surface String Similarity
- 3.2 Features Based on Retrieval Models
- 3.3 Features Based on Deep Learning
- 4 Experiments
- 4.1 Evaluation Metrics
- 4.2 Experiments Results and Analysis
- 5 Conclusion
- Acknowledgments
- References
- A Chinese Question Answering Approach Integrating Count-Based and Embedding-Based Features
- 1 Introduction
- 2 Related Work
- 3 Methods
- 3.1 Data Exploration
- 3.2 Data Preprocessing
- 3.3 Feature Extraction
- 3.4 Model Ensemble
- 4 Experiment
- 5 Conclusion and Future Work
- References
- Overview of the NLPCC-ICCPOL 2016 Shared Task: Open Domain Chinese Question Answering
- Abstract
- 1 Background
- 2 Task Description
- 2.1 KBQA Task
- 2.2 DBQA Task
- 3 Evaluation Metrics
- 4 Evaluation Results
- 5 Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.