Natural Language Understanding and Intelligent Applications

Name: Natural Language Understanding and Intelligent Applications | 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China, December 2-6, 2016, Proceedings
Brand: Springer
Price: 96.29 EUR
Availability: OnlineOnly

5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China, December 2-6, 2016, Proceedings

Chin-Yew Lin Nianwen Xue Dongyan Zhao Xuanjing Huang Yansong Feng(Editor)

Springer (Publisher)

Published on 30. November 2016

XXII, 952 pages

E-Book

PDF with digital watermarking

System requirements

978-3-319-50496-4 (ISBN)

€96.29incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

More details

Other editions

Content

Intro
Message from the Program Committee Co-chairs
Organization
Contents
Fundamentals on Language Computing
Integrating Structural Context with Local Context for Disambiguating Word Senses
Abstract
1 Introduction
2 Proposed Approach
2.1 Generate Permuted-Lexicon-Sequence
2.2 Proposed Model
3 Evaluation
3.1 Data Sets
3.2 Experiments
4 Related Work
5 Conclusion
Acknowledgements
References
Tibetan Multi-word Expressions Identification Framework Based on News Corpora
Abstract
1 Introduction
2 Related Work
3 Brief Description of Tibetan MWE Identification Framework
4 Tibetan MWE Identification Based on the Combination of Context Analysis and Language Model-Based Analysis
4.1 Context Analysis
4.2 Two-Word Coupling Degree
4.3 Tibetan Syllable Inside Word Probability
5 Experiments
5.1 Experimental Data
5.2 Evaluation
5.2.1 Evaluation for Different Strategies in Identifying Framework
5.2.2 Evaluation for the Effect of Context Analysis Granularity
5.2.3 Evaluation on Large Corpus
6 Conclusion
Acknowledgements
References
Building Powerful Dependency Parsers for Resource-Poor Languages
1 Introduction
2 Our Approach
2.1 Data Preprocessing
2.2 Projecting Dependencies and POS Tags
2.3 CRF-Based POS Tagging Model
2.4 Graph-Based Dependency Parsing Model
3 Enhancing the Parsers
3.1 Subtree Based Features
3.2 Word-Cluster Based Features
4 Experiments
4.1 Data Sets
4.2 Results on POS Tagging
4.3 Results on Parsing
5 Related Work
6 Conclusions
References
Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification
1 Introduction
2 Related Works
3 Methodology
3.1 Embedding Layer
3.2 Sentence Modeling with Bi-LSTM
3.3 Gated Relevance Network
3.4 Max-Pooling Layer and MLP
3.5 Model Training
4 Experiments
4.1 Dataset and Evaluation Metrics
4.2 Parameter Settings
4.3 Baselines
4.4 Results of Comparison Experiments
5 Conclusion
References
Syntactic Categorization and Semantic Interpretation of Chinese Nominal Compounds
Abstract
1 Introduction
2 Related Literature
3 Syntactic Categorization of Nominal Compounds in Chinese
3.1 Basic Rules
3.2 Context-Based Rules
3.3 Rules of Named Entities
3.4 Rules for Syntactic Categorization
3.5 Syntactic Categorization Experiments
4 Automatic Semantic Interpretation of Head-Modifier Nominal Compounds
4.1 Description of the System
4.2 Resources and Similarity Computation
4.3 Noun Matching
4.4 Acquisition of Semantic Interpretation Templates
4.5 Experiments of Automatic Semantic Interpretation
5 Application in Syntactic Parsing and Machine Translation
5.1 Correction in Syntactic Parsing
5.2 Application in Machine Translation
6 Conclusions
Acknowledgement
References
TDSS: A New Word Sense Representation Framework for Information Retrieval
1 Introduction
2 Related Work
3 A New Word Sense Representation Framework
4 TDSS Sense Extraction
4.1 Explanation Words and Context Extraction
4.2 Sense Graph Construction
4.3 Sense Generation and Weighting
5 Experiments
5.1 Evaluating Explanation Word Extraction
5.2 Evaluating Word Sense Generation
5.3 Case Study: Query Rewriting
6 Conclusions
References
A Word Vector Representation Based Method for New Words Discovery in Massive Text
Abstract
1 Related Work
2 The New Word Discovery Method Based on Word Vector Pruning
2.1 Data Preprocessing
2.2 Word Vector Representation and Training
2.3 Mining n-Gram Word String
2.4 Pruning Based on Word Vector
3 Experiment Results
3.1 Data Sets and Experimental Settings
3.2 The Result of New Word Detection
3.3 Comparative Analysis
3.4 Different Vector Similarity Measure Pruning Comparison
4 Conclusions and Future Work
Acknowledgments
References
Machine Translation and Multi-lingual Information Access
Better Addressing Word Deletion for Statistical Machine Translation
1 Introduction
2 The Proposed Approach
2.1 Undesired WD Classification
2.2 Undesired WD Model
2.3 Integration into SMT Decoder
3 Evaluation Metric - Recall of WD
3.1 Unigram Recall
4 Evaluation
4.1 Experiment Setup
4.2 Corpus
4.3 Results
4.4 Recall of WD vs Human Evaluation
5 Related Work
6 Conclusion and Future Work
References
A Simple, Straightforward and Effective Model for Joint Bilingual Terms Detection and Word Alignment in SMT
1 Introduction
2 Related Work
3 The Proposed Joint Model
3.1 The Framework for Jointly Detecting Bilingual Term Pairs and Aligning Words
3.2 The Joint Model
3.3 Derivation Details
4 Experiments
4.1 Experimental Setup
4.2 Results and Analysis
5 Conclusion
References
Bilingual Parallel Active Learning Between Chinese and English
Abstract
1 Introduction
2 Related Work
3 Corpus Annotation
3.1 Corpus Selection
3.2 Annotation of Chinese Corpus
3.3 Mapping to English Corpus
3.4 Manual Adjustment
3.5 Alignment Statistics
4 Bilingual Parallel Active Learning
4.1 Problem Definition
4.2 BPAL Algorithm
5 Experiments
5.1 Corpora
5.2 Experimental Methods
5.3 Features for Relation Classification
5.4 Evaluation Metrics
5.5 Experimental Results
6 Conclusion
Acknowledgement
References
Study on the English Corresponding Unit of Chinese Clause
Abstract
1 Chinese-to-English Clause-Aligned Parallel Corpus
2 ECUCC Grammatically Annotated Corpus
2.1 Grammatical Analytic Principles of ECUCC
2.2 Grammatical Analytic System of ECUCC
3 Classification and Statistical Analysis of ECUCC
3.1 Sentences and Clauses
3.2 Major Clauses and Subordinate Clauses
3.3 Functions of Subordinate Clauses: Adverbial and Attributive
3.4 Structures of Subordinate Clauses: Restrictive Relative and Non-defining
3.5 Simple Clauses and Coordinate Clauses
3.6 General Analysis
4 Conclusion and Further Research
Acknowledgments
References
Research for Uyghur-Chinese Neural Machine Translation
Abstract
1 Introduction
2 Related Work
3 Model
3.1 Pre-process
3.2 Pointer-NMT Model
3.3 Post-process
4 Experiment
4.1 Experiment Set
4.2 Results of Experiment
5 Conclusion
Acknowledgements
References
MaxSD: A Neural Machine Translation Evaluation Metric Optimized by Maximizing Similarity Distance
1 Introduction
2 Learning Task
3 MaxSD Model: Maximizing Similarity Distance Model
3.1 MaxSD Model
3.2 Bi-LSTM and BiC-LSTM Networks
4 Experiments and Results
4.1 Datasets
4.2 Setups
4.3 Results
5 Conclusion
References
Automatic Long Sentence Segmentation for Neural Machine Translation
1 Introduction
2 Related Work
3 Neural Machine Translation
4 The Segmentation Method
4.1 The Split Model
4.2 The Reordering Model
4.3 Joint Model: Combining the Two Submodels
5 Experiment
5.1 Setup
5.2 The Split Model
5.3 The Reordering Model
5.4 Comparison
5.5 Analysis
6 Conclusion and Future Work
References
Machine Learning for NLP
Topic Segmentation of Web Documents with Automatic Cue Phrase Identification and BLSTM-CNN
1 Introduction
2 Related Work
3 Models
3.1 BLSTM (Bidirectional Long Short Term Memory)
3.2 CNN for Paragraph Representation
3.3 Model Learning
4 Features
4.1 Frequent Subsequence Mining Based Cue Phrase Identification
4.2 Other Features
5 Experiments
5.1 Data and Setup
5.2 Results
5.3 Error Analysis
6 Conclusion and Future Work
References
Multi-task Learning for Gender and Age Prediction on Chinese Microblog
1 Introduction
2 Multi-task Convolutional Neural Network (MTCNN)
2.1 Model Description
2.2 Model Learning
3 Weibo Data
4 Experiments
4.1 Experimental Setup
4.2 Baselines
4.3 Results
4.4 Error Analysis
5 Related Work
6 Conclusion and Future Work
References
Dropout Non-negative Matrix Factorization for Independent Feature Learning
Abstract
1 Introduction
2 Related Work
3 Methodology
3.1 NMF as a Linear Neural Network
3.2 Dropout and Sequential NMF
3.3 Complexity Analysis
4 Experimental Results
4.1 Datasets
4.2 Experimental Settings
4.3 Clustering Results
4.4 Parameter Selection and Convergence Analysis
4.5 Case Study
5 Conclusion
Acknowledgement
References
Analysing the Semantic Change Based on Word Embedding
1 Introduction
2 Related Work
3 Approaches
3.1 Word Embedding
3.2 Random Project Forest
3.3 DBSCAN
4 Experiments
4.1 Preparations
4.2 Detecting the Semantic Change Based on Word Embedding
4.3 Analysing the Semantic Trend with Word Embedding
4.4 Clustering on the Similar Words and Context Words
5 Conclusion and Future Work
References
Learning Word Sense Embeddings from Word Sense Definitions
1 Introduction
2 Methodology
2.1 Definition Understanding Model
2.2 Training Definition Understanding Model with Definitions of Monosemous Words
2.3 Word Sense Embedding Learning
2.4 Training with Word Sense Embeddings to Represent Words in Definitions
3 Experiments
3.1 Setup
3.2 Qualitative Evaluations
3.3 Quantitative Evaluations
4 Related Work
5 Conclusion
References
Information Extraction, Question Answering and Knowledge Acquisition
Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition
1 Introduction
2 Related Work
3 Neural Network Architecture
3.1 LSTM
3.2 CRF
3.3 Radical-Level LSTM
3.4 Tagging Scheme
4 Network Training
4.1 LSTM Variants
4.2 Pretrained Embeddings
4.3 Training
5 Experiments
5.1 Data Sets
5.2 Results
6 Conclusion
References
Improving First Order Temporal Fact Extraction with Unreliable Data
1 Introduction
2 Related Work
3 Dataset Construction
4 Model
4.1 PCNN Model
4.2 Curriculum Learning
4.3 Label Dropout
4.4 Instance Attention
5 Experiments
5.1 Main Results
5.2 Influence of Curriculum Learning Parameters
5.3 Influence of Label Dropout Parameters
6 Conclusion
References
Reducing Human Effort in Named Entity Corpus Construction Based on Ensemble Learning and Annotation Categorization
1 Introduction
2 Related Work
3 Preliminaries
3.1 Definitions
3.2 System Architecture
4 Method
5 Experimental Setup
5.1 Datasets
5.2 Taggers
5.3 Pre-annotators
5.4 Assisted Annotation Experiments
6 Results and Analysis
6.1 Performance of Pre-annotated Annotations
6.2 Performance of Annotators
6.3 Analysis
7 Conclusion
References
A Convolution BiLSTM Neural Network Model for Chinese Event Extraction
1 Introduction
2 Trigger Labeling
2.1 Language Specific Issues
2.2 Word-Based Method
2.3 Character-Based Method
3 Argument Labeling
3.1 Input Layer
3.2 Output Layer
4 Experiments
4.1 Experimental Setup
4.2 Network Training
4.3 Trigger Labeling
4.4 Argument Labeling
5 Conclusion
References
Detection of Entity Mixture in Knowledge Bases Using Hierarchical Clustering
Abstract
1 Introduction
2 Background and Related Work
2.1 Knowledge Base and Knowledge Service
2.2 Entity Disambiguation and Entity Linking
3 Detection of Homonymous Entities Mixture
3.1 Definition of Homonymous Entities Mixture
3.2 Workflow of Entity Mixture Detection
3.3 Hierarchical Clustering for Detection of Entity Mixture
4 Experiments of Entity Mixture Detection
4.1 Experiment Environments and Settings
4.2 Analysis of Experimental Results
5 Conclusion
Acknowledgments
References
Knowledge Base Question Answering Based on Deep Learning Models
1 Introduction
2 Related Work
3 Methods
3.1 Topic Entity Extraction Models
3.2 Deep Structured Semantic Models
3.3 Candidates Retrieval
3.4 Answer Selection
4 Experiment
4.1 Data Set
4.2 Setup
4.3 Evaluation Metric
4.4 Experimental Results
5 Conclusion
References
An Open Domain Topic Prediction Model for Answer Selection
1 Introduction
2 Topic Prediction Model
2.1 Training Data Acquisition
2.2 Model Training
3 Topic Prediction Model for Answer Selection
3.1 Implicit Topic Matching Feature
3.2 Explicit Topic Matching Feature
4 Related Work
5 Experiment
5.1 Evaluation on Topic Prediction
5.2 Experiment on Answer Selection
6 Conclusion
References
Joint Event Extraction Based on Skip-Window Convolutional Neural Networks
Abstract
1 Introduction
2 Event Extraction Task
3 Methodology
3.1 Word Embedding Learning
3.2 Skip-Window Convolutional Neural Networks
3.3 Label Vectors Learning
3.4 Use RNNs to Operate Sequence Labeling
3.5 Training
4 Experiment
4.1 Dataset and Evaluation Metric
4.2 Baselines
4.3 Overall Performance
4.4 The Effectiveness of S-CNNs
5 Related Work
6 Conclusion
Acknowledgments
References
Improving Collaborative Filtering with Long-Short Interest Model
1 Introduction
2 Related Work
2.1 Collaborative Filtering
2.2 Neural Network Language Model
3 Our Approach
3.1 Problem Definition
3.2 Long-Short Interest Model
3.3 Feature Based Collaborative Filtering
4 Experiment
4.1 Experimental Setup
4.2 Benchmark Models
4.3 Overall Results
4.4 Effects on Different Users
5 Conclusions
References
Discourse Analysis
Leveraging Hierarchical Deep Semantics to Classify Implicit Discourse Relations via Mutual Learning Method
Abstract
1 Introduction
2 Related Works
3 Method
3.1 Mutual Learning Neural Model
3.2 Learning Model Parameters and Semantic Embeddings
4 Experiment
4.1 Datasets
4.2 Classification Results
5 Conclusion
Acknowledgment
References
Transition-Based Discourse Parsing with Multilayer Stack Long Short Term Memory
1 Introduction
2 Related Work
3 Long Short Term Memory Theory
4 Transition-Based Parsing
5 Method
5.1 EDU Representation
5.2 Stack LSTM
5.3 Multilayer Stack LSTM Discourse Parsing Model
5.4 Parsing with Multilayer Stack LSTM Model
5.5 Composition Functions
6 Experiment
6.1 Data
6.2 Results and Discussion
7 Conclusion
References
Predicting Implicit Discourse Relation with Multi-view Modeling and Effective Representation Learning
1 Introduction
2 Overview of the Penn Discourse Treebank
3 Model
3.1 Discourse Relation Scoring Model
3.2 Max-Margin Learning
4 Multi-level Representations for the Arguments
5 Implementation Details
6 Experiments
6.1 First-Level Relation Recognition
6.2 Second-Level Relation Recognition
7 Discussion
8 Related Work
9 Conclusion
References
A CDT-Styled End-to-End Chinese Discourse Parser
1 Introduction
2 The Chinese Discourse Tree Bank
3 End-to-End Chinese Discourse Parser
3.1 System Overview
3.2 Elementary Discourse Unit Detector
3.3 Discourse Relation Recognizer
3.4 Discourse Parse Tree Generator
3.5 Attribution Labeler
4 Experiments
4.1 Experimental Setting
4.2 Experimental Results and Analysis
5 Related Work
6 Conclusion
References
NLP for Social Media
Events Detection and Temporal Analysis in Social Media
1 Introduction
2 Related Work
3 Keywords Extraction
3.1 User Authority Estimation
3.2 Words Score
3.3 Keywords Selection
4 Events Detection
4.1 Building KeyGraph
4.2 Community Detection
5 Temporal Analysis
6 Experiment Analysis
6.1 Dataset
6.2 Experiment Result and Analysis
7 Conclusions
References
Discovering Concept-Level Event Associations from a Text Stream
1 Introduction
2 Burst Information Networks
2.1 Burst Detection
2.2 BINet Construction
3 Event Association Discovery
3.1 Event Extraction
3.2 Major Event Identification
3.3 Event Association Pair Ranking
4 Experiments and Evaluations
4.1 Data
4.2 End-to-end Evaluation
4.3 Future Event Prediction with Association Knowledge
5 Related Work
6 Conclusion
References
A User Adaptive Model for Followee Recommendation on Twitter
1 Introduction
2 Approach Overview
3 User Modeling
3.1 Topology-Based Neural Network
3.2 Content-Based Neural Network
3.3 Adaptive Layer
4 Learning
5 Experiments
5.1 Experiment Setup
5.2 Comparison with Recommendation Methods
5.3 Analysis of Our Model
6 Related Work
7 Conclusions
References
Who Will Tweet More? Finding Information Feeders in Twitter
1 Introduction
2 Related Work
3 Method
3.1 Task Description
3.2 User Features
4 Experiments
4.1 Data Preparation
4.2 Data Description
4.3 Experiment Setting
4.4 Results
5 Examples
6 Conclusion and Future Work
References
Short Papers
Discrete and Neural Models for Chinese POS Tagging: Comparison and Combination
1 Introduction
2 Baseline Models
2.1 The Discrete Model
2.2 The Neural Model
2.3 Training Method
3 Combination
3.1 Feature Combination
3.2 Stacked Learning
4 Incorporating Cluster Features
5 Experiments
5.1 Experimental Settings
5.2 Development Results
5.3 Final Results
6 Related Work
7 Conclusion
References
Improving Word Vector with Prior Knowledge in Semantic Dictionary
Abstract
1 Introduction
2 Our Approach
2.1 Hownet Semantic Knowledge
2.2 Similar Words
2.3 Character Embedding
3 Experiment
3.1 Correlation with Human Judgement
3.2 Temporal Tagging
3.3 People Daily NER
4 Conclusion
Acknowledgement
References
Adapting Attention-Based Neural Network to Low-Resource Mongolian-Chinese Machine Translation
Abstract
1 Introduction
2 Attention-Based Neural Network Machine Translation
3 Sub-words Mongolian-Chinese NMT
4 Applying Monolingual Data to NMT
5 Applying Correction Model to NMT
6 Evaluation
6.1 Setting
6.2 Evaluation of Attention-Based NMT
6.3 Evaluation of Sub-words NMT
6.4 Evaluation of Monolingual
6.5 Evaluation of NMT Correction
6.6 Comparison
7 Conclusion
Acknowledgements
References
Sentence Similarity on Structural Representations
Abstract
1 Introduction
2 Related Work
3 Structural Representations
3.1 Motivation
3.2 Shallow Syntactic Tree
3.3 Dependency Tree
4 Experiment
4.1 Baseline
4.2 Experimental Setup
4.3 Experimental Results and Analysis
5 Summary
References
Word Sense Disambiguation Using Context Translation
Abstract
1 Introduction
2 Proposed Approach
2.1 The Bayesian Classifier
2.2 WSD Methods Based on Context Translation
3 Experiment
3.1 Experimental Setup
3.2 Evaluation Results
Acknowledgments
References
Cyrillic Mongolian Named Entity Recognition with Rich Features
Abstract
1 Introduction
2 Construction of Cyrillic Mongolian
2.1 Characteristics of Mongolian
2.2 Collection of Corpus
2.3 Annotation of Corpus
3 The Model
3.1 CRF Framework
3.2 Features
4 Experiment
4.1 Setting up
4.2 Results and Analysis
5 Conclusion
Acknowledgements
References
Purchase Prediction via Machine Learning in Mobile Commerce
1 Introduction
2 Related Work
3 Problem Definition
4 Method
4.1 Training Module and Prediction Module
4.2 Feature Project
4.3 Filtering Module
4.4 Reduced Data
5 Experiments
5.1 Data Description
5.2 Two Rule-Based Baselines
5.3 Result
6 Conclusion
References
Exploring Long Tail Data in Distantly Supervised Relation Extraction
1 Introduction
2 Related Work
3 Problem Definition
4 Rule Learning Using EBL-Based Distant Supervision
4.1 Algorithm DistantEBL
4.2 Relation Keyword Extraction
5 Experiments
5.1 Data Generation
5.2 Evaluation on Long Tail Data
5.3 Evaluation on Standard Data
6 Conclusion
References
Detecting Potential Adverse Drug Reactions from Health-Related Social Networks
Abstract
1 Introduction
2 Detecting Potential ADRs on Health-Related Social Networks
2.1 Data Acquisition Module
2.2 Potential ADRs Detecting Module
2.3 Associated Protein Recognition Modules
3 Experiments and Result Analysis
3.1 Datasets
3.2 Performance on Recognizing Mentions of Diseases and ADRs
3.3 Performance on Potential ADRs Detection
3.4 Associated Proteins for Potential ADRs
4 Conclusion and Future Work
Acknowledgements
References
Iterative Integration of Unsupervised Features for Chinese Dependency Parsing
Abstract
1 Introduction
2 Previous Work
3 Joint Word Segmentation, POS Tagging and Dependency Parsing Model
3.1 Character-Based Joint Model
3.2 Unsupervised Feature Using in Joint Model
4 Iterative Exploring of Unsupervised Features for Chinese Dependency Parsing
4.1 Preliminary Investigation
4.2 Iterative Exploring of Unsupervised Feature
5 Experiments
5.1 Experimental Settings
5.2 Experimental Result and Analyses
6 Conclusions
Acknowledgments
References
Can We Neglect Function Words in Word Embedding?
Abstract
1 Introduction
2 Word Embedding Models
3 Experiments
3.1 Dataset and Settings
3.2 Tasks
3.3 Results and Discussion
4 Related Work
5 Conclusion
Acknowledgements
References
A Similarity Algorithm Based on the Generality and Individuality of Words
Abstract
1 Introduction
2 Related Works
3 Word Similarity Calculation
3.1 Sememe Similarity Calculation
3.2 Concepts Similarity Calculation
3.3 Word Similarity Calculation
4 Experiment and Result
4.1 Experimental Data
4.2 Evaluation Standard
4.3 Experimental Results and Analysis
5 Conclusion
Acknowledgement
References
An Improved Information Gain Algorithm Based on Relative Document Frequency Distribution
Abstract
1 Introduction
2 Information Gain
3 An Improved Information Gain Algorithm Based on Relative Document Frequency Distribution
3.1 Feature Selection by Categories
3.2 Reducing the Impact of Unbalanced Data Sets
3.3 Reducing the Impact of the Low-Frequency Characteristics
3.4 Within-Class Word Frequency Distribution
3.5 Between-Class Features Selection Based on Relative Document Frequency Distribution
3.5.1 Limitations of Features Selection Based on Absolute Document Frequency Distribution
3.5.2 Advantages of Features Selection Based on Relative Document Frequency Distribution
4 Experiment Results and Analysis
4.1 Experimental Procedure
4.2 Evaluation Criterion
4.3 Experiments Results Analysis and Comparison
5 Conclusions and Future Work
Acknowledgements
References
Finding the True Crowds: User Filtering in Microblogs
Abstract
1 Introduction
2 Related Work
3 Categorization of Users in Microblogs
3.1 Categorization of Microblog Users
4 Unified Filtering Model for Advertisers
4.1 Study on Non-content Features
4.2 Topic-Specific Divergence (TSD)
4.3 Advertisers Filtering Model
5 Comparative Experimental Analysis
6 Conclusions and Future Work
References
Learning to Recognize Protected Health Information in Electronic Health Records with Recurrent Neural Network
Abstract
1 Introduction
2 De-Identification with Recurrent Neural Network
2.1 Records Preprocessing and Skeleton Generation
2.2 Chunk Representation Schemes
2.3 Sequence Labeling Using RNNs
3 Experiments and Discussion
3.1 Datasets
3.2 Parameters of the Framework
3.3 Performance on i2b2 Datasets
3.4 Performance on Chinese Dataset
4 Conclusion
References
Sentiment Classification of Social Media Text Considering User Attributes
1 Introduction
2 Datasets
2.1 Data
2.2 User Attributes
3 The Proposed Method
3.1 Some Notations
3.2 Content-Based Method
3.3 Feature-Based Method
3.4 Graph-Based Method
3.5 Combination Strategy
4 Experiments
4.1 Experimental Settings
4.2 Performance Comparison
4.3 Effects of Pruning
4.4 Attribute Group Preference Analysis
5 Related Work
6 Conclusion and Future Work
References
Learning from User Feedback for Machine Translation in Real-Time
1 Introduction
2 Related Work
3 Online Learning Framework
3.1 Anchor-Based Word Alignment Method
3.2 Online Translation Model
4 Experiments
4.1 Experimental Setup
4.2 Results and Analysis
5 Conclusion
References
GuideRank: A Guided Ranking Graph Model for Multilingual Multi-document Summarization
1 Introduction
2 Related Work
2.1 Multilingual Multi-document Summarization
2.2 Graph-Based Extractive Summarization Models
3 Methods
3.1 CoRank Model
3.2 GuideRank Model
4 Experiment
4.1 Dataset
4.2 Baseline Models
4.3 Experimental Results
5 Analysis
6 Conclusion
References
Fast-Syntax-Matching-Based Japanese-Chinese Limited Machine Translation
Abstract
1 Introduction
2 Architecture
3 Algorithm
4 Experiment
4.1 Dataset
4.2 Preprocess Result and Discussion
4.3 FSM Result and Discussion
4.4 LMT Result and Discussion
5 Conclusion
Acknowledgements
References
Value at Risk for Risk Evaluation in Information Retrieval
1 Introduction
2 Value at Risk in Finance
2.1 Concept of Value at Risk
2.2 General Formula of Value at Risk
2.3 Variance-Covariance Method
3 Value at Risk in IR (VaR_IR)
3.1 Estimation of Variance of Ranking
3.2 Incorporation of Effectiveness Metric and Effectiveness Improvement
3.3 The Formula of VaR_IR
4 Empirical Evaluation
4.1 Evaluation Setup
4.2 Risk Evaluation Results in Session Track
5 Conclusions
References
Chinese Paraphrases Acquisition Based on Random Walk N Step
Abstract
1 Introduction
2 Background
3 Paraphrases Acquisition Based on Graph Model
3.1 Graph Model Constructed for Phrase Table
3.2 Paraphrases Acquisition Based on Random Walk
3.3 Paraphrase Credibility Based on Expected Number Steps
3.4 Augmented Graph Based Model for Multiple Language Pairs
4 Experiment
4.1 Experiment Data and Parameters
4.2 Experiment Result
5 Conclusion
Acknowledgments
References
A Micro-topic Model for Coreference Resolution Based on Theme-Rheme Structure
1 Introduction
2 Micro-topic Scheme
2.1 Elementary Discourse Units
2.2 Theme-Rheme Structure
2.3 Patterns of Thematic Progression
2.4 Model Representation
3 Proposed Model
3.1 Identifying the EDUs
3.2 Identifying the TRS
3.3 Recognizing Thematic Progression Patterns
3.4 Identifying Mention-Pairs
4 Experiments and Results
5 Conclusion and Further Work
References
Learning from LDA Using Deep Neural Networks
1 Introduction
2 Related Work
3 Methods
4 Experiments
4.1 Database and Experimental Setup
4.2 Results
5 Topic Discovery by Transfer Learning
6 Conclusion and Future Work
References
Relation Classification: CNN or RNN?
1 Introduction
2 Model
2.1 Model Training
2.2 Position Indicators
3 Experiments
3.1 Database
3.2 Experimental Setup
3.3 Results
4 Discussion
4.1 Impact of Long Context
4.2 Proportion of Long Context
4.3 Semantic Accumulation
5 Conclusion
References
Shared Tasks
Ensemble of Feature Sets and Classification Methods for Stance Detection
1 Introduction
2 Problem Description
3 Feature Engineering
3.1 Text Preprocessing
3.2 Word Selection
3.3 Latent Semantic Features
3.4 Lexical Features
4 Model Training
4.1 Feature Ranking and Selection
4.2 Model Ensemble
5 A Performance Study
5.1 Importance of the Features
5.2 Performance on Test Data
6 Conclusion
References
Exploiting External Knowledge and Entity Relationship for Entity Search
1 Introduction
2 Related Works
3 Method
4 Implementation
4.1 Offline Prediction Component
4.2 Online Prediction Component
5 Experiment
5.1 Case Study
5.2 Performance on Large-Scale Dataset
6 Conclusion
References
A Flexible and Sentiment-Aware Framework for Entity Search
1 Introduction
2 Related Work
3 System Design
3.1 Data Collection and Data Alignment
3.2 Query Rewriting
3.3 Entity Ranking
4 Experiments
5 Conclusion
References
Word Segmentation on Micro-Blog Texts with External Lexicon and Heterogeneous Data
1 Introduction
2 The Baseline CRF-Based WSTagger
3 Exploring External Lexicon Features
4 The Guide-Feature Based Approach for Exploiting CTB7 and PD
5 The Coupled Approach for Exploring CTB7 and PD
6 The Merge-then-re-decode Ensemble Approach
7 Experiments
7.1 Datasets
7.2 Heterogeneity of WB and CTB7
7.3 Results on CTB7-dev/test
7.4 Results on WB-dev
7.5 Reported Results on WB-test
8 Related Work
9 Conclusion
References
Open Domain Question Answering System Based on Knowledge Base
1 Introduction
2 Related Work
3 Architecture
3.1 Topic Entity Linking
3.2 Predicate Scoring
3.3 Answer Pattern
3.4 Ranking
4 Experiment
4.1 Dataset
4.2 Experiment Settings
4.3 Benchmark Systems
4.4 Results
4.5 Error Analysis
4.6 Dataset Analysis
5 Conclusion
References
Recurrent Neural Word Segmentation with Tag Inference
1 Introduction
2 The Network Architecture
2.1 Character Feature Vectors
2.2 LSTM
2.3 Tag Inference
2.4 Model Training
3 Experiments
3.1 Features
3.2 Feature Evaluation
3.3 Model Evaluation
4 Conclusion
References
Chinese Word Similarity Computing Based on Combination Strategy
Abstract
1 Introduction
2 Combination Strategy
3 Chinese Word Similarity Computation Strategy
3.1 HowNet
3.2 Word2Vector
3.3 Chinese FrameNet (CFN)
3.4 Other
4 Experiments
4.1 Data and Evaluation Metrics
4.2 Experiments Results
5 Conclusion and Future Work
Acknowledgements
References
An Empirical Study on Chinese Microblog Stance Detection Using Supervised and Semi-supervised Machine Learning Methods
Abstract
1 Introduction
2 Related Work
3 Data Preprocessing
4 Stance Detection Based on Supervised Learning
5 Stance Detection Based on Semi-supervised Learning
6 Experiments and Results
6.1 Experiment Dataset
6.2 Experiment Results Based on Supervised Learning
6.3 Experiment Results Based on Semi-supervised Learning
7 Conclusions and Future Work
Acknowledgements
References
Combining Word Embedding and Semantic Lexicon for Chinese Word Similarity Computation
1 Introduction
2 Methodology
2.1 Similarity Computation Based on Tongyici Cilin
2.2 Similarity Computation Based on Embedding Vectors
2.3 Combination Strategies
2.4 A-Posteriori Improvements
3 Experiment Settings
3.1 Data Set
3.2 Evaluation
4 Results and Analysis
4.1 Results of Submission
4.2 Result of Improvement
5 Conclusion
References
Football News Generation from Chinese Live Webcast Script
Abstract
1 Introduction
2 Related Work
3 Sports News Generation
3.1 Keywords Dictionary
3.2 Key Sentences
3.3 News Generation
3.4 Readability Improvement
4 Experiments
4.1 Keywords Collection
4.2 Weights Alignment Between Hot Events and Key Sentences
4.3 Scores of Automatic Evaluation
5 Conclusion
Acknowledge
References
Convolutional Deep Neural Networks for Document-Based Question Answering
1 Introduction
2 Convolutional Sentence Model
2.1 Embedding Layer
2.2 Convolution Layer
2.3 Non-linearity Layer
2.4 Max-Pooling Layer
3 Convolutional Matching Model
3.1 Interact Layer
3.2 Multi-layer Perceptron
3.3 Training
4 Attentive Pooling
5 Experiments
5.1 Dataset
5.2 Evaluation Metrics
5.3 Embedding
5.4 Results and Discussion
5.5 Attentive Pooling Visualization
6 Conclusion
References
Research on Summary Sentences Extraction Oriented to Live Sports Text
Abstract
1 Introduction
2 Expanding Correlated Words Method Based on Word2vec
2.1 Model Training on Word2vec
2.2 Correlated Words Extension
3 Summary Sentence Extraction Based on CRFs
3.1 Conditional Random Fields
3.2 Extraction Model
3.3 Feature Selection
4 Experiment and Results
4.1 Data Set
4.2 Evaluating Indicator
4.3 Result and Analysis
5 Conclusion
Acknowledgements
References
Short Papers
Statistical Entity Ranking with Domain Knowledge
1 Introduction
2 Framework of Learning to Rank Entity
2.1 Problem Formulation
2.2 Extending Entity Extension by External Resources
2.3 Feature for Learning
2.4 Word Segmentation and 2-Gram Words
2.5 Ranking Model Design
3 Experiments
3.1 DataSet Description
3.2 Experiments Design and Results
4 Conclusion
References
Study on the Method of Precise Entity Search Based on Baidu's Query
Abstract
1 Introduction
2 Query String Parsing
2.1 Movies, TV Shows Query String Classification
2.2 The Extraction of the Matched Words in Restaurants' Query
2.3 The Extraction of the Matched Words in Name's Query
3 Semantic Extension and Matching Rules
3.1 Word2vec Word Vector Model and Semantic Extension
3.2 Matching Rule
4 Evaluation Method and Experimental Result Analysis
4.1 Evaluation Standard
4.2 Experimental Results and Analysis
5 Conclusion
References
Overview of the NLPCC-ICCPOL 2016 Shared Task: Chinese Word Similarity Measurement
Abstract
1 Introduction
2 Dataset Construction
2.1 Word Selection
2.2 Word Pair Generation
2.3 Similarity Score Annotation
3 Task Setup
4 Evaluation Results and Analysis
4.1 Overall Results
4.2 Inter-annotator Agreement and Evaluation Results
4.3 Part of Speech on Similarity Computation
4.4 Word Length on Similarity Computation
4.5 Polysemous Words on Similarity Computation
5 Participating Systems
6 Conclusion
Acknowledgement
Appendix A: 91 Word Pairs with Standard Deviation Greater Than 2
References
Exploring Various Linguistic Features for Stance Detection
Abstract
1 Introduction
2 Related Works
3 Approach
3.1 Lexical Features
3.2 Morphology
3.3 Semantics
3.4 Syntax
4 Experiments
4.1 Datasets
4.2 Experimental Setting
4.3 Experimental Results
5 Conclusion
Acknowledgments
References
Overview of Baidu Cup 2016: Challenge on Entity Search
Abstract
1 Introduction
2 Entity Search Task
2.1 Task Definition
2.2 Dataset
2.2.1 Query Preparation
2.2.2 Association Entities Preparation
2.2.3 Annotation
2.2.4 Dataset Arrangement
3 Challenge Results
4 Future Work
Acknowledgement
Reference
A Feature-Rich CRF Segmenter for Chinese Micro-Blog
1 Introduction
2 Our Method
2.1 Model
2.2 Features
3 Experiment
3.1 Experimental Results
4 Conslusion
References
NLPCC 2016 Shared Task Chinese Words Similarity Measure via Ensemble Learning Based on Multiple Resources
Abstract
1 Introduction
2 Related Work
3 Methodology
3.1 Frameworks of Word Similarity Measure
3.2 Word Similarity Computation via Different Algorithms
4 Experiments and Results Analysis
4.1 Task Dataset and Evaluation Method
4.2 Multiple Resources
4.3 Experimental Results and Analysis
5 Conclusion and Future Works
Acknowledgments
References
Overview of the NLPCC-ICCPOL 2016 Shared Task: Sports News Generation from Live Webcast Scripts
Abstract
1 Task
2 Data
3 Participants
4 Results
4.1 Automatic Evaluation
4.2 Manual Evaluation
5 Conclusions
Acknowledgments
References
Sports News Generation from Live Webcast Scripts Based on Rules and Templates
Abstract
1 Introduction
2 Related Work
3 System Description
3.1 System Architecture
3.2 Rules Based on Common Sense
3.3 Sentence Extraction and Generation
4 Evaluation Results and Discussions
5 Conclusions and Future Work
Acknowledgments
References
A Deep Learning Approach for Question Answering Over Knowledge Base
Abstract
1 Introduction
2 Background
3 Approach
3.1 Entity Mention
3.2 Relation Classification
3.3 Ranking
4 Experiments
4.1 Datasets
4.2 Settings
4.3 Search
4.4 Model Tuning
4.5 Results
5 Conclusion
Acknowledgments
References
Stance Detection in Chinese MicroBlogs with Neural Networks
1 Introduction
2 Related Work
3 Model Based Neural Network Overview
3.1 Word Embedding Layer
3.2 Convolution Neural Layer
3.3 Bi-Directional LSTM
3.4 Pooling Layer
3.5 Training
4 Experiment
4.1 Parameter Settings
4.2 Result Analysis
5 Conclusion and Future Work
References
Overview of the NLPCC-ICCPOL 2016 Shared Task: Chinese Word Segmentation for Micro-Blog Texts
1 Introduction
2 Data
2.1 Background Data
3 Description of the Task
3.1 Tracks
4 Evaluations
4.1 Evaluation Metric
4.2 Results
4.3 Some Representative Systems
5 Analysis
6 Conclusion
References
Overview of NLPCC Shared Task 4: Stance Detection in Chinese Microblogs
Abstract
1 Introduction
2 Stance Detection
2.1 Stance Detection
2.2 Stance Detection and Sentiment Analysis
3 Dataset for Stance Detection in Chinese Microblogs
3.1 Dataset Construction and Annotation
3.2 Statistics of the Dataset
4 Evaluation Settings
4.1 Sub-tasks
4.2 Evaluation Metrics
5 Submission Results and Discussions
5.1 Submission Result for Task A
5.2 Submission Result for Task B
5.3 Discussions
6 Conclusions
Acknowledgment
References
Combining Deep Learning with Information Retrieval for Question Answering
Abstract
1 Introduction
2 Related Work
3 Framework
3.1 Topic Phrase Detecting
3.2 NBSVM-Based Ranking
3.3 CNN-Based Ranking
3.4 Re-ranking
4 Experiment
4.1 Train
4.2 Experimental Results
5 Conclusion
Acknowledgment
References
A Hybrid Approach to DBQA
Abstract
1 Introduction
2 Related Work
3 Hybrid Approach via Rank SVM
3.1 Measures for Surface String Similarity
3.2 Features Based on Retrieval Models
3.3 Features Based on Deep Learning
4 Experiments
4.1 Evaluation Metrics
4.2 Experiments Results and Analysis
5 Conclusion
Acknowledgments
References
A Chinese Question Answering Approach Integrating Count-Based and Embedding-Based Features
1 Introduction
2 Related Work
3 Methods
3.1 Data Exploration
3.2 Data Preprocessing
3.3 Feature Extraction
3.4 Model Ensemble
4 Experiment
5 Conclusion and Future Work
References
Overview of the NLPCC-ICCPOL 2016 Shared Task: Open Domain Chinese Question Answering
Abstract
1 Background
2 Task Description
2.1 KBQA Task
2.2 DBQA Task
3 Evaluation Metrics
4 Evaluation Results
5 Conclusion
References
Author Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Natural Language Understanding and Intelligent Applications

Description

More details

Other editions

Additional editions

Content

System requirements