
Human Language Technology. Challenges for Computer Science and Linguistics
Beschreibung
Weitere Details
Weitere Ausgaben
Person
Inhalt
- Title Page
- Preface
- Organization
- Table of Contents
- Speech Processing
- Data-Driven Approaches to Objective Evaluation of Phoneme Alignment Systems
- Introduction
- Experiments
- Data
- Front-End Processing
- HMM Training
- Differences between Systems
- Non-parametric Ranking of Variances
- Parametric Bayesian Models
- Results
- Non-parametric Ranking Results
- Parametric BayesianResults
- Discussion
- Conclusions
- References
- Phonetically Transcribed Speech Corpus Designed for Context Based European Portuguese TTS
- Introduction
- Methodology
- Description
- The Speech Corpus by Graphemes
- Phonetic Transcription by Rules
- Phonetic Transcription with Vocalic Reduction
- The Vocalic Reduction Influence
- Conclusions
- References
- Robust Speech Recognition in the Car Environment
- Introduction
- Background
- Spectral Subtraction
- Weighted Finite State Transducers in Speech Recognition
- Adaptation of the Acoustic Model HMM
- Experimental Conditions
- Drivers Japanese Speech Corpus in Car Environment
- WFST Network Construction
- Evaluation of the Baseline Model
- Evaluation of Nonlinear Spectral Subtraction
- Speaker Adaptation of Acoustic Model
- Conclusions
- References
- Corpus Design for a Unit Selection TtS System with Application to Bulgarian
- Introduction
- Unit Selection Text to Speech
- Unit Selection Module
- Spoken Corpus for TTS
- Corpus Design Strategies
- Utterance Selection Methods
- The Proposed Corpus Selection Method
- Results
- Experimental Evaluation
- Discussion
- References
- Automatic Identification of Phonetic Similarity Based on Underspecification
- Introduction
- Speech Recognition System and Corpus
- Experiment
- Results
- General Information
- Analysis of Underspecified System
- Phone [f]
- Common Phonetic Properties of [f] and [th]
- Phonetic Neighbourhood
- Frequency of Occurrence
- Discussion
- Conclusion
- References
- Error Detection in Broadcast News ASR Using Markov Chains
- Introduction
- Features for Error Detection
- Models for Error Detection
- Maximum Entropy Models
- Markov Chains
- Gaussian Mixture Models
- Corpus
- Experiments
- Evaluation
- Automatic Transcription
- Error Detection Results
- Result Analysis
- Impact of the Transition Probability Matrix
- Summary and Future Work
- References
- Pronunciation and Writing Variants in an Under-Resourced Language: The Case of Luxembourgish Mobile N-Deletion
- Introduction
- The Study of Written and Spoken Variants
- Text Normalization for Written Variants
- Pronunciation Modeling of Spoken Variants
- Effects of Luxembourgish MND on Written and Pronunciation Variants
- The Current Study
- Data Collection
- Characterizing Potential Mobile -N Sites
- MND in Transcriptions
- MND and Word List Coverage
- Summary and Prospects
- References
- Morpheme-Based and Factored Language Modeling for Amharic Speech Recognition
- Introduction
- Language Modeling
- Factored Language Modeling
- The Morphology of Amharic
- The Baseline Speech Recognition System
- Speech and Text Corpus
- The Acoustic and Language Models
- Performance of the Baseline System
- Morpheme-Based and Factored Language Models
- Morpheme-Based Language Models
- Amharic Factored Language Models
- Lattice Rescoring Experiment
- Lattice Rescoring with Morpheme-Based Language Models
- Lattice Rescoring with Factored Language Models
- Conclusion
- References
- The Corpus Analysis Toolkit - Analysing Multilevel Annotations
- Introduction
- Linguistic Annotation
- Multilevel Annotations
- Tiers, Intervals, and Points
- Inter-tier Analysis
- Inter-annotation Analysis
- The Corpus Analysis Toolkit
- An Integrated Analytical Framework
- Supported Formats
- Internal Representation
- Toolkit Inventory
- Analysing a Corpus
- Corpus Overview
- Acquiring General Corpus Information
- Interval-Based Corpus Analysis
- Investigation of Temporal Inclusion
- Extracting Multilevel Representations
- The Toolkit and Emerging Standards
- Future Work
- Concluding Remarks
- References
- Computational Morphology/Lexicography
- Time Durations of Phonemes in Polish Language for Speech and Speaker Recognition
- Introduction
- Phoneme Segmentation
- Experimental Data
- Statistics Collection
- Results
- Conclusions
- References
- Polysemous Verb Classification Using Subcategorization Acquisition and Graph-Based Clustering
- Introduction
- Related Work
- Japanese Verb Description
- Selectional Preferences
- Link Analysis
- Distributional Similarity
- Clustering Method
- Experiments
- Data and Evaluation
- Results
- Conclusion
- References
- Estimating the Proximity between Languages by Their Commonality in Vocabulary Structures
- Introduction
- Basics of the Comparative Method
- Linguistic Specifications
- Formal Specifications
- Recent Works on Vocabulary Structure
- Analogy in Morphology
- A Measure of Similarity between Vocabulary Structures
- Experiments and Results
- Languages and Purpose of the Experiments
- Experiments with Swadesh Lists
- Experiments with a Multilingual Lexicon Extracted from the Acquis Communautaire
- Conclusion
- References
- Toposlaw - A Lexicographic Framework for Multi-word Units
- Introduction
- Objects in $Toposlaw$
- Morphology and Variants of a Name
- Morphological Description of Components
- Inflection Graphs
- Graph Management
- Filtering Graphs
- Tracing Paths in a Graph
- New Graphs
- Dictionary Management
- Conclusions and Perspectives
- References
- Parsing
- Parsing CFGs and PCFGs with a Chomsky-Sch¨utzenberger Representation
- Introduction
- Preliminaries
- Notation
- The Chomsky-Schützenberger Theorem
- An Encoding for CFGs
- Parsing with C-S Representations
- The Algorithm
- Analysis
- Weights and PCFGs
- An Example
- Conclusion and Further Work
- References
- Syntactic Analysis Using Finite Patterns: A New Parsing System for Czech
- Introduction
- Current Approaches to Czech Language Parsing
- SET - Syntactic Analysis as Pattern Matching Linking Rules
- The Parsing Algorithm
- The Pattern Definitions
- The SET Parser
- Experiments and Preliminary Results
- Testing Data
- Conclusions
- References
- Using SRX Standard for Sentence Segmentation
- Introduction
- SRX Standard
- Disambiguation Strategies
- Results for English and Polish
- Conclusions
- References
- Using Lexicon-Grammar Tables for French Verbs in a Large-Coverage Parser
- Introduction
- The Verbal Lexicon lglex
- The Le$fff$ Syntactic Lexicon and the Alexina Format
- Conversion of the Verbal Lexicon $lglex$ into a Lexicon in the Alexina Format
- Sketch of the Conversion Process
- Resulting Lexicon
- Integration in the frmg Parser
- Evaluation and Discussion
- Conclusion and Future Work
- References
- Computational Semantics
- Effect of Overt Pronoun Resolution in Topic Tracking
- Introduction
- Related Work
- Pronoun Resolution
- Pre-processing
- Identification of Antecedent
- Tracking Based on Term Weighting and Adaptation
- Experiments
- Anaphora Resolution
- Topic Tracking
- Conclusion
- References
- Sentiment Intensity: Is It a Good Summary Indicator?
- Introduction
- Related Work
- Sentiment Intensity and Summarisation
- Sentiment Annotated Corpus
- Sentiment Analysis
- Summarisation Algorithm Based on Sentiment Intensity
- Hypothesis Test for Sentiment Intensity Usefulness
- Experimental Results
- Sentiment Analysis
- Summarisation
- Conclusions
- References
- Syntactic Tree Kernels for Event-Time Temporal Relation Learning
- Introduction
- Previous Works
- Pattern Based Methods
- Rule Based Methods
- Anchor Based Methods
- Syntactic Tree Kernels in SVM
- Simple Event-Time Kernel
- Tree Kernels
- Composite Kernels
- Corpus Description
- Experiments
- Conclusion
- References
- The WSD Development Environment
- Introduction
- WSD Development Environment
- Corpora
- Feature Generation
- Feature Selection
- Machine Learning Algorithms
- Runtime
- Reports
- Experiments
- Conclusions
- References
- Semantic Analyzer in the Thetos-3 System
- Semantic Processing - General Premises
- Predicate-Argument Structure
- Semantic Interpretation of SGs
- Semantic Relations
- Facial Expressions and Gestures
- Requests
- Conclusion
- References
- Unsupervised and Open Ontology-Based Semantic Analysis
- Introduction
- Motivation, Theory and Practice
- The a-Grammar, a Pattern-Based Grammar
- A Tree-Like Representation Conversion
- A Compositional Analysis
- Examples
- Ontology-Based Semantic Analysis
- Evaluation
- Logical Form Evaluation
- DRS Evaluation
- Conclusion and Further Work
- References
- Entailment
- Non Compositional Semantics Using Rewriting
- Introduction
- Semantic Role Labelling
- Building and Using Semantic Representations
- From Labelled Dependency Structures to FOL Formulae
- Illustrating Example
- Checking Entailment
- Evaluation
- Conclusion
- References
- Defining Specialized Entailment Engines Using Natural Logic Relations
- Introduction
- Extended Model of Natural Logic
- Transformation-Based TE and Specialized Entailment Engines
- Entailment Rules and Atomic Edits
- Combination Based on Natural Logics
- Order of Composition
- Example of Application of the Proposed Framework to RTE Pairs
- Conclusions
- References
- Dialogue Modeling and Processing
- Czech Senior COMPANION: Wizard of Oz Data Collection and Expressive Speech Corpus Recording and Annotation
- Introduction
- Data Collection Process
- Dialogue Corpus Characteristics
- Design and Recording of the Expressive Speech Corpus
- Annotation Using Communicative Functions
- Conclusions and Future Work
- References
- Abstractive Summarization of Voice Communications
- Introduction
- Related Work
- Paper Outline
- Automatic Argumentative Analysis
- Argumentative Structure - Issues and Theories
- Computing Argumentative Annotations
- The A3 Algorithm
- Experimental Results
- Abstract Summarization of Conversations
- Conclusions
- Future Work
- References
- Natural Language Based Communication between Human Users and the Emergency Center: POLINT-112-SMS
- Credits
- Introduction
- Challenging Aspects of the Project
- User Modeling
- Linguistic Challenge of SMS Processing
- Knowledge Representation and Reasoning
- System Development Methodology
- Elements of the Logical/Physical Model. System Architecture
- Language Coverage Related Issues
- Project Resources
- PolNet
- Verbo-Nominal Collocations Dictionary
- Concluding Remarks
- References
- Dialogue Organization in Polint-112-SMS
- Introduction
- System Architecture
- Dialogue Organization
- The "Philosophy" of Dialogue in the System
- Responsibilities of the Dialogue Maintenance Module
- Dialogue-Oriented Features of the Situation Analysis Module
- Evaluation
- User Surveys
- Known Problems
- References
- Digital Language Resources
- Valuable Language Resources and Applications Supporting the Use of Basque
- Introduction
- Strategy to Develop HLT in Basque
- Useful Applications and Resources
- Spelling Checker/Corrector
- Lemmatization-Based On-Line Dictionaries
- Lemmatization-Based Search Machine
- Transfer-Based Machine Translation System
- EDBL: Lexical Database for Basque
- BasWN: Basque WordNet
- EPEC: Syntactically Annotated Text Corpus
- ZTC: Morphosyntactically Annotated Text Corpus
- Conclusions
- References
- Clues to Compare Languages for Morphosyntactic Analysis: A Study Run on Parallel Corpora and Morphosyntactic Lexicons
- Introduction
- State of the Art
- Description of Resources
- Experiment and Analysis of the Results
- Corpus-Based Study
- Using Morphosyntactic Lexicons
- Conclusions and Further Work
- References
- Corpus Clouds - Facilitating Text Analysis by Means of Visualizations
- Introduction
- Corpus Clouds
- Corpus Query Tool
- Corpus Inquiry Tasks and the Aim of Corpus Clouds
- Design Overview
- Challenges with the Visualizations
- Some Design Principles
- Evaluation and Future Work
- Conclusion
- References
- Acquiring Bilingual Lexica from Keyword Listings
- Introduction
- Collecting the Corpus
- Procedure for the Keywords Extraction
- Scan and Split
- Alignment
- Evaluation
- Alignment with GIZA as a Baseline
- Recall from Documents
- Precision
- Conclusions
- References
- Annotating Sanskrit Corpus: Adapting IL-POSTS
- Introduction
- POS Tagging in Sanskrit
- Sanskrit Morphology
- MSRI Hierarchical Tagset Schema
- Adaptations for Sanskrit
- Proposed IL-POSTS for Sanskrit
- POS Results and Current Status
- Conclusion
- References
- Effective Authoring Procedure for E-learning Courses' Development in Philological Curriculum Based on LOs Ideology
- Introduction
- Theoretical Background
- Authoring Practice and Guiding Principles
- Course Prototype Structures
- Course Design Requirements
- Results
- References
- Acquisition of Spatial Relations from an Experimental Corpus
- Introduction
- Description of the Problem
- Experiment
- Results
- Type 1
- Type 2
- Results and the Participants' Profiles
- Conclusion
- References
- Which XML Standards for Multilevel Corpus Annotation?
- Introduction
- Requirements
- Standards and Best Practices
- ISO TC37 / SC4
- TEI
- XCES
- TIGER-XML
- PAULA
- Discussion
- Standards in NKJP
- Metadata, Primary Data and Structure
- Segmentation
- Morphosyntax
- Syntactic Words
- Named Entities and Syntactic Groups
- Word Senses
- Conclusion
- References
- Corpus Academicum Lithuanicum: Design Criteria, Methodology, Application
- Introduction
- The Use of Language Corpora
- Digitalised Resources of the Lithuanian Language
- The Building of Corpus Academicum Lithuanicum
- Corpus Design
- Representativeness
- Encoding of Textual Data
- Automatic Encoding
- Perspectives
- References
- WordNet
- The EM-Based Wordnet Synsets Annotation of NP/PP Heads
- Introduction
- Data Resources
- Semantic Annotation
- The EM Selection Algorithm
- Related Works
- The Experiment
- Manually Annotated Data for an Evaluation of the Algorithm
- Efficiency of the Algorithm
- Evaluation of the Algorithm
- Conclusions
- References
- Unsupervised Word Sense Disambiguation with Lexical Chains and Graph-Based Context Formalization
- Introduction
- Lexical Chains
- The WSD Algorithm
- Evaluations
- Conclusions
- References
- An Access Layer to PolNet - Polish WordNet
- Introduction
- Access Layer Architecture
- WQuery Language
- Data Types
- Basic Syntax
- Typical Queries Appearing in POLINT-112-SMS
- Obtaining Word Meanings
- Creating and Composing Frames
- Refreshing PolNet Cache
- Discussion
- Conclusion
- References
- Document Processing
- OTTO: A Tool for Diplomatic Transcription of Historical Texts
- Introduction
- Requirements of Transcription Tools
- Characteristics of Historical Texts
- Meta-information: Header and Comments
- Requirements of Transcription Tools
- Related Work
- OTTO
- Conclusion and FutureWork
- References
- Automatic Author Attribution for Short Text Documents
- Introduction
- Authorship Analysis Classification
- Features
- Algorithm
- Corpus
- Experiments
- Conclusions
- References
- BioExcom: Detection and Categorization of Speculative Sentences in Biomedical Literature
- Introduction
- Task
- Goal
- Definition of Biological Speculation in Articles
- Importance of Speculative Sentences in Biological Literature
- Categorization into Prior and New Speculation
- Automatic Annotation of Speculative Sentences by Contextual Exploration Processing
- The Contextual Exploration Processing
- Computational Architecture of the CE Engine and Overview of Text Treatment
- The Linguistic Markers of Speculation in Biological Sentences
- Categorization of Speculative Sentences
- BioExcom Implementation
- Evaluation
- Evaluation Methodology
- Results of the Evaluation
- Perspectives
- References
- Experimenting with Automatic Text Summarisation for Arabic
- Introduction
- Related Work
- Summarisers for Arabic: AQBTSS and ACBTSS
- ACBTSS Concepts
- Experimental Design
- The Document Collection
- The Evaluation Scale
- The Subjects
- Additional Experiments with Sakhr
- Results
- AQBTSS versus ACBTSS
- Sakhr Summarisation System
- Discussion of Results
- Conclusions and Future Work
- References
- Enhancing Opinion Extraction by Automatically Annotated Lexical Resources
- Introduction
- Related Work
- The Opinion Extraction System
- The Learning and Classification System
- Lexical Resources for OM
- Experiments
- The WWC Opinion Markup Language
- The Datasets: MPQA and I-CAB Opinion
- Evaluation Models and Measures
- The Experiments
- Results and Conclusions
- The Results
- Statistical Significance Tests
- Inter-Annotator Agreement
- Conclusions
- References
- Technical Trend Analysis by Analyzing Research Papers' Titles
- Introduction
- System Behavior
- Related Work
- Utilization of Research Papers' Structures
- Automatic Generation of Survey Articles and Technical Trend Maps
- Analysis of Research Papers' Titles
- Analyzing the Structure of Japanese Titles
- Analyzing the Structure of English Titles
- Experiments
- Experimental Method
- Experimental Results
- Discussion
- Conclusion
- References
- Information Processing (IR, IE, other)
- Extracting and Visualizing Quotations from News Wires
- Introduction
- Related Work
- Overall Architecture
- Pre-processing with SXPipe
- Named Entities Recognition
- Verbatims Extraction
- Parsing and Post-processing
- Anaphora Resolution
- Quotation Extraction
- Web Interface for Visualization
- Conclusions and Perspectives
- References
- Using Wikipedia to Improve Precision of Contextual Advertising
- Introduction
- Problem Statement
- Contributions
- Organizations
- Related Work
- Keyword Matching
- Semantic Advertising
- Keyword Extraction
- Wikipedia Matching
- Finding Similar Articles
- Dimension Reduction
- Combining Ranking Functions
- Experiments
- Data and Methodology
- Average Precision
- Results for the Ambiguous Dataset
- Performance Gain and t-Interval
- Conclusion
- References
- Unsupervised Extraction of Keywords from News Archives
- Introduction
- Related Work
- Belga News Agency Archive
- Automatic Extraction of Keywords
- TextRank
- Chi-Square Test
- Information Radius
- Raw Frequency
- Evaluation
- Conclusions
- References
- Machine Translation
- Automatic Evaluation of Texts by Using Paraphrases
- Introduction
- Related Work
- Automatic Evaluation of Texts
- Text Evaluation Using Paraphrases
- The Benefits of Paraphrases in Text Evaluation
- Data
- Paraphrases in Text Evaluation
- An Automatic Method of Text Evaluation Using Paraphrases
- Procedure for Text Evaluation
- Paraphrase Methods
- Experiments
- Experimental Settings
- Experimental Results
- Discussion
- Conclusions
- References
- Packing It All Up in Search for a Language Independent MT Quality Measure Tool - Part Two
- Introduction
- Research Setting
- Results
- Original Results with Complearn
- Results with WMT08 and WMT10 Data
- Discussion and Conclusions
- References
- Author Index
Systemvoraussetzungen
Dateiformat: PDF
Kopierschutz: Wasserzeichen-DRM (Digital Rights Management)
Systemvoraussetzungen:
- Computer (Windows; MacOS X; Linux): Verwenden Sie zum Lesen die kostenlose Software Adobe Reader, Adobe Digital Editions oder einen anderen PDF-Viewer Ihrer Wahl (siehe E-Book Hilfe).
- Tablet/Smartphone (Android; iOS): Installieren Sie bereits vor dem Download die kostenlose App Adobe Digital Editions oder die App PocketBook (siehe E-Book Hilfe).
- E-Book-Reader: Bookeen, Kobo, Pocketbook, Sony, Tolino u.v.a.m.
Das Dateiformat PDF zeigt auf jeder Hardware eine Buchseite stets identisch an. Daher ist eine PDF auch für ein komplexes Layout geeignet, wie es bei Lehr- und Fachbüchern verwendet wird (Bilder, Tabellen, Spalten, Fußnoten). Bei kleinen Displays von E-Readern oder Smartphones sind PDF leider eher nervig, weil zu viel Scrollen notwendig ist. Mit Wasserzeichen-DRM wird hier ein „weicher” Kopierschutz verwendet. Daher ist technisch zwar alles möglich – sogar eine unzulässige Weitergabe. Aber an sichtbaren und unsichtbaren Stellen wird der Käufer des E-Books als Wasserzeichen hinterlegt, sodass im Falle eines Missbrauchs die Spur zurückverfolgt werden kann.
Weitere Informationen finden Sie in unserer E-Book Hilfe.