
Programming for Corpus Linguistics
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Intro
- Contents
- I Progamming and Corpus Linguistics
- 1 Introduction
- 1.1 PROGRAMMING IN CORPUS LINGUISTICS
- 1.1.1 The Computer in Corpus Linguistics
- 1.1.2 Which Programming Language?
- 1.1.3 Useful Aspects of Java
- 1.1.4 Programming Language Classification
- 1.2 ROAD-MAP
- 1.2.1 What is Covered
- 1.2.2 Other Features of Java
- 1.3 GETTING JAVA
- 1.4 PREPARING THE SOURCE
- 1.4.1 Running your First Program
- 1.5 SUMMARY
- 2 Introducting to Basic Programming Concepts
- 2.1 WHAT DOES A PROGRAM DO?
- 2.1.1 What is an Algorithm?
- 2.1.2 How to Express an Algorithm
- 2.2 CONTROL FLOW
- 2.2.1 Sequence
- 2.2.2 Choice
- 2.2.3 Multiple Choice
- 2.2.4 Loop
- 2.3 VARIABLES AND DATE TYPES
- 2.3.1 Numerical Data
- 2.3.2 Character Data
- 2.3.3 Composite Data
- 2.4 DATA STORAGE
- 2.4.1 Internal Storage: Memory
- 2.4.2 External Storage: Files
- 2.5 SUMMARY
- 3 Basic Corpus Concepts
- 3.1 HOW TO COLLECT DATA
- 3.1.1 Typing
- 3.1.2 Scanning
- 3.1.3 Downloading
- 3.1.4 Other Media
- 3.2 HOW TO STORE TEXTUAL DATA
- 3.2.1 Corpus Organisation
- 3.2.2 File Formats
- 3.3 MARK-UP AND ANNOTATIONS
- 3.3.1 Why Use Annotations?
- 3.3.2 Different Ways to Store Annotations
- 3.3.3 Error Correction
- 3.4 COMMON OPERATIONS
- 3.4.1 Word Lists
- 3.4.2 Concordances
- 3.4.3 Collocations
- 3.5 SUMMARY
- 4 Basic Java Programming
- 4.1 OBJECT-ORIENTED PROGRAMMING
- 4.1.1 What is a Class, What is an Object?
- 4.1.2 Object Properties
- 4.1.3 Object Operations
- 4.1.4 The Class Defintion
- 4.1.5 Accessibilty: Private and Public
- 4.1.6 APIs and their Documentation
- 4.2 INHERITANCE
- 4.2.1 Multiple Inheritance
- 4.3 SUMMARY
- 5 The Java Class Library
- 5.1 PACKAGING IT UP
- 5.1.1 Introduction
- 5.1.2 The Standard Packages
- 5.1.3 Extension Packages
- 5.1.4 Creating your Own Package
- 5.2 ERRORS AND EXCEPTIONS
- 5.3 STRING HANDLING IN JAVA
- 5.3.1 String Literals
- 5.3.2 Combiningn Strings
- 5.3.3 The String API
- 5.3.4 Changing Strings: The StringBuffer
- 5.4 OTHER USEFUL CLASSES
- 5.4.1 Container Classes
- 5.4.2 Array
- 5.4.3 Vector
- 5.4.4 Hashtable
- 5.4.5 Properties
- 5.4.6 Stack
- 5.4.7 Enumeration
- 5.5 THE COLLECTION FRAMEWORK
- 5.5.1 Introduction
- 5.5.2 Collection
- 5.5.3 Set
- 5.5.4 List
- 5.5.5 Map
- 5.5.6 Iterator
- 5.5.7 Collections
- 5.6 SUMMARY
- 6 Input/Output
- 6.1 THE STREAM CONCEPT
- 6.1.1 Streams and Readers
- 6.2 FILE HANDLING
- 6.2.1. Reading from a File
- 6.2.2 Writing to a File
- 6.3 CREATING YOUR OWN READERS
- 6.3.1 The CordanceReader
- 6.3.2 Limiations & Problems
- 6.4 RANDOM ACCESS FILES
- 6.4.1 Indexing
- 6.4.2 Creating the Index
- 6.4.3 Complex Queries
- 6.5 SUMMARY
- 6.6 STUDY QUESTIONS
- 7 Processing Plain Text
- 7.1 SPLITTING A TEXT INTO WORDS
- 7.1.1 Problems with Tokenisation
- 7.2 THE STRINGTOKENIZER CLASS
- 7.2.1 The StringTokenizer API
- 7.2.2 The PreTokeniser Explained
- 7.2.3 Example: The File Tokeniser
- 7.2.4 The FileTokeniser Explained
- 7.3 CREATING WORD LISTS
- 7.3.1 Storing Words in Memory
- 7.3.2 Alphabetical Wordlists
- 7.3.3 Frequency Lists
- 7.3.4 Sorting and Resorting
- 7.4 SUMMARY
- 8 Dealing with Annotations
- 8.1 INTRODUCTION
- 8.2 WHAT IS XML?
- 8.2.1. An Informal Description of XML
- 8.3 WORKING WITH XML
- 8.3.1 Integrating XML into your Application
- 8.3.2 An XML Tolkeniser
- 8.3.3 An XML Checker
- 8.4 SUMMARY
- II Language Processing Examples
- 9 Stemming
- 9.1 INTRODUCTION
- 9.2 PROGRAM DESIGN
- 9.3 IMPLEMENTATION
- 9.3.1 The Stemmer Class
- 9.3.2 The RuleLoader Class
- 9.3.3 The Rule Class
- 9.3.4 The Rule File
- 9.4 TESTING
- 9.4.1 Output
- 9.4.2 Expansion
- 9.5 STUDY QUESTIONS
- 10 Part of Speech Tagging
- 10.1 INTRODUCTION
- 10.2 PROGRAM DESIGN
- 10.3 IMPLEMENTATION
- 10.3.1 The Processor
- 10.3.2 The Lexicon
- 10.3.3 The Suffix Analyser
- 10.3.4 The Transition Matrix
- 10.4 TESTING
- 10.5 STUDY QUESTIONS
- 11 Collocation Analysis
- 11.1 INTRODUCTION
- 11.1.1 Enviroment
- 11.1.2 Benchmark Frequency
- 11.1.3 Evaluation Function
- 11.2 SYSTEM DESIGN
- 11.3 IMPLEMENTATION
- 11.3.1 The Collocate
- 11.3.2 The Comparators
- 11.3.3 The Span
- 11.3.4 The Collocator
- 11.3.5 The Utility Class
- 11.4 TESTING
- 11.5 STUDY QUESTIONS
- III Appendices
- 12 Appendix
- 12.1 A LIST OF JAVA KEYWORDS
- 12.2 RESOURCES
- 12.3 RINGCONCORDANCEREADER
- 12.4 REFERENCES
- Index
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.