
Programming for Corpus Linguistics with Python and Dataframes
Daniel Keller(Author)
Cambridge University Press
Published on 20. June 2024
Book
Paperback/Softback
114 pages
978-1-108-82258-9 (ISBN)
Description
This Element offers intermediate or experienced programmers algorithms for Corpus Linguistic (CL) programming in the Python language using dataframes that provide a fast, efficient, intuitive set of methods for working with large, complex datasets such as corpora. This Element demonstrates principles of dataframe programming applied to CL analyses, as well as complete algorithms for creating concordances; producing lists of collocates, keywords, and lexical bundles; and performing key feature analysis. An additional algorithm for creating dataframe corpora is presented including methods for tokenizing, part-of-speech tagging, and lemmatizing using spaCy. This Element provides a set of core skills that can be applied to a range of CL research questions, as well as to original analyses not possible with existing corpus software.
More details
Series
Language
English
Place of publication
Cambridge
United Kingdom
Target group
Professional and scholarly
Product notice
Paperback (trade)
Illustrations
Worked examples or Exercises
Dimensions
Height: 229 mm
Width: 152 mm
Thickness: 7 mm
Weight
178 gr
ISBN-13
978-1-108-82258-9 (9781108822589)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Other editions
Additional editions

Book
06/2024
Cambridge University Press
€75.30
Shipment within 15-20 days
Person
Content
1. Data frame corpora; 2. Python basics for corpus linguistics; 3. Working with data frames; 4. Algorithms for common corpus linguistic tasks; 5. Creating data frame corpora; 6. Conclusion; References.