
How to do Linguistics with R
Beschreibung
This book has a companion website: http://doi.org/10.1075/z.195.website
Weitere Details
Weitere Ausgaben
Person
Inhalt
- Intro
- How to do Linguistics with R
- Title page
- LCC data
- Dedication page
- Table of contents
- Acknowledgements
- Introduction
- 1. Who is this book written for?
- 2. The quantitative turn in linguistics
- 3. How to use this textbook
- 1. What is statistics?
- What you will learn from this chapter:
- 1.1 Statistics and statistics
- 1.2 Formulating and testing your hypotheses
- 1.2.1 Null and alternative hypotheses
- 1.2.2 Those mysterious p-values.
- 1.2.3 Type I and Type II errors
- 1.2.4 One-tailed and two-tailed statistical tests
- 1.3 What statistics cannot do for you
- 1.4 Types of variables
- 1.5 Summary
- 2. Introduction to R
- What you will learn from this chapter:
- 2.1 Introduction
- 2.2 Installation of the basic distribution and add-on packages
- 2.3 First steps with R
- 2.3.1 Starting R
- 2.3.2 R syntax
- 2.3.3 Exiting from R or terminating a process
- 2.3.4 Getting help
- 2.4 Main types of R objects
- 2.5 RStudio
- 2.6 Importing and exporting your data and saving your graphs
- 2.6.1 Importing your data to R
- 2.6.2 Exporting your data from R
- 2.6.3 Saving your graphs
- 2.7 Summary
- 3. Descriptive statistics for quantitative variables
- What you will learn from this chapter:
- 3.1 Analysing the distribution of word lengths: Basic descriptive statistics
- 3.1.1 The data
- 3.1.2 Measures of central tendency
- 3.1.3 Measures of dispersion
- 3.2 Bad times, good times: Visualization of a distribution and finding outliers
- 3.3 Zipf's law and word frequency: Transformation of quantitative variables
- 3.4 Summary
- 4. How to explore qualitative variables
- What you will learn from this chapter:
- 4.1 Frequency tables, proportions and percentages
- 4.2 Visualization of categorical data
- 4.3 Basic Colour Terms: Deviations of Proportions in subcorpora
- 4.3.1 The data and hypothesis
- 4.3.2 Deviation of proportions as a measure of dispersion
- 4.4 Summary
- 5. Comparing two groups
- What you will learn from this chapter:
- 5.1 Comparing group means (medians): An overview of the tests
- 5.2 Comparing the number of associations triggered by high- and low-frequency nouns with the help of an independent t-test
- 5.2.1 Data and hypothesis
- 5.2.2 Descriptive statistics and visualizations
- 5.2.3 Choosing an appropriate test to compare the measures of central tendency in two groups
- 5.2.4 Confidence intervals and standard errors
- 5.3 Comparing concreteness scores of high- and low-frequency nouns with the help of a two-tailed Wilcoxon test
- 5.3.1 Data and hypotheses
- 5.3.2 Descriptive statistics and visualizations: Strip charts and rug plots
- 5.3.3 Inferential statistics: The two-tailed Wilcoxon test
- 5.4. Comparing associations produced by native and non-native speakers: A paired one-tailed t-test
- 5.4.1 Creating simulation data
- 5.4.2 Performing the paired t-test
- 5.5 Summary
- 6. Relationships between two numerical variables
- What you will learn from this chapter:
- 6.1 What is correlation?
- 6.2 Word length and word recognition: The Pearson product-moment correlation coefficient
- 6.2.1 The data and hypothesis
- 6.2.2 Descriptive statistics and visualizations
- 6.2.3 Testing the significance of the correlation coefficient
- 6.3 Emergence of grammar from lexicon: Spearman's ? and Kendall's t.
- 6.3.1 The data and hypothesis
- 6.3.2 Exploring the data and computing correlation coefficients
- 6.4 Visualization of correlations between more than two variables with the help of correlograms
- 6.5 Summary
- 7. More on frequencies and reaction times
- What you will learn from this chapter
- 7.1 The basic principles of linear regression analysis
- 7.2 Putting several factors together: Predicting reaction times in a lexical decision task
- 7.2.1 Data and hypotheses
- 7.2.2 The lm() function and interpretation of its output
- 7.2.3 Selecting the explanatory variables
- 7.2.4 Checking for outliers and overly influential observations
- 7.2.5 Checking the regression assumptions
- 7.2.6 Testing and interpreting interactions
- 7.2.7 Checking for overfitting
- 7.2.8 Robust regression: Bootstrap
- 7.3 Summary
- 8. Finding differences between several groups
- What you will learn from this chapter:
- 8.1 What is ANOVA?
- 8.2 Motion events in Nicaraguan Sign Language: Independent one-way ANOVA
- 8.2.1 Theoretical background and data
- 8.2.2 Exploring the data
- 8.2.3 Assumptions of one-way parametric ANOVA
- 8.2.4 Performing parametric one-way ANOVA
- 8.2.5 Alternative tests
- 8.2.6 Post-hoc tests
- 8.3 Development of spatial modulations in Nicaraguan Sign Language: Independent factorial (two-way) ANOVA
- 8.3.1 The data and hypothesis
- 8.3.2 Descriptive statistics for different groups and interaction plot
- 8.3.3 Assumptions of parametric factorial ANOVA
- 8.3.4 ANOVA and orthogonal contrasts
- 8.3.5 Alternative tests
- 8.3.6 Post-hoc tests
- 8.4 Do native English and native Mandarin Chinese speakers conceptualize time differently? Repeated-measured and mixed-design ANOVA (mixed GLM method)
- 8.4.1 The data and hypothesis
- 8.4.2 Fitting a mixed-design ANOVA with the help of mixed GLM
- 8.4.3 Post-hoc tests
- 8.5 Summary
- 9. Measuring associations between two categorical variables
- What you will learn from this chapter:
- 9.1 Testing independence
- 9.2 The story of over is not over: Metaphoric and non-metaphoric uses in two registers (analysis of a 2-by-2 contingency table)
- 9.2.1 The data and hypothesis
- 9.2.2 Visualizations, proportions and measures of effect size: Odds ratios, Cramér's V and the ø coefficient
- 9.2.3 Testing statistical significance: The ?2 -test of independence
- 9.3 Metaphorical and non-metaphorical uses of see in four registers (analysis of a 4-by-2 table)
- 9.3.1 The data and hypothesis
- 9.3.2 Descriptive statistics and visualizations
- 9.3.3 Testing the statistical significance and analysing the residuals: The ?2-test and mosaic and association plots
- 9.4 Summary
- 10. Association measures
- What will you learn from this chapter:
- 10.1 Measures of association: A brief typology
- 10.1.1 Frequencies that you will need in order to compute association measures
- 10.1.2 Unidirectional (asymmetric) vs. bidirectional (symmetric) measures
- 10.1.3 Contingency-based vs. non-contingency-based measures
- 10.2 Case study: The Russian ditransitive construction and its collexemes
- 10.2.1 Theoretical background and data
- 10.2.2 Computation of some popular association measures
- 10.3 Summary
- 11. Geographic variation of quite: Distinctive collexeme analysis
- What you will learn from this chapter:
- 11.1 Introduction to distinctive collexeme analysis
- 11.2 Distinctive collexeme analysis of quite + ADJ in different varieties of English
- 11.2.1 Theoretical background and data
- 11.2.2 Simple distinctive collexeme analysis of quite + ADJ in British and American English
- 11.2.3 Multiple distinctive collexeme analysis: Quite + ADJ in the British, American and Canadian varieties of English
- 11.3 Summary
- 12. Probabilistic multifactorial grammar and lexicology
- What you will learn from this chapter:
- 12.1 Introduction to logistic regression
- 12.2 Logistic regression model of Dutch causative auxiliaries doen and laten
- 12.2.1 Theoretical background and data
- 12.2.2 Fitting a binary logistic regression model: Main functions
- 12.2.3 Selection of variables
- 12.2.4 Testing possible interactions
- 12.2.5 Identifying outliers and overly influential observations
- 12.2.6 Checking the regression assumptions
- 12.2.7 Testing for overfitting
- 12.2.8 Interpretation of the model
- 12.3 Summary
- 13. Multinomial (polytomous) logistic regression models of three and more near synonyms
- What you will learn from this chapter:
- 13.1 What is multinomial regression?
- 13.2 Multinomial models of English permissive constructions
- 13.2.1 Data and hypotheses
- 13.2.2 Contrasting allow and permit with let
- 13.2.3 'One vs. rest' approach
- 13.3 Summary
- 14. Conditional inference trees and random forests
- What you will learn from this chapter:
- 14.1 Conditional inference trees and random forests
- 14.2 Conditional inference trees and random forests of three English causative constructions
- 14.2.1 The data and hypotheses
- 14.2.2 Fitting a conditional inference tree model
- 14.2.3 Random forests
- 14.3 Summary
- 15. Behavioural profiles, distance metrics and cluster analysis
- What you will learn from this chapter:
- 15.1 What are Behavioural Profiles?
- 15.2 Behavioural Profiles of English analytic causatives
- 15.2.1 Data and theoretical background
- 15.2.2 Computation of numeric BP vectors from the categorical data
- 15.2.3 Distance matrix
- 15.2.4 Hierarchical cluster analysis
- 15.2.4.1 Identifying the clusters
- 15.2.4.2 Interpretation of the cluster solution: Snake plots and effect size measures
- 15.2.4.3 Validation of a cluster solution
- 15.2.5 Partitioning methods
- 15.2.5.1 General introduction
- 15.2.5.2 Partitioning around centroids (k-means)
- 15.2.5.3 Partitioning around medoids
- 15.3 Summary
- 16. Introduction to Semantic Vector Spaces
- What you will learn from this chapter:
- 16.1 Distributional models of semantics and Semantic Vector Space models
- 16.2 A Semantic Vector Space model of English verbs of cooking
- 16.2.1 Theoretical background and data
- 16.2.2 Creating vectors of weighted co-occurrence frequencies
- 16.2.3 Cosine similarity
- 16.3 Summary
- 17. Language and space
- What you will learn from this chapter:
- 17.1 Making maps with R
- 17.2 What is multidimensional scaling?
- 17.3 Computation and representation of geographical distances
- 17.4 Computation and representation of linguistic distances: The Kruskal non-metric MDS
- 17.4.1 Recoding the dataset
- 17.4.2 Computation of Gower distances
- 17.5 The Mantel test for distance matrices
- 17.6 Summary
- 18. Multidimensional analysis of register variation
- What you will learn from this chapter:
- 18.1 Multidimensional analysis of register variation
- 18.2 Case study: Register variation in the British National Corpus
- 18.2.1 The data and research question
- 18.2.2 Principal Component Analysis
- 18.2.3 Factor Analysis
- 18.3 Summary
- 19. Exemplars, categories, prototypes
- What you will learn from this chapter:
- 19.1 Register variation of Basic Colour Terms: Simple Correspondence Analysis
- 19.1.1 The data and hypothesis
- 19.1.2 Simple Correspondence Analysis
- 19.2 Visualization of exemplars and prototypes of lexical categories: Multiple Correspondence Analysis of Stuhl and Sessel
- 19.2.1 The data and theoretical background
- 19.2.2 Multiple Correspondence Analysis
- 19.3 Summary
- 20. Constructional change and motion charts
- What you will learn from this chapter:
- 20.1 The past and present of the future: Diachronic motion charts of be going to and will
- 20.1.1 Theoretical background and data
- 20.1.2 Motion charts
- 20.2 Summary
- Epilogue
- Appendix 1. The most important R objects and basic operations with them
- Appendix 2. Main plotting functions and graphical parameters in R
- 24.1 A scatterplot with text labels
- References
- Subject Index
Systemvoraussetzungen
Dateiformat: PDF
Kopierschutz: Adobe-DRM (Digital Rights Management)
Systemvoraussetzungen:
- Computer (Windows; MacOS X; Linux): Installieren Sie bereits vor dem Download die kostenlose Software Adobe Digital Editions (siehe E-Book Hilfe).
- Tablet/Smartphone (Android; iOS): Installieren Sie bereits vor dem Download die kostenlose App Adobe Digital Editions oder die App PocketBook (siehe E-Book Hilfe).
- E-Book-Reader: Bookeen, Kobo, Pocketbook, Sony, Tolino u.v.a.m. (nicht Kindle)
Das Dateiformat PDF zeigt auf jeder Hardware eine Buchseite stets identisch an. Daher ist eine PDF auch für ein komplexes Layout geeignet, wie es bei Lehr- und Fachbüchern verwendet wird (Bilder, Tabellen, Spalten, Fußnoten). Bei kleinen Displays von E-Readern oder Smartphones sind PDF leider eher nervig, weil zu viel Scrollen notwendig ist.
Mit Adobe-DRM wird hier ein „harter” Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.
Bitte beachten Sie: Wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!
Weitere Informationen finden Sie in unserer E-Book Hilfe.