How to do Linguistics with R

Name: How to do Linguistics with R | Data exploration and statistical analysis
Brand: John Benjamins Publishing Company
Price: 130.99 EUR
Availability: OnlineOnly

Data exploration and statistical analysis

Natalia Levshina(Author)

John Benjamins Publishing Company

1st Edition

Published on 25. November 2015

XI, 443 pages

E-Book

PDF with Adobe-DRM

System requirements

978-90-272-6845-7 (ISBN)

€130.99incl. 7% vat

System requirements

for PDF with Adobe-DRM

E-Book Single Licence

Available for download

Description

More details

Other editions

Person

Content

Intro
How to do Linguistics with R
Title page
LCC data
Dedication page
Table of contents
Acknowledgements
Introduction
1. Who is this book written for?
2. The quantitative turn in linguistics
3. How to use this textbook
1. What is statistics?
What you will learn from this chapter:
1.1 Statistics and statistics
1.2 Formulating and testing your hypotheses
1.2.1 Null and alternative hypotheses
1.2.2 Those mysterious p-values.
1.2.3 Type I and Type II errors
1.2.4 One-tailed and two-tailed statistical tests
1.3 What statistics cannot do for you
1.4 Types of variables
1.5 Summary
2. Introduction to R
What you will learn from this chapter:
2.1 Introduction
2.2 Installation of the basic distribution and add-on packages
2.3 First steps with R
2.3.1 Starting R
2.3.2 R syntax
2.3.3 Exiting from R or terminating a process
2.3.4 Getting help
2.4 Main types of R objects
2.5 RStudio
2.6 Importing and exporting your data and saving your graphs
2.6.1 Importing your data to R
2.6.2 Exporting your data from R
2.6.3 Saving your graphs
2.7 Summary
3. Descriptive statistics for quantitative variables
What you will learn from this chapter:
3.1 Analysing the distribution of word lengths: Basic descriptive statistics
3.1.1 The data
3.1.2 Measures of central tendency
3.1.3 Measures of dispersion
3.2 Bad times, good times: Visualization of a distribution and finding outliers
3.3 Zipf's law and word frequency: Transformation of quantitative variables
3.4 Summary
4. How to explore qualitative variables
What you will learn from this chapter:
4.1 Frequency tables, proportions and percentages
4.2 Visualization of categorical data
4.3 Basic Colour Terms: Deviations of Proportions in subcorpora
4.3.1 The data and hypothesis
4.3.2 Deviation of proportions as a measure of dispersion
4.4 Summary
5. Comparing two groups
What you will learn from this chapter:
5.1 Comparing group means (medians): An overview of the tests
5.2 Comparing the number of associations triggered by high- and low-frequency nouns with the help of an independent t-test
5.2.1 Data and hypothesis
5.2.2 Descriptive statistics and visualizations
5.2.3 Choosing an appropriate test to compare the measures of central tendency in two groups
5.2.4 Confidence intervals and standard errors
5.3 Comparing concreteness scores of high- and low-frequency nouns with the help of a two-tailed Wilcoxon test
5.3.1 Data and hypotheses
5.3.2 Descriptive statistics and visualizations: Strip charts and rug plots
5.3.3 Inferential statistics: The two-tailed Wilcoxon test
5.4. Comparing associations produced by native and non-native speakers: A paired one-tailed t-test
5.4.1 Creating simulation data
5.4.2 Performing the paired t-test
5.5 Summary
6. Relationships between two numerical variables
What you will learn from this chapter:
6.1 What is correlation?
6.2 Word length and word recognition: The Pearson product-moment correlation coefficient
6.2.1 The data and hypothesis
6.2.2 Descriptive statistics and visualizations
6.2.3 Testing the significance of the correlation coefficient
6.3 Emergence of grammar from lexicon: Spearman's ? and Kendall's t.
6.3.1 The data and hypothesis
6.3.2 Exploring the data and computing correlation coefficients
6.4 Visualization of correlations between more than two variables with the help of correlograms
6.5 Summary
7. More on frequencies and reaction times
What you will learn from this chapter
7.1 The basic principles of linear regression analysis
7.2 Putting several factors together: Predicting reaction times in a lexical decision task
7.2.1 Data and hypotheses
7.2.2 The lm() function and interpretation of its output
7.2.3 Selecting the explanatory variables
7.2.4 Checking for outliers and overly influential observations
7.2.5 Checking the regression assumptions
7.2.6 Testing and interpreting interactions
7.2.7 Checking for overfitting
7.2.8 Robust regression: Bootstrap
7.3 Summary
8. Finding differences between several groups
What you will learn from this chapter:
8.1 What is ANOVA?
8.2 Motion events in Nicaraguan Sign Language: Independent one-way ANOVA
8.2.1 Theoretical background and data
8.2.2 Exploring the data
8.2.3 Assumptions of one-way parametric ANOVA
8.2.4 Performing parametric one-way ANOVA
8.2.5 Alternative tests
8.2.6 Post-hoc tests
8.3 Development of spatial modulations in Nicaraguan Sign Language: Independent factorial (two-way) ANOVA
8.3.1 The data and hypothesis
8.3.2 Descriptive statistics for different groups and interaction plot
8.3.3 Assumptions of parametric factorial ANOVA
8.3.4 ANOVA and orthogonal contrasts
8.3.5 Alternative tests
8.3.6 Post-hoc tests
8.4 Do native English and native Mandarin Chinese speakers conceptualize time differently? Repeated-measured and mixed-design ANOVA (mixed GLM method)
8.4.1 The data and hypothesis
8.4.2 Fitting a mixed-design ANOVA with the help of mixed GLM
8.4.3 Post-hoc tests
8.5 Summary
9. Measuring associations between two categorical variables
What you will learn from this chapter:
9.1 Testing independence
9.2 The story of over is not over: Metaphoric and non-metaphoric uses in two registers (analysis of a 2-by-2 contingency table)
9.2.1 The data and hypothesis
9.2.2 Visualizations, proportions and measures of effect size: Odds ratios, Cramér's V and the ø coefficient
9.2.3 Testing statistical significance: The ?2 -test of independence
9.3 Metaphorical and non-metaphorical uses of see in four registers (analysis of a 4-by-2 table)
9.3.1 The data and hypothesis
9.3.2 Descriptive statistics and visualizations
9.3.3 Testing the statistical significance and analysing the residuals: The ?2-test and mosaic and association plots
9.4 Summary
10. Association measures
What will you learn from this chapter:
10.1 Measures of association: A brief typology
10.1.1 Frequencies that you will need in order to compute association measures
10.1.2 Unidirectional (asymmetric) vs. bidirectional (symmetric) measures
10.1.3 Contingency-based vs. non-contingency-based measures
10.2 Case study: The Russian ditransitive construction and its collexemes
10.2.1 Theoretical background and data
10.2.2 Computation of some popular association measures
10.3 Summary
11. Geographic variation of quite: Distinctive collexeme analysis
What you will learn from this chapter:
11.1 Introduction to distinctive collexeme analysis
11.2 Distinctive collexeme analysis of quite + ADJ in different varieties of English
11.2.1 Theoretical background and data
11.2.2 Simple distinctive collexeme analysis of quite + ADJ in British and American English
11.2.3 Multiple distinctive collexeme analysis: Quite + ADJ in the British, American and Canadian varieties of English
11.3 Summary
12. Probabilistic multifactorial grammar and lexicology
What you will learn from this chapter:
12.1 Introduction to logistic regression
12.2 Logistic regression model of Dutch causative auxiliaries doen and laten
12.2.1 Theoretical background and data
12.2.2 Fitting a binary logistic regression model: Main functions
12.2.3 Selection of variables
12.2.4 Testing possible interactions
12.2.5 Identifying outliers and overly influential observations
12.2.6 Checking the regression assumptions
12.2.7 Testing for overfitting
12.2.8 Interpretation of the model
12.3 Summary
13. Multinomial (polytomous) logistic regression models of three and more near synonyms
What you will learn from this chapter:
13.1 What is multinomial regression?
13.2 Multinomial models of English permissive constructions
13.2.1 Data and hypotheses
13.2.2 Contrasting allow and permit with let
13.2.3 'One vs. rest' approach
13.3 Summary
14. Conditional inference trees and random forests
What you will learn from this chapter:
14.1 Conditional inference trees and random forests
14.2 Conditional inference trees and random forests of three English causative constructions
14.2.1 The data and hypotheses
14.2.2 Fitting a conditional inference tree model
14.2.3 Random forests
14.3 Summary
15. Behavioural profiles, distance metrics and cluster analysis
What you will learn from this chapter:
15.1 What are Behavioural Profiles?
15.2 Behavioural Profiles of English analytic causatives
15.2.1 Data and theoretical background
15.2.2 Computation of numeric BP vectors from the categorical data
15.2.3 Distance matrix
15.2.4 Hierarchical cluster analysis
15.2.4.1 Identifying the clusters
15.2.4.2 Interpretation of the cluster solution: Snake plots and effect size measures
15.2.4.3 Validation of a cluster solution
15.2.5 Partitioning methods
15.2.5.1 General introduction
15.2.5.2 Partitioning around centroids (k-means)
15.2.5.3 Partitioning around medoids
15.3 Summary
16. Introduction to Semantic Vector Spaces
What you will learn from this chapter:
16.1 Distributional models of semantics and Semantic Vector Space models
16.2 A Semantic Vector Space model of English verbs of cooking
16.2.1 Theoretical background and data
16.2.2 Creating vectors of weighted co-occurrence frequencies
16.2.3 Cosine similarity
16.3 Summary
17. Language and space
What you will learn from this chapter:
17.1 Making maps with R
17.2 What is multidimensional scaling?
17.3 Computation and representation of geographical distances
17.4 Computation and representation of linguistic distances: The Kruskal non-metric MDS
17.4.1 Recoding the dataset
17.4.2 Computation of Gower distances
17.5 The Mantel test for distance matrices
17.6 Summary
18. Multidimensional analysis of register variation
What you will learn from this chapter:
18.1 Multidimensional analysis of register variation
18.2 Case study: Register variation in the British National Corpus
18.2.1 The data and research question
18.2.2 Principal Component Analysis
18.2.3 Factor Analysis
18.3 Summary
19. Exemplars, categories, prototypes
What you will learn from this chapter:
19.1 Register variation of Basic Colour Terms: Simple Correspondence Analysis
19.1.1 The data and hypothesis
19.1.2 Simple Correspondence Analysis
19.2 Visualization of exemplars and prototypes of lexical categories: Multiple Correspondence Analysis of Stuhl and Sessel
19.2.1 The data and theoretical background
19.2.2 Multiple Correspondence Analysis
19.3 Summary
20. Constructional change and motion charts
What you will learn from this chapter:
20.1 The past and present of the future: Diachronic motion charts of be going to and will
20.1.1 Theoretical background and data
20.1.2 Motion charts
20.2 Summary
Epilogue
Appendix 1. The most important R objects and basic operations with them
Appendix 2. Main plotting functions and graphical parameters in R
24.1 A scatterplot with text labels
References
Subject Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

How to do Linguistics with R

Description

More details

Other editions

Additional editions

Person

Content

System requirements