
Statistics with JMP
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
Reviews / Votes
"For teachers of applied statistics, this book provides a rich resource of course material, examples and applications." (Zentralblatt MATH, 1 June 2015)More details
Other editions
Additional editions

Persons
Peter Goos and David?Meintrup, Department of Mathematics, Statistics and Actuarial Sciences of the Faculty of Applied Economics of the University of Antwerp, Belgium.
Content
Preface xiii
Acknowledgments xvii
1 What is statistics? 1
1.1 Why statistics? 1
1.2 Definition of statistics 3
1.3 Examples 4
1.4 The subject of statistics 5
1.5 Probability 6
1.6 Software 7
2 Data and its representation 8
2.1 Types of data and measurement scales 8
2.1.1 Categorical or qualitative variables 8
2.1.2 Quantitative variables 9
2.1.3 Hierarchy of scales 10
2.1.4 Measurement scales in JMP 10
2.2 The data matrix 11
2.3 Representing univariate qualitative variables 12
2.4 Representing univariate quantitative variables 16
2.4.1 Stem and leaf diagram 16
2.4.2 Needle charts for univariate discrete quantitative variables 17
2.4.3 Histograms and frequency polygons for continuous variables 22
2.4.4 Empirical cumulative distribution functions 27
2.5 Representing bivariate data 30
2.5.1 Qualitative variables 30
2.5.2 Quantitative variables 34
2.6 Representing time series 38
2.7 The use of maps 39
2.8 More graphical capabilities 47
3 Descriptive statistics of sample data 54
3.1 Measures of central tendency or location 55
3.1.1 Median 56
3.1.2 Mode 57
3.1.3 Arithmetic mean 58
3.1.4 Geometric mean 61
3.2 Measures of relative location 63
3.2.1 Order statistics, quantiles, percentiles, deciles 63
3.2.2 Quartiles 64
3.3 Measures of variation or spread 64
3.3.1 Range 64
3.3.2 Interquartile range 65
3.3.3 Mean absolute deviation 65
3.3.4 Variance 65
3.3.5 Standard deviation 68
3.3.6 Coefficient of variation 69
3.3.7 Dispersion indices for nominal and ordinal variables 70
3.4 Measures of skewness 76
3.5 Kurtosis 78
3.6 Transformation and standardization of data 78
3.7 Box plots 79
3.8 Variability charts 84
3.9 Bivariate data 88
3.9.1 Covariance 89
3.9.2 Correlation 92
3.9.3 Rank correlation 94
3.10 Complementarity of statistics and graphics 98
3.11 Descriptive statistics using JMP 100
4 Probability 106
4.1 Random experiments 108
4.2 Definition of probability 110
4.3 Calculation rules 113
4.4 Conditional probability 114
4.5 Independent and dependent events 119
4.6 Total probability and Bayes' rule 122
4.7 Simulating random experiments 127
5 Additional aspects of probability theory 129
5.1 Combinatorics 129
5.1.1 Addition rule 129
5.1.2 Multiplication principle 130
5.1.3 Permutations 130
5.1.4 Combinations 131
5.2 Number of possible orders 132
5.2.1 Two different objects 133
5.2.2 More than two different objects 133
5.3 Applications of probability theory 134
5.3.1 Sequences of independent random experiments 134
5.3.2 Euromillions 135
6 Univariate random variables 138
6.1 Random variables and distribution functions 138
6.2 Discrete random variables and probability distributions 140
6.3 Continuous random variables and probability densities 143
6.4 Functions of random variables 151
6.4.1 Functions of one discrete random variable 151
6.4.2 Functions of one continuous random variable 152
6.5 Families of probability distributions and probability densities 154
6.6 Simulation of random variables 155
7 Statistics of populations and processes 159
7.1 Expected value of a random variable 159
7.2 Expected value of a function of a random variable 161
7.3 Special cases 162
7.4 Variance and standard deviation of a random variable 163
7.5 Other statistics 166
7.6 Moment generating functions 169
8 Important discrete probability distributions 173
8.1 The uniform distribution 173
8.2 The Bernoulli distribution 175
8.3 The binomial distribution 176
8.3.1 Probability distribution 176
8.3.2 Expected value and variance 183
8.4 The hypergeometric distribution 184
8.5 The Poisson distribution 188
8.6 The geometric distribution 194
8.7 The negative binomial distribution 197
8.8 Probability distributions in JMP 200
8.8.1 Tables with probability distributions and cumulative distribution functions 200
8.8.2 Graphical representations 204
8.9 The simulation of discrete random variables with JMP 209
9 Important continuous probability densities 212
9.1 The continuous uniform density 213
9.2 The exponential density 215
9.2.1 Definition and statistics 215
9.2.2 Some interesting properties 216
9.3 The gamma density 220
9.4 The Weibull density 221
9.5 The beta density 223
9.6 Other densities 224
9.7 Graphical representations and probability calculations in JMP 226
9.8 Simulating continuous random variables in JMP 230
10 The normal distribution 232
10.1 The normal density 233
10.2 Calculation of probabilities for normally distributed variables 237
10.2.1 The standard normal distribution 237
10.2.2 General normally distributed variables 238
10.2.3 JMP 240
10.2.4 Examples 241
10.3 Lognormal probability density 247
11 Multivariate random variables 252
11.1 Introductory notions 252
11.2 Joint (discrete) probability distributions 254
11.3 Marginal or unconditional (discrete) probability distribution 256
11.4 Conditional (discrete) probability distribution 257
11.5 Examples of discrete bivariate random variables 258
11.6 The multinomial probability distribution 266
11.7 Joint (continuous) probability density 268
11.8 Marginal or unconditional (continuous) probability density 276
11.9 Conditional (continuous) probability density 279
12 Functions of several random variables 282
12.1 Functions of several random variables 282
12.2 Expected value of functions of several random variables 283
12.3 Conditional expected values 288
12.4 Probability distributions of functions of random variables 289
12.4.1 Discrete random variables 289
12.4.2 Continuous random variables 290
12.5 Functions of independent Poisson, normally, and lognormally distributed random variables 295
13 Covariance, correlation, and variance of linear functions 300
13.1 Covariance and correlation 300
13.2 Variance of linear functions of two random variables 305
13.3 Variance of linear functions of several random variables 306
13.4 Variance of linear functions of independent random variables 307
13.4.1 Two independent random variables 307
13.4.2 Several pairwise independent random variables 308
13.5 Linear functions of normally distributed random variables 308
13.6 Bivariate and multivariate normal density 310
13.6.1 Bivariate normal probability density 310
13.6.2 Graphical representations 310
13.6.3 Independence, marginal, and conditional densities 314
13.6.4 General multivariate normal density 318
14 The central limit theorem 319
14.1 Probability density of the sample mean from a normally distributed population 319
14.2 Probability distribution and density of the sample mean from a non-normally distributed population 320
14.2.1 Central limit theorem 320
14.2.2 Illustration of the central limit theorem 322
14.3 Applications 326
14.4 Normal approximation of the binomial distribution 328
Appendix A The Greek alphabet 330
Appendix B Binomial distribution 331
Appendix C Poisson distribution 336
Appendix D Exponential distribution 339
Appendix E Standard normal distribution 341
Index 343
Chapter 1
What is statistics?
The world is ready for the truth; the modern age is here; every year another report appears that examines poverty by means of statistical research rather than romantic claptrap.
(from The Crimson Petal and the White, Michael Faber, p. 334)
In this introductory chapter, we give a general description of the topics of statistics and probability theory. Some examples illustrate the purpose and applications of both disciplines, as well as the differences between them. As statistics has more applications in science, industry, and economics than probability theory, statistics is typically given far more attention in degree subjects like business, industrial and bio-science engineering, applied economics, and natural or social sciences. Nevertheless, one should pay some attention to probability theory as well. In fact, both disciplines are strongly connected to each other: it is impossible to understand the working of statistical inference without a sound knowledge of probability theory. Therefore, in this book, we discuss both probability theory and statistics.
1.1 Why statistics?
For many years, statistics has been a subject, often a dreaded one, in several fields of study at universities and colleges. The reason is that quite a few people will, sooner or later, be confronted with problems of data analysis during their professional activities. A sound statistical background not only allows us to analyze the data and to make concrete decisions based on the analysis, but it also provides an advantage in the data collection process.
Nevertheless, statistics is not immediately perceived as useful by most students. This is mainly due to the fact that, during a statistics course, they are still unfamiliar with the sorts of practical decision problems managers, economists, engineers, andresearchers face on a daily basis. Many students will start realizing the usefulness of statistics when they start to work on their bachelor's or master's thesis. The many examples in this basic course are intended to advance this awareness by several years.
In an introductory statistics course, one often finds a whole series of quotes as an attempt to motivate students. A classic example is "Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write." from the British writer Herbert George Wells (1866-1946). More recent is the judgment by the US quality guru W. Edwards Deming, to whom a large part of the downright spectacular economic recovery in Japan after World War II is attributed. He claimed that "Statistics is too important to be left to statisticians. The goal is to have many statistically-skilled workers: engineers, scientists, managers..." Hal Varian, chief economist at Google says the following: "I keep saying that the most sexy job in the next 10 years will be statistician. And I'm not kidding." In Europe, Willy Buysse, former CEO at SN Brussels Airlines, states that too few decisions are made based on data. Only recently, his many years of diligence establishing a research department, where statistical and other quantitative methods are used to address all sorts of problems, has been rewarded.
Another justification for a thorough training in statistical methods can be found in the so-called Six Sigma improvement program. The purpose of this program is to solve concrete problems with a large financial impact both in service and industrial companies, and to reduce the number of faults and defects to 3.4 per million opportunities. The approach is based on statistical methods, as presented in Figure 1.1. The figure shows that the traditional method to solve a practical problem is to immediately search for practical solutions. This approach is typically based on guessing and trial-and-error, so that it will often take a long time to find a final solution to the problem. The Six Sigma improvement program promotes a more thoughtful, scientific approach to problems. First, data is collected in the so-called measurement phase. Then, using statistical methods, the data is carefully examined. This often leads to interesting insights and recommendations to improve existing products, services, or processes. The Six Sigma approach also relies on the use of statistical process control and statistically designed experiments. Hence, statistics helps to find the best possible solution for all kinds of practical problems.
Figure 1.1 Using statistical methods to solve problems.
To achieve a successful cooperation between practitioners, on one hand, and statisticians, on the other, some openness is required on both sides. Engineers, economists, or scientists need a solid knowledge of the basic principles and techniques of statistics. Statistics is thus an indispensable skill in the repertoire of an effective employee. This explains why statistics is taught not only in the first and second years of many bachelor's degrees in engineering, sciences, and economics, but also later, for example in master's programs.
Finally, a thorough training in statistics is also a prerequisite for students of political and social sciences. They will also be confronted with numerous data sets in their professional careers that are impossible to interpret without a statistical background. For them, statistics is a stepping stone to econometric research methods.
1.2 Definition of statistics
The word statistics may sound familiar to anyone. A statistic usually refers to numerical information, for example, information about
- the population of a country: birth and death rates, immigration and emigration,.(such statistics are called population statistics),
- the economy: employment and unemployment rates, investments, prices, gross national products (GNP),.(these statistics are called economic statistics), or
- a company or sector: sales figures, income statements, growth, acquisitions, layoffs,.(these figures are called business statistics).
More formally, statistics can be defined as the set of methodologies for collecting, representing, analyzing, and interpreting data. This shows that the statistical science is a very general auxiliary science, which plays an important role in almost any environment. Applications of statistics are countless in engineering, medicine, economics, natural sciences, and business management, but statistics is also used in literature, history, political science, criminology, and even musicology.
In our modern society, data is massively present:
- computer files in companies contain sales data, cost data, and customer data (such as addresses, ordered quantities, and order frequencies),
- the financial pages of newspapers contain stock prices, commodity prices, and exchange rates,
- federal and regional authorities regularly publish data on population, trade, and industry, and
- the Internet is a source of numerous data sets.
Companies collect data naturally and actively. Among other things, this takes place by carrying out experiments (e.g., to design new products), in the context of statistical process control, or by measuring all kinds of properties of products, services, and processes. By continuously analyzing data, quality departments of companies attempt to deliver products or services with as few defects as possible and with the highest reliability. In addition, business processes are organized in such a way that waste is minimized, inspections of finished products are reduced to the minimum, and customer requirements are satisfied with minimal costs.
Research agencies collect data via surveys by phone, by post, via the Internet or by street interviews. Such surveys are designed to gather information about the shopping behavior of consumers, about the voting behavior of the population, or public opinion on social issues.
Statistics allows us to turn data into usable information. The role that statistics plays herein may be best illustrated based on some examples.
1.3 Examples
Example 1.3.1
An airline conducted a study on the behavior of its passengers on intercontinental flights and recorded
- the number of passengers with reservations that do not show up (the so-called no-shows),
- the weight of the luggage of passengers (often there is a limit of 20 kilograms), and
- the time the passengers arrive before the official departure time of the flight (for intercontinental flights, the passengers are asked to be at the airport at least two hours prior to departure).
The company recorded this data over several months and then made a distinction between passengers in economy class and passengersin business class. The data is analyzed with the aim of instituting appropriate policies. An example may be to allow overbooking, that is, to take more reservations than there are seats on the plane, or to apply more stringent action against passengers carrying too much luggage.
Example 1.3.2
In the production of coffee, the humidity during production is of crucial importance for the quality of the final product. The humidity is kept under control by a system that does not work flawlessly. Therefore, several measurements of the humidity are taken daily to determine whether it remains within appropriate limits. This approach is referred to as statistical process control.
Example 1.3.3
A filling machine for bottles usually has several filling heads, so that many bottles can be...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.