Mathematics for Digital Science 3

Name: Mathematics for Digital Science 3 | Data Analysis and Optimization
Brand: Wiley
Price: 130.99 EUR
Availability: OnlineOnly

Data Analysis and Optimization

Gérard-Michel Cochard Mhand Hifi(Author)

Wiley (Publisher)

1st Edition

Published on 19. June 2025

444 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-394-38853-0 (ISBN)

€130.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

More details

Other editions

Persons

Content

Preface ix

Chapter 1. Linear Modeling for Two-Dimensional Data 1

1.1. Basic statistics 1

1.2. Linear adjustment 3

1.3. Linear correlation 7

Chapter 2. Multidimensional Data Analysis 13

2.1. Multidimensional tables 14

2.2. Analysis of a point cloud 24

2.2.1. Spatial analysis of individuals 24

2.2.2. Analysis in variable space 30

2.2.3. Link between the two spaces 32

2.3. Principal component factor analysis 35

2.3.1. Principles 35

2.3.2. Principal component factor analysis in practice 41

2.4. Appendix: useful software 48

2.5. Tanagra software 48

2.6. R software 56

Chapter 3. Introduction to Automatic Classification 65

3.1. Similarity and distance 66

3.1.1. Similarity, dissimilarity 66

3.1.2. Distance 67

3.2. Basics of information theory 69

3.3. Classification methods 78

3.4. Classification by partitioning 80

3.4.1. Partitioning principle 80

3.4.2. Inertia 81

3.4.3. Mobile center algorithm 82

3.5. Hierarchical classification 89

3.5.1. Principle 89

3.5.2. Aggregation strategies 90

3.5.3. Ward's aggregation strategy 98

3.6. Appendices 101

3.6.1. Huygens-König theorem 101

3.6.2. Classification software 103

Chapter 4. Linear Programming 109

4.1. An introductory example 110

4.2. General formulation 113

4.3. Geometry of a linear program 116

4.4. Simplex algorithm 118

4.5. Two-phase method. 129

4.6. Duality 135

4.7. Degeneracy 139

4.8. Introduction to integer linear programming 140

4.8.1. Gomory cuts 140

4.8.2. Branch and Bound algorithm 151

Chapter 5. Elements of Graph Theory 159

5.1. Definition and representations of a graph 160

5.1.1. Graphical representation 160

5.1.2. Tables associated with a graph 161

5.2. Main concepts and technology 162

5.2.1. Elements of a graph 163

5.2.2. Analysis of graph structure 168

5.3. Transitive closure of a graph's vertices 172

5.4. Decomposition of a directed graph into strongly connected components 173

5.5. Trees 175

5.5.1. Introduction 175

5.5.2. Trees in a graph 175

5.5.3. Spanning tree in a graph 177

5.6. Finding a fundamental system of independent cycles of a connected graph 179

5.7. Extremum spanning tree 180

5.7.1. General information 180

5.7.2. Extremum spanning tree search algorithms 180

Chapter 6. Path Optimization 187

6.1. Extremal-length paths 188

6.1.1. General information 188

6.1.2. Solving algorithms 189

6.1.3. Searching for e -extremal paths 204

6.2. Hamiltonian path search 205

6.2.1. Decomposition into strongly connected components 205

6.2.2. Branch and Bound algorithm 207

Chapter 7. Transportation Problems 219

7.1. Maximum flow 220

7.1.1. General 220

7.1.2. Ford-Fulkerson theorem 223

7.1.3. Ford-Fulkerson algorithm 224

7.1.4. Practical solution method 230

7.2. Least cost transportation 236

7.2.1. Issues 236

7.2.2. Northwest Corner algorithm 238

7.2.3. Least cost method 240

7.2.4. Balas-Hammer or Vogel's algorithm 243

7.2.5. Stepping stone algorithm 246

7.2.6. Inequality of supply and demand 256

7.2.7. Unconnected graph and degeneration 257

7.2.8. Potentials and dual program 261

7.3. Assignment problems 263

7.3.1. Issues 263

7.3.2. Hungarian algorithm 269

Chapter 8. Scheduling Problems 279

8.1. Planning a project 280

8.1.1. Introduction 280

8.1.2. Earliest and latest date method 283

8.1.3. Gantt chart 289

8.2. Flow-shop problem 291

8.2.1. Johnson's algorithm for two machines 293

8.2.2. Case of three machines 297

8.3. The job-shop problem 306

References 311

List of Authors 313

Index 315

1
Linear Modeling for Two-Dimensional Data

CONCEPTS COVERED IN THIS CHAPTER. -

This brief chapter serves as a reminder of the concepts presented in detail in Volume 1. It primarily provides an overview of basic statistical analysis tools, particularly linear regression and correlation for two-dimensional data.

References: [SAP 11].

1.1. Basic statistics

Consider a population of n elements. Each element i is characterized by the value of a variable x = xi. The n values xi constitute a one-dimensional statistical series, whose characteristics are:

The average is defined by:

In this definition, it is assumed that all elements have the same statistical weight If the weights are not equal, the following expression is used:

where pi represents the statistical weight of individual i.

Variance v(x) is defined as the average of the squares of the deviations from the average:

Huygens' theorem provides another method for calculating variance:

This relationship is often summarized as "the average of squares minus the square of the mean".

Standard deviation s(x) is defined as the square root of the variance:

EXAMPLE 1.1.-

Consider the statistical series shown in Figure 1.1, which represents the number of rainy days over 10 consecutive years at a given location.

Figure 1.1. Statistical series

The average can be easily calculated by assigning equal statistical weight to each measurement. The average of the squares the variance v(x) = 1284 and the standard deviation s(x) = 35,83 are also determined.

Figure 1.2 shows the graphical representation of the statistical series in the form of a histogram. This histogram illustrates the distribution of data regarding the number of rainy days over the 10 years.

The average is a measure of the position of the statistical series along the number of days axis, while the standard deviation serves as a dispersion parameter, providing an indicator of the spread of the statistical series.

Figure 1.2. Graphical representation of the statistical series

1.2. Linear adjustment

Now, consider a two-dimensional statistical series, where each element is characterized by the values of two variables, x and y. For each variable, various statistical measures can be calculated, such as the average, variance and standard deviation.

To graphically represent this two-dimensional series, a two-dimensional Cartesian coordinate system is used. The x-axis represents the variable x. and the y-axis represents the variable y. Each element i of the series is represented as a point (xi,yi) in this coordinate system, where the coordinate xi corresponds to the value of the variable x, and the coordinate yi corresponds to the value of the variable y. Figure 1.3 shows examples of graphical representations of two two-dimensional series.

Figure 1.3. Example of two two-dimensional series

When observing a two-dimensional series and detecting a certain structure in the set of representative points, we may be inclined to model this structure using a curve. This involves finding a mathematical function that best describes the relationships between the variables x and y. In the examples shown in Figure 1.3, a straight line can be proposed for modeling the first example, and a parabola for the second example, as shown in Figure 1.4. These models are adjustments that simplify the representation of trends or relationships observed in the data.

Figure 1.4. Examples of adjustments

The linear adjustment is the simplest of all analytical adjustments. It involves obtaining the equation of the straight line that "best fit" the set of representative points of the series.

A classic method for obtaining the equation of the line in linear adjustment is the least squares method. This method involves minimizing the sum of the squares of the deviations between the observed values and the values predicted by the line. For the variables x and y, the respective means, denoted by and are calculated assuming equal statistical weight for each value of i:

Next, the deviations from these averages for each point in the series are calculated (convenient to work with "centered" coordinates):

It is easy to verify that:

The squares of these deviations are obtained by squaring these values:

The least squares method involves finding the coefficients a and b of the equation of the line y = ax + b. Alternatively, using the centered coordinates, the equation becomes Y´ = AX + B, where for each of the representative points, The relationship between (A, B) and (a,b) is:

The goal is to optimize the sum of squared deviations to a minimum. Mathematically, this involves minimizing the following objective function:

In other words, the aim is to minimize the following quantity:

The minimum of M corresponds to the cancellation of the first derivatives with respect to A and B, the only unknowns in M. Taking the partial derivatives:

which leads to:

These conditions lead to the following equations:

EXAMPLE 1.2.-

Let us consider the statistical series showing the number of rainy days (x) and umbrella sales in local currency (y) (see Figure 1.5).

Figure 1.5. Statistical series (x, y)

Figure 1.6. Detailed adjustment calculations

Figure 1.7. Adjustment line.

Figure 1.6 summarizes the calculations required to determine the best-fit adjustment line, with the values of a = 1311.53 and b = 8831.78. Figure 1.7 displays the best-fit adjustment line.

1.3. Linear correlation

Figure 1.8. Different correlation situations

In the case of adjustment, the goal is to express y as a function of x. This choice is arbitrary, as x could be expressed as a function of y. In this case, two adjustment lines would be obtained, both intersecting at the point

By treating the variables x and y symmetrically, the concept of correlation between these variables can be introduced. Correlation measures the relationship between two variables and quantifies the possible influence of one on the other. Figure 1.8 presents various examples of scatter plots to illustrate different correlation situations.

In particular, in the case of linear correlation, it is interesting to note that when the two best-fit adjustment lines, y = f(x) and x = f´(y), coincide, this indicates maximum linear correlation between the variables x and y.

EXAMPLE 1.3.-

For the series in Example 1.2, the following two best-fit adjustment lines are obtained:

Figure 1.9 shows that the two straight lines are very close to each other, indicating a strong correlation between the variables.

Figure 1.9. Adjustment lines.

The two best-fit adjustment lines have direction coefficients a and a´. If the lines coincide, then equivalently, a × a´ = 1. Now,

The maximum correlation corresponds to the following equality (known as the Cauchy-Schwarz equality):

The analytical definition of the linear correlation is:

which is simply

EXAMPLE 1.3 (CONTINUED).-

Let us return to Example 1.3. The equations of the adjustment lines are:

The linear correlation coefficient is close to 1, i.e. r = 0.98 ~ 1. This indicates an almost maximal linear correlation between the variables x and y. In this case, a strong relationship exists between x and y.

The linear correlation coefficient r is often written in another form, using the standard deviations s(x) and s(y):

Furthermore, the covariance cov(x, y) is defined by:

It follows that:

In the case of linear fitting, the expression for M is:

The minimum is found by replacing A and B with the values obtained:

By definition, M and therefore Mmin are positive or zero quantities. This leads to the Cauchy-Schwarz inequality:

Figure 1.10. Variations in the linear correlation coefficient

This inequality implies that the linear correlation coefficient lies in the range -1 = r = 1. This means that the linear correlation coefficient can take values between -1 and 1, inclusive. Figure 1.10 shows such a correlation scale, where different ranges of r values are...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Mathematics for Digital Science 3

Description

More details

Other editions

Additional editions

Persons

Content

1
Linear Modeling for Two-Dimensional Data

CONCEPTS COVERED IN THIS CHAPTER. -

1.1. Basic statistics

EXAMPLE 1.1.-

1.2. Linear adjustment

EXAMPLE 1.2.-

1.3. Linear correlation

EXAMPLE 1.3.-

EXAMPLE 1.3 (CONTINUED).-

System requirements

Schweitzer Fachinformationen

Mathematics for Digital Science 3

Description

More details

Other editions

Additional editions

Persons

Content

1 Linear Modeling for Two-Dimensional Data

CONCEPTS COVERED IN THIS CHAPTER. -

1.1. Basic statistics

EXAMPLE 1.1.-

1.2. Linear adjustment

EXAMPLE 1.2.-

1.3. Linear correlation

EXAMPLE 1.3.-

EXAMPLE 1.3 (CONTINUED).-

System requirements

1
Linear Modeling for Two-Dimensional Data