Nonlinear Parameter Optimization Using R Tools

Name: Nonlinear Parameter Optimization Using R Tools
Brand: Wiley
Price: 60.99 EUR
Availability: OnlineOnly

John C. Nash(Autor*in)

Wiley (Verlag)

1. Auflage

Erschienen am 3. April 2014

304 Seiten

E-Book

ePUB mit Adobe-DRM

Systemvoraussetzungen

E-Book

PDF mit Adobe-DRM

Systemvoraussetzungen

978-1-118-88475-1 (ISBN)

60,99 €inkl. 7% MwSt.

Systemvoraussetzungen

für ePUB mit Adobe-DRM und PDF mit Adobe-DRM

(Hinweis: Die Auswahl des von Ihnen gewünschten Dateiformats und des Kopierschutzes erfolgt erst im System des E-Book Anbieters)

E-Book Einzellizenz

Als Download verfügbar

Beschreibung

Weitere Details

Weitere Ausgaben

Person

Inhalt

Preface xv

1 Optimization problem tasks and how they arise 1

1.1 The general optimization problem 1

1.2 Why the general problem is generally uninteresting 2

1.3 (Non-)Linearity 4

1.4 Objective function properties 4

1.4.1 Sums of squares 4

1.4.2 Minimax approximation 5

1.4.3 Problems with multiple minima 5

1.4.4 Objectives that can only be imprecisely computed 5

1.5 Constraint types 5

1.6 Solving sets of equations 6

1.7 Conditions for optimality 7

1.8 Other classifications 7

References 8

2 Optimization algorithms - an overview 9

2.1 Methods that use the gradient 9

2.2 Newton-like methods 12

2.3 The promise of Newton's method 13

2.4 Caution: convergence versus termination 14

2.5 Difficulties with Newton's method 14

2.6 Least squares: Gauss-Newton methods 15

2.7 Quasi-Newton or variable metric method 17

2.8 Conjugate gradient and related methods 18

2.9 Other gradient methods 19

2.10 Derivative-free methods 19

2.10.1 Numerical approximation of gradients 19

2.10.2 Approximate and descend 19

2.10.3 Heuristic search 20

2.11 Stochastic methods 20

2.12 Constraint-based methods - mathematical programming 21

References 22

3 Software structure and interfaces 25

3.1 Perspective 25

3.2 Issues of choice 26

3.3 Software issues 27

3.4 Specifying the objective and constraints to the optimizer 28

3.5 Communicating exogenous data to problem definition functions 28

3.5.1 Use of "global" data and variables 31

3.6 Masked (temporarily fixed) optimization parameters 32

3.7 Dealing with inadmissible results 33

3.8 Providing derivatives for functions 34

3.9 Derivative approximations when there are constraints 36

3.10 Scaling of parameters and function 36

3.11 Normal ending of computations 36

3.12 Termination tests - abnormal ending 37

3.13 Output to monitor progress of calculations 37

3.14 Output of the optimization results 38

3.15 Controls for the optimizer 38

3.16 Default control settings 39

3.17 Measuring performance 39

3.18 The optimization interface 39

References 40

4 One-parameter root-finding problems 41

4.1 Roots 41

4.2 Equations in one variable 42

4.3 Some examples 42

4.3.1 Exponentially speaking 42

4.3.2 A normal concern 44

4.3.3 Little Polly Nomial 46

4.3.4 A hypothequial question 49

4.4 Approaches to solving 1D root-finding problems 51

4.5 What can go wrong? 52

4.6 Being a smart user of root-finding programs 54

4.7 Conclusions and extensions 54

References 55

5 One-parameter minimization problems 56

5.1 The optimize() function 56

5.2 Using a root-finder 57

5.3 But where is the minimum? 58

5.4 Ideas for 1D minimizers 59

5.5 The line-search subproblem 61

References 62

6 Nonlinear least squares 63

6.1 nls() from package stats 63

6.1.1 A simple example 63

6.1.2 Regression versus least squares 65

6.2 A more difficult case 65

6.3 The structure of the nls() solution 72

6.4 Concerns with nls() 73

6.4.1 Small residuals 74

6.4.2 Robustness - "singular gradient" woes 75

6.4.3 Bounds with nls() 77

6.5 Some ancillary tools for nonlinear least squares 79

6.5.1 Starting values and self-starting problems 79

6.5.2 Converting model expressions to sum-of-squares functions 80

6.5.3 Help for nonlinear regression 80

6.6 Minimizing Rfunctions that compute sums of squares 81

6.7 Choosing an approach 82

6.8 Separable sums of squares problems 86

6.9 Strategies for nonlinear least squares 93

References 93

7 Nonlinear equations 95

7.1 Packages and methods for nonlinear equations 95

7.1.1 BB 96

7.1.2 nleqslv 96

7.1.3 Using nonlinear least squares 96

7.1.4 Using function minimization methods 96

7.2 A simple example to compare approaches 97

7.3 A statistical example 103

References 106

8 Function minimization tools in the base R system 108

8.1 optim() 108

8.2 nlm() 110

8.3 nlminb() 111

8.4 Using the base optimization tools 112

References 114

9 Add-in function minimization packages for R 115

9.1 Package optimx 115

9.1.1 Optimizers in optimx 116

9.1.2 Example use of optimx() 117

9.2 Some other function minimization packages 118

9.2.1 nloptr and nloptwrap 118

9.2.2 trust and trustOptim 119

9.3 Should we replace optim() routines? 121

References 122

10 Calculating and using derivatives 123

10.1 Why and how 123

10.2 Analytic derivatives - by hand 124

10.3 Analytic derivatives - tools 125

10.4 Examples of use of R tools for differentiation 125

10.5 Simple numerical derivatives 127

10.6 Improved numerical derivative approximations 128

10.6.1 The Richardson extrapolation 128

10.6.2 Complex-step derivative approximations 128

10.7 Strategy and tactics for derivatives 129

References 131

11 Bounds constraints 132

11.1 Single bound: use of a logarithmic transformation 132

11.2 Interval bounds: Use of a hyperbolic transformation 133

11.2.1 Example of the tanh transformation 134

11.2.2 A fly in the ointment 134

11.3 Setting the objective large when bounds are violated 135

11.4 An active set approach 136

11.5 Checking bounds 138

11.6 The importance of using bounds intelligently 138

11.6.1 Difficulties in applying bounds constraints 139

11.7 Post-solution information for bounded problems 139

Appendix 11.A Function transfinite 141

References 142

12 Using masks 143

12.1 An example 143

12.2 Specifying the objective 143

12.3 Masks for nonlinear least squares 147

12.4 Other approaches to masks 148

References 148

13 Handling general constraints 149

13.1 Equality constraints 149

13.1.1 Parameter elimination 151

13.1.2 Which parameter to eliminate? 153

13.1.3 Scaling and centering? 154

13.1.4 Nonlinear programming packages 154

13.1.5 Sequential application of an increasing penalty 156

13.2 Sumscale problems 158

13.2.1 Using a projection 162

13.3 Inequality constraints 163

13.4 A perspective on penalty function ideas 167

13.5 Assessment 167

References 168

14 Applications of mathematical programming 169

14.1 Statistical applications of math programming 169

14.2 R packages for math programming 170

14.3 Example problem: L1 regression 171

14.4 Example problem: minimax regression 177

14.5 Nonlinear quantile regression 179

14.6 Polynomial approximation 180

References 183

15 Global optimization and stochastic methods 185

15.1 Panorama of methods 185

15.2 R packages for global and stochastic optimization 186

15.3 An example problem 187

15.3.1 Method SANN from optim() 187

15.3.2 Package GenSA 188

15.3.3 Packages DEoptim and RcppDE 189

15.3.4 Package smco 191

15.3.5 Package soma 192

15.3.6 Package Rmalschains 193

15.3.7 Package rgenoud 193

15.3.8 Package GA 194

15.3.9 Package gaoptim 195

15.4 Multiple starting values 196

References 202

16 Scaling and reparameterization 203

16.1 Why scale or reparameterize? 203

16.2 Formalities of scaling and reparameterization 204

16.3 Hobbs' weed infestation example 205

16.4 The KKT conditions and scaling 210

16.5 Reparameterization of the weeds problem 214

16.6 Scale change across the parameter space 214

16.7 Robustness of methods to starting points 215

16.7.1 Robustness of optimization techniques 218

16.7.2 Robustness of nonlinear least squares methods 220

16.8 Strategies for scaling 222

References 223

17 Finding the right solution 224

17.1 Particular requirements 224

17.1.1 A few integer parameters 225

17.2 Starting values for iterative methods 225

17.3 KKT conditions 226

17.3.1 Unconstrained problems 226

17.3.2 Constrained problems 227

17.4 Search tests 228

References 229

18 Tuning and terminating methods 230

18.1 Timing and profiling 230

18.1.1 rbenchmark 231

18.1.2 microbenchmark 231

18.1.3 Calibrating our timings 232

18.2 Profiling 234

18.2.1 Trying possible improvements 235

18.3 More speedups of R computations 238

18.3.1 Byte-code compiled functions 238

18.3.2 Avoiding loops 238

18.3.3 Package upgrades - an example 239

18.3.4 Specializing codes 241

18.4 External language compiled functions 242

18.4.1 Building an R function using Fortran 244

18.4.2 Summary of Rayleigh quotient timings 246

18.5 Deciding when we are finished 247

18.5.1 Tests for things gone wrong 248

References 249

19 Linking R to external optimization tools 250

19.1 Mechanisms to link R to external software 251

19.1.1 R functions to call external (sub)programs 251

19.1.2 File and system call methods 251

19.1.3 Thin client methods 252

19.2 Prepackaged links to external optimization tools 252

19.2.1 NEOS 252

19.2.2 Automatic Differentiation Model Builder (ADMB) 252

19.2.3 NLopt 253

19.2.4 BUGS and related tools 253

19.3 Strategy for using external tools 253

References 254

20 Differential equation models 255

20.1 The model 255

20.2 Background 256

20.3 The likelihood function 258

20.4 A first try at minimization 258

20.5 Attempts with optimx 259

20.6 Using nonlinear least squares 260

20.7 Commentary 261

Reference 262

21 Miscellaneous nonlinear estimation tools for R 263

21.1 Maximum likelihood 263

21.2 Generalized nonlinear models 266

21.3 Systems of equations 268

21.4 Additional nonlinear least squares tools 268

21.5 Nonnegative least squares 270

21.6 Noisy objective functions 273

21.7 Moving forward 274

References 275

Appendix A R packages used in examples 276

Index 279

Chapter 1
Optimization problem tasks and how they arise

In this introductory chapter we look at the classes of problems for which we will discuss solution tools. We also consider the interrelationships between different problem classes as well as among the solution methods. This is quite general. R is only incidental to this chapter except for some examples. Here we write our list of things to do.

1.1 The general optimization problem

The general constrained optimization problem can be stated as follows.

Find x = (x)
such that
(x)>= 0

Note that is a scalar function but is a vector. There may or may not be constraints on the values of , and these are expressed formally in the vector of functions . While these functions are general, many problems have much simpler constraints, such as requirements that the values of be no less than some lower bounds or no greater than some upper bounds as we shall discuss in the following text.

We have specified the problem as a minimization, but maximization problems can be transformed to minimizations by multiplying the objective function by .

Note also that we have asked for the set of arguments x that minimize the objective, which essentially implies the global minimum. However, many—if not most—of the numerical methods in optimization are able to find only local minima and quite a few problems are such that there may be many local minima and possibly even more than one global minimum. That is, the global minimum may occur at more than one set of parameters x and may occur on a line or surface.

1.2 Why the general problem is generally uninteresting

While there do exist methods for tackling the general optimization problem, almost all the “real” work of optimization in problems related to statistics and modeling tends to be done by more specialized methods that work on problems that are restricted in some ways by the nature of the objective or the constraints (or lack thereof). Indeed, for a number of particular problems, there are very specialized packages expressly designed to solve them. Unfortunately, the user often has to work quite hard to decide if his or her problem actually matches the design considerations of the specialized package. Seemingly small changes—for example, a condition that parameters must be positive—can render the specialized package useless. On the other hand, a very general tool may be quite tedious for the user to apply easily, because objective functions and constraints may require a very large amount of program code in some cases.

In the real world, the objective function and the constraints are not only functions of but also depend on data; in fact, they may depend on vast arrays of data, particularly in statistical problems involving large systems.

To illustrate, consider the following examples, which, while “small,” illustrate some of the issues we will encounter.

Cobb–Douglas example

The Cobb–Douglas production function (Nash and Walker-Smith, 1987, p. 375) predicts the quantity of production of a commodity as a function of the inputs of (it appears traditional to use a K for this variable) and used, namely,

1.1

A traditional approach to this problem is to take logarithms to get

1.2

However, the two forms imply very different ways in which errors are assumed to exist between the model and real-world data. Let us assume (almost certainly dangerously) that data for and are known precisely, but there may be errors in the data for . Let us use the name . In particular, if we use additive errors of the form

1.3

then we have

1.4

where we have given these errors a particular name . This means that the errors are actually multiplicative in the real scale of the data.

1.5

If we estimate the model using the log form, we can sometimes get quite different estimates of the parameters than using the direct form. The “errors” have different weights in the different scales, and this alters the estimates. If we really believe that the errors are distributed around the direct model with constant variance, then we should not be using the log form, because it implies that the relative errors are distributed with constant variance.

Hobbs' weed infestation example

This problem is also a nonlinear least squares. As we shall see later, it demonstrates a number of computational issues. The problem came across my desk sometime in 1974 when I was working on the development of a program to solve nonlinear least squares estimation problems. I had written several variants of Gauss–Newton methods in BASIC for a Data General NOVA system. This early minicomputer offered a very limited environment of a 10 character per second teletype with paper tape reader and punch that allowed access to a maximum 8K byte (actually 4K word) segment of the machine. Arithmetic was particularly horrible in that floating point used six hexadecimal digits in the mantissa with no guard digit.

The problem was supplied by Mr. Dave Hobbs of Agriculture Canada. As I was told, the observations () are weed densities per unit area over 12 growing periods. I was never given the actual units of the observations. Here are the data (Figure 1.1).

# draw the data y <- c(5.308, 7.24, 9.638, 12.866, 17.069, 23.192, 31.443, 38.558, 50.156, 62.948, 75.995, 91.972) t <- 1:12 plot(t, y) title(main = "Hobbs' weed infestation data", font.main = 4)

Figure 1.1

It was suggested that the appropriate model was a 3-parameter logistic, that is,

1.6

where , is the growing period, and . We shall see later that there are other forms for the model that may give better computational properties.

1.3 (Non-)Linearity

What do we mean by “nonlinear?” The word clearly implies “not a straight line,” and many researchers take this to apply to the model they are trying to estimate. However, for the process of estimation, which generally involves minimizing a loss function such as a sum of squared deviations or maximizing a likelihood function, the key issue is that of solving a set of equations to find the result.

When we minimize the sum of squares for a model that is linear in the parameters, such as the log form of the Cobb–Douglas function (1.2) above where , , and appear only to the first power, we can apply standard calculus to arrive at the normal equations. These are a set of linear equations. However, when we want to minimize the sum of squares from the original model (1.1), it is generally necessary to use an iterative method from some starting set of the parameters , , and .

For the purposes of this book, “nonlinear” will refer to the process of finding a solution and implying that there is no method that finds a solution via a predetermined set of solutions of linear equations. That is, while we use a lot of linear algebra in finding solutions to the problems of interest in this book, we cannot, in advance, specify how many such subproblems are needed.

1.4 Objective function properties

There are some particular forms of the objective function that lead to specialized, but quite common, solution methods. This gives us one dimension or axis by which to categorize the optimization methods we shall consider later.

1.4.1 Sums of squares

If the objective function is a sum of squared terms, we can use a method for solving nonlinear least squares problems. Clearly, the estimation of the Cobb–Douglas production model above by minimizing the sum of squared residuals is a problem of this type.

We note that the Cobb–Douglas problem is linear in the parameters in the case of the log-form model. The linear least squares problem is so pervasive that it is worth noting how it may be solved because some approaches to nonlinear problems can be viewed as solving sequences of linear problems.

1.4.2 Minimax approximation

It is sometimes important to have an upper bound on the deviation of a model from “data.” We, therefore, wish to find the set of parameters in a model that minimizes the maximum deviation, hence a minimax problem. In particular, consider that there may be relatively simple approximations to some specialized and awkward-to-compute special functions. This sort of approximation problem is less familiar to statistical workers than sums-of-squares problems. Moreover, the small residuals may render some traditional methods such as the R function nls() ill-suited to their solution.

1.4.3 Problems with multiple...

Inhalt (PDF)

Systemvoraussetzungen

Dateiformat: ePUB
Kopierschutz: Adobe-DRM (Digital Rights Management)

Systemvoraussetzungen:

Computer (Windows; MacOS X; Linux): Installieren Sie bereits vor dem Download die kostenlose Software Adobe Digital Editions (siehe E-Book Hilfe).
Tablet/Smartphone (Android; iOS): Installieren Sie bereits vor dem Download die kostenlose App Adobe Digital Editions oder die App PocketBook (siehe E-Book Hilfe).
E-Book-Reader: Bookeen, Kobo, Pocketbook, Sony, Tolino u.v.a.m. (nicht Kindle)

Das Dateiformat ePUB ist sehr gut für Romane und Sachbücher geeignet – also für „fließenden” Text ohne komplexes Layout. Bei E-Readern oder Smartphones passt sich der Zeilen- und Seitenumbruch automatisch den kleinen Displays an.
Mit Adobe-DRM wird hier ein „harter” Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.

Bitte beachten Sie: Wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!

Weitere Informationen finden Sie in unserer E-Book Hilfe.

Dateiformat: PDF
Kopierschutz: Adobe-DRM (Digital Rights Management)

Systemvoraussetzungen:

Computer (Windows; MacOS X; Linux): Installieren Sie bereits vor dem Download die kostenlose Software Adobe Digital Editions (siehe E-Book Hilfe).
Tablet/Smartphone (Android; iOS): Installieren Sie bereits vor dem Download die kostenlose App Adobe Digital Editions oder die App PocketBook (siehe E-Book Hilfe).
E-Book-Reader: Bookeen, Kobo, Pocketbook, Sony, Tolino u.v.a.m. (nicht Kindle)

Das Dateiformat PDF zeigt auf jeder Hardware eine Buchseite stets identisch an. Daher ist eine PDF auch für ein komplexes Layout geeignet, wie es bei Lehr- und Fachbüchern verwendet wird (Bilder, Tabellen, Spalten, Fußnoten). Bei kleinen Displays von E-Readern oder Smartphones sind PDF leider eher nervig, weil zu viel Scrollen notwendig ist.
Mit Adobe-DRM wird hier ein „harter” Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.

Bitte beachten Sie: Wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!

Weitere Informationen finden Sie in unserer E-Book Hilfe.

Als PDF speichern Als Link merken

Nonlinear Parameter Optimization Using R Tools

Beschreibung

Weitere Details

Weitere Ausgaben

Person

Inhalt

Chapter 1 Optimization problem tasks and how they arise

1.1 The general optimization problem

1.2 Why the general problem is generally uninteresting

Cobb–Douglas example

Hobbs' weed infestation example

1.3 (Non-)Linearity

1.4 Objective function properties

1.4.1 Sums of squares

1.4.2 Minimax approximation

1.4.3 Problems with multiple...

Systemvoraussetzungen

Chapter 1
Optimization problem tasks and how they arise