Fast Sequential Monte Carlo Methods for Counting and Optimization

Name: Fast Sequential Monte Carlo Methods for Counting and Optimization
Brand: Wiley
Price: 107.99 EUR
Availability: OnlineOnly

Reuven Y. Rubinstein Ad Ridder Radislav Vaisman(Author)

Wiley (Publisher)

Published on 13. November 2013

208 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-118-61235-4 (ISBN)

€107.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

More details

Other editions

Persons

Content

Preface xi 1. Introduction to Monte Carlo Methods 1 2. Cross-Entropy Method 6 2.1. Introduction 6 2.2. Estimation of Rare-Event Probabilities 7 2.3. Cross-Entrophy Method for Optimization 18 2.4. Continuous Optimization 31 2.5. Noisy Optimization 33 3. Minimum Cross-Entropy Method 37 3.1. Introduction 37 3.2. Classic MinxEnt Method 39 3.3. Rare Events and MinxEnt 43 3.4. Indicator MinxEnt Method 47 3.5. IME Method for Combinatorial Optimization 52 4. Splitting Method for Counting and Optimization 56 4.1. Background 56 4.2. Quick Glance at the Splitting Method 58 4.3. Splitting Algorithm with Fixed Levels 64 4.4. Adaptive Splitting Algorithm 68 4.5. Sampling Uniformly on Discrete Regions 74 4.6. Splitting Algorithm for Combinatorial Optimization 75 4.7. Enhanced Splitting Method for Counting 76 4.8. Application of Splitting to Reliability Models 79 4.9. Numerical Results with the Splitting Algorithms 86 4.10. Appendix: Gibbs Sampler 104 5. Stochastic Enumeration Method 106 5.1. Introduction 106 5.2. OSLA Method and Its Extensions 110 5.3. SE Method 120 5.4. Applications of SE 127 5.5. Numerical Results 136 A. Additional Topics 148 A.1. Combinatorial Problems 148 A.1.1. Counting 149 A.1.2. Combinatorial Optimization 154 A.2. Information 162 A.2.1. Shannon Entropy 162 A.2.2. Kullback-Leibler Cross-Entropy 163 A.3. Efficiency of Estimators 164 A.3.1. Complexity 165 A.3.2. Complexity of Randomized Algorithms 166 Bibliography 169 Abbreviations and Acronyms 177 List of Symbols 178 Index 181

Chapter 1 Introduction to Monte Carlo Methods

Monte Carlo methods present a class of computational algorithms that rely on repeated random sampling to approximate some unknown quantities. They are best suited for calculation using a computer program, and they are typically used when the exact results with a deterministic algorithm are not available.

The Monte Carlo method was developed in the 1940s by John von Neumann, Stanislaw Ulam, and Nicholas Metropolis while they were working on the Manhattan Project at the Los Alamos National Laboratory. It was named after the Monte Carlo Casino, a famous casino where Ulam's uncle often gambled away his money.

We mainly deal in this book with two well-known Monte Carlo methods, called importance sampling and splitting, and in particular with their applications to combinatorial optimization, counting, and estimation of probabilities of rare events.

Importance sampling is a well-known variance reduction technique in stochastic simulation studies. The idea behind importance sampling is that certain values of the input random variables have a greater impact on the output parameters than others. If these “important” values are sampled more frequently, the variance of the output estimator can be reduced. However, such direct use of importance sampling distributions will result in a biased estimator. To eliminate the bias, the simulation outputs must be modified (weighted) by using a likelihood ratio factor, also called the Radon Nikodym derivative [108]. The fundamental issue in implementing importance sampling is the choice of the importance sampling distribution.

In the case of counting problems, it is well known that a straightforward application of importance sampling typically yields poor approximations of the quantity of interest. In particular, Gogate and Dechter [56] [57] show that poorly chosen importance sampling in graphical models such as satisfiability models generates many useless zero-weight samples, which are often rejected, yielding an inefficient sampling process. To address this problem, which is called the problem of losing trajectories, these authors propose a clever sample search method, which is integrated into the importance sampling framework.

With regard to probability problems, a wide range of applications of importance sampling have been reported successfully in the literature over the last decades.Siegmund [115] was the first to argue that, using an exponential change of measure, asymptotically efficient importance sampling schemes can be built for estimating gambler's ruin probabilities. His analysis is related to the theory of large deviations, which has since become an important tool for the design of efficient Monte Carlo experiments. Importance sampling is now a subject of almost any standard book on Monte Carlo simulation (see, for example, [3] [108]). We shall use importance sampling widely in this book, especially in connection to rare-event estimation.

The splitting method dates back to Kahn and Harris [62] and Rosenbluth and Rosenbluth [97]. The main idea is to partition the state-space of a system into a series of nested subsets and to consider the rare event as the intersection of a nested sequence of events. When a given subset is entered by a sample trajectory during the simulation, numerous random retrials are generated, with the initial state for each retrial being the state of the system at the entry point. By doing so, the system trajectory is split into a number of new subtrajectories, hence the name “splitting”. Since then, hundreds of papers have been written on this topic, both from a theoretical and a practical point of view. Applications of the splitting method arise in particle transmission (Kahn and Harris [62]), queueing systems (Garvels [48], Garvels and Kroese [49], Garvels et al. [50]), and reliability (L'Ecuyer et al. [76]). The method has been given new impetus by the RESTART (Repetitive Simulation Trials After Reaching Thresholds) method in the sequence of papers by Villén-Altimirano and Villén-Altimirano [122–124]. A fundamental theory of the splitting method was developed by Melas [85], Glasserman et al. [54] [55], and Dean and Dupuis [38] [39]. Recent developments include the adaptive selection of the splitting levels in Cérou and Guyader [24], the use of splitting in reliability networks [73] [109], quasi-Monte Carlo estimators in L'Ecuyer et al. [77], and the connection between splitting for Markovian processes and interacting particle methods based on the Feynman-Kac model in Del Moral [89].

Let us introduce the notion of a randomized algorithm. A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic to solve a deterministic problem such as a combinatorial optimization problem. As a result, the algorithm's output will be a random variable representing either the running time, its output, or both. In general, introducing randomness may result in an algorithm that is far simpler, more elegant, and sometimes even more efficient than the deterministic counterpart.

Example 1.1 Checking Matrix Multiplication Suppose we are given three matrices , and and we want to check whether . A trivial deterministic algorithm would be to run a standard multiplication algorithm and compare each entry of with . Simple matrix multiplication requires operations. A more sophisticated algorithm [88] takes only operations. Using randomization, however, we need only operations, with an extremely small probability of error [88]. The randomized procedure is as follows:

Pick a random -dimensional vector .
Multiply both sides of by , that is, obtain and .
If , then declare , otherwise, .

This algorithm runs in operations because matrix multiplication is associative, so can be computed as , thus requiring only three matrix-vector multiplications for the algorithm.

For more examples and foundations on randomized algorithms, see the monographs [88] [90].

We shall consider not only randomized algorithms but also random structures. The latter comprises random graphs (such as Erdös-Rényi graphs), random Boolean formulas, and so on. Random structures are of interest both as a means of understanding the behavior of algorithms on typical inputs and as a mathematical framework in which one can investigate various probabilistic techniques to analyze randomized algorithms.

This book deals with Monte Carlo methods and their associated randomized algorithms for solving combinatorial optimization and counting problems. In particular, we consider combinatorial problems that can be modeled by integer linear constraints. To clarify, denote by the set of feasible solutions of a combinatorial problem, which is assumed to be a subset of an -dimensional integer vector space and which is given by the following linear constraints:

1.1

Here, is a given matrix and is a given -dimensional vector. Most often we require the variables to be nonnegative integers and, in particular, binary integers.

In this book, we describe in detail various problems, algorithms, and mathematical aspects that are associated with (1.1) and its relation to decision making, counting, and optimization. Below is a short list of problems associated with (1.1):

1. Decision making: Is nonempty? 2. Optimization: Solve for a given objective (performance) function . 3. Counting: Calculate the cardinality of .

It turns out that, typically, it is hard to solve any of the above three problems and, in particular, the counting one, which is the hardest one. However, we would like to point out that there are problems for which decision making is easy (polynomial time) but counting is hard [90]. As an example, finding a feasible path (and also the shortest path) between two fixed nodes in a network is easy, whereas counting the total number of paths between the two nodes is difficult. Some other examples of hard counting and easy decision-making problems include:

How many different variable assignments will satisfy a given satisfiability formula in disjunctive normal form?
How many different variable assignments will satisfy a given 2-satisfiability formula (constraints on pairs of variables)?
How many perfect matchings are there for a given bipartite graph?

In Chapter 5, we follow the saying “counting is hard, but decision making is easy” and employ relevant decision-making algorithms, also called oracles, to derive fast Monte Carlo algorithms for counting.

Below is a detailed list of interesting hard counting problems.

The Hamiltonian cycle problem. How many Hamiltonian cycles does a graph have? That is, how many tours contains a graph in which every node is visited exactly once (except for the beginning/end node)?
The permanent problem. Calculate the permanent of a matrix , or equivalently, the number of perfect matchings in a bipartite balanced graph with as its biadjacency matrix.
The self-avoiding walk problem. How many self-avoiding random walks of length exist, when we are allowed to move at each grid point in any neighboring direction with equal probability?
The connectivity problem. Given two different nodes in a directed or undirected graph, say and , how many paths exist from to that do not traverse the same edge more than once?
The satisfiability problem. Let be a collection of all sets of Boolean variables . Thus, ...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Fast Sequential Monte Carlo Methods for Counting and Optimization

Description

More details

Other editions

Additional editions

Persons

Content

Chapter 1

Introduction to Monte Carlo Methods

System requirements