怎样理解“No Free Lunch”定理

以下节选自:Taking Sudoku Seriously: The Math Behind the World’sMost Popular Pencil Puzzle by JASON ROSENHOUSE AND LAURA TAALMANEverything we have done is readily automated. A computer can be programmed to produce valid Sudoku squares, and then, either by removing symmetrical pairs or by placing symmetrical pairs, produce a sound puzzle. Given a partially filled square, the computer can easily determine the number of valid completions. The difficulty lies not in producing sound puzzles, but in finding those that are interesting, challenging, and beautiful.The quest for sound Sudoku puzzles is an example of a search problem. Such problems have the following general form: We have a large collection of objects. Each of these objects has a score attached to it measuring its desirability for our purposes. We desire objects possessing high scores. But the collection is too large to examine each object individually. We can examine only a small percentage of the total space. How should we conduct our search to maximize our chances of finding the desired objects?From now on we shall refer to the “large collection of objects” as the search space. It can be helpful to imagine your search space as a surface with peaks and valleys. The peaks represent high-scoring points and the valleys represent low-scoring points. Most points lie somewhere between these extremes. The surface in this metaphor is commonly referred to as the search landscape. In terms of Sudoku, the search landscape contains all partially filled-in Sudoku squares, and desirability is determined by the number of completions the pseudo-puzzles admit. The goal is to locate a partially filled-in Sudoku square that admits just one solution.The simplest of all approaches is random search. We select our objects at random, and we get what we get. It is a primitive method, but often effective. Specifically, it will be effective when the number of desirable objects in the space is a relatively large percentage of the total. For example, suppose just one-tenth of 1 percent of the objects have the properties we seek. If we choose randomly, then we would expect that one out of every one thousand objects will be desirable. Using a computer, we can typically search many millions of possibilities in a short period of time. In such a situation, random search works rather well.The number of partially filled-in Sudoku squares is enormous, but with a fast computer we could likely do a random search, choosing subsets of a Sudoku square until one is found with a unique solution. We could also do a more direct search, removing pairs sequentially until no more can be removed, as we did at the start of this chapter. It does not take long at all to produce a sound puzzle in this way.The problem comes when our criterion is something more than mere soundness. We might seek to minimize the number of starting clues, for example. For Sudoku puzzles, it is currently thought that eighteen is the minimum number of clues for a sound, symmetrical puzzle. (For a non-symmetrical puzzle, seventeen clues appears to be the minimum.) Beginning with a valid solution square and selecting eighteen cells at random is unlikely to produce a sound puzzle. It is a straightforward calculation to show that the number of ways of selecting eighteen cells from a total of eighty-one is on the order of 1016, which is too large for an exhaustive search. Finding such puzzles in a reasonable amount of time requires a better method.If random search is impractical, what are the alternatives? One possibility is known as a hill-climbing algorithm. We pick our initial point at random, but from there we confine our search to the local area around that point. We look for nearby points possessing a higher score than our current point. If we find one, then we use this second point as the beginning of a new, local search. This continues until we can no longer find a nearby, higher-scoring point. As the name suggests, this works rather well if your landscape looks like a simple pile of sand. Your initial, randomly chosen point may have a low score near the bottom of the pile. But no matter where you begin, you will always work your way to the top. The reverse situation works as well. If your landscape resembles a bowl and you wish to find the lowest point, you can with confidence employ a hill-descending algorithm.On the other hand, these algorithms perform poorly when the landscape has many peaks and valleys. You will certainly climb to the peak of whichever hill you chance upon, but you are unlikely to find the globally highest point. Worse, you will have no way of knowing whether your particular peak represents a local or a global peak. Hill-climbers also fare poorly when the landscape is highly discontinuous. If the landscape is very jagged, so that high points and low points are thoroughly mixed together, then you will need a different sort of algorithm.We could also try the so-called evolutionary algorithms. The idea is to mimic the processes of genetic mutation and natural selection lying at the heart of the evolutionary process in nature. Our search begins by choosing several points at random. We examine those points and select a subset consisting of those which, by chance, have the highest score. These are allowed to vary randomly, producing a new set of “offspring” points. We then repeat the process. Such algorithms have proven themselves effective in solving a variety of engineering problems. On the other hand, like the hill-climbers, they are only effective when the landscape is tolerably smooth and regular.What we really need is an all-purpose algorithm, one that will be effective regardless of the landscape to be searched. Such an algorithm would be a marvelous thing. Alas, it does not exist. If an algorithm performs especially well on one kind of landscape, then I promise you there are other landscapes on which it works very poorly. More precisely, the average performance of a given search algorithm over all possible landscapes is no better than a random search. This is a consequence of a collection of results known collectively, and somewhat facetiously, as the No Free Lunch Theorems .So sad. Solving a search problem requires tailoring the algorithm to the landscape. In math, as in life, there is no substitute for hard work. And it takes a lot of hard work, and a mix of human and computer techniques, to find an eighteen-clue needle in the Sudoku haystack. D. Wolpert, W. Macready, “The No Free Lunch Theorems for Optimization,” IEEE Transactions on Evolutionary Computation, Vol. 1, No. 1, pp. 67–82, 1997.
■网友


推荐阅读