Hello, bonjour, sain bainuu, salemetsiz be, welcome to the 10th week of Statistical Mechanics: Algorithms and Computations from the Physics Department of Ecole Normale Superieure. This week's lecture is devoted to a discussion of the alpha and the omega of Monte Carlo. The alpha will be Count Buffon's legendary needle-throwing experiment that has fascinated many people children and adults, learning circles and noble courts ever since its invention in 1777. It will lead us to a discussion of variance reduction methods. The omega will be Levy's stable distributions, that anybody interested in statistics and especially in Monte Carlo algorithms has to be at least aware of. It will lead us to a discussion of the case when the variance of an observable is infinite. So this week's tutorial will contain a short review of all the material covered in these ten weeks, and then we'll have a party. So, let's get started right away with this week of Statistical Mechanics: Algorithms and Computations. The experiment consists in randomly throwing needles of lengths a, onto a wooden floor with cracks that are a distance b apart. Like this.. The positions of the needles are random on an infinite floor the needles do not roll into cracks as they do in real life, they do not interact with each other and the angle phi is randomly distributed between 0 and 2*pi. So, let us introduce variables. x and y (clearly y is irrelevant to the problem) and r_center and phi. All the cracks in the floor are equivalent and there's a symmetry between the tip and the eye of the needle and there's also a symmetry between phi and -phi. So we may look at a reduced problem, where phi goes from 0 to pi/2, that means to 90°, and x_center goes from 0 to b/2. We suppose that the angle phi is uniformly distributed, and also that x_center is uniformly distributed. Now, the x variable of the tip is given by x_center - (a/2) cos(phi) and if x of the tip is smaller than 0, we have a hit. More precisely, the number of hits of a needle centered at x and oriented along phi is given by N_hits(x,phi) which is given by this formula. And the average that we want to compute is given by the integral (as we have many times in this course) over x from 0 to b/2 over phi from 0 to pi/2 N_hits(x,phi) divided by the normalizing integral dx from 0 to b/2 dphi from 0 to pi/2. This gives the following integral. So, we have the solution of our problem under the condition that we know how to integrate dx from 0 to 1 of arccos(x). Well, think about it twice and you'll find out that it would have been wiser to first integrate over x and then integrate over phi. And we find that the expected number of hits is proportional to an integral dphi from 0 to pi/2 of cos(phi), and this integral we know how to do it, it is equal to 1. So we end up with the result that the expected number of hits is (a/b)(2/pi). So in the special case when the length of the needle a is equal to the distance between the floor boards, we find that the number of hits is equal to 2/pi. So, by throwing needles onto a parquet, we can compute the number pi. This is what has fascinated many people since 1777. We can now write a program to do the Buffon experiment ourselves. It suffices to sample x between 0 and b/2 and phi between 0 and pi/2, and then it is necessary to check whether the tip of the needle is on the other side of the crack, as shown here. Well, you see that in this program we have pi input, because we have a random number between 0 and pi/2, and this program is supposed to compute pi so you might be interested in a historically authentic version that does the Buffon experiment without inputting pi. This is show here in the patch. So now we can do Buffon's experiment for as many needles as we want and you see here the output for 2000 needles being thrown on the parquet. Let's look at this program, the output, and let's ask a question: how do the needles hit the cracks? Do they hit the cracks more at the tip? at the eye? or at the center? Can you tell from this picture? So we want to know whether the needles hit the cracks more at the tip, the center or the eye. Mathematically speaking we can introduce an internal variable l, that goes from 0 to a and N_hits can be written as an integral over l from 0 to a, of N_hits(l). Now, we can go into a mathematical argument and a probabilistic reasoning. But it is much easier to look at this question experimentally. So now, in order to find out whether we have more hits at the eye, at the center or at the tip, and in order to avoid getting into probabilistic arguments let's take a second needle. Here, and let's make a little gadget. This will be gadget number 1. All right, you see that the center coordinate of one of the needles is equal to the eye coordinate of the other. Now see the gadget 1 I throw one of the needles randomly onto the floor, but if I don't tell you which one of the needles was thrown randomly, you will not be able to tell. You see that the number of hits at the tip of one of the needles must be equal to the number of hits at the center of the other needle. But because these needles are equivalent, we have proven, more or less, that the probability or the mean number of hits at the center must be equal to the mean number of hits at the tip. Now, we could glue together the needles at different parts.. we could glue them together like this, or like this, or like this.. and we prove like this that the hitting probability N_hits(l) must be independent on l. And by this we can prove that N_hits is equal to a constant times the length of the needle. So the gadget that we had before was of length 3/2 and had 3/2 many hits as the original needle. And this we already saw in our starting calculation, because the number of hits was proportional to a, the length of the needle. So now, can we compute Upsilon without doing a complicated calculation and computing the integral of arccos(x)? Indeed we can. Because the world of gadgets is not limited to straight gadgets: we can glue together the two needles, like this, at an angle. we can do this.. All right, so this is my gadget number 2: it is a needle, a bent needle that also has an internal variable l that goes from 0 to the length of the needle, and I can throw this gadget. Now look at this bent needle gadget number 2, that I throw onto the floor Each point along the interior variable collects hits as if it was by itself So what I will find is that each internal point on this gadget So what I will find is that each internal point on this gadget has the same mean number of hits as it was before. And this remains true even if the needle itself was not just the joint of two straight needles, but even if it was a curved needle. So we find a very strong generalization that the mean number of hits is equal to Upsilon times the length of the needle, independently of the shape of the needle. So now this has a very strong consequence, that was first found out by Barbier in 1860. He considered throwing needles of length pi times b. So let's start with straight needles and let's do this experiment here on the screen so here I throw a first needle, a second needle, a third needle and so on.. You see that the number of hits can be equal to 0, 1, 2 3, or even 4. Now, we can do the same experiment with needles of lengths pi but of different shape, we can take round needles of length pi, and let's throw them: first, second, third, fourth and so on.. So, what Barbier first noticed is that for the round needles we have two hits, two hits, two hits, two hits and two hits. But for the round needles, we also have the law that the mean number of hits, which is equal to two, is equal to Upsilon times the length of the needle. But this length of the needle is pi. So we find that Upsilon must be equal to 2/pi. And you see that Barbier was able to compute for us the outcome of the original needle experiment without ever doing the experiment. He found without any calculation that N_hits must be equal to 2/pi times a/b, and that is the final result. But now, wait a minute. wait a minute, we find that the outcome of the round needle experiment is the same as the outcome of the straight needle experiment? No, of course not. Let us look at the two experiments the straight-needle experiment and the round-needle experiment on their respective landing pads. So for the straight-needle experiment, as a function of x and phi, we see that we can have 0 hits, 1 hit, 2 hits, 3 hits and 4 hits, and we find out that the mean number of hits is equal to 2. This would be quite a complicated calculation, if we did it ourselves. And it would be quite a complicated Monte Carlo simulation, because we would have a lot of fluctuations. Now, let's look at the landing pad of the round-needle experiment. Again we have the variables x and phi and the number of hits is 2 everywhere. We have no fluctuations and the outcome also is 2, but the two experiments are of course different. It is only the observable "mean number of hits" which is the same. So if we are only interested in computing the mean number of hits, we are much better off in changing our experimental set-up and doing the round needles, rather than the straight needles. What I explained here in a very simple setting but the very famous setting of the Buffon-Barbier needle experiment, has many applications and it is called variance reduction. Because you see here, in the round-needle landing pad, we have complete variance reduction, the variance is equal to zero, (of this observable) and in the other the variance is very large. So this variance reduction is a powerful concept, it means that we will re-do a new calculation and think about a new set-up of our algorithm for any observable that we want to compute. So before continuing, please take a moment to download, run and modify the two programs that we discussed in this section. There is direct_needle.py that does the original experiment, and direct_needle_patch.py, that used inspiration from the Monaco games, to avoid computing sines and cosines and using the input of pi to compute pi. And especially try to think over this amazing algorithm by Barbier, telling us that the mean number of hits is proportional to the length of the needle, independent of the shape. This is called Barbier's theory. All during this course, we have insisted on the relationship between sampling and integration. So, to rep it up, for the omega of Monte Carlo, let's do one last integral. We call it the gamma integral: integral from 0 to 1 of x to the gamma. This very simple integral can be done analytically the outcome is 1/(gamma+1) and this integral exists for gamma > -1. So let's do this integral by sampling. Simply take x as a random number between 0 and 1 and compute this random number to the power of gamma. This is done in the program direct_gamma.py. And we are adding a few lines to do the error analysis. We compute the mean value, we compute the mean square value, and afterwards we output the value of the gamma integral +/- the error. Notice that here we have direct sampling algorithm, not a Markov chain sampling algorithm, so that the Gaussian error analysis is ok. So, output of this program direct_gamma.py is shown here for gamma=2 1, 0, -0.2, -0.4, -0.8. You see that most of the times the outcome of the integral +/- the error agrees with the analytical result. But for gamma=-0.8, where the programs runs just fine, we get the result that the gamma integral is equal to 4 +/- 0.1. But this does not agree with the analytical result, which would be the gamma integral is equal to 1/(gamma+1), so it's 1/(1-0.8) it's 1/0.2 = 5. To make progress, let us modify our original program direct_gamma.py and compute the running averages. So, all during a very long simulation, we output what would be the current estimate of the gamma integral, by computing the sum of the observables the random number to the power of gamma, divided by the length of the run. So this gives direct_gamma_running.py Output of direct_gamma_running.py is shown here. We see initially we have very unpredictable results, and then for very long time the simulation zeros in on a result which is different from 5, the exact value. And then, after hundreds of thousands of trials, all of a sudden we hit an enormous value of x to the gamma, notice that gamma is negative, so if x is close to zero, of x to the gamma, notice that gamma is negative, so if x is close to zero, the value becomes very large. So here we see that we approach wrong result and all of a sudden we go through the roof and we continue our calculation. So, clearly because the running average is constant we also have an error estimate the average over dx of x^(2*gamma), which comes out much too small. Then the very large sample hikes up the mean value and also increases the error estimates. Now let us analyze what is the problem here. The problem is that we have an estimation of the variance of our distribution as the sum over the random numbers to the power of (2*gamma), as you see here in the program, and this sum approximates and integral, which is the integral from 0 to 1 in dx x^(2*gamma), so this is the integral from 0 to 1 in dx of x to the -1.6 and this integral is infinite. So the distribution for gamma=-0.8 exists, it is finite, but it has an infinite variance. Clearly the situation is very difficult to analyze in the absence of an analytical solution. Now we continue in two steps to analyze the gamma integral. What we'll do in the first program, direct_gamma_average.py, is to take the average Sigma/N and to compute the probability distribution of Sigma/N as function of the length of the interval over which we average. We take N=1, 10, 100, 1000, 10000.. for N=1, we have the original integral and its probability distribution is given by pi(Sigma/N) =(Sigma/N)^(-1+1/gamma) as we discussed previously in lecture 4. So it goes like this and it decreases like 1/x^2.25, then we take the sum Sigma/N for N=10, we have this distribution then for 100 we get a new distribution, for 1000, ... All the averages are equal to 5 but the most probable values only very slowly approach 5. And this was the reason why we obtained a wrong result because the average turned out to be very different from 5. So finally, let me give you without justification a way to rescale all these curves onto one universal curve. So this rescaling is the following, instead of considering the probability distribution of Sigma/N, we consider Sigma/N minus the mean value divided by N^(-1-gamma). In our case gamma=-0.8, and the mean value is equal to 5. So we have to look at the histogram of the mean value Sigma/N minus 5, divided by N to the -0.2. And you see that the same data that depended so much on N, fall onto the same curve approximately for large N. This is written up in the program direct_gamma_average_rescaled.py which also contains the analytically known limiting distribution in the limit N going to infinity. This distribution that you see approaching here for N larger and larger and that you see written here as analytical formula, is one of the examples of the famous Levy stable distributions. These Levy stable distributions depend on the power gamma of our distribution that has infinite variance and on the precise asymptotic behavior for x going to plus infinity and x going to minus infinity. So in our case the distribution for x going to plus infinity is 1.25 / x^2.25 and it is zero for x going to minus infinity. And what Levy showed in the 1930s is that this very erratic distributions, as we saw in our initial calculations, satisfy very strict laws and obey in the limit N going to infinity if they are properly rescaled, they obey these stable distributions. So before concluding this lecture and this whole lecture series, please take a moment to download, run and modify the programs we discussed in this lecture. There was direct_gamma.py the program that was supposed to compute the integral dx x^gamma between 0 and 1, and that had a lot of problems doing this for gamma between -1 and -0.5, even though the integral existed. Then there was direct_gamma_running.py, the program that computed the running average and had this point going through the roof. Then we had the program direct_gamma_average.py and direct_gamma_average_rescaled.py. So in conclusion we have studied in this lecture two outstanding problems: the original Buffon's needle and the famous Levy stable distributions that brought order and understanding to very erratic random processes. So let me thank you for all of your interest and very active participation during this course.