# 🎲 The Monte Carlo method The Monte Carlo method is used to approximate certain mathematical expressions that are extremely complex or simply very difficult to evaluate accurately. Its name comes as a reference to the Monte Carlo casino in Monaco that specializes in gambling. Roulette is a random number generator and that is how the Monte Carlo method works. The creator of the Monte Carlo method was Stanislaw Marcin Ulam when during an illness he dedicated himself to playing "solitaire" and realized that it was much easier to obtain a general result of the game, performing multiple tests and counting the proportions obtained from the results, in instead of having to calculate all the probabilities. This great idea was implemented in his work that was "The Manhattan Project" developed by the United States government with the collaboration of Canada and the United Kingdom, during World War II. The objective of this project was nothing more and nothing less than developing the first atomic bomb Although it is true that this method was not used in the development of the first atomic bombs, once the war was over, it was applied in quantum physics and nuclear physics research. Ulam was working with John von Neumann and convinced him of great potential that this new method had, so both mathematicians worked together to develop it. Although the greatest contribution, in addition to Ulam, was made by Nicholas Constantine Metropolis (Greek mathematician and physicist who also worked on the Manhattan project) and Enrico Fermi (Italian-born physicist with excellent contributions in nuclear, quantum and particle physics). The first formal publication of the Monte Carlo method was made in 1949 in an article where Ulam and Metropolis participated, where the latter led the reins of a team, which achieved the first simulation carried out with this method on the first β€œENIAC” computer in the University of Pennsylvania (1948). The Monte Carlo method is considered one of the top 10 most important algorithms of the 20th century. The simulation of probabilities in hydrodynamics with respect to the diffusion of neutrons in fission, has a completely random behavior and that is where the Monte Carlo method has its place. Today this method is fundamental in the algorithms that generate 3D images, in addition to providing solutions to a large number of mathematical problems that lead to experiments with random number samples on the computer. The Monte Carlo method can solve problems such as complex volume calculation, cost minimization, optimization, etc. the idea is to try to generate random samples that are capable of corresponding to a specific probability function. Let's suppose that we are studying the probability function of a given that, as we know, has six heads and each of these faces has the same probability of coming out. In this case, the probability function is 1/6 for each face and we realize that if we add each of the probabilities, the result is unity (1). In this example, the function is discrete when it only takes whole numbers and they can be counted. But we can also have continuous functions, such as the weight of each individual in a population, in this case, to calculate the probability area of this continuous function we must use integrals, although the result will always also be one (1). When calculating integrals of continuous functions or other integrals of highly complex probability distributions, such as calculating the expected value, we find that these integrals are often very difficult to calculate analytically and that is when numerical methods of sampling to be able to approximate the result of this type of integrals. This is when it is advisable to use the Monte Carlo method, which is capable of generating random samples that are not only independent, but are also homogeneously distributed for a better approximation of the calculations. The result of an integral is the area under the curve of a function between points a and b, to model it, we simply used a summation of successive rectangles and triangles where an area that we do not know can be found, with the sum of areas that we do know , as are the area of the rectangle and the triangle. Then the Monte Carlo method appears, which allows solving mathematical and physical problems through the simulation of random variables. Let's go to an example to illustrate the difference between both methods, suppose we want to find the area of the following figure: ![Graph](_static/images/the-monte-carlo-method/graph_1.jpeg) Of course, this area is half of a square with side (1), therefore the area of a square is: l ^ 2 and its half will be: l ^ 2/2 = 1 ^ 2/2 = 0.5 Now, what the Monte Carlo method does is generate points within that frame, that is, it generates random values for me and distributes them evenly. If we count the total number of points generated and the number of points that are below the line, we realize that there are 20 points in total and below the line I have 10 points, when dividing the points under the line by the total points, it gives me 10/20 = 0.5. Steps to use the Monte Carlo method The first thing we must do is generate the points uniformly distributed between points 0 and 1 both on the "x" axis and on the "y" axis. I draw the line y = x which in this case is my function Then I count the total number of iterations or points, which in this case is 20 Later we must count the number of points that remained below the line and we realize that there are 10 Find the percentage represented by the points that are below the line, with respect to the total number of iterations, so that 10 represents 50% of 20, that is: 10/20 = 0.5 If we multiply the total area by the previous fraction, that is, the division of the points under the function by the total of scattered points, we will obtain the approximate area under the curve l2 (0.5)= (1)2 (0.5)=0.5 On the other hand, if we calculate: ![Formula](_static/images/the-monte-carlo-method/formula_1.png) If we evaluate it between 0 and 1: ![Formula](_static/images/the-monte-carlo-method/formula_2.png) We can conclude that through the Monte Carlo method we approach the value of the integral and we say approximate, because as the points are random, we may find 9 or perhaps 11 points below the line. We are now going to find the area under the curve of a function, but this time generating the random numbers and generally using the Excel tool. The idea is to find the area under the curve of the function y = √x between the values x = 0 and x = 4, using the Monte Carlo method. As we can see, we are only interested in the section of the curve when β€œx” goes from 0 to 4, so to find the random numbers that are within that section of the curve we must limit them as follows: x = lower value + (upper value - lower value) x random number In this case: the lower value = 0 and the upper value = 4, so: x = 0 + (4 - 0) x Random = (4) * Random y = lower value + (upper value - lower value) x random number y = 0 + (2 - 0) *Random = 2* Random The other thing that we must define is when the random number is below the curve and for this we must establish that if y <√x definitely the number will be below the function The first thing we have to do in Excel is a column that tells us the number of iterations, to get an idea of how far we should drag the calculations we are making. Once we have column "A" capable of counting our data, which in this case we are going to establish 1000 points evenly distributed within the rectangle, whose base is defined by "x" and its height by "y", in this case the base measures x = 4 and height y = 2, so the total area of our simulation square is: area = 4 x 2 = 8 Now, column "B" in Excel will be made up of the random points on the "x" axis generated by the program and for this it will be necessary to set the formula for that column as follows: = RANDOM () ![Graph](_static/images/the-monte-carlo-method/graph_2.png) Column "C" should also have this instruction, since we must generate random points for the "y" axis. In column "D" we are going to find the random values of "x" bounded by the limits initially established for the study of our function, that is, x = 4 *Random. To find the values of this column, we must place the formula in cell D2: = B2* 4 and drag the result until the counter that is arranged in column "A" reaches the number 1000. In the same way as the previous step, column β€œE” will be destined to obtain the values of β€œy” limited specifically for this function, as we defined previously y = 2 *Random, so in cell E2 we must include the formula: = C2* 2 and in the same way as in the previous step, we must drag the result until the counter reaches 1000. On the other hand, column "F" will be composed of y = √x, which will ultimately be the value that I will have to compare with the values of column "E", in order to know if the points are below the curve of my function or not. While column "G" is made up of a condition that tells us that: if the values in column "E" are less than the values found in column "F", this implies that the random point is under the curve and therefore it should give me as a result the number 1, otherwise, if it is greater it should give me a zero as a result. To do this, the conditioner must be placed in cell G2 as follows: = IF(The condition to check; the value if it is true; the value if it is false), in this case it will be as: = IF(E2 2 and y= √x As in the previous exercise, column A will be configured by random values of "x", but this time we are not going to multiply these values by some number, since the interval that interests us goes from 0 to 1 and the values Random data generated by Excel assume precisely numbers between 0 and 1, so the formula for the first column will be: = RANDOM (). ![Graph](_static/images/the-monte-carlo-method/graph_5.jpeg) Column b representing the function y=√x is made up of the square root of the random values found in column A. In the same way, column B is made up of the values of column A squared in representation of the function y=x2, while column β€œD” are random values for β€œy”, which are the points green that are scattered throughout the total area box equal to (1) and are also precisely the points that we are going to buy with both functions to know if they are inside the intersection area or outside. Column "E" is made up of two conditionals, which are to compare if the random values in column "D" are greater than y= x2 and less than y=√x, if so, we will know that those points random are within the intersection between both functions. Cell D2 so that instead of throwing me "true" or "false" they give me some (1) or zeros (0), which are ultimately easier to count, for Excel purposes, I must express it as: = - -Y(D2>C2;D2