(As mentioned before, current MIT students should not read this material.)
Having looked at how the largest cluster size varies with for a fixed , now we consider the converse of what happens as varies for a fixed .
We learned in class that follows certain functional forms in the limit of large :
Here, is the dimension of the grid being used, and is the fractal dimension of the critical percolating cluster1The fractal scaling is just one of many power laws that describe the behavior of a percolating cluster at or near the critical point. The exponents of these power laws are referred to as “universal” critical exponents because their value does not depend on the local shape of the grid being used as a substrate, but only on the grid’s dimension; thus all two-dimensional grids follow the same power laws. For , the exponents stop depending on the dimension, and equal the values according to mean-field theory. Generally speaking we only have exact results for , where conformal field theory can be used, or where mean-field theory is correct..
As an aside, it is worth noting that this behavior for is not obvious, and it is plausible for there to exist a range of values of which exhibit large clusters2i.e., cluster sizes distributed like a power law, and therefore would be a power law in without having infinite clusters. It wasn’t until the 1980s that is was proved3eg “Sharpness of the Phase Transition in Percolation Models” by Michael Aizenman and David J. Barsky that there was a single phase transition at a single critical point at which clusters go immediately from being small to infinite.
Can we observe these functional forms empirically? Let us start in 2D, and graph how and (the second-largest cluster size) vary with for .
We have shown the least-squares fit of as a linear function of for large ; it is quite a good fit. also appears to grow like , although I did not perform a fit.
Moving up to :
We are now fitting against two different functional forms. Where is almost as large as one would expect at the critical point, we are fitting against a power law; where is significantly smaller than one expects at the critical point, we fit against 4Specifically, we use a power law where is at least 80% of at for that , and where is at most 20% of at for that .. Again, for large we see that is a good fit against .
At :
It is hard to say at this point that is a good fit for ; at this point it is looking more and more like a power law.
Finally we get to :
An excellent fit to a power law! Or at least, it looks like an excellent fit until it is compared to the fit with , very close to :
I find it remarkable how much of a difference there is between 0.59 and 0.5927; the former has a very slight but visible bend to the graph as starts to fall behind slightly for large . Also, one can see that and are not quite parallel when ; is slowly but surely catching up on . But at however, and both perfectly match a power law for as high a value of as my computer could handle, and visually appear exactly parallel5Though from the exponent we see is in fact slightly catching up on . This is because at the system exhibits scale invariance.
What of the fractal dimension? We estimate , just shy of the correct value known to be exactly .
Continuing, as soon as it is already obvious we have overshot :
And it just gets more obvious as increases; at the second-largest cluster can no longer keep up:
Here we see that , very close to the predicted relation of for .
Of course, the same observations hold in 3D. Here is the behavior at the critical point:
Now we can use these graphs to resolve a bit of a mystery. We saw previously that for any fixed , varies continuously as a function of . However, the theoretical functional forms for do not vary continuously: they abruptly change from when to a power law at . How can we resolve this disparity?
It appears from these graphs that we were wrong about the functional forms, and the functional form actually does change continuously; first fits well for small , then gradually becomes more like a power law as approaches . However this isn’t true – the functional forms as stated above are correct. What actually happens is that although the functional form for large does abruptly change at , as gets closer to then how large needs to be for the functional form to be applicable grows considerably.
This is related to the notion of uniform convergence; while is a continuous function of for any fixed , as we take the limit , converges to a discontinuous function. This is because the convergence is not uniform,6necessarily, as uniform convergence of continuous functions always gives a continuous function but rather the convergence is slower for near .
How large needs to be to be considered “large” is called the correlation length ; we find it by asking the question, what is the probability that two occupied sites belong to the same cluster7More rigorously, how much more likely are they to be in the same cluster than two far-away points are in the same cluster? This is the same question when , but for two distant points have a nonzero chance of lying in the same cluster because there is an infinite cluster. Thus what we are really asking is the chance two sites lie in the same finite cluster.? If they are further than apart, the probability falls exponentially with the distance between them, like . Another way to see this is that the radius of a random cluster tends to be at most around 8or a suitable power thereof.
Therefore when is as big as , the largest finite clusters tend to span the whole grid, and the system behaves roughly as if it were at . Only for can we observe the limiting behavior.
itself obeys a power-law relationship (see the above footnote on universal critical percolation exponents):
where for all 2D grids. Thus when we find and would need to be significantly larger than that to observe the behavior, whereas our graph only extended to .
We can see this with a 3D plot of :
The ramp region with corresponds to a power law growth rate, and the constant slope of 2 shows that that whole region has the same exponent of . The flatter region with has an asymptotic growth rate of . As gets larger, the transition between the flat region and the ramp becomes sharper and sharper, but for any fixed value of it remains continuous and unbroken.
The behavior for is clear, as each occupied site has some fixed, high probability of being in the big cluster, so scales with the volume of the domain . To understand the transition from to better let us see if we can explain the behavior.
The key observation is that when , far away sites are unlikely to belong to the same cluster; the probability decays exponentially with distance between the sites, so as most sites become uncorrelated. Thus we can imagine each of the clusters as being (mostly) independent of each other. There is some finite average size for a cluster, so as is increased the number of clusters grows proportionally to the volume of the domain:
So to find the largest cluster, imagine taking iid samples from some probability distribution of cluster sizes. will be somewhat “nice” in that it has finite mean, finite variance, and goes to 0 rapidly as .
Let be the maximum of the ; we wish to find the probability distribution of . Now certainly by definition of ,
where we have used independence of the and identity of the . Note that is simply the cumulative distribution function of , i.e., its integral.
Let be the median of . Then
We have used the Taylor series for ; the approximation is accurate because is extremely large and so is extremely small.
So now we need to know the asymptotic behavior of . A logical guess is that is a normal distribution. In that case, its cdf is called the error function (after scaling and translating appropriately), in this context more specifically called the Q-function. An excellent approximation for the tail behavior of is where is the standard normal distribution, so we get
Recall is the median of the maximum cluster size, and that . In the end we find that … which is not correct!
So our guess that cluster sizes are normally distributed is not good; the normal distribution has too little weight in the tails.
Let us see if we can derive the tail behavior of from first principles. To start, suppose we are in 1D, and we have .
Consider an occupied site, and for simplicity ignore its left neighbor. For it to be part of a cluster of size , the sites to the right must all be occupied, which has probability . Each step further imposes another factor of , so is an exponential distribution. That is, , and likewise its integral . This gives:
which is now correct.
What about higher than 1D? The whole difficulty with is the presence of loops in the grid, i.e., there is more than one route between any two particular sites. If there were only one route connecting two sites it would be very easy to predict whether they are in the same cluster.
Thus we turn to the Bethe lattice, which is the infinite tree where each vertex has neighbors. If we take , then the Bethe lattice strongly resembles the -dimensional grid, in that both are regular graphs with vertex degree . In fact, the Bethe lattice is the universal covering graph of the integer lattice; that is, it was what you get when removing all the loops from the integer lattice. E.g., if you go up, right, down, left on the Bethe lattice you do not return where you started – the only way to return to the origin is to exactly retrace your steps.
The Bethe lattice being a tree, there is a unique path connecting any two sites, which again makes it easy to calculate the cluster size distribution. An occupied site has neighbors, each with a probability of being occupied. Each “newly” occupied neighbor opens up an additional new neighbors which have the potential to be occupied themselves. We can model this process of growing a cluster as a weighted random walk that starts units to the right of the origin and at each time step has a probability of going units to the right and probability going 1 unit to the left; the size of the cluster is one plus the number of times the random walk went to the right before reaching the origin.9The distance from the origin represents the number of sites adjacent to the incomplete cluster that have not yet been determined whether they are occupied or not. Thus the size of the cluster is where is the time it takes for the random walk to reach the origin.
For , which is the critical threshold for the Bethe lattice, this random walk is biased towards the left. The path of a biased random walk can be thought of as diffusing from a mean position which is moving with time. For large times , this mean position is to the left of the origin (specifically, the mean position is at ), but the standard deviation only grows like , like all random walks / diffusion processes. Therefore to remain to the right of the origin at time requires being standard deviations to the right. Since diffusion creates a normal distribution, the probability of remaining standard deviations from the mean scales like
which is an exponential decay, exactly as we saw in 1D (recall that the time it takes this random walk to cross the origin is proportional to the size of the cluster).
Thus we have recovered the desired scaling for clusters in the Bethe lattice, but how does this help us with the standard integer lattice when ?
Let be the universal cover that maps the Bethe lattice to the integer lattice (i.e. “curls up” the Bethe lattice by restoring all the loops), and consider a random cluster on the Bethe lattice. Let be the cluster on the integer lattice consisting of the points for in .10Note that if some collection of points in the Bethe lattice are connected, then is also connected, because is a covering map and therefore sends paths to paths. Certainly is either the same size or smaller than , depending on whether different points in have the same image. (Eg, recall that the paths “right, right, up, up” and “up, up, right, right” go to different points on the Bethe lattice, and the same point on the integer lattice.)
Now is some sort of a random cluster, but is it drawn from the same probability distribution as randomly generated clusters on the integer lattice? No, in fact it tends to be larger. Specifically, if we fix an origin point 11As a slight abuse of notation we will use to mean the origin of both the Bethe lattice and the integer lattice, i.e. we identify with . and suppose is a random cluster containing on the Bethe lattice, and is a random cluster containing on the integer lattice. (We are, of course, using the same value for both lattices, and considering only finite clusters.) Then we claim that for any finite collection of points , we have:
For simplicity suppose consists of a single point . If , then there exists a non self-intersecting path from to that passes through points in . Let us make a list of all finite non self-intersecting path that go from to . Then if and only if at least one of the .
To compute this, we add up over all of the , and then remove the probability that at least two of the paths are in , etc. Each is easy to compute, because it is just where is the length of . Note that these events are not independet of each other because some of the paths pass through the same points as other paths.
To compute , we do exactly the same process involving adding up the . Because is a covering map, each path has a unique preimage on the Bethe lattice that is also a path. Thus we can do this adding up process on the Bethe lattice, and add up the and subtract duplicates, etc.
Certainly
because the paths and have the same length, and we are using the same probability on both lattices.
However the joint probabilities of the paths are not the same, because there are paths that pass through the same points on the integer lattice but do not intersect on the Bethe lattice. (E.g. “right, up, up” and “up, right, right”; or indeed all of the end at the same point on the integer lattice but different points on the Bethe lattice.) The events are more correlated with each other than the events , so the probability that at least one event happens is greater for the latter than the former. That is, .
Similarly this holds if is any finite collection of points, not just one point.
It follows that for any integer ,
where we have used that .
Since we previously saw that when (the critical point on the Bethe lattice, where here), the tail probability distribution of the is exponential (i.e. ), it follows that is bounded from above by an exponential.
What about bounded from below? That is easy: the dimensional integer lattice contains the 1D lattice as a subset, and so the size distribution is bounded from below by the distribution on the 1D lattice, which is also exponential. Therefore for large .
Now let us jump back to our earlier claim, that if we have iid randomly generated clusters, and is the size of one of them, and is the median of the size of the largest one, then
so therefore
and we have demonstrated the desired scaling law (recall ).
While this “proof” is certainly quite hand-wavey, it is largely sound but for one critical weakness: the critical points of the Bethe lattice and integer lattice are different. This argument is only valid for , the former critical point. For a more sophisticated argument is needed. This kind of approximation process, where we replace the integer lattice with an easier lattice, will never be able to bring us all the way to because the easier lattice has a lower critical threshold. To write a proof that is valid for all requires somehow exploiting the critical behavior as the value of is, in general, not definable analytically.
Follow RSS/Atom feed for updates.