All posts from 2022.

Why is it warmer in summer?

2022 September 13

It’s not a trick question: yes, it is warmer in summer because there is more sunlight. But there is surprising nuance in trying to nail down exactly what is going on – and if you don’t believe me, which of the following explanations sounds more plausible?

  1. The Earth warms during the day and cools during the night1Via radiation of infrared light to space due to the blackbody effect; see my series on the greenhouse effect for more details.; summer days are longer and therefore able to reacher a higher temperature.

  2. The Earth warms during the day and cools during the night; on long days, warming exceeds cooling, so as summer carries on the Earth accumulates more and more heat.

These are very different theories – if, say, an Earth year were only 20 days long (and each day were still the same length), the second explanation predicts very mild seasonal variation, while the first predicts that seasonal variation would be unchanged.

We can see right away that the first explanation is not fully correct. Consider a location that is 45 degrees from the equator, such as Milan. The longest day of the year is the summer solstice, on June 21:

However, the hottest day of the year is more than a month later:

So after the solstice, the days are getting shorter, but Milan keeps getting hotter. Perhaps the second explanation is correct, then. Can we delve in and test it further?

Theoretical model

We are interested in a fixed location on Earth, and let us suppose that it has a temperature, or more precisely heat energy2We suppose that the heat capacity is constant over the range of temperatures of interest, so energy and temperature are linearly related., of E, varying with time t. Sunlight warms the Earth, and infrared emissions cool it; the latter is approximately proportional to E3Black-body radiation varies with the fourth power of temperature per the Stefan-Boltzmann law, so we need a cubic correction factor, which is modest over the range of temperatures over interest; but we are not trying to be quantitative yet.. Then the governing equation for E is

\frac {dE}{dt} = \omega (S - E)

where S represents insolation, which may depend on t, and the various constants are absorbed into \omega which has units of one over time. \omega is a frequency, and 1 / \omega is the characteristic timescale for the Earth system to restore equilibrium.

S has units of temperature (or energy, according to the units chosen for E) and represents the equilibrium temperature of that location of Earth. If S were constant, then the temperature E would simply exponentially decay to S. However let us now take S to be the day-averaged insolation at 45 degrees, which is well-approximated by a sine wave:

S = S_0 + a \cos (\tau t / P)

where \tau = 2\pi and P is the insolation period, i.e. 1 year. So we need to solve

\frac {dE}{dt} = \omega (S_0 + a \cos (\tau t / P) - E).

A pretty sensible (and correct, fortunately) guess is that E is also a sine wave, with E = S_0 + A \cos (\tau (t - t_0) / P). Here, t_0 indicates the phase offset between peak insolation and peak temperature: for Milan, we had t_0 = 35 days.

We could thus solve for E very directly, referring to some less-used trig identities for sums of sines. However, the easiest derivations for these identities are from complex numbers, so instead we will forestall needing trig at all by using complex numbers from the start. First, notice that the differential equation for E is linear: if E_1, E_2 are the temperature functions corresponding to the insolations S_1, S_2, then

\frac {d(E_1 + E_2)}{dt} = \frac {dE_1}{dt} + \frac {dE_2}{dt} = \omega ((S_1 + S_2) - (E_1 - E_2))

and thus E_1 + E_2 is solution for an insolation of S_1 + S_2.4Indeed this linearity is how we knew that the constant term of E is S_0. In particular, if S is complex, the real and imaginary parts act independently of each other. Therefore, instead of the previous choice for S we can use

S = S_0 + a e^{it / P}

and we plug in E = S_0 + A \exp(\tau i (t - t_0) / P) in the differential equation to get


\begin{aligned}
    A (\tau i / P) e^{\tau i (t - t_0) / P} &= \omega a e^{\tau it / P} - \omega A e^{\tau i (t - t_0) / P} \\
    A ((\tau i / P) + \omega) &= \omega a e^{\tau it_0 / P} \\
    A ...

Then, returning to real numbers, we get that the solution to the differential equation is

E = S_0 + \frac {P\omega}{\sqrt{\tau^2 + (P\omega)^2}} a \cos(\tau (t / P) - \arctan (\tau / P\omega)).

Here P\omega is a unitless number representing how many “times” the system can relax to equilibrium within one period P of the forcing. When it is very large, the system rapidly approaches equilibrium: the amplitude A is close to the forcing amplitude a, and the phase shift \tau t_0 / P is nearly zero. When P\omega is quite small, the system takes many times the length of a single forcing period to reach equilibrium, thus greatly damping the amplitude A, and causing the phase shift 2\pi t / P to approach a maximum of 90 degrees (i.e., 3 months out of phase).

Thus, the differential equation for E is essentially a low-pass filter: components of S with a frequency higher than \omega are dampened, whereas components with a lower frequency are passed through. In fact, it is exactly the equation for an RC circuit with 1 / \omega = RC.

How can we make use of this solution? We will not be able to sensibly predict \omega from first principles as it depends on a wide variety of physical attributes of the terrain and atmosphere, such as heat capacities, thermal conductivity, albedo, emissivity, etc. Therefore we cannot predict S, or a, either: while insolation in units of Watts per square meter can be observed easily, we have folded an unknown constant into \omega to bring S to units of temperature. Thus the amplitude A is not useful.

However, observation of E does tell us the phase-offset t_0, and therefore \omega. In the case of Milan, where we observed t_0 = 35 days, we get \tau / P\omega = \tan (\tau t_0 / P) = 0.867 or 1 / \omega = 0.109P = 40 days5\tan x \approx x for small x so it is not a surprise that 1 / \omega is approximately equal to t_0. Thus the timescale for temperatures in Milan to approach equilibrium is (in theory!) 40 days.

With the time to restore equilibrium so much longer than a single day, and comparable to the length of a season, this supports our second explanation, that summers are hot because of the accumulation of heat locally over an extended time.

Replicating diurnal and seasonal cycles

But there is a critical flaw in this argument so far! To get here, we used the daily-averaged insolation S – we let sunlight smoothly vary over the seasons, completely erasing the existence of day and night. And, consequently, the resulting temperature E also smoothly varies with the seasons, and does not accurately replicate the observed day-night cycle of hot and cold.

To better understand this issue, let us crudely approximate the diurnal plus seasonal cycle as a sum of two sine waves. As the differential equation for E is linear, the solution will be


\begin{aligned}
    E &= S_0 \\
    &+ \frac {P_1\omega}{\sqrt{\tau^2 + (P_1\omega)^2}} a_1 \cos(\tau (t / P_1) - \arctan (\tau / P_1\omega)) \\
    &+ \frac {P_2\omega}{\sqrt{\tau^2 + (P_2\omega)...

(More generally, we could use this technique to find the solution for any periodic forcing function by taking the Fourier transform to decompose it into a sum of sine waves, solve them separately, and take the sum of the solutions.)

Here, there are two periods: P_1, say, is one day, and P_2 is one year. The resulting temperature function also varies with these two periods. And indeed, we observe that temperatures vary throughout the day, and throughout the year.

So, have we fully explained observed temperature variations? Not quite, we have two problems, with the phase offsets and the relative amplitudes. As we saw above, we expect 1 / \omega to be on the order of 40 days to be able to produce the observed phase offset of the annual temperature cycle. However 1 / \omega is then 40 times larger than P_1, so the phase offset of the diurnal temperature cycle will be 90 degrees (a quarter phase).

Phase offset of diurnal cycle

What is the observed phase offset of the diurnal temperature cycle? Well, sunlight does not actually act like a sine wave over the course of a day: it is more of a truncated sine wave, as solar heating does not become increasingly negative as the night gets deeper – once sunlight goes to zero it goes no lower.

In the P_1 \ll 1 / \omega regime, the morning low and evening high are predicted to occur at the point where solar heating is equal to its average daily value; this is a generalization of the quarter phase offset from sinusoidal forcing to arbitrary periodic forcing.

Ignoring atmospheric phenomena, solar heating during the day is proportional to the cosine of the solar zenith angle, i.e., the angle between the sun and straight up, and is given by

\cos \Theta = \sin \phi \sin \delta + \cos \phi \cos \delta \cos (\tau t / P_1)

where \Theta is solar zenith angle, \phi is latitude, \delta is declination of the sun6the angle between the subsolar point and the equator, and t is time of day, with t = 0 being solar noon and t = P_1 / 2 being solar midnight.

The intensity of sunlight varies with its angle \Theta, being most intense when bearing directly down; specifically, the intensity is proportional to \cos \Theta. When \cos \Theta > 0 corresponds to exactly when the sun is above the horizon.

Thus we wish to find the average positive value of \cos \Theta as t varies, and so need to identify the time of sunrise and sunset (when \cos \Theta = 0). Sunrise and sunset occur at t = \pm \alpha where

\cos \frac {\tau \alpha}{P_1} = -\tan \phi \tan \delta.

Then the average daily solar heating is


\begin{aligned}
    \overline{\cos \Theta} &= \frac 1{P_1} \int_{t = -\alpha}^{t = \alpha} (\sin \phi \sin \delta + \cos \phi \cos \delta \cos(\tau t / P_1)\ dt \\
    &= \frac {2\alpha}{P_1} \sin...

Now we need to solve for \cos \Theta_m = \overline{\cos \Theta} to find the angle \Theta_m of the sun at which point it is providing the mean solar heating; this will occur at the time t_m which is predicted to be the the temperature lows / highs. We have

\cos (\tau t_m / P_1) = \left(\frac {2\alpha}{P_1} - 1 \right) \tan \phi \tan \delta + \frac 2 \tau \sin(\tau \alpha / P_1)

Let us assemble a few example values. First, we tabulate sunset times \alpha, listed in hours after solar noon:

equinox summer solstice winter solstice
equator 6 6 6
30 degrees 6 6.97 5.03
45 degrees 6 7.71 4.29
60 degrees 6 9.24 2.76

Then, the predicted time t_m of the daily temperature maximum, again in hours after solar noon:

equinox summer solstice winter solstice
equator 4.76 4.76 4.76
30 degrees 4.76 5.22 4.20
45 degrees 4.76 5.49 3.70
60 degrees 4.76 5.86 2.53

Again let us consider Milan, which sits at 45 degrees of latitude. How do these predictions fare?

For most of summer, solar noon is around 1.30pm local time. At the solstice, solar noon is 1.25pm, and the peak temperature is historically observed at 4.30pm, 3 hours later. This is noticeably earlier than predicted peak temperature. Worse yet, the temperature minimum is at 6am, just a few minutes after dawn! This is about 7.5 hours before solar noon, again much earlier than the prediction.

The prediction is even worse at the winter solstice. Local noon is at 12.21pm, sunset 4.42pm, and sunrise 8am. The predicted temperature maximum and minimum are at 4.03pm and 8.39am, but are observed at 2.30pm and 6.45am. In particular, temperatures are already rising (very slightly) before sunrise, whereas we should have expected temperature to linearly decline through the whole night, as the Earth is shedding heat to space at a uniform rate.

This does not just occur in Milan; for almost any location in the world, the actual phase offset for the diurnal temperature cycle is much lower than the predicted phase offset. Furthermore, the observed diurnal temperature cycle tends to be quite asymmetric, with the daily low around dawn and the daily high a few hours after noon, despite the solar forcing being approximately symmetric.

Relative amplitudes of diurnal and seasonal variations

Now let us consider the second problem: the relative amplitude of the diurnal and seasonal variations.

When we were considering only a single cycle, the amplitude was not helpful to consider because of the unknown constants involved; but with two variations at different periods (one daily and one yearly), we can compare their relative amplitudes and see if it is consistent with observations.

As our differential equation for E is a low-pass filter, variations of higher frequency than \omega will have their amplitude scaled downwards proportionally: when P_1 \ll 1 / \omega, we have

A_1 = \frac {P_1\omega}{\sqrt{\tau^2 + (P_1\omega)^2}} a_1 \approx \frac {P_1\omega}{\sqrt{\tau^2}} = \frac {P_1\omega}{\tau} a_1

whereas for P_2 \gg 1 / \omega we have

A_2 = \frac {P_2\omega}{\sqrt{\tau^2 + (P_2\omega)^2}} a_2 \approx a_2.

Recall that our observations in Milan suggest the timescale 1 / \omega is approximately 40 days, giving us


\begin{aligned}
    A_1 &= 0.00398 a_1 \\
    A_2 &= 0.824 a_2
\end{aligned}

Thus the amplitude of the diurnal cycle in Milan should be supressed 207 times more so than the seasonal cycle. The difference between the high and low temperature in Milan is typically around 10 C, whereas the difference between summer and winter temperatures is about 20 C, so observations give A_2 is about double A_1. To produce these observations, then, the diurnal variation a_1 in insolation needs to be 100 times greater than the seasonal variation a_2.

The solar heating during the day is \cos \Theta times the solar constant, which on solar noon at the summer solstice at 45 N is

\cos \Theta = \sin \phi \sin \delta + \cos \phi \cos \delta = 0.93

and at solar midnight is zero, for a daily range of 0.937The diurnal cycle in solar heating is only poorly approximated by a sine wave; the “effective” daily range should be somewhat larger, but not enough so to change any conclusions.. The daily range at the winter solstice is even less, being only 0.368 at solar noon and 0 at solar midnight.

For the seasonal variation, we compare the daily averaged solar heating \overline {\cos \Theta} at the two solstices. In summer we find 0.367, and in winter 0.0858, for a range of 0.281.

Thus the amplitude of the diurnal variation in forcing is up to around 4 times larger than the amplitude of the seasonal variation, but we needed an amplitude 100 times larger to replicate the observations.

The specific numbers will vary quite a bit from location to location, especially the diurnal and seasonal variations in temperature, which strongly depend on the complex behavior of local climatological conditions. None-the-less we would consistently find that our predictions from this model substantially understimate observed diurnal temperature swings: if it really needs most of a season for local temperatures to reflect changes in solar forcing, then individual days and night would be substantially smoothed over.

Reality

We are left with an irreconciliable conflict: we can explain the seasonal lag in temperatures by supposing that a location is gradually warmed over the course of season, or we can explain the diurnal lag in temperatures by supposing that a location is gradually warmed over the course of a day, but we cannot do both with a single physical object with a single response timescale. Either it responds on a timescale comparable to a day, in which case there would be no seasonal lag, or it responds on a timescale of a season, in which case its diurnal lag is much too large and diurnal variation much too small.

Clearly the differential equation for local temperature E does not accurately reflect the actual processes going on that determine the temperature at a location; let us consider more carefully what is going on. Our first question should be, what does the temperature “at a location” on Earth mean – what is the actual physical object being measured? The graphs shown above reflect measurements of air temperature, but air is not particularly stable, and certainly does not stick around for multiple seasons. Worse yet, air near the surface receives very little direct solar heating: it is mostly heated by the Earth below, and to a lesser extent by air above it, and half of the heat it radiates goes down into the Earth instead of out towards space.

Perhaps more useful is the temperature of the solid Earth itself. This is directly heated by the sun – at least, the very top of it is. (Well, on much of the Earth it is the vegetation on top of the ground that is heated by the sun, which in turn mostly loses heat to the surrounding air and not into the ground.) We now have the problem of asking how deep into the Earth we should be considering: further under the surface temperatures are smoothed out on a longer time scale. The diurnal cycle of temperature only occurs in the top 10 to 20 cm, and the seasonal cycle is completely smoothed out at depths below 10 to 20 m.

What’s more, temperatures at depth are lagged relative to the surface, so that around perhaps 5 meters ground temperature is often hottest in winter8Note that the differential equation we’ve been using so far can only produce a phase shift of up to a quarter phase. The reason that temperatures underground can be phase shifted by more than that is that they do not directly interact with the surface, but rather interact via soil at an intermediate depth. Each “layer” of soil is slightly phase shifted and slightly damped relative to the layer above it, so these phase shifts can add up arbitrarily high at depth.. Temperatures many thousands of years into the past can be reconstructed by simply digging a deep enough borehole and measuring the temperature there.

All of this of course presumes we are talking about a location on land: in the ocean the concerns are different but no easier, with issues of how deep sunlight can penetrate into the water and ocean currents, particularly coastal upwelling.

A simple improvement over the differential equation above is the approach of Cronin and Emanuel (2013). They consider the temperature anomalies T_S' of the surface and T_A' of the air just above the surface, and assume (their equation 1)


\begin{aligned}
    C_A T_A' &= \lambda (T_S' - T_A') - B T_A' \\
    C_S T_S' &= \lambda (T_A' - T_S')
\end{aligned}

where C_A, C_S are the heat capacities of the air and surface, \lambda is the coupling constant between the surface and air, and B is the coupling constant between the air and space.9Additive constants are missing because they considered temperature anomalies T' rather than temperatures T. Note that only a coupling between air and space is explicitly given; there is no explicit insolation term, so the equations as given are only capable of relaxing towards equilibrium. In their numerical simulations, though, they were forced with a step function change in insolation, which is not explicitly shown in the equation. Since their forcing function was a step function instead of sinusoidal, their results were transients instead of periodic. Here, B / C_A is the equivalent of our \omega, above.

Having two variables, this system of equations supports two fundamental modes, which relax on different timescales. The longer timescale, which is the more important, was found to be upwards of 100 days for reasonable choices. The shorter timescale was not explicitly calculated in the paper but from the cited literature may be comparable to 2 days.

This is very convenient for resolving our difficulties with the observations: having two different relaxation timescales, the system can exhibit the observed behavior in both the seasonal cycle (due to response on the longer timescale) and the diurnal cycle (the shorter timescale). Adjusting the constants appropriately we would likely be able to create a temperature model that broadly agrees with the observations.

Indeed, this is probably not too far off from a qualitative understanding of our observations. Generally speaking solar heating is mostly applied to the surface of the Earth, which transfers this heat to the lowest layers of the atmosphere, which then radiates it to space. The fundamental mode with the faster timescale corresponds to the difference between T_A' and T_S' (as \lambda \gg B), and so it is on this faster timescale that the temperature of the air responds to the temperature of the surface. Thus our observation of a phase shaft between the hottest time of the day and solar noon informs us about the timescale C_AC_S / ((C_A + C_S) \lambda) of this coupling. However the earth-air system loses heat to space at the much slower C_A / B timescale, which is how the diurnal cycle can have a temperature lag of only a few hours and yet the seasonal cycle have a temperature lag of over a month.

However this is a bit too convenient. It is simplistic to just divide the system into two parts, land and air, and treat each as a homogenous box. Temperature of the ground varies with depth, as well as how closely it interacts with the surface. Air within a few centimeters of the surface can be much hotter than air at 2 meters above the surface, where temperature measurements are conventially made. The boundary layer of the atmosphere, which varies from the lowest 50 to 2000 meters, is subject to high turbulence due to friction between wind and the ground, allowing for mixing on a short timescale; then the troposphere, the bottom 10 km or so, is subject to convection which mixes it on a timescale of days.

Besides vertical mixing, there is also horizontal mixing, especially along lines of latitude. Milan, like most locations at mid-latitudes, tends to have wind blow from the east, so that the air temperature anticipates the changes in sunlight over the diurnal cycle. Furthermore, diffusive scatter of sunlight due to atmospheric phenomena causes significant warming in infrared wavelengths even before the sun is directly visible at sunrise.

Attempting to create a massive model that simulates the temperatures of each component of the system and their interrelations is a monumental task with innumerable sources of error; such an explicit simulation is only viable for very particular purposes where the errors can be controlled and prevented from compounding upon each other. For our purposes, it is more appropriate to analyze more abstractly how we expect complicated systems to respond in general.

Consider the many physical objects that comprise the surface Earth, such as the vegetation and different layers of the ground and air. Each such component has its own physical characteristics, such as its temperature E_i (which may vary with time), and can be subdivided into further components as necessary. We can linearly approximate the response of E_i to solar forcing, like before, as

\frac {dE_i}{dt} = \omega_i (S - E_i)

where \omega_i is the component’s response frequency, which may vary from one component to another. Of course, each of the components interacts in some way with every other, so we have a much bigger system of equations

\frac {dE_i}{dt} = \omega_i (S - E_i) + \sum_j \omega_{ij} (E_j - E_i)

We will suppose the \omega_{ij} are generally unimportant compared to the \omega_i: for, if two components interact highly with each other, they are probably in close physical proximity (such as adjacent layers in the ground), so they will respond to solar forcing in a similar way, so they will generally have a similar tempeature anyhow, and heat flow between them is not so important.10Side note: because \omega_{ij} depends on 1 / C_i, where C_i is the heat capacity of the ith component, in general \omega_{ij} \neq \omega_{ji}.

Now, what happens if a solar forcing S with frequency \omega is applied to the system? Each component acts as a low-pass filter with frequency cut-off \omega_i, so E_i will oscillate with S when \omega_i \gg \omega and will not respond when \omega_i \ll \omega. The phase offset of the response is small when \omega_i \gg \omega and approaches a quarter phase when \omega_i \ll \omega.

If you then measure the temperature of the system at some point in time, you will effectively be measuring some sort of an average of the temperatures of its components; this depends on where you are measuring the temperature, and the details of the interaction terms \omega_{ij} that cause changes in the temperature of each component to influence the temperatures of every other one.

Therefore, as sunlight varies over a day, we observe that the parts of the Earth with a very large \omega_i respond to the sunlight by rapidly changing temperature: for example, the top few centimeters of the ground and the bottom few meters of air, which are most directly influenced by sunlight and have a quite low heat capacity. These parts also respond to seasonal variations in sunlight by rapidly changing temperature. If they were the only relevant parts of the surface Earth, then explanation (1) at the very beginning would be correct, and we would observe the hottest part of the year lagging only a few hours behind the brightest part of the year.

However, there are other relevant parts of the Earth, such as the top few meters of ground, top 10-ish km of the troposphere, and the air masses elsewhere on Earth but at the same latitude.11Going even further afield, the yet more distant parts of the Earth have a response timescale even greater than a year, and so are not important to either seasonal or diurnal variations. Their temperatures respond to local variations in sunlight not in hours but in weeks or months, which is too slow for them to be relevant to the diurnal temperature cycle, but not too slow for the seasonal cycle.

Since there are many components to the Earth system with many different characteristic response times to variations in forcing, we expect that forcing of any period will create a lagged response: if there were large weekly or monthly variations in sunlight then the response would be dominated by the portions of the atmosphere with a weekly or monthly characteristic time scale and we would observe the hottest part of the week or month to be moderately after the brightest part. This is what we would expect from most sufficiently messy systems: they will exhibit responses to all frequencies in their input.

Let us conclude by returning to the original question: why is the summer warmer than the winter? Because of the large lag of the hottest day in the summer after the sunniest day in the summer, it must be that the seasonal variation in temperature is dominated by objects that respond to variations in solar heating with a timescale comparable to the seasonal lag. Thus, explanation (2) is (more) correct: the summer is hotter due to the accumulation of more and more heat over the season.

The greenhouse effect part 5: Differences between model and reality

2022 September 12

  1. Part 1: What is the greenhouse effect? An accessible, scientific introduction
  2. Appendix A: What is the atmosphere?
  3. Appendix B: Ozone
  4. Part 2: Physics of light and temperature
  5. Part 3: Temperature of the Earth without an atmosphere
  6. Part 4: A model of the greenhouse effect
  7. Part 5: Differences between model and reality

We have considered several simple examples of atmospheric models where we were able to exactly calculate the temperature that the surface of the Earth would have to maintain the system in equilibrium. While the greenhouse effect acts in the real world to raise the temperature of the Earth in just the same way as it acts in the model, there are many additional complications to the real climate system that our simple examples do not represent. With the aid of detailed observations of the atmosphere it is possible to build sophisticated computer simulations of the climate; but our simple model is sufficient to understand the basic principles underlying the greenhouse effect.

In part 4 we introduced a “single layer” model, that is, we assumed that the atmosphere was a single uniform temperature. The real atmosphere has temperatures that wildly vary with height, and the greenhouse gases in the atmosphere are only sensitive to specific wavelengths of infrared light, with different gases located in different concentrations in different parts of the atmosphere.

We briefly discuss the structure of the atmosphere; a more detailed explanation can be found in appendix A. The lowest layer of the atmosphere, called the troposphere, is the bottom 10 to 15 km of the atmosphere and contains about 80% of its mass. The troposphere is well-mixed because it is heated from below12This fact is how real-world greenhouses work, which is totally unrelated to the greenhouse effect. A greenhouse prevents the air near the ground from rising and mixing with the air above, causing hot air to be trapped near the surface. It has nothing to do with blocking infrared radiation, as can be demonstrated by placing a small vents in the roof and sides of a greenhouse, which causes it to cool to ambient temperatures.. While infrared radiation is one way that energy goes upwards in the troposphere, another major component is hot air rising. Above the troposphere is the stratosphere, which is well stratified into distinct layers because it is heated from above by the ozone layer.

The main greenhouse gases located in the Earth’s atmosphere are water vapor, carbon dioxide, methane, and ozone, listed in decreasing order of their contribution to the greenhouse effect. Ozone is mostly located in the ozone layer in the stratosphere. While ozone only has a small greenhouse effect, it plays a very important role in Earth’s climate and ecosystem because it absorbs ultraviolet radiation.

Methane is a very efficient greenhouse gas compared to carbon dioxide, but it is present at a much lower concentration. Most methane released into the atmosphere decays to carbon dioxide within about 10 years. Levels of atmospheric methane today are approximately 3 times the natural amount, with the main sources being natural gas mining and livestock.

Carbon dioxide is a very stable gas that is only removed in significant amounts through photosynthesis and diffusion into the surface of the ocean. When excess carbon dioxide is added to the atmosphere, most of it remains for hundreds of years, and some remains for tens of thousands of years. Because of its stability, carbon dioxide is well-mixed through all layers of the atmosphere and plays a key role in the greenhouse effect.

Today carbon dioxide is about 410 ppm as of 2018 415 ppm (as of 2022) in the Earth’s atmosphere, of which about 135 ppm is from artificial sources. Natural levels of carbon dioxide vary from 180 ppm to 280 ppm on a timescale of about 100 000 years13Before the industrial revolution, the natural level of carbon dioxide was already at roughly 280 ppm, at the high end of the natural cycle.. The main artificial source of carbon dioxide is fossil fuel burning.

Water is the most important gas in the atmosphere and has a direct effect on almost every aspect of the climate. While water is chemically unreactive in atmospheric conditions, it readily condenses from vapor into liquid or solid, making clouds, and sometimes precipitating out. Because of this, almost all water vapor is located in the warm air nearest the surface, with very little found at higher altitudes. Since water is mostly confined near the surface it does not have as large a greenhouse effect as it would if it were evenly mixed throughout the atmosphere.

Furthermore, water rapidly enters or leaves the atmosphere in response to changes in weather conditions, with warmer conditions typically increasing the amount of water. This gives other greenhouse gases a compounding effect: any warming caused by the addition of carbon dioxide to the atmosphere results in an increase in water concentration, causing further warming. In extreme cases this could cause a “runaway” greenhouse effect, as is thought to have happened to Venus. Finally, water forms clouds when it condenses, which have complicated and hard to understand effects on the climate, and are capable of either warming or cooling the Earth depending on what altitude they are.

These gases are capable of absorbing different wavelengths of infrared light, as seen below. The attenuation of light due to a gas is what fraction of light would be absorbed by the gas – that is, how opaque it is, or the opposite of how transparent it is. For example, a photon with a wavelength of 13 microns emitted by the surface of the Earth going directly upwards has about a 50% chance of being absorbed by a water molecule, assuming it is not scattered or absorbed by any other gas. At 15 microns we see that both water and carbon dioxide are capable of absorbing almost all light emitted by the surface of the Earth, whereas around 11 microns, called the infrared atmospheric window, most light emitted by the surface of the Earth passes to space without being absorbed by the atmosphere.

Although the atmosphere absorbs almost all emissions from the surface of the Earth at specific wavelengths, that does not mean that light of that wavelength is not emitted to space. Recall that in the single-layer model, even when the atmosphere was totally opaque to infrared radiation, infrared radiation still reached space because the atmosphere itself was emitting it. Therefore, at wavelengths like 7 microns and 15 microns where the atmosphere is very effective at absorbing the Earth’s radiation, almost all light at those wavelengths that reaches space was emitted by the atmosphere. Because the atmosphere is colder than the surface of the Earth, the atmosphere is less effective at emitting heat to space. In particular, this gives us another perspective for understanding the Earth’s effective temperature: the effective temperature measures the temperature of the part of the atmosphere that is emitting infrared radiation directly to space14Or more specifically, the average of the temperatures of the various parts of the Earth and atmosphere, weighted according to what proportion of the emissions to space come from that part..

Another way to describe this situation is that instead of the surface of the Earth radiating energy directly to space, instead energy flows from the surface to/through the atmosphere in the form of infrared radiation, and then from the atmosphere to space. The atmosphere itself has many layers, with the lower layers warming the upper layers. The surface of the Earth must be warmer than the atmosphere, as otherwise energy would not flow from the surface to the atmosphere; and likewise each layer of the atmosphere must be warmer than the layers above it so that energy will flow from the lower layers to the upper layers15Although note that this simplified explanation ignores the ozone layer, where temperatures actually increase with height. Ozone absorbs ultraviolet radiation, which comes only from the Sun, and not from the Earth, so this causes the reverse behavior of temperature increasing with height.. The more layers there are, the hotter the surface of the Earth needs to be to push the same amount of heat outwards. Since the surface continues to receive the same amount of energy from the Sun no matter how many layers of greenhouse gases are added, if not enough heat is being expelled from the surface, then the surface will simply heat up until there is.

The total effects of the greenhouse gases in the Earth’s atmosphere can be seen in the following figure. The figure presents typical infrared emissions from the Earth when the surface of the Earth is 294.2 K (21 C, 70 F) and there are no clouds. The blue line illustrates the emissions of a perfect blackbody the same size and shape of the Earth at 294.2 K. In the absence of the atmosphere, the Earth’s emissions would be very close to the blue line; note that the emissions are closest to the blue line in the infrared atmospheric window around 8 to 12 microns. However, in the presence of the atmosphere, certain regions of the Earth’s emission spectrum are dominated by emissions from the atmosphere, which is cooler than the surface of the Earth. The green and red lines show blackbody spectra of other temperatures; for example, this allows us to estimate that the emissions near 15 microns, which are caused by carbon dioxide, come from a layer of the atmosphere with a temperature near 225 K (-48 C, -55 F).

Observe the close relationship between the attenuation of the four gases, particularly water and carbon dioxide, and the features of the Earth’s infrared spectrum. The wavelengths that are strongly attenuated by the gases are those that have much lower emissions to space. Particularly, notice that at wavelengths where water strongly attenuates, the spectrum has a temperature of about 255 K, and at wavelengths where carbon dioxide strongly attenuates, the spectrum has a temperature of about 225 K. This suggests that the highest levels of the atmosphere with high concentrations of water typically have a temperature of around 255 K, whereas the highest levels of the atmosphere with sufficient carbon dioxide typically have a temperature of around 225 K.

Since water is typically only found in the lowest parts of the atmosphere (see appendix A), whereas carbon dioxide is uniformly found throughout, in the region near 15 microns where both water and carbon dioxide strongly attenuate, it is carbon dioxide that determines the temperature of the emissions to space. (The small spike right at 15 microns is due to carbon dioxide found in the upper stratosphere.)

What happens if carbon dioxide is increased?

The previous figure of the Earth’s spectrum was made using a radiative transfer model, which uses the same ideas presented in part 4 but with much greater sophistication. Given the temperature of the surface and a description of the temperature and chemical composition of each level of the atmosphere, the radiative transfer model highly accurately computes the emission and absorption of radiation at every level, taking into account the different properties of each chemical at every wavelength16Our model of part 4 only considered two wavelengths of light, shortwave and longwave. However the radiative transfer model, LBLRTM, simulated 15 million different wavelengths, which was smoothed to about 1000 wavelengths in the figure – without this smoothing the graph would have been so spiky as to be totally unreadable. Numerous other details we ignored in our simple model were properly simulated by LBLRTM. The atmospheric composition used is the US Atmospheric Standard of 1976, which defined a carbon dioxide concentration of 314 ppm, far below the current value of 415 ppm as of 2022..

So, what happens if the amount of carbon dioxide is increased in the atmosphere? As we saw in the previous sections, adding carbon dioxide should increase the temperature of the Earth; but with the use of a highly accurate radiative transfer model, it seems that it should be easy to give a quantitative and exact answer. Indeed, if we re-run the radiative transfer model with the same surface temperature but a higher concentration of carbon dioxide, we find that the “holes” in the emission spectrum corresponding to attenuation from carbon dioxide become slightly deeper and wider, so that less infrared emissions reach space, and a higher surface temperature is needed to maintain the same infrared emissions.

While this gives a first decent estimate of the increase in temperature due to an increase in carbon dioxide, it assumes that the temperature and composition of the atmosphere is unchanged by the addition of carbon dioxide. However, as the surface temperature rises, the temperature of the lowest layers of the atmosphere also rises. Since warmer air holds much more water, this increases the amount of atmospheric water, which is the most potent greenhouse gas.

This process by which any warming causes an increase in atmospheric water, thereby causing further warming, is called the water vapor feedback process. This makes the climate more sensitive to changes in temperature – any perturbation is amplified. Fortunately, this amplification is self-limiting, and does not compound upon itself endlessly17It is thought that Venus entered in a water vapor feedback process that was not self-limiting, and just grew forever in a runaway greenhouse effect until its oceans boiled away entirely. It is also expected that Earth will eventually enter a similar runaway greenhouse effect in about one billion years, and that artificial climate change will not be able to trigger this early..

It is the existence of climate change feedbacks like the water vapor feedback that makes it very challenging to accurately predict how much the temperature will rise when carbon dioxide is added to the atmosphere. Water vapor feedback is an example of a linear feedback process where every time a bit of carbon dioxide is added, the amount of water vapor goes up a bit in response. Much more difficult are nonlinear feedback processes, particularly so-called “tipping points” which have no observable effect on the climate until a critical threshold is reached, at which point there is a very large feedback effect.

Many of these nonlinear feedback processes are poorly understood and difficult to predict, or sometimes not even known if they exist or not. One widely speculated process regards the release of methane clathrates, which are vast reserves of methane trapped in ice buried under the ocean floor, particularly in the arctic. Estimates of the size of these reserves vary widely, but are typically on the order of thousands of gigatons of carbon. If this methane were to be released to the atmosphere by the melting of the ice, it would cause a large and rapid rise in temperature. Some scientists believe that a process like this was responsible for certain sudden climate changes in the past18Specifically the Permian-Triassic extinction event of 252 million years ago and the Paleocene-Eocene Thermal Maximum of 55 million years ago., and measurements have found an increase in methane releases in the arctic, but the role of methane clathrates in climate change is still unknown.

A nonlinear feedback process which scientists have more understanding of is the ice-albedo feedback, a process by which rising temperatures causes ice to melt, which lowers the albedo of the Earth (as ice is very reflective), which then warms the planet further. This is particularly important in the arctic, where highly-reflective ice covers sea, which has very low reflectivity. A simplified perspective of the arctic ice-albedo feedback is the idea that the arctic supports two stable states: a high ice state, where cold global temperatures allow a large ice cap, whose high albedo encourages cold temperatures; or a low ice state, where warm global temperatures cause a small ice cap, so that the low albedo of dark ocean waters encourages warm temperatures. It is speculated that intermediate ice levels are unstable, which explains why the Earth has sharply transitioned between glacial periods (popularly called “ice ages”, although that term has a different technical meaning) and interglacial periods instead of smoothly varying. Again, the extent to which this will play a role in the climate response to an increase in carbon dioxide is hard to predict.

The greenhouse effect part 4: A model

2022 September 05

  1. Part 1: What is the greenhouse effect? An accessible, scientific introduction
  2. Appendix A: What is the atmosphere?
  3. Appendix B: Ozone
  4. Part 2: Physics of light and temperature
  5. Part 3: Temperature of the Earth without an atmosphere
  6. Part 4: A model of the greenhouse effect
  7. Part 5: Differences between model and reality

In this part we will use what we have learned about light and temperature to create a simple model of the Earth with its atmosphere, which we can solve to see what effect changing the atmosphere has on the temperature of the modeled Earth.

The model is not quantitative, meaning that it is not accurate enough for the numerical results from the model to agree with the true numbers, but it is qualitative, meaning that the overall behavior of the model agrees with the overall behavior of the Earth. Therefore by understanding how the model works we can improve our understanding of the Earth system; in particular, the way the greenhouse effect works in the model is the same way that the greenhouse effect works on the Earth.

Simple model of Earth with atmosphere

Consider the following figure of the Earth19We mean “the surface of the Earth” when we say “the Earth”, as the interior of the Earth only very slowly exchanges heat with the surface, so it can be ignored. and its atmosphere, with energy flowing between them.

The arrows in the diagram represent energy flowing between the different objects in the form of light; we ignore other forms of energy transfer20All energy exchanged with the Sun or with space is in the form of light, but some of the energy exchanged between the Earth and the atmosphere is in other forms. In particular, hot water molecules that physically move from the surface into the air bring a large amount of energy with them, called latent heat. Heat conduction plays a lesser role. We also omit geothermal heating, which is energy flowing from the interior of the Earth to the surface. This is estimated to be 47 TW, or 0.092 watts per square meter.. On the left of the diagram is shortwave radiation, that is, visible light. The variable I is the rate at which energy from the Sun is being absorbed by the surface of the Earth; we assume that none of this shortwave radiation is absorbed by the atmosphere. We omit from the diagram light from the Sun that strikes the Earth (or clouds in the atmosphere) and is reflected to space; I is only the portion that is absorbed. Recall from part 3 that

I = \pi R^2 (1 - \alpha) S.

On the right of the diagram is longwave radiation, that is, infrared light. The variable A_\text{up} is the rate at which radiation leaves the top of the atmosphere to space; some of this is emitted by the atmosphere, and the rest is emitted by the Earth and passed through the atmosphere. The variable A_\text{down} is the rate at which radiation emitted by the atmosphere strikes the Earth. Finally, the variable E is the rate at which radiation is emitted by the Earth, which will either be absorbed by the atmosphere or transmitted to space.21We briefly remark on the arrows that are absent from the diagram. The most interesting omission is the arrow from the Sun to the atmosphere; we have already commented on that. A tiny fraction of the light emitted to space goes on to strike the Sun or other bodies, but we are uninterested in where exactly it goes once it leaves the Earth. Space is filled with cosmic microwave background radiation, so there should be arrows representing microwave light from space to each of the other objects, but the amount is so tiny as to be totally insignificant – only 1.6 GW of it reaches the Earth, or 3 microwatts per square meter. Finally, the Sun emits a tremendous amount of light into space that does not strike the Earth, but we are not interested in that.

As discussed in part 3 we are interested in the equilibrium of the system, that is, when the total amount of energy entering each object equals the energy leaving that object.

So far we have not made any assumptions about the physics of the atmosphere, so there is insufficient information to solve for the model’s equilibria. However before we go further let us look at what we can conclude so far.

Let T be the average22Whenever we say the “average” temperature of a (spherical) object in the context of blackbody radiation, we mean the fourth root of the arithmetic mean of the fourth power of the surface temperature, weighted by surface area and emissivity. That is, we use exactly the average that makes the Stefan-Boltzmann law work with the result. For objects like the Earth, where the temperature does not vary tremendously from one location to another, this average is close to the ordinary arithmetic mean. For tidally-locked or slowly rotating objects like Mercury or the Moon, the distinction can be very important. temperature of the Earth. Then as before, we know that

E = 4 \pi R^2 \sigma T^4

where R is the radius of the Earth.

At this point in the calculations in part 3 we solved for the effective temperature T_e that satisfied the equality E = I. However, the introduction of the atmosphere changes this equation. Since the Earth is at equilibrium, the total flow of energy in and out of the surface is equal, giving the equality

\begin{aligned}
    E &= I + A_{\text{down}} \\
    4 \pi R^2 \sigma T^4 &= \pi R^2 (1 - \alpha) S + A_{\text{down}},
\end{aligned}

where the additional term A_{\text{down}} is downward heating from the atmosphere, which has the effect of increasing the temperature T of the surface of the Earth, demonstrating the greenhouse effect. Our goal is to find the temperature T of the Earth, which will require calculating A_{\text{down}} and A_{\text{up}}.

Since the atmosphere is at equilibrium, we also have

E = A_{\text{up}} + A_{\text{down}}.

Combining this equation with E = I + A_{\text{down}} gives us I = A_{\text{up}}, which is to say that the total amount of sunlight the Earth absorbs equals the total amount of light it emits, much as we would expect23In fact, direct measurements of light going in and out of the Earth agree with each other up to the accuracy with which they can be measured..

Simple model: no greenhouse gas

To solve for the temperature T of the Earth, we need the values of A_{\text{down}} and A_{\text{up}}, or equivalently we need to know what fraction of the energy E emitted by the Earth is absorbed by the atmosphere or transmitted through it to space. We know that E = A_{\text{down}} + A_{\text{up}} but that is not sufficient information by itself. To find these values we need some kind of additional assumption about the physics of the atmosphere.

The simplest assumption is the total absence of gases that interact with infrared light, so that the atmosphere has no greenhouse gas. In this case, all light from the Earth is transmitted directly to space, and the atmosphere neither absorbs nor emits any infrared light.

In this case, we have E = A_{\text{up}} and A_{\text{down}} = 0. Then E = I and we get exactly the situation of part 3, so that the temperature of the surface of the Earth equals the effective temperature: T = T_e.

Simple model: atmosphere has one layer, with ample greenhouse gas

The next simplest assumption that can be made about the Earth’s atmosphere is to assume that it fully absorbs all upwards emissions from the Earth while maintaining a single uniform temperature, which is usually described by saying it has a “single layer”. Thus A_{\text{up}} consists only of emissions from the atmosphere. As the atmosphere has a uniform temperature, it emits the same amount of energy upwards and downwards, so we get A_{\text{down}} = A_{\text{up}} and

A_{\text{down}} = \frac 12 E.

Combining with E = I + A_{\text{down}} we get

\begin{aligned}
    E &= I + \frac 12 E \\
    E &= 2I \\
    4 \pi R^2 \sigma T^4 &= 2 \pi R^2 (1 - \alpha) S
\end{aligned}

so

T = 2^{1/4} T_e = 1.1892 T_e = 303 \text{ K}.

This is 30 C or 86 F, somewhat above the true average temperature of the Earth.

Simple model: atmosphere has one layer, with only some greenhouse gas

Let us assume again the the atmosphere is a single layer of uniform temperature, but now suppose it does not have sufficient greenhouse gas to fully absorb all infrared emissions from the Earth. Let \lambda be the fraction of such emissions that are absorbed, so that the atmosphere absorbs \lambda E and transmits (1 - \lambda)E. Since half of the atmosphere’s emissions are downwards (by the uniform temperature assumption again), we get A_{\text{down}} = \frac \lambda 2 E, so that

\begin{aligned}
    E &= I + \frac \lambda 2 E \\
    E &= \frac 1 {1 - (\lambda / 2)} I,
\end{aligned}

and therefore T = (1 - (\lambda / 2))^{-1/4} T_e.

When \lambda = 0, we recover the situation of the model with no greenhouse gas, and get T = T_e. When \lambda = 1, we find again the result of the model with ample greenhouse gas, and get T = 2^{1/4} T_e. As we vary the amount of greenhouse gas in the atmosphere from none to ample, \lambda varies between 0 and 1, and the temperature calculated from the model varies from T_e to 2^{1/4} T_e, increasing as greenhouse gas is added. This illustrates how adding greenhouse gases to the atmosphere causes the temperature to rise.

Multiple layers

With the single-layer assumption we saw that the Earth’s temperature increased with additional greenhouse gases up to a fixed maximum. However, as more and more greenhouse gas is added to the atmosphere, the assumption that the atmosphere can maintain a uniform temperature would break down as the emissions from the Earth can’t penetrate past the very bottom of the atmosphere.

If we instead approximate the atmosphere as being composed of multiple layers, each fully absorbing all incident infrared radiation and individually have uniform temperatures, then each such successive layer of greenhouse gases causes the temperature to increase by a factor of 2^{1/4}. Equivalently, every fourth layer doubles the temperature.

Similarly, we could modify the model to accommodate anti-greenhouse gases24Hazy conditions can have an anti-greenhouse-like effect, although the mechanism is not exactly the same; for example, major volcanic eruptions cool the Earth for a few years by putting sulfur aerosols in the stratosphere. which absorb shortwave radiation and transmit longwave radiation. Each such layer of anti-greenhouse gases decreases the temperature by a factor of 2^{-1/4}, countering the effects of one layer of greenhouse gases.

The greenhouse effect part 3: Temperature of the Earth without an atmosphere

2022 August 29

  1. Part 1: What is the greenhouse effect? An accessible, scientific introduction
  2. Appendix A: What is the atmosphere?
  3. Appendix B: Ozone
  4. Part 2: Physics of light and temperature
  5. Part 3: Temperature of the Earth without an atmosphere
  6. Part 4: A model of the greenhouse effect
  7. Part 5: Differences between model and reality

Equilibrium temperature

So far we have been using temperatures without any discussion about what temperature is or what it measures. Though the technical details of temperature are tricky, in everyday life and for our purposes we can consider temperature as a measure of how much thermal energy is in an object. There are two important properties of temperature: adding heat to an object causes its temperature to increase25One major exception is phase changes like ice melting or liquid water evaporating; another of course is black holes, as mentioned earlier., and when two objects are allowed to freely exchange energy then heat will flow from the hotter object to the colder object. As we discussed in part 2, if these two properties were not true then life would not be possible.

We say an object is in thermal equilibrium with its surroundings if its temperature is not changing over time. This is the same thing as saying there is no net flow of heat in or out of the object. What happens if an object was at thermal equilibrium but is transiently heated up? The object is hotter than before and will therefore lose heat faster than before; for example, through blackbody radiation, which is greater the hotter an object is. Since the object will now have a net flow of heat outwards, it will cool down over time. Similarly, if an object was at thermal equilibrium and is transiently cooled down, it will experience a net flow of heat inwards and heat up over time.

Therefore we expect that in general an object will have a unique equilibrium temperature, that is, the one temperature at which it would be at thermal equilibrium with its surroundings. Above this temperature, the object will cool down, and below this temperature, the object will heat up, and at this temperature the object will be in thermal equilibrium. The equilibrium temperature depends on the object’s surroundings and the interactions between the object and its surroundings. If these change, then the equilibrium temperature can also change. For example, consider a room heated by a heater on a cold day; the room reaches some steady temperature at which the heat it gains from the heater is balanced by the heat lost to the outside. Opening a window decreases the equilibrium temperature, so the room will cool off until it approaches this new equilibrium temperature.

By analyzing the equilibria of a system we often find it much easier to understand how the system changes over time. The most direct approach to studying the changes in a system, which I would call the “dynamic” method, is to determine the current state of the system and how that state is changing. For example, if we know the current temperature of an object, and we know how the temperature is changing, then we can predict the future temperature of the object. Alternatively we can analyze the equilibria of a system, which I call the “static” method. The static method loses information about transient fluctuations or detailed behavior of the system, but it often is better at giving an overall understanding of the behavior of the system. The static method is also usually easier and more robust to model error26Model error refers to details of the real-world system which are omitted in the mathematical model. Regardless of how detailed and precise the model is, for a model to be useful there will always be some further detail that is missing..

In the specific example of studying the temperature of the Earth, the dynamic method requires knowing the current temperature of the Earth, exactly how much energy the Earth is receiving from the Sun at each point in time, and exactly how much energy the Earth is losing to space at each point of time. Knowing these three things we can calculate the temperature of the Earth at any point in the future. Of course, this is totally infeasible because, for example, the amount of sunlight reflected back into space depends on how many clouds there are and what shape they have, which rapidly changes within hours. If we tried to make predictions in this way they would become wildly inaccurate almost immediately because any small error compounds upon itself (Which is not to suggest that such predictions are impossible, just that this naive approach is not viable.).

Alternatively, using the static method we first measure how much energy the Earth is typically receiving from the Sun, and then calculate the temperature at which the Earth would emit as much energy as it is receiving. This calculation tells us the equilibrium temperature of the Earth, that is, the temperature at which it radiates the same amount of energy as it receives, so that the Earth’s temperature does not change. While this method does not explain any oscillations or fluctuations around the predicted equilibrium temperature, it does capture the most important features that are relevant to the Earth’s temperature.

Temperature of the Earth with no atmosphere

We now have the pieces to understand what temperature the Earth would have in the absence of an atmosphere. The Earth gains energy through light received from the Sun; some of this light is reflected back to space but most of it is absorbed by the Earth’s surface. The proportion of light reflected back to space is called the albedo of Earth, and it equals approximately 30%27Of course, if the Earth had no atmosphere it would have a significantly different albedo, among many other major differences.. The Earth loses energy by radiating it to space due to the blackbody effect, that any warm object emits light; the amount of energy lost this way depends on the Earth’s temperature. The equilibrium temperature of the Earth is the temperature at which the energy gains from the Sun are equal to the energy losses due to the blackbody effect.

The rate at which energy is received from the Sun is equal to

\pi R^2 (1 - \alpha) S

where R is the radius of the Earth, \alpha is the albedo of the Earth, and S is the insolation of the Earth (the amount of energy in sunlight received by the Earth, per area and per time). The reason for the factor of \pi R^2 is that, from the perspective of the Sun, the Earth appears to be a disc of radius R, so that \pi R^2 is the total effective area illuminated by the Sun28While the surface area of the Earth is 4 \pi R^2 and half of that is exposed to sunlight at any time, the amount of sunlight a location receives depends on the angle the Sun is above the horizon, and the average illuminated location receives half as much light as it would receive under direct, full sunlight.. Since the albedo \alpha is the proportion of light reflected back to space, 1 - \alpha is the proportion absorbed.

From the Stefan-Boltzmann law, the rate at which energy is radiated from the Earth due to the blackbody effect is

4 \pi R^2 \sigma T^4

where again R is the radius of the Earth, \sigma is the Stefan-Boltzmann constant, and T is the temperature of the Earth. Here 4 \pi R^2 is the surface area of a sphere with radius R.

At the equilibrium temperature T_e the energy received and emitted is equal, so

\pi R^2 (1 - \alpha) S = 4 \pi R^2 \sigma T_e^4.

If we know the values of \alpha, S, and \sigma we can solve for T_e:

T_e = \left(\frac {1 - \alpha}{4 \sigma} S\right)^{1/4}.

Observe that the radius R of the Earth has no effect; if the Earth were larger, it would absorb more sunlight and also emit more blackbody radiation.

If we use sensible values29We take albedo \alpha = 0.3, insolation S = 1366 \text{ W m}^{-2}, and Stefan-Boltzmann constant \sigma = 5.67 \cdot 10^{-8} \text{ W m}^{-2} \text{K}^{-4}. As per a previous note, we use an emissivity \epsilon = 1. With a more realistic \epsilon = 0.96, we get T_e = 257 K. for \alpha, S, and \sigma, we compute the equilibrium temperature

T_e = 255 \text{ K}

or -18 C or -1 F.

The temperature of an astronomical body computed in this way, by ignoring all atmospheric effects, is called the effective temperature; specifically, we find the effective temperature of an object by measuring how much light it emits and using the Stefan-Boltzmann law to determine the temperature needed to emit that much light. In fact, when we previously spoke about the temperature of the surface of the Sun, we really meant the effective temperature of the Sun30Like the gas planets, the Sun does not have a solid surface, but instead gradually becomes denser and more opaque closer to the center. The “surface” is defined somewhat arbitrarily in terms of a certain level of opacity. The exact temperature at this depth would be difficult to measure, but is likely very close to the effective temperature.. A distant astronomer attempting to measure the temperature of the Earth would be measuring its effective temperature.

The effective temperature of the Earth differs from the true temperature of the surface of the Earth in two important ways:

The effective temperature can be thought of as a suitable average31Rather than the typical arithmetic mean, the fourth root of the arithmetic mean of the fourth powers is the suitable average. This is always warmer than the arithmetic mean, although not significantly so for the Earth. of the temperature across all locations. These spatial variations are important to weather and the climate but do not directly pertain to the greenhouse effect, so we will not discuss them here.

The greenhouse effect part 2: Physics of light and temperature

2022 August 22

  1. Part 1: What is the greenhouse effect? An accessible, scientific introduction
  2. Appendix A: What is the atmosphere?
  3. Appendix B: Ozone
  4. Part 2: Physics of light and temperature
  5. Part 3: Temperature of the Earth without an atmosphere
  6. Part 4: A model of the greenhouse effect
  7. Part 5: Differences between model and reality

Light

Light, also called radiation or electromagnetic radiation, is a familiar aspect of our daily lives. Light can be emitted, absorbed, reflected, transmitted, or refracted by objects, and travels in a straight line otherwise. When light strikes our eyes, it is absorbed and gives us information about the object that emitted or reflected that light, which we call “seeing” the object. Light is a type of energy.

A beam of light is made of many individual photons, which are indivisible parcels of light. While the brightness of a beam of light depends on how many photons are in it, not all photons are the same. The properties of a photon can be described with a single number, its wavelength; two photons with the same wavelength are indistinguishable32Instead of wavelength, photons are sometimes described by their frequency or energy. These are related by E = h \nu = h c / \lambda, where E is the energy of the photon, \nu is the frequency, \lambda is the wavelength, h = 6.626 \cdot 10^{-34} \text{J}\cdot\text{s} is Planck’s constant and c = 3 \cdot 10^8 \text{m}/\text{s} is the speed of light. For consistency we will only use wavelength..

The wavelength of a photon affects what objects it can interact with. For example, radio waves, which have a wavelength of 1 meter or more, can pass through walls but will interact with the antenna of a radio receiver; and the microwaves in a microwave oven, which have a wavelength of 12.2 centimeters, are easily absorbed by the water in food to heat it up but cannot pass through the small holes in the metallic screen covering the door33The interior of the microwave oven is surrounded on all sides by metal, forming a Faraday cage from which microwaves cannot escape..

In particular, certain wavelengths of light are able to interact with the light-sensitive cells (the “rod” and “cone” cells) in our eyes to produce sight; light of these wavelengths is called visible light, while light of other wavelengths cannot be seen34Except that very strong x-rays shown directly in the eye can appear faintly blue; this was discovered in 1895 before the dangers of x-rays were known.. The various wavelengths of visible light appear to our eye as different colors. Just as the chemicals in our food are perceived by us as having different tastes, the different wavelengths of visible light are perceived as different colors; and just as some chemicals are tasteless, some wavelengths of light cannot be seen at all.

While wavelength and color are closely related, the wavelength of a photon is a single number (a length) that describes its physical properties, but the color of a beam of light is a far more complicated property that depends on how light interacts with the human eye and how the brain interprets this interaction, and therefore is also slightly different from person to person. The longest wavelength a photon can have and be detectable to the human eye is about 0.7 microns35A micron is one millionth of a meter. “Micron” is short for “micrometer”, which is also written 1 \upmum. A micron is about a hundred times smaller than the thickness of a sheet of paper, and is a bit smaller than the typical bacterium. We will usually use microns to describe wavelengths of light.; light of this wavelength appears dull red. Longer wavelengths of light are infrared, microwave, and radiowaves, which are invisible to the eye. The shortest visible wavelength is around 0.39 microns, which appears deep blue or violet; shorter wavelengths are ultraviolet, x-rays, and gamma rays. Light with wavelengths between 0.39 and 0.7 microns are visible.

Rainbows are formed by separating the photons in a beam of light according to their wavelength; so if all the photons in a beam of visible light have the same wavelength, then its color will be one of the colors of the rainbow. Other colors, such as pink, magenta, or brown, can only be formed by a beam of light containing a mixture of photons of different wavelengths. Two different mixtures can appear to be the same color; for example, light at 0.58 microns (which appears yellow) cannot be made by a standard computer monitor, but a monitor can make a mixture of red and green that appears that same yellow color to the human eye. Some colors, called impossible or imaginary colors, cannot be produced by any beam of light, but can be seen for example using afterimages.

The mix of wavelengths that are in a beam of light is called the spectrum of that beam. We can graph a spectrum by showing how much energy the beam has at different wavelengths; where the spectrum is low, the beam has very few photons around that wavelength, and where the spectrum is high, the beam has many photons around that wavelength. Two simple examples of spectra are shown below.

To illustrate the usefulness of spectra for understanding light, the following figure shows real-world observations of the spectra of light from four commercially available light bulbs. The shaded region indicates visible wavelengths of light. We can see from the spectra that the incandescent and halogen bulbs emit many more photons of long wavelengths; they will appear “warmer” (that is, with tones of yellow or red). In comparison, the fluorescent and LED bulbs will appear “cooler” (more white or blue). We can also see that the incandescent and halogen produce a great deal of infrared light that is not visible to the eye. While more than 99% of the light produced by the LED is visible light, only 15% of the light produced by the incandescent bulb and 10% of the light produced by the halogen bulb is visible. This contributes to the much greater efficiency of fluorescent and LED lighting.

Spectra of real-world observations of four commercially available light bulbs. The shaded region indicates the portion of the spectrum that is visible to the eye; the other light is wasted for purposes of illumination. The spectra have been scaled so that all four contain the same amount of energy in the visible region; each of the four bulbs would have the same brightness to the eye. The LED and fluorescent lights emit almost all of their light as visible radiation, which contributes to their excellent efficiency. By contrast, the incandescent and halogen bulbs waste a great deal of energy emitting infrared light.

The simplest possible bulb, the incandescent light, works by heating an object (the filament) until it is so hot that it glows visibly due to the blackbody effect, which is discussed below. The close match of the observed spectrum of the incandescent light with the theoretical spectrum of a blackbody at a temperature of 2950 K (2667 C, 4850 F), also shown in the figure, suggests that the filament of the incandescent bulb used in the experiment had a temperature near 2950 K. The only commonly available materials with a melting point above that are carbon and tungsten; while the first light bulb filaments used carbon, since around 1906 tungsten filaments have been used almost exclusively.

The poor match of the observed spectrum of the LED and fluorescent spectra with a theoretical blackbody spectrum shows that those two bulbs use a different physical process to emit light.

The relationship between temperature and light: blackbody radiation

It has long been understood that hot objects feel hot from a distance, although a scientific explanation of this phenomenon has only come recently. This process is different from conduction, which is the movement of heat from a hot object to an object it is touching. For example, a fire in a fireplace feels warm from a distance even before the air in the room has begun to warm up, so the warmth is not due to conduction via the air. Furthermore, a person near the fireplace feels warmer on the side facing the fire even though the air on both sides is the same temperature. This also can be observed with a hot stovetop, which feels warm from the side even though rising hot air should only be felt directly above the stovetop.

This relationship is because every object emits light according to its temperature, which is called the blackbody effect; light emitted in this way is called blackbody radiation36The first person to study blackbodies was Gustav Kirchhoff in 1860, who was unable to determine the formula for blackbody radiation but called it “a problem of the highest importance”; this proved true when the discovery of the formula led to the discovery of the photon by Max Planck and Albert Einstein around 1905, for which Einstein received the Nobel Prize.. Since light is a form of energy, emitting light causes an object to cool down37With a very few exceptions; for example, black holes warm up when they lose energy., and absorbing light causes an object to warm up.

It is a necessary fact of life that the hotter an object is the more blackbody radiation it emits; so if two objects of different temperature are placed near each other then the hotter object emits more light than it absorbs, cooling down, and the cooler object absorbs more light than it emits, warming up. If this were not true and instead hotter objects emitted less blackbody radiation, then hotter objects would get hotter and hotter, while cooler objects would get cooler and cooler, until everything in the universe would be either extremely hot or extremely cold38Since black holes become colder when they absorb energy, and they start colder than their surroundings, they just get even colder over time until they approach absolute zero; for example the black hole in the center of the Milky Way, called Sagittarius A^*, is approximately 1.7 \cdot 10^{-14} K. The rest of the universe is currently about 2.7 K, so Sgr A^* is gaining energy from its surroundings and continuing to cool down. Eventually the rest of the universe will cool down until it is even colder than Sgr A^*, so Sgr A^* will start losing energy over time and warm up until it ultimately explodes..

An exact formula for the amount of light emitted by an object by the blackbody effect was found in 1879, called the Stefan-Boltzmann law. The amount of light emitted is

A \sigma T^4,

where A is the area of the object, T is its temperature relative to absolute zero (for example, using Kelvin), and \sigma is a constant39Real-world objects actually emit slightly less light than indicated by this formula; the proper formula is A \epsilon \sigma T^4 where \epsilon is the emissivity of the object, a number between 0 and 1 that depends on the material the object is made out of and the wavelength of light that we are interested in. Most objects in daily life have an emissivity around 0.9 to 1 in infrared wavelengths. The surface of the Earth has an average emissivity of about 0.96 in infrared wavelengths. For simplicity we take \epsilon = 1 from now on.. As an example, doubling the temperature of an object multiplies its emissions by 2^4 = 16. Using this law we can calculate the temperature of an object by measuring its blackbody radiation; this is how infrared thermometers work. In fact, Josef Stefan used this law to give the first accurate estimate for the temperature of the surface of the Sun, 5700 K, by comparing sunlight to light emitted from a hot object with a known temperature. The true temperature is 5772 K.

As another example, we can use this law to estimate how much light is given off by an electric stovetop. A hot stovetop (or any object) begins to visibly glow a very dull red when it reaches around 800 K (527 C, 980 F), called the Draper point. This compares to room temperature of about 300 K (27 C, 80 F). Using the Stefan-Boltzmann law, we see that the hot stovetop emits (800 / 300)^4 \approx 50 times more light than it would at room temperature. From this, one may be able to crudely estimate that you need to be within \sqrt{50} / 2 \approx 3 stovetop-lengths of the stovetop to significantly feel the light from a red-hot stovetop compared to surroundings at room temperature.

As well as knowing the total amount of blackbody radiation emitted at a specific temperature, we also know what wavelengths blackbody radiation has (that is, the radiation’s spectrum). An exact formula for how much each wavelength occurs in blackbody radiation is called Planck’s law40Discovered in 1900, and directly leading to Einstein’s prediction of the photon.. Planck’s law tells us that blackbody radiation is mostly near a specific wavelength, and that the hotter an object is the shorter that wavelength is. A room temperature object will mostly emit light near 14 microns, which is infrared light, and an object at the Draper point of 800 K will mostly emit light nearer 5 microns, which is still infrared. The Sun, however, mostly emits light of wavelengths near 0.7 microns, so it is easily visible to the eye.

Above we see a comparison of the emissions from a blackbody (such as a stovetop, which is a good approximation to a blackbody) at a temperature of 800 K against the same object with a temperature of 300 K; the object emits 50 times more light when hot. Almost all of the light emitted is infrared; the hot object emits 10 million times more infrared light than visible light, which is marked with the shaded region of the diagram. While we can’t see the infrared light at all, there is so much of it that we can feel it as heat when nearby. However, we have no trouble seeing the visible emissions of the stovetop because our eyes are tremendously sensitive at detecting visible light – so much so that laboratory tests have found that people can sometimes detect a single photon in ideal conditions.

An idealized comparison of the sunlight that reaches the Earth and of the light that the Earth emits to space is given above. While the amount of energy that enters the Earth equals the amount of energy that leaves the Earth, causing the Earth to be in balance, the wavelength of the incoming and outgoing light is different.

In the context of climate science and the greenhouse effect, the term longwave radiation is used to describe blackbody radiation emitted by the Earth (including the Earth’s surface, oceans, the atmosphere, clouds, etc.) and the term shortwave radiation is used to describe blackbody radiation emitted by the Sun. Since objects on the Earth tend to be much colder than the surface of the Sun, longwave radiation has much longer wavelength than shortwave radiation. Specifically, since objects on the Earth are about 20 times colder than the surface of the Sun, they emit radiation that has a wavelength about 20 times longer.

Because of the large gap between the wavelengths of longwave and shortwave radiation, certain chemicals will interact strongly with one type of radiation but not with the other. The most common gases in our atmosphere, oxygen and nitrogen, are transparent to both longwave and shortwave radiation. However greenhouse gases such as water vapor, carbon dioxide and methane are opaque to longwave radiation but transparent to shortwave radiation. These gases interfere with longwave radiation emitted by the Earth so that it does not cool as effectively, but do not interfere with shortwave radiation from the Sun, so they cause a net warming effect called the greenhouse effect. The details of this process will be explored in the following parts. Conversely, sulfur dioxide in the stratosphere can form sulfate particles that, like clouds, are partially reflective of sunlight and thus cool the Earth by decreasing how much shortwave radiation it receives.

The greenhouse effect appendix B: Ozone

2022 August 15

  1. Part 1: What is the greenhouse effect? An accessible, scientific introduction
  2. Appendix A: What is the atmosphere?
  3. Appendix B: Ozone
  4. Part 2: Physics of light and temperature
  5. Part 3: Temperature of the Earth without an atmosphere
  6. Part 4: A model of the greenhouse effect
  7. Part 5: Differences between model and reality

Ozone is a molecule with three atoms of oxygen, written O3. Ozone is a highly reactive chemical that is very damaging to the lungs in both brief and prolonged exposure: however ozone is mostly located in the ozone layer, in the stratosphere, and only rarely found near the surface.

Ozone located at the surface is mostly, though not entirely, due to man-made activity. It is formed when other dangerous pollutants such as NO and NO2 react with sunlight, and is typically associated with smog. Ozone pollution is frequently caused by car emissions and, in China and India, coal burning, but has been dramatically reduced in the West due to strict car emission standards.

The ozone in the ozone layer is both created and destroyed by reactions with sunlight. The intense ultraviolet radiation is strong enough to destroy the bond in O2, releasing free oxygen atoms, which quickly react with other O2 to form O3. Ozone is even better than O2 at absorbing ultraviolet radiation because its bonds are weaker, although it can be destroyed by the radiation in the process.

Because ozone absorbs ultraviolet radiation, the ozone layer greatly decreases the amount of ultraviolet light that reaches the surface. Therefore the ozone layer is beneficial to human health, and is speculated by some to even be necessary for complex life on land.

The location of the ozone layer is caused by a balance between intensity of sunlight and the amount of oxygen in the atmosphere. The ozone layer doesn’t go higher because the atmosphere becomes so thin that there isn’t enough oxygen for the reactions that make ozone to proceed quickly. The ozone layer doesn’t go lower because the ultraviolet light necessary to make ozone is all absorbed by the time sunlight gets that far.

The ozone hole

The ozone hole is an unnatural seasonal decrease in the amount of ozone in the ozone layer above Antarctica during spring each year. The ozone over Antarctica began to diminish around 1980, reaching one third of the natural amount by 1990; ozone levels in Antarctica have remained low since 1990, although there is some weak indication that ozone levels may have recently been increasing slowly.41The reason why the last bit of the Antarctic ozone layer was not destroyed is because the upper-most part of the Antarctic stratosphere does not have the right conditions for destroying ozone, and the stratosphere does not mix well. Ozone levels in the rest of the world have decreased by a lesser amount of around 5%.

Concern over the loss of ozone began in 1974 with the discovery that certain chemicals, called CFCs, could efficiently destroy ozone when exposed to ultraviolet radiation. CFCs are stable, non-reactive chemicals with low toxicity and a boiling point near room temperature; these properties made them well suited for use in aerosol cans, refrigerators, and fire extinguishers. Before the development of CFCs by Thomas Midgley Jr., highly toxic or explosive chemicals were often used for those purposes. Besides his research in CFCs, Midgley is best known for the development of leaded gasoline, which caused the worst man-made environmental crisis of American history (and arguably world history) by poisoning tens of millions of American children, and many more worldwide.42While the worldwide banning of leaded gasoline and the banning of leaded paint in the US and EU has greatly decreased the amount of lead in the environment, exposure to environmental lead continues to kill 140 000 people every year and contribute to 600 000 new cases of intellectual disability in children annually. However, the great stability of CFCs also meant that they could remain in the atmosphere for hundreds of years.

In 1984, on-the-ground observations in Antarctica revealed astonishingly low amounts of ozone in the stratosphere. NASA independently reported satellite measurements confirming the decline starting around 1980, although the satellite measurements were so low that they were initially believed to be a measurement error.

The announcement of the sudden destruction of the ozone layer in Antarctica brought immediate alarm, though the cause of the ozone hole remained unclear at first. It was soon confirmed that CFCs were responsible for the ozone hole through a previously unknown series of chemical reactions that were specific to the atmospheric conditions in Antarctica. In 1986 the Montreal Protocol was passed, banning CFCs globally, with the phase-out beginning in 1991. The Montreal Protocol is one of very few treaties to be ratified by every UN member and is directly responsible for averting disaster.

Today, the ozone hole poses a minor health hazard to people in and near Antarctica, and possibly a lesser contributor to skin cancer worldwide. There is not much reliable scientific data on the effects of a missing ozone layer on human health, but a study has found that a region in southern Chile experiences 50% more skin cancer than normal. (The exact shape and size of the ozone hole varies from year to year, and it can sometimes reach South America.) The ozone hole has a significant effect on the climate in Antarctica. Recall that the stratosphere is warmed by the ozone layer; in the absence of ozone, the stratosphere is cooler than usual, so wind circulation around Antarctica has strengthened significantly in response to the colder temperatures. These winds drive major ocean circulation patterns, so the stronger winds have changed the ocean circulation, leading to changes in sea ice formation, temperature, and precipitation in Antarctica. For this reason, the climate of Antarctica has been behaving anomalously for the last several decades compared to the rest of the world.

Computer models suggest that the ozone layer may return to normal levels around 2060. Without regulation of CFCs it is predicted that more than half of ozone worldwide would have been destroyed by then. The EPA estimates that the Montreal Protocol will have prevented 300 million cases of skin cancer and 2 million skin cancer deaths among Americans born before 2100.

While CFCs are greenhouse gases, their concentration is too low to contribute significantly to the greenhouse effect. Carbon dioxide does not have a significant effect on the ozone hole either directly or indirectly through climate change, and conversely the ozone hole does not contribute significantly to global climate change.

Ozone is itself a greenhouse gas, although because of its small concentration in the Earth’s atmosphere it only contributes about 5% to the greenhouse effect. Also, as ozone absorbs ultraviolet radiation it contributes an anti-greenhouse effect. The balance of these effects depends on the concentration and distribution of ozone.

The greenhouse effect appendix A: What is the atmosphere?

2022 August 08

  1. Part 1: What is the greenhouse effect? An accessible, scientific introduction
  2. Appendix A: What is the atmosphere?
  3. Appendix B: Ozone
  4. Part 2: Physics of light and temperature
  5. Part 3: Temperature of the Earth without an atmosphere
  6. Part 4: A model of the greenhouse effect
  7. Part 5: Differences between model and reality

The atmosphere is a mixture of gases that lies on the surface of the Earth. It tries to expand to fill the volume it is in like all gases, but it is drawn to the Earth by the Earth’s gravity. The balance, called hydrostatic equilibrium, between outward expansion and gravity causes it to be thickest at the surface of the Earth and thinner at higher altitudes. There is no sharp upper boundary to the atmosphere: it just continues to become thinner. (Similarly, the Sun and the gas giants do not have a sharp boundary between them and space either.)

Half of the atmosphere is within 5.5 km (3.5 miles) of the surface of the Earth. A commercial plane at a cruising altitude of 11 km (7 miles) is above 80% of the Earth’s atmosphere, with only 20% of the Earth’s atmosphere between it and space.

Gases in the atmosphere

The main components of the Earth’s atmosphere are as follows:

Gas Chemical Fraction of atmosphere
Nitrogen N2 78%
Oxygen O2 21%
Argon Ar 1%
Water H2O 0.25%
Carbon dioxide CO2 0.04%
Neon Ne 18 ppm
Helium He 5 ppm
Methane CH4 1.9 ppm
Ozone O3 0.4 ppm

All of these chemicals except water and ozone are roughly uniformly distributed through the atmosphere; that is, at any location at any altitude you expect the air to be about 21% oxygen.

Nitrogen, argon, neon, and helium are all chemically unreactive and do not play a significant role in the climate.

Oxygen, of course, plays an important role in life. In fact, the Earth’s atmosphere did not have any oxygen for the first two billion years, until photosynthesis created the oxygen we have in our atmosphere today. While oxygen does not directly have a significant role in the climate, its initial appearance in the atmosphere caused enormous climactic upheavals43This is called the Great Oxygenation Event, which may have involved a “snowball Earth” mostly or entirely covered in ice..

Carbon dioxide and methane are important greenhouse gases, and the effects of greenhouse gases are discussed in the other parts in detail. Carbon dioxide is mostly unreactive but is necessary for photosynthesis. Methane is only mildly reactive in atmospheric conditions, having a lifetime of 10 years in the troposphere or 120 years in the stratosphere, forming carbon dioxide and water. Methane is the main source of water in the upper stratosphere.

Water is the most exceptional of the components of the Earth’s atmosphere because it can readily change between gas, liquid, and solid at conditions typical in the atmosphere. Liquid and solid water will sometimes, depending on conditions, form large enough droplets or crystals to fall out of the atmosphere, whereas water vapor of course does not. Since the transition between phases strongly depends on the temperature, and temperature changes greatly with altitude, this means that the amount of water varies strongly with altitude. In particular, almost all of the water in the atmosphere is very near the surface; water that rises far above the surface of the Earth typically cools so much that it condenses to liquid or solid form and falls. In comparison, most of the other gases listed above are evenly mixed through the whole atmosphere.

Furthermore, since water readily enters and leaves the atmosphere through evaporation and precipitation, the amount of water in the atmosphere changes rapidly in just weeks. We see this in our day-to-day lives when we notice that one day is much more or less humid than normal. The other gases listed above typically remain in the atmosphere for very long periods of time.

Layers of the atmosphere

For the purpose of scientific study the atmosphere is typically organized into a series of layers. The layers differ from each other in their temperature and the types of phenomena that occur in each layer.

The bottom 10 to 15 km of the atmosphere is called the troposphere, and contains about 80% of the atmosphere. The troposphere is heated from below because the bottom is touching the Earth’s surface, which is warmed directly by sunlight. Because the troposphere is heated from below and hot air rises, it is a very active part of the atmosphere, and almost all of the weather that we are interested in occurs here. In particular, almost all of the water is in the troposphere, so clouds predominantly appear in this layer. The name “troposphere” refers to the constant “turning over” of the air there.

Above the troposphere and continuing to 50 km is the stratosphere. The important difference between the troposphere and the stratosphere is that the stratosphere is mostly heated from above, by the ozone layer. Because it is heated from above, and heat rises, the stratosphere is very stable and mixes slowly (that is, it is stratified, giving it its name). In particular, it does not mix readily with the troposphere. This lack of mixing can be best observed in strong thunderstorms that form anvil clouds at the boundary.

The mesosphere, until 85 to 100 km, and thermosphere, until 500 to 1000 km, are the next layers in the atmosphere. Around 100 km can be found the sodium layer which is formed by ablation of meteorites; it has a column density of about 1 milligram per square kilometer. This is also approximately the altitude at which gases begin to separate based on molecular weight, with lighter gases reaching higher altitudes. At 160 km the atmosphere is too thin to sustain sound waves within the range of human hearing44Higher pitches require denser air to be transmitted; if the time between collisions is longer than the frequency of the pressure wave, then the high and low pressures will simply be averaged out. The highest transmissible pitch drops by an octave roughly every 5 km of altitude.. The International Space Station orbits at 400 km and slowly falls over time due to drag from the air: it is boosted by rockets approximately once a month.

The upper boundary of the thermosphere depends strongly on daily variations in solar activity, referred to as space weather. Above the thermosphere is the exosphere, at which point the air is so thin that it no longer acts like a gas; instead, each of the molecules behaves independently of the others and their paths (or orbits!) are controlled by gravity. The exosphere could be considered part of space; different scientists use different definitions for when space “begins” according to what physical phenomena are relevant to their discipline.

The Great Patton (version 2)

2022 August 03

A few years ago I was experimenting with unusual ways to combine videos. Casting about for movie clips to remix, I struck upon an idea: what if I combined one of the most famous movie speeches, from The Great Dictator, with the iconic film rendition of General Patton’s war speech, from Patton?

How to combine them was obvious: The Great Dictator, released in 1940, was in black and white, whereas Patton’s speech stands Patton before an enormous, vibrant American flag, so I took the luminance from the former and the color from the latter. I was unsatisfied with my first version of this (viewable at the link above; see technical details below for more information), so today I redid the video, with this result:

(Use headphones, as the audio is in stereo.)

The two speeches ostensibly concern the same topic, World War II, but they could not be more antithetical. Patton vividly portrays the gruesome brutality of war, using the bloody imagery to call the listening soldiers to aggression and violence. Chaplin’s speech, however, invokes abstract ideals of democracy, peace, and unity, promising to the civilian victims of war that it is but a temporary falling away from the path, and calling upon soldiers to engage in a metaphorical fight for liberty, peace, and happiness.

Patton

George Patton (1885-1945), nicknamed “Old Blood and Guts”, was one of the most successful and aggressive US Army generals in the European theater of World War II. In 1944, he gave a series of motivational speeches to US infantry preparing to fight in the war, arguably the most famous historical pre-battle speech. A literary analysis of the genre of battle speeches identifies key elements in the formula for such a speech:

  1. an opening that focuses on the valor of the men rather than the impact of the speech (the common trope here is to note how “brave men require few words”)

  2. a description of the dangers arrayed against them,

  3. the profits to be gained by victory and the dire consequences of defeat

  4. the basis on which the general pins his hope of success and finally

  5. a moving peroration; the big emotional conclusion of the speech.

While the essay focuses on the literary form as it was used in the classical era, the author does point out that Patton’s speech adheres to this same formula. The purpose is to steel the warrior’s courage to maintain cohesion even in extremis. Patton’s speech does this ably, tying violent descriptions of the terrors of combat with the urge to push onwards, aggress, and surmount these terrors through victory instead of retreat. While contemporary officers sometimes criticized Patton for his vulgarity and unprofessionalism, his speeches were hugely popular with the men they were intended for, for whom these horrors were visceral. And it is likely this contributed to his success as a general: “battles are not won by killing all of the enemies, but by making the enemy run away … battles are principally won in the minds of the combatants”45Source, which has more information on the role of generalship.. And it was not just in his speeches: Patton sought to inspire his troops with a deliberately cultivated image of a strong, decisive, capable leader through his uniform, his signature ivory-handeled pistols, his “war face”, etc..

The 1970 film Patton opens with Patton delivering his 1944 speech to the troops in front of a massive American flag, with the text abbreviated and lightly adapted to movie format (the line where he blasely observes “You are not all going to die. Only two percent of you right here today would be killed” was one of those cut), as follows:

Be seated. Now, I want you to remember that no bastard ever won a war by dying for his country. He won it by making the other poor dumb bastard die for his country. Men, all this stuff you’ve heard about America not wanting to fight, wanting to stay out of the war, is a lot of horse dung. Americans, traditionally, love to fight. All real Americans love the sting of battle.

When you were kids, you all admired the champion marble shooter, the fastest runner, the big league ball players, the toughest boxers. Americans love a winner and will not tolerate a loser. Americans play to win all the time. Now, I wouldn’t give a hoot in hell for a man who lost and laughed. That’s why Americans have never lost and will never lose a war. Because the very thought of losing is hateful to Americans.

Now, an army is a team. It lives, eats, sleeps, fights as a team. This individuality stuff is a bunch of crap. The bilious bastards who wrote that stuff about individuality for the Saturday Evening Post don’t know anything more about real battle than they do about fornicating.

Now, we have the finest food and equipment, the best spirit, and the best men in the world.

You know, by God, I actually pity those poor bastards we’re going up against. By God, I do. We’re not just going to shoot the bastards. We’re going to cut out their living guts and use them to grease the treads of our tanks. We’re going to murder those lousy Hun bastards by the bushel.

Now, some of you boys, I know, are wondering whether or not you’ll chicken-out under fire. Don’t worry about it. I can assure you that you will all do your duty. The Nazis are the enemy. Wade into them. Spill their blood. Shoot them in the belly. When you put your hand into a bunch of goo that a moment before was your best friend’s face, you’ll know what to do. [my video mix stops here]

Now, there’s another thing I want you to remember. I don’t want to get any messages saying that we are holding our position. We’re not holding anything. Let the Hun do that. We are advancing constantly and we’re not interested in holding onto anything – except the enemy. We’re going to hold onto him by the nose, and we’re gonna kick him in the ass. We’re gonna kick the hell out of him all the time, and we’re gonna go through him like crap through a goose!

Now, there’s one thing that you men will be able to say when you get back home, and you may thank God for it. Thirty years from now when you’re sitting around your fireside with your grandson on your knee, and he asks you, “What did you do in the great World War II?” – you won’t have to say, “Well, I shoveled shit in Louisiana.”.

Alright now you sons-of-bitches, you know how I feel. Oh, I will be proud to lead you wonderful guys into battle anytime, anywhere. That’s all.

One more note: while the delivery in the movie is serious, rough, and gruesome, apparently the real 1944 speech was given with a much more light and humorous tone. While I doubt video of the original speech exists (Patton’s presence on the military base was a matter of high secrecy), his other speeches are filled with wry quips and delivered comedically.

The Great Dictator

The Great Dictator, released in 1940, was Charlie Chaplin’s first real talkie; filming had begun the previous fall, coinciding with the start of World War II. In the film Charlie Chaplin (1889 - 1977) plays both Adenoid Hynkel, a tissue-thin parody of Hitler, and an anonymous Jewish barber who is a victim of Hynkel’s persecution. The film largely carries on Chaplin’s characteristic slapstick style, alternating between Hynkel’s absurd megalomania and the barber’s bumbling naivete.

Inevitably, the identical-looking Hynkel and barber become interchanged, with Hynkel imprisoned and the barber ushered into the head of a military parade. At the very end of the movie the barber is compelled to address an enormous crowd of citizens from the newly conquered “Osterlich”. Here, the facade drops: it is not the barber, nor Hynkel, that speaks, but Chaplin himself. We do not see (much of) the crowd’s reaction, and the camera narrowly frames Chaplin’s face, for he is no longer addressing the people of Osterlich but directly talking to the 1940s movie audience – painfully cogniscient of the rising Nazi empire – calling for democracy, liberty, peace, and unity.

The common appearance of the barber and Hynkel parallels the real-life similarity between Chaplin and Hitler. Chaplin was acutely aware of this, as they were born just 4 days apart, and would sometimes ruminate on the whims of fate that made “he the madman, I the comic”. Indeed their resemblance, along with the false belief that Chaplin was jewish, had led to German censorship of Chaplin films: of course The Great Dictator was banned in occupied Europe, although it is believed that Hitler watched it twice.

I’m sorry, but I don’t want to be an Emperor – that’s not my business. I don’t want to rule or conquer anyone. I should like to help everyone, if possible – Jew, gentile, black man, white. We all want to help one another; human beings are like that. We want to live by each other’s happiness, not by each other’s misery. We don’t want to hate and despise one another. In this world there’s room for everyone and the good earth is rich and can provide for everyone.

The way of life can be free and beautiful.

But we have lost the way.

Greed has poisoned men’s souls, has barricaded the world with hate, has goose-stepped us into misery and bloodshed. We have developed speed but we have shut ourselves in. Machinery that gives abundance has left us in want. Our knowledge has made us cynical, our cleverness hard and unkind. We think too much and feel too little. More than machinery, we need humanity. More than cleverness, we need kindness and gentleness. Without these qualities, life will be violent and all will be lost.

The aeroplane and the radio have brought us closer together. The very nature of these inventions cries out for the goodness in men, cries out for universal brotherhood for the unity of us all. Even now my voice is reaching millions throughout the world, millions of despairing men, women, and little children, victims of a system that makes men torture and imprison innocent people.

To those who can hear me I say, do not despair. The misery that is now upon us is but the passing of greed, the bitterness of men who fear the way of human progress. The hate of men will pass and dictators die; and the power they took from the people will return to the people and so long as men die, liberty will never perish.

Soldiers: Don’t give yourselves to brutes, men who despise you, enslave you, who regiment your lives, tell you what to do, what to think and what to feel; who drill you, diet you, treat you like cattle, use you as cannon fodder. Don’t give yourselves to these unnatural men, machine men, with machine minds and machine hearts! You are not machines! You are not cattle! You are men! You have the love of humanity in your hearts. You don’t hate; only the unloved hate, the unloved and the unnatural.

Soldiers: Don’t fight for slavery! Fight for liberty! In the seventeenth chapter of Saint Luke it is written, “the kingdom of God is within man” – not one man, nor a group of men, but in all men, in you, you the people have the power, the power to create machines, the power to create happiness. You the people have the power to make this life free and beautiful, to make this life a wonderful adventure.

Then, in the name of democracy, let us use that power! Let us all unite! Let us fight for a new world, a decent world that will give men a chance to work, that will give you the future and old age a security. By the promise of these things, brutes have risen to power, but they lie! They do not fulfill their promise; they never will. Dictators free themselves, but they enslave the people!! Now, let us fight to fulfill that promise! Let us fight to free the world, to do away with national barriers, to do away with greed, with hate and intolerance. Let us fight for a world of reason, a world where science and progress will lead to all men’s happiness.

Soldiers: In the name of democracy, let us all unite!

Technical notes

When I had first made my mashup in 2019, I had extracted the red-green-blue pixel values (r_i, g_i, b_i) from the two clips and combined them like so:

r =
\begin{cases}
    r_1 \frac {r_0 + g_0 + b_0}{r_1 + g_1 + b_1} & \text{if } r_1 + g_1 + b_1 > 0 \\
    r_0 & \text{otherwise}
\end{cases}

The values were then clipped to the range (0, 255). Thus, the total intensity of the result r + g + b equals the intensity r_0 + g_0 + b_0, whereas the ratio of red to green to blue of (r, g, b) matches that of (r_1, g_1, b_1).

This approach has several problems: due to the clipping, colors with high saturation in the original stream 1 end up at maximum value in the result. Also, extreme artifacts occur in dark regions of the picture due to quantization near 0.

There is an additional concern, which contributes to the severe artifacts. Most video streams, including those I used as sources, are encoded in yuv420p format. This makes use of the Y’UV color space, instead of the RGB colorspace, in which Y’ is the gamma-adjusted luma of a point, U is the blue-yellow component, and V is the red-cyan component. The yuv420p format has four times the resolution (twice in both horizontal and vertical) in the Y component than the U or V components, as human perception is much more discriminating of variances in light levels than in hues.

However, most python libraries I have used for manipulating videos (moviepy and imageio being the two I have found best) do not permit manipulation of the raw video stream data, but instead convert everything to and from RGB implicitly. Thus the processing converts the videos from Y’UV to RGB, manipulates the RGB data, and then converts RGB to Y’UV, with each step introducing significant rounding errors and conversion losses46Each RGB value is a single byte from 0 to 255, and the conversion Y’UV <-> RGB is strongly nonlinear. Y’UV is optimized for human perception, which does not map very neatly to distinct RGB components..

More logical, then, would be to directly manipulate the Y’UV data, which was how I made version 2 of the video. On my first attempt I just used the Y’ data from The Great Dictator and the UV data from Patton, but was able to improve the result with some slight tweaking. The ffmpeg command I used for the processing was:

ffmpeg -i input/greatdictator.webm -i input/patton.webm \
    -i input/patton_denoise.wav \
    -filter_complex FILTER \
    -map '[vid]' -map '[aud]' \
    ENCODING_OPTIONS \
    output/greatpatton.mp4

I have three inputs, which are the two videos, and an audio file taken from the Patton video and processed with sox to remove static. These are passed through a filter, resulting in a video stream and audio stream which are assembled into an output file. The ENCODING_OPTIONS are simply some flags for optimizing the output for youtube:

ENCODING_OPTIONS='-c:v libx264 -profile:v high -level 4.2 \
    -crf 17 -tune film -movflags +faststart'

The filter itself was produced by joining together 7 linear filters with semicolons. Let us explain each one at a time:

[0:v]extractplanes=y,
trim=start=5.5:duration=201.5,
scale=1440:816,
pad=1440:1080:-1:-1,
setsar=sar=1,
setpts=PTS-STARTPTS[y1]

Take the video stream from input 0 (The Great Dictator), extract the Y’ plane, cut a clip from it (with length 201.5 seconds, starting from 5.5 seconds), and adjust the resolution to match the other video. Adjusting the video is done by scaling it and then padding with black. The last filter, setpts=PTS-STARTPTS, changes the stream’s timestamp so that the clip starts from 0 seconds instead of 5.5 seconds (it’s not clear if this is necessary but I think it makes merging the streams later more reliable).

[1:v]trim=start=66.5:duration=201.5,
setpts=PTS-STARTPTS,
extractplanes=y+u+v[y2][u][v]

Take the video stream from input 1 (Patton), cut a clip from it, fix the timestamps, and then extract the Y’, U, and V planes.

[y1][y2]mix=weights=0.93 0.07[y]

Mix 93% of the stream y1 with 7% of the stream y2. The eye is very sensitive to small changes in luminance, and having just a hint of the luminance from Patton makes it much easier to distinguish features. The outline of the uniform, the stars on the helmet, the medals, the features of the face, etc. all suddenly pop out distinctly when they were completely invisible without this.

[y][u][v]mergeplanes=0x001020:yuv420p,
eq=contrast=1.5:saturation=2[vid]

Merge together the monostreams y, u, and v to make a single stream in yuv420p format. The contrast is heightened as The Great Dictator had mostly intermediate greys, and the saturation also increased to make the flag more vivid.

[0:a]atrim=start=5.5:duration=201.5,
pan=1c|c0<c0+c1,
volume=volume=3.5,
asetpts=PTS-STARTPTS[a0]

Take the audio stream from input 0, cut a clip, average the two channels into one channel with pan, and boost the volume by 3.5.

[2:a]atrim=start=66.5:duration=201.5,
pan=1c|c0<c0+c1,
asetpts=PTS-STARTPTS,
afade=t=in:d=2[a1]

Likewise with the audio stream from input 2, adding two seconds of fade in so that the opening trumpet (which is in the middle of a longer fanfare) is not so blaring.

[a0][a1]amerge[aud]

Combine the two monochannel audio streams into a single stereo stream.

What is the greenhouse effect? An accessible, scientific introduction. Part 1

2022 August 01

  1. Part 1: What is the greenhouse effect? An accessible, scientific introduction
  2. Appendix A: What is the atmosphere?
  3. Appendix B: Ozone
  4. Part 2: Physics of light and temperature
  5. Part 3: Temperature of the Earth without an atmosphere
  6. Part 4: A model of the greenhouse effect
  7. Part 5: Differences between model and reality

Click on images to zoom. An older version of this series is also available as a single pdf.

Introduction

Growing up, I understood that it gets warmer in the summer because there was “excess” sunlight warming the Earth, and cooler in winter because there was “insufficient” sunlight47There is a subtlety here that I did not notice at the time: does it get warmer in summer because the excess sunlight causes more warming to happen each day than cooling happens in the night, or does it get warmer because the days get longer and therefore the equilibrium temperature of each day is higher than the previous? This is similar to asking on what time scale the Earth system returns to equilibrium – yearly, or daily. The (incomplete) answer is that the former effect is arguably dominant, as can be seen by observing that the hottest day of the year is usually about 1 to 2 months after summer solstice, and thus the days continue to get hotter even when they are getting shorter. More specifically, if we approximate the Earth’s temperature at a point as an exponential decay to equilibrium, subject to a sinusoidal forcing, then we find that the temperature is sinusoidal with a phase offset from the forcing such that the tangent of the phase offset equals the ratio of the decay timescale to the forcing period. Taking the phase offset to be 1.5 months of a 12 month year, the ratio is \tan(2 \pi / 8) = 1, so the time scale for the Earth to reach equilibrium is about 1 forcing period, i.e., 1 year.. However I was confused that these effects happened to cancel each other exactly: if there was any tendency for the summer warming to exceed the winter cooling on average, however slightly, then year after year the temperature would steadily increase unendingly. Why doesn’t this happen? Furthermore, I did not even think to wonder what “excess” or “insufficient” sunlight could even mean: surely any amount of sunlight would warm the Earth – by what process was the Earth cooling in winter?

Like many students in the US, the only time (until graduate school) I was taught about the Earth’s climate, climate change, or the greenhouse effect was in grade school. At that age, students are often not prepared to engage with such questions, much less have the necessary background in physics to answer them; and if we do not understand the processes by which the Earth warms and cools, then we will not understand how the greenhouse effect impacts those processes no matter how well, or poorly, it is taught.48As an example, here is a lesson plan designed by NASA in which students perform an experiment where greenhouse gases are “simulated” using plastic wrap – ironically how greenhouses work, but having nothing to do with the greenhouse effect. Here is an experiment for high schoolers in which students measure the temperature of bags with and without carbon dioxide (which is generated in situ by an exothermic chemical reaction). Neither of these plans contain any information related to the greenhouse effect. Many of these students have become adults who see the words “climate change” and “greenhouse effect” increasingly often in news media and policy discussion, without realizing that their grasp of these concepts is at best superficial.

The goal of this series is to better introduce the reader to the greenhouse effect by first covering the requisite topics in physics and then building off of it show how the greenhouse effect occurs as a consequence. Specifically, we investigate properties of energy, light, heat, and temperature, including the blackbody effect, that hot objects emit light, and is how the Earth cools off. Balancing the blackbody effect with sunlight and features of the Earth’s atmosphere leads inevitably to the greenhouse effect, that certain gases (called “greenhouse gases”) cause the Earth to be warmer.

Understanding the greenhouse effect is necessary to engage with public policy and the prospects for the climate of our future. Along the way to this goal, we will encounter a variety of other topics, such as geoengineering, the lifetime of different greenhouse gases in the atmosphere, the processes that govern the structure of the atmosphere, the temperature of black holes, and even color.

While this series is intended for a scientifically-curious reader, it does not presume any specific advanced scientific background, and the principles are introduced with digressions and examples to provide context and help connect them to real-world experience. Scientific terms are put in bold when their definition is given by the surrounding text, and in italics otherwise; that is, terms in italics are jargon that may have a different scientific meaning than their plain English meaning.

In part 1 we introduce and motivate the remaining parts. Readers who are satisfied with the summary may skip the rest of the series.

In part 2 we explain how energy enters and leaves the Earth, which happens in the form of light. We discuss light and the different types (that is, wavelengths) of light, and we introduce a graph called a spectrum which shows what types of light are in a beam. We then describe the blackbody effect, which states that all objects emit light, and relates the amount and type of light emitted by an object to the temperature of the object.

In part 3 we introduce the concept of an equilibrium of a system, which could be a typical or average state for the system to be in; it is frequently much easier to find the equilibria of a system than to describe the exact behavior of a system as it changes over time. We discuss equilibria in the context of temperature and calculate the equilibrium temperature for the Earth in the absence of the atmosphere.

In part 4 we present a simplified model of the Earth and its atmosphere. We calculate the equilibrium temperature of the model Earth, and see what exactly changes as we change the amount of greenhouse gas. The finding that the temperature goes up as greenhouse gas is added is the greenhouse effect.

In part 5 we explore the differences between the simple model and the real-world climate of the Earth.

In appendix A we give an overview of what the atmosphere is, starting from questions like why it is thicker on the bottom and whether it has a top.

In appendix B we discuss the ozone layer and ozone hole. While they have no bearing on the greenhouse effect, they are a frequent point of confusion with the greenhouse effect. The worldwide effort to control the ozone hole by regulating CFCs is also the only notable example of a global agreement to address a global environmental problem, and thus makes a useful point of comparison to the future regulation of carbon dioxide emissions.

Summary

The Earth is warmed by visible light it receives from the Sun, and is cooled by emitting infrared light (which is invisible) to space. Greenhouse gases in the atmosphere are those that absorb infrared light, but not visible light. Some of the infrared light emitted by the Earth is absorbed by the greenhouse gases and then re-emitted, some of which hits the Earth.

The Earth’s temperature is in balance when the total energy it receives equals the total energy it emits. As infrared light is returned to the Earth by greenhouse gases, the energy the Earth receives is increased, so it must emit more energy to maintain balance. The Earth emits more light when it is warmer. In this way adding greenhouse gases to the atmosphere causes the Earth to become warmer.

Further reading

A very short and easy to read introduction to climate change is What We Know About Climate Change by Kerry Emanuel. The book is about as long as this series and is targeted to a non-scientific audience. In addition to reviewing the material covered here (without any equations), the book discusses what climate change is, how we know it is happening, what the effects of climate change are, and what can be done about climate change.

Sustainable Energy – Without the Hot Air by David MacKay is an excellent, readable introduction to energy in modern society, and is also targeted to non-scientists. The book discusses the major uses of energy and the major available sources of energy, and invites the reader to consider how to balance these against each other in a future without fossil fuels. The reader is empowered with the information necessary to come up with their own energy plan, and in doing so is forced to confront with the difficulty of finding an agreeable plan where energy production satisfies energy demand.

Temperature units

For more information, see my four-part series on temperature, beginning with What is negative temperature?

In this series we use a mixture of three different scales for describing temperature: degrees Celsius (formerly called “centigrade”), which is the international standard temperature for day-to-day use; degrees Fahrenheit, which is standard in the United States for day-to-day use; and Kelvin, which is typically used within physics and other sciences when discussing temperatures far from room temperature or for theoretical physics. The reason for the difficulty converting between this different scales is that Celsius and Fahrenheit do not have zero temperature at zero degrees, but rather at -273.15 C and -459.67 F, respectively. Indeed, both of those scales were invented and popularized long before the scientific community reached a consensus that there was such a thing as zero temperature (called absolute zero to avoid confusion with other temperatures like 0 C or 0 F) or how cold it was compared to known temperatures.

To rectify these shortcomings, Lord Kelvin invented a temperature scale with zero temperature at zero degrees and was the first to reliably calculate the difference between that temperature and known temperatures, building off of contemporary work in thermodynamics by him and other scientists. This Kelvin scale was scaled so that the difference between two temperatures is the same number of degrees as in the Celsius scale, so as a result one can convert from Celsius to Kelvin by simply adding 273.15 degrees, which has the effect of shifting the zero to the correct place. To convert between Fahrenheit and the other two scales requires both shifting and multiplying by a constant.

Celsius and Fahrenheit continue to be more useful than Kelvin for describing temperatures near room temperature (for which Kelvin can be unwieldy), but are very inconvenient in the context of thermodynamics. Climate science in general, and this series in particular, is mostly concerned with physical processes that take place near room temperature, but explains these processes with thermodynamics, and so uses both Celsius and Kelvin according to what is convenient in each context. (American meteorology continues to primarily use Fahrenheit for historical reasons.)

Links

2022 April 12

Music

Nils Frahm - Says

Scala and Kolacny Brothers - Evigheden

Unicorn Heads - A Mystical Experience

A mathematical April fools joke: an interactive model of a sphere tiled by hexagons. I haven’t figured out how it works….

It’s not possible to represent real numbers exactly in a computer; the most common representation, IEEE754 has a variety of oddities, notably both positive and negative zero. For most purposes these behave identically, but one surprising difference is that compilers can optimize x + (-0) into x but can’t optimize x + (+0) into x! Why? Because (+0) + (-0) = (+0), so the additive identity in IEEE754 is negative zero! Therefore in some circumstances it might be slightly more efficient to use negative zero instead of positive zero.

A Japanese researcher conducted an experiment for 24 days in which they slept at whatever location their cat chose to sleep. They experienced no reduction in average quality of sleep.

“In reality, there are no such things [as dates in Excel spreadsheets]. What you have are floating point numbers and pious hope.”

Italy has started a superbonus scheme for investments in energy-efficient building upgrades. It is called a superbonus because the government subsidizes homeowners for 110% of the capital cost of the upgrade. An estimated 33 billion euros will be invested in the scheme, which has already this year increased GDP by 0.7%. The advantage of such a scheme is to delegate decision making to the smallest scale, where people can fine tune for their particular circumstances, and the 110% reimbursement helps cover the many non-financial burdens that come with upgrading your home. The limited time of the subsidy may encourage more uptake due to fear-of-missing-out. The downside of delegating decision making is that the system is vulnerable to fraud, with already 1 billion euro of fraud having been identified. (Using a fixed menu of subsidy amounts for each type of upgrade would be the reverse trade off: less room for fraud, but less flexible decision-making for the people most attuned to the circumstances.) Unfortunately fraud is inevitable in any country-sized project.

Images

The definitive Monty Hall explainer

2022 April 06

Imagine you’re on a gameshow. There are three doors, behind one of which is a car–” “You’re telling it wrong. There are three doors, behind one of which is a car.”

I have been obsessed for far too many years with incorrect statements of the Monty Hall problem, once going so far as to pick up a Korean pop-science book just to see if its explanation was wrong (conclusion: I can’t read Korean), and I fear the only way to sate this obsession is to write my own. Er, my own correct explanation, that is.

I will spoil the ending by giving the take-away lesson up front. The Monty Hall problem involves probability, which is to say it has an element of randomness. Randomness is a property of a process that yields some result, not a property of the result itself:

Source: https://xkcd.com/221/

Therefore any presentation of the Monty Hall problem that simply describes the sequence of events is incomplete, and has no single right answer: a complete statement must also give the underlying random process that yields the observed events.

Treating randomness as a process, not a result, is often the domain of computer science, but this is a broadly general concept that is vital to successfully navigating any confusing problem involving probability. Losing track of your sources of randomness and how they determine the observed outcomes is a common cause of errors. Perhaps related is one of the benefits of pre-committing to the statistical analysis performed in a scientific study – by designing your analysis before seeing the data, you are forced to grapple with the data generation process, rather than the particular observed data values.

The Monty Hall problem

The common presentation of the Monty Hall problem, as seen in any number of popular media, is akin to the following:

You are on a gameshow hosted by Monty Hall featuring 3 doors, one of which conceals a fancy car you want, and the other two hiding worthless gag “prizes”. The host asks you to choose which door’s prize you want: you choose door 1. He then opens door 3, revealing a gag prize, and gives you a chance to change your mind about which door to take. What is the probability that door 2 conceals the car?

The classic answers are either 1/2 or 2/3, with the latter being “correct”. However if you read the spoiler above, you already see a major issue: how can the answer be a probability between 0 and 1 if no source of randomness is stated in the problem?

It is commonly understood from the context of a gameshow that the three prizes are initially distributed behind the three doors uniformly at random, which serves as a source of randomness. With some re-wording we can suppose the player’s initial door choice was also at random, although this turns out to be redundant with the prizes being distributed at random. (Just as in rock-paper-scissors it is redundant to choose at random if you know your opponent is choosing at random.)

So let us consider the three possibilities:

  1. Case 1. Car is behind door 1, and events transpire as described.
  2. Case 2. Car is behind door 2, and events transpire as described.
  3. Case 3. Car is behind door 3, which is inconsistent with the specified sequence of events, so something else happens.

Again we are stuck: the problem as stated doesn’t tell us what happens in the 1/3 of cases that the car was behind door 3. Clearly something happens, but what?

The host’s behavior

What we are missing is a complete description of the host’s behavior. It is not enough to know what the host did in actuality, but also what the host would do in every counterfactual. We aren’t even told if the host’s behavior is random or deterministic! Indeed we could justify any answer from 0 to 1 by carefully choosing what procedure for the host to follow.

Suppose for example that the host’s procedure is to always open door 3. Then the cases are:

  1. Car is behind door 1 and host reveals door 3 has a gag prize.
  2. Car is behind door 2 and host reveals door 3 has a gag prize.
  3. Car is behind door 3 and host reveals door 3 has the car.

Then, the conditional probability that the car is behind door 2 given that the host reveals a gag prize is equal to 1/2. (Of course, the probability given that the host reveals a car is zero, since we can see the car is behind door 3.)

Note that, in this case, we don’t actually need to know how the player chooses which door for their initial selection, since it had no influence on the sequence of events. (Assuming that the player has no prior information about the location of the car and therefore that conditioning on their selecting door 1 does not change the probability of where the car is.)

Now suppose that the player’s procedure is to always select door 1, and the host’s procedure is to always open the lowest numbered door that is not the selection and is a gag prize. Then the cases are:

  1. Car is behind door 1 and host reveals door 2 has a gag prize.
  2. Car is behind door 2 and host reveals door 3 has a gag prize.
  3. Car is behind door 3 and host reveals door 2 has a gag prize.

Now, the conditional probability that the car is behind door 2 given that the host revealed a gag prize behind door 3 is 100%!

I was pleasantly surprised to see that the importance of specifying the host’s procedure was clearly explained by none other than Monty Hall himself, in a 1991 interview with the New York Times, a very rare example of math or science being accurately reported on in mainstream media:

[Monty Hall] picked up a copy of Ms. vos Savant’s original column, read it carefully, saw a loophole and then suggested more trials.

On the first, the contestant picked Door 1.

“That’s too bad,” Mr. Hall said, opening Door 1. “You’ve won a goat.”

“But you didn’t open another door yet or give me a chance to switch.”

“Where does it say I have to let you switch every time? I’m the master of the show. Here, try it again.”

[…] Whenever the contestant began with the wrong door, Mr. Hall promptly opened it and awarded the goat; whenever the contestant started out with the right door, Mr. Hall allowed him to switch doors and get another goat. The only way to win a car would have been to disregard Ms. vos Savant’s advice and stick with the original door.

Furthermore:

Dr. Diaconis and Mr. Gardner both noticed the same loophole when they compared Ms. vos Savant’s wording of the problem with the versions they had analyzed in their articles.

“The problem is not well-formed,” Mr. Gardner said, “unless it makes clear that the host must always open an empty door and offer the switch.”

This is not an idle objection. Any problem statement necessarily depends on the reader sharing some context to understand the meaning of the words being used, just as we inferred that the prizes were distributed randomly (and that the player’s initial selection is independent of where the prize is) from the context of it being a gameshow. However, it is a much bigger step to infer the host’s behavior from the incomplete information as given above. The original gameshow Let’s Make a Deal that Monty Hall hosted featured a wide assortment of variety games that did not follow any fixed format or rigid rules; it would have been entirely in keeping with that show for Hall to adapt the format dynamically with circumstances, or even entice players away from the main prize with cash incentives. So it is not reasonable to leave the reader to guess at the procedure Hall is following from contextual clues alone: the procedure should be fully specified.

Full statement of problem and solution

Here is a (more) complete statement of the Monty Hall problem:

A gameshow has three doors, one of which is chosen in advance uniformly at random to conceal a car; the others conceal gag prizes. A player is asked to select a door, and does so at random. Regardless of the choices so far, the host then opens a door, chosen at random from among those doors which the player did not select and does not have the car. Given that the player selected door 1 and the host opened door 3, what is the probability that door 2 has the car?

There are three sources of randomness (which are implied to be independent), and we need 12 cases to fully work out every possibility. However it is fairly clear that the problem is symmetric with respect to which door the player selected, so we can fix that as nonrandom without changing the problem. This leaves us with only four cases:

(It is simple enough to see how to write out the other 8 cases to allow for if the player selects door 2 or 3, remembering to divide the probabilities by 3 so they still add up to 1.)

In cases 2 and 3, the host is “randomly” choosing which door to open from only one possibility. The host’s randomness only is relevant in cases 1a and 1b.

The conditional probability is

\frac {P(\text{car behind 2 and host opens 3})}{P(\text{host opens 3})} = \frac {1/3}{1/6 + 1/3} = \frac 2 3.

We’ve used the counterfactual outcomes to assess the conditional probability of the outcome that occurred. If we can eliminate case 1a, for example, case 1b raises in probability to 1/3 and the answer we get at the end is 1/2. Alternatively, if we can add cases 2a, 2b, 3a, 3b in which the host reveals the car, then the probability of that branch lowers from 1/3 to 1/6 and we again get 1/2 as the answer.

If the incomplete Monty Hall problem, as typically stated, lacks sufficient information to uniquely determine the answer, what makes this version of the problem the correct way of completely specifying it? While correct statements of the Monty Hall problem are rare in pop science discussions of it, most agree on a more-or-less acceptable explanation of how to get to the answer 2/3. Working backwards from this solution one finds the problem being solved. Indeed, this was the original justification vos Savant gave when critics of her column which popularized the problem pointed out the ambiguities: she said that from her explanation of the answer it was clear what was intended in the problem. (And, in vos Savant’s defense, the original incomplete statement of the problem was not hers but quoting from an inquiring reader.)

Other sources and variants

While most popular discussions of the Monty Hall problem fail to properly state it, some do. Wikipedia, of course, gives the correct explanation, saying “The given probabilities depend on specific assumptions about how the host and contestant choose their doors.”, and has the most thorough discussion of the problem and its variants of any source I’ve seen. The New York Times interview with Hall, discussed above, doesn’t give an explicit mathematical treatment but lucidly captures the key point that the host’s behavior must be specified.

Three decades before vos Savant’s column popularized the Monty Hall problem, an exactly equivalent problem was given by Martin Gardner (the same who was quoted above):

Three prisoners, A, B, and C, are in separate cells and sentenced to death. The governor has selected one of them at random to be pardoned. The warden knows which one is pardoned, but is not allowed to tell. Prisoner A begs the warden to let him know the identity of one of the two who are going to be executed. “If B is to be pardoned, give me C’s name. If C is to be pardoned, give me B’s name. And if I’m to be pardoned, secretly flip a coin to decide whether to name B or C.”

The warden tells A that B is to be executed. Prisoner A is pleased because he believes that his probability of surviving has gone up from 1/3 to 1/2, as it is now between him and C. Prisoner A secretly tells C the news, who reasons that A’s chance of being pardoned is unchanged at 1/3, but he is pleased because his own chance has gone up to 2/3. Which prisoner is correct?

Being both unambiguous and having priority in time, I would prefer if this formulation supplanted the classic presentation of the Monty Hall problem, but of course its very ambiguity led to its popularity.

A popular variant is to suppose there are 100 doors, of which the host opens 98. (vos Savant proposed 1000000 doors, but I guess people got tired of writing that many zeros.) This improves the intuition behind the correct solution to the problem but doesn’t by itself improve the clarity of the problem statement. Similarly, consider another formulation:

Three tennis players. Two are equally-matched amateurs; the third is a pro who will beat either of the amateurs, always.

You blindly guess that Player A is the pro; the other two then play. Player B beats Player C. Do you want to stick with Player A in a Player A vs. Player B match-up, or do you want to switch? And what’s the probability that Player A will beat Player B in this match-up?

After writing this article I found an interesting twist in which the player bribes the host in advance to choose their behavior, subject to the limitation that the host must open exactly one door that is not the player’s selection nor the car. The goal then is to choose a strategy (for both player and host) to maximize the chance of getting the car, and calculate that probability.

(Note that some of the behaviors I’ve mentioned above result in getting the car with 0% or 100% probability, but they do not satisfy the specified constraint on the host’s behavior, so we can’t use them here.)

Suppose, say, the bribed host opens door 3. This outcome occurs in every case that the car is behind door 2 (i.e. “case 2” above), and between 0% and 100% of the cases that the car is behind door 1 (i.e. “case 1b” above), with the probability depending on what the player bribed the host to do. Thus here the player does at least as well by switching as by staying, and likewise if the bribed host opened door 2 (cases 3 and 1a above). Therefore always switching is an optimal strategy for the player, with which the player wins in cases 2 and 3, with a probability of 2/3 – regardless of the behavior of the host. (Though for certain host behaviors this is not the only optimal strategy.)

The only thing the player can change by bribing the host is the distribution of wins and losses between doors 2 and 3: e.g., so that whenever the host opens door 3 it is a guaranteed win, but at the cost that they only win half the time that door 2 is opened. The overall win rate will still be the same.

Most popular presentations of the Monty Hall problem ask only whether you should switch, but as we just saw, switching dominates staying in most interpretations of the problem, side-stepping the central issue; this is why I phrased the problem as asking more specifically for the probability of finding the car.

Temperature part 1: What is negative temperature?

2022 April 05

  1. Part 1: What is negative temperature?
  2. Part 2: What is temperature? (coming)
  3. Part 3: Why does entropy increase? Resolving Loschmidt’s paradox. (coming)
  4. Part 4: Temperature of black holes (coming?)

This is a four-part exploration of topics related to temperature. In part 1 I start by asking about the units in which we measure temperature and end up investigating a model that permits negative temperature. We are left with some foundational questions unresolved, and in part 2 we must retreat to the very basics of statistical mechanics to justify the concept of temperature. Part 3 edges into philosophical grounds, attempting to justify the central assumption of statistical mechanics that entropy increases with time, with implications on the nature of time itself. Finally in part 4 I jump topics to black holes to look at their unusual thermodynamic properties.

Units for describing temperature

Our bodies are able to directly sense variations in the temperature of objects we touch, which forms the motivation of our understanding of temperature. The first highly successful quantified measurements of this property came with the invention of the mercury thermometer by Fahrenheit, along with his scale temperatures were measured on. He chose to use higher numbers to represent the sensation we call “hotter”.

With the discovery of the ideal gas laws came the realization that temperature is not “affine”, by which I mean that the laws of physics are not the same if you translate all temperatures up or down a fixed amount. In particular, there exists a minimum temperature. The Fahrenheit and Celsius scales were designed in ignorance of this fact, but they were already well-established by that point and continued to be used for historical reasons. Kelvin decided to address this problem by inventing a new scale, in which he chose to use zero for the minimum temperature.

Now with our modern theory of statistical mechanics, temperature is no longer just a thing that thermometers measure but is defined in terms of other thermodynamic properties:

 T = \frac {dU}{dS}

where T is the temperature of a system (in Kelvin), U is its energy, and S is its entropy. We will skip past the question of “what is energy” (not because it is easy, but because there is no good answer) to quickly define entropy. While in classical thermodynamics entropy was originally defined in terms of temperature (via the above equation), in statistical mechanics entropy is defined as a constant times the amount of information needed to specify the system’s microstate:

 S = -k_B \log \Omega

where \Omega is the number of possible microstates (assuming uniform distribution), and we will return to k_B in just a moment.

Unfortunately there remain two more inelegancies with this modern definition of temperature. The first is that the proportionality constant is a historial accident of Celsius’s choice to use properties of water to define temperature. Indeed, temperature is in the “wrong” units altogether. We like to think of temperature as related to energy somehow. It turns out, the Boltzmann constant accomplishes exactly that:

 k_B = 1.380649 \cdot 10^{-23} \text{ J / K}

The k_B that appeared in the definition of entropy was just there to get the units to agree with historical usage – if we dropped it, we could be measuring temperature in Joules instead! Well, more likely in zeptoJoules, since room temperature would be a bit above 4 \cdot 10^{-21} J. In summer we’d talk about it being a balmy 4.2 zeptoJoules out, but 4.3 would be a real scorcher.

(When measuring temperature in Joules it should not be confused with thermal energy, which is also in Joules! The former is an intensive quantity, while the latter is an extensive quantity. Maybe it is for the best that we have a special unit for temperature, just to avoid this confusion.)

Thermodynamic beta

To introduce what I perceive to be the second inelegancy of the modern definition of temperature, I will walk through a worked example of calculating the temperature of a simple system.

Suppose we have N magnetic particles in an external magnetic field. Each particle can either be aligned with the field, which I will call “spin down”, or against it, i.e. “spin up”; a spin up particle contains E more energy than one that is spin down. This is the Ising model with no energy contained in particle interactions.

A microstate is a particular configuration of spins. The ground state or lowest-energy state of the model is the configuration that has all spins down. If some fraction p of the spins are up, then the energy is

 U = pNE.

We need to calculate the distribution of possible microstates that correspond to a given energy level. Certainly we can find the number \Omega of microstates with pN up spins, but are they are equally likely to occur? Though it may seem strange, perhaps depending on the geometry of the system certain particles may have correlated spins causing certain microstates to be more probable than others.

In fact, with certain minimal assumptions, the distribution over all possible microstates is necessarily uniform. We suppose that nearby particles in the system can interact by randomly exchanging their spins. Furthermore distant particles can exchange spins via some chain of intermediate particles: if not, the system would not have a single well-defined temperature, but each component of the system would have its own temperature.

Thus the system varies over all configurations with pN up spins: it remains to see that the distribution is uniform. The particle interaction is adiabatic and time reversible, so the probability that two particles exchange spin is independent of what their spins are, and therefore it can be shown that the Markov chain for the state transition converges to a uniform distribution over all accessible configurations. That is, the distribution of microstates is uniform on a sufficiently long time period; on a shorter timescale the system does not have a well-defined temperature.

The number of configurations with N particles and pN up spins is \Omega = \binom {N}{pN}, and \log \Omega is the information needed to specify which microstate we are in. Thus the entropy is


\begin{aligned}
    S &= k_B \log \binom {N}{pN} \\
    &= k_B \log \frac {N!}{(pN)!((1 - p)N)!} \\
    &\approx k_B N (\log N - p \log pN - (1 - p) \log ((1 - p)N)) \\
    &= -k_B N (p \log p + (...

where we have used Stirling’s approximation. (Note that when using Stirling’s approximation on binomial coefficients, the second term in the approximation always cancels out.)

What is the point of all this faffing about with microstates and our information about them? After all, with temperature defined in this way, temperature is not a strictly physical property but rather depends on our information about the world. However at some point we would like to get back to physical phenomena, like the sensation of hot or cold that we can feel directly: this connection will be justified in part 2.

Now

\frac {dS}{dp} = k_B N (\log (1 - p) - \log p)

and p = U / (NE), so dp / dU = 1 / (NE) and


\begin{aligned}
    \frac {dS}{dU} &= \frac {k_B}{E} (\log (1 - p) - \log p) \\
    T = \frac {dU}{dS} &= \frac {E}{k_B} \cdot \frac 1{\log (1 - p) - \log p}.
\end{aligned}

Alright, so what have we learned? Let’s plug in some values for p and see. First, for the ground state p = 0, we have \log p = \infty so therefore we get T = 0 (in Kelvin). That’s not a big surprise, that the lowest energy state is also absolute zero.

What about the highest energy state p = 1? Now we have \log (1 - p) = \infty so again T = 0. Wait, what?

Maybe this model just thinks everything is absolute zero? How about the midpoint p = 1/2: then \log p - \log (1 - p) = 0 so T = \infty. Great.

We can just graph this so we can see what is really going on:

(Note the temperature is only defined up to a scalar constant that depends on the choice of E.) As the system gets hotter, the temperature starts from +0 Kelvin, increases through the positive temperatures, hits \infty K, increases through the negative temperatures, and then reaches -0 Kelvin, which is the hottest possible temperature.

So zero Kelvin is two different temperatures, both the hottest and coldest possible temperatures. Positive and negative infinity are the same temperature. (Some sources in the literature I read incorrectly state that positive and negative infinity are different temperatures.) Negative temperatures are hotter than positive temperatures.

This is quite a mess, but one simple change in definition fixes everything, which is to use inverse temperature, also called thermodynamic beta or coldness:

 \beta = \frac 1{k_B T} = \frac 1{k_B} \frac {dS}{dU}.

\beta has units of inverse Joules, but thinking of temperature from an information theory perspective and using 1 bit = \log 2, we can equate room temperature to about 45 gigabytes per nanoJoule. This is a measure of how much information is lost (i.e. converted into entropy) by adding that much energy to the system.

With \beta, the coldest temperature is +\infty and the hottest temperature is -\infty. Ordinary objects (including all unconfined systems) have positive temperatures while exotic systems like the magnetic spin system we considered above can have negative temperatures. If you prefer hotter objects to have a higher number on the scale, you can use -\beta, but then ordinary objects have negative temperature.

Negative temperature

The magnetic spin system of the previous section is a classic introduction to statistical mechanics, but as we’ve seen it admits the surprising condition of negative temperature. What does it mean for an object to have negative temperature, and can this happen in reality?

The first demonstration of a negative temperature was in 1951 by Purcell and Pound, with a system very similar to the magnetic spin system we described. They exposed a LiF crystal to a strong 6376 Gauss magnetic field, causing the magnetized crystal to align with the imposed field. The magnetic component of the crystal had a temperature near room temperature at this point. Then, reversing the external field, the crystal maintained its original alignment, causing its temperature to be near negative room temperature, then cooling to room temperature through \infty K over the course of several minutes. As it cooled, it dissipated heat to its surroundings.

Why did it cool off? On a time scale longer than it takes for the LiF’s magnet to reach internal thermal equilibrium, but shorter than it takes to cool off, the LiF’s magnet and the other forms of energy (e.g. molecular vibrations) within the LiF crystal are two separate thermal systems, each with their own temperature. However these two forms of energy can be exchanged, and on a time scale longer than it takes for them to reach thermal equilibrium with each other they can be treated as a single system with some intermediate temperature. As the molecular vibrations had a positive temperature, and the magnetic component had a negative temperature, thermal energy flowed from the latter to the former.

Consider more generally any tangible object made of ordinary atoms and molecules in space. The object will contain multiple energy states corresponding to various excitations of these atoms, such as vibrations and rotations of molecules, electronic excitations, and other quantum phenomena. For these excitations, the entropy increases rapidly with energy, so the corresponding temperatures will always be positive. Whatever other forms of energy the object exhibits, they will eventually reach equilibrium with the various atomic excitations, and thus converge on a positive temperature.

While such an object will exhibit increasing entropy with energy on the large scale, maybe it is possible for it to have a small-scale perturbation in entropy and thus a narrow window with negative temperatures. Perhaps, but some of the atomic excitations are very low energy, and thus they have a gradually increasing entropy that is smooth even to quite a small scale. Only under some very unusual circumstance could that object have negative temperatures.

None-the-less in 2012 researchers were able to bring a small cloud of atoms to a negative temperature in their motional modes of energy. I read the paper but didn’t really understand it.

As an aside – what if some exotic object had a highest energy state but no lowest energy state? This object could only have negative temperatures, would always be hotter than any ordinary objects it was in contact with, so would perpetually have thermal energy flowing from it.

(An object with no highest or lowest energy state could never reach internal thermal equilibrium, wouldn’t have a well-defined temperature, and its interactions with other objects couldn’t be represented as an average of small-scale phenomena, making statistical mechanics inapplicable.)

Lasers

However, negative temperatures are closely related to a phenomenon which is essential in the functioning of an important modern technology: lasers! Lasers produce light when excited electrons in the lasing medium drop to a lower energy level, emitting a photon. While an excited electron can spontaneously emit a photon, lasers produce light through stimulated emission of an excited electron with a photon of the matching wavelength: the stimulated photon is synchronized exactly with the stimulating photon, which is why lasers can produce narrow, coherent beams of light. However, the probability that a photon stimulates emission from an excited electron is equal to the probability that the photon would be absorbed by an unexcited electron, so for a beam of light to occur the majority of the electrons must be excited.

This is an example of population inversion, where a higher-energy state has greater occupancy than a lower-energy state. Population inversion is a characteristic phenomenon of negative temperatures: an object at thermal equilibrium with a positive temperature will always have occupancy decrease with energy, while at negative temperature it will always have occupancy increase with energy. (At infinite temperature, occupancy of all states is equal.)

But is the lasing medium at thermal equilibrium? Certainly we have to consider a short enough time scale for the electronic excitations to be treated separately from the motional energy modes. However even considering only electronic modes, modern lasers do not operate at thermal equilibrium. Originally lasers used three energy levels (a ground state and two excited states), for which it would have been debatable whether calling the lasing medium “negative temperature” is appropriate. Now, most lasers use four (or more) energy levels due to their far greater efficiency.

Such a laser has two low-energy levels, and two high-energy levels. The narrow transitions (between the bottom two or the top two levels) are strongly coupled with vibrational or other modes, so electrons rapidly decay to the lower of the two. As a result, there is a very low occupancy in the second-lowest energy level, making it easy to generate a strong population inversion between the two middle energy levels, which is the lasing transition. Thus the laser still operates at positive (but non-equilibrium) temperature, as most electrons remain in the ground state even when a population inversion between two higher levels is created.

Appendix: Proof of uniformity of distribution in microcanonical ensemble

We claimed that, under reasonable physical assumptions, the spin system with pN up spins will eventually be equally likely to be in any of the \binom {N}{pN} such configurations. (A particular configuration of up / down spins is called a microstate, and the collection of all microstates with the same energy is called a microcanonical ensemble (MCE).) This is intuitively sensible since any particular spin goes on a random walk over all particles, and thus any particular particle is eventually equally likely to have any of the starting spins. However I found rigorously proving this more involved than I thought, and I haven’t seen any similar argumentation elsewhere, so let’s work out all the details.

Any two “nearby” particles can exchange spins with some probability, not necessarily equal for all such pairs. Let us discretize so that in each time step either 0 or 1 such exchanges happen. The graph of particles with edges for nearby particles is connected: otherwise the system would actually be multiple unrelated systems. There are finitely many microstates in the MCE, and we can get from any one to any other in at most N^2 time steps with positive probability, for example using bubble sort.

Let Q be the state transition matrix. In the language of Markov chains, we have a time-homogeneous Markov chain (because the probability of a swap does not depend on the spins involved) which is irreducible (i.e. any state can get to any other) and is aperiodic (a consequence of the identity transition having positive probability) and thus ergodic (irreducible and aperiodic). First we would like to establish that the Markov chain converges to a stationary distribution: this is effectively exploring the behavior of Q^k for large k.

The behavior of Q^k is dominated by its largest eigenvalue. Since all elements of Q^{N^2} are positive real numbers, by the Perron-Frobenius_theorem it has a positive real eigenvalue which has strictly larger magnitude than any other eigenvalue, and which has an eigenspace of dimension 1, and furthermore an eigenvector v with all positive coefficients.

Now the eigenvalues of Q^{N^2} are simply the eigenvalues of Q raised to the power N^2, so Q likewise has a real eigenvalue r > 0 with strictly larger magnitude than any other eigenvalue, and an eigenspace of dimension 1, etc.. (Use both N^2 and N^2 + 1 to show that r cannot be negative; and to show r is not complex other powers can be used.) Then for any vector w with positive components we get

 \lim_{k \to \infty} \frac 1{r^k} Q^k w = v

by writing w in the eigenspace decomposition for Q, using generalized eigenvectors if Q is defective. (In fact r = 1.)

Alternatively, observe that for any two different states a, b such that the transition a \to b has positive probability, then a and b are related by a single swap of two spins. The transition b \to a is the same swap, and therefore has the same probability, as the probability of a swap does not depend on the values of the spins. Therefore Q = Q^T is real symmetric, and thus is diagonalizable with real eigenvalues. This saves us some casework above for dealing with complex eigenvalues or generalized eigenvectors.

We have seen that for any starting state, the system eventually converges to the same stationary distribution. (This is a basic result of Markov chains / ergodic theory, that any ergodic chain converges to a stationary distribution, but I didn’t see a proof spelled out somewhere.) What is that distribution? Since Q is a transition matrix, the total probabilities going out of each state equals 1. Then from Q = Q^T we see that the total probabilities going in to each state is 1, so the uniform distribution is stationary.

More generally, for any system in which the MCE respects some symmetry such that each state has the same total probability going in, the uniform distribution will be a stationary distribution; and if the system is ergodic, then it always converges to that unique stationary distribution.

As a side note, using Q^{N^2} instead of Q and taking care that the entries are strictly positive instead of nonnegative, permitting a time step to have 0 exchanges, and so forth is only necessary to deal with the case that the connectivity graph is bipartite: if the graph is bipartite, the system can end up oscillating between two states with each swap. This is purely an artifact of the discretization; in the real system, as time becomes large, the system “loses track” of whether there were an even or odd number of swaps. Thus a lot of the complication in the proof comes from dealing with an unimportant case.

Mystery Hunt 2022 Team Statistics

2022 February 02

With MIT Mystery Hunt 2022 (wrap up video, stats summary) having come to a close, the second year that Mystery Hunt was completely virtual, I am also winding down the second year that I helped out with a portion of the behind-the-scenes tech stack for team I’m not a planet either… and sometimes less “behind the scenes” when things go awry! It is a unique position to be in, with goals driven by a mixture of what automated tools I would like to make use of, what support the team needs so that people can puzzle together uninterrupted, and what surprises the hunt organizers throw our way.

Being in this position also lets me piece together some statistics on the team’s progress on the hunt, which I also did last year. First, how many puzzles were available to our team, and how many of them had been solved at any time:

Like all images here, click to zoom. Note that this year events were not considered “puzzles” and are not in the statistics. We see that for a period of about 4 hours on Saturday afternoon we were reduced to just three open puzzles, and briefly only two. The story behind this is apparent when separately track puzzles by round:

The Investigation was the first round of hunt, and solving its meta unlocked the second round. Two puzzles from this round went untouched for a very long time until they were backsolved near the end of the hunt, using the solution of the meta to work backwards to determine their answers. They were also the first and third most popular puzzles for other teams to request the solution to.

The Ministry was the second round of hunt, containing 25 puzzles, 5 metapuzzles, 1 metametapuzzle, and a run-around. Completing The Ministry was intended by the hunt organizers to serve as an intermediate goal that would be feasible for non-competitive teams to accomplish, so was structured like the ending of hunt. 60 teams successfully finished The Ministry.

As a result, progress was bottle-necked on completing The Ministry: unlocking the third round required completing the run-around, which was unlocked 15 minutes after completing the metametapuzzle, which itself was unlocked by completing all 5 metas. The only other puzzles available at the time were the two from The Investigation.

The third and far the largest round was Bookspace, containing many “subrounds”. We were steadily working our way through it when time was called. One puzzle from this round was recorded as being unlocked long before it actually was, because we were given information about it at the beginning of The Ministry.

Below is the list of every puzzle solved by the team. Some of the solution times of early puzzles were inaccurate as we had to manually record puzzle solutions in the beginning of the hunt. In total we gained access to 84 puzzles, of which we solved 65.

Puzzle Round Solve time
Kid Start-up The Ministry 22m 54s
The Boy with Two Heads The Ministry 37m 15s
Something Command The Quest Coast 44m 51s
Does Any Kid Still Do This Anymore New You City 51m 7s
Crewel The Ministry 52m 21s
Ada Twist Scientist The Investigation 58m 24s
My First ABC The Investigation 1h 0m
The Day You Begin The Ministry 1h 1m
The Wonderful Wizard of Oz The Investigation 1h 4m
Fruit Around The Ministry 1h 4m
The Hobbit The Ministry 1h 9m
Make Way for Ducklings The Ministry 1h 13m
Dinotopia The Ministry 1h 18m
Harold and the Purple Crayon The Ministry 1h 23m
The Talking Tree The Ministry 1h 43m
Tikki Tikki Tembo The Ministry 1h 47m
Cemetery Boys The Ministry 1h 50m
Pippi Långstrump/Pippi Longstocking The Ministry 1h 53m
The Missing Piece The Investigation 1h 56m
Your Name is a Song The Ministry 2h 9m
Watership Down The Ministry 2h 13m
Kiki’s Delivery Service The Ministry 2h 15m
Teach Us Amelia Bedelia The Investigation 2h 17m
Just a Dream The Ministry 2h 18m
Potions The Quest Coast 2h 21m
The Ministry The Ministry 2h 26m
Where The Wild Things Are The Investigation 2h 29m
Peter Pan The Investigation 2h 32m
The Last Olympian The Ministry 2h 58m
A Wizard of Earthsea The Ministry 2h 59m
Magically Delicious The Quest Coast 3h 16m
The Investigation The Investigation 4h 56m
A Wrinkle in Time The Ministry 5h 9m
Too Many Toys The Investigation 5h 20m
Go The F*** To Sleep The Ministry 5h 30m
Oxford Children’s Dictionary The Ministry 5h 34m
Sometime After Midnight The Ministry 5h 43m
Frankenstein’s Music Lake Eerie 5h 53m
I Don’t Have a Clue! Noirleans 5h 54m
The Colour Out of Space Lake Eerie 6h 23m
Mysterious Mechanics Noirleans 6h 27m
The Adventures of Pinocchio The Ministry 7h 33m
The Mad Scientist’s Assistant Lake Eerie 8h 1m
The Hound of the Vast-Cur Villes Noirleans 8h 10m
Sorcery for Dummies The Quest Coast 8h 19m
Alice’s Adventures in Wonderland The Ministry 8h 57m
Dancing Triangles Noirleans 9h 31m
The Thin Pan Noirleans 9h 50m
Albumistanumerophobia Lake Eerie 10h 10m
Charlotte’s Web The Ministry 10h 51m
A Number of Games The Quest Coast 14h 2m
Curious Customs The Quest Coast 14h 24m
The Enchanted Garden The Quest Coast 15h 8m
Curious and Determined Noirleans 16h 20m
Billie Barker The Ministry 18h 18m
Randy and Riley Rotch The Ministry 19h 2m
Everybody Must Get Rosetta Stoned New You City 19h 55m
Danni Dewey The Ministry 20h 11m
Trickster Tales Noirleans 20h 34m
Herschel Hayden The Ministry 20h 35m
Alexei Lewis The Ministry 20h 45m
Book Reports New You City 30h 37m
My Dinner With Big Boi New You City 30h 38m
The Messy Room The Investigation 51h 13m
The Neverending Story The Investigation 51h 35m

The longest two puzzle solves were backsolves; we received the solution to My Dinner With Big Boi as a bonus; and Book Reports was recorded in our system as available long before it actually was unlocked. The five longest puzzles from The Ministry were all metas, so the longest standard puzzle solution was Trickster Tales.

Compared to last year, we solved a much higher fraction of the puzzles accessible to us, and the slowest ordinary solutions were much faster; although conversely there were no 3 minute puzzle solutions this year. Many fewer puzzles were open at any time, possibly due to some combination of each individual puzzle being “bigger” and the hunt being much more linear.

Cheap and simple mask fit testing

2022 February 01

  1. Introduction
  2. Disclaimer
  3. Masks
    1. What are fit testing and seal checking?
    2. How to perform a seal check
    3. Cloth and surgical masks
    4. Acquiring genuine N95s
    5. Donning and doffing
    6. Mask re-usability
    7. Do glasses provide protection?
  4. OSHA fit testing
  5. Quick-and-easy fit testing
    1. Preparing the testing solution
    2. Performing a fit test
    3. Results
  6. Other resources
  7. Thoughts on protecting against covid

Introduction

Short version: Read the disclaimer, how to make the testing solution, how to perform the test, and my results.

US healthcare workers who rely on respirators, such as N95s, for their safety undergo annual fit testing to make sure that they are protected. Of course professional fit testing equipment is priced like all aspects of American healthcare… beyond the reach of the typical American. Here I detail the steps I took to imitate this procedure and how you can perform them as well.

Professional PPE usage is intended to provide a level of protection suitable for someone in sustained direct contact with highly-infectious covid patients. If you were to have a passing encounter with a contagious person in public, your exposure risk would be much lower than in a hospital setting. My goal is to gain most of the protection of professional PPE with a minimum fraction of the effort, in the hopes that this compromised level of protection is more than sufficient for my likely exposure level.

If you live in my area and would like masks or to borrow my testing solution, let me know!

Disclaimer

If you need professional-level respiratory protection, disregard this article and follow the relevant regulations. When I write things like (e.g.) disposing of a mask after a single usage is wasteful, I am writing for those, like me, who are looking for modest protection from incidental covid exposure in a public setting. I am not an expert.

Masks

Masks are not perfect at blocking small particles. There are two ways they can fail: either through leakage around the sides, or through inadequate filtration of air that passes through the filter. Masks must compromise between these two failure modes, as additional layers of filtration material improve filtration but increase breathing resistance, and therefore increase the tendency for air to pass through any small gaps around the sides.

What are fit testing and seal checking?

The goal of fit testing is to identify a model of mask that gives a high quality fit for a specific person. The person is exposed to a strong-smelling/tasting chemical while wearing a mask; the odiferous chemical is aerosolized into tiny droplets for which the mask’s material has a high filtration efficiency, so detection of the chemical indicates that air is leaking around the mask. Note that a fit test does not test whether the filtration material is adequate: it only checks for leaks. A wide variety of particulate sizes and types would need to be used to validate the quality of the filtration media, but are irrelevant for detecting leakage.

Those in a professional setting will typically undergo fit testing annually through a procedure regulated by OSHA; see below. A variety of masks are tested until one that passes the test is found. Like other garb, a proper fit depends on the shape of the mask matching the person wearing it, so there is no one “best” mask for everyone.

Once an appropriate model of mask has been found, seal checking (or a “fit check”) is performed everytime that mask is donned or adjusted before entering the hazardous area. It only takes a few seconds and I do it everytime I go out: I find leaks the majority of the time.

How to perform a seal check

To perform a seal check, cover the filtration surface of the respirator with your hands (while wearing it), and either breathe in (negative pressure) or breathe out (positive pressure). You should feel increased breathing resistance, and you should not feel any air passing the side of the mask. The mask should also visibly deflate / inflate slightly. Adjust the mask and repeat if leaks are found.

See a video of a seal check.

The increased breathing resistance is more apparent in the negative pressure test, as breathing in is harder than breathing out. However breathing in tends to tighten the seal, concealing leaks, so leakage around the sides is easier to detect when breathing out. If you wear glasses, fogging of the glasses can be the most obvious sign of leakage around the nose on breathing out. I suggest using a mask with a foam nose insert and wearing it high up on the nose to reduce leakage there.

Don’t forget to shave: studies find that facial hair located at the mask’s seal increases the leakage by a factor of 20 to 1000 times. Even a small stubble compromises the seal.

Cloth and surgical masks

Neither fit testing nor seal checking serves much function for cloth or surgical masks, or KN95s. (Though if you try, let me know!)

Double-masking to improve filtration is generally unhelpful as it can increase leakage instead. However, a close-fitting cloth mask worn over a surgical mask can be beneficial if it improves the fit of the latter. Never wear an N95 over another mask: this combines the fit of a surgical mask with the breathing resistance of a respirator.

A better alternative to double-masking is a mask fitter / mask brace. A medical face mask rated at ASTM-2 or ASTM-3 with a correctly sized mask fitter might be compared to the protection of an N95.

Cloth masks are often found to be substantially inferior to surgical masks.

Acquiring genuine N95s

I have purchased 3M Aura N95s online from Northern Safety Industrial and Home Depot. In both cases they cost on the order of $2 per mask. Another source is Industrial Safety Products. I believe that mask manufacturers, particularly 3M whose masks are very popular in healthcare, prioritize selling to medical distributors, so any masks sold to the general public are after medical demands have been satisfied. (Note that 3M masks have become scarce again since covid omicron hit the news.)

Only purchase N95s and KN95s from an established seller… and certainly not off of Amazon.

The majority of KN95s in the US are counterfeit!

For a mask to qualify as an N95 it must be verified by NIOSH, a part of the CDC, to meet specific regulations of performance. (“KN95” is a standard regulated by China, and inferior to N95 due to the use of ear loops; “FFP2” is the European equivalent, with “FFP1” inferior and “FFP3” superior.) Refer to this CDC page for guidance on recognizing counterfeit masks and how to look up your mask’s NIOSH approval.

I believe the majority of “counterfeit” N95 masks in the US simply lie about being NIOSH approved; this can easily be detected by the above. Detecting forgeries, which imitate genuinely approved N95 masks, is harder.

All images are click to zoom.

Donning and doffing

Above I linked a video of a seal check, which included demonstration of donning and doffing. CDC procedures are to only handle masks by the straps when donning and doffing. I doubt this matters much; of course you should wash your hands after handling the mask regardless.

A little etymological diversion… “don” and “doff” are contractions of “do on” and “do off”. They had become obsolete by the 17th century when Sir Walter Scott repopularized them, though the words “dup/dub” (open) and “dout” (put out) went extinct.

Mask re-usability

Disposable masks are meant to be used once. However this is excessive and wasteful of limited resources.

Unless you are visibly soiling the respirator surface, the limiting factor on the re-use of your mask is the strength of the elastic band. As it weakens with use, particularly during donning and doffing, the mask is held less tightly to the face and leaks develop. This is also why masks with ear loops can’t seal as effectively as those whose straps go behind the head.

Certain masks allow for the tightness of the elastic head bands to be adjusted by the wearer: this greatly extends the life and performance of the mask by allowing the fit to be tightened as the elasticity weakens with re-use.

Quality of N95 fit declines measurably by around 5 to 30 uses, though still far superior to cloth or surgical masks.

Masks can be sterilized between uses with UV light; however leaving them in a paper bag at room temperature for several days is a simpler and more reliable technique. I rotate through several masks and don’t reuse them more than once every 3 days. I write the date of its first use on each mask’s bag and then infrequently dispose of the oldest mask.

Water and other cleaning fluids should only be applied to cloth masks. Brief exposure to water (e.g. in light rain) is not a concern. Alcohol-based cleaners will destroy the filtration material!

Source (pdf). Mask failures were found to significantly increase after the second day of mask re-use. The subjects were hospital workers who had performed OSHA-regulated fit testing and were trained to do a user seal check with every mask donning. The length of each shift or the number of donnings / doffings performed was not recorded. Workers consistently overestimated whether their mask would pass the fit test (even though seal checks would have revealed any obvious problems). I would not read much into the exact numbers found by the study, but the rise in mask failures with reuse is significant.

Do glasses provide protection?

A study based on data from a Chinese hospital from January to March 2020 found that wearing glasses reduced the risk of being hospitalized with covid by a factor of about 10. This study has been sometimes cited to suggest that glasses provide some protection against infection. However, as one might guess from the ridiculous conclusion they reached, the study is riddled with fundamental methodological flaws. Besides, anyone who has worn glasses in rain or wind is well aware they do not provide any protection against air-borne droplets reaching the eye.

I am not aware of any firm evidence that covid can, or cannot, be transmitted via the eyes. If you wish to protect your eyes you should wear goggles or a full-face respirator. I don’t think this is necessary but I did drop $5 on getting cheap goggles in case they might be useful in the future.

OSHA fit testing

OSHA publishes official regulations on how fit testing should be performed which employers are required to follow for employees who will be exposed to respiratory hazards. All fit testing procedures that I have reviewed are derived from these procedures. If reading “regulationese” is not your desired pastime, 3M has a one page quick reference to the qualitative test.

Quantitative fit testing involves puncturing the mask being tested with a device for measuring particles while the user is wearing it; it does not rely on the wearer’s senses to test the fit of the mask. This test yields a filtration efficiency, allowing to quantify how much better one mask is than another. We will not be considering quantitative fit tests further.

Qualitative fit testing yields only a “pass” or “fail” depending on whether the wearer detected any of the test substance while wearing the mask.

I summarize the official OSHA procedure as follows:

  1. A dilute solution of 0.83 grams sodium saccharine in 100 mL distilled water is prepared.

  2. A concentrated solution of 83 grams sodium saccharine in 100 mL distilled water is prepared.

  3. During the whole test the subject wears an enclosed hood into which the test solutions will be released.

  4. With no mask, the dilute solution is introduced into the hood to verify the subject can taste it, and at what concentration.

  5. With the mask being tested, the concentrated solution is introduced into the hood. The wearer performs several acts, including moving the head, grimacing, exercising, and talking, and should include the sort of motions that will be done in the hazardous environment.

  6. The mask’s fit passes if the concentrated solution is not detected at all.

Bitrex, a nasty bitter substance which OSHA describes as a “taste aversion agent”, can be substituted for the sodium saccharine solution. This increases the sensitivity of the test with the downside that you have to taste Bitrex.

The OSHA procedure recommends sticking the tip of your tongue out which I found to be unhelpful.

Why use the oddly specific concentration of 83 g / 100 mL? When I attempted to make that concentration, I found that it was just a little bit higher than the saturation concentration of sodium saccharine. Note that saturation is strongly dependent on temperature, and OSHA specifies warm water. Therefore my guess is that they simply chose the saturation point, to make the strongest-tasting solution possible. However, when I tried to look up the saturation point of sodium saccharine, I got wildly conflicting information: most sources simply said “greater than 10 g / 100 mL”, and others gave values which were well below what I observed.

The instructions I give in the next section use half this concentration to reduce annoyances like precipitation on temperature change.

Quick-and-easy fit testing

The process I describe is based on the official OSHA procedure described above; I removed the portions that are annoying (e.g., using distilled water), difficult (making a hood), or expensive (buying medical-grade equipment). This simplified procedure retains only the core concept of applying strong-tasting particulates to the outside of a mask and testing if they can be detected.

Preparing the testing solution

You require:

To prepare the testing solution:

  1. Weigh out about 8 grams of sodium saccharin.

  2. Mix into 20 mL of room-temperature tap water; it will take some effort to dissolve fully.

Alternatively, if you do not have any measuring equipment:

  1. Collect enough water to fill about one third of the aerosolizer’s reservoir (10 mL out of 30 mL).

  2. Slowly add in sodium saccharin, mixing it as you go, until there are solid particles that do not dissolve even with continuous mixing for more than a minute. This will be about 8 grams – nearly as much sodium saccharin as water.

  3. Dilute by mixing 1 to 1 with tap water. Mix thoroughly and pour liquid into the reservoir, disposing of any excess or remaining precipitate.

The purpose of diluting sodium saccharin is to discourage precipitation out of solution. The exact concentration is not important.

Once you have filled and re-attached the reservoir to the aerosolizer, and charged it, you are ready to perform a test. The button on the aerosolizer toggles whether it is on or off; check that it works. Some aerosolizers operate by dispensing once each time it is triggered, which may allow for greater control over the amount dispensed.

If its performance declines, check it is charged and check for precipitate accumulating on the outlet. When left unattended for a long time I find a significant build up of sodium saccharin clogging the device. It is easy to clean with a damp cloth or by flushing with tap water.

Performing a fit test

Test indoors in a location you can ventilate. See a video demonstration, though I recommend breathing naturally, unlike the person in the video.

Wearing your mask, point the aerosolizer at your face from a few inches away and turn it on for a few seconds. I suggest starting a little farther and then moving close and going around the whole perimeter of the mask, especially around the nose and under the chin. You don’t have to worry about getting the mist in your eyes.

While spraying the aerosol, breathe shallowly in and out through your mouth. I and others have found the solution produces no odor, so do not bother trying to smell it. Rather, it yields a very mild sweet flavor in the back of the tongue or throat, which can take a few seconds to be detectable. The solution is very strong and you will detect even small amounts getting past the mask. Take note of whether you can detect it, and under what circumstances.

If 10 or so seconds of spraying the solution directly at the seals of the masks produces no sensation, then the result is a complete success.

In any case, after removing the mask there should be a very stark difference as even lingering aerosol in the air will be easily detectable.

While the official OSHA procedure is a binary success or failure, it is easy enough to find gradations in how strong the taste is or how much spray is needed to be detectable. Assessing the quality of the mask fit based on what you detect is left to your judgement; I describe my own experiences below to provide context.

Do not breathe in the aerosol directly without a close-fitting respirator! In initial testing I tried to get a tiny whiff from a foot away without my mask but inhaled too strongly, and ended up coughing vigorously for the next two hours, with a sickly sweet flavor to every cough. If you avoid direct exposure it’s not too bad: others found the lingering aerosol to be unpleasant but tolerable without a mask.

If you are paranoid you can use a more diluted solution but just don’t spray it in your face and you’ll be fine.

Results

I have used this process with about 15 masks across 10 people; the masks included a variety of N95 models, and one P100 mask.

Note that the concentration of solution described above is very strong. This is valuable as it increases the discrimination of the test – but do not be disappointed if your mask is not perfect. The results of this procedure cannot be directly compared to that of the official OSHA procedure, as the method of application of the aerosol is different, and subject to how the user operates the aerosolizer.

Some of the masks I helped fit test. At right is the popular 3M Aura 9210+ that fits me very well. Below is a 3M 9502+ KN95 mask with ear loops, and left is the 3M 8210 N95 mask, which is stiffer than the Aura series. Which of these masks fit best varied from person to person. The aerosolizer we used (“Nano Mini Disinfecting Sprayer” sold by Emerald Prairie Health) is at top.

Every N95 I helped test experienced some degree of leakage. Wearing a carefully adjusted 3M Aura mask, I could only detect a faint hint of sweetness in my throat after several seconds of spraying the aerosol directly at the mask’s seals; compare this to my experience above of exposure without a mask. On this basis I feel very confident in the protection that this mask provides me.

Some of the masks tested had different outcomes on different people, and vice versa: fortunately everyone who tried this procedure with me appeared to be satisfied with at least one mask they tried, up to their standard of precaution.

Some people, like me, felt re-assured in the benefit of their mask after experiencing the sharp contrast between comfortably breathing with aerosol spraying in one’s face and the unpleasantness of passing through that room without a mask well after testing had stopped.

The GVS SPR457 mask, rated at P100, which passed the fit test convincingly by its owner. They also tested a Moldex 4200 airwave. The latter is very comfortable to breathe through on account of its low breathing resistance, but it does not have an adjustable nose piece, and was a poor fit for the people who tested it.

The P100-rated mask proved to be the only mask totally impervious to the testing solution – the person wearing it could not detect the solution at all.

If you decide you are only safe with a respirator rated above an N95, then you should absolutely perform a fit testing: a P100 that does not fit you is worse than an N95 that does. There is no reason to invest in an expensive, uncomfortable mask but not invest in checking if it fits you.

Other resources

Source. Several masks were tested with various degrees of fit: they found that ill-fitting N95s exceeded the performance of surgical / cloth masks. Most of the failure of cloth masks was due to leakage from poor fit, not due to the filtration being inadequate. Note the very small sample size.

Thoughts on protecting against covid

Wearing a mask or getting vaccinated has often been compared to wearing a seatbelt. Let us extend this analogy a little further, and talk about the multiple layers of protection in the Swiss cheese model of hazards.

I found this nice graphic of the swiss cheese model on wikipedia… alongside not one but two such graphics specialized to covid-19 hazards. Nothing like seeing your exact idea in graphical form to make you feel unoriginal. Though both of the artists seem very confused about the concept of “layers” so I stuck with the generic graphic.

I will somewhat arbitrarily group protection measures in three layers:

  1. Avoiding hazardous conditions.

  2. Avoiding hazard exposure risk when in hazardous conditions.

  3. Mitigating against severity of hazard when it happens.

For avoiding car crashes, this means:

  1. Not driving on a road with unsafe drivers.

  2. Driving safely and “defensive driving”.

  3. Wearing a seatbelt, as well as airbags and other safety equipment.

For mitigating covid, this means:

  1. Physical distancing from people with covid.

  2. Wearing a mask and having substantial ventilation or air filtration when around people with covid.

  3. Getting vaccinated, bolstering your immune system with sufficient sleep, and other health measures like losing weight.

You only need to actually take precautions from contagious people, just as you only need to avoid drivers who get into car accidents… but in practice that means taking precautions from everyone. I would like to emphasize that getting covid or spreading covid is not something that only happens to “other” people or to “bad” people. Wearing a mask around someone is not a moral condemnation of them any more than wearing a seatbelt when they drive. “We’ve been friends for 10 years, you don’t need to wear a mask around me” is asking for you to “trust” that they can’t catch covid… might as well “trust” that they won’t develop diabetes or cancer. Being a loyal friend does not make you immune to medical ailments!

So let us return to risk assessment, free of any distractions about morals. To catch covid, all three layers of protection must fail you. (Similarly for other endpoints of interest, such as being hospitalized due to or dying of covid.)

Early in the pandemic, without vaccinations or general access to quality masks, the only effective layer of protection was physical distancing (confusingly called “social distancing” in the media). This would be akin to having a 1950s car with no seatbelt or airbag in a snow storm: best just stay home.

Now this has changed. Vaccines, of course, are amazing, with perhaps a 10x or 20x reduction in death risk, and lesser protection against hospitalization and symptomatic illness. But vaccinations’ real value is in how easy and consistent they are: you can never forget to “put on” your vaccination, or worry about around whom you will choose to go unvaccinated. If you are willing to put in just a little more effort to get a higher level of protection, then you should invest in the other layers of protection.

My approach is to treat the second layer, masking, as my main line of defense against covid. It is not an impervious layer: vaccination is so easy that it’d be foolish to go without it as a backup, especially for situations where wearing a mask is infeasible. But I find investing in masks much preferable to investing in the first layer, which would entail avoiding people on the presumption that they may be contagious. Wearing a high quality mask whose fit I have verified gives me the confidence to not worry about who I interact with. (Don’t forget seal checks each time you wear your mask! I wouldn’t be surprised if doing seal checks reduces the leakage of my mask by a factor of 10.)

Of course, you can never be too careful about what might be in your mask.

Links

2022 January 28

Music

Talos Principle OST - Virgo Serena

Maxence Cyrin - Where is my mind (piano cover)

Zbigniew Preisner - Requiem for my friend - Lacrimosa

Haken - Atlas Stone

Mathematical overkill: using Goedel’s compactness theorem to solve a geometry problem by creating a formal system whose consistency is equivalent to the existence of a solution to the problem.

Letterlocking is a security technique of intricately folding and sealing a letter to protect it from inspection or tampering in transit without visibly damaging the letter. Letterlocking has been used for more than 700 years and folding methods were often personalized by individual letter writers. Recent work has allowed researchers to defeat historical locked letters using xrays and computer modeling to inspect them without damaging them.

In Penney’s game, for a fixed n > 2, two players sequentially name a different sequence of coin flips of length n (e.g., the first player might say TTH, and then the second player might respond HHH). Remarkably the second player always has a winning strategy, making this a non-transitive game.

Herbert Dingle (1890 - 1978) was a respected physicist who became reputed as a crank in his later years. However, it seems he had no idea what he was doing all along:

At this point his colleagues were convinced that Dingle was insane through age or loneliness […] But there is a small point. Dingle was not crazy. For 35 years from 1920 to 1935 (that is before starting his campaign against Relativity) Dingle had written, held conferences about and lectured on a theory of which he’d never understood any part.

[…] Nor is [it] a novel event for a famous scientist to start supporting an absurd or ‘heretical’ theory, completely losing any credibility, maybe for ideological or political reasons, or out of academic rivalry. Here, though, we face a different matter, and an even more chilling one: someone who is a supposed expert in a sector in which ‘peer reviews’ exist with all the accolades and the respectability that entails, who shows that he hasn’t understood a word of things that he’s been left to discuss for years. […] if one looks at his books on the history of scientific philosophy, they are full of blunders. In practice, it’s not that Dingle forgot some things, or was acting under false pretences. He really didn’t understand some things.

That blog also has a fascinating mini-biography of Alexander Grothendieck, one of the greatest and most independent mathematicians of the last century. I hadn’t really internalized just how unusual it is to go to North Vietnam during the US-Vietnam War to teach mathematics in a war zone, just one of many oddities of his life.

Grothendieck seemed indifferent to the danger. When the bombings got too violent, his hosts moved their classes to the jungle. It wasn’t a problem for Grothendieck. He dressed as a Vietnamese peasant, wore sandals made from old car tires, and slept on the ground. The math lessons were very advanced, and Alexander hove into the sights of the western secret service, which continued to track him for years. But his Vietnamese visit had an important outcome in that Grothendieck became the rapporteur of the dissertations of Hoàng Xuân Sính, the first important female Vietnamese mathematician and founder of Than Long University [and first female professor in Vietnam of a technical field], who gained her doctorate under Alexander’s supervision in 1975.

Images

Source tweet

The first prototype computer mouse. It was demoed in 1968 at The Mother of All Demos, which also demonstrated hyperlinking, windows, version control, video conferencing, and real-time multi-user text editing.

From False Knees.

Halos around the sun observed from Rhone-Alpes, France. Refraction through ice crystals is responsible for most of the features seen.

Media sources for legal news and the Jan 6 insurrection investigation

2022 January 24

Anybody who’s seen their field of expertise appear in the mainstream media knows that for quality reporting you should look to investigative reporters who specialize in the field. (Incidentally for math news that is exclusively Quanta Magazine.) With the increased attention on the investigation into the Jan 6 insurrection, here are some of the sources I’ve been following that are at least a step up from the mainstream:


  1. Via radiation of infrared light to space due to the blackbody effect; see my series on the greenhouse effect for more details.↩︎

  2. We suppose that the heat capacity is constant over the range of temperatures of interest, so energy and temperature are linearly related.↩︎

  3. Black-body radiation varies with the fourth power of temperature per the Stefan-Boltzmann law, so we need a cubic correction factor, which is modest over the range of temperatures over interest; but we are not trying to be quantitative yet.↩︎

  4. Indeed this linearity is how we knew that the constant term of E is S_0.↩︎

  5. \tan x \approx x for small x so it is not a surprise that 1 / \omega is approximately equal to t_0↩︎

  6. the angle between the subsolar point and the equator↩︎

  7. The diurnal cycle in solar heating is only poorly approximated by a sine wave; the “effective” daily range should be somewhat larger, but not enough so to change any conclusions.↩︎

  8. Note that the differential equation we’ve been using so far can only produce a phase shift of up to a quarter phase. The reason that temperatures underground can be phase shifted by more than that is that they do not directly interact with the surface, but rather interact via soil at an intermediate depth. Each “layer” of soil is slightly phase shifted and slightly damped relative to the layer above it, so these phase shifts can add up arbitrarily high at depth.↩︎

  9. Additive constants are missing because they considered temperature anomalies T' rather than temperatures T. Note that only a coupling between air and space is explicitly given; there is no explicit insolation term, so the equations as given are only capable of relaxing towards equilibrium. In their numerical simulations, though, they were forced with a step function change in insolation, which is not explicitly shown in the equation. Since their forcing function was a step function instead of sinusoidal, their results were transients instead of periodic.↩︎

  10. Side note: because \omega_{ij} depends on 1 / C_i, where C_i is the heat capacity of the ith component, in general \omega_{ij} \neq \omega_{ji}.↩︎

  11. Going even further afield, the yet more distant parts of the Earth have a response timescale even greater than a year, and so are not important to either seasonal or diurnal variations.↩︎

  12. This fact is how real-world greenhouses work, which is totally unrelated to the greenhouse effect. A greenhouse prevents the air near the ground from rising and mixing with the air above, causing hot air to be trapped near the surface. It has nothing to do with blocking infrared radiation, as can be demonstrated by placing a small vents in the roof and sides of a greenhouse, which causes it to cool to ambient temperatures.↩︎

  13. Before the industrial revolution, the natural level of carbon dioxide was already at roughly 280 ppm, at the high end of the natural cycle.↩︎

  14. Or more specifically, the average of the temperatures of the various parts of the Earth and atmosphere, weighted according to what proportion of the emissions to space come from that part.↩︎

  15. Although note that this simplified explanation ignores the ozone layer, where temperatures actually increase with height. Ozone absorbs ultraviolet radiation, which comes only from the Sun, and not from the Earth, so this causes the reverse behavior of temperature increasing with height.↩︎

  16. Our model of part 4 only considered two wavelengths of light, shortwave and longwave. However the radiative transfer model, LBLRTM, simulated 15 million different wavelengths, which was smoothed to about 1000 wavelengths in the figure – without this smoothing the graph would have been so spiky as to be totally unreadable. Numerous other details we ignored in our simple model were properly simulated by LBLRTM. The atmospheric composition used is the US Atmospheric Standard of 1976, which defined a carbon dioxide concentration of 314 ppm, far below the current value of 415 ppm as of 2022.↩︎

  17. It is thought that Venus entered in a water vapor feedback process that was not self-limiting, and just grew forever in a runaway greenhouse effect until its oceans boiled away entirely. It is also expected that Earth will eventually enter a similar runaway greenhouse effect in about one billion years, and that artificial climate change will not be able to trigger this early.↩︎

  18. Specifically the Permian-Triassic extinction event of 252 million years ago and the Paleocene-Eocene Thermal Maximum of 55 million years ago.↩︎

  19. We mean “the surface of the Earth” when we say “the Earth”, as the interior of the Earth only very slowly exchanges heat with the surface, so it can be ignored.↩︎

  20. All energy exchanged with the Sun or with space is in the form of light, but some of the energy exchanged between the Earth and the atmosphere is in other forms. In particular, hot water molecules that physically move from the surface into the air bring a large amount of energy with them, called latent heat. Heat conduction plays a lesser role. We also omit geothermal heating, which is energy flowing from the interior of the Earth to the surface. This is estimated to be 47 TW, or 0.092 watts per square meter.↩︎

  21. We briefly remark on the arrows that are absent from the diagram. The most interesting omission is the arrow from the Sun to the atmosphere; we have already commented on that. A tiny fraction of the light emitted to space goes on to strike the Sun or other bodies, but we are uninterested in where exactly it goes once it leaves the Earth. Space is filled with cosmic microwave background radiation, so there should be arrows representing microwave light from space to each of the other objects, but the amount is so tiny as to be totally insignificant – only 1.6 GW of it reaches the Earth, or 3 microwatts per square meter. Finally, the Sun emits a tremendous amount of light into space that does not strike the Earth, but we are not interested in that.↩︎

  22. Whenever we say the “average” temperature of a (spherical) object in the context of blackbody radiation, we mean the fourth root of the arithmetic mean of the fourth power of the surface temperature, weighted by surface area and emissivity. That is, we use exactly the average that makes the Stefan-Boltzmann law work with the result. For objects like the Earth, where the temperature does not vary tremendously from one location to another, this average is close to the ordinary arithmetic mean. For tidally-locked or slowly rotating objects like Mercury or the Moon, the distinction can be very important.↩︎

  23. In fact, direct measurements of light going in and out of the Earth agree with each other up to the accuracy with which they can be measured.↩︎

  24. Hazy conditions can have an anti-greenhouse-like effect, although the mechanism is not exactly the same; for example, major volcanic eruptions cool the Earth for a few years by putting sulfur aerosols in the stratosphere.↩︎

  25. One major exception is phase changes like ice melting or liquid water evaporating; another of course is black holes, as mentioned earlier.↩︎

  26. Model error refers to details of the real-world system which are omitted in the mathematical model. Regardless of how detailed and precise the model is, for a model to be useful there will always be some further detail that is missing.↩︎

  27. Of course, if the Earth had no atmosphere it would have a significantly different albedo, among many other major differences.↩︎

  28. While the surface area of the Earth is 4 \pi R^2 and half of that is exposed to sunlight at any time, the amount of sunlight a location receives depends on the angle the Sun is above the horizon, and the average illuminated location receives half as much light as it would receive under direct, full sunlight.↩︎

  29. We take albedo \alpha = 0.3, insolation S = 1366 \text{ W m}^{-2}, and Stefan-Boltzmann constant \sigma = 5.67 \cdot 10^{-8} \text{ W m}^{-2} \text{K}^{-4}. As per a previous note, we use an emissivity \epsilon = 1. With a more realistic \epsilon = 0.96, we get T_e = 257 K.↩︎

  30. Like the gas planets, the Sun does not have a solid surface, but instead gradually becomes denser and more opaque closer to the center. The “surface” is defined somewhat arbitrarily in terms of a certain level of opacity. The exact temperature at this depth would be difficult to measure, but is likely very close to the effective temperature.↩︎

  31. Rather than the typical arithmetic mean, the fourth root of the arithmetic mean of the fourth powers is the suitable average. This is always warmer than the arithmetic mean, although not significantly so for the Earth.↩︎

  32. Instead of wavelength, photons are sometimes described by their frequency or energy. These are related by E = h \nu = h c / \lambda, where E is the energy of the photon, \nu is the frequency, \lambda is the wavelength, h = 6.626 \cdot 10^{-34} \text{J}\cdot\text{s} is Planck’s constant and c = 3 \cdot 10^8 \text{m}/\text{s} is the speed of light. For consistency we will only use wavelength.↩︎

  33. The interior of the microwave oven is surrounded on all sides by metal, forming a Faraday cage from which microwaves cannot escape.↩︎

  34. Except that very strong x-rays shown directly in the eye can appear faintly blue; this was discovered in 1895 before the dangers of x-rays were known.↩︎

  35. A micron is one millionth of a meter. “Micron” is short for “micrometer”, which is also written 1 \upmum. A micron is about a hundred times smaller than the thickness of a sheet of paper, and is a bit smaller than the typical bacterium. We will usually use microns to describe wavelengths of light.↩︎

  36. The first person to study blackbodies was Gustav Kirchhoff in 1860, who was unable to determine the formula for blackbody radiation but called it “a problem of the highest importance”; this proved true when the discovery of the formula led to the discovery of the photon by Max Planck and Albert Einstein around 1905, for which Einstein received the Nobel Prize.↩︎

  37. With a very few exceptions; for example, black holes warm up when they lose energy.↩︎

  38. Since black holes become colder when they absorb energy, and they start colder than their surroundings, they just get even colder over time until they approach absolute zero; for example the black hole in the center of the Milky Way, called Sagittarius A^*, is approximately 1.7 \cdot 10^{-14} K. The rest of the universe is currently about 2.7 K, so Sgr A^* is gaining energy from its surroundings and continuing to cool down. Eventually the rest of the universe will cool down until it is even colder than Sgr A^*, so Sgr A^* will start losing energy over time and warm up until it ultimately explodes.↩︎

  39. Real-world objects actually emit slightly less light than indicated by this formula; the proper formula is A \epsilon \sigma T^4 where \epsilon is the emissivity of the object, a number between 0 and 1 that depends on the material the object is made out of and the wavelength of light that we are interested in. Most objects in daily life have an emissivity around 0.9 to 1 in infrared wavelengths. The surface of the Earth has an average emissivity of about 0.96 in infrared wavelengths. For simplicity we take \epsilon = 1 from now on.↩︎

  40. Discovered in 1900, and directly leading to Einstein’s prediction of the photon.↩︎

  41. The reason why the last bit of the Antarctic ozone layer was not destroyed is because the upper-most part of the Antarctic stratosphere does not have the right conditions for destroying ozone, and the stratosphere does not mix well.↩︎

  42. While the worldwide banning of leaded gasoline and the banning of leaded paint in the US and EU has greatly decreased the amount of lead in the environment, exposure to environmental lead continues to kill 140 000 people every year and contribute to 600 000 new cases of intellectual disability in children annually.↩︎

  43. This is called the Great Oxygenation Event, which may have involved a “snowball Earth” mostly or entirely covered in ice.↩︎

  44. Higher pitches require denser air to be transmitted; if the time between collisions is longer than the frequency of the pressure wave, then the high and low pressures will simply be averaged out. The highest transmissible pitch drops by an octave roughly every 5 km of altitude.↩︎

  45. Source, which has more information on the role of generalship.↩︎

  46. Each RGB value is a single byte from 0 to 255, and the conversion Y’UV <-> RGB is strongly nonlinear. Y’UV is optimized for human perception, which does not map very neatly to distinct RGB components.↩︎

  47. There is a subtlety here that I did not notice at the time: does it get warmer in summer because the excess sunlight causes more warming to happen each day than cooling happens in the night, or does it get warmer because the days get longer and therefore the equilibrium temperature of each day is higher than the previous? This is similar to asking on what time scale the Earth system returns to equilibrium – yearly, or daily. The (incomplete) answer is that the former effect is arguably dominant, as can be seen by observing that the hottest day of the year is usually about 1 to 2 months after summer solstice, and thus the days continue to get hotter even when they are getting shorter. More specifically, if we approximate the Earth’s temperature at a point as an exponential decay to equilibrium, subject to a sinusoidal forcing, then we find that the temperature is sinusoidal with a phase offset from the forcing such that the tangent of the phase offset equals the ratio of the decay timescale to the forcing period. Taking the phase offset to be 1.5 months of a 12 month year, the ratio is \tan(2 \pi / 8) = 1, so the time scale for the Earth to reach equilibrium is about 1 forcing period, i.e., 1 year.↩︎

  48. As an example, here is a lesson plan designed by NASA in which students perform an experiment where greenhouse gases are “simulated” using plastic wrap – ironically how greenhouses work, but having nothing to do with the greenhouse effect. Here is an experiment for high schoolers in which students measure the temperature of bags with and without carbon dioxide (which is generated in situ by an exothermic chemical reaction). Neither of these plans contain any information related to the greenhouse effect.↩︎

Follow RSS/Atom feed or twitter for updates.