Last video I left you with a puzzle. The setup involves two sliding blocks in a perfectly

idealized world where there’s no friction, and all collisions are perfectly elastic,

meaning no energy is lost. One block is sent towards another smaller one, which starts

off stationary, and there’s a wall behind it so that the small one bounces back and

forth until it redirects the big block’s momentum enough to outpace it away from the

wall.

If that first block has a mass which is some power of 100 times the mass of the second,

for example 1,000,000 times as much, an insanely surprising fact popped out: The total number

of collisions, including those between the second mass and the wall, has the same starting

digits as pi. In this example, that’s 3,141 collisions.

If it was one trillion times the mass, it would take 3,141,592 collisions before this

happens, almost all of which happen in one huge burst.

Speaking of unexpected bursts, in the short time since that video lots of people have

shared solutions, attempts, and simulations, which is awesome. See the description for

some of my favorites. So why does this happen?! Why should pi show up in such an unexpected

place, and in such an unexpected manner?

First and foremost this is a lesson about using a phase space, also commonly called

a configuration space, to solve problems. So rest assured that you’re not just learning

about an esoteric algorithm for pi, the tactic here is core to many other fields.

To start, when one block hits another, how do you figure out how the velocity of each

one after the collision? The key is to use the conservation of energy, and the conservation

of momentum. Let’s call their masses m1 and m2, and their velocities v1 and v2, which

will be variables changing throughout the process.

At any given moment, the total kinetic energy is (½)m1(v1)^2 + (½)m2(v2)^2. Even though

v1 and v2 will change as the blocks get bumped around, the value of this expression must

remain constant. The total momentum of the two blocks is m1*v1 + m2*v2. This also remains

constant when the blocks hit each other, but it can change as the second block bounces

off the wall. In reality, that second block would transfer its momentum to the wall during

this collision. Again we’re being idealistic, say thinking of the wall as having infinite

mass, so such a momentum transfer won’t actually move the wall.

So we’ve got two equations and two unknowns. To put these to use, try drawing a picture

to represent the equations.

You might start by focusing on this energy equation. Since v1 and v2 are changing, maybe

you think to represent this equation on a coordinate plane where the x-coordinate represents

v1, and the y-coordinate represents v2. So individual points on this plane encode the

pair of velocities of our block. In that case, the energy equation represents an ellipse,

where each point on this ellipse gives you a pair of velocities, and all points of this

ellipse correspond to the same total kinetic energy.

In fact, let’s actually change our coordinates a little to make this a perfect circle, since

we know we’re on a hunt for pi. Instead of having the x-coordinate represent v1, let

it be sqrt(m1)*v1, which for the example shown stretches our figure in the x-direction by

sqrt(10). Likewise, have the y-coordinate represent sqrt(m2)*v2. That way, when you

look at this conservation of energy equation, it’s saying ½(x^2 + y^2) = (some constant),

which is the equation for a circle. Which specific circle depends on the total energy.

At the beginning, when the first block is sliding to the left and the second one is

stationary, we are at this leftmost point on the circle, where the x-coordinate is negative

and the y-coordinate is 0. What about after the collision, how do we know what happens?

Conservation of energy tells us we must jump to some other point on this circle, but which

one?

Well, use the conservation of momentum! This tells us that before and after a collision,

the value m1*v1 + m2*v2 must stay constant. In our rescaled coordinates, that looks like

saying sqrt(m1)*x + sqrt(m2)*y = (some constant), which is the equation for a line with slope

-sqrt(m1/m2). Which specific line depends on what that constant momentum is. But we

know it must pass through our first point, which locks us into place.

Just to be clear what all this is saying: All other pairs of velocities which would

give the same momentum live on this line, just as all other pairs of velocities which

give the same energy live on our circle. So notice, this gives us one and only one other

point that we could jump to. And it should make sense that it’s something where the

x-coordinate gets a little less negative and the y-coordinate is negative, since that corresponds

to our big block slowing down a little while the little block zooms off towards the wall.

When the second block bounces off the wall, it’s speed stays the same, but will go from

negative to positive. In the diagram, this corresponds to reflecting about the x-axis,

since the y-coordinate gets multiplied by -1. Then again, the next collision corresponds

to a jump along a line of slope -sqrt(m1 / m2), since staying on such a line is what conservation

of momentum looks like in this diagram.

This gives us a very satisfying picture of how we hop around on our picture, where you

keep going until the velocity of that smaller block is both positive, and smaller than the

velocity of the big one, meaning they’ll never touch again. That corresponds to this

region of the diagram, so in our process, we keep bouncing until we land in that region.

What we’ve drawn here is called a “phase diagram”, which is a simple but powerful

idea in math where you encode the state of some system, in this case the velocities of

our sliding blocks, as a single point in some abstract space. What’s powerful here is

that it turns questions about dynamics into questions about geometry. In this case, the

dynamical idea of all pairs of velocities that conserve energy corresponds to the geometric

object of a circle, and counting the total number of collisions turns into counting the

number of hops along these lines, alternating between vertical and diagonal.

Specifically, why is it that when the mass ratio is a power of 100, that number of steps

shows the digits of pi?

Well, if you stare at this picture, maybe, just maybe, you might notice that all the

arc-lengths between the points of this circle seem to be about the same. It’s not immediately

obvious that this should be true, but if it is, it means that computing the value of that

one arc length should be enough to figure out how many collisions it takes to get around

the circle to the end zone.

The key here is to use the ever-helpful inscribed angle theorem, which says that whenever you

form an angle using three points on a circle P1, P2 and P3 like this, it will be exactly

half the angle formed by P1, the circle’s center, and P3. P2 can be anywhere on this

circle, except in that arc between P1 and P3, and this fact will be true.

So now look at our phase space, and focus specifically on three points like these. Remember

this first vertical hop corresponds to the small block bouncing off the wall, and the

second hop along a slope of -sqrt(m1 / m2) corresponds to a momentum-conserving block

collision. Let’s call the angle between this momentum line and the vertical “theta”.

Then using the inscribed angle theorem, the arc length between these bottom two points,

measured in radians, will be 2*theta. Notice, since this momentum line has the same slope

for all of those jumps from the top of the circle to the bottom, the same reasoning means

all of these arcs must also be 2*theta.

So for each hop, if we drop down a new arc, like so, then after each collision we cover

another 2*theta radians of the circle. We stop once we’re in this endzone, corresponding

to both blocks moving to the right, with the smaller one going slower. But you can also

think of this as stopping at the point when adding another arc of 2*theta would overlap

with a previous one.

In other words, how many times do you have to add 2*theta to itself before it covers

more than 2*pi radians? The answer to this is the same as the number of collisions between

our blocks.

Or, simplifying things a little, what’s the largest integer multiple of theta that

doesn’t surpass pi?

For example, if theta was 0.01 radians, then multiplying by 314 would put you a little

less than pi, but multiplying by 315 would bring you over that value. So the answer would

be 314, meaning if our mass ratio were one such that the angle theta in our diagram was

0.01, the blocks would collide 314 times.

In fact, let’s go ahead and compute theta, say when the mass ratio is 100 : 1. Remember

that the rise-over-run slope of this constant momentum line is -sqrt(m1/m2), which in this

example is -10. That would mean the tangent of this angle theta, opposite over adjacent,

is that run over the negative rise, which is 1/10 in this example. So theta = arctan(1/10).

In general, it’ll be the inverse tangent of the square root of the small mass over

the square root of the big mass.

If you go an plug these into a calculator, you’ll notice that the arctan of each such

small value is quite close to the value itself. For example, arctan(1/100), corresponding

to a big mass of 10,000 kilograms, is extremely close to 0.01.

In fact, it’s so close that for the sake of our central question, it might as well

be 0.01. That is, analogous to what we saw a moment ago, adding this to itself 314 times

won’t surpass pi, but the 315th time would. Remember, unraveling why we’re doing this,

that’s a way of counting how many of our jumps on the phase diagram gets to the end

zone, which is a way of counting how many times the blocks collide until they’re sailing

off never to touch again. So that’s why a mass ratio of 10,000 gives 314 collisions.

Likewise a mass ratio of 1,000,000 to 1 will give an angle of arctan(1/1,000) in our diagram.

This is extremely close to 0.001. And again, if we ask about the largest integer multiple

of this theta that doesn’t surpass pi, it’s the same as it would be for the precise value

of 0.001: 3,141. These are the first four digits of pi, because that is by definition

what the digits of pi mean. This explains why with a mass ratio of 1,000,000, the number

of collisions is 3,141.

All this relies on the hope that the arctan of a small value is sufficiently close to

the value itself, which is another way of saying that the tangent of a small value is

approximately that value. Intuitively, there’s a nice reason this is true. Looking at a unit

circle, the tangent of any given angle is the height of this little triangle divided

by its width. When that angle is really small, the width is basically 1, and the height is

basically the same as the arc length along the circle, which by definition is theta.

To be more precise about it, the Taylor series expansion of tan(theta) shows that this approximation

will only have a cubic error term. So for example, tan(1/100) differs from 1/100 by

something on the order of 1/1,000,000. So even if we consider 314 steps with this angle,

the error between the actual value of arctan(1/100) and the approximation of 0.01 won’t have

a chance to accumulate enough to be significant.

So, let’s zoom out and sum up: When blocks collide, you can figure out how their velocities

change by slicing a line through a circle in a velocity phase diagram, each curve representing

a conservation law. Most notably, the conservation of energy plants the circular seed that ultimately

blossoms into the pi we find in the final count.

Specifically, due to some inscribed angle geometry, the points we hit of this circle

are spaced out evenly, separated by the angle we were calling 2*theta. This lets us rephrase

the question of counting collisions as instead asking how many times we must add 2*theta

to itself before it surpasses 2pi.

If theta looks like 0.001, the answer to that question has the same first digits as pi.

And when the mass ratio is some power of 100, because arctan(x) is so well approximated

by x for small values, theta is sufficiently close to this value to give the same final

count. Setup for next video

I’ll emphasizes again what this phase space allowed us to do, because this is a lesson

useful for all sorts of math, like differential equations, chaos theory, and other flavors

of dynamics: By representing the relevant state of your system as a single point in

an abstract space, it lets you translate problems of dynamics into problems of geometry.

I repeat myself because I don’t want you to come away just remembering a neat puzzle

where pi shows up unexpectedly, I want you to think of this surprise appearance as a

distilled remnant of the deeper relationship at play.

And if this solution leaves you feeling satisfied, it shouldn’t. Because there is another perspective,

more clever and pretty than this one, due to Galperin in the original paper on this

phenomenon, which invites us to draw a striking parallel between the dynamics of these blocks,

and that of a beam of light bouncing between two mirrors. Trust me, I’ve saved the best

for last on this topic, so I hope to see you again next video.