This is a video I’ve been excited to make for a while. The story here braids together

prime numbers, complex numbers and pi in a very pleasing trio.

Quite often in modern math, especially that which flirts with the Riemann zeta function,

these three seemingly unrelated objects show up in unison, and I want to give you a peek

at one instance where this happens; one of the few that doesn’t require too heavy a

technical background. That’s not to say it’s easy –in fact

this is one of the most intricate videos I’ve ever done– but the culmination is worth

it. What we’ll end up with is a formula for

pi, a certain alternating infinite sum. This formula is actually written on the mug

I’m drinking coffee from as I write this, and a fun but almost certainly apocryphal

story is that the beauty of this formula is what inspired Leibniz to quit being a lawyer

to instead pursue math. Whenever you see pi show up in math, there’s

a circle hiding somewhere, sometimes very sneakily. So the goal here is not just to

discover this sum, but to really understand the circle hiding behind it.

You see, there’s another way to prove the same result that you and I will spend some

meaningful time building to with just a few lines of calculus. But it’s one of those

proofs that leaves you thinking “Okay, I suppose that’s true”, without a sense

for why, for where the hidden circle is. On path you and I will take, what you’ll

see is that fundamental truth behind this sum and the circle it hides is a certain regularity

in prime numbers behave within complex numbers.

To start the story, imagine yourself with nothing more than a pencil, some paper, and

a desire to find a formula for computing pi. There are countless ways to approach this,

but as a broad outline for the plotline hear, you’ll start by asking how many lattice

points of the plane sit inside a big circle. That question can lead asking about how to

express numbers as the sum of two squares, which in turn will lead to factoring integers

inside the complex plane. From there, bringing in a function named chi

will give us a formula for pi that at first seems to involve a crazy complicated pattern

dependent on the distribution of primes, but a slight shift in perspective will simplify

it dramatically and expose the ultimate gold nugget.

It’s a lot, but good math takes time, and we’ll take it step by step.

When I say “lattice point”, I mean a point (a, b) in the plane where a and b are both

integers, a point where grid lines cross. If you draw a circle centered at the origin,

say with radius 10, how many lattice points would you guess are inside that circle?

Well, there’s one lattice point for each unit of area, so the answer should be approximately

equal to the area of the circle, pi*R2, which in this case is pi(102). And for a really

big radius, like 1,000,000, you’d expect this to be more accurate, in the sense that

the percent error between the estimate pi*R2 and the actual count of lattice points should

get smaller. If you can find a second way to answer the

same question, how many lattice points are in this circle, it might lead you to another

way to express the area of a circle, and hence another way to express pi.

So you play! And you wonder. And maybe, especially if you just watched a certain calculus video,

you might try looking through every possible ring that a lattice point could sit on.

For each of those points, (a, b), its distance from the origin in the square root of a2+b2,

and since a and b are both integers, a2+b2 is also an integer, so you only have to consider

rings whose radii are the square roots of whole numbers.

A radius of 0 just gives that single origin point. A radius of 1 hits 4 lattice points…

radius sqrt(2) hits 4 lattice points as well...sqrt(3) doesn’t hit any lattice points...sqrt(4)

hits four again...a radius of sqrt(5) actually hits 8 lattice points...

So what you need is a systematic way to count how many lattice points are a given distance

away from the origin, and to tally them all up.

If you pause and try this for a moment, you’ll see that the pattern seems pretty chaotic,

which is a good sign that some very interesting math will come into play.

In fact, as you’ll see, this pattern is rooted in the distribution of primes.

For example, look at the ring with radius sqrt(25).

It hits (5, 0), since 52 + 02 = 25. It also hits (4, 3), since 42+32 = 25, and likewise

it hits (3, 4).... and (0, 5)... What’s really happening here is that you’re

counting how many pairs of integers (a, b) have the property that a2 + b2 = 25, and it

looks like there’s a total of 12. As another example, look at the ring with

radius sqrt(11). It doesn’t hit any lattice points, which corresponds with the fact that

you cannot find two integers whose squares add up to 11.

Now, many times in math when you see a question that has to do with the 2d plane, it can be

surprisingly fruitful to just ask what it looks like when you think of that plane as

the set of all complex numbers. So instead of thinking of this lattice point

as the pair of integer coordinates (3, 4), think of it as the single complex number 3

+ 4i. That way, another way to think of the sum

of the squares of its coordinates, 32+42, is to multiply this number by 3 - 4i.

This is called its “complex conjugate”, it’s what you get by reflecting over the

real axis, replacing i with -i. This might seem like a strange step if you

don’t have much of a history with complex numbers, but describing this distance as a

product can be unexpectedly useful. It’ll turn our question into a factoring problem,

which is ultimately why patterns among prime numbers come into play.

Algebraically, this relation is straight-forward enough to verify. You get a 32, then the 3*(-4i)

cancels with the 4i*3, then you have -(4i)2, which because i2 is -1 becomes +42.

This is also quite nice to see geometrically; and if the following feels unfamiliar, I do

have another video where I go into more detail about why complex multiplication looks the

way it does. The number 3+4i has a magnitude of 5, and

some angle off the horizontal. Multiplying by 3-4i rotates by that same angle in the

opposite direction, putting it on the positive real axis, then stretches by that same magnitude,

5, meaning you land on the number 25, the square of the magnitude.

The collection of all these lattice points a + bi, where a and b are integers, has a

special name: the “Gaussian integers”, named after Martin Sheen.

Geometrically, you will still be asking the same question, how many of these lattice points

(Gaussian integers) are a given distance away from the origin, like sqrt(25). But we’re

just phrasing it in a more algebraic way: How many gaussian integers have the property

that multiplying by their complex conjugate gives 25?

This might seem needlessly complex, but it’s the key to understanding the seemingly random

pattern for how many lattice points are a given distance from the origin.

To see why, we need to understand how numbers factor inside the Gaussian integers.

As a quick refresher, among the ordinary integers, every number can be factored as some unique

collection of prime numbers. For example, 2,250 can be factored as 2*32*53, and there

is no other collection of primes that also multiplies to 2,250.

Unless you just make some of the primes in this factorization negative.

So really, factorization in the integers is not perfectly unique, it’s almost unique,

with the exception that some of the factors might be multiplied by -1.

Analogy with primes Factoring works very similarly in Gaussian

integers. Some numbers, like 5, can be factored into

smaller Gaussian integers, in this case (2+i)(2-i). This Gaussian integer (2+i), cannot be factored

into anything smaller, so we call it a “Gaussian prime”.

Again, this factorization is almost unique. But this time, not only can you multiply each

of the factors by -1 to get a factorization that looks different, you can also be extra

sneaky by multiplying one factor by i, and another by -1. This gives a different way

to factor 5 into Gaussian primes. But other than the things you can get by multiplying

some of these factors by -1, i or -i, factorization within the Gaussian integers is unique.

If you can figure out how ordinary primes numbers factor inside the Gaussian integers,

it’ll tell you all all other integers factor, so here we pull in a crucial and surprising

fact. Primes which are 1 above a multiple of 4,

like 5, 13, 17, can be always factored into exactly 2 distinct gaussian primes.

This corresponds with the fact that rings with radii equal to the square root of one

of these primes always hit lattice points. In fact they always hit 8, as you’ll see

in a bit. Primes that are 3 above a multiple of 4, like

3, 7, 11 and so on, cannot be factored further in the Gaussian integers. Not only are they

primes in the integers, they are also Gaussian primes, unsplittable even when i is in the

picture. This corresponds with the fact that a ring

whose radius is the square root of one of these will never hit any lattice points.

This pattern is the regularity within primes that we’ll ultimately exploit. And in a

later video I might explain why on earth this is true; why a prime number’s remainder

when divided by 4 has anything to do with whether or not it factors inside the Gaussian

integers, and hence, whether it can be expressed as the sum of two squares. But here we’ll

take it as given. The prime number 2 is special, because it

does factor, as (1+i)(1-i), but these two gaussian primes are a 90o rotation away from

each other, so you can multiply one by i to get the other. And that fact will make us

want to treat 2 a little differently for where this is all going.

Remember, our goal here is to count how many lattice points are a given distance away from

the origin. Doing this systematically for all distances sqrt(N) can lead us to a formula

for pi. And counting the number of lattice points

with a given magnitude, like sqrt(25), is the same as asking how many Gaussian integers

have the property that when you multiply by its complex conjugate, you get 25.

So here’s the recipe for finding all Gaussian integer with this property.

First, factor 25, which in the ordinary integers is 52, but since 5 factors further as (2+i)(2-i),

25 breaks down into these four Gaussian primes. Next, organize these into two columns, with

conjugate pairs sitting right next to each other.

Now multiply what’s in each column, and you’ll come out with two Gaussian integers.

Because everything in the right is a conjugate to everything in the left, what comes out

will be a complex conjugate pair, which multiply to make 25.

Picking an arbitrary standard, let’s call that product from the left column the output

of our recipe. Notice, there are three choices for how to

divvy up the primes that can affect that output: Here, both copies of 2+i are in the left column,

and that left column product is 3+4i; you could also have only one copy of 2+i is in

the left column, in which case that product is 5; or both copies of 2+i are in the right

column, which will give an output of 3-4i. Those three possible outputs are all lattice

points on the circle with radius sqrt(25). So, why does this recipe not yet capture all

12 lattice points? Well, remember how I mentioned that a factorization

into gaussian primes can look different if you multiply some of them by i, -1 or -i?

If you write the factorization of 25 differently, maybe splitting up one of those 5’s as (-1+2i)(-1-2i),

it can affect our result. But the only effect that would have is to

multiply the total output by i, -1 or -i, so as a final step to our recipe, say you

have to make one of 4 choices: Multiply the product from the left column either by 1,

i, -1 or -i. The three lattice points we originally found,

can each be multiplied by i, -1, or -i, and that accounts for all 12 ways to construct

a gaussian integer whose product with its own conjugate is 25.

Extend to 53 The best way to see how this recipe counts

lattice points more generally is to just see more examples.

If instead we were looking at 125, which is 53, we would have had 4 choices for how to

divvy up primes conjugate pairs into the two columns, either including 0, 1, 2, or all

three copies of 2+i in that left column. Those four choices, multiplied by final four

choices of multiplying the product from the left column by 1, i, -1 or -i, would suggest

there are 16 lattice points a distance of sqrt(125) away from the origin.

And indeed, if you draw that circle and count, you’ll find that it hits exactly 16 lattice

points

If you introduce a factor like 3, which doesn’t break down into the product of two conjugate

Gaussian primes, it really mucks up the whole system.

When you’re divvying up primes between the two columns, there’s no way to split up

that 3. No matter where you put it, it leaves the columns imbalanced, and when you take

the product of all the numbers in each column you wouldn’t end up with a conjugate pair.

So for a number like 3*53, which is 375, there’s actually no lattice point you hit; no Gaussian

integer whose product with its own conjugate gives 375.

Extend to 32*53 But if you add a second factor of 3, then

you have an option. You can throw one 3 in the left column, and the other in the right.

Since 3 is its own complex conjugate, this keeps things balanced, in the sense that the

products of the left and right columns will be complex conjugates of each other.

But it doesn’t add any new options. There will still be a total of 4 choices for how

to divvy up those factors if 5, multiplied by the final 4 choices of multiplying by 1,

i, -1 or -i. So just like the sqrt(125) circle, this guy

also hits exactly 16 lattice points.

Let’s just sum up where we are. When you’re counting how many lattice points

lie on a circle with radius sqrt(N), the first step is to factor N.

For prime factors like 5, 13, and 17, which factor into a conjugate pair of Gaussian primes,

the number of choices you have is one more than the exponent that shows up with that

factor. For prime factors like 3, 7 and 11, which

are already Gaussian primes and can’t be split, if they show up with an even power,

you have one and only one choice for what to do with them. If it’s an odd exponent,

you’re screwed, you have zero choices. And no matter what, you have those 4 choices

at the end.

By the way, this process right here is, I think, the most complicated part of the video.

It took me a couple times to think through that yes, this is a valid way to count lattice

points, so don’t be shy if you want to pause and scribble things down to get a feel for

it. The one last thing to mention is how factors

of 2 affect the count.

If your number is even, the factor of 2 breaks down as (1+i)(1-i), so you can divvy up that

conjugate pair between the columns. At first it might look like this doubles your options,

depending on how you choose divvy these up between the columns.

However, since multiplying one of these gaussian primes by i gives you the other one, if you

swap them between the columns, the effect on the output from the left column is to multiply

it by i or -i. So this is redundant with that final step,

where we take the product of the left column and choose to multiply it either by 1, i,

-1 or -i. This means a factor of 2, or any power of

2, doesn’t actually change the count at all; it doesn’t hurt, it doesn’t help.

For example, a circle with radius sqrt(5) hits 8 points, and one with radius sqrt(10)

also hits 8 points, as does sqrt(20)... and sqrt(40). Factors of 2 just don’t make a

difference.

What’s about to happen is number theory at its best.

We have this complicated recipe telling us how many lattice points sit on a circle with

radius sqrt(N), and it depends on the prime factorization of N.

To turn it into something simpler, we’re going to exploit the regularity of prime numbers

that those which are 1 above a multiple of 4 split into distinct gaussian prime factors,

while those that are 3 above a multiple of 4 cannot be split.

Introduce chi To do this, we’ll bring in a simple function,

which I’ll label with the greek letter “chi”. For inputs 1 above a multiple of 4, the output

of chi is 1. For inputs 3 above a multiple of 4, it outputs -1. And for even numbers,

it gives 0. So if you evaluate chi on the natural numbers,

it gives this nice cyclic pattern 1, 0, -1, 0, and repeat.

Chi has a special property, it’s what’s called a “multiplicative” function. If

you evaluate it on two number and multiply, like chi(3)*chi(5), the result is the same

as chi evaluated on the product of those two numbers, in this case chi(15). Likewise, chi(5)*chi(5)

= chi(25)...and this works for any two numbers, go ahead an try it.

Show answer to counting question in terms of chi

So, for our central question of counting lattice points in this way that involves factoring

a number, I’m going to write the number of choices we have using chi in what at first

seems to be a more complicated way, but it has the benefit of treating all prime factors

the same way. For each prime power, like 53, write (chi(1)

+ chi(5) + chi(52) + chi(53)), adding up the value of chi on all the power of this prime

up to the one that shows up in the factorization. Since 5 is 1 above a multiple of 4, all of

these are just 1, so this sum comes out to 4, which reflects the fact that the factor

of 53 gives us 4 options for how to divvy up its two Gaussian prime factors between

the columns. For something like 34, this looks like (chi(1)

+ chi(3) + chi(32) + etc.). Since chi(3) is -1, this sum oscillates, 1 - 1 + 1, etc. If

it’s an even power, like 4 in this case, the sum comes out to 1, which encapsulates

the fact that there is only 1 choice for what to do with those unsplittable 3’s. If it’s

an odd power, that sum is 0, indicating that we have no choices.

For powers of two, this looks like (1 + 0 + 0 + etc.), indicating that with a factor

of 2, we always have 1 choice for what to do with it, it neither helps nor hurts the

lattice point cause. And as always, that 4 in front reflects the

final choice of multiplying the output by 1, i, -1 or -i.

We’re getting close to the culmination now, things are starting to look organized, so

take a moment, pause and ponder, make sure this all feels good.

Take the number 45 as an example, which factors as 325. The expression for the number of lattice

points is 4*(1 + chi(3) + chi(32))(1 + chi(5)), which is the same as 4*(1 choice for what

to do with the 3’s)*(2 choices for how to divvy up the Gaussian prime factors of 5).

It might seem like expanding out this sum is really complicated, because it involves

looking at all possible combinations of these prime factors.

But because chi is multiplicative, each one of these combinations corresponds to a divisor

of 45, so really this looks like 4*(chi(1) + chi(3) + chi(5) + chi(9) + chi(15) + chi(45)).

This covers every number which divides evenly into 45, once, and only once.

And it works like that for any number, there’s nothing special about 45.

That’s pretty interesting, and I think wholly unexpected. This question of counting the

number of lattice points a distance sqrt(N) away from the origin involves adding up the

value of this simple function over all divisors of N.

Now, remember why we’re doing all of this. The total number of lattice points inside

a big circle with radius R should be about pi*R2.

But on the other hand, we can count those same lattice points by looking through all

numbers N between 0 and R2 and counting how many lattice points are a distance sqrt(N)

from the origin. We’ll go ahead and ignore the origin dot,

with radius 0, since it doesn’t follow the pattern of the rest, and one little dot won’t

make a difference as we let R grow towards infinity.

From all of this Gaussian integer and factoring and chi function stuff we’ve been doing,

the answer for each N looks like adding up the value of chi on every divisor of N, and

multiplying by 4. Let’s just put that 4 in the corner for

now and remember to bring it back in a moment. Rearrange sum

At first, adding up the values of all these rows seems super random. Numbers with lots

of factors have lots of divisors, primes only have two divisors, and you might think that

you’d need perfect knowledge of the distribution of primes to get anything useful out of this.

But if instead you organize these into columns, the puzzle starts to fit together.

How many numbers between 1 and R2 have 1 as a divisors; well, all of them, so our sum

should include R2*chi(1) How many have 2 as a divisors. Well, about

half of them, so that accounts for (R2/2)*chi(2) in our total tally up.

About a third of the rows have a chi(3), so that’s (R2/3)*chi(3).

Keep going like this, and the total count of lattice points looks like R2*chi(1) +(R2/2)*chi(2)

+ (R2/3) * chi(3), and so on. Factoring out that R2, and bringing back the 4 that needs

to be multiplied in, this means the total number of lattice points in this big circle

is approximately 4R2(..this sum..). Since chi is 0 on all even numbers, and oscillates

between 1 and -1 for odd numbers, that sum looks like 1 - ⅓ + ⅕ - 1/7, and so on.

This is what we wanted! An alternative expression for the number of lattice points in a big

circle, which we know to be around pi*R2 The bigger R is, the more accurate both these

estimates are, so the error between these sides can be arbitrarily small.

Dividing by R2, this gives us an infinite sum which should converge to pi.

And the reason this sum is so simple, just oscillating back and forth between odd numbers,

stems from the regular pattern for how prime numbers factor inside the Gaussian integers.

If you’re curious, there are two main branches of number theory: Algebraic number theory,

and analytic number theory. Very loosely speaking, the former deals with

new number systems, like these Gaussian integers you and I looked at, while the latter deals

with things like the Riemann zeta function, or its cousins called “L” functions which

involve multiplicative functions like the central character chi from our story.

The path we just walked is a little glimpse of where those two fields intersect. Both

are pretty heavy-duty fields with lots of active research and unsolved problems. So

if this feels like something which will take time to mentally digest, like there’s more

patterns to be uncovered and understood, it’s because it is, and there are.

So, you are all demonstrably the kind of people who watch in-depth math videos in your free

time, and I know that some subportion of you are software engineers, or soon-to-be software

engineers, so before you go there’s a little recruiting I’d like to do.

This video is sponsored by Remix, which is a planning platform for public transit. They

enable cities to find the most efficient and cost-effective ways to serve the communities

and demographics they want. And, if you think about if for a moment, doing

this well can involve some seriously interesting optimization problems. And indeed, their looking

for mathematically oriented programmers who can tackle problems involving a variety of

optimization techniques so that, as one of their engineers phrased it to me, they can

“trick the universe into letting them create quality schedules.”

If you want to work with smart people on interesting problems, take a look at their site and careers

page, which I’ve linked to on the screen and in the description.