The phrase “exponential growth” is familiar to most people, and yet human intuition has

a hard time really recognizing what it means sometimes.

We can anchor on a sequence of small seeming numbers, then become surprised with suddenly

those numbers look big, even if the overall trend follows an exponential perfectly consistently.

This right here is the data for recorded cases of COVID-19, aka the Coronavirus, outside

mainland China, at least as of the time I’m writing this.

Never one to waste an opportunity for a math lesson, I thought this might be a good time

for us all to go back to the basics on what exponential growth is, where it comes from,

what it implies, and maybe most pressingly, how to know when it’s coming to an end.

Exponential growth means as you go from one day to the next, it involves multiplying by

some constant.

In our data, the number of cases each day tends to be between 1.15 and 1.25 times the

number of cases the previous day.

Viruses are a textbook example of this kind of growth because what causes new cases are

the existing cases.

If the number of cases on a given day is N, and we say each individual with the virus

is, on average, exposed to E people on a given day, and each exposure has a probability p

of becoming an infection, the number of new cases each day is E*p*N. The fact that N itself

is a part of this is what really makes things go fast because as N gets big, the rate it

grows also gets big.

One way to think of this is that as you add on these new cases to get the next day’s

count, you can factor out the N, so it’s just the same as multiplying by some constant

bigger than 1.

This is sometimes easier to see if we put the y-axis on a logarithmic scale, meaning

each step of a fixed distance corresponds to multiplying by a certain factor; in this

case, each step is another power of 10.

On this scale, exponential growth looks like a straight line.

With our data, it took 20 days to go from 100 to 1,000, and 13 days to go from that

to 10,000, and by doing a linear regression to find the best fit line, you can look at

the slope of that line to say it tends to multiply by 10 every 16 days on average.

This regression also lets us be more quantitative about how close the exponential fit really

is, and to use the technical jargon here, the answer is that it’s really freaking

close.

It can be hard to digest what this really means, if true.

If you see one country with 6,000 cases, while another has 60, it’s easy to think the second

is doing 100 times better and, hence doing fine.

But if you’re in a situation where numbers multiply by 10 every 16 days, another way

to view the same fact is that the second country is about a month behind the first.

This is, of course, rather worrying if you draw out the line.

I’m recording this on March 6th, and if the present trend continues, it would mean

hitting 1M cases in 30 days (April 5th), hitting 10M in 47 days (April 22nd), 100M in 64 days

(May 9th), and 1 billion in 81 days (May 26th).

Needless to say, though, you can’t draw out a line like this forever, it clearly must

start slowing down at some point, but the crucial question is when.

Is it like the SARS outbreak of 2002 capped out at about 8,000 cases, or more like the

Spanish Flu in 1918 ultimately infected about 27% of the world’s population?

In general, just drawing a line through your data is not a great way to make predictions,

but remember that there’s an actual reason to expect an exponential here.

If the number of new cases each day is proportional to the number of existing cases, it means

each day you multiply by some constant, so moving forward d days is the same as multiplying

by that constant d times.

It is inevitable, though, that this factor in front of N eventually decreases.

Even in the most perfectly pernicious model for a virus, which would be where every day,

each person with the virus is exposed to a random subset of the world’s population,

at some point most of the people they’re exposed to will already be sick, and so can’t

become new cases.

In our equation, this means the probability of infection should include some factor to

account for the probability that a person you’re exposed to isn’t already infected,

which for a random exposure model would be (1 - the proportion of people in the world

who are infected).

When you include a factor like that and solve for how N grows, you get what’s known as

a logistic curve, which is essentially indistinguishable from an exponential at the beginning, but

ultimately levels upon approaching the total population size, as you’d expect.

True exponentials essentially never exist in the real world, they’re all the beginnings

of logistic curves.

The point where this curve goes from curving up to instead curving down is known as the

“inflection point”.

At that point, the number of new cases each day, represented by the slope of this curve,

is roughly constant, and will soon start decreasing.

So one number that people will often follow with epidemics is the “growth factor”,

which defined as the ratio between the number of new cases one day, and the number of new

cases the previous day.

So, just to be clear, if you were looking at the totals from on day to the next, then

tracking the changes between these totals, the growth factor is the ratio between two

successive changes.

While you’re growing exponentially, this factor will stay consistently above 1, whereas

seeing a growth factor around 1 is a sign you’ve hit the inflection.

This can make for another counterintuitive fact while following the data.

Think about what it would look like for the number of new cases one day to be about 15%

more than the number of new cases the previous day, and contrast that with what it would

feel like for it to be about the same.

Just looking at the totals, they really don’t feel that different, but if the growth factor

is 1, it could mean you’re at the inflection point of a logistic, which means the total

number of cases will max out around 2 times wherever you are now.

But a growth factor bigger than 1 means you’re on the exponential part, which could imply

orders of magnitude of growth still lie ahead of you.

While in the worst case this saturation point would be the total population, it’s of course

not true that people with the virus are randomly shuffled around the world’s population like

this, people are clustered in communities.

But when you run simulations where there’s even a little bit of travel between the clusters

like these, the growth is not actually much different.

What you end up with is a kind of fractal pattern, where communities themselves function

like individuals.

Each one has some exposure to others, with some probability of spreading the infection,

so the same underlying exponential-inducing laws apply.

Fortunately, saturating the whole population is not the only thing that causes the growth

factor to slow.

The amount of exposure goes down when people stop gather and traveling, and the infection

rate goes down when people wash their hands more.

The other thing that’s counterintuitive about exponential growth is how sensitive

it is to this constant.

For example, if it’s 15%, and we’re at 21,000 cases now, that means 61 days from

now it’s over 100 million.

But if through a bit less exposure and infection it drops to 5%, it doesn’t mean the projection

drops by a factor of 3, it actually drops to around 400,000.

So if people are sufficiently worried, there’s much less to worry about, but if no one is

worried, that’s when you should worry.