The phrase “exponential growth” is familiar to most people, and yet human intuition has
a hard time really recognizing what it means sometimes.
We can anchor on a sequence of small seeming numbers, then become surprised with suddenly
those numbers look big, even if the overall trend follows an exponential perfectly consistently.
This right here is the data for recorded cases of COVID-19, aka the Coronavirus, outside
mainland China, at least as of the time I’m writing this.
Never one to waste an opportunity for a math lesson, I thought this might be a good time
for us all to go back to the basics on what exponential growth is, where it comes from,
what it implies, and maybe most pressingly, how to know when it’s coming to an end.
Exponential growth means as you go from one day to the next, it involves multiplying by
In our data, the number of cases each day tends to be between 1.15 and 1.25 times the
number of cases the previous day.
Viruses are a textbook example of this kind of growth because what causes new cases are
the existing cases.
If the number of cases on a given day is N, and we say each individual with the virus
is, on average, exposed to E people on a given day, and each exposure has a probability p
of becoming an infection, the number of new cases each day is E*p*N. The fact that N itself
is a part of this is what really makes things go fast because as N gets big, the rate it
grows also gets big.
One way to think of this is that as you add on these new cases to get the next day’s
count, you can factor out the N, so it’s just the same as multiplying by some constant
bigger than 1.
This is sometimes easier to see if we put the y-axis on a logarithmic scale, meaning
each step of a fixed distance corresponds to multiplying by a certain factor; in this
case, each step is another power of 10.
On this scale, exponential growth looks like a straight line.
With our data, it took 20 days to go from 100 to 1,000, and 13 days to go from that
to 10,000, and by doing a linear regression to find the best fit line, you can look at
the slope of that line to say it tends to multiply by 10 every 16 days on average.
This regression also lets us be more quantitative about how close the exponential fit really
is, and to use the technical jargon here, the answer is that it’s really freaking
It can be hard to digest what this really means, if true.
If you see one country with 6,000 cases, while another has 60, it’s easy to think the second
is doing 100 times better and, hence doing fine.
But if you’re in a situation where numbers multiply by 10 every 16 days, another way
to view the same fact is that the second country is about a month behind the first.
This is, of course, rather worrying if you draw out the line.
I’m recording this on March 6th, and if the present trend continues, it would mean
hitting 1M cases in 30 days (April 5th), hitting 10M in 47 days (April 22nd), 100M in 64 days
(May 9th), and 1 billion in 81 days (May 26th).
Needless to say, though, you can’t draw out a line like this forever, it clearly must
start slowing down at some point, but the crucial question is when.
Is it like the SARS outbreak of 2002 capped out at about 8,000 cases, or more like the
Spanish Flu in 1918 ultimately infected about 27% of the world’s population?
In general, just drawing a line through your data is not a great way to make predictions,
but remember that there’s an actual reason to expect an exponential here.
If the number of new cases each day is proportional to the number of existing cases, it means
each day you multiply by some constant, so moving forward d days is the same as multiplying
by that constant d times.
It is inevitable, though, that this factor in front of N eventually decreases.
Even in the most perfectly pernicious model for a virus, which would be where every day,
each person with the virus is exposed to a random subset of the world’s population,
at some point most of the people they’re exposed to will already be sick, and so can’t
become new cases.
In our equation, this means the probability of infection should include some factor to
account for the probability that a person you’re exposed to isn’t already infected,
which for a random exposure model would be (1 - the proportion of people in the world
who are infected).
When you include a factor like that and solve for how N grows, you get what’s known as
a logistic curve, which is essentially indistinguishable from an exponential at the beginning, but
ultimately levels upon approaching the total population size, as you’d expect.
True exponentials essentially never exist in the real world, they’re all the beginnings
of logistic curves.
The point where this curve goes from curving up to instead curving down is known as the
At that point, the number of new cases each day, represented by the slope of this curve,
is roughly constant, and will soon start decreasing.
So one number that people will often follow with epidemics is the “growth factor”,
which defined as the ratio between the number of new cases one day, and the number of new
cases the previous day.
So, just to be clear, if you were looking at the totals from on day to the next, then
tracking the changes between these totals, the growth factor is the ratio between two
While you’re growing exponentially, this factor will stay consistently above 1, whereas
seeing a growth factor around 1 is a sign you’ve hit the inflection.
This can make for another counterintuitive fact while following the data.
Think about what it would look like for the number of new cases one day to be about 15%
more than the number of new cases the previous day, and contrast that with what it would
feel like for it to be about the same.
Just looking at the totals, they really don’t feel that different, but if the growth factor
is 1, it could mean you’re at the inflection point of a logistic, which means the total
number of cases will max out around 2 times wherever you are now.
But a growth factor bigger than 1 means you’re on the exponential part, which could imply
orders of magnitude of growth still lie ahead of you.
While in the worst case this saturation point would be the total population, it’s of course
not true that people with the virus are randomly shuffled around the world’s population like
this, people are clustered in communities.
But when you run simulations where there’s even a little bit of travel between the clusters
like these, the growth is not actually much different.
What you end up with is a kind of fractal pattern, where communities themselves function
Each one has some exposure to others, with some probability of spreading the infection,
so the same underlying exponential-inducing laws apply.
Fortunately, saturating the whole population is not the only thing that causes the growth
factor to slow.
The amount of exposure goes down when people stop gather and traveling, and the infection
rate goes down when people wash their hands more.
The other thing that’s counterintuitive about exponential growth is how sensitive
it is to this constant.
For example, if it’s 15%, and we’re at 21,000 cases now, that means 61 days from
now it’s over 100 million.
But if through a bit less exposure and infection it drops to 5%, it doesn’t mean the projection
drops by a factor of 3, it actually drops to around 400,000.
So if people are sufficiently worried, there’s much less to worry about, but if no one is
worried, that’s when you should worry.