Last time, we left off with an important question.
Should we continue manually programming rules to identify fingers,
or should we try something else?
To help us decide, let’s pick back up on our history of artificial intelligence.
We left off with the first generation of AI researchers of the 1950s and 60s
making some wildly optimistic predictions.
When these predictions failed to materialize in the early 1970s,
AI funding was dramatically cut in the US and elsewhere,
leading to what is now called the First AI Winter.
Entire areas of research, such as neural networks, dried up for nearly a decade.
Fortunately for us, AI came screaming back in the 1980s,
thanks to a newfound ability to make piles of money instead of losing them.
To make these piles of money, arts of AI were dramatically focused
onto narrow and profitable subproblems.
One of the earliest of these problems belonged to the Digital Equipment Corporation.
DEC’s hot new line of VAX computers were complicated.
So complicated, in fact, that you needed an expert just to buy one.
Since DEC sold every cable, connection, piece of hardware and software separately,
customers would often end up with a bunch of computer parts that wouldn’t work together.
When training salespeople to help customers arrive at valid configurations proved challenging,
DEC turned to Carnegie Mellon Professor John McDermott for help.
McDermott built and delivered a system that dramatically and effectively solved DEC’s problem
and that, five years later, would be saving DEC forty million dollars a year.
So, how did McDermott do it?
How did he harness the power of artificial intelligence to ensure
DEC customers received the exact correct combination of chips, modules, software and cables?
McDermott did something that sounds simple: he asked the experts.
DEC already had experts that knew how to solve these problems,
so McDermott rolled up his sleeves and began turning the experts’ knowledge into code.
This process, dubbed “knowledge engineering,” proved tedious.
Especially when DEC’s own experts disagreed with one another over ideal configurations.
But McDermott persisted and the first expert system, the R1, was born.
Word quickly spread out the R1’s success, and artificial intelligence was quickly back in vogue.
The subsequent growth was explosive.
By 1985, 150 companies collectively spent an excess of 1 billion dollars annually on internal AI groups.
Demand for AI experts quickly outgrew supply,
academic conferences were overwhelmed by business types,
and many professors answered the call of venture capital and spun off their own AI companies.
And, like all periods of economic boom fueled by dramatically overhyped advances in technology,
the expert systems of the 1980s continued to grow indefinitely without any problems!
The market for expert systems and the companies that solved them evaporated in the late 80s.
So, what went wrong here?
What made the market for expert systems disappear?
Well, the economics here are largely controlled by perceptions, which are finicky.
But there are varied real weaknesses in the expert systems’ approach
that became more and more apparent as the decade advanced.
One weakness was very practical:
huge systems of rules are brittle.
In our example, we already have 30 rules,
and it’s easy to imagine this number growing tenfold after considering more finger orientations,
sizes and shapes.
The issue now is, what if something changes?
What if we want to recognize other objects?
We would have to completely start over.
Piles of hand-coded rules are ridiculously complex to maintain and update.
Rewrites just aren’t feasible.
By 1987, DEC’s R1 system had over 10,000 rules and over a person-century of time invested.
While maintaining huge systems of handwritten rules is certainly a daunting task,
a deeper problem with the expert system approach also emerged in the 1980s.
This problem had everything to do with the types of problems expert systems are able to solve.
Expert systems excelled at a specific kind of problem:
the type of abstract reasoning problems we humans consider difficult.
Things like configuring computer systems, solving logic problems,
playing chess and inferring the structure of chemicals.
Tasks that require experts.
Early AI researchers naturally assumed that, since these tasks were hard for humans,
they would also be hard for machines.
And the tasks that were easier for humans, like recognizing a face, would also be easier for machines.
Remarkably, this assumption turned out to be completely backwards.
As the linguist and cognitive scientist Steven Pinker writes,
“the main lesson of thirty-five years of AI research is that the hard problems are easy
and the easy problems are hard.”
This phenomenon, known today as Moravec’s paradox,
missed by generations of brilliant scientists,
is a result of misunderstanding just what intelligence is.
Language, reasoning and abstract thought are certainly central to being human.
But that doesn’t mean that these skills represent the most sophisticated functions of our brain.
There are many key processes in our brain, such as vision,
that happen too quickly and too far below the surface of consciousness
for us to understand by simply thinking about how we think.
What this means for our problem is that no matter how much time we spend writing code
that describes what we think fingers look like.
Our code will never be as good at recognizing fingers as our visual cortex is.
Simply turning knowledge into code, knowledge engineering,
assumes that we know what’s going on between our ears.
And we simply don’t, when it comes to many of the problems we solve on a day-to-day basis.
So, what are we to do, then?
How are we going to program computers to think if we don’t even understand how we do it ourselves?
Fortunately, throughout the cycles of AI boom and bust, groups of researchers are quietly working on a solution.
An alternative approach that allows us to sidestep many of the problems of knowledge engineering.
An approach that has been incredibly successful in the last couple decades
and today is arguably responsible for its very own economic bubble.
Next time, we’ll see why this alternative approach is so effective.