KEVIN SMITH: Tool use is one of the defining characteristics
of an intelligent species like humans.
Every culture we know of uses objects
to shape the environment, culminating
in the tools and machines we use to build massive cities,
fly in planes, or communicate instantly across the world.
And while most of the tools we use
have been created and passed down from other people,
the spark of innovation to repurpose objects to serve
as tools exists in all of us.
We could easily see how to use a rock as a hammer
or put an old book under a table leg to stabilize it,
even if those aren't typical uses.
These capabilities come so easily to us
that we often forget how complex these behaviors are.
Despite the universality in people,
only a handful of other animals, just a few great apes, whales,
and birds, use objects in this way.
And we tend to think of these as some
of the most intelligent behaviors
that other species display.
For example, consider the problem this crow faces.
She wants to get the food floating
in the water in this tube.
But in order to reach it, she needs to raise the water level.
And for that, she needs to know that heavy objects can displace
water, and then what kind of objects
would be heavy enough to do this.
This reasoning process is impressively complex
and requires not just knowing how our actions will affect
the world, but also picking out the right actions
and in the right sequence to get the food.
KELSEY ALLEN: In this work, we set out
to better understand what makes people such capable tool users.
We designed the virtual tools game,
a novel task that requires creative physical problem
solving similar to how we think people might use a new tool.
In this game, people see a 2D scene with a goal,
like getting the red ball into the green area.
To accomplish this goal, people must choose one of three tools
from the side of the screen and then
place it somewhere in the level to launch,
stop, or even support other objects.
The catch is that people can only
use a single tool in a single place to solve the problem.
But they can try as many times as they want.
And this way, they can learn what happens after each attempt
and potentially update their plans accordingly.
We made a number of levels for the virtual tools game
which were designed to test different kinds
of physical concepts, such as indirectly launching a goal
object by hitting another object in the scene,
supporting an object, potentially prevent it
from falling, or even tipping and opening objects,
among many other different kinds of physical concepts.
We hypothesized that people need three critical capabilities
to solve these kinds of physical problems.
First, an object-oriented prior that
guides the initial actions to those
that will make a difference in the scene;
second, an ability to imagine the effects of their actions
before taking them; and third, a way
of rapidly updating those strategies when
their current attempts fail.
We captured these components in the Sample Simulate Update
model, or SSUP, as we call it for short.
KEVIN SMITH: If people's skill with tools
is based on these three capabilities,
then we would expect SSUP to behave
like our participants, which is, in fact, what we found.
When people typically solve a level in just a few attempts,
SSUP usually finds the answer quickly, too.
Levels that people find more difficult,
SSUP also takes longer to find a solution.
When we looked at individual levels,
we found that SSUP didn't just solve them
at similar rates to people, but also in similar ways.
The SSUP model often starts a level
by trying one of a variety of different actions
which match the distribution of ways
that people start that level out.
Similarly, the SSUP model predicts
that people should solve the level in particular ways, which
often describes the solutions that people do in fact find.
People's performance can't be explained by simple models.
Even deep networks, which have achieved strong performance
in other games like Atari, don't generalize
to new levels in the virtual tool game
if they weren't explicitly trained on them.
Instead, it seemed like people have
a good intuitive understanding of how their actions will
affect the objects in each level that they then use
to solve the game generally.
KELSEY ALLEN: So what does this tell us
about human cognition and physical reasoning?
We suggest that people are such amazing tool users because
of their incredible ability to make
use of very rich, internal physical simulation engines
to rapidly update their beliefs about what kinds of actions
are likely to be successful.
If we want build machines that are similarly
flexible in their physical reasoning,
they will require both more structured policies
and better physical models than what is currently the norm.
Looking towards the future, we plan
to use the virtual tools game as a platform
for studying other aspects of tool cognition,
both at a larger scale and across a much
broader range of scenarios than has been possible in the past.
This is an exciting step toward understanding
the computational and cognitive mechanisms that
have allowed the human species to develop tools as simple
as a rock-based hammer to those as complex as an airplane.
JOSHUA TENENBAUM: I'm excited about this work
for many reasons.
But one of them is how it relates to the broader
context of current research in artificial and natural
One of the AI developments that everybody's been hearing about
and is very exciting in the last few years
are advances in the reinforcement learning.
Systems that can learn from the mistakes
that they make capture the intuition that we know
is very important in human and animal learning of what we call
"trial and error learning."
But there is a very big difference between the way
trial and error learning works in today's
artificial reinforcement systems and what
you can see humans do, like in this task
that we've been studying here, this virtual tools game.
So today's AI systems and reinforcement learning,
they might learn from thousands, or millions,
or even billions of examples and experiences,
mistakes that they make.
And while that can be very powerful,
it also is just much slower than we see in human trial and error
So we're really excited that here we
have a task where we can study experimentally
how humans learn from just one or a few mistakes
and can very quickly get better.
And we even have a computational model, the SSUP model,
that starts to capture that.
Looking forward, this might even lead
to advances in machine systems, artificial reinforcement
learning that can learn as quickly and as flexibly as