Cookies   I display ads to cover the expenses. See the privacy policy for more information. You can keep or reject the ads.

Video thumbnail
- [Voiceover] So here I'm gonna talk about the gradient.
And in this video, I'm only gonna describe
how you compute the gradient,
and in the next couple ones
I'm gonna give the geometric interpretation.
And I hate doing this,
I hate showing the computation
before the geometric intuition
since usually it should go the other way around,
but the gradient is one of those weird things
where the way that you compute it
actually seems kind of unrelated to the intuition
and you'll see that.
We'll connect them in the next few videos.
But to do that, we need to know
what both of them actually are.
So on the computation side of things,
let's say you have some sort of function.
And I'm just gonna make it a two-variable function.
And let's say it's f of x, y, equals x-squared sine of y.
The gradient is a way of packing together
all the partial derivative information of a function.
So let's just start by computing the partial derivatives
of this guy.
So partial of f with respect to x
is equal to,
so we look at this and we consider x the variable
and y the constant.
Well in that case sine of y is also a constant.
As far as x is concerned,
the derivative of x is 2x
so we see that this will
be 2x times that constant sine of y,
sine of y.
Whereas the partial derivative
with respect to y.
Now we look up here
and we say x is considered a constant
so x-squared is also considered a constant
so this is just a constant times sine of y,
so that's gonna equal
that same constant times the cosine of y,
which is the derivative of sine.
So now what the gradient does
is it just puts both of these together
in a vector.
And specifically, maybe I'll change colors here,
you denote it with a little upside-down triangle.
The name of that symbol is nabla,
but you often just pronounce it del,
you'd say del f or gradient of f.
And what this equals
is a vector
that has those two partial derivatives in it.
So the first one is the partial derivative
with respect to x,
to x times sine of y.
And the bottom one, partial derivative with respect to y
X-squared cosine of y.
And notice, maybe I should emphasize,
this is actually a vector-valued function.
So maybe I'll give it a little bit more room here
and emphasize that it's got an x and a y.
This is a function that takes in
a point in two-dimensional space
and outputs a two-dimensional vector.
So you could also imagine doing this
with three different variables.
Then you would have three partial derivatives,
and a three-dimensional output.
And the way you might write this more generally
is we could go down here and say the gradient
of any function
is equal to a vector with its partial derivatives.
Partial of f with respect to x,
and partial of f with respect to y.
And in some sense, we call these partial derivatives.
I like to think as the gradient as the full derivative
cuz it kind of captures all of the information
that you need.
So a very helpful mnemonic device
with the gradient is to think about this triangle,
this nabla symbol as being a vector
full of partial derivative operators.
And by operator, I just mean like partial
with respect to x,
something where you could give it a function,
and it gives you another function.
So you give this guy the function f
and it gives you this expression,
this multi-variable function as a result.
So the nabla symbol is this vector full
of different partial derivative operators.
And in this case it might just be two of them,
and this is kind of a weird thing
because it's like what,
this is a vector, it's got like operators in it,
that's not what I thought vectors do.
But you can kind of see where it's going.
It's really just,
you can think of it as a memory trick,
but in some sense it's a little bit deeper than that.
And really when you take this triangle
and you say ok let's take this triangle
and you can kind of imagine multiplying it by f,
really it's like an operator taking in this function
and it's gonna give you another function.
It's like you take this triangle and you put an f
in front of it, and you can imagine,
like this part gets multipled, quote unquote
multiplied with f, this part gets quote unquote
multiplied with f but really you're just saying
you take the partial derivative with respect to x
and then with y, and on and on.
And the reason for doing this,
this symbol comes up a lot in other contexts.
There are two other operators that you're gonna learn about
called the divergence and the curl.
We'll get to those later,
all in due time.
But it's useful to think about this
vector-ish thing of partial derivatives.
And I mean one weird thing about it,
you could say ok so this nabla symbol is a vector
of partial derivative operators.
What's its dimension?
And it's like how many dimensions do you got?
Because if you had a three-dimensional function
that would mean that you should treat this
like it's got three different operators as part of it.
And you know I'd kinda, finish this off down here,
and if you had something that was 100-dimensional
it would have 100 different operators in it
and that's fine.
It's really just again,
kind of a memory trick.
So with that, that's how you compute the gradient.
Not too much too it,
it's pretty much just partial derivatives,
but you smack em into a vector
where it gets fun and where it gets interesting
is with the geometric interpretation.
I'll get to that in the next couple videos.
It's also a super important tool
for something called the directional derivative.
So you've got a lot of fun stuff ahead.