Cookies   I display ads to cover the expenses. See the privacy policy for more information. You can keep or reject the ads.

Video thumbnail
- [Voiceover] In the last couple videos, I
showed how you can take a function, ah,
just a function with two inputs,
and find the tangent plane to its graph,
and the way that you think about this,
you first find a point,
some kind of input point,
which is, you know I'll just write
abstractly as x nought and y nought.
And you see where that point ends up on the graph,
and you wanna find a new function,
a new function which we were calling L,
and maybe you say L sub f,
which also is a function of x and y.
And you want the graph of that function
to be a plane tangent to the graph.
Now this often goes by another name.
This will go under the name Local Linearization,
Local linearization, this is kind of a long word, zation.
And what this basically means,
the word local means you're looking at
a specific input point.
So in this case,
it's a specific input point x nought, y nought,
and the idea of a linearization,
a linearization, means you're approximating
the function with something simpler,
with something that's actually linear,
and I'll tell you what I mean by linear in,
in just a moment.
But the whole idea here is that we don't really care about,
you know, tangent planes in an abstract 3D spaced
to some kind of graph.
The whole reason for doing this,
is that this is a really good way to approximate a function,
which is potentially a very complicated function
with something that's much easier,
something that has constant partial derivatives.
Now my goal of this video is gonna be to show
how we write this local linearization here
in vector form, because it'll be both more compact,
and hopefully easier to remember,
and also it's more general.
It'll apply to things that have more than just,
just two input variables like this one does.
So just to remind us of where we were,
and what we got to in the last couple videos,
I'll write a little bit more abstractly this time,
rather than a specific example.
The way you do this local linearization
is first you find the partial derivative
of f with respect to x,
which I'll write with the subscript notation.
And you evaluate that at x of o
or x nought, y nought.
You evaluate it at the point
about which you're approximating
and then you multiply that by x minus that constant.
So the only variable right here,
everything is a constant,
but the only variable part is that x.
And then we add to that,
basically doing the same thing with y.
You take the partial derivative with respect to y,
you evaluate it at the input point,
the point about which you are linearizing,
and then you multiply it by y minus ys of o.
And then to this entire thing
because you wanna make sure that when
you evaluate this function at the input point itself.
You see, when you plug in x nought and y nought,
this terms goes to zero,
cause x nought minus x nought is zero.
This terms goes to zero,
and this is the whole reason we kind of paired
up these terms and organized the constants in this way.
This way, you can just think about adding
whatever the function itself evaluates to at that point.
And this will ensure that your linearization
actually equals the function itself
at the local point.
Cause hopefully if you're approximating it near a point,
then at that point, it's actually equal.
So what do I mean by this word linear?
The word linear has a very precise formulation,
especially in the context of linear algebra,
and admittedly, this is not actually
a linear function in the technical sense.
But loosely what it means,
and the reason people call it linear,
is that this x term here, this variable term,
doesn't have anything fancy going on with it.
It's just being multiplied by a constant,
and similarly this y term it's just being
multiplied by a constant.
It's not squared, there's no square root,
it's not in an exponent or anything like that.
And although there is a more technical meaning
of the word linear,
this is all it really needs in this context.
This is all you need to think about.
Each variable is just multiplied by a constant.
Now you might see this in a more complicated form,
or what's at first a more complicated form using vectors.
So first of all, let's think about
how we would start describing
everything going on here with vectors.
So the input, ah,
rather than talk about the input as being a pair of points,
what I wanna say is that there's some vector,
some vector that has these as its components,
and we just wanna capture that all,
and I wanna give that a name.
And kind of unfortunately the name that we give this,
it's very common to just call it x,
and maybe a bold-faced x and that would be
easier to do typing than it is writing,
so I'll just kind of try to emphasize bold-faced
x equals this vector,
and where's, that's confusing cause x
is already one of the input variables that's just a number.
Um, but I'll try to emphasize it,
just making it bold.
You'll see this in writing a lot.
X is this input vector,
and then similarly, the specified, ah,
specified input about which we are approximating,
you would call, see I'll make it a nice
bold-faced x nought.
We'll, we'll do that nought to just kind of indicate
that it's a constant of some kind.
And what that is,
it's a vector containing the two numbers
x nought, y nought.
So this is just us starting to write things
in a more vectorized way
and the convenience here is that,
that if you're dealing with a function
with three input variables or four or a hundred,
you could still just write it as this
bold-faced x with the understanding
that the vector has a lot more components.
So now, let's take a look at,
at these first two terms in our linearization.
We can start thinking of this as a,
as a dot product, actually.
So let me first just kind of move this guy
out of the way and give ourselves some room.
So he's gonna just go up there,
this the same guy,
and now I wanna think about writing this other term here
as a, as a dot product.
And what that looks like is we have
the two partial derivatives fs of x
and fs of y, indicating the partial derivatives
with respect to x and y,
and each one of them is evaluated.
Let's see, I'll do it, I'll do it
evaluating at our bold-faced x nought,
and then this one is also evaluated
at that bold-faced x nought.
So, really you're thinking about this as being,
you know, a vector that contains two different variables.
You're just packing it into a single symbol,
and the dot product here
is against, ah, you know, the first component
is x minus x nought,
so I'd write that as x minus x nought the number,
and then similarly, y minus,
let's see, I'll do it in the same color,
y nought the number.
Ah, but we can write each one of these
in a more compact form,
where this, the vector that has the partial derivatives,
that's the gradient, and if that feels unfamiliar,
maybe go back and check out the videos on the gradient,
but this whole vector is basically just saying,
take the gradient and evaluate it at that,
that vector input, you know, x nought.
And in the second component here,
that's telling you you've got
x and y minus x nought and y nought.
So what you're basically doing is taking
the, you know, bold-faced input,
the variable vector x,
and then you're subtracting off,
you know, x nought, where x nought is some kind of constant.
So this right here,
this is just vector terms where you're thinking of this
as being a vector with two components,
and this one is a vector with two components,
but if your function happened to
be something more complicated,
with, you know, a hundred input variables,
this would be the same thing you write down.
You would just understand that when you expand this,
there's gonna be a hundred different
components in the vector.
Um, and this is what a linear term
looks like in vector terminology,
cause this dot product is telling you
that all of the components
of that bold-faced x vector,
the, that expands into, you know,
not bold-faced x, y, z, whatever else it expands to.
All of those are just being multiplied
by some kind of constant.
So we take that whole thing,
that's, that's how you simplify the first couple terms here,
and of course, we just add on
the value of the function itself.
So you would take that as the linear term.
And no, I kind of like to add it on to the front,
actually, where you think about taking
the function itself and evaluating it
at that, that constant input x nought,
cause that way you can kind of think
this is your constant term,
and then the rest of the stuff here is your linear term.
Rest of your stuff is your linear.
Cause later on if we start adding other terms
like a quadratic term
or more complicated things,
you can kind of keep adding them on the end.
So this right here,
is the expression that you will often see
for the local linearization.
And the only place where the actual variable shows up,
the variable vector, is right here, is this guy.
Cause, you know, when you evaluate the function
f at a specified input, that's just a constant.
when you evaluate the gradient at that input,
it's just a constant.
And we're subtracting off that, that
um, specified input that's just a constant.
So this is the only place where your variable shows up.
So once all is said and done,
and once you do your computations,
this is a very simple function.
And the, the important part is maybe this is much simpler
than the function f itself,
which allows you to, you know,
maybe compute something more quickly
if you're writing a program that needs to,
you know, deal with some kind of complicated function,
but runtime is an issue or maybe,
it's a function that you never knew in the first place,
but you were able to approximate its value at a point,
and approximate its gradient.
So this is what lets you approximate
the function as a whole near that point.
So again, this might look very abstract,
but if you just kind of unravel everything
and think back to where it came from
and look at, look at the specific example
of a, you know, tangent plane, um,
hopefully it all makes a little bit of sense
and you see that this is really just the simplest
possible function that evaluates to the same value
as f when you input this point,
and whose partial derivatives all evaluate
to the same values as those of f
at that specified point.
And if you wanna see more examples of this,
and what it looks like and maybe how you can use it
to approximate certain functions,
I have an article on that, that you can go check out,
and it would be particularly good to kind of
go in with a piece of paper
and sort of work through the examples yourself
as you, as you work through it.
And with that said,
I will see you next video.