- [Voiceover] Okay, so we are finally ready

to express the quadratic approximation

of a multivariable function in vector form.

So, I have the whole thing written out here

where f is the function that we are trying to approximate,

x naught, y naught is the constant point

about which we are approximating,

and then this entire expression

is the quadratic approximation,

which I've talked about in past videos,

and if it seems very complicated or absurd,

or you're unfamiliar with it,

I'm just dissecting it real quick.

This over here is the Constant term,

this is just gonna evaluate to a constant,

everything over here is the Linear term,

because it just involves taking a variable

multiplied by a constant,

and then the remainder, every one of these components

will have two variables multiplied into it.

So x squared comes up, and x times y,

and y squared comes up, so that's the quadratic term.

Quadratic.

Now, to vectorize things, first of all,

let's write down the input, the input variable (x,y)

as a vector, and typically we'll do that

with a boldfaced x to indicate that it's a vector,

and its components are just gonna be the single variables,

x and y, the non-boldfaced.

So this is the vector representing the variable input,

and then correspondingly a boldfaced x

with a little subscript o, x naught,

is gonna be the constant input,

the single point in space near which we are approximating.

So when we write things like that,

this Constant term, simply enough,

is gonna look like evaluating your function

at that boldfaced x naught.

So that's probably the easiest one to handle.

Now the linear term, this looks like a dot product,

and if we kind of expand it out as the dot product,

it looks like we're taking the partial derivative

of f with respect to x,

and then the partial derivative with respect to y,

and we're evaluating both of those

at that boldfaced x naught input.

X naught as its input.

Now, each one of those partial derivatives

is multiplied by variable minus constant numbers,

so this looks like taking the dot product,

here, I'm gonna erase the word "linear".

We're taking with x minus x naught,

and y minus y naught.

This is just expressing the same linear term,

but as a dot product, but the convenience here

is that this is totally the same thing

as saying the gradient of f,

that's the vector that contains all the partial derivatives,

evaluated at the special input, x naught,

and then we're taking the dot product between that

and the variable vector, boldfaced x, minus x naught.

Since when you do this component-wise,

boldfaced x minus x naught, if we kinda think here,

it'll be x the variable minus x naught the constant,

y the variable minus y naught the constant,

which is what we have up there.

So this expression kind of vectorizes the whole linear term,

and now the beef here, the hard part,

how are we gonna vectorize this quadratic term?

Now that's what I was leading to in the last couple videos,

where I talked about how you express a quadratic form

like this with a matrix, and the way that you do it,

I'm just kinda scroll down to give us some room,

the way that you do it is we'll have a matrix

whose components are all of these constants.

It'll be this 1/2 times the second partial derivative

evaluated there, and I'm just gonna, for convenience's sake,

I'm gonna just take 1/2 times the second partial derivative

with respect to x, and leave it as understood

that we're evaluating it at this point.

And then, on the other diagonal,

you have 1/2 times the other kind of partial derivative

with respect to y two times in a row.

And then we're gonna multiply it by this constant here,

but this term kind of gets broken apart

into two different components.

If you'll remember, in the quadratic form video,

it was always things where it was a,

and then 2b and c, as your constants for the quadratic form,

so if we're interpreting this as two times something,

then it gets broken down,

and on one corner shows up as fxy,

and on the other one, kind of 1/2 fxy.

So both of these together are gonna constitute

the entire mixed partial derivative.

And then the way that we express the quadratic form

is we're gonna multiply this by, well by what?

Well, the first component is whatever the thing is

that's squared here, so it's gonna be that x minus x naught,

and then the second component is

whatever the other thing squared is,

which in this case is y minus y naught,

and of course we take that same vector

but we put it in on the other side too.

So let me make a little bit of room,

cause this is gonna be wide.

So we're gonna take that same vector,

and then kind of put it on its side.

So it'll be x minus x naught as the first component,

and then y minus y naught as the second component,

but it's written horizontally, and this,

if you multiply out the entire matrix,

is gonna give us the same expression that you have up here.

And if that seems unfamiliar, if that seems,

you know, how do you go from there to there,

check out the video on quadratic forms,

or you can check out the article where I'm talking

about the quadratic approximation as a whole,

I kind of go through the computation there.

Now, this matrix right here is almost the Hessian matrix,

this is why I made a video about the Hessian matrix.

It's not quite, because everything

has a 1/2 multiplied into it,

so I'm just gonna kinda take that out

and we'll remember we have to multiply

a 1/2 in at some point, but otherwise,

it is the Hessian matrix, which we denote

with a kind of boldfaced H, boldfaced H,

and emphasize that it's the Hessian of f.

The Hessian is something you take of a function.

And like I said, remember each of these terms

we should be thinking of as evaluated

on the special input point,

evaluating it at that special, you know,

boldfaced x naught input point.

I was just kind of too lazy to write it in,

each time the x naught y naught, x naught y naught,

x naught y naught, all of that.

But what we have then is we're multiplying it on the right

by this whole vector is the variable vector, boldfaced x,

minus boldfaced x naught, that's what that entire vector is,

and then we kind of have the same thing on the right,

you know, boldfaced vector x minus x naught,

except that we transpose it, we kind of put it on its side,

and the way you denote that,

you have a little T there, for transpose.

So this term captures all of the quadratic information

that we need for the approximation.

So just to put it all together,

if we go back up and we put the Constant term that we have,

the Linear term, and this quadratic form

that we just found, all together,

what we get is that the quadratic approximation of f,

which is a function we'll think of it as a vector input,

boldfaced x, it equals the function itself evaluated at,

you know, whatever point we're approximating near,

plus the gradient of f, which is kind of,

it's vector analog of a derivative,

evaluated at that point, so this is a constant vector,

dot product with the variable vector,

x minus the constant vector x naught,

that whole thing, plus 1/2 the, then we'll just copy down

this whole quadratic term up there,

the variable minus the constant

multiplied by the Hessian,

which is kind of like an extension

of the second derivative, two multivariable functions,

and we're evaluating that, no, let's see,

we're evaluating it at the constant,

at the constant, x naught, and then on the right side,

we're multiplying it by the variable, x minus x naught.

And this, this is the quadratic approximation

in vector form, and the important part is, now,

it doesn't just have to be of a two variable input.

You could imagine plugging in a three variable input,

or a four variable input, and all of these terms make sense.

You know, you take the gradient of a four variable function,

you'll get a vector with four components.

You take the Hessian of a four variable function,

you would get a four by four matrix.

And all of these terms make sense.

And I think it's also prettier to write it this way,

because it looks a lot more like a Taylor expansion

in the single variable world.

You have, you know, a constant term,

plus the value of a derivative, times x minus a constant,

plus 1/2 what's kind of like the second derivative term,

what's kind of like taking an x squared,

but this is how it looks in the vector world.

So in that way it's actually maybe a little bit

more familiar than writing it out

in the full, you know, component by component term,

where it's easy to kind of get lost in the weeds there.

So, full vectorized form of the quadratic approximation

of a scalar valued multivariable function.

Boy, is that a lot to say.