- [Instructor] In the last video I introduced

a constrained optimization problem

where we were trying to maximize this function,

f of x, y equals x squared times y,

but subject to a constraint that your values of x and y

have to satisfy x squared plus y squared equals one.

And the way we were visualizing this

was to look at the x, y plane where this circle here

represents our constraint.

All of the points that make up this set,

x squared plus y squared equals one,

and then this curvy line here is one of the contours of f,

meaning, we're setting f of x, y equal to some constant.

And then I was varying around that constant c.

So for high values of c, the contour would look

something like this, this is where the value

of x squared times y is big.

And then for small values of c,

the contours would look like this.

So all the points on this line would be f of x, y

equals like 0.01 in this case, something like that.

Then the way to think about maximizing this function

is to try to increase that value of c

as much as you can without it falling off the circle.

And the key observation is

that happens when they're tangent.

So, you know, you might kind of draw this out

in a little sketch and say there's some curve

representing your constraint, which in this case

would be, you know, where our circle is.

And then the curve representing the contour

would just kiss that curve,

just barely touch it in some way.

Now, that's pretty, but in terms of solving the problem,

we still have some work to do.

And the main tool we're gonna use here is the gradient.

So let me go ahead and draw a lot more contour lines

than there already are for x squared times y.

So this is many of the contour lines,

and I'll draw the gradient field, the gradient field of f.

So I've made a video about the relationship

between the gradient and contour lines.

And the upshot of it is that these gradient vectors,

every time they pass through a contour line,

they're perpendicular to it.

And the basic reason for that

is if you walk along the contour line,

the function isn't changing value,

so if you want it to change most rapidly,

you know, it kind of makes sense you should walk

in the perpendicular direction,

so that no component of the walk that you're taking

is, you know, useless, is along the line

where the function doesn't change.

But again, there's a whole video on that

that's worthy checking out if this feels unfamiliar.

For our purposes, what it means is that

when we're considering this point of tangency,

the gradient of f at that point is gonna be some vector

perpendicular to both the curves at that point.

So that little vector represents the gradient of f

at this point on the plane.

And we can do something very similar

to understand the other curve.

Right now I've just written it as a constraint,

x squared plus y squared equals one.

But you know, to give that function a name,

let's say we've defined g of x, y to be

x squared plus y squared,

x squared plus y squared.

In that case, this constraint is pretty much

just one of the contour lines for the function g,

and we can take a look at that.

If we go over here and we look at

all of the other contour lines for this function g,

and it should make sense that they're circles,

because this function is x squared plus y squared.

And if we took a look at the gradient of g,

and we go over and ask about the gradient of g,

it has that same property,

that every gradient vector,

if it passes through a contour line,

is perpendicular to it.

So over on our drawing here,

the gradient vector of g would also be perpendicular

to both these curves.

And you know, maybe in this case,

it's not as long as the gradient of f, or maybe it's longer.

There's no reason that it would be the same length,

but the important fact is that it's proportional.

And the way that we're gonna write this in formulas

is to say that the gradient of f evaluated,

let's see, evaluated at whatever the maximizing value

of x and y are, so we should give that a name probably.

Maybe x sub m, y sub m, the specific values

of x and y that are gonna be at this point

that maximizes the function subject to our constraint.

So that's gonna be related to the gradient of g,

it's not gonna be quite equal,

so I'll leave some room here.

Related to the gradient of g,

evaluated at that same point, xm, ym.

And like I said, they're not equal, they're proportional.

So we need to have some kind

of proportionality constant in there.

You almost always use the variable lambda,

and this guy has a fancy name,

it's called a Lagrange multiplier.

Lagrange, Lagrange was one

of those famous French mathematicians.

I always get him confused with some of the other

French mathematicians at the time like Legendre or Laplace,

there's a whole bunch of things.

Let's see, multiplier, distracting myself talking here.

So Lagrange multiplier.

So there's a number of things in multivariable calculus

named after Lagrange, and this is one of the big ones.

This is a technique that he kind of developed

or at the very least popularized.

And the core idea is to just set

these gradients equal to each other,

'cause that represents when the contour line

for one function is tangent to the contour line of another.

So this, this is something that we can actually work with.

And let's start working it out, right,

let's see what this translates to in formulas.

So I already have g written here,

so let's go ahead and just evaluate

what the gradient of g should be.

And that's the gradient of x squared plus y squared.

And the way that we take our gradient is

it's gonna be a vector

whose components are all partial derivatives.

So the first component is the partial derivative

with respect to x.

So we treat x as a variable, y looks like a constant.

The derivative is two x.

The second component the partial derivative

with respect to y, so now we're treating y as the variable,

x is the constant, so the derivative looks like two y.

So that's the gradient of g.

Then the gradient of f, gradient of f.

It's gonna look like gradient of,

let's see, what is x?

What is f?

It's x squared times y.

So x squared times y.

We do the same thing.

First component partial derivative with respect to x,

x looks like a variable, so it's derivative is two times x

and then that y looks like a constant when we're up here.

But then partial derivative with respect to y,

that y looks like a variable, that x squared just looks

like a constant sitting in front of it.

So that's what we get.

And now if we kind of work out

this Lagrange mulitplier expression

using these two vectors, what we have written,

what we're gonna have written is that this vector,

two xy x squared

is proportional with the proportionality constant lambda to

the gradient vector for g.

So two x two y.

And if you want you can think about this

as two separate equations.

I mean right now it's one equation with vectors,

but really what this is saying is

you've got two separate equations,

two times xy is equal to lambda,

ah, gotta change colors a lot here.

Lambda times two x.

Hm, gonna be stickler for color.

Keep red all of the things associated with g.

And then,

this second equation is that x squared

is equal to lambda times two y.

And this might seem like a problem,

because we have three unknowns, x, y,

and this new lambda that we introduced.

Kind of shot ourselves in the foot

by giving ourselves a new variable to deal with.

But we only have two equations.

So in order to solve this, we're gonna need three equations.

And the third equation is something

that we've known the whole time.

It's been part of the original problem.

It's the constraint itself,

x squared plus y squared equals one.

So that, that third equation,

x squared plus y squared is equal to one.

So these are the three equations

that characterize our constrained optimization problem.

The bottom one just tells you

that we have to be on this unit circle here.

Allow me to just highlight it.

We have to be on this unit circle.

And then these top two tell us what's necessary

in order for our contour lines,

the contour of f and the contour of g

to be perfectly tangent with each other.

So, in the next video, I'll go ahead and solve this.

At this point, it's pretty much just algebra to deal with,

but it's worthy going through.

And then in the next couple ones,

I'll talk about a way that you can encapsulate

all three of these equations into one expression,

and also a little bit about the interpretation

of this lambda that we introduced.

'Cause it's not actually just a dummy variable,

it has a pretty interesting meaning in physical contexts

once you're actually dealing

with a constrained optimization problem in practice.

So I'll see you in the next video.