3.5.3 Examples:.

1. (x - c)² + y² = 1. As we have seen above, this equation represents the family of circles of unit radius the centres of which lie on the x-axis (Fig. 19). Geometrically speaking, we see at once that the envelope must consist of the two lines y = 1 and y= -1. We can verify this by means of our rule, because the two equations (x - c)² + y² = 1 and -2(x - c) = 0 yield immediately the envelope in the form y² = 1.

2. The family of circles of unit radius passing through the origin, the centres of which therefore must lie on the circle of unit radius about the origin, is given by the equation

The derivative with respect to c equated to zero yields x sin c - y cos c = 0. These two equations are satisfied by the values x = 0 and y = 0. However, if x² + y² = 0, it readily follows from our equations that sin c = y/2, cos c = x/2, so that on eliminating c we obtain x² + y² = 4. Thus, our rule yields for the envelope the circle of radius 2 about the origin, as is anticipated by geometrical intuition, but it also yields the isolated point x = 0, y = 0.

3. The family of parabolas (x - c)² - 2y = 0 (Fig. 20) also has an envelope, which both by intuition and by our rule is found to be the x-axis.

4. Next, consider the family of circles (x- 2c + y² - c² = 0 (Fig. 21). Differentiation with respect to c gives 2x - 3c = 0 and we find by substitution that the equation of the envelope is

hence, the envelope consists of the two lines . The origin is an exception, in that contact does not occur there.

6. Another example is the family of straight lines on which unit length is intercepted by the x- and y-axes. If a = c is the angle indicated in Fig. 22, these lines are given by the equation

The condition for the envelope is

which, in conjunction with the equation of the lines, yields the envelope in parametric form

From these equations, we obtain the further equation

This curve is called the astroid. It has four symmetrical branches meeting at four cusps (Figs. 23/24).

6. The astroid also appears as the envelope of the family of ellipses

with semi-axes c and 1 - c (Fig. 24).

7. The family of curves (x - c)2 - y3 = 0 shows that under certain circumstances our process may fail to give an envelope. The rule gives here the x-axis. However, as Fig. 25 shows, this is not an envelope, but the locus of the cusps of the curves ..

8. In the case of the family

we again find that the discriminant curve is the x-axis (Fig. 26). This is again the cusp-locus; but it touches each of the curves, and in this case must be regarded to be an envelope.

9. Another example, in which the discriminant curve consists of the envelope plus the locus of double points, is given by the family of strophoids (Fig. 27)

.All the curves of this family are similar to each other and arise from one another by translation parallel to the y-axis. By differentiation, we obtain fc=2(y— c)(x —2)=0, so that we must have either x = 2 or y = c. However, the line x = 2 does not enter into the matter, because no finite value of y corresponds to x = 2. We therefore have y = c, so that the discriminant curve is x2(x - 2) + x=0. This curve consists of the two straight lines x = 0 and x = 1. As we see from Fig. 27, only x= 0 is an envelope; the line x = 1 passes through the double points of the curves.

10. An envelope need not be the locus of the points of intersection of neighbouring curves; this is shown by the family of identical of parallel cubical parabolas y -(x- c)3 = 0. No two of these curves intersect each other. The rule gives the equation fc = 3(x - c)2 = 0, so that the x-axis y=0 is the discriminant curve. Since all the curves of the family are touched by it, it is also the envelope (Fig. 28).

11. The notion of the envelope enables us to give a new definition for the evolute of a curve G. Let C be given by x = j(t), y = y(t). We then define the evolute of C as the envelope of the normals of C. As the normals of C are given by

the envelope is found by differentiating this equation with respect to t:

From these two equations, we obtain the parametric representation of the envelope

where

denotes the radius of curvature. These equations are identical with those given for the evolute.

12. Let there be given a curve C by x = j(t), y = y(t). We form the envelope E of the circles with their centres on C and passing through the origin O. Since the circles are given by

the equation of E is

Hence, if P is the point (j(t), y = y(t)) and Q(x, y) the corresponding point of E, then OQ is perpendicular to the tangent to G at P. Since, by definition, PQ = PO, PO and PQ make equal angles with the tangent to C at P.

If we now imagine O to be a luminous point and C a reflecting curve, then QP is the reflected ray corresponding to OP. The envelope of the reflected rays is called the caustic of E with respect to O. The caustic is the. evolute of E, because the reflected ray PQ is normal to E, since a circle with centre P touches E at Q and the envelope of the normals of E is the evolute, as we have seen in the preceding example.

For example, let C be a circle passing through O. Then E is the path described by the point O' of a circle C' congruent to C which rolls on C and starts with O and O' coinciding. In fact, during the motion, O and O' always occupy symmetrical positions with respect to the common tangent of the two circles. Thus, E will be a special epicycloid, in fact, a cardioid. As the evolute of an epicycloid is a similar epicycloid (Vol. I, Chapter 5, Ex. 2/3), the caustic of C with respect to O is in this case a cardioid.

3.5.4 Envelopes of Families of Surfaces: The remarks made about the envelopes of families of curves apply with little changes to families of surfaces. If, in the first instance, we consider a one-parameter family of surfaces f(x, y, z, c) = 0 in a definite interval of parameter values c, we shall say that a surface E is the envelope of the family, if it touches each surface of the family along an entire curve and, moreover, these curves of contact form a one-parameter family of curves on E which completely cover E.

An example is given by the family of all spheres of unit radius with centres on the z-axis. We see intuitively that the envelope is the cylinder x² + y² - 1 = 0 with unit radius and axis along the z-axis; the family of curves of contact is simply the family of circles parallel to the xy-plane with unit radius and centre on the z-axis.

The envelopes of spheres of constant radius the centres of which lie along curves are called tube-surfaces.

As in 3.5.2, if we assume that the envelope exists, we can find it by the following heuristic method. We first consider the surfaces f(x, y, z, c) = 0 and f(x, y, z, c + h) = 0, corresponding to two different parameter values c and c + h. These two equations determine the curve of intersection of the two surfaces (we expressly assume that such a curve of intersection does exist). In addition to the two equations above, this curve also satisfies the third equation

If we let h tend to zero, the curve of intersection will approach a definite limiting position and this limit curve is determined by the two equations

This curve is often referred to in a non-rigorous but intuitive way as the intersection of neighbouring surfaces of the family. It is still a function of the parameter c, so that all the curves of intersection for the different values of c form in space a one-parameter family of curves. If we eliminate the quantity c from the two equations, we obtain an equation, which is called the discriminant. As in 3.5.2, we can show that the envelope must satisfy this discriminant equation.

Just as in the case of plane curves, we may readily convince ourselves that a plane touching the discriminant surface also touches the corresponding surface of the family, provided that fx + fy + fz ¹ 0. Hence, the discriminant surface again gives the envelopes of the family and the loci of the singularities of the surfaces of the family.

As a first example, consider the family of spheres above

For finding the envelope, we have the additional equation

Obviously, for fixed values of c, these two equations represent the circle of unit radius parallel to the xy-plane at the height z=c. If we eliminate the parameter c between the two equations, we obtain the equation of the envelope in the form x²+y²-1=0, which is the equation of the right circular cylinder with unit radius and the z-axis as axis.

While for families of curves the formation of the envelope has a meaning only for one-parameter families, in the case of families of surfaces, it is also possible to find. envelopes of two-parameter families f(x, y, z, c1, c2) = 0. For example, if we consider the family of all spheres with unit radius and centre on the xy-plane, represented by the equation

intuition at once tells us that the two planes z = 1and z = -1 touch a surface of the family at every point. In general, we shall say that a surface E is the envelope of a two-parameter family of surfaces, if at every point P of E the surface E touches a surface of the family in such a way that, as P ranges over E, the parameter values c1, c2, corresponding to the surface touching E at P, range over a region of the c1c2-plane, and, moreover, different points (c1, c2) correspond to different points P of E. A surface of the family then touches the envelope m a point, and not, as before) along an entire curve.

With assumptions similar to those made in the case of plane curves, we find that the point of contact of a surface of the family with the envelope, if it exists, must satisfy the equations. In general, we can find from these three equations the point of contact of each separate surface by assigning the corresponding values to the parameters. Conversely, if we eliminate the parameters c1 and c2, we obtain an equation which the envelope must satisfy.

For example, the family of spheres with unit radius and centre on the xy-plane is given by the equation

with the two parameters c1 and c2. The rule for forming the envelope yields the two equations

Thus, we have the discriminant equation z² - 1 = 0 and, in fact, the two planes z = 1 and z = -1 are envelopes, as we have already seen intuitively.

Exercises 3.5

1. Let z = u(x, y) be the equation of a tube-surface, i.e., the envelope of a family of spheres of unit radius with centres on some curve y = f(x) in the xy-plane. Prove that

2. Find
(a) the envelope of the two-parameter family of planes for which

where P, Q, R denote the points of interception of the planes with the co-ordinate axes and O is the origin,
(b) the envelope of the planes for which

3. Let C be an arbitrary curve in the plane and consider the circles of radius p the centres of which lie on C. Prove that the envelope of these circles is formed by the two .curves parallel to C at the distance p.

4*. A family of straight lines in space may be given as the intersection of two planes depending on a parameter t:

Prove that, if these straight lines are tangents to some curve, i.e., possess an envelope, then

5*. A family of planes is given by the equation

where t is a parameter.
(a) Find the equation of the envelope of the planes in cylindrical co-ordinates (r, z, q).
(b) Prove that the envelope consists of the tangents to a certain curve.

6. If a body is always thrown from the same initial position with the same initial velocity, but at different angles, its trajectories form a family of parabolas (it is assumed that the motion always takes place in the same vertical plane). Prove that the envelope of these parabolas is another parabola.

7*. .Find the envelope of the family of spheres which touch the three spheres

8. If a plane curve C is given by x = f(t), y = g(t), its polar reciprocal C' is defined as the envelope of the family of straight lines

where (x, h) are current co-ordinates.
(a) Prove that C is also the polar reciprocal of C' .
(b) Find the polar reciprocal of the circle

(c) Find the polar reciprocal of the ellipse

Hints and Answers

3.6 MAXIMA AND MINIMA

3.6.1 Necessary Conditions: The theory of maxima and minima for functions of several variables, like that for functions of a single variable, is one of the most important applications of differentiation.

We shall begin by considering a function u = f(x, y) of two independent variables x, y, which we shall represent by a surface in xyu-space. We say that this surface has a maximum with the co-ordinates (x0, y0) if all other values of u in a neighbourhood of that point (all around the point) are less than u(x0, y0). Geometrically speaking, such a maximum corresponds to a hill-top on the surface.  In the same manner, we shall call the point (x0, y0) a minimum if all other values of the function in a given neighbourhood of P0(x0, y0) are larger than u0 = u(x0, y0). Just as with functions of one variable, these concepts always refer only to a  sufficiently small neighbourhood  of a point in question. Considered as a whole, the surface may very well have points which are higher than the hill-tops. Analytically speaking, in order that our definition will apply to functions of more than two independent variables, we say:
A function u = f(x, y, × × ) has a maximum (or a minimum) at the point (x0, y0, × × × ), if at every point in a neighbourhood of (x0, y0, × × × ) the function assumes a smaller value (or a larger value) than at the point itself.

If in the neighbourhood of (x0, y0, , × × × ) the function assumes values which are not larger than the value of the function at the point (but may be equal to it), we say that the function has an improper maximum at the point. We define an. improper minimum in a similar way.

We again emphasize that this definition refers to a suitably chosen neighbourhood of the point, extending in all directions about the point. Thus, in a closed region, the value of a maximum may very well lie below the largest value assumed by the function in the region. (We already know that a continuous function always assumes a greatest and a smallest value in a closed region!) If the largest value is reached at a point P of the boundary, it need not be a maximum in the sense defined above, as we have already seen for functions of one variable. In fact, if the function is only defined in a closed region, we cannot find a complete neighbourhood of P0 in which the function is defined; on the other hand, if the closed region is contained within a larger region in which the function is defined, then in this larger region the function may not have a maximum at P0, as the following example shows. The function u = -x -y is defined over the entire xy-plane, but we consider it only in the square 0 £ x £ 1, 0 £ y £ 1. In this closed region, it reaches its largest value 0 at the origin.  However, this largest value is not a maximum. In fact, if we consider a neighbourhood all around the origin, we find that the function assumes values greater than zero. However, if we know that the largest or smallest value of a function is assumed at a single point interior to a region, that point must necessarily be a maximum or minimum in the sense defined above.

We shall first present necessary conditions for the occurrence of an extreme value. (As in the case of functions of one variable, we use the terms extreme value, extreme point when we do not wish to distinguish between maxima and minima.)  In fact, we will find conditions which must be satisfied at a point (x, y, × × ×) if there is to be an extreme value at that point.  The equations

are necessary conditions for the occurrence of a maximum or minimum of a differentiable function u = f(x, y, × × × ) at the point P0 with co-ordinates (x0, y0, z0, . . .).

On the other hand, as will be seen later , the terms stationary value, stationary point include points at which there are neither maxima nor minima.

In fact, these conditions follow at once from the known conditions for functions of one independent variable.   If we consider the variables y, z, × × × as fixed at the values y0, z0, × × × and the function in the neighbourhood of P0 as a function of the single variable x, this function of x must have an extreme value at the point x = x0, and by our previous results we must have fx(x0, y0, z0, × × × ) = 0.

Geometrically speaking, the vanishing of the partial derivatives in the case of functions of two independent variables means that at the point (x0, y0) the tangent plane to the surface u = f(x, y) is parallel to the xy-plane.

For many purposes, it is more convenient to combine these conditions in the single equation

In words: At an extreme point, the differential (linear part of the increment) of the function must vanish, whatever values we assign to the differentials dx, dy, dz, × × × of the independent variables x, y, z, × × × . Conversely, if the above equation is satisfied for arbitrary values of dx, dy, × × × , it follows that at the given point fx = fy = fz × × × = 0.  We only need take all but one of the (mutually independent) variables equal to zero.

The equations

involve as many unknowns x0, y0, z0, × × × as there are equations. Hence, as a rule, we can calculate with them the position of the extreme points. However, a point obtained in this way need not by any means be an extreme point. For example, consider the function u = xy. Our two equations at once give x = 0, y = 0.  However, in the neighbourhood of the point x=0, y = 0, the function assumes both positive and negative values, depending on the quadrant.  The function therefore has not an extreme value there.  The geometrical representation of the surface u = xy, which is a hyperbolic paraboloid, shows that the origin is a saddle point (3.1.2. Fig. 1).

It is useful to have a simple expression for a point at which the above equations are satisfied, irrespectively of whether the function has an extreme point or not. We accordingly say that, if there is a point (x0, y0, z0, × × × ) at which fx = 0, fy=0, fz=0, × × or at which

the function has a stationary value.

Every point interior to a closed region, at which a differentiable function assumes its largest or smallest values is a stationary point.

In order to decide whether and when our system of equations really yields an extreme value, we must make further investigations. However, in many cases, the state of affairs is clear from the outset, especially when we know that the largest or smallest value of a function must be assumed at an interior point P of a region and discover that our equations determine only a single stationary system x =x0, y == y0, × × × . This system of values must then determine the point P, which is necessarily a stationary point. However, if such considerations do not apply, we must investigate more closely; we will postpone this aspect to the appendix. Meanwhile, we shall illustrate the foregoing results by examples.

3.6.2 Examples:

1. For the function u = x2 + y2, the partial derivatives vanish only at the origin, so that this point alone may be an extreme point. Actually, the function has a minimum, since at all points (x, y) other than (0, 0) the function must be positive as the sum of squares.

2. The function

has the partial derivatives

which only vanish at the origin. Here we have a maximum, for at all other points (x, y) in the neighbourhood of the origin the quantity 1 - x 2 - y 2 under the square root is less than it is at the origin.

3. We wish to construct the triangle for which the product of the sines of the three angles is a maximum, i.e., to find the maximum of the function

in the region 0 £ x £ p, 0 £ y £ p. Since f is positive inside this region, its largest value is positive. On the boundary of the region, where the equality sign holds in at least one of the inequalities defining the region, we have f(x, y) = 0, so that the largest value must lie inside the region.

If we equate the derivatives to zero, we obtain the two equations

Since 0 < x < p, 0 < y < p, 0 < x + y < p, these yield tan x = tan y or x = y. Substitute this value into the first equation to obtain the relation sin 3x = 0, whence x = p/3, y = p/3 is the only stationary point, the sought triangle being equilateral.

4. Three points P1, P2, P3 with co-ordinates (x1, y1), (x2, y2), (x3, y3), respectively, are the vertices of an acute-angled triangle. We wish to find a fourth point P with co-ordinates (x,y) such that the sum of its distances from P1., P2 and P3 is the smallest possible one. This sum of distances is a continuous function of x and y, and has a minimum at some point P inside a large circle enclosing the triangle. This point P cannot lie at a vertex of the triangle, since then the foot of the perpendicular from one of the other two vertices onto the opposite side would yield a smaller sum of distances. Moreover, P cannot lie on the circumference of the circle, if this is sufficiently far away from the triangle. We now form with the distances

the function

which is differentiable everywhere except at P1, P2, P3. We know that at the point P the partial derivatives with respect to x and y must vanish. Thus, differentiation yields for P the conditions

According to these equations, the three plane vectors u1, u2, u3 with the components

have the vector sum 0. Moreover, everyone of these vectors are of unit length. When combined geometrically, they form an equilateral triangle, i.e., each vector is brought into the direction of the next one by a rotation through 2p/3 (Fig. 29). Since these three vectors have the same directions as the three vectors from P1, P2, P3 to P, it follows that each of the three sides of the triangle must subtend at the point P the same angle 2p/3.

3.6.3 Maxima and Minima with Subsidiary Conditions: The problem of determining the maxima and minima of functions of several variables frequently presents itself in a form other than that treated above. For example, if we wish to find the point of a given surface f(x, y, z) = 0 which is at the smallest distance from the origin, then we have to determine the minimum of the function

where, however, the quantities x, y, z are no longer three independent variables, but are linked by the equation of the surface f(x, y, z) = 0 as a subsidiary condition. In fact, such extreme values with subsidiary conditions do not represent a fundamentally new problem. Thus, in our example, we need only solve for one of the variables, say z, in terms of the other two and then substitute this expression in the formula for the distance

in order to reduce the problem to that of determining the stationary values of a function of the two variables x, y.

However, it is more convenient as well as more elegant to express the conditions for a stationary value in a symmetrical form, in which no preference is given to any one of the variables. As a very simple case, which is nevertheless typical, we will consider the problem:

Find the stationary values of a function f(x, y) when the two variables x, y are not mutually independent, but are linked by a subsidiary condition

 

In order to give a geometrical interpretation to the analytical treatment, we assume, first of all, that the subsidiary condition, as in Fig. 30, is represented by a curve without singularities in the xy-plane and that, moreover, the family of curves f(x,y)=c=const covers a portion of the plane. The problem then is: Among the curves of the family which intersect the curve f = 0, find the one for which the constant c is the largest or the least possible one.

As we describe the curve f = 0, we cross the curves f(x, y) = c and, in general, c changes monotonically; at the point, where the sense in which we run through the c-scale is reversed, we may expect an extreme value. We see from Fig. 30 that this occurs for the curve of the family which touches the curve f = 0. The co-ordinates of the point of contact will be the required values x = x, y = h, corresponding to the extreme value of f(x, y). If the two curves f=const and f=0 touch, they have the same tangent. Thus, at the point x=x, y =h, there applies the proportional relation

or, if we introduce the constant of proportionality l, the two equations

are satisfied. Together with the equation

they allow to determine the co-ordinates (x, h) of the point of contact as well as the constant of proportionality l.

For example, this argument may fail when the curve f = 0 has a singular point, say a cusp as in Fig. 31 below, at the point (x,h) at which it meets a curve f = c with the largest or smallest possible c. However, in this case, we have

In any case, we are led intuitively to the rule, which we shall prove in 3.6.4:

In order that an extreme value of the function f(x, y) may occur at the point x = x, y = h, with the subsidiary condition f(x,y) = 0, the point (x,h) being such that both of the two equations

are satisfied, it is necessary that there should be a constant of proportionality such that not both the two equations

are satisfied together with the equation

This rule is known as Lagrange's method of undetermined multipliers, and the factor l as Lagrange multiplier.

We observe that this rule yields for the determination of the quantities x, h and l as many equations as there are unknowns. We have therefore replaced the problem of finding the positions of the extreme values (x, h) by a problem in which there is the additional unknown l, but in which we have the advantage of complete symmetry. Lagrange's rule is usually expressed as follows:

In order to find the extreme values of the function f(x, y), subject to the subsidiary condition f(x, y) = 0, we add to f(x, y) the product of f(x, y) and an unknown factor l independent of x and y, and write down the known necessary conditions,

for an extreme value of F := f + lf. In conjunction with the subsidiary condition f = 0, these conditions serve to determine the co-ordinates of the extreme value and the constant of proportionality l. Before proceeding to prove the rule of undetermined multipliers rigorously, we shall illustrate its use by means of a simple example: Find the extreme values of the function

on the unit circle with centre at the origin, i.e., with the subsidiary condition

By our rule, differentiating

with respect to x and to y, we find that at the stationary points the two equations

must be satisfied. In addition, we have the subsidiary condition

On solving these equations, we obtain the four points

The first two of these give the maximum u = 1/2, the second two the minimum u = -1/2 of the function u = xy. The fact that the first two really give the maximum and the second the minimum of the function u can be shown as follows; on the circumference, the function must assume a maximum and minimum and, since the circumference has no boundary point, these points must be stationary points of the function.

3.6.4. Proof of the Method of Undetermined Multipliers in the Simplest Case: As we expect, we arrive at an analytical proof of the method of undetermined multipliers by reducing it to the known case of free extreme values. We assume that at the extreme point not both the two partial derivatives fx(x,h) and fy(x,h) vanish; in order to be specific, assume that fx(x,h) ¹ = 0. Then, by 3.1.3, the equation f(x,h) = 0 determines in a neighbourhood of this point y=g(x) uniquely as a continuously differentiable function of x. If we substitute this expression in f(x, y), the function

must must have a free extreme value at the point x =x. Then the equation

must hold at x= x. Moreover, the implicitly defined function y = g(x) satisfies the relation fx + fy g'(x) = 0 identically. If we multiply this relation by l = -fy / fx and add it to fx + fyg'(x) = 0, we obtain

and, by the definition of l, the equation

This establishes the method of undetermined multipliers.

This proof demonstrates the importance of the assumption that not both the derivatives fx and fy vanish at the point (x,h). If both these derivatives vanish, the rule fails, as is shown analytically by he following example. We wish to minimize the function

subject to the condition

By Fig.32, the shortest distance from the origin to the curve (x - 1)2 - y2 = 0 is obviously given by the line joining the origin to the cusp S of the curve (we can easily prove that the unit circle with centre at the origin has no other point in common with the curve). The co-ordinates of S, i.e., x = 1 and y = 0, satisfy the equations f(x,y) = 0, fy +lf y = 0 no matter what value is assigned to l, but

We can state the proof of the method of undetermined multipliers in a slightly different way, which is particularly convenient for a generalization. We have seen that the vanishing of the differential of a function at a given point is a necessary condition for the occurrence of an extreme value of the function at that point. For the present problem, we can also make the statement:

In order that the function f(x, y) may have an extreme value at the point (x h), subject to the subsidiary condition f(x,y)=0, it is necessary that the differential df shall vanish at that point, it being assumed that the differentials dx and dy are not independent of each other, but are chosen in accordance with the equation

derived from f = 0. Thus, at the point (x h), the differentials dx and dy must satisfy the equation

whenever they satisfy the equation df = 0. If we multiply the first of these equations by a number l, undetermined in the first instance, and add it to the second, we obtain

If we determine l so that

as is possible by virtue of the assumption that fy ¹ 0, it necessarily follows that (fx + lfx)dx = 0, and since the differential dx can be chosen arbitrarily, for example, equal to 1, we have

3.6.5 Generalization of the Method of Undetermined Multipliers: We can extend the method of undetermined multipliers to a larger number of variables as well as to a larger number of subsidiary conditions. We shall consider a special case which includes every essential feature. We shall seek the extreme values of the function

when the four variables x, y, z, t satisfy the two subsidiary conditions

We assume that at the point (x, h, z, t) the function takes a value which is an extreme value, when compared with the values at all neighbouring points satisfying the subsidiary conditions. Moreover, we assume that in the neighbourhood of the point P(x, h, z, t) two of the variables, say z and t, can be represented as functions of the other two variables x and y by means of the equations

In fact, in order to ensure that such solutions z = g(x, y) and t = h(x, y) can be found, we assume that at the point P the Jacobian

does not vanish. If we now substitute the functions

into the function u =f(x, y, z, t), then f(x, y, z, t ) becomes a function of the two independent variables x and y, and this function must have a free extreme value at the point x= x, y = h, whence its two partial derivatives must vanish at that point. The two equations

must therefore hold. In order to calculate from the subsidiary conditions the four derivatives

occurring there, we could write down the two pairs of equations

and solve them for the unknowns

which is possible because the Jacobian

does not vanish. The problem would then be solved.

Instead, we prefer to retain formal symmetry and clarity by proceeding as follows. We determine two numbers l and m in such a way that the two equations

are satisfied at the point where the extreme value occurs.. The determination of the multipliers l and m. is possible, since we have assumed that the Jacobian does not vanish. If we multiply the equations

by l and m, respectively, and add them to the equation

we obtain

Hence, by the definition of l and m,

Similarly, if we multiply the equations

by l and m, respectively, and add them to the equation

we obtain the additional equation

Thus, we arrive at the result:

If the point (x, h, z, t) is an extreme point of f(x, y, z, t), subject to the subsidiary conditions

and if at that point is not zero, then two numbers l and m exist such that at the point (x, h, z, t)

and also the subsidiary conditions are satisfied

These last conditions are perfectly symmetrical. Every trace of emphasis on the two variables x and y has disappeared from them, and we should equally well have obtained them, if, instead of assuming that ¹ 0, we had merely assumed that any one of the Jacobians did not vanish, so that in the neighbourhood of the point in question a certain pair of the quantities x, y, z, t (although possibly not z and t) could be expressed in terms of the other pair. Naturally, we have paid a price for this symmetry of our equations; in addition to the unknowns x, h, z, t, we now have to find l and m. Thus, instead of four unknowns, we have six, determined by the six equations above.

Here, as well, we could have carried out the proof somewhat more elegantly by using differential notation. In this notation, the necessary condition for the occurrence of an extreme value at the point P is the equation

where the differentials dz and dt are to be expressed in terms of dx and dy. These differentials are linked by the relations

obtained by differentiating the subsidiary conditions. If we assume that the two-rowed determinants occurring here do not all vanish at the point (x, h, z, t,), for example, if we assume that the expression ¹ 0, then we can determine two numbers l and m which satisfy the two equations

If we multiply the equation df = 0 by l, the equation dy = 0 by m and add them to the equation df = 0, then the last two equations yield

Since here dx and dy are independent differentials (that is, arbitrary numbers), it follows that the numbers l and m also satisfy the equations

and we are once again led to the method of undetermined multipliers.

In exactly the same manner, we can state and prove the method of undetermined multipliers for arbitrary numbers of variables and subsidiary conditions. The general rule follows:

If in a function

not all n variables x1,, x2, . . . , x n are independent, but are interlinked by the m subsidiary conditions (m < n)

then we introduce m multipliers l1,, l2, . . . , l m and equate the derivatives of the function

with respect to x1,, x2, . . . , xn, when l1,, l2, . . . , ln are constant, to zero. The equations

thus obtained, together with the m subsidiary conditions

represent a system of m + n equations for the m + n unknown quantities x,, x2, . . . , xn, l1,, l2, . . . , ln . These equations must be satisfied at every extreme value of f unless at that extreme value every one of the Jacobians of the m functions f1,, f2, . . . , fm with respect to m of the variables x1, , . ., x„ vanishes.

In connection with the method of undetemined multipliers, we must still note the following important fact. The rule gives us an elegant formal method for determining the points where extreme values occur, but it merely gives us a necessary condition. There arises then the further question whether and when the points, which we find by means of the multiplier method, do actually give us a maximum or a minimum of the function. We shall not deal here with this question as it would lead us much too far afield. As in the case of free extreme values, when we apply the method of undetermined multipliers, we usually know beforehand that there exists an extreme value. Thus, if the method determines the point P uniquely and the exceptional case, when all the Jacobians vanish, does not occur anywhere in the region under discussion, we can be sure that we have really found the point where the extreme value occurs.

3.6.6 Examples:

1. As a first example, we attempt to find the maximum of the function f(x, y, z) = x2y2z2, subject to the subsidiary condition x2 + y 2 + z2 = c2 . On the spherical surface x2 + y 2 + z2 = c2, the function must assume a largest value and, since the spherical surface has no boundary points, this value must be a maximum in the sense defined above. According to the rule, we form the expression

and obtain by differentiation

The solutions with x = 0, y = 0, or z = 0 can be excluded, because at these points the function f assumes its least value, i.e., zero. The other solutions of these equations are x2 = y 2 = z2, l = -x4. Using the subsidiary condition, we obtain the values

for the required co-ordinates.

At all these points, the function assumes the same value c6/27, which is accordingly the required maximum. Hence any triad of numbers satisfies the relation

i.e., the geometric mean of three positive numbers x2, y2, z2 never exceeds their arithmetic mean.

In fact, it is true that for any arbitrary set of positive numbers the geometric mean never exceeds the arithmetic mean. The proof is similar to that just given.

Another proof is given in Ex. 18.

2. Find the triangle with sides x, y, x and given perimeter 2s and the largest possible area. By a well-known formula, the square of the area is given by

Hence, we must find the maximum of this function subject to the subsidiary condition

where x, y, z are restricted by the inequalities

On the boundary of this closed region, i.e., whenever one of these inequalities becomes an equation, we have always f=0. Consequently, the largest value of f occurs in the interior and is a maximum. We form the function

and obtain by differentiation the three conditions

By solving each of these equations for l and equating the three resulting expressions, we obtain x = y = z = 2s/3, i.e., the solution is an equilateral triangle.

3. We shall now prove the theorem: The inequality

holds for every u ³ 0, v ³ 0 and every a > 0, P > 0 for which

This inequality is certainly valid if either u or v vanishes, whence we may restrict ourselves to values of u and v such that uv¹0. If the inequality holds for a pair of numbers u, v, it also holds for all numbers ut1/a, vt1/b, where t is an arbitrary positive number. Hence, we need only consider values of u, v for which uv = 1 and have to show that the inequality

holds for all positive numbers u, v such that uv = 1

For this purpose, we solve the problem of finding the minimum of ua/a + vb/b, subject to the subsidiary condition uv = 1. Obviously, this minimum exists and occurs at a point (u, v), where u ¹ 0, v ¹ 0. Hence, there exists a multiplier -l, for which

On multiplication by u and v, respectively, these yield ua = l, vb = l. Taken with uv = 1, these imply that u = v = 1. The minimum of the function ua/a + vb/b is therefore 1/a + 1/b = l, i.e., the statement

when uv has been proved.

If in the inequality uv £ ua/a + vb/b, just proved, we replace u and v by

respectively, where u1, u2, × × × , un , v1, v2, × × × , vn are arbitrary non-negative numbers and at least one u and v is non-zero, and if we then sum the inequalities thus obtained for i = 1, . . ., n, we obtain Hölder's inequality

This inequality holds for any 2n numbers ui, vi, where ui ³ 0, vi ³ 0,(i = 1, × × × , n), not all the ui and vi are zero and the indices a, b are such that a > 0, b > 0, 1/a.+ 1/b = 1.

4. Find the point on the closed surface

which is at the smallest distance from the fixed point (x, h, z). If the distance is a minimum, its square is also a minimum, whence we consider the function

Differentiation yields the conditions

or, in another form,

These equations state that the fixed point (x, h, z) lies on the normal to the surface at the point of extreme distance (x,y,z), whence, in order to travel along the shortest path from a point to a (differentiable) surface, we must travel in a direction normal to the surface. Of course, a further discussion is required to decide whether we have found a maximum or a minimum or neither. (For example, consider a point inside a spherical surface. The points of extreme distance lie at the ends of the diameter through the point; the distance to one of these points is a minimum, to the other a maximum.)

Exercises 3.6

1. Find the largest and smallest distances of a point on the ellipse

from the straight line x + y— 4 = 0.

2. The sum of the lengths of the twelve edges of a rectangular block is a; the sum of the areas of the six faces is a2/25. Calculate the lengths of the edges when the excess of the volume of the block over that of a cube with edges equal to the shortest edge of the block is largest.

3. Determine the maxima and minima of the function

4. Show that the maximum of

is equal to the larger of the roots of the equation in l

.

5. Calculate the maximum values of

6. Find the stationary points of the function

and discuss their nature.

7*. Find the values of a and b for the ellipse

of least area containing inside the circle

8. Find the quadrilateral with given edges a, b, c, d which includes the largest area.

9. Which point of the sphere

is at the largest distance from the point (1, 2, 3)?

10. Let P1, P2, P3, P4 be a convex quadrilateral. Find the point O for which the sum of the distances from P1, P2, P3, P4 is a minimum.

11. Find the point (x, y, z) of the ellipsoid

for which

is a minimum, where A, B, C denote the intercepts between the tangent plane at (x, y, z) (x > 0, y > 0, z > 0) and the co-ordinate axes.

12. Find the rectangular parallelepiped of largest volume inscribed in the ellipsoid

13. Find the rectangle with largest perimeter inscribed in the ellipse

14. Find the point of the ellipse

for which the tangent is at the largest distance from the origin.

15*. Prove that the length l of the largest axis of the ellipsoid

is given by the largest real root of the equation

Hints and Answers

last next