5.106 Sets of Intervals on a line. The Heine-Borel theorem: We shall now proceed to prove some theorems concerning the oscillations of a function which are of particular importance, as we shall see later on in the theory of integration. These theorems depend upon a general theorem concerning the intervals on a line.

Suppose that we are given a set of intervals on a straight line, i.e., an aggregate each of the members of which is an interval (a, b). We make no restriction as to the nature of these intervals; their number may be finite or infinite and they may or may not overlap*; any number of them may be included in others.

* The word overlap is used in its obvious sense; two intervals overlap if they have points in common which are not end points of either. Thus (0, 2/3) and (1/2, 1) overlap. A pair of intervals such as (0, 1/2) and (1/2, 1) may he said to abut.

In passing, it is worth while to give a few examples of sets of intervals to which we shall have occasion to return later on:

(i) If the interval (0, l) is subdivided into equal parts,then the n intervals thus formed define a finite set of non-overlapping intervals which just cover the line.

(ii) We take every point x of the interval (0, 1) and associate with x the interval (x - e, x + e), where e is a positive number less than 1, except that we associate (0, e) with 0 and (l - e, 1) with 1, and, in general, reject any part of any interval which projects outside the interval (0, 1). Thus, we define an infinite set of intervals and, obviously, many of them overlap with one another.

(iii) We take the rational points p/q of the interval (0, 1) and associate with p/q the interval

where e is positive and less than 1. We regard 0 as 0/1 and I as 1/1; in these two cases, we reject the part of the interval which lies outside (0, 1). Thus, we obtain an infinite set of intervals, which plainly overlap with each other, since there are an infinity of rational points, other than p/q, in the interval associated with p/q.

Heine-Borel theorem: Let there be given an interval (a, b) and a set of intervals I each the members of which is in (a, b). Moreover, let I possess the properties:

(i) Every point of (a, b), other than a or b, lies inside, i.e., 'in and not at an end of', at least one interval of I;

(ii) a is the left hand and b the right hand end point of at least one interval of I.

Then it is possible to choose a finite number of intervals from the set I which form a set of intervals with the properties (i) and (ii).

We know that a is the left hand end point of at least one interval of I, say (a, a1). We also know that a1 lies inside at least one interval of I, say (a1',a2). Similarly, a2 lies inside an interval (a'2,a3) of I. It is plain that this argument may be repeated indefinitely, unless after a finite number of steps an coincides with b.

If an does coincide with b after a finite number of steps, then there is nothing further to prove, for we have obtained a finite set of intervals, selected from the intervals of I and possessing the properties required. If an never coincides with b, then the points a1, a2, a3, ··· must (since each lies to the right of its predecessor) tend to a limiting position, but this limiting position may, as far as we can tell, lie anywhere in (a, b).

Let us suppose now that the process just indicated, starting from a, is performed in all possible ways, so that we obtain all possible sequences of the types a1, a2, a3, ···. Then we can prove that there must be at least one such sequence which arrives at b after a finite number of steps.

There are two possibilities with regard to any point x between a and b. Either

(i) x lies to the left of some point an of some sequence or
(ii) it does not.

We subdivide the points x into two classes L and R according to whether (i) or (ii) is true. The class L certainly exists, since all points of the interval (a, a1) belong to L. We shall now prove that R does not exist, so that every point x belongs to L.

If R exists, then L lies entirely to the left of R and the classes L, R form a section of the real numbers between a and b, to which corresponds a number x0· The point x0 lies inside an interval of I, say (x',x"), and x' belongs to L, whence it lies to the left of some term an of some sequence. However, we then can take (x',x") as the interval (a'n, a'n+1) associated with an in our construction of the sequence a1, a2, a3, ···, and all points to the left of x" lie to the left of an+1. Whence, there are points of L to the right of x0 and this contradicts the definition of R. It is therefore impossible that R should exist.

Thus, every point x belongs to L. Now, b is the right hand end point of an interval of I, say (b1,b), and b1 belongs to L. Hence, there is a member an of a sequence a1, a2, a3, ··· such that an> b1. But we may then take the interval (a'n, an+1), corresponding to an, to be (b1, b), and thus obtain a sequence in which the term after the nth term coincides with b and therefore a finite set of intervals with the required properties. Thus, the theorem is proved.

It is instructive to consider the examples in §106 in the light of this theorem.

(i) Here the conditions of the theorem are not satisfied; the points l/n, 2/n, 3/n, ... do not lie inside any interval of I.
(ii) Here the conditions of the theorem are satisfied. The set of intervals

associated with the points e, 2e, 3e, ..., l - e possesses the properties required.
(iii) In this case, we can prove by using the theorem that there are, if
e is small enough, points of (0,1) which do not lie in any interval of I.

If every point of (0,1) were inside an interval of I (with the obvious reservation regarding the end points), then we could find a finite number of intervals of I possessing the same property and having therefore a total length larger than I. Now, there are two intervals of total length 2e, for which q = 1, and q - 1 intervals of total length 2e(q - l)/q³, associated with any other value of q. The sum of any finite number of intervals of I can therefore not be larger than 2e times that of the series

which in Chapter VIII will be shown to converge, whence it follows that, if e is small enough, the supposition that every point,of (0, 1) lies inside an interval of I leads to a contradiction.

The reader may be tempted to think that this proof is needlessly elaborate and that the existence of points of the interval, not in any interval of I, follows at once from the fact that the sum of all these intervals is less than 1. But the theorem to which he would be appealing is (when the set of intervals is infinite) far from obvious and can only be proved rigorously by some such use of the Heine-Borel theorem as is made in the text.

5.107 The oscillation of a continuous function: We shall now apply the Heine-Borel theorem to the proof of two important theorems concerning the oscillation of a continuous function.

THEOREM 1: If f(x) is continuous throughout the interval (a, b), then we can divide (a, b) into a finite number of sub-intervals (a, x1), (x1,x2), ..., (xn, b), in each of which the oscillation of f(x) is less than an assigned positive number d.

Let x be any number between a and b. Since f(x) is continuous for x = x, we can determine an interval (x - e, x + e) such that the oscillation of f(x) in this interval is less than S. Indeed, it is obvious that there are an infinity of such intervals corresponding to every x and every d, for, if the condition is satisfied for any particular value of e, then it is satisfied a fortiori for any smaller value. What values of e are admissible will naturally depend upon x ; we have at present no reason for supposing that a value of e admissible for one value of x will be admissible for another. We shall call the intervals thus associated with x the d-intervals of x.

If x = a, then we can determine an interval (a, a + e) and so an infinity of such intervals, having the same property. We call these the d-intervals of a and we can define in a similar manner the d-intervals of b.

Consider now the set I of intervals formed by taking all the d-intervals of all points of (a, b). It is plain that this set satisfies the conditions of the Heine-Borel theorem; every point interior to the interval is interior to at least one interval of I, and a and b are end points of at least one such interval. We can therefore determine a set I' which is formed by a finite number of intervals of I and which possesses the same property as I itself.

In general, the intervals which compose the set I ' will overlap, as in Fig. 32. But, obviously, their end points subdivide (a,b) into a finite set of intervals I" each of which is included in an interval of I' and in each of which the oscillation of f(x) is less than d. Thus Theorem I is proved,

THEOREM II: Given any positive number d, we can find a number h such that, if the interval (a, b) is subdivided in any manner into subintervals of length less than h, then the oscillation of f(x) in each of them will be less than d.

Take d < ½d and construct, as in Theorem I, a finite set of sub-intervals j in each of which the oscillation of f(x) is less than d1. Let h be the length of the least of these sub-intervals j. If we now subdivide (a, b) into parts each of length less than h, then any such part must lie entirely within at most two successive sub-intervals j. Hence, by virtue of (3) of §104, the oscillation of f(x), in one of the parts of length less than h, cannot exceed twice the largest oscillation of f(x) in a sub-interval j and is therefore less than 2d1, and therefore less than d.

This theorem is of fundamental importance in the theory of definite integrals (Chapter VII). Without the use of this or some similar theorem, it is impossible to prove that a function, which is continuous throughout an interval, necessarily possesses an integral over that interval.

5.108 Continuous functions of several variables: The notions of continuity and discontinuity may be extended to functions of several independent variables (Chapter II, §31 et seq.) However, their application to such functions raises questions which are much more complex and more difficult than those which we have considered in this chapter. It would be impossible for us to discuss these questions in any detail here; but we shall, in the sequel, require to know what is meant by a continuous function of two variables and, accordingly, we give the following definition. It is a straightforward generalization of the last form of the definition of §99.

The function f(x, y) of the two variables x and y is said to be continuous far x = x, y = h if, given any positive number d, however small, we can choose e(d) so that

when 0 £ |x - x| £ e(d) and 0 £ |y - h| £ e(d)), i.e., if we can draw a square with sides parallel to the co-ordinate axes and of length 2e(d), with centre at the point (x, h) and which is such that the value of f(x, y) at any point inside it or on its boundary differs from f(x, h) by less than d*.

* The reader should draw a figure to illustrate the definition.

Naturally, this definition presupposes that f(x, y) is defined at all points of the square in question and, in particular, at the point (x, h). Another method of stating the definition is: f(x, y) is continuous for x = x, y = h, if f(x, y) ® f(x, h) when x ® x and y ® h in any manner. This statement is apparently simpler. However, it contains phrases the precise meaning of which has not yet been explained and can only be explained with the aid of inequalities like those which occur in our original statement.

It is easy to prove that the sums, products and, in general, quotients of continuous functions of two variables are themselves continuous. A polynomial in two variables is continuous for all values of the variables, and the ordinary functions of x and y, which occur in every day analysis, are generally continuous, i.e., they are continuous except for pairs of values of x and y linked by special relations.

The reader should observe carefully that to assert the continuity of f(x, y) with respect to the two variables x and y is to assert much more than its continuity with respect to each variable considered separately. It is plain that, if f(x, y) is continuous with respect to x and y, then it is continuous with respect to x (or y) when any fixed value is assigned to y (or x). However, the converse is by no means true. For example, let

when neither x nor y is zero, and f(x, y) = 0 when either x or y is zero. Then, if y has any fixed value, zero or not, f(x, y) is a continuous function of x and, in particular, continuous for x = 0; in fact, its value when x = 0 is zero and it tends to the limit zero as x ® 0. In the same way, it may be shown that f(x, y) is a continuous function of y. However, f(x, y) is not a continuous function of x and y for x = 0, y = 0. Its value when x = 0, y = 0 is zero, while, if x and y tend to zero along the straight line y = ax, then

which may have any value between —1 and 1.

5.109 Implicit functions: Already in Chapter II, we have encountered the idea of an implicit function. Thus, if x and y are linked by the relation

then y is an ' implicit function ' of x.

However, it is far from obvious that such an equation does really define a function y of x, or several such functions. In Chapter II, we were content to take this for granted. We are now in a position to consider whether the assumption we made then was justified.

We shall find the following terminology useful. Suppose that it is possible to surround a point (a, b), as in the preceding section, with a square throughout which a certain condition is satisfied. We shall call such a square a neighbourhood of (a, b) and say that the condition in question is satisfied in the neighbourhood of (a,b) or near (a,b), meaning by this simply that it is possible to find some square throughout which the condition is satisfied. It is obvious that similar words may be used when we are dealing with a single variable, the square being replaced by an interval on a line.

THEOREM:

If

(i) f(x,y) is a continuous function of x and y in the neighbourhood of (a,b),
(ii) f(a,b) = 0,
(iii) f(x, y) is in the neighbourhood of a for all values of x a steadily increasing function of y, in the stricter sense of
§96, then

(1) there exists a unique function y = f(x,y) which, when substituted in the equation f(x, y) = 0, satisfies it identically for all values of x in the neighbourhood of a,
(2)
f(x,y) is continuous for all values of x in the neighbourhood of a.

In Fig.33, the square represents a ' neighbourhood ' of (a, b) throughout which the conditions (i) and (iii) are satisfied, and P is the point (a, b). If we take Q and R as in the figure, it follows from (iii) that f(x, y) is positive at Q and negative at R. This being so and f(x, y) being continuous at Q and at R, we can draw lines QQ' and RR' parallel to OX, so that R'Q' is parallel to OY and f(x, y) is positive at all points of QQ' and negative at all points of RR'. In particular, f(x, y) is positive at Q' and negative at R', whence, by virtue of (iii) and § 101, it vanishes once and only once at a point P' on R'Q'. The same construction gives us a unique point at which f(x, y) = 0 on each ordinate between RQ and R'Q'. Moreover, it is obvious that the same construction can be carried out to the left of RQ. The aggregate of points such as P' yields the graph of the required function y = f(x).

There remains to prove that f(x) is continuous. This is most simply effected by using the idea of the 'limits of indetermination' of f(x) as x ® a (§96). Suppose that x ® a and let l and L be the limits of indetermination of f(x) as x ® a. Evidently, the points (a, l) and (a, L) lie on QR. Moreover, we can find a sequence of values of x such that f(x) ® l as x ® a through the values of the sequence; and since f{x,f(x)} = 0 and f(x, y) is a continuous function of x and y, we have

Hence l = b; and similarly L = b. Thus, f(x) tends to the limit b as ® a, whence f(x) is continuous for x =a. It is evident that we can show in exactly the same way that f(x) is continuous for any value of x in the neighbourhood of a.

It is clear that the truth of the theorem would not be affected if in Condition (iii) we were to change 'increasing' to 'decreasing'. As an example, let us consider Equation (1), taking a = 0, b = 0. Evidently, Conditions (i) and (ii) are satisfied. Moreover,

has, when x, y and y' are sufficiently small, the sign opposite to that of y— y', Condition(iii) (with 'decreasing' for 'increasing') is satisfied. It follows that there is one and only one continuous function y which satisfies Equation (1) identically and vanishes with x.

The same conclusion would follow for the equation

In this case the function in question is

where the square root is positive. The second root, in which the sign of the square root is changed, does not satisfy the condition of vanishing with x.

There is one point in the proof which the reader should be careful to observe. We have assumed that the hypotheses of the theorem were satisfied 'in the neighbourhood of (a, b)', i.e., throughout a certain square

The conclusion holds 'in the neighbourhood of x = a', i.e., throughout a certain interval x-e1£x£x+e1. There is nothing to show that the e1 of the conclusion is the e of the hypotheses, and indeed this is generally untrue.

5.110 Inverse functions: In particular, suppose that f(x,y) is of the form F(y) - x. We then obtain the theorem:

If F(y) is a function of y, continuous and steadily increasing (or decreasing) in the stricter sense of §95 in the neighbourhood of y = b and F(b) = a, then there is a unique continuous function y=f(x) which is equal to b when x = a and satisfies the equation F(y) = x identically in the neighbourhood of x = a.

The function thus defined is called the inverse function of F(y).

For example, let y³ = x, a = 0, b = 0. Then all the conditions of the theorem are satisfied. The inverse function is

If we had assumed that y² = x, then the conditions of the theorem would not have been satisfied, because y² is not a steadily increasing function of y in any interval which includes y = 0: It decreases when y is negative and increases when y is positive. Thus, in this case, the conclusion of the theorem does not hold, since y' = x defines two functions of x, viz., y = Öx and y = -Ö x, both of which vanish when x = 0 and each of which is defined only for positive values of x, so that the equation has sometimes two solutions and sometimes none. The reader should consider the more general equations

in the same way. Another interesting example is given by the equation

already considered in Exercise xiv 7.

Similarly, the equation

has just one solution which vanishes with x, viz., the value of arsin x which vanishes with x. Of course, there are an infinity of solutions, given by the other values of arsin x (Example xv 10), which do not satisfy this condition.

So far, we have considered only what happens in the neighbourhood of a particular value of x. Let us suppose now that F(y) is positive and steadily increasing (or decreasing) throughout an interval (a, b). Given any point x of (a, b), we can determine an interval i including x and a unique and continuous inverse function fi(x) defined throughout i.

By virtue of the Heine-Borel theorem, we can select from the set I of intervals i a finite sub-set covering the entire interval (a, b) and it is plain that the finite set of functions fi(x), corresponding to the sub-set of intervals i thus selected, define together a unique inverse function f(x) continuous throughout (a, b).

We thus obtain the theorem:

If x = F(y), where F(y) is continuous and increases steadily and strictly from A to B as x increases from a to b, then there is a unique inverse function y = f(x) which is continuous and increases steadily and strictly from a to b as x increases from A to B.

It is worthwhile to show how this theorem can be obtained directly without the help of the more difficult theorem of § 109. Suppose that A < x < B and consider the class of values of y such that (i) a < y <b and (ii) F(y) £ x. This class has an upper bound h and plainly F(h) £ x. If F(h) were less than x, we could find a value of y such that y > h and F(y) < x, and h would not be the upper bound of the class considered, whence F(h) = x. The equation F(y) = x has therefore a unique solution, say y = h = f(x); plainly, h increases steadily and continuously with x, which proves the theorem.

MISCELLANEOU8 EXAMPLES ON CHAPTER V

1. Show that, in general,

where a = a/A, b = (bA - aB)/A² and h is of the first order of smallness when x is large. Indicate any exceptional cases.

2.. Determine a , b and g so that

where h is of the first order of smallness when x is large. Indicate any exceptional cases.

3.. Show that, if P(x) is a polynomial axn + bxn-1 + ... + k, the first coefficient a of which is positive, then P(x+h)—P(x) and

increases steadily from a certain value of x onwards.

4. Prove that

when x ® ¥.

5. Show that

Use the formula

6. Show that

where h is of the first order of smallness when x is large.

7. Find values of a and b such that

has the limit zero as x ® ¥ and prove that

8. Evaluate

9. Prove that

10. Prove that

is.of the fourth order of smallness when x is small and find the limit of f(x)/x4 as x ® 0.

11. Prove that

is of the sixth order of smallness when x is small and find the limit of f(x}/x4 as x ® 0.

12. From a point P on a radius OA of a circle, extended beyond the circle, a tangent PT is drawn to the circle, touching it at T and TN is drawn perpendicular to OA. Show that NA/AP ® 1 as P moves up to A.

13. Tangents are drawn to a circular arc at its central point and its extremities; D is the area of the triangle formed by the chord of the arc and the two tangents at the extreme points and D' the area of that formed by the three tangents. Show that D/D'® 4 as the length oft the arc tends to zero.

14. For what values of a does {a + sin(1/x)}x tend to (l) ¥,(2) -¥, as x ® 0.

[To ¥ if a > l, to - ¥, if a < -1. Otherwise the function oscillates.]

15. If f(x) = 1/q when x = p/q, and f(x) = 0 when x is irrational, then f(x) is continuous for all irrational and discontinuous for all rational values of x.

16. Show that the function of the graph in Fig. 30 may be represented by either of the formulae

17. Show that the function f(x), which is equal to 0 when x = 0, to 1/2 - x when 0 < x < 1/2, to 1/2 when x = 1/2, to 3/2-x when 1/2 < x < 1 and to 1 when x = 1, assumes every value between 0 and 1 once and once only as x increases from 0 to 1, but is discontinuous for x = 0, x = 1/2. and x = 1. Show also that the function may be represented by the formula

18. Let f(x) = x when x is rational and f(x) = 1 when x is irrational. Show that f(x) assumes every value between 0 and 1 once and once only as x increases from 0 to 1, but is discontinuous for every value of x except x = 1/2.

19. Prove that a function which is increasing at every point of (a, b) is an increasing function in (a, b).

Show that a function which is ' increasing on the right ' at every point of (a, b) is not necessarily an increasing function in (a,b), but is so if it is continuous. (Math. Trip. 1926)

[We say that 'f(x) is increasing at x' when (i) f(x') ³ f(x) for all x' of some interval to the right of x; and (ii) f(x') £ f(x1) for all x' of some interval to the left of x. When (i) alone is given, we say that f(x) is 'increasing on the right'.

We have to prove that f(x2) ³ f(x1) if a £ x1 < x2 £ b. We divide the points x of (x1, b) into two classes L and R, L if f(x') ³ f(x1) for all x' of (x1, x), and R in the contrary case, and denote by b the number corresponding to the section. The conclusion will follow if b = b (i.e., if R does not exist).

If b < b and f(b) < f(x1), we can, by (i), find an interval to the right of b in which f(x) ³ f(b) ³ f(x1), and this contradicts the definition of b, whence (f(b) < f(x1) if b < b . So far we have used only (i).

If (ii) is true as well, then there are points to the left of b at which f(x) £ f(b) < f(x1) and this again contradicts the definition of b, whence b = b, as required. The same conclusion follows if (i) only is given but f(x) is continuous; in fact, then f(x)<f(x1) for values of x to the left of but sufficiently near to b.

The example a = 0, b = 2, f(x) = x for 0 £ x < 1, f(x) = x - 1 for 1 £ x £ 2 shows that the conclusion does not follow from (i) alone.]

20. As x increases from -p/2 to p/2, y = sin x is continuous and steadily increases, in the stricter sense, from -p/2 to p/2. Deduce the existence of a function x = arsin y which is a continuous and steadily increasing function of y from y = -1 to y=+1.

21. Show that the numerically least value of artan y is continuous for all values of y and increases steadily from -p/2 to p/2 as y varies through all real values.

22. Examine whether the equation

where P(x, y) is a polynomial containing no term of degree less than 2, defines a unique function vanishing at x = 0 and continuous in the neighbourhood of x = 0. (Math. Trip, 1936)

23. Discuss, along the lines of §§109-110 the solution of the equations

in the neighbourhood of x = 0, y = 0.

24. If

then one value of y is given by

[If y - ax = h, then, say,

It is evident that h is of the second order, xh of the third and h² of the fourth order of smallness, and -2eh=Ax²-(AB/e)x³, the error being of the fourth order.]

25. If x = ay + by² + cy³, then one value of y is given by

where

26. If x = ay + byn, where n is an integer larger than unity, then one value of y is given by

where a = 1/a, b = -b/an+1, y = nb2/a2n+1.

27. Show that the least positive root of the equation xy = sin x is a continuous function of y throughout the interval (0,1) and decreases steadily from p to 0 as y increases from 0 to 1.

[The function is the inverse arsin x of sin x; apply §110.]

28. The least positive root of xy = tan x is a continuous function of y throughout the interval (1, ¥) and increases steadily from 0 to p/2 as y increases from 1 towards ¥.

29. A function f(x) is said to be upper semi-continuous at x, if

for every positive d and all x' of an interval (depending on x and d) around x. Prove that a function which is upper semi-continuous at all points of (a, b) has an upper bound, which it attains in (a, b). (Math. Trip. 1924)

[In order to prove the existence of an upper bound M, replace 'bounded' by 'bounded above' in the proof of Theorem I of §103. In order to prove that f(x) attains the value M, make corresponding changes in the argument of § 106. We find that f(x) assumes near b values as near as we please to M and this contradicts the inequality

if f(b) < M and d is sufficiently small.

We can define lower semi-continuity similarly by an inequality

A lower semi-continuous function has an attained lower bound. A function which is both upper and lower semi-continuous is continuous.]

last next