CHAPTER I

Preliminary Remarks on Analytical Geometry and Vector Analysis

In an interpretation and application of the mathematical facts which form the main subject of this second volume, it is often convenient to employ the simple fundamental concepts of analytical geometry and vector analysis. Hence, even though many will already have a certain knowledge of these subjects, it seems advisable to summarize their elements in a brief introductory chapter. However, this chapter need not be studied before the rest of the book is read; the reader is advised to refer to the facts collected here only when he finds a need for them while studying the later parts of the book.

1. Rectangular Co-ordinates and Vectors

1.1.1 Co-ordinate Axes: In order to fix a point in a plane or in space, as is well known, one generally employs a rectangular co-ordinate system. In the plane, we take two perpendicular lines, the x-axis and the y-axis, in space, three mutually perpendicular lines, the x-axis, the y-axis and the z-axis. Taking the same unit length on each axis, we assign to each point of the plane an x-, a y- and a z-co-ordinate (Fig.1) Conversely, there corresponds to every set of values (x, y) or (x, y, z), just one point of the plane or space; a point is completely determined by its co-ordinates.

Using Pythagoras' theorem, we find that the distance between the two points (x1,y1), (x2,y2) is given by

while the distance between the points with co-ordinates (x1,y1,z1) and (x2,y2,z2) is

In setting up a system of rectangular axes, we must pay attention to the orientation of the co-ordinate system.

In 5.2.1, we distinguished between positive and negative senses of rotation in the plane. The rotation by 90º, which brings the positive x-axis of a plane co-ordinate system into the position of the positive y-axis in the shortest way defines a sense of rotation. According to whether this sense of rotation is postive or negative, we say that the system of axes is right-handed or left-handed (Figs.2,3). It is impossible to change a right-handed system into a left-handed one by a rigid rotation, confined to the plane. A similar distinction occurs with co-ordinate systems in space. For, if one imagines oneself standing on the xy-plane with one's head in the direction of the positive z-axis, it is possible to distinguish two types of co-ordinate system

by means of the apparent orientation of the co-ordinate system in the xy-plane. If this system is right-handed, the system in space is also said to be right-handed, otherwise left-handed..(Figs, 4/6). A right-handed system corresponds to an ordinary right-handed screw; for if we make the xy-plane rotate about the z-axis (in the sense prescribed by its orientation) and simultaneously translate it along the positive z-axis, the combined motion is obviously that of a right-hand screw. Similarly, a left-handed system corresponds to a left-handed screw. No rigid motion in three dimensions can transform a left-handed system into a right-handed system.

In. what follows, we shall always use right-handed systems of axes.

We may also assign an orientation to a system of three arbitrary axes passing through one point, provided these axes do not all lie in one plane, just as we have done here for a system of rectangular axes.

1.1.2 Directions and Vectors. Formulae for Transforming Axes:An oriented line l in space or in a plane, i.e., a line traversed in a definite sense, represents a direction; every oriented line which can be made to coincide with the line l in position and sense by displacement parallel to itself represents the same direction. It is customary to specify a direction relative to a co-ordinate system by drawing an oriented half-line in the given direction, starting from the origin of the co-ordinate system, and on this half-line taking the point with co-ordinates a, b, g , which is at unit distance from the origin. The numbers a, b, g are called the direction cosines of the direction. They are the cosines of the three angles d1, d2, d3, which the oriented line l makes with the positive x-, y- and z-axes * (Fig. 6); by the distance formula., they satisfy the relation

If we restrict ourselves to the xy-plane, a direction can be specified by the angles d1, d2 which the oriented line l with this direction passing through the origin forms with the positive x- and y-axes, or by the direction cosines a == cosd1, b = cosd2, which satisfy the equation

A line-segment of given length and given direction will be called a vector, more specifically, a bound vector, if the initial point is fixed in space, and a free vector, if the position of the initial point is immaterial. In the sequel, and indeed throughout most of this volume, we shall omit the adjectives free and bound, and if nothing is said to the contrary, we shall always assume vectors to be free. We denote vectors by bold type, e.g., a, b, c, x, A. Two free vectors are said to be equal if one of them can be made to coincide with the other by displacement parallel to itself. We will at times call the length of a vector its absolute value and denote it by |a|.

* The angle which one oriented line forms with another may always be taken as being between 0 and p, because in the sequel only the cosines of such angles will be considered.

If we drop from the initial and final points of a vector v perpendiculars to an oriented line l, we obtain an oriented segment on l corresponding to the vector. If the orientation of this segment is the same as that of l , we call its length the component of v in the direction of l; if the orientations are opposite, we call the negative value of the length of the segment the component of v in the direction of l. The component of v in the direction of l will be denoted by vl. If d is the angle between the direction of v and that of l (Fig.7), we always have

A vector v of length 1 is called a unit vector. Its component in a direction l is equal to the cosine of the angle between l and v. The components of a vector v in the directions of the three axes of a co-ordinate system are denoted by v1, v2, v3. If we transfer the initial point of v to the origin, we see that

If a, b, g are the direction cosines of the direction of v, then

A free vector is completely determined by its components v1, v2, v3.

An equation

between two vectors is therefore equivalent to the three ordinary equations

There are different reasons why the use of vectors is natural and advantageous.Firstly, many geometrical concepts and a still larger number of physical concepts such as force, velocity, acceleration, etc., immediately reveal themselves as vectors independent of the particular co-ordinate system. Secondly, we can set up simple rules for calculating with vectors analogous to the rules for calculating with ordinary numbers; by means of these many arguments, these can be developed in a simple way, independently of the particular co-ordinate system chosen.

We begin by defining the sum of the two vectors a and b. For this purpose, we displace the vector b parallel to itself until its initial point coincides with the final point of a. Then the starting point of a and the end point of b determine a new vector c (Fig. 8} the starting point of which is the starting point of a and the end point of which is the end point of b. We call c the sum of a and b and write

For this additive process, there hold obviously the commutative law

and the associative law

as a glance at Figs. 8 and 9 shows.

We obtain immediately from the definition of vector addition the projection theorem: The component of the sum of two or more vectors in a direction l is the sum of the components of the individual vectors in that direction, i.e.,

In particular, the components of a + b in the directions of the co-ordinate axes are a1, + b1, a2 + b2, a3+ b3.

Hence, in order to form the sum of two vectors, we have the simple rule: The components of the sum are equal to the sums of the corresponding summands.

Every point P with co-ordinates (x, y, z) may be determined by the position vector from the origin to P, the components of which in the directions of the axes are just the co-ordinates of the point P. We take three unit vectors in the directions of the three axes, e1 in the x-direction, e2 in the y-direction, e3 in the z-direction. If the vector v has the components v1, v2, v3, then

We call v1= v1e1, v21= v2e2, v3= v3e3 the vector components of v.

Using the projection theorem stated above, we easily obtain the transformation formulae which determine (x', y', z'), the co-ordinates of a given point P with respect to the axes Ox', Oy', Oz' in terms of (x, y, z), its co-ordinates with respect to another set * of axes Ox, Oy, Oz, which has the same origin as the first set and may be obtained from it by rotation. The three new axes form angles with the three old axes, the cosines of which may be expressed, by the following scheme, where, for example, g1 is the cosine of the angle between the x'-axis and the z-axis.

* Note that, in accordance with the convention adopted, both systems of axes are to be right-handed.

We drop from P perpendiculars to the axes Ox, Oy, Oz to their feet P1,P2, P3 (Fig.1). The vector from O to P is then equal to the sum of the vectors from O to P1, from O to P2 and from O toP3. The direction cosines of the x'-axis relative to the axes Ox, Oy, Oz are a1, b1, g1, those of the y'-axis a2, b2, g2 and. those of the z'-axis a3, b3, g3. By the projection theorem, we know that x', which is the component of the vector in the direction of the x'-axis, must be equal to the sum of the components of in the direction of the x'-axis, whence

because a1x is the component of x in the direction of the x'-axis, etc. Carrying out similar arguments for y' and z', we obtain the transformation formulae

Since the components of a. bound vector v in the directions of the axes are expressed by the formulae

in which (x1, y1, z1) are the co-ordinates of the starting point and (x2, y2, z2) the co-ordinates of the end point of v, it follows that the same transformation formulae hold for the components of the vector as for the co-ordinates:

1.1.3 Scalar Multiplication of Vectors: Following conventions similar to those for the addition of vectors, we now define the product of a vector v by a number c: If v has the components v1, v2, v3, then cv is the vector with the components cv1, cv2, cv3. This definition agrees with that of vector addition, because v + v = 2v, v + v + v = 3v, etc. If c > 0, cv has the same direction as v and the length c|v|; if c < 0, the direction of cv is opposite to the direction of v and its length is (-c) |v|. If c = 0, we see that cv is the zero vector with the components 0,0,0.

We can also define the product of two vectors w and v, where this multiplication of vectors satisfies rules of calculation which are partly similar to those of ordinary multiplication. There are two different kinds of vector multiplication. We begin with scalar multiplication which is simpler and the more important for our purposes.

The scalar product * uv of the vectors u and v is the product of their absolute values and the cosine of the angle d between their directions:

Hence, the scalar product is simply the component of one of the vectors in the direction of the other multiplied by the length of the second vector.

* It is sometimes also called the inner product.

By the the projection theorem, the distributive law for multiplication is

whence follows at once the commutative law

On the other hand, there is an essential difference between the scalar product of two vectors and the ordinary product of two numbers, because the product can vanish although neither factor vanishes.

If the length of u and v are not zero, the product uv vanishes if, and only if the two vectors u and v are perpendicular to each other.

In order to express the scalar product in terms of the components of the two vectors, we take both the vectors u and v with their starting points at the origin. We denote their vector components by u1, u2, u3 and v1, v2, v3, respectively, so that u=u1+u2+u3 and v=v1+v2+v3. In the equation uv = (u1+u2+u3)(v1+v2+v3), we can expand the product on the right hand side in accordance with the rules of calculation, which we have just established; if we note that the products u1v2, u1v3, u2v1, u2v3, u3v1, u3v2 vanish, because the factors are perpendicular to each other, we obtain uv = u1v1 + u2v2 + u3v3. Now the factors on the right have the same direction, so that, by definition, u1v1 = v1u1, etc., where u1, u2, u3 and v1, v2, v3 are the components of u and v, respectively. Hence

This equation could have been taken as the definition of the scalar product and is an important rule for calculating the scalar product of two vectors given in terms of their components. In particular, if we take u and v as unit vector with direction cosines a1, a2, a3 and b1, b2 ,b3, respectively, the scalar product is equal to the cosine of the angle between u and v, which is accordingly given by the formula

The physical meaning of the scalar product is exemplified by the fact, proved in elementary physics, that a force f which moves a particle of unit mass through the directed distance v does work amounting to fv.

1.1.4 The Equations of the Straight Line and of the Plane: Let a straight line in the xy-plane or a plane in xyz-space be given. In order to find their equations, we erect a perpendicular to the line (or the plane) and specify a definite positive direction along the normal, perpendicular to the line (or plane); it does not matter which of the two possible directions is taken as positive (Fig.10). Denote the vector with unit length and the direction of the positive normal by n. The points of the line (or plane) are characterized by the property that the position vector x from the origin to them has a constant projection p on the direction of the normal; in other words, the scalar product of this position vector and the normal vector n is constant. If a, b (or a, b, g) are the direction cosines of the positive direction of the normal, i.e., the components of n, then the required equation of the line (or plane) is

where p has the meaning: The absolute value |p| is the distance of the line (or plane) from the origin. Moreover, p is positive if the line (or plane) does not pass through the origin and n is in the direction of the perpendicular from the origin to the line (or plane); p is negative if the line (or plane) does not pass through the origin and n has the opposite direction; p is zero if the line (or plane) passes through the origin. Conversely, if a, b (a, b, g) are direction cosines, this equation represents a line (or plane) at a distance p from the origin and its normal has these direction cosines.

The expression ax + by - p (ax + by + gz - p) on the left-hand side of this so-called normal or canonical form of the equation of the straight line (or plane) also has a geometrical meaning for any point P(x, y) not lying on the line (or plane). Since ax + by (or ax + by + gz) is the projection of the position vector from O to P on the normal, we see at once that the expression ax + by - p (or ax + by - gz - p) is the perpendicular distance of the point P from the line (or plane) and is positive for points on one side of the line or plane (namely, that on which the normal is positive) and negative for points on the other side.

We obtain from the canonical form of the equation other forms of equation for the straight line (or plane) by multiplying by an arbitrary, non-vanishing factor. Conversely, an arbitrary linear equation

represents a straight line (or plane) provided not all the coefficients A, B (or A, B, C) are zero.* For example, in the second of these equations, we may divide by and set

In this way, we obtain an equation which is seen to represent a plane at a distance p from the origin, the normal of which has the direction cosines a, b, g. Corresponding remarks hold for the equation of the straight line.

* If A = B = 0 (or A = B = C = 0), D must also be zero, and any point of the plane (or of space) satisfies the equation.

A straight line in space may be determined by any two planes passing through the line. Thus, we obtain for a line in space two linear equations

which are satisfied by (x, y, z), the co-ordinates of any point on the line. Since an infinite number of planes pass through a given line, this representation of a line in space is not unique.

 

Frequently, it is more convenient to represent a line analytically in parametric form by means of a parameter t. If we consider three linear functions of t

where the bi are not all zero, then, as t traverses the number axis, the point (x, y, z) describes a straight line. We see this at once by eliminating t between each pair of equations, whereby we obtain two linear equations for x, y, x.

The direction cosines a, b, g of the line in its parametric form are proportional to the coefficients b1, b2, b3, because these direction cosines are proportional (Fig.11) to x1 - x2, y1 - y2, z1 - z2, the differences of the co-ordinates of two points P1 , P2 with the co-ordinates

and

Hence

where denotes the length of the segment P1P2. Hence

Since the sum of the squares of the direction cosines is unity, it follows that

where the double signs of the square root correspond to the fact that we can choose either of the two possible senses on the line. By means of the direction cosines, we can easily bring the parametric representation of the line into the form

where (x0, y0, z0) is a fixed point on the line; the new parameter t is connected with the previous parameter t by the equation

 

From the fact that a² + b ² + g² = l follows that

Hence the absolute value of t is the distance between (x0, y0, z0) and (x, y, z). The sign of t indicates whether the direction of the line is from the point (x0, y0, z0) to the point (x, y, z) or vice versa; in the first case, t is positive, in the second case negative.

From this result we obtain a useful expression for (x, y, z), the co-ordinates of a point P on the segment joining the points P0(x0, y0, z0) and P1(x1, y1, z1), namely

where l0 and l1 are positive and l0 + l1 = 1. If t and t1 denote the distances from P0 of the point P and P1, respectively, we find that l0 = t /t1, because if we calculate a, say, from x1 = x0 + at1, and substitute this value, a = (x1 - x0)/t0, in the equation x = x0 + at1, we obtain the above expression.

Let a straight line be given by

We now seek the equation of the plane which passes through the point (x0, y0, z0) and. is perpendicular to this line. Since the direction cosines of the normal to this plane are a, b, g, the canonical form of the required equation is

and since the point (x0, y0, z0) lies on the plane

Hence, the equation of the plane through (x0, y0, z0) perpendicular to the line with direction cosines a, b, g is

In the same way, the equation of a straight line in the xy-plane which passes through the point (x0, y0) and is perpendicular to the line with direction cosines a, b is

Later on, we shall require a formula for d, the angle between two planes, given by the equations

Since the angle between the planes is equal to the angle between their normal vectors, the scalar product of these vectors is cos d, so that

In the same way, we have for the angle d between the two straight lines

In the xy-plane, we have

Exercises 1.1

1. Prove that the quantities a, a, ··· , g3 (cf.), which define a rotation of axes, satisfy the relations

2. If a and b are two vectors with starting point O and end points A and B, then the vector with O as starting point and the point dividing AB in the ratio q :1 - q as final point is given by

3. The centre of mass of the vertices of a tetrahedron PQRS may be defined as the point dividing MS in the ratio 1 : 3, where M is the centre of mass of the triangle PQR. Show that this definition is independent of the order in which the vertices are taken and that it agrees with the general definition of the Centre of mass.

4. If in the tetrahedron PQRS the centres of the edges PQ, RS, PR, QS, PS, QR are denoted by A, A', B, B', C, C', respectively, then all the lines AA', BB', CC' pass through the centre of mass and bisect one another at that point

5. Let P1, ··· , Pn be n arbitrary particles in space, with masses m1, m2, ··· , mn, respectively. Let G be their centre of mass and p1, ··· , pn denote the vectors with starting point O and final points P1, ··· , Pn. Prove that

Hints and Answers

1.2 THE AREA OF A TRIANGLE, THE VOLUME OF A TETRAHEDRON, THE VECTOR MULTIPLICATION OF VECTORS

1.2.1 The Area of a Triangle: In order to calculate the area of a triangle m the xy-plane, we imagine it moved parallel to itself until one of its vertices is at the origin; let the other two vertices be P1(x1, y1) and P2(x2, y2) (Fig.12). Write down the equation of the line joining P1 to the origin in its canonical form

hence one has for the distance h of the point P2 from this line (except possibly for the sign) the expression

Since the length of the segment OP is , we find that twice the area of the triangle, which is the product of the base OP1 and the height h is given (except possibly for the sign) by

This expression can be either positive or negative; it changes sign if we interchange P1 and P2. We now make the following assertion: The expression A has a positive or negative value according to whether the sense, in which the vertices OP1P2 are traversed, is the same as the sense of the rotation associated with the co-ordinate axes or not. Instead of proving the fact by a more detailed investigation of the argument given above, which is quite feasible, we prefer to prove it in the following way. We rotate the triangle OP1P2 about the origin O until P1 lies on the positive x-axis. (The case in which O, P1, P2 lie on one line, so that A = ½(x1y2 - x2y1) = 0 can be omitted.) This rotation does not change the value of A. After the rotation, P1 has the co-ordinates x1' > 0, y1' = 0, and the co-ordinates of the new P2 are x2' and y2'. The area of the triangle is now

whence it has the same sign as y2'. However, the sign of y2' is the same as the sign of the sense in which the vertices OP1P2 are traversed (Fig.13), and thus the statement is proved.

For the expression x1y2 - x2y, which gives twice the area with its proper sign, it is customary to introduce the symbolic notation

which we call a two-row determinant or a second order determinant.

If no vertex of the triangle is at the origin of the co-ordinate system, for example, if the three vertices are (x0, y0), (x1, y1), (x2, y2), we obtain by moving the axes parallel to themselves for the area A of the triangle

1.2.2 Vector Multiplication of two Vectors:. Beside the scalar product of two vector, we have the important concept of the vector product. The vector product [ab] or a´b of the vectors a and b is defined as follows (Fig.14):

We lay off a and b from a point O. Then a and b are two sides of a parallelogram in space. The vector product [ab] = c is a vector the length of which is numerically equal to the area of the parallelogram and the direction of which is perpendicular to the plane of the parallelogram, the sense of direction being such that the rotation from a to b and c = [ab] is right-handed, i.e., if we look at the plane from the end point of the vector c, we see the shortest rotation from the direction of a to that of b as a positive direction. If a and b lie in the same straight line, we must have [ab] == 0, since the area of the parallelogram is zero.

Rules of Calculation of the Vector Product:

(1) If a ¹ b and b ¹ 0, then [ab] = 0 if, and only if, a and b have the same direction or opposite directions, because then and only then the area of the parallelogram with sides a and b equals zero.

(2) There holds the equation

This follows at once from the definition of [ab].

(3) If a and b are real numbers, then

because the parallelogram with sides aa and bb has an area ab times that of the parallelogram with aides a and b and lies in the same plane as the latter.

(4) The distributive law holds:

We shall prove the first of these formulae; the second follows from it when Rule (2) is applied.

We shall now give a geometrical construction for the vector product [ab] which will demonstrate directly the truth of the distributive law.

Let E be the plane perpendicular to a through the point O. We project b orthogonally on E, thus obtaining a vector b' (Fig.15). Then [ab'] = [ab], because firstly the parallelogram with sides a and b has the same base and the same altitude as the parallelogram with sides a and b'; and secondly the directions of [ab'] and [ab] are the same, since a, b, b' lie in one plane and the sense of rotation from a to b' is the same as that from a to b. Since the vectors a and b are the sides of a rectangle, the length of [ab'] = [ab] is the product [a][b']. Hence, if we increase the length of |b'| in the ratio |a| :1I, we obtain a vector b" which has the same length as [ab']. However, [ab] = [ab'] is perpendicular to both a and b, so that we obtain [ab] = [ab'] from b" by a rotation through 90° about the line a. The sense of this rotation must be positive when looked at from the end point of a. We shall call such a rotation positive rotation about the vector a.

Hence, we can form [ab] in the following way: Project b orthogonally onto the plane E, lengthen it in the ratio |a| : 1 and rotate it positively through 90° about the vector a.

In order to prove that [a(b + c)] = [ab] + [ac],. we proceed as follows: b and c are the sides OB, OC of a parallelogram OBDC, the diagonal OD of which is the sum b + c. We now perform the three operations of projection, lengthening and rotation on the whole parallelogram OBDC instead of on the individual vectors b, c, b + c; we thus obtain a parallelogram OB1D1C1 the sides OB1, OC1 of which are the vectors [ab] and [ac] and the diagonal of which is the product [a(b +c)]. Hence follows the equation [ab] + [ac] = [a(b + c)] (Fig.16).

(5) Let a and b be given by their components along the axes a1, a2, a3 and b1, b2, b3, respectively. What is the expression for the vector product [ab] in terms of the vector components?

We express a by the sum of its vector components in the directions of the axes. If e1, e2, e3 are the unit vectors in the directions of the axes, then

and similarly

Now the distributive law yields

which, by Rules (1) and (3), may be rewritten

Now, by the definition of the vector product,

whence

The components of the vector product [ab] = c are therefore

In Physics, we use the vector product of two vectors to represent a moment. A force f acting at the end point of the position vector x has the moment [fx] about the origin.

1.2.3 The Volume of a Tetrahedron: Consider a tetrahedron (Fig.17) the vertices of which are the origin and three other points P1, P2 P3 with the co-ordinates (x1, y1, z1), (x2, y2, z2), (x3, y3, z3), respectively. In order to express the volume of this tetrahedron in terms of the co-ordinates of its vertices, proceed as follows: The vectors x1 = OP1 and x2 = OP2 are the sides of a triangle the area of which is half the length of the vector product [x1x2]· This vector product has the direction of the perpendicular from P3 to the plane of the triangle OP1P2, hence the length h of this perpendicular (the height of the tetrahedron) is given by the scalar product of the vector x3 = OP3 and the unit vector in the direction of [x1x2], because h is equal to the component of OP3 in the direction of [x1x2]· Since the absolute value of [x1x2] is twice the area A of the triangle OP1P2 and the volume V of the tetrahedron is equal to Ah/3, we have

Or, since the components of [x1x2] are given by

we can write

This also holds for the case in which O, P1, P2 lie on a straight line; in this case, it is true, the direction of [x1x2] is indeterminate, so that h can no longer be regarded as the component of OP in the direction of [x1x2], but nevertheless A=0, so that V = 0, and this follows also from the above expression for V, since in this case all the components of [x1x2] vanish.

Here again the volume of the tetrahedron is given with a definite sign as the area of the triangle was earlier; and we can show that the sign is positive, if the three axes OP1, OP2, OP3 taken in that order form a system of the same type (right-handed or left-handed, as the case may be) as the co-ordinate axes, and negative if the two systems are of the opposite type. In fact, in the first case, the angled between [x1x2] and x3 lies in the interval 0 £ d £ p/2 and, in the second case, in the interval p/2£d £p, as follows immediately from the definition of [x1x2] and V is equal to

The expression

in our formulae may be expressed more briefly by the symbol

called a three-rowed determinant or determinant of third order. On expanding the two-rowed determinants, we see that

Just as in the case of the triangle, we find that the volume of the tetrahedron with vertical (x0,y0,z0), (x1,y1,z1), (x2,y2,z2) is

Exercises 1.2 (more difficult exercises are indicated by an *)

1. What is the distance of the point P(x0, y0, z0) from the straight line l given by

2*. Find the shortest distance between two straight lines l and l' in space, given by the equations

3. Show that the plane through the three points (x1, y1, z1), (x2, y2, z2), (x3, y3, z3) is given by

4. In a uniform rotation, let (a, b, g) be the direction cosines of the axis of rotation, which passes through the origin, and w the angular velocity. Find the velocity of the point (x, y, z).

5. Prove Lagrange's identity

6. The area of a convex polygon with the vertices P1(x1, y1), P2(x2, y2), ··· , Pn(xn, yn) is given by half the absolute value of

Hints and Answers

1.3 SIMPLE THEOREMS ON DETERMINANTS OF THE SECOND AND THIRD ORDER

1.3.1 Laws of Formation and Principal Properties: The determinants of the second and third order occurring in the calculation of the area of a triangle and the volume of a tetrahedron, together with their generalization, the determinant of order n, or n-rowed determinant, are very important in that they enable formal calculations in all branches of mathematics to be expressed m a compact form. We shall develop now the properties of determinants of the second and third order; those of higher order we shall rarely need. It may, however, be pointed out that all the principal theorems may be generalized at once to determinants with any number of rows. For their theory, we must refer the reader to books on algebra and determinants.* By their definitions in 2.1 and 2.3, the determinants

are expressions formed in a definite way from their elements a, b, c, d and a, b, c, d, e, f, g, h, k, respectively. The horizontal lines of elements (such as d, e, f in our example) are called rows and the vertical lines (such as c, f, k) are called columns.

* For example, H. W. Turnbull, The Theory of Determinants, Matrices and Invariants (Blackie & Son, Ltd., 1929).

We need not spend any time on discussing the formation of the two-rowed determinant

For the three-rowed determinant, we give the diagonal rule which exhibits the symmetrical way in which the determinant is formed:

We repeat the first two columns after the third and then form the product of each triad of numbers in the diagonal lines, multiply the products associated with lines slanting downwards and to the right by +1, the others by -1, and add them. In this way, we obtain

We shall now prove several theorems on determinants:

(1) If the rows and columns of a determinant are interchanged, its value is unaltered, i.e.,

This follows immediately from the above expressions for the determinants.

(2) If two rows {or two column.s) of a determinant are interchanged, its sign is altered, i.e., the determinant is multiplied by -1.

By virtue of (1), this need only be proved for the columns, and it can be verified at once by the law of formation of the determinant given above.

(3) In 2.3, we have introduced three-rowed determinants by the equations

Using (2), we write this in the form

then in the determinants on the right hand side the elements are in the same order as on the left hand side. If we interchange the last two rows and then write down the same equation, using (2), we obtain

and similarly

We call these three equations the expansion in terms of the elements of the third row, the second row, and the first row, respectively. By interchanging columns and rows, which according to (1) does not alter the value of the determinant, we obtain the expansion by columns,

An immediate consequence of this is the theorem:

(4} If all the elements of one row (or column) are multiplied by a number r, the value of the determinant is multiplied by r.

From (2) and (4), we deduce the theorem:

(5) 1f the elements of two rows (or two columns) are proportional, i.e., if every element of one row (or column) is the product of the corresponding element in the other row (or column) and the same factor r, then the determinant is equal to zero.

In fact, according to (4), we can take the factor outside the determinant. If we then interchange the equal rows, the value of the determinant is unchanged, but by (2) it should change sign. Hence its value is zero.

In particular, a determinant, in which one row or column consists entirely of zeros, has the value zero, as also follows from the definition of a determinant.

(6) The sum of two determinants, having the same number of rows, which differ only in the elements of one row (or column) is equal to the determinant which coincides with them in the rows (or columns) common to the two determinants and in the one remaining row (or column) has the sums of the corresponding elements of the two non-identical rows (or columns).

For example,

In fact, if we expand in terms of the rows (or columns) in question, which in our example consist of the elements b, e, h and m, n, p, respectively, and add, we obtain the expression

which clearly is just the expansion of the determinant

in terms of the column b + m, e + n, h + p. This proves the statement.

Similar statements hold for two-rowed determinants.

(7) If we add to each element of a row (or column) of a determinant the same multiple, of the corresponding element of another row (or column), the value of the determinant is unchanged.

By (6), the new determinant is the sum of the original determinant and a determinant which has two proportional rows (or columns); by (5), this second determinant is zero.*

* The rule for an expansion in terms of rows or columns may be extended to define determinants of the fourth and higher order. Given a system of sixteen numbers, for example,

we define a determinant of the fourth order by the expression

and similarly we can introduce determinants of the fifth, sixth, .. ., nth order in succession. It turns out that in all essential properties these agree with the determinants of two or three rows. However, determinants of more than three rows cannot be expanded by the diagonal rule. We shall not consider further details here.

The following examples illustrate how the above theorems are applied to the evaluation of determinants. We have

as we can prove by the diagonal rule. A determinant in which only the elements in the so-called principal diagonal differ from zero is equal to the product of these elements.

Evaluation of a determinant:

Hence

Another example is

If we now expand this in terms of the first column, we obtain

1.3.2 Application to Linear Equations: Determinants are of fundamental importance in the theory of linear equations. In order to solve the two equations

for x and y, we multiply the first equation by c, the second by a and subtract the second from the first; then we multiply the first equation by d and the second by b and subtract. We thus obtain

If we assume that the determinant

differs from zero, these equations yield at once the solution

which can be verified by substitution. However, if the determinant vanishes, the equations

would lead to a contradiction if either the determinant and were different from zero. However, if

our formulae tell us nothing about the solution.

Hence, we obtain the fact, which is particularly important for our purposes, that a system of equations of the above form, the determinant of which is different from zero, always has a unique solution.

If our system of equations is homogeneous, i.e., if A = B = 0, our calculations lead to the solution x = 0, y = 0, provided that ¹ 0.

For the three equations with three unknowns

a similar discussion leads to a similar conclusion. We multiply the first equation by , the second by the third by and add to obtain

However, by our formulae for the expansion of a determinant in terms of the elements of a column, this equation can be written in the form

By Rule (4), the coefficients of y and z vanish, so that

In the same way, we derive the equations

If the determinant

is not zero, the last three equations yield the value of the unknowns. Provided that this determinant is not zero, the equations can be solved uniquely for x, y, z. If the determinant is zero, it follows that the right hand sides of the above equations must also be zero, whence the equations cannot be solved unless A, B, C satisfy the special conditions which are expressed by the vanishing of every determinant on the right hand side.

In particular, if the system of equations is homogeneous, so that A = B = C = 0, and if its determinant is different from zero, it again follows that x = y == z = 0. In addition to the cases above, in which the number of equations is equal to the number of unknowns, we shall occasionally encounter systems of two (homogeneous) equations with three unknowns, for example

If not all of the three determinants

are zero, if, for example, D3 ¹ 0, our equations can first be solved for x and y:

or

This has the geometrical meaning: We are given two vector w and v with the components a, b, c and d. e, f, respectively. We seek a vector x which is perpendicular to u and v, i.e., which satisfies the equations

Thus, x is in the direction of [uv].

Exercises 1.3

1. Show that the determinant

can always be reduced to the form

merely by repeated application of the following processes:

(1) Interchanging two rows or two columns, (2) adding a multiple of one row (or column) to another row (or column).

2. If all the three determinants

vanish, then the necessary and sufficient condition for the existence of a solution of the three equations

is

3. State the condition that the two straight lines

either intersect or are parallel

4*. Prove Properties (1) to (7), given in 3.1 for determinants of the fourth order .

5. Prove that the volume of a tetrahedron with vertices (x1, y1, z1), (x2, y2, z2), (x3, y3, z3), (x4, y4, z4) is given by

Hints and Answers

1.4 AFFINE TRANSFORMATIONS AND THE MULTIPLICATION OF DETERMINANTS

We shall conclude these preliminary remarks by discussing the simplest facts relating to the so-called affine transformations and at the same time obtain an important theorem on determinants.

1.4.1. Affine Transformations of the Plane and Space: We mean by a mapping or transformation of a portion of space (or of a plane) a law by which each point has assigned to it another point of a space (or a plane) as image point; we call the point itself the original point, or sometimes the model (in contrast to the image). We obtain a physical expression of the concept of mapping by imagining that the portion of space (or plane) in question is occupied by some deformable substance and that our transformation represents a deformation in which every point of the substance moves from its original position to a certain final position.

Using a rectangular system of co-ordinates, we take (x, y, z) as the co-ordinates of the original point and (x', y', z') as those of the corresponding image point.

The transformations which are not only the simplest and most easily understood ones, but are also of fundamental importance for the general case, are the affine transformations. An affine transformation is one in which the co-ordinates (x', y', z') (or in the plane (x', y') of the image point are expressed linearly in terms of those of the original point. Such a transformation is therefore given by the three equations

or in the plane by the two equations

with constant coefficients a, b, ···. These assign an image point to every point of space (or plane). The question at once arises whether we can interchange the relationship between image and original point, i.e., whether every point of space (or of the plane) has an original point corresponding to it. The necessary and sufficient condition for this is that the equations

shall be capable of being solved for the unknowns x, y, z (or x, y), no matter what the values of x', y', z' are. By 1.3.2 , an affine transformation has an inverse and, in fact, a unique inverse (i.e., every image point has one and only one original point), provided that its determinant

is different from zero. We shall confine our attention to affine transformations of this type and shall not discuss what happens when D = 0.

By introducing an intermediate point (x", y", z"), we can decompose the general affine transformation into the transformations

and

Here (x, y, z) is mapped first onto (x", y", z") and then {x", y", z") is mapped onto (x', y', z'). Since the second transformation is merely a parallel translation of the space (or plane) as a whole and is therefore quite easily understood, we may restrict our study to the first. We shall therefore only consider affine transformations of the form

with non-vanishing determinants.

The results of 1.3.2 for linear equations enable us to express the inverse transformation by the formulae

in which a', b',··· are certain expressions formed from the coefficients a, b, ···. Due to the uniqueness of the solution, the original equations also follow from these latter equations. In particular, it follows from x = y = z = 0 that x' = y' = z' = 0 and conversely.

The characteristic geometrical properties of affine transformations are stated in the theorems.

(1) In space, the image of a plane is a plane; in the plane, the image of a straight line is a straight line.

In fact, by 1.1.4 , we can write the equation of the plane (or the line) in the form

The numbers A, B, C (or A, B) are not all zero. The co-ordinates of the image points of the plane (or of the line) satisfy the equation

Hence the image points themselves lie on a plane (or a line), because the coefficients

of the co-ordinates x', y', z' (or x', y') cannot all be zero; otherwise the equations

would hold, and these we may regard as equations in the unknowns A, B, C (or A, B). But we have shown above that it follows from these equations that A = B = C =0 (or A = B = 0).

(2) The image of a straight line in space is a straight line.

This follows immediately from the fact that a straight line may be regarded as the intersection of two planes; by (1), its image is also the intersection of two planes and is therefore a straight line.

(3) The images of two parallel planes of space {or of two parallel lines of the plane) are parallel.

In fact, if the images had points of intersection, the originals would have to intersect at the original points of these intersections.

(4) The images of two parallel lines in space are two parallel lines.

In fact, as the two lines lie in a plane and do not intersect one another, by (1) and (2), the same is true for their images,. The images are therefore parallel.

The image of a vector v is of course a vector v' leading from the image of the starting point of v to the image of the end point of v. Since the components of the vector are the differences of the corresponding co-ordinates of the starting and end points, under the most general affine transformation, they are transformed according to the equations

1.4.2 The Combination of Affine Transformations and the Resolution of the General Affine Transformation. If we map a point (x, y, z) onto an image point (x', y', z') by means of the transformation

and then map (x', y', z') onto a point (x", y", z") by means of a second affine transformation

we readily see that (x, y, z) and (x", y", z") are also related by an affine transformation. In fact,

where the coefficients are given by the equations

We say that this last transformation is the combination or resultant of the first two transformations. If the determinants of the first two transformations are different from zero, their inverses can be formed, whence the compound transformation also has an inverse. The coefficients of the compound transformation are obtained from those of the original transformations by multiplying corresponding elements of a column of the first transformation and of a row of the second transformation, adding the three products thus obtained, and using this product of column and row as the coefficient which stands in the column with the same number as the column used and in the row with the same number as the row used.

In the same way, a combination of the transformations

yields the new transformation

We mean by a primitive transformation one in which two (or one) of the three (or two) co-ordinates of the image are the same as the corresponding co-ordinates of the original points. In physical terms, we may think of a primitive transformation as one in which the space (or plane) undergoes stretching in one direction only (the stretching, of course, varying from point to point) so that all the points are simply moved along a family of parallel lines. A primitive affine transformation in which the motion takes place parallel to the x-axis is analytically represented by formulae of the type

The general affine transformation in the plane

with a non-vanishing determinant, can be obtained by a combination of primitive transformations.

In the proof we may assume * that a ¹ 0. We introduce an intermediate point (x, h) by the primitive transformation

the determinant of which does not vanish. From x, h, we obtainx', y' by a second primitive transformation

with the determinant

This yields the required resolution into primitive transformations.

* If a = 0, then b ¹ 0, and we can return to the case a ¹ 0. Such an interchange. represented by the transformation X=y, Y=x, is itself effected by the three successive primitive transformations

x1 = x - y, x2 = x1, X = -x2 + h2 = y
h1 = y, h2 = x1 + h1 = x, Y = h2 = z.

In a similar way, the affine transformation in space

with a non-vanishing determinant, can be resolved into primitive transformations.

At least one of the three determinants

must be different from zero; otherwise, as the expansion in terms of the elements of the last row shows, we should have

As in the preceding case, we can them assume without loss of generality that (1)and (2) that a ¹ 0. The first intermediate point (x, z) is given by the equations

The determinant of this primitive transformation is a, which is not zero. For the second transformation to x ', h', z ', we wish to set x ' = x, z ' = z, and also to have h' = y'. One primitive transformation then remains. If we introduce in the equation

the quantities x, h, z instead of x, y, z, we obtain the second primitive transformation in the form

The determinant of this transformation is The third transformation must then be

next go to start of chapter