1.6 Systems of non-homogeneous linear equations: Consider the system of linear equations

(1)

where not all the terms on the right hand side are equal to zero, together with the two homogeneous systems

Denote the matrix of (2) by A, that of (3) by A0. The following propositions are obvious, but important:

Proposition 1: If x=(x1,···,xn) is a solution of (1), and h=(y1,···,yn) a solution of (2), then x + h is a solution of (1).

Proposition 2: If x and x1 are solutions of (1), then h = x - x1 is a solution of (2).

Proposition 3: If x1,···,x n are solutions of (1), and c1 + ··· + cn = 1, then =c1x1+···+cnxn is a solution of (1).

Proposition 4: If (x1,···,xn) is a solution of (1), then (x1,···,xn,1) is a solution of (3).

Proposition 5: Let (z1,···, zn, z0) be a solution of (3). If z0 = 0, then (z1,···,zn) is a solution of (2): if z0 ¹ 0, then (z1,···,zn)(z1/z0, ··· , zn/z0) is a solution of (1).

Propositions 1 and 2 are condensed in

Theorem 1: If x is an arbitrary solution of (1), then. we get all solutions of (1) are obtained by addition of the solution h of (2).

As has been shown in the introduction, (1) may have no solution. Consider next the necessary and sufficient condition for the existence of a solution.

Theorem 2: System (1) has solutions if and only if Rank A = Rank A0.

Proof: Let r be Rank A. Since r is equal to the Rank of the vector space generated by the columns of A, and A0 is formed by A and a column added to A, Rank A0 is either equal to r or r + 1. The solutions of (2) form a vector space X(A) of rank n - r. Let

(y1, ··· , yn) (4)

be the vectors of this space. The vectors

(y1, ··· , yn, 0) (5)

form a vector space X' consisting of (n + 1)-vectors. n-vectors (4) are independent if and only if the corresponding vectors (5) are independent, whence

Rank X' = Rank X(A) = n - r,

that is, X' Í X(A0). Since every vector of X' is a solution of (3), then

n - r = Rank X' £ Rank X(A0) = (n + 1)— Rank A0, (6)

where the equality applies if and only if the vector spaces X' and XA0) are identical. But X' is identical to X(A0) if in every solution of (3) the coordinate z0 vanishes; Proposition 5 yields in this case that (1) has no solution. Hence for n - r = n + 1 Rank A0, that is, when Rank A0 = r + 1, System (1) has no solution. If X(A0) contains (n + 1) vectors which do not belong to X', then by Proposition 5, System (1) does not have a solution. This applies if and only if n— r < n + 1 - Rank A0, i.e., Rank A < r + 1, whence then A0 = r and Theorem 2 has been proved.

In order to discover the solutions of (1), one may solve the homogeneous system (3) by sweep out and consider only those solutions for which z0>1. The sweeping method yields a matrix with n + 1 columns and t = Rank A0 rows [cf. 1.5, (3)] of the type

This matrix corresponds to a system of homogeneous linear equations equivalent to (3):

where i1, ···, in, in+ 1 is a permutation of the subscripts 1, ··· , n, 0. If none of the subscripts i1, ···, it is the subscript 0, then it may be assumed without loss of generality that in+1-t is zero. Then System (1) is equivalent to

(7)

In this case, the equations are soluble, whence r = t. Hence solubility occurs only if the matrix A cannot be swept without seeping out its last column, i.e., if there results a row in which every element but the last one is zero. A row of this type corresponds to the condition z0 = 0, and if this condition is satisfied, System (1) has no solution, whence follows

Theorem 3: On applying sweep out to the matrix A in such a manner that the coefficients of z0 remain the last column, either one obtains a solution (7) or a row results which corresponds to an equation z0 = 0, and which shows that System (1) has no solution.

Equations 1-5 and 1-6 show that the solubility of a system of linear equations does not depend on the number of equations and of unknown quantities, but on the Ranks of certain matrices. These Ranks are limited by the number of equations and unknown quantities as the Rank of a matrix can neither exceed the number of rows, nor the number of columns.

Proposition 6: If m = n, System (1) has exactly one solution if and only if Rank A = n.

Proof: Rank A = n' cannot exceed n. If n' < n, then either (1) has no solution or its solutions are m a (1,l)-correspondence to the solutions of (2) [cf. Theorem 1] and the solutions of (2) form a vector space of Rank n n' [cf. 1.5, Propositions 2 and 3]; this vector space contains more than one element. If Rank A == n, then the solutions of (2) form & vector space of Rank 0, i.e. System (2) has only the trivial solution (0, ··· , 0). As Rank A0 cannot exceed n, Rank A0 = Rank A, whence there exists a solution of (1), but it follows from Theorem 1 that there exist only one solution.

The coordinates of n-vectors and the elements of matrices have been assumed to be numbers. No special assumption has been introduced whether this term should be refer to real or complex numbers. Naturally, the preceding investigation have been made in such a manner that they are independent of any special assumption. It may he mentioned in anticipation that the investigations so far apply unaltered if the notion of number is replaced by the notion of element of any particular* field. The general notion of a field will be explained in Chapter II; it will not be used before.

* A field may also be finite; in this case, a vector space contains only a finite number of elements. Nowhere use is made of infinite vector spaces (cf. especially the Proof of 1.6, Proposition 6.

1.7 Method of orthogonalization* Numbers in this section are assumed to be real**. Especially the n-vectors are to have real coordinates.

* This section may be omitted at a first reading.
** This method is readily generalized for abstract real fields, i.e. , fields in which 0 cannot be represented as a sum of squares.

Definition 1. The scalar product of two n-vectors a = (a1, ··· ,an) and b=(b1, ··· ,bn) is the number

This definition yields:

1: ab =ba - commutative law;
2.
a(b + g) = ab + ag - distributive law;
3. ab = 0, if and only if a is orthogonal to b;

4.   aa > 0,   if   a ¹ 0,
      = 0   if   a = 0.

Definition 2: The non-negative square root of aa is said to be the length |a| of a.

Thus |a| > 0, unless a = 0. The fact that length is a generalization of the notion of absolute value is seen from the case n = 1 as well as from the case n = 2, when a = (a1,a2) is represented by the complex number a1 + a2i. This justifies the notation |a|. If one is given a system of homogeneous linear equations with the matrix A and is to solve the system by the method of orthogonalization., one forms a system of n independent n-vectors of length 1

each of them orthogonal to each other, in such a way, that

is a base of the vector space R(A) generated by the rows of A. The vector space generated by

contains only n-vectors which are orthogonal to the n-vectors of R(A), and these are solutions of the given system. It will be shown how the n-vectors (1) can be found and that (3) is a base of X(A). These considerations yield a second proof of 1-5, Proposition 3.

Definition 3: The n-vectors b1, ··· ,bm form an orthogonal system, if

The n-vectors of an orthogonal system are therefore of length 1, and are mutually orthogonal.

Let

be an orthogonal system and

then

whence a = 0 implies c1= ··· = cm = 0. Thus

Proposition 1: The n-vectors of an orthogonal system are independent.

Let k be an arbitrary n-vector; in (6), let

when it follows from (7) that

Hence l is orthogonal to the n-vectors (5). If l ¹ 0, set

so that the n-vectors

form an orthogonal system. Since b m+1 is independent of the n-vectors (5), so are l and k. If l = 0, then k = a depends on (5), whence follows

Proposition 2: If k is independent of (5), and b m+1 is defined by (6), (8) and (9), then (10) is an orthogonal system and the vector space generated by (10) is the same as the vector space generated by b 1, ··· ,b m, k. If l = 0, then k depends on (5).

Proposition 2 can serve two different purposes: Firstly, to extend an orthogonal system (5) to an orthogonal system (10) in such a way that a particular n-vector k, independent of (5), is dependent on (10); secondly, to state whether an arbitrary n-vector depends on (5). Let

be an orthogonal system; in order to discover whether the unit-vector ek depends on (5), set in (8) k = ek, whence c = ek b i = bik and

Hence l = 0 if and only if

Thus, ek depends on (5) if and only if the column vector b k of (11) has the length 1 and is orthogonal to other column vectors of (11). For m < n, one finds readily by this method a unit-vector which is independent of (5).

Let

omit the rows of A which depend on b1, and let

then omit the rows of A which depend on b1, b2, let

and continue until the orthogonal system (2) of independent n-vectors is obtained which forms a base of R(A).

This orthogonal system can on extended further with the aid of an n-vector k which does not depend on it. One may choose this vector, for example, out of the unit-vectors. The procedure can be repeated and will stop after n steps as the n orthogonal n-vectors

are independent and form therefore a base of the complete vector space of Rank n.

An arbitrary n-vector can be represented by

a is a solution of the system of homogeneous equations with the matrix A if and only if it is orthogonal to R(A), i.e., if it is orthogonal to b1, ··· ,b r. Hence a is a solution if and only if

hence the solutions are the n-vectors which depend on

This proves 1.5 Proposition 3 without reference to the method of sweep out.

The method of orthogonalization has certain advantages over the method of sweep out, as it yields bases of the vector spaces R(A) and X(A) which are orthogonal systems; however, it is not very convenient for practical calculations. Sweep out employs only rational operations whereas during orthogonalization a square root must be found at every step.

1.8 Determinants Let A = ((aik)) be a square matrix with n rows and n columns. It is possible to allot to A a certain number which therefore is a function of the matrix. You may also consider this value as a function of the n² elements of the matrix or as a function of the n column vectors, because, when the elements or the column vectors are given in their proper order, the matrix is also known. Obviously, functions of n² variables can be formed in infinitely different ways. The particular function which will be considered here is called a determinant and is denoted by

If one view a1, ··· , an as an abbreviation for the vertical column vectors of ((aik)), the notation given on the right hand side becomes identical to the notation on the left hand side. However, it is useful to express the determinant as a function of the column vectors, since it will be shown to be a linear function of those n-vectors.

The determinant is supposed to have the properties:

It is not obvious that there exists a function which has these properties. It may be that they contradict one another. For example, if in (b) the restriction k ¹ m is omitted, the conditions contradict each other, since it follows from (a) and (c) that det (2e1, ··· , en) = 2, whereas (b) and (c) would yield 1. To start with, we assume that such a function exists and derive its properties ; existence and uniqueness will be proved later on.

Proposition 1: If anyone of the n-vectors a1,···,an is equal to 0, then det A=0.

Proof: Let ai = 0, whence

ai = 0ai , det A = det (a1, ··· , 0ai, ., ,, «,) = 0 det A = 0.

Proposition 2: A determinant is not altered when ak is replaced ak + cai.

Proof: For c = 0, the proposition obvious, for e ¹ 0,

detA = 1/cdet(···,cai,···,ak,···) = 1/cdet(···,cai,···,ak+cai ,···) = det(···,cai,···,ak+cai ,···).

This proposition shows that column additions do not alter a determinant, where column-addition is understood in the same sense as row addition in 1.4. If any column depends on other columns, say an = c1a1 + ··· + cn-1an-1, one can reduce it by n = 1 column additions to the n-vector 0, whence Proposition 1 yields:

Proposition 3: When the column-vectors are interdependent, a determinant is equal to zero.

Another consequence of Proposition 2 is:

Proposition 4. An interchange of two columns changes a determinant's sign.

Proof:

This proposition yields

Proposition 5: An even permutation of columns does not alter a determinant, an odd permutation alters a its sign.

Proposition 6: according to whether the permutation 1, ··· , in is even or odd.

A determinant with two equal columns is zero ; this is a special case of Proposition 3 or 4).

Proposition 7 Let B be the matrix obtained by replacement of a particular column ai of A by b, then

Proof: Let the n - 1column vectors ak (for ¹ 1) be dependent, then each of the three determinants in (2) is zero and the formula applies. Let a1 depend on the n - 1 n-vectors ak, then det A=0, and the determinant on the right hand side of (2) can be reduced by column addition to det B. Without loss of generality, we call therefore assume that a1,···,an are independent ; hence they form a base of the vector space containing every n-vector; hence b depends on them, say b=c1a1+···+cnan. Using column addition, you may reduce the column b in B to ciai and similarly in the determinant on the right hand side of (2). Hence both the sides are equal to (1 + c1) det A and the proposition has been proved.

Let B be the matrix obtained by replacing a) a1 by b and let

then

Repetition of this procedure yields

This result is expressed by

Proposition 8: A determinant is a linear function of each of its column vectors.

Consider especially and let

then Proposition 8 yields

Proposition 9:

The number Ak is called the cofactor of ak. If Ai = (A1i, ··· ,Ani)) is the n-vector A, then det A is the scalar product

det A = Sai Ai.

Especially,

Ak1 = det(e k, a2, ··· , an).

By Proposition 9, Ak1 beomes

and hence

After n repetitions, this process yields

where k, s, ..., q assume the values 1, ··· , n independently. The determinants on the right hand side take the values +1, -1, 0 accordingly to whether k, i, s, ··· , q form an even or an odd permutation of 1, ··· , n or subscripts are the same. Hence follows

Proposition 10: If there exists a function det A which satisfies the conditions (a), (b), (c) above, then det A is equal to

where the sum has to be taken over all those products ak1, ··· , aqn for which the upper subscripts are permutations of 1, ··· , n; the signs +/- are used for even/odd permutations.

Theorem 1: The function det A which satisfies the conditions (a), (b), (c) exists and is equal to the function D(A), defined by (7).

Proof: Proposition 10 yields that det A either does not exist or is equal to D(A), whence it must be shown that D(A) satisfies Conditions (a), (b), (c). Thus: (a) If ai is replaced cai, then in every term of the sum (7) exactly one factor is multiplied by c, whence D(A) is replaced by c. (c) If aj = e j for j = 1, ··· , n, then ajj = 1, ajk = 0 for j ¹ k. Hence a11, a22, ··· , ann = 1, whereas the other terms in the sum (7) vanish. Since 1, 2, ··· , n is an even permutation, D(A) = +1. (b) In order to prove that (b) holds for D(A), prove first that D(A) satisfies Proposition 4. If in A the subscripts i and k are interchanged, the terms in (7) are not altered, but every even permutation becomes odd and conversely, whence D(A) is transformed into D(A). If ai = ak, the interchange of i and k cannot alter D(A) ; hence D(A) = -D(A) = 0. If in A the column vector am is replaced by am + an», in every term ak1, ··· , aqn the factor arm is replaced by a rn + a rm and the term is therefore increased by ak1, ··· , ark, ··· , aqr, The sum of these additional terms taken with the corresponding sign ± is equal to the determinant obtained by replacement of am by ak.

Hence D(A) is replaced hy D(A) + det (a1,·· , ak,···, ak,···,···, an) = D(A) and the theorem has been proved.

Hence Propositions 1 to 10, which had been proved under the assumption that a function satisfying the conditions (a), (b), (c) exists, apply unconditionally. Moreover, it follows from Proposition 10 that this function (determinant) is determined uniquely.

Proposition 11:

 

Proof: If ai is replaced in A by aj, the determinant is zero. Hence (8) follows from (5).

Proposition 12: Let a1, ···,an be the column vectors, a1, ···,an the row vectors of A. Then

Proof: In every term ak1, ··· ,aqm of (7), the factors can be ordered in such a way that the superscripts have their natural order 1, ··· , n; then the order of the subscripts is the inverse permutation of h, ..., q. A permutation is odd (even ) when its inverse permutation is odd (even), whence

where j, ··· , p takes all the permutations of 1, ··· , n and the sign ± has to be taken depending on whether the permutation is even or odd. Hence det (A) = a1, ···,an.

It follows from this proposition that in every proposition which applies to determinants, we may interchange the column and row vectors (i.e., the upper and lower indices). This duality of rows and columns in a determinant can be extended to the cofactors Ak with the aid of

Proposition 15: If one replaces in the matrix A the element aki by 1 and those elements which are in the same row or in the same column as aki by 0, the determinant of this matrix is equal to

Proof: Let ai be replaced by e k,then every term aj1, ··· , ag1, ··· , aqn of (7) js zero unless g = k. Since therefore in every non-vanishing term the factor a occurs, no factor akj, j ¹ i can occur. Hence Ak does not depend on the elements akj which therefore may be replaced by any value, e.g, by 0. In exactly the same manner, it can be proved that if in A the row ak is replaced by the n-vector e i, the determinant becomes independent of agi for g ¹ k. Hence follows the proposition. The essence of some of the propositions, proved above, is given by the following theorem and subsequent formulae.

Theorem 2: det A is a linear function of its row vectors (column vectors). It is invariant to row addition (column addition), to even permutation of the rows (columns) and to interchanges of rows with the columns with corresponding indices. det A only changes its sign if an odd permutation of the rows (columns) is applied. If rank A < n, then det A = 0.

last next