30. Numerical solution of linear equations

The theory of linear equations states that evaluation of solutions of every system of linear equations demands eventually solution of one or several systems of n linear equations each with n unknowns and a non-zero determinant. As has already been pointed out, while Cramer's Rule yields for such equations a closed expression of the solution in the form of a quotient of determinants, it is hardly useful for practical computations. The reason is, on the one hand, that the evaluation of determinants of higher order is laborious, on the other hand, that during employment of Cramer's Rule to equations with approximated coefficients the propagation of errors is unfavourable.

For this reason, it is preferable to employ for numerical work the method which involves elimination of one unknown from n equations to obtain n-1 equations with n - 1 unknowns, etc. Once you have arrived at a single equation in one unknown, you solve it and obtain the values of the previously eliminated variables by substitution. If you use for this procedure a suitable scheme, you will have a very useful method which also allows control of the operations.

Other methods involve to set out from any approximate solution of a system of equations and then step by step to improve these solutions until the desired accuracy is reached. For this procedure, the first approximation need not be very good; in fact, in the worst case, you can start from arbitrarily chosen values for the unknowns, when you will have more steps to take, and proceed with the iteration until an approximate solution is reached. Let there be given the system of equations

a11x1 + ··· + a1nxn = c1, (30.1)
·············································
an1x1 + ··· + annxn = cn,

or Ax = c with |A| 0. Let x (1) be a first approximation to the solution. You then construct a sequence x (2), x (3), ··· , using a regular (n,n)-matrix P yet to be determined:

x (n + 1), = x (n), + P(Ax (n), - c), ( = 1, 2, ···). (30.2)

You must discover whether the sequence x (n) converges towards the solution v of (30.1)*. Once a term in the sequence, say x (m), equals the exact solution, that is Ax (m) - c = 0, then (30.2) tells that all the terms following x (n) will be equal to x (n); conversely, x (n+1)= x (n) only when Ax (n)- c = 0, that is, when the exact solution has been reached.

If you set z (n) = x (n) - x, then, because Ax - c = 0,

z (n+1) = x (n+1) - x = x (n) - x + P(Ax (n)- c) =
= x (
n) - x + P(Ax (n)- c) - P (Ax - c) =
= z (n)+ P(Ax (n)- x) = (PA + E)+ Pz(n).

Hence

z (n+1) = (PA + E)nz (1),

and convergence occurs when

* The convergence of a sequence of (r,s)-matrices is here defined as follows: If

M (n)= (m(n)ik) and M = (mik) (i = 1, ··· , r; k = 1, ··· , s),

then is element by element convergence, that is,

As a rule, you select for the matrix P the diagonal matrix with the elements

pii = -1/aii (1, ··· , n).

You can do so, because you may assume that all aii 0; if that were not the case at the start, you can rearrange the equations (30.1), since |A| 0. With this matrix P, you have

A sufficient condition of convergence is now that the sums of the absolute values of the elements of each row of this matrix are at the most equal to a number q < 1, that is

For the proof, let PA + E = B = (bik). Then, in the matrix B, the sum of the absolute values of the elements in each row is at most equal to q, and certainly |bik| q. For the elements of the matrix B², you have the estimate

|bi1b1k + ··· + binbnk | |bi1| |b1k| + ··· + |bin| |bnk| (|bi1|) + ··· + |bin|)qq².

You use induction to prove in an analogous manner that the absolute values of all the elements of the matrix Bn are at most equal to qn. In fact, if you set Bn-1=(bik(n-1)) and you have proved already that |bik(n-1)| qn-1, then you have for the elements of Bn =(bik(n))

|bik(n)| = |bi1bik(n-1) + ··· + binbnk(n-1)|
|b
i1| |bik(n-1)| + ··· + |bin| |bnk(n-1)|
(|b
i1| + ··· + |bin|)qn-1 qn.

From q < 1 follows therefore and hence

Another sufficient condition of convergence is that the sum of the absolute values of the elements of every column in (30.3) is at most q < 1. The proof follows simply from the fact that with also .

The detailed iteration formula for this method is

xi(n+1)=xi(n)-(ai1xi(n)+··+ai,i-1xi-1(n)+ai,ixi(n)+ai,i+1xi+1(n)+···+ai,nxn(n)-ci,)/aii= = -(ai1xi(n)+···+ai,i-1xi-1(n)+ai,ixi(n)+ai,i+1xi+1(n)+···+ai,nxn(n)-ci,)/aii.

Another approximation method replaces immediately in the last formula the values of the n-th approximation by the already computed value of the (n+1)-th approximation, whence

xi(n+1)=-(ai1xi(n+1)+···+ai,i-1xi-1(n+1)+ai,i+1xi+1(n)+···+ai,nxn(n)-ci,)/aii.(30.4)

We will also present for this method a sufficient convergence condition. For the sake of simplicity, we will confine consideration to a system of equations with real coefficients and, correspondingly, to a real solution. We do so because, firstly, this is the case in most applications, secondly, you can reduce every system of equations with complex coefficients by separation of real and imaginary parts to a system of twice as many equations with twice as many unknowns.

In order to be able to formulate the convergence condition, we require several concepts, which will only be discussed in detail in Part VIII. An (n,n)-matrix is symmetric when A' = A. The product formed out of a column y with the n variables y1, ··· , yn

is called the quadratic form, associated wit the matrix A. It can happen that always Q(y1, ··· , yn) 0 whatever real numbers are substituted for y1, ··· , yn and that Q(y1, ··· , yn) = 0 only for y1 = ··· = yn= 0. In that case, the quadratic form Q or also the matrix A is said to be positive definite. We will prove in Section 34 that the values of the determinant of a positive definite matrix as well as of all its main diagonal elements are positive.

The method of approximation with the interation formula (30.4) is only sure to converge if the coefficient matrix A of the system of equations is symmetric and positive definite. In order to prove this statement, consider the expression

For sufficiently large values of |x1|, ··· , |xn|, the quadratic sum Q is larger than that of the linear term Scixi and, since Q is positive definite, F becomes for such values arbitrarily large. Consequently, F must have a minimum, which you find from the equations

This is the same system as the (30.1). Since that system has a unique solution, F has a minimum.

We will now show that on replacement of xi(n) by xi(n+1), according to (30.4), there holds the inequality

As long as F has not yet reached its minimum, that is, as long as you have not reached the exact solution of (30.1), you approach at each iteration step the minimum, that is, the exact solution. You have

Since, as has been pointed out above,always aii>0 for a positive definite matrix A , you have DF < 0.

The convergence condition, by which the matrix of the system of equations must be symmetric and positive definite, appears at first to be very restrictive. However, you can confirm without great effort that every arbitrary system of equations with a non-zero determinant can be replaced by a system with the same solution and a symmetric and positive definite matrix. Let

Ax = c (30.5)

be an arbitrary system of equations with |A| 0. Multiply by A' from the left to obtain

A'Ax = A'c (30.6);

conversely, (30.5) follows again from (30.6) by multiplication by A'-1. Hence the two systems (30.5) and (30.6) have the same solution x. The matrix A'A of the system (30.6) is now symmetric and positive definite. In fact, firstly,

(A'A)' = A'A" =A'A.

Secondly, setting Ay = z, you find

y'A'Ay = z'z = z1² + ··· + zn²,

whence y'A'Ay 0. Obviously, the equality sign holds only for z = 0; however, from z = Ay = 0 follows y = 0, since we have assumed that |A| 0.

Exercises

22. Find the general solution of the system of equations

x1 + 2x2 + x3 - x4 = 1,
x
2 + x3 + x4 + x5 + x6 = 1,
x
1 + x2 + x4 + 2x6 = 2,
2x
2 + 2x3 + x4 - x6 = 0.

23. The equations of Exercise 22 with the right hand side of the last equation being 1 instead of 0.

24. Find the Rank of the matrix B of Exercise 18 in dependence on the rank of the matrix A.

Answers

last next