STEP 11

SYSTEMS OF LINEAR EQUATIONS 1

Solution by elimination

Many physical systems can be modelled by a set of linear equations which describe relationships between system variables. In simple cases, there are two or three variables; in complex systems (for example, in a linear model of the economy of a country) there may be several hundred variables. Linear systems also arise in connection with many problems of numerical analysis. Examples of these are the solution of partial differential equations by finite difference methods, statistical regression analysis, and the solution of eigenvalue problems (see, for example, Gerald and Wheatley ( 1994)). A brief introduction to this last topic is given in STEP 17.

Hence there arises a need for rapid and accurate methods for solving systems of linear equations. The student will already be familiar with solving by elimination systems of equations with two or three variables. This Step presents a formal description of the Gauss elimination method for n-variable systems and discusses certain errors which might arise in their solutions. Partial pivoting, a technique which enhances the accuracy of this method, is discussed in. in the next Step.

1. Notation and definitions

We first consider an example in three variables:

,

a set of three linear equations in the three variables or unknowns x, y, z. During solution of such a system, we determine a set of values for x, y and z which satisfies each of the equations. In other words, if values (X, Y, Z) satisfy al1 equations simultaneously, then they constitute a solution of the system.

Consider now the general system of n equations in n variables:

Obviously, the dots indicate similar terms in the variables and the remaining (n - 3) equations.

In this notation, the variables are denoted by x1, x2, . . , xn; they are sometimes referred to as xi , i = 1, 2, · ·. , n. The coefficients of the variables may be detached and written in a coefficient matrix:

The notation aij will be used to denote the coefficient of xj in the i-th equation. Note that aij occurs in the i-th row and j-th column of the matrix.

The numbers on the right-hand side of the equations are called constants; they may be written in a column vector:

The coefficient matrix may be combined with the constant vector to form the augmented matrix:

As a rule, one works in the elimination method directly with the augmented matrix.

2. The existence of solutions

For any particular solution of n linear equations, there may be a single solution (X1, X2, . . . Xn), or no solution, or an infinity of solutions. In the theory of linear algebra, theorems are given and conditions stated which allow to make a decision regarding the category into which a given system falls. We shall not treat the question of existence of solutions in this book, but for the benefit of students, familiar with matrices and determinants, we state the theorem:

Theorem: A linear system of n equations in n variables with coefficient matrix A and non-zero constants vector b has a unique solution, if and only if the determinant of A is not zero.

If b = 0, the system has the trivial solution x = 0. It has no other solution unless the determinant of A is zero, when it has an infinite number of solutions.

If the determinant of A is non-zero, there exists an n´ n matrix, called the inverse of A (denoted by A-1) such that the matrix product of A-1 and A is equal to the n ´ n-identity or unit matrix I. The elements of the identity matrix are 1 on the main diagonal and 0 elsewhere. Its algebraic properties include Ix = x for any n´1 vector x, and IM = MI = M for any n´n matrix M. For example, the 3´3 identity matrix is

Multiplication of the equation Ax = b from the left by the inverse matrix A-1 yields A-1Ax = A-1b, whence the unique solution is x = A-1b (since A-1A = I and Ix = x). Thus, in principle, a linear system with a unique solution may be solved by first evaluating A-1 and then A-1b. This approach is discussed in more detail in the optional Step 14. The Gauss elimination method is a more general and efficient direct procedure for solving systems of linear equations.

3. Gauss elimination method

In Gauss elimination, the given system of equations is transformed into an equivalent system which has upper triangular form; this form is readily solved by a process called back-substitution. We shall demonstrate the process by solving the example of Section 1.

Transformation into upper triangular form:

First stage: Eliminate x from equations R2 and R3 using equation R1.

Second stage: Eliminate y from R3' using R2':

The system, now in upper triangular form, has the coefficient matrix:

Solution by back-substitution

The system in upper triangular form is easily solved by obtaining z from R3", then y from R2", and finally x from R1". This procedure is called back-substitution. Thus:

4. The transformation operations

During transformation of a system to upper triangular form, one or more of the following elementary operations are used at every step:

  1. Multiplication of an equation by a constant;
  2. Subtraction from one equation some multiple of another equation;
  3. Interchange of two equations.

Mathematically speaking, it should be clear to the student that performing elementary operations on a system of linear equations leads to equivalent systems with the same solutions. This statement requires proof which may be found as a theorem in books on linear algebra (cf., for example, Anton (1993)). It forms the basis of all elimination methods for solving systems of linear equations.

5. General treatment of the elimination process

We will now describe the application of the elimination process to a general n´ n linear system, written in general notation, which is suitable for implementation on a computer (Pseudo-code). Before considering the general system, the process will be described by means of a system of three equations. We begin with the augmented matrix, and display in the column (headed by m) the multipliers required for the transformations.

First stage: Eliminate the coefficients a21 and a31, using row R1:

Second stage: Eliminate the coefficient a32 using row R2:

The matrix is now in the form which permits back-substitution. The full system of equations at this stage, equivalent to the original system, is

Hence follows the solution by back-substitution:

We now display the process for the general n ´ n system, omitting the primes (') for convenience. Recall that the original augmented matrix was:

First stage: Eliminate the coefficients a21, a31,. . . , an1 by calculating the multipliers

and then

This leads to the modified augmented system

Second stage: Eliminate the coefficients a32, a42, . . . , an2 by calculating the multipliers

and then

whence

We continue to eliminate unknowns, going on to columns 3, 4, . . so that by the beginning of the k-th stage we have the augmented matrix

k-th stage: Eliminate ak+l,k, ak+2,k, . . , ak+n,kl by calculating the multipliers

and then

At the end of the k-th stage, we obtain the augmented system

Continuing in this way, we obtain after n -1 stages the augmented matrix

Note that the original coefficient matrix has been transformed into the upper triangular form.

We now back-substitute: We find xn = bn/ann, and then

Notes

1. The diagonal elements akk, used at the k-th stage of the successive elimination, are referred to as pivot elements
.

2. In order to proceed from one stage to the next, the pivot elements must be non-zero, since they are used as divisors in the multipliers and in the final solution. If at any stage a pivot element vanishes, rearrange the remaining rows of the matrix, in order to obtain a non-zero pivot; if this is not possible, then the system of linear equations has no solution.

3. If a pivot element is small compared with the elements in its column which have to be eliminated, the corresponding multipliers used at that stage will be larger than one in magnitude. The use of large multipliers in elimination and back-substitutions leads to magnification of round-off errors; this can be avoided by using the partial pivoting described in Step12.

4. During the elimination method, an extra check column, containing the sums of the rows, should be computed at each stage (cf. the following example). Its elements are treated in exactly the same way as the equation coefficients. After each stage is completed, the new row sums should equal the new check column elements ( within roundoff error).

6. A numerical example with check column

Consider the system of equations:

0.34x1 - 0.58x2 + 0.94x3 = 2.0
0.27x1 + 0.42x2 + 0.134x3 = 1.5
0.20x1 - 0.51x2 + 0.54x3 = 0.8

The manipulations leading to the solution are set out in tabular form below. For the purpose of illustration, the calculations have been executed in 3D floating point arithmetic. For example, at the first stage, the multiplier 0.794 arises as follows:

while the value -0.0900 arises from the sequence of operations

Work with so few significant digits leads to errors in the solution, as is shown below by an examination of the so called residuals.

Next, we back-substitute:

As a check, we sum the original three equations to obtain 0.81x1 - 0.67x2 = 0.43. Inserting the solution yields 0.81´ 0.89- 0.67 ´ 2.1 + 1.61 ´ 3.03 = 4.3049.

In order to assess the accuracy of the solution, we insert the solution into the left-hand sides of each of the original equations and compare the results with the right-hand side constants. For the present example, we find:

which yields the residuals

It seems reasonable to conclude that, if the residuals are small, the solution is a good one. As a rule, this is true. However, at times, small residuals do not indicate that a solution is acceptable. This aspect is discussed under the heading ill-conditioning in Step 12.

Checkpoint

  1. What types of operations are permissible during the transformation of the augmented matrix?
  2. What is the final form of the coefficient matrix, before backsubstitution begins?
  3. What are pivot elements? Why must small pivot elements be avoided if possible?

EXERCISES

Solve the following systems by Gauss elimination:

a)
x
1 + x2 - x3 = 0,
2x1 - x2 + x3 = 6,
3x1 + 2x2 - 4x3 = -4.

b)
5.6 x + 3.8 y + 1.2z = 1.4,
3.lx + 7.ly - 4.7z = 5.1,
1.4x - 3.4y + 8.3z = 2.4.

c)
2x + 6&127 + 4z = 5,
6x + 19y + 12z = 6,
2x + 8y + 14z = 7.

d)
1.3x + 4.6y + 3.lz= -1,
5.6x + 5.8y + 7.9z = 2,
4.2x + 3.2y + 4.5z = -3.

Answers

last next