STEP 27

CURVE FITTING 2*

Least squares and linear equations*

In this optional Step, we consider certain advanced topics concerning systems of linear equations that arise out of the discrete least squares problem discussed in the preceding Step.

  1. Pseudo-inverse

    We recall that we are given data points (x1, y1), . . , (xn, yn) and we wish to find parameters c1, . . , ck multiplying the basis functions f1, ···, f k such that

    is minimized, where the errors ei are given by

    Ideally, we would like S to be zero, i.e., each of the ei to be zero. If c and y are the vectors

    ,

    respectively, and A is the n x k matrix

    then requirement S = 0 is equivalent to

    .

    If n > k, we have an overdetermined system of linear equations, since the number of equations is larger than that of the unknowns. As pointed out in the preceding Step, it is generally not possible to find a solution to this system, but we may find c1, . . , ck such that Ac is 'close' to y (in the least squares sense). If there is a solution of the least squares problem, then we write

    The matrix is called the pseudo-inverse or generalized inverse of A. We note that when n=k and A is invertible (i.e., A has an inverse), then

  2. The normal equations

    In order to minimize S, we must minimize

    .

    If we take for j = 1, 2, . . . , k the partial derivatives with respect to cj and set them equal to zero, we get the normal equations

    or, equivalently,

    .

    If M is a matrix (or vector) with (i, j)-th element mij, then we know from linear algebra that its transpose, denoted by MT, is the matrix with (i, j)-th element mji, i.e., MT is obtained from M by swapping rows and columns. For example, if

    then

    Evidently, the normal equations may be written

    ,

    where the matrix A has the entries . As an example, let . Then

    and

    ,

    so that

    ,

    where in the sums i runs from 1 to n and

    .

    Then the system is precisely the system given in Exercise 3 of the preceding Step.

    If the matrix is invertible, then there is a solution to the least squares problem. We then have , so that the psceudo-inverse A+ is given by

    .

    The case when does not have an inverse is beyond the scope of this text. It is in this situation that the importance of the pseudo-inverse becomes apparent.

  3. QR factorization

    If the matrix has an inverse, then, in principle, the normal equations can be solved to find the least squares solution. However, as we have noted in the preceding Step, the normal equations may be ill-conditioned. If so, then an alternative technique for finding the least squares solution is to use a QR factorization of A. In this technique, A is expressed as

    ,

    where Q is an n x n orthogonal matrix and R is an n x k matrix the first k rows of which form an upper triangular matrix and the last n - k rows of which contain zeros. (The inverse of an orthogonal matrix M is equal to its transpose, i.e., ) A technique for finding a QR factorization of a matrix will be given in the next section.

    As an example, let

    .

    Then we can write A = QR, where Q is given to 5d by

    and

    .

    It may be verified that is the 6 x 6 identity matrix.

    Recalling that Ac = y is an overdetermined system of equations, we substitute A = QR and multiply through by to obtain

    .

    Since the matrix R has the form

    ,

    where is an (n - k) x k matrix of zeros, we partition as

    .

    The vector q1 is k ´ 1, while the vector q2 is (n - k) ´ 1, and it turns out that the solution of the least squares problem is obtained by solving the equation

    Thus, once a QR factorization of A has been obtained, we can find the least squares approximation by solving an upper triangular system by back-substitution.

    For example, suppose we wish to fit a parabola to the experimental data, given in Section 3 of Step 26. The relevant matrix A with its QR factorization has been given above, and we see that

    .

    The calculation of yields

    .

    Upon solving , we find to 3D the same solution as before, namely, c3=-0.357, c2=2.700, and c1=-1.200.

  4. The QR factorization process

    A complete explanation of how to determine a QR factorization is beyond the scope of this course. However, we will now give a brief outline which should enable the reader to write an appropriate computer program. (The calculations are normally too complex for hand calculations!) Modifications of the process, in order to improve the computational efficiency, are usually implemented in computer software packages. For instance, it is not necessary to calculate Q explicitly, in order to obtain the system for the least squares problem.

    Most techniques of QR factorization are based on orthogonal transformations, in which A is pre-multiplied by a sequence of orthogonal matrices which successively reduce the elements below the main diagonal in each column to zero. There are a number of ways of choosing these transformations. Here we shall only consider Householder transformations, in which we pre-multiply A by Householder matrices.

    An n ´ n Householder matrix has the form

    ,

    where w is an n x 1 vector, such that

    .

    Note that H is symmetric and orthogonal

    In order to find a QR factorization of A, we apply to it appropriate Householder transformations which transform it into R. If we set , this may be achieved by calculating the sequence

    for , where is a Householder matrix. The effect of this Householder matrix is to make the last elements in the -th column of zero. The final matrix A(k) is the required R.

    The corresponding Householder matrix is of the form , where the first components of are zero. If the components of the -th column of are , then the
    -th element of should be taken to be

    ,

    where

    .

    Since can be either positive or negative, it is best to choose the sign to be opposite to , i.e., to assume that is positive, if is negative and vice versa. This choice maximizes and minimizes round-off errors. The remaining elements of are given by

    .

    Finally, this matrix is given by the product from which follows Q by transposition. However, as has been pointed out earlier, it is not necessary to find Q explicitly, in order to obtain the least squares solution. Instead, we set y(0) = y, and, while calculating , we may also obtain

    .

    The final result is a transformation of the original system into the system i.e,, the system . In fact, the calculations may be carried out without a need to explicitly form the Householder matrices. Once we have found , we know that

    ,

    i.e., may be evaluated by first calculating In a similar manner, may be found without explicit evaluation of

    Checkpoint

    1. If A is a matrix with elements how may the normal equations be expressed in terms of A, c, and y?
    2. How may the the least squares solution be found by means of QR factorization of A?
    3. What type of transformations are used to produce a QR factorization?

    EXERCISES

    1. Given the data points let Write down the matrix A with and obtain the normal equations. Verify that these normal equations are the same as those obtained in Exercise 3 of the preceding Step.
    2. For A of Exercise l, find H(1) (take A(0)) = A) and hence calculate A(1) = H(1)A. Verify that the second, third, and fourth components in the first column of this last matrix vanish.

Answers

last next