In this optional Step, we consider certain advanced topics concerning systems of linear equations that arise out of the discrete least squares problem discussed in the preceding Step.
We recall that we are given data points (x1, y1), . . , (xn, yn) and we wish to find parameters c1, . . , ck multiplying the basis functions f1, ···, f k such that
![]()
is minimized, where the errors ei are given by

Ideally, we would like S to be zero, i.e., each of the ei to be zero. If c and y are the vectors
,
respectively, and A is the n x k matrix

then requirement S = 0 is equivalent to
.
If n > k, we have an overdetermined system of linear equations, since the number of
equations is larger than that of the unknowns. As pointed
out in the preceding Step, it is generally
not possible to find a solution to this system, but we
may find c1, . . , ck
such that Ac is
'close' to y (in the least squares sense). If there is a solution
of the least squares problem,
then we write
![]()
The matrix
is called the pseudo-inverse or generalized
inverse of A. We
note that when n=k and A is invertible
(i.e., A has an inverse), then
![]()
In order to minimize S, we must minimize
.
If we take for j = 1, 2, . . . , k the partial derivatives with respect to cj and set them equal to zero, we get the normal equations
![]()
or, equivalently,
.
If M is a matrix (or vector) with (i, j)-th element mij, then we know from linear algebra that its transpose, denoted by MT, is the matrix with (i, j)-th element mji, i.e., MT is obtained from M by swapping rows and columns. For example, if

then

Evidently, the normal equations may be written
,
where the matrix A has the
entries
. As an example, let
. Then

and
,
so that
,
where in the sums i runs from 1 to n and
.
Then the system
is precisely
the system given in Exercise
3 of the preceding Step.
If the matrix
is invertible,
then there is a solution to the least squares problem. We
then have
, so that the psceudo-inverse A+
is given by
.
The case when
does not have
an inverse is beyond the scope of this text. It is in
this situation that the importance of the pseudo-inverse becomes apparent.
If the matrix
has an inverse,
then, in principle, the normal equations can be solved to
find the least squares solution. However, as we have
noted in the preceding Step, the normal equations may be ill-conditioned. If so, then an alternative technique for
finding the least squares solution is to use a QR factorization of A. In this technique, A is
expressed as
,
where Q is an n
x n orthogonal matrix and R is an n x k matrix
the first k rows of which form an upper triangular matrix and the last n - k rows of which
contain zeros. (The inverse of an orthogonal
matrix M is equal
to its transpose, i.e.,
) A technique
for finding a QR factorization of a matrix will be
given in the next section.
.
Then we can write A = QR, where Q is given to 5d by

and
.
It may be verified that
is the 6 x 6 identity matrix.
Recalling that Ac = y is an overdetermined system of
equations, we
substitute A = QR and multiply through by
to obtain
.
Since the matrix R has the form
,
where
is an (n - k) x k
matrix of zeros, we partition
as
.
The vector q1 is k ´ 1, while the vector q2 is (n - k) ´ 1, and it turns out that the solution of the least squares problem is obtained by solving the equation
![]()
Thus, once a QR factorization of A has been obtained, we can find the least squares approximation by solving an upper triangular system by back-substitution.
For example, suppose we wish to fit a parabola to the experimental data, given in Section 3 of Step 26. The relevant matrix A with its QR factorization has been given above, and we see that
.
The calculation of
yields
.
Upon solving
, we find to 3D the same solution as
before, namely, c3=-0.357,
c2=2.700, and c1=-1.200.
A complete explanation of how to
determine a QR
factorization is
beyond the scope of this course. However, we will now
give a brief outline which should enable the reader to
write an appropriate computer program. (The calculations
are normally too complex for hand calculations!)
Modifications of the process, in order to improve the computational efficiency, are usually implemented in computer software packages. For instance, it is not necessary to
calculate Q explicitly, in order to obtain the
system
for the least squares problem.
Most techniques of QR factorization are based on orthogonal transformations, in which A is pre-multiplied by a sequence of orthogonal matrices which successively reduce the elements below the main diagonal in each column to zero. There are a number of ways of choosing these transformations. Here we shall only consider Householder transformations, in which we pre-multiply A by Householder matrices.
An n ´ n Householder matrix has the form
,
where w is an n x 1 vector, such that
.
Note that H is symmetric and orthogonal
![]()
In order to find a QR
factorization of A, we apply to it appropriate Householder
transformations
which transform it into R. If we set
, this may be achieved by calculating the
sequence
![]()
for
, where
is a Householder matrix. The effect of this
Householder matrix is to make the last
elements in the
-th column of
zero. The final matrix A(k) is the
required R.
The corresponding Householder
matrix
is of the form
, where the first
components of
are zero. If the components of the
-th column of
are
, then the
-th element of
should be taken
to be
,
where
.
Since
can be either
positive or negative, it is best to choose the sign to be
opposite to
, i.e., to assume that
is positive, if
is negative and
vice versa. This choice maximizes
and minimizes round-off errors. The remaining elements of
are given by
.
Finally, this matrix is given by
the product
from which follows Q by
transposition. However, as has been pointed out earlier,
it is not necessary to find Q explicitly, in order
to obtain the least
squares solution.
Instead, we set y(0) = y, and, while calculating
, we may also obtain
.
The final result is a
transformation of the original system
into the system
i.e,, the
system
. In fact, the calculations may be
carried out without a need to explicitly form the Householder
matrices. Once
we have found
, we know that
,
i.e.,
may be
evaluated by first calculating
In a similar
manner,
may be found without explicit
evaluation of ![]()