STEP 26

CURVE FITTING 1

1. Least squares

Scientists often wish to fit a smooth curve to experimental data. Given (n + 1) points, an obvious approach is to use the interpolating polynomial of degree n, but when n is large, this is usually unsatisfactory. Better results are obtained by piecewise use of polynomials, i.e., by fitting lower degree polynomials through subsets of the data points. The use of spline functions, which, as a rule, provide a particularly smooth fit, has become widespread (Step 28 ).

A rather different, but often quite suitable approach is a least square fit, in which, instead of trying to fit points exactly, a polynomial of low degree (often linear or quadratic) is obtained which fits the points closely (after all, the points themselves may not, in general, be exact, but subject to experimental error).

2. An illustration of the problem

Suppose we are studying experimentally the relationship between two variables x and y - for example, quantities x of drug injected and observed responses y, reorded in a laboratory experiment. By carrying out the appropriate experiment, say, six times, we obtain six pairs of values (xj, yj), which can be plotted on a diagram such as Figure 11(a).

Fig. 12 Fitting a straight line and a parabola

We may believe that the relationship between the variables can be described satisfactorily by a function y = f (x), but that the y-values, obtained experimentally, are subject to errors (or noise). Therefore one arrives at the mathematical model:

with n data, where f (xi ) are the values of y, corresponding to the value of xi, used in the experiment, and ei is the experimental error involved in the measurement of the variable y at the point. Thus, the error in y at the observed point is

In the problem of curve fitting, we use the information of the sample data points to determine a suitable curve (i.e., find a suitable function f ) so that the equation y = f (x) gives a description of the (x, y) relationship, in other words, it is hoped that predictions made by means of this equation will not be too much in error.

How does on choose the function f ? There is an unlimited range of functions available. Figure 11(b) shows four possibilities. The polygon A passes through all six points; intuitively, however, we would prefer to fit a straight line B, or an exponential curve such as C. The curve D is clearly not a good candidate for our model.

3. A general approach to the problem

Let us, first of all, answer the question regarding the choice of function. Given a set of values (x1, y1), (x2, y2),. . , (xn, yn); we shall pick a function which we can specify completely except for·the values of a set of k parameters c1, c2, .. , ck; we shall denote this function by . We then choose values for the parameters which will make the errors at the observation points (xi, yi) as small as possible. Next, we shall suggest three ways by which the phrase as small as possible can be given specific meaning.

Examples of functions to use are:

  1. (Polynomials),
  2. ( Combinations of· sine functions),
  3. ( Combination of cosine functions).

    These examples) may be termed general, linear forms:

  4. , where the functions are a preselected set of functions.

In 1., the set of functions is ; in 2., with a constant chosen to coincide with a periodicity in the data, while in 3., the set is Other functions commonly used in curve fitting are exponential functions, Bessel functions, Legendre polynomials, and Chebyshev polynomials (cf., for example, Burden and Faires (1993)).

4. The meaning of Errors as small as possible

We now present criteria which make precise the concept of choosing a function to make measurement errors as small as possible. We suppose that the curve to be fitted can be expressed in a general linear form, with a known set of functions

The errors at the n data points are:

If the number of data points is less than or equal to the number of parameters, i.e., , it is possible to find values for {c1, c2,. .. ., ck) which make all the errors ei zero. If n is an infinite number of solutions for {ci} which make al1 the errors zero, then an infinite number of curves of the given form pass through all the experimental points; in this case, the problem is not fully determined, i.e., more information is needed to choose an appropriate curve.

If n > k, which, in practice, is mostly the case, then it is not normally possible to make all the errors zero by a choice of the {ci}. There are three possible choices:

  1. A set {ci} which minimizes the total absolute error, i.e., minimize the sum:
  2. a set {ci} which minimizes the maximum absolute error, i.e., minimizes ;
  3. a set {cI} which minimizes the sum of the squares of the errors, i.e., minimizes .

In general, Procedures 1 and 2 are not readily applied. Procedure 3 leads to a linear system of equations for the set {cI}, referred to as the Principle of least squares; it is used almost exclusively.

5. The least squares method and normal equations

In order to apply the principle of least squares, use has to be made of partial differentiation, a calculus technique which may not be known to some readers of this text. For that reason, a general description of the method will not be given until the Step 27(which is optional), but we describe it here and give examples, in order to show how it is used.

The sum of squared errors to be minimized is

The n values of (xi, yi) are the known measurements taken from n experiments. When they are inserted on the right-hand side, S becomes an expression involving only the k unknowns c1, c2, . . , ck. In other words, S may be regarded as a function of the ci, i.e., . The problem is now to choose that set of values {ci} which makes S a minimum.

A theorem in calculus tells us that, under certain conditions which are usually satisfied in practice, the minimum of S occurs when all the partial derivatives

vanish. The partial derivative coincides here with the differential coefficient , while all the other ci are held constant; for instance, if S = 3cl + 5c2, then

Thus, we have to solve the system of k equations:

This system is a set of equations which is linear in the variables cl, c2, . . , ck and is referred to as the normal equations for the least squares approximation. One of the numerical methods presented in Step 11, Step 13, or Step 15 may be used to obtain the required set {cI} which minimizes S. . However, we note that the normal equations may be ill-conditioned, when it is preferable to invoke QR factorization as outlined in the next (optional) Step 27, or to employ orthogonal basis functions (cf. for example, Conte and de Boor ( 1980).

6. Example

The following points were obtained in an experiment:

.

We shall plot the points on a diagram and use the method of least squares to fit through them

a) a straight line, and b) a parabola.

 

The plotted points are shown in Figure 12(a). In order to fit a straight line, we have to find a function y = cl + c2x, i.e., a first degree polynomial which minimizes

Differentiating first with respect to cl (keeping c2 constant) and then with respect to c2 (keeping cl constant), and setting the results equal to zero, yields the normal equations:

We may divide both equations by -2, take the summation operations through the brackets, and rearrange, in order to obtain:

We see that, in order to obtain a solution, we have to evaluate the four sums and insert them into these equations. We can arrange the work in a table as follows (the last three columns are devoted to fitting of the parabola and the required sums are in the last row):

The corresponding normal equations for fitting a straight line are:

The solutions to 2D are c1 = 2.13 and c2 = 0.20, whence the required line is (Figure 12(b)):

In order to fit a parabola, we must find the second degree polynomial

which minimizes

.

Taking partial derivatives and proceeding as above we obtain the normal equations:

Inserting the values for the sums (see the table above), we obtain the system of linear equations:

.

The solution to 3D is c1 = -1.200, c2 = 2.700, and c3 = -0.357. The required parabola is therefore (retaining 2D):

.

it is also plotted in Figure 13(b). Obviously, the parabola is a better fit than the straight line!

Checkpoint

  1. What is meant by the term error at a point?
  2. Give three criteria which may be applied to choose the set (ci).
  3. How are the normal equations obtained?

EXERCISES

  1. For the example above (the data points are shown in Figure 12(a)) compute the value of S, the sum of the squares of the errors at the points, from 1. the fitted line, and 2. the fitted parabola. Plot the points on graph paper, and fit a straight line by eye (i.e., use a ruler to draw a line, guessing its best position). Determine the value of S for this line and compare it with the value for the least squares line. Fit a straight line by the least squares method to each of the following sets of data:

    a) Toughness x and percentage of nickel y in eight specimens of alloy steel.

    b) Aptitude test marks x, given to six trainee sales people, and their first-year sales y in thousands of dollars.

    For both sets of data, plot the points and draw the least squares line. Use the lines to predict the % - nickel of a specimen of steel the toughness of which is 38, and the likely first-year sales of a trainee sales person who obtains a mark of 48 in the aptitude test.

  2. Obtain the normal equations for fitting a third-degree polynomial y = c1 + c2x + c3x2 + c3x3 to a set of n points. Show that they can be written in matrix form (all sums being from i =1 to i=n

    Deduce the matrix form of the normal equations for fitting a fourth-degree polynomial.

  3. Use the least squares method to fit a parabola to the points (0,0), (1,I), (2,3), (3,3), and (4,2). Find the value of S for this fit.
  4. Find the normal equations which arise while fitting by the least squares method an equation of the form y = c1 + c2sin x to the set of points Solve them for c1 and c2.

Answers

last next