Fitting data points to polynomial expressions

Suppose we have a set of experimental data that we want to fit with a straight line. For specificity, imagine that we have measured absorbance as a function of concentration in a test of the Beer-Lambert law:

   concentration (M)     absorbance

        0.01                0.095
        0.02                0.202
        0.04                0.413
        0.06                0.629
        0.10                1.101
        0.15                1.735
First let's put this data into a list of (x, y) points:
   data = { {0.01, 0.095}, {0.02, 0.202}, {0.04, 0.413},
            {0.06, 0.629}, {0.10, 1.101}, {0.15, 1.735} }
We can plot the data with
from which we see that the data looks fairly linear at low concentrations, but starts to curve upward as the concentration increases. (Well, it's actually a bit hard to see this, but it will become more evident once we try to fit the data with a straight line.)

To fit the data to a straight line, type

   Fit[data, {1, conc}, {conc}]
This tells Mathematica to fit the data in terms of two functions of the variable conc; these functions are 1 and conc. You should get the result
   -0.044375 + 11.6875 conc
from which we see that the slope and y-intercept of the line are 11.6875 and -0.044375, respectively. (Note that Mathematica uses its default of six significant figures even though our data has fewer significant figures.)

The general form of the Fit command is:

   Fit[ <data-list>, <function-list>, <variable-list> ]
The first argument is a list of (x, y) data points, in the standard Mathematica form of a nested list. The second argument is a list of functions that Mathematica will use to try to fit the points. The third argument is a list of the independent variables in the data set. In this example, we have only one independent variable, so the variable list is a list with one element: the variable conc. (Of course, we could name the independent variable anything we want, as long as the same name appears in the list of functions and the list of variables.) Later we'll see how we can fit data as a function of two or more independent variables.

You might be asking yourself, "Why is there a 1 in the list of functions?" Good question! Let's try leaving the 1 out:

   Fit[data, {conc}, {conc}]
   11.2461 conc
Here we have instructed Mathematica to fit the data to a straight line that goes through the origin. The 1 in the variable list is used to fit the y-intercept in our original example. Think of the fit as finding the best constants A and B such that the data is described by the line
   A * 1 + B * conc
When we leave out the 1, we are telling Mathematica not to look for a constant term in the fitting equation. This is equivalent to finding the best constant B such that the data is described by the line
   B * conc
with A implicitly set to zero.

In the example presented here, we might want to force the y-intercept to be zero; after all, that is the sensible Beer-Lambert law prediction. But if our apparatus is dirty or malfunctioning, it may register a "baseline" absorbance even when there is nothing in our sample cell. This would show up as a constant error in the absorbance, independent of concentration. So the fit to the line

   A * 1 + B * conc
lets us estimate this error. Notice that the slopes of the two straight lines are slightly different.

As we mentioned in class, the Beer-Lambert law can break down at high concentrations. Let's add a quadratic term to our fitting equation to see if the data is described better that way:

   linfit = Fit[data, {1, conc}, {conc}]
   -0.044375 + 11.6875 conc
   quadfit = Fit[data, {1, conc, conc^2}, {conc}]
   -0.00325599 + 9.91865 conc + 11.1374 conc^2
Now let's plot both fits, along with the experimental data:
   points = ListPlot[data]
   curve1 = Plot[linfit, {conc, 0, 0.15}]
   curve2 = Plot[quadfit, {conc, 0, 0.15}]
   Show[points, curve1]
   Show[points, curve2]
We see that the quadratic curve does seem to fit the data better. In addition, the quadratic fit leads to a y-intercept which is much closer to zero, in accord with our intuition.

Other parts of the Mathematica tutorial: