The construction of a least-squares approximant usually requires that one have in hand a basis for the space from which the data are to be approximated. As the example of the space of “natural” cubic splines illustrates, the explicit construction of a basis is not always straightforward.
This section makes clear that an explicit basis is not actually needed; it is sufficient to have available some means of interpolating in some fashion from the space of approximants. For this, the fact that the Curve Fitting Toolbox™ spline functions support work with vector-valued functions is essential.
This section discusses these aspects of least-squares approximation by “natural” cubic splines.
You want to construct the least-squares approximation to given
y) from the space S of
“natural” cubic splines with given breaks
If you know a basis, (f1,f2,...,fm), for the linear space S of
all “natural” cubic splines with break sequence
then you have learned to find the least-squares approximation in the
c(2)f2+ ... +
with the vector
c the least-squares solution to
the linear system
A*c = y, whose coefficient matrix
is given by
A(i,j) = fj(x(i)), i=1:length(x), j=1:m .
In other words,
c = A\y.
The general solution seems to require that you know a basis.
However, in order to construct the coefficient sequence
you only need to know the matrix
A. For this, it
is sufficient to have at hand a basis map, namely a function
F(c) returns the spline given by the particular
For, with that, you can obtain, for
j-th column of
the identity matrix of order
Better yet, the Curve Fitting Toolbox spline functions can
handle vector-valued functions, so you should
be able to construct the basis map
F to handle
c(i) as well. However,
by agreement, in this toolbox, a vector-valued coefficient is a column vector,
hence the sequence c is necessarily a row vector of column vectors,
i.e., a matrix. With that,
the vector-valued spline whose
i-th component is
the basis element f
Hence, assuming the vector
x of data sites to be
a row vector,
fnval(F(eye(m)),x) is the matrix
(i,j)-entry is the value of f
i.e., the transpose of the matrix
are seeking. On the other hand, as just pointed out, your basis map
the coefficient sequence
c to be a row vector,
i.e., the transpose of the vector
assuming, correspondingly, the vector
y of data
values to be a row vector, you can obtain the least-squares approximation
from S to data (
To be sure, if you wanted to be prepared for
be arbitrary vectors (of the same length), you would use instead
What exactly is required of a basis map
the linear space S of “natural” cubic
splines with break sequence
b(1) < ... < b(l+1)?
Assuming the dimension of this linear space is
F should set up a linear one-to-one correspondence
m-vectors and elements of S.
But that is exactly what
csape(b, . ,'var') does.
To be explicit, consider the following function
function s = F(c) s = csape(b,c,'var');
For given vector
c (of the same length as
b), it provides the unique “natural”
cubic spline with break sequence b that takes the value
The uniqueness is key. It ensures that the correspondence between
c and the resulting spline
one-to-one. In particular,
More than that, because the value f(t)
of a function f at a point t depends
linearly on f, this uniqueness ensures that
the inverse of an invertible linear map is again a linear map).
Putting it all together, you arrive at the following code
for the least-squares approximation by “natural”
cubic splines with break sequence
Let's try it on some data, the census data, say, which is provided in MATLAB® by the command
and which supplies the years,
cdate and the values as
Use the break sequence
b = 1810:40:1970; s = csape(b, ... pop(:)'/fnval(csape(b,eye(length(b)),'var'),cdate(:)'),'var'); fnplt(s, [1750,2050],2.2); hold on plot(cdate,pop,'or'); hold off
Have a look at Least-Squares Approximation by “Natural” Cubic Splines With Three Interior Breaks which shows, in thick blue, the resulting approximation, along with the given data.
This looks like a good approximation, -- except that it doesn't look like a “natural” cubic spline. A “natural” cubic spline, to recall, must be linear to the left of its first break and to the right of its last break, and this approximation satisfies neither condition. This is due to the following facts.
The “natural” cubic spline interpolant to given
data is provided by
csape in ppform, with the interval
spanned by the data sites its basic interval. On the other hand, evaluation
of a ppform outside its basic interval is done, in MATLAB
ppval or Curve Fitting Toolbox spline
fnval, by using the relevant polynomial
end piece of the ppform, i.e., by full-order extrapolation. In case
of a “natural” cubic spline, you want instead second-order
extrapolation. This means that you want, to the left of the first
break, the straight line that agrees with the cubic spline in value
and slope at the first break. Such an extrapolation is provided by
fnxtr. Because the “natural” cubic
spline has zero second derivative at its first break, such an extrapolation
is even third-order, i.e., it satisfies three matching conditions.
In the same way, beyond the last break of the cubic spline, you want
the straight line that agrees with the spline in value and slope at
the last break, and this, too, is supplied by
Least-Squares Approximation by “Natural” Cubic Splines With Three Interior Breaks
The following one-line code provides the correct least-squares
approximation to data (
by “natural” cubic splines with break sequence
fnxtr(csape(b,y(:).'/ ... fnval(fnxtr(csape(b,eye(length(b)),'var')),x(:).'),'var'))
But it is, admittedly, a rather long line.
The following code uses this correct formula and plots, in a thinner, red line, the resulting approximation on top of the earlier plots, as shown in Least-Squares Approximation by “Natural” Cubic Splines With Three Interior Breaks.
ss = fnxtr(csape(b,pop(:)'/ ... fnval(fnxtr(csape(b,eye(length(b)),'var')),cdate(:)'),'var')); hold on, fnplt(ss,[1750,2050],1.2,'r'),grid, hold off legend('incorrect approximation','population', ... 'correct approximation')
The one-line solution works perfectly if you want to approximate
by the space S of all cubic splines with the given
b. You don't even have to use the Curve Fitting Toolbox spline
functions for this because you can rely on the MATLAB
You know that, with
c a sequence containing two
more entries than does
the unique cubic spline with break sequence
takes the value
i, and takes the slope
and the slope
spline(b,.) is a basis map for
More than that, you know that
the value(s) at
xi of this interpolating spline.
Finally, you know that
spline can handle vector-valued
data. Therefore, the following one-line code constructs the least-squares
approximation by cubic splines with break sequence