Smoothing spline

Smoothing splines are function estimates,, obtained from a set of noisy observations of the target, in order to balance a measure of goodness of fit of to with a derivative based measure of the smoothness of. They provide a means for smoothing noisy data. The most familiar example is the cubic smoothing spline, but there are many other possibilities, including for the case where is a vector quantity.

Cubic spline definition

Let be a set of observations, modeled by the relation where the are independent, zero mean random variables. The cubic smoothing spline estimate of the function is defined to be the minimizer of
Remarks:

is a smoothing parameter, controlling the trade-off between fidelity to the data and roughness of the function estimate. This is often estimated by generalized cross-validation, or by restricted marginal likelihood which exploits the link between spline smoothing and Bayesian estimation.
The integral is often evaluated over the whole real line although it is also possible to restrict the range to that of.
As , the smoothing spline converges to the interpolating spline.
As , the roughness penalty becomes paramount and the estimate converges to a linear least squares estimate.
The roughness penalty based on the second derivative is the most common in modern statistics literature, although the method can easily be adapted to penalties based on other derivatives.
In early literature, with equally-spaced ordered, second or third-order differences were used in the penalty, rather than derivatives.
The penalized sum of squares smoothing objective can be replaced by a penalized likelihood objective in which the sum of squares terms is replaced by another log-likelihood based measure of fidelity to the data. The sum of squares term corresponds to penalized likelihood with a Gaussian assumption on the.
Derivation of the cubic smoothing spline

It is useful to think of fitting a smoothing spline in two steps:

First, derive the values.
From these values, derive for all x.

Now, treat the second step first.
Given the vector of fitted values, the sum-of-squares part of the spline criterion is fixed. It remains only to minimize, and the minimizer is a natural cubic spline that interpolates the points. This interpolating spline is a linear operator, and can be written in the form
where are a set of spline basis functions. As a result, the roughness penalty has the form
where the elements of A are. The basis functions, and hence the matrix A, depend on the configuration of the predictor variables, but not on the responses or.
A is an n×n matrix given by.
Δ is an '×n matrix of second differences with elements:
,,
W is an '× symmetric tri-diagonal matrix with elements:
, and, the distances between successive knots.
Now back to the first step. The penalized sum-of-squares can be written as
where.
Minimizing over by differentiating against. This results in:
and

De Boor's approach

De Boor's approach exploits the same idea, of finding a balance between having a smooth curve and being close to the given data.
where is a parameter called smooth factor and belongs to the interval, and are the quantities controlling the extent of smoothing. In practice, since cubic splines are mostly used, is usually. The solution for was proposed by Reinsch in 1967. For, when approaches, converges to the "natural" spline interpolant to the given data. As approaches, converges to a straight line. Since finding a suitable value of is a task of trial and error, a redundant constant was introduced for convenience.
is used to numerically determine the value of so that the function meets the following condition:
The algorithm described by de Boor starts with and increases until the condition is met. If is an estimation of the standard deviation for, the constant is recommended to be chosen in the interval. Having means the solution is the "natural" spline interpolant. Increasing means we obtain a smoother curve by getting farther from the given data.

Multidimensional splines

There are two main classes of method for generalizing from smoothing with respect to a scalar to smoothing with respect to a vector
. The first approach simply generalizes the spline smoothing penalty to the multidimensional setting. For example, if trying to estimate we might use the Thin plate spline penalty and find the minimizing
The thin plate spline approach can be generalized to smoothing with respect to more than two dimensions and to other orders of differentiation in the penalty. As the dimension increases there are some restrictions on the smallest order of differential that can be used, but actually Duchon's original paper, gives slightly more complicated penalties that can avoid this restriction.
The thin plate splines are isotropic, meaning that if we rotate the co-ordinate system the estimate will not change, but also that we are assuming that the same level of smoothing is appropriate in all directions. This is often considered reasonable when smoothing with respect to spatial location, but in many other cases isotropy is not an appropriate assumption and can lead to sensitivity to apparently arbitrary choices of measurement units. For example, if smoothing with respect to distance and time an isotropic smoother will give different results if distance is measure in metres and time in seconds, to what will occur if we change the units to centimetres and hours.
The second class of generalizations to multi-dimensional smoothing deals directly with this scale invariance issue using tensor product spline constructions. Such splines have smoothing penalties with multiple smoothing parameters, which is the price that must be paid for not assuming that the same degree of smoothness is appropriate in all directions.

Related methods

Smoothing splines are related to, but distinct from:

Regression splines. In this method, the data is fitted to a set of spline basis functions with a reduced set of knots, typically by least squares. No roughness penalty is used.
Penalized Splines. This combines the reduced knots of regression splines, with the roughness penalty of smoothing splines.
Elastic maps method for manifold learning. This method combines the least squares penalty for approximation error with the bending and stretching penalty of the approximating manifold and uses the coarse discretization of the optimization problem; see thin plate splines.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Smoothing spline

Cubic spline definition

Derivation of the cubic smoothing spline

De Boor's approach

Multidimensional splines

Related methods