Truncated regression model


Truncated regression models are a class of models in which the sample has been truncated for certain ranges of the dependent variable. That means observations with values in the dependent variable below or above certain thresholds are systematically excluded from the sample. Therefore, whole observations are missing, so that neither the dependent nor the independent variable is known. This is in contrast to censored regression models where only the value of the dependent variable is clustered at a lower threshold, an upper threshold, or both, while the value for independent variables is available.
Sample truncation is a pervasive issue in quantitative social sciences when using observational data, and consequently the development of suitable estimation techniques has long been of interest in econometrics and related disciplines. In the 1970s, James Heckman noted the similarity between truncated and otherwise non-randomly selected samples, and developed the Heckman correction.
Estimation of truncated regression models is usually done via parametric maximum likelihood method. More recently, various semi-parametric and non-parametric generalisation were proposed in the literature, e.g., based on the local least squares approach or the local maximum likelihood approach, which are kernel based methods.