Previous Chapter: Preprocessing First Chapter: Contents

Missing values should be treated with care in any model. Simply setting
the values to zero is sometimes suggested, but this is a *very* dangerous
approach. The missing elements may just as well be set to 1237 or some
other value. There is nothing special about zero. Another approach is to
impute the missing elements from an ANOVA model or something similar. While
better than simply setting the elements to zero, this is still not a good
approach. In two-way PCA and any-way PLS estimated through NIPALS-like
algorithms the approach normally advocated for in chemometrics is to simply
skip the missing elements in the appropriate inner products of the algorithm.
This approach has been shown to work well for a small amount of randomly
missing data, but also to be problematic in some cases.

A better way, though, to handle missing data follows from the idea that
the model is estimated by optimizing the loss function *only* considering
non-missing data. This is a more sensible way of handling randomly missing
data. The loss function for any model of incomplete data can thus be stated

Another approach for handling incomplete data is to impute the missing
data iteratively during the estimation of the model parameters. The missing
data are initially replaced with either sensible or random elements. A
standard algorithm is used for estimating the model parameters using all
data. After each iteration the model of **X** is calculated, and the
missing elements are replaced with the model estimates. The iterations
and replacements are continued until no changes occur in the estimates
of the missing elements and the overall convergence criterion is fulfilled.
It is easy to see, that when the algorithm has converged, the elements
replacing the missing elements will have zero residual.

How then, do these two approaches compare? Henk Kiers has shown that the two approaches give identical results, which can also be realized by considering data imputation more closely. As residuals corresponding to missing elements will be zero they do not influence the parameters of the model, which is the same as saying they should have zero weight in the loss function. Algorithmically, however, there are some differences. Consider two competing algorithms for estimating a model of data with missing elements; one where the parameters are updated using weighted least squares regression with zero weights for missing elements and one where ordinary least squares regression and data imputation is used. Using direct weighted least squares regression instead of ordinary least squares regression is computationally more costly per iteration, and will therefore slow down the algorithm. Using iterative data imputation on the other hand often requires more iterations due to the data imputation (typically 30-100% more iterations). It is difficult to say which method is preferable as this is dependent on implementation, size of the data, and size of the computer. Data imputation has the advantage of being easy to implement, also for problems which are otherwise difficult to estimate under a weighted loss function.

Previous Chapter: Preprocessing First Chapter: Contents

*The N-way tutorial*
*Copyright © 1998*
*R. Bro*