Department of Food Science

Faculty of Science

University of Copenhagen

# Sparse PARAFAC

Variable selection in high dimensional models can improve the interpretability and performance. One way to achieve sparsity is through penalized minimization problems with bounding of the $L_1$ norm of some parameter entity. This concept has received huge attention within fields such as statistical learning, data mining, signal processing.

SPARAFAC *(Sparse PARAFAC)*

In sparse principal component analysis the aim is to estimate a PARAFAC-like model where sparsity is induced on the model parameters. Sparsity in several modes is (here) refered to as multidimentional coclustering.

##### Download algorithm

An algorithm for estimation of a SPARAFAC model can be downloaded here.

References:

Rasmussen MA and Bro R (2012) A tutorial on the LASSO approach to sparse modelling, *Chemometrics and Intelligent Laboratory Systems*, 119,21:31

## Algorithmic details

Solving a constrained optimization problems like:

**Input: **$\mathbf{X}$ - data, $k$ - number of components, $\lambda$ - penalty from the *Lagrangian *formulation, additional constraints (nonnegativity and orthogonality (not defined for otherwise constrained components))

**Initialize **with random numbers, tucker3 solution or unconstrained PARAFAC.

**For ***i=1,...,*

- Estimate $\mathbf{A}$ when $\mathbf{B}$ and $\mathbf{C}$ is fixed.
- Estimate $\mathbf{B}$ when $\mathbf{A}$ and $\mathbf{C}$ is fixed.
- Estimate $\mathbf{C}$ when $\mathbf{A}$ and $\mathbf{B}$ is fixed.

**Repeat **1-3 until convergence

Due to similarities with the Alternating Least Squares (ALS) algortihm, a name could be Alternating Shrunken Least Squares (ASLS).

As none of these problems are convex, a solution might be a local minimum. Hence multible starts for determing local solutions are needed.

### Soft thresholding

Estimation of $\mathbf{p}$ when $\mathbf{t}$ is known correspond to solving: $ \mathrm{min}(\frac{1}{2}\left|\mathbf{X} - \mathbf{t}\mathbf{p}^T\right|_F^2 + \lambda\left|\mathbf{p}\right|_1) $ with respect to $\mathbf{p}$. Seeting $\mathbf{p_{LS}} = (\mathbf{t}^T\mathbf{t})^{-1}\mathbf{t}^T\mathbf{X} $ to the least squares solution, the solution is a soft thresholded version of $\mathbf{p_{LS}}$ by $\lambda$:

$$ p_j = S(p_{LSj},\lambda) $$

Below is exemplified how this operator works.

Author: Morten Arendt Rasmussen, mortenr@life.ku.dk.