Sparse PARAFAC

Variable selection in high dimensional models can improve the interpretability and performance. One way to achieve sparsity is through penalized minimization problems with bounding of the $L_1$ norm of some parameter entity. This concept has received huge attention within fields such as statistical learning, data mining, signal processing.

 


SPARAFAC (Sparse PARAFAC)

In sparse principal component analysis the aim is to estimate a PARAFAC-like model where sparsity is induced on the model parameters. Sparsity in several modes is (here) refered to as multidimentional coclustering. 

Download algorithm

An algorithm for estimation of a SPARAFAC model can be downloaded here.

 

References:

Rasmussen MA and Bro R (2012) A tutorial on the LASSO approach to sparse modelling, Chemometrics and Intelligent Laboratory Systems, 119,21:31

 


Algorithmic details

Solving a constrained optimization problems like:

Input: $\mathbf{X}$ - data, $k$ - number of components, $\lambda$ - penalty from the Lagrangian formulation, additional constraints (nonnegativity and orthogonality (not defined for otherwise constrained components))

Initialize with random numbers, tucker3 solution or unconstrained PARAFAC.

For  i=1,...,

  1. Estimate $\mathbf{A}$ when $\mathbf{B}$ and $\mathbf{C}$ is fixed.
  2. Estimate $\mathbf{B}$ when $\mathbf{A}$ and $\mathbf{C}$ is fixed.
  3. Estimate $\mathbf{C}$ when $\mathbf{A}$ and $\mathbf{B}$ is fixed.

Repeat 1-3 until convergence

 

Due to similarities with the Alternating Least Squares (ALS) algortihm, a name could be Alternating Shrunken Least Squares (ASLS). 

 

As none of these problems are convex, a solution might be a local minimum. Hence multible starts for determing local solutions are needed.

 

Soft thresholding

Estimation of $\mathbf{p}$ when $\mathbf{t}$ is known correspond to solving: $ \mathrm{min}(\frac{1}{2}\left|\mathbf{X} - \mathbf{t}\mathbf{p}^T\right|_F^2 + \lambda\left|\mathbf{p}\right|_1) $ with respect to $\mathbf{p}$. Seeting $\mathbf{p_{LS}} = (\mathbf{t}^T\mathbf{t})^{-1}\mathbf{t}^T\mathbf{X} $ to the least squares solution, the solution is a soft thresholded  version of $\mathbf{p_{LS}}$ by $\lambda$: 

$$ p_j = S(p_{LSj},\lambda) $$

Below is exemplified how this operator works.


Author: Morten Arendt Rasmussen, mortenr@life.ku.dk.