Four-way chromatographic data with retention time shifts



In order to understand the chemistry of the color formation during sugar processing an experiment was conducted to explore the presence and amount of chemical analytes in thick juice, an intermediate product in the sugar production. The molecular entities of thick juice samples were separated by size on a chromatographic system and detected by fluorescence in the hope that the individual fluorophores could be separated and detected. The primary aspect of these data, is the problem of modeling chromatographic data with retention time shifts. It is not uncommon that analysis of chromatographic data is hampered by retention time shifts. Retention time shifts causes the chromatographic profiles of specific analytes to be dissimilar from run to run. This is problematic because it severely prevents or degrades the application of multilinear models.

Get the data

The data are available in zipped MATLAB 4.2 format. Download the data and write load data in MATLAB. If you use the data we would appreciate that you report the results to us as a courtesey of the work involved in producing and preparing the data. Also you may want to refer to the data by referring to

Bro, R, Andersson,C.A., Kiers,H.A.L., PARAFAC2 - Part II. Modeling chromatographic data with retention time shifts. Journal of Chemometrics, 13, 295-309, 1999.


Data (Matlab format)


Fifteen samples of thick juice from different sugar factories were introduced into a sephadex G25 low pressure chromatographic system using a NH4Cl/NH3 buffer (pH 9.00) as carrier. In this way the high molecular reaction products between reducing sugar and amino acids/phenols are separated from the low molecular free amino acids and phenols. The high molecular substances elute first followed by the low molecular species. Aromatic components are retarded the most. The sample size was 300 µL and a flow of 0.4 mL/min was used. Twenty-eight discrete fractions were sampled and measured spectrofluorometrically on a Perkin Elmer LS50 B spectrofluorometer.

The column was a 20 cm long glass cylinder with an inner radius of 10 mm packed with Sephadex G25-fine gel. The water used was doubly-ion exchanged and milli-pore filtrated upon degassing. The excitation-emission matrices were collected using a standard 10 mm by 10 mm quartz cuvette, scanning at 1500 nm/min with 10 nm slit widths in both excitation and emission monochromators (250 - 440 nm excitation, 250 - 560 nm emission). The size of the four-way data set is 28 (fractions) × 20 (excitation) × 78 (emission) × 15 (samples).

The chromatographic data are four-way. Ideally, they are quadrilinear, the components of the modes corresponding to time profiles (28), excitation spectra (20), emission spectra (78), and sample concentration profiles (15). Hence a four-way PARAFAC model should be capable of uniquely and meaningfully describing the variation. In this case, a four-component model seems adequate. However, because of shifts in the retention times these data can not be meaningfully analuzed with the PARAFAC model. In Bro et al. (99) it is shown that the PARAFAC2 model on the other hand provides a meaningful model that specifically handles the retention time shifts in the different samples.

Copyright © 1996-2002 Rasmus Bro ( All rights reserved