goal is to represent the variation present in many variables using a small number of ‘factors’. A new
row space is constructed in which to plot the samples, by redefining the axes using factors than
the original measurement variables. These new axes are called principal components PCs, allow
the analyst to probe matrices with many variables and view the true multivariate nature of the data
in a relatively small number of dimensions. With this view, human pattern recognition can be used
to identify structures in the data Beebe et al., 1998.
The first PC explains the maximum amount of variation possible within the data set in one direc-
tion. The coordinates of the sample in a coordi- nate system defined by the Principal Components
are called scores. The loading vectors are the bridge between the variable space and the PC
space. The loadings tell us how much each vari- able contributes to each PC.
In matrix form: X = TP, where X: the analysed data matrix; T is the score matrix, and P is the
loading matrix. Only a significant number of PCs are relavant in describing the information in X.
This leads to the following decomposition: X = T
f
P
f
+ E, where T
f
is the score matrix with dimen-
sions sxf, P
f
is the loading matrix with dimensions
wxf and E is a residual matrix with the same dimensions as X Esbensen et al., 1994. In this
particular case, T contains information about the samples, and P contains information about the
wavelengths.
2
.
3
. Partial least square regression PLS
. PLS is used in order to make a model correlat-
ing X and y, where X contains the fluorescence spectra and y s × 1 is a vector containing the
property of interest. The model performance is validated by cross validation due to our small
data set. Multivariate calibration models were built correlating the fluorescence emission spectra,
and physical as well as chemical properties of the samples.
PCA and PLS regressions were performed with the use of the UNSCRAMBLER program ver-
sion 6.0, Camo AS, Norway. The model perfor- mance is validated by cross validation due to our
small data set Noergaard, 1995. The pre- dictive performances of the PLS models are as-
sessed by RMSEP root mean square error of prediction criterion and Pearson’s correlation co-
efficient R:
RMSEP =
N i = 1
Y
i predicted
− Y
i measured
2
N where N is the number of samples.
Schematically the procedure could be summa- rized as following:
Emission fluorescence spectra recorded at four excitation wavelenghts tranformation to ascii
files introduction to UNSCRAMBLER pro- gram each spectrum is represented by 751
points put the four emission spectra sequen- tially resulting rows of 4 × 751 = 3004 points
creation of a X matrix for the 20 samples with dimension 20 × 3004 removal of the common
peaks the
analysed matrix
becomes 20 ×
1115 application of PCA construction of PLS models with the introduction of Y
i
matrices with dimensions
20 × 1 with
data of
different properties.
3. Results and discussion
Raw fluorescence emission spectra of ground adult and juvenile eucalyptus wood are presented
in Fig. 1. The spectral shapes of all samples are much alike, with small differences in the intensity.
Each emission spectrum shown is the average of the two measurements, and represents 751 points.
The recorded emission spectra at the indicated excitation wavelengths were sequentially placed,
and introduced as rows in an X matrix whose dimension for the 20 samples is 20 × 3004 751 ×
4 = 3004. Rayleigh scattering gives a large contri- bution to the emission spectrum at the emission
wavelength corresponding to the actual excitation wavelength; as it is common to all samples, it can
be removed from the spectral emission region. Moreover, we have removed the spectral areas
where the samples have similar intensities and shapes and, consequently, do not permit differen-
tiation with the result shown in Fig. 2. The analysed matrix X thus becomes a 20 × 1115 ma-
Fig. 1. Emission spectra of the eucalyptus wood samples as introduced to the UNSCRAMBLER program. Emission variables 1 – 751 correspond to l
ex
= 450 nm; variables 752 – 1502 correspond to l
ex
= 400 nm; variables 1503 – 2253 correspond to l
ex
= 350 nm;
variables 2254 – 3004 correspond to l
ex
= 280 nm.
Fig. 2. Concatenated emission spectra of eucalyptus wood samples. Emission variables 1 – 205: l
ex
= 450 nm; emission: 375 – 418 and
498 – 558 nm; emission variables 206 – 511: l
ex
= 400 nm; emission: 340 – 375 and 437 – 555 nm; emission variables 512 – 793: l
ex
= 350
nm; emission: 374 – 514 nm; emission variables 794 – 1115: l
ex
= 280 nm, emission: 334 – 494 nm.
Fig. 3. A PCA 3-dimensional score plot of the samples. 1 – 10 different clones of eucalyptus; N: juvenile wood; A: mature wood; 1or 2 in the third position independent measurements of the same wood.
Table 1 Physical and chemical properties of juvenile N eucalyptus wood and the corresponding Kraft pulps
Ash content
a
Guaiacyl G Wood sample
Solubility in Lignin content
a
Yield
a
of Kraft Kappa number
pulping lignin units
b
NaOH
a
1N 0.62
18.7 648
18.1 49.5
16.8 1.10
581 17.4
20.4 51.2
2N 23.2
0.67 545
15.8 3N
50.0 21.4
18.8 0.84
638 17.2
20.6 49.9
4N 21.1
0.85 688
18.8 50.6
20.0 5N
18.2 0.94
679 18.0
19.5 50.5
6N 20.7
22.0 7N
0.76 631
18.3 50.5
21.4 20.8
8N 0.75
681 15.3
50.3 20.2
0.77 731
17.4 18.5
49.5 9N
21.1 10N
0.77 21.0
696 20.9
48.8 27.3
a
Results expressed in on dry basis.
b
Results expressed in mmolg of lignin.
trix. We have observed that the fluorescence emis- sion intensity is higher for excitation at 400 and
350 nm, as compared to 280 and 450 nm. A PCA applied to the data in Fig. 2 with the
use of UNSCRAMBLER software, has provided three vectors PCs, which can explain the 99 of
the total variance. Score plots of the data of the 20 samples and their duplicate measurements with
respect to the three identified Principal Compo nents PC1, PC2, and PC3 are shown in Fig. 3.
We can observe that the duplicate measurements are close enough, implying little effect of the
heterogeneity of sample texture. Moreover, a clear clustering of juvenile N and adult A wood
samples is demonstrated. This clearly indicates that juvenile and adult wood samples can be
discriminated by their fluorescence spectra. Such a data discrimination provides further evidence
pointing to the predictive potential of PCA, since it qualitatively corroborates with earlier data; ac-
cording to the latter, pulps from different biologi- cal sources could be discriminated using PCA of
their fluorescence spectra Billa et al., 1999a.
In an effort to add further information to the correlation, quantitative data such as the ash con-
tent and the sodium hydroxide solubility of the samples Tables 1 and 2 were introduced into the
program. PLS models were mathematically con- structed aiming at correlating fluorescence spec-
tral data X: 20 × 1115 with ash content and sodium hydroxide solubility Y
i
: 20 × 1. Fig. 4 presents the predicted versus measured values plot
Table 2 Physical and chemical properties of adult A eucalyptus wood and the corresponding Kraft pulps
Guaiacyl G Lignin content
a
Solubility in Yield
a
of Kraft Wood sample
Kappa number Ash content
a
NaOH
a
lignin units
b
pulping 0.24
498 1A
11.7 18.6
57.7 14.4
15.5 55.4
12.6 515
2A 0.27
21.7 3A
21.1 0.25
515 10.9
56.7 13.6
4A 14.4
56.0 12.3
548 0.25
21.9 56.8
10.3 598
13.5 0.35
18.1 5A
6A 0.29
20.1 510
10.6 58.0
14.9 17.6
55.6 0.26
621 11.4
7A 21.3
8A 13.8
57.2 11.1
19.3 516
0.24 9A
496 11.0
56.3 13.5
0.27 20.1
0.22 560
11.4 10A
55.9 21.2
14.6
a
Results expressed in on dry basis.
b
Results expressed in mmolg of lignin.
Fig. 4. Predicted versus measured plot for the ash content with reference to the third PC. PLS modelling.
Fig. 5. Predicted versus measured plot for the sodium hydroxide solubility with reference to the third PC. PLS modelling.
for the ash content. The Pearson correlation co- efficient is 0.7. The ash content is higher for the
juvenile samples, which also present higher inter- clonal variability. The sodium hydroxide solubil-
ity of the eucalyptus wood is higher for the young shoots with also larger interclonal variability
R = 0.87 Fig. 5. Hot alkali solution extracts low-molecular weight carbohydrates, consisting
mainly of hemicellulose and degraded cellulose. This is in accordance with previous studies that
have shown that young tissues are more fragile and soluble compared to older ones Hatton and
Hunt, 1993.
We then examined whether fluorescence spec- troscopic information derived from the wood
samples could be correlated with structural infor- mation of lignins obtained by thioacidolysis Ta-
bles 1 and 2. The thioacidolysis provides quantitative data on the guaiacyl G and syringyl
S lignin monomeric units engaged in b-O-4 bonds Lapierre, 1993. The data for the lignin
monomeric composition shown in Fig. 6, indi- cated that the lignins from juvenile trees are en-
riched in G units. Moreover, the SG ratio varies from 3.4 to 5.1 results not shown, being higher
for the adult trees indicating that they are en- riched in S units, in accordance with the studies
on lignification of Terashima et al. 1993. The above results corroborate previous data, where
fluorescence spectroscopy information of residual kraft lignins was correlated with structural infor-
mation emerging from quantitative
31
P NMR spectroscopy Billa et al., 1999b.
Furthermore, Figs. 7 and 8 present the PLS models for the yield and the Kappa number,
respectively, of the kraft pulps issued from the different clones. The value of the correlation co-
efficient is higher for the pulp yield R = 0.86, compared to that for Kappa number R = 0.68.
According to these data, adult trees are character- ised by higher pulping yields and lower Kappa
numbers, thus indicating that they are more suit- able for chemical pulp production.
4. Conclusions