Accepted Manuscript Analytical methods Blends of olive oil and seeds oils: characterisation and olive oil quantification using fatty acids composition and chemometric tools. Part II M. Monfreda, L. Gobbi, A. Grippa PII: DOI: Reference:
S0308-8146(13)01107-2 http://dx.doi.org/10.1016/j.foodchem.2013.07.141 FOCH 14523
To appear in:
Food Chemistry
Received Date: Revised Date: Accepted Date:
24 February 2013 25 July 2013 30 July 2013
Please cite this article as: Monfreda, M., Gobbi, L., Grippa, A., Blends of olive oil and seeds oils: characterisation and olive oil quantification using fatty acids composition and chemometric tools. Part II, Food Chemistry (2013), doi: http://dx.doi.org/10.1016/j.foodchem.2013.07.141
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
1
Blends of olive oil and seeds oils: characterisation and olive oil
2
quantification using fatty acids composition and chemometric tools.
3
Part II
4
Monfreda M. a, , Gobbi L.a, Grippa A.b
5 6 7 8
a. Department of Management, Sapienza University of Rome, via del Castro Laurenziano 9, 00161
9
Rome, Italy.
10
b. Department of Business and Law, Roma Tre University, Via Ostiense, 159, 00154 Rome, Italy.
11 12 13 14 15 16 17
Corresponding author:
Maria Monfreda Department of Management, Sapienza University of Rome, via del Castro Laurenziano 9, 00161 Rome, Italy. E-mail:
[email protected] Telephone: 0039 3396093528
1
18
ABSTRACT
19
A method to verify the percentage of olive oil in a blend, in compliance with the Commission
20
Regulation EU No. 29/2012, was developed by GC-FID analysis of methyl esters of fatty acids,
21
followed by chemometric tools (PCA, TFA, SIMCA and PLS).
22
First of all, binary blends of twelve olive oils and one sunflower oil were studied, in order to
23
evaluate the variability associated to the fatty acids profile of olive oils (Monfreda, Gobbi &
24
Grippa, 2012). In this study, binary blends of twelve olive oils with four types of seeds oils (peanut,
25
corn, rice and grape seed oils) were evaluated. These four groups of blends were analysed and
26
processed separately, each group consisting of 36 samples with 40%, 50% and 60% of olive oil
27
content. Chemometric tools were also applied to the global data set (180 samples, including those
28
analysed in the previous paper).
29
Outstanding results were achieved, showing that the proposed method would be capable to
30
discriminate blends with a difference in concentration of olive oil lower than 5% (a standard error
31
of prediction of 3.97% was obtained with PLS). Therefore blends containing 45% and 55% of olive
32
oil were also analysed with the current method and added to the data sets for chemometric
33
assessment with supervised tools. SIMCA still provided good models; however the best
34
performance was achieved by processing each group of binary blends (consisting of 60 samples)
35
separately, rather than applying SIMCA to the overall data set (300 samples). On the other hand
36
PLS did not show significant improvements.
37 38 39
KEYWORDS: Olive oil; blend; Principal Component Analysis; Target Factor Analysis; Soft
40
Independent Models of Class Analogy; Partial Least Squares. 2
41
1.Introduction
42
The subject of blends of olive oils and other vegetable oils is currently handled by Commission
43
Regulation EU No. 29/2012 on marketing standard for olive oil, which repealed EC Regulation
44
1019/2002. Both the current and repealed regulations set the following trade description for blends
45
whose labeling highlights the presence of olive oil elsewhere than in the list of ingredients, using
46
words, images or graphics: 'Blend of vegetable oils (or the specific names of the vegetable oils
47
concerned) and olive oil', directly followed by the percentage of olive oil in the blend. It is also
48
stipulated that the presence of olive oil may be highlighted by images or graphics on the labeling of
49
a blend only where it accounts for more than 50% of the blend concerned.
50
Although the field of legal blends of olive oils and seed oils has been regulated for over ten years by
51
European Union law, there is no official method capable of quantifying the percentage of olive oil
52
in a blend with other vegetable oils. Moreover the implementation of the Commission Regulation
53
EU No. 29/2012 needs an analytical method for verifying if the percentage of olive oil in a blend is
54
lower or higher than 50%, the limit value for a legal use of the aforementioned images or graphics
55
on the labeling.
56
Actually, lots of studies focused on the detection of adulterants in olive oil; many analytical
57
methods have been proposed, often followed by chemometric tools (Gurdeniz & Ozen, 2009;
58
Kasemsumran, Kang, Christy & Ozaki, 2005; Maggio, Cerretani, Chiavaro, Kaufman & Bendini
59
A., 2010; Peña, Cárdenas, Gallego & Valcárcel, 2005; Poulli, Mousdis & Georgiou, 2007; Priego
60
Capote, Rohman, & Che Man, 2012 ; Ruiz Jiménez & Luque de Castro, 2007). This issue has been
61
widely developed, because related to the need for identification a possible fraud consisting in
62
selling olive oil adulterated with cheaper oils. However, the detection and quantification of olive oil
63
in a legal blend requires, as already stated (De la Mata, Dominguez-Vidal, Bosque Sendra, Ruiz-
64
Medina, Cuadros Rodríguez, Ayora-Caňada, 2012), a change in point of view. A possible marketing
65
fraud concerning oils blends, might be the trade of such mixtures, in packages with labels bearing 3
66
images or graphics highlighting the presence of olive oil, when, in fact, the olive oil content is lower
67
than 50%.
68
Fasciotti et al. (2010), during a study aimed to the assessment of olive oil adulteration by soybean
69
oil, found linear relationships among triacylglycerols (TAGs) areas and olive oil concentrations in
70
blends, suggesting the capacity of their method for quantifying olive oil in a commercial blend.
71
Few studies, dealing with edible oil blends, may actually be found: TAGs were analysed by HPLC
72
(De la Mata-Espinosa, Bosque-Sendra, Bro, Cuadros-Rodríguez, 2011) or GC-MS (Ruiz-Samblás,
73
Marini, Cuadros-Rodríguez, González-Casado, 2012), while the quantification has been carried out
74
by partial least squares (PLS). Oil blends were also characterised by ATR-FTIR, followed by PLS
75
(De la Mata, Dominguez-Vidal, Bosque Sendra, Ruiz-Medina, Cuadros Rodríguez, Ayora-Caňada,
76
2012). All these studies provided errors of prediction, ranging from 8% to 10%.
77
The control of legal blends of olive oils and seed oils is a very hard issue: first of all, the behavior of
78
such blends in a concentration range centered on 50% (the discriminant value for legal purposes),
79
needs to be assessed. Moreover, during the development of an analytical method, the minimum
80
appreciable variation in the concentration of olive oil in this specific range should be taken into
81
account for the performance evaluation.
82
A method capable to differentiate blends containing 50% of olive oil with respect to blends
83
containing 40% and 60% of it, was proposed by this research group (Monfreda, Gobbi & Grippa,
84
2012). Methyl esters of fatty acids were analysed by GC-FID and results were processed by
85
chemometric tools; olive oil was also quantified.
86
Methyl esters of fatty acids are a parameter that shows some advantages, because the limit values,
87
set out by the EEC Regulation No. 2568/91, are consistent for each category of olive oil (virgin,
88
refined, pomace, etc…). Variations of this parameter depend only on the type of oil. This is a very
4
89
important advantage in the control of blends, where olive oil has to be detected regardless of its
90
category.
91
The main goal of the first study was to investigate the variability associated to the olive oil. For this
92
reason, olive oil samples having a fatty acids profile extremely different between each other, were
93
mixed with only one sample of sunflower oil.
94
In the light of the noteworthy results achieved, the aim of this study was to go on with the
95
assessment of the variability associated to the fatty acids composition of seeds oils. The model
96
proposed by the previous work was therefore extended to binary blends of olive oils with four types
97
of vegetable oils: corn, peanut, rice and grape seed.
98
Moreover in the previous work a standard error of prediction of 1.51% was obtained, suggesting
99
that the proposed method could really be able to distinguish blends with a difference in olive oil
100
content less than 10%. Mixtures containing 45% and 55% of olive oil were therefore analyzed and
101
checked with this method. As in the previous work, the methyl esters of fatty acids were analysed
102
by GC-FID, followed by chemometric tools.
103
2. Materials and Methods
104
Preparation of blends and samples, as well as the analyses of methyl esters of fatty acids by GC-
105
FID, were carried out at the same time as the analyses of the previous work had been made
106
(Monfreda, Gobbi & Grippa, 2012), according to the procedures and using the gas chromatograph
107
described in par. 2.1 and 2.2. (of the previous paper).
108
Twelve samples of olive oil were mixed with four vegetable oils: corn, peanut, rice and grape seed
109
(the fatty acids composition of pure oils is reported in tables 5 and 6 of the previous article),
110
obtaining binary blends with 40%, 45%, 50%, 55% and 60% in olive oil volume. Four groups of
111
binary blends were obtained: olive-corn, olive-peanut, olive-rice and olive-grape seed. Each group
112
(or data set) consists of sixty samples divided into five categories, depending on the olive oil 5
113
content. Blends of olive oil and sunflower oil containing 45% and 55% of olive oil were also
114
prepared (using the sunflower oil of the previous work) and anlysed, in order to have, as a whole,
115
five groups of blends.
116
One-way Anova was performed on each data set in order to compare, for each variable, the variance
117
within any category with the one between categories. Fatty acids used as variables, were: myristic,
118
palmitic, palmitoleic, margaric, margaroleic, stearic, oleic, linoleic, arachidic, linolenic, eicosenoic,
119
behenic and lignoceric acids.
120
Principal Component Analysis (PCA), Target Factor Analysis (TFA), Soft Independent Models of
121
Class Analogy (SIMCA) and Partial Least Squares (PLS) were applied as chemometric tools. They
122
are described in the previous paper (par. 2.4.).
123
Multivariate statistical analyses were carried out as follows:
124
PCA, TFA, SIMCA and PLS were applied to the four data sets (olive-corn, olive-peanut,
125
olive-rice and olive-grape seed), excluding, at the beginning, blends containing 45% and
126
55% of olive oil; each data set consisted of 36 samples and three categories (40%, 50% and
127
60% of olive oil concentration).
128
PCA, TFA, SIMCA and PLS were applied to a data set of 180 samples, obtained by
129
gathering these four data sets together with the one processed in the previous work (blends
130
of olive oil and sunflower oil, 36 samples).
131
SIMCA and PLS were applied to the five data sets (olive-corn, olive-peanut, olive-rice and
132
olive-grape seed, olive-sunflower), including blends with 45% and 55% of olive oil; each
133
data set consisted of 60 samples. The fifth data set (olive oil – sunflower oil) was processed,
134
due to the fact that in the previous paper blends with 45% and 55% of olive oil were not
135
evaluated.
6
136
SIMCA and PLS were eventually applied to the overall data set consisting of 300 samples (five groups of sixty samples each).
137
138
All the computations were performed using V-PARVUS (Forina, Lanteri, Armanino, Cerrato-
139
Oliveiros, Casolino, 2010) and SPSS (IBM Statistics computer program, 2010).
140
3. Results and Discussion
141
3.1 Preliminary statistical tests
142
One-way Anova was performed on the four data sets (olive oil – corn oil, olive oil – peanut oil,
143
olive oil – rice oil and olive oil – grape seed oil) in order to select, for further statistical analysis, the
144
more significant fatty acids for discrimination between categories. Each data set consists of 60
145
samples that contain 40%, 45%, 50%, 55% and 60% of olive oil, forming five classes.
146
However, the application of one-way Anova to these data sets led to obtain only one or two
147
significant variables (that have a between-category variability significantly higher than the within-
148
category variability). Such a result is probably due to the fact that the between-category variability
149
tends to decrease with decreasing the difference of content of olive oil between categories, so as to
150
become, for many variables, comparable to the within-category variability. However only one or
151
two variables are not suitable for application of multivariate statistical analysis. As a consequence,
152
one-way Anova and the first multivariate statistical analyses were performed on 36 samples for
153
each data set (similarly to what was done in the previous work), excluding blends with 45% and
154
55% of olive oil.
155
From one-way Anova, applied to the reduced data sets, the following variables resulted more
156
significant for discrimination between categories:
157 158
myristic, oleic, linoleic, arachidic, linolenic, eicosenoic, behenic and lignoceric acids for blends of peanut oil and olive oil;
7
159
oil and olive oil;
160 161
myristic, linoleic, arachidic, linolenic, eicosenoic, behenic and lignoceric acids for blends of rice oil and olive oil;
162 163
myristic, oleic, linoleic, arachidic, linolenic, behenic and lignoceric acids for blends of corn
myristic, oleic, linoleic, arachidic, linolenic, eicosenoic and behenic acids for blends of grape seed oil and olive oil.
164 165
Multivariate statistical analyses were therefore performed, for each data set, on the variables just
166
mentioned. Moreover Anova showed that some variables: palmitic, palmitoleic, margaric,
167
margaroleic and stearic acids do not have a discrimination power for any of the data sets analysed
168
(because the between-category variability is not significantly different from the within-category
169
one).
170
3.2 Principal component analysis (PCA)
171
Variables were preprocessed by column autoscaling (Monfreda, Gobbi & Grippa, 2012). PCs with
172
eigenvalues >1 were highlighted.
173
3.2.1 Blends of olive oil and peanut oil.
174
A data set of 36 samples and eight variables (myristic, oleic, linoleic, arachidic, linolenic,
175
eicosenoic, behenic and lignoceric acids, selected by Anova, par. 3.1) was processed. The first two
176
PCs explain 93.36% of the total variance.
177
The biplot of PC2 vs PC1 is shown in (Fig. 1 (a)). A well-defined separation of samples in
178
accordance with the percentage of olive oil was achieved on PC1. It can be noticed that variables
179
with positive loadings on PC1 - where blends with 60% of olive oil are grouped - are oleic and
180
linolenic, while all the other variables have negative loadings on PC1, where blends with 60% of
181
peanut oil are grouped.
8
182
Such loading distribution is very consistent with the fatty acid composition of pure oils: in fact,
183
olive oil has a significantly higher content of oleic and linolenic acids compared to peanut oil, while
184
the latter has a significantly higher concentration of the other fatty acids compared to olive oil
185
(Monfreda, Gobbi & Grippa, 2012 (tables 2 and 3)).
186
Blends that are more similar to olive oil than peanut oil (with 60% of olive oil) are therefore
187
grouped toward positive values of PC1, whilst blends that are more similar to peanut oil (with 40%
188
of olive oil) cluster toward negative values of PC1. Blends containing 50% of olive oil cluster in the
189
middle of the plot.
190
3.2.2 Blends of olive oil and corn oil.
191
From the PCA, applied to a data set of 36 samples and seven variables (myristic, oleic, linoleic,
192
arachidic, linolenic, behenic and lignoceric acids, selected by Anova, par. 3.1), only one PC with
193
eigenvalue >1 was extracted, explaining 72.37% of the total variance.
194
Even in this case a well defined separation of samples in accordance with the percentage of olive oil
195
was achieved (Fig. 1 (b)): blends containing 60% of olive oil, grouped toward positive values of
196
PC1, have a high content of oleic acid, the only variable with a positive loading on PC1.
197
Blends with 40% of olive oil, grouped together on negative values of PC1, show a greater content
198
of myristic, linoleic, behenic and lignoceric acids, variables with high negative loadings on PC1.
199
Similarly to the previous case (blends of olive oil and peanut oil), the mixtures containing 50% of
200
olive oil are grouped around the axis origin.
201
Arachidic and linolenic acids show negative loadings on both PC1 and PC2, along a direction with
202
no variability due to the percentage of olive oil in the mixtures (Fig. 1 (b)).
203
It is possible to correlate this loadings distribution with the fatty acids profile of the pure oils
204
(Tables 2 and 3 in the previous paper). Olive oils show a high relative amount of oleic acid, which 9
205
is, in fact, the variable most related to the mixtures containing 60% of olive oil, while the pure corn
206
oils have a high content of linoleic acid, a variable strongly related to the blends with 60% of corn
207
oil. With regard to myristic, behenic and lignoceric acids, they are minor components of the pure
208
oils (even if corn oil contains a greater amount of them, with respect to olive oils), but the
209
preprocessing by autoscaling tends to highlight even little differences existing among blends. This
210
explains why blends with 60% of corn oil show a higher relative amount of these compounds,
211
compared to blends containing a greater amount of olive oil.
212
3.2.3 Blends of olive oil and rice oil.
213
PCA, applied to a data set of 36 samples and seven variables (myristic, linoleic, arachidic, linolenic,
214
eicosenoic, behenic and lignoceric acids, selected by Anova, par. 3.1), allowed the extraction of
215
only one PC with eigenvalue >1, explaining 79.43% of the total variance.
216
Even in this case it is possible to see, from the bi-plot of PC2 versus PC1 (Fig. 1 (c)), a very good
217
separation of samples in accordance with the percentage of olive oil. Such a separation is achieved
218
on PC1, but in this case, blends with 60% of olive oil have negative values on PC1 and samples
219
containing 40% of olive oil are grouped toward positive values of PC1.
220
Regarding the loadings distribution, it can be noticed that all the variables have high positive
221
loadings on PC1: in other words the samples richest in rice oil have the highest content of all the
222
fatty acids used as variables in PCA. These results were compared with the fatty acids profile of
223
pure oils (Tables 2 and 3 in the previous paper), which show, in fact that rice oil has a higher
224
content of all fatty acids used as variables in the present statistical analysis. Oleic acid, the most
225
abundant fatty acid in olive oil, has been excluded from the PCA, because it resulted, from Anova,
226
not to be a discriminant variable for this kind of mixtures.
227
3.2.4 Blends of olive oil and grape seed oil.
10
228
A data set relative to 36 samples and seven variables (myristic, oleic, linoleic, arachidic, linolenic,
229
eicosenoic and behenic acids, selected by Anova, par. 3.1) was processed. The first two PCs explain
230
81.47% of the total variance.
231
Even in this case, a well defined separation of samples in accordance with the percentage of olive
232
oil was achieved on PC1, as it can be seen from the bi-plot of PC2 versus PC1 (Fig. 1 (d)).
233
Blends containing 60% of olive oil, grouped toward positive values of PC1, have a high relative
234
content of oleic, linolenic, arachidic and behenic acids (these differences were found as well
235
between pure olive oils and grape seed oils).
236
Blends with 40% of olive oil, clustered toward negative values of PC1, show a higher content of
237
linoleic acid (this variable has a significantly higher content in a pure grape seed oil compared to an
238
olive oil) and myristic acid (this is a minor component for which the autoscaling tends to highlight
239
even small differences between pure oils mixed for preparing blends).
240
3.2.5 PCA applied to a data set made up by all blends containing 40%, 50% and 60% of olive
241
oil
242
All samples so far processed by PCA were put together with the samples of blends of olive oil and
243
sunflower oil (analysed in the previous article). PCA was then applied to a data set of 180 samples.
244
Variables were selected by eliminating those most frequently excluded from the statistical analyses
245
carried out on each separate data set. Margaric, margaroleic, palmitic, palmitoleic and stearic acids
246
were therefore eliminated, because, as highlighted in par. 3.1, such variables do not have a
247
discrimination power for any of the data sets analysed.
248
PCA was applied to eight variables (myristic, oleic, linoleic, arachidic, linolenic, eicosenoic,
249
behenic and lignoceric acids).
250
The first two PCs have eigenvalues >1 and explain 85.10% of the total variance. 11
251
Samples were displayed, first of all, depending on both the percentage of olive oil contained therein,
252
and the seed oil used in the mixture. In this way, fifteen classes were highlighted (three classes for
253
each type of seed oil).
254
Through the evaluation of Fisher weights (calculated, for each couple of classes and for each PC,
255
through the ratio of the between-category variance and the within-category variance) PC1, PC2 and
256
PC4 were found to be the most significant components for the purpose of grouping samples
257
according to their class.
258
The scores-plots of PC2 versus PC1 and PC4 versus PC1 are shown respectively in Fig. 2 (a) and 2
259
(b), where fifteen groups, almost all well separated between each other, can be seen. Moreover,
260
groups of blends containing the same types of pure oils are closer each other than the groups with
261
the same percentage of olive oil are. These graphical results clearly indicate that chemical
262
properties of mixtures are more linked to their qualitative composition (both types of oils whose
263
they are made up) rather than the actual content of olive oil. This conclusion could be explained
264
considering the analytical range of work: blends containing 50% of olive oil ± 10%. In other words,
265
olive oil, the component that all classes have in common, does not have an adequate concentration
266
to determine a samples grouping which is indipendent from the other vegetable oil whose blends are
267
made up. These results demonstrate, furthermore, how different is the analytical problem of
268
defining blends of olive oil and other vegetable oils, from the study of olive oil adulteration.
269
Results obtained from PCA have also been displayed by dividing samples in only three classes,
270
based on the percentage of olive oil. The most significant components for the purpose of grouping
271
samples according to their class, evaluated by means of the Fisher weights, were, in this case, PC3
272
and PC6. The bi-plot of PC6 versus PC3 is shown in Fig. 2(c), from which classes 1 and 2 appear
273
overlapped, as well as classes 2 and 3, whilst classes 1 and 3 appear completely separated.
274
From this bi-plot it can be deduced that, although the evaluation of many blends based only on the
275
olive oil content is by far more complex than studying sets of binary blends, it is possible, even with 12
276
unsupervised tools, to identify a trend in the distribution of samples belonging to such complex
277
systems.
278
From the loadings distribution, (Fig. 2 (c)), it can be seen that linoleic acid has the highest loading
279
in the direction of maximum separation between classes, indeed where samples with 40% of olive
280
oil tend to cluster. Myristic, linolenic, eicosenoic, behenic and lignoceric acids have a positive
281
weight in the same direction, whilst oleic and arachidic acids have high loadings in an almost
282
ortogonal direction, that does not appear correlated to the olive oil percentage in blends.
283
3.3. Target Factor Analysis (TFA)
284
Target factor analysis was applied to the four data sets already processed by PCA, to which six
285
“pure” objects, corresponding to the fatty acid profiles of the same number of pure oils, were added:
286
olive oil (having a composition equal to the mean value among the 241 olive oils already considered
287
in the previous paper for the TFA), sunflower oil (the same used for TFA in the previous work) and
288
corn, peanut, rice and grape seed oil (different from the samples used for preparing blends).
289
Target Factor Analysis was applied, as an exploratory tool, in order to check whether the
290
fingerprints of pure oils used for preparing blends would have been recognized. Variables were
291
preprocessed by column autoscaling. Factors with eigenvalues >1 were chosen as significant.
292
With regard to blends of peanut oil and olive oil, two significant factors were extracted with PCA,
293
the same number of target factors should then be identified by TFA, in other words two types of oil
294
used to produce blends. The six factors, ordered according to the residual variance, gave as first and
295
second target factors the chromatographic profiles of olive oil and peanut oil, respectively.
296
As regards blends of corn oil and olive oil, only one significant factor was obtained by PCA. The
297
first target factor identified by TFA was the chromatographic profile of olive oil, while corn oil was
298
found as second target factor. 13
299
For blends of rice oil and olive oil PCA also identified only one target factor. TFA allowed the
300
identification of rice oil as first target factor, while olive oil was found as second target factor.
301
With regard to blends of grape seed oil and olive oil, two significant factors were extracted with
302
PCA, while the first and second target factors, identified by TFA, were the chromatographic
303
profiles of olive oil and corn oil, respectively. The fatty acids composition of grape seed oil appears
304
as third target factor. This lack in recognition the grape seed oil could be due to the similarity of the
305
fatty acids profiles between the grape seed oil used for preparing blends and the corn oil used as
306
pure object in TFA.
307
TFA was then applied to the data set consisting of all blends (as it was done for PCA in par. 3.2.5),
308
leading to the identification of the olive oil, the only type of oil present in all samples, as first target
309
factor.
310
This unsupervised chemometric tool allowed, on the whole, a good description of the systems under
311
study (except for blends of grape seed oil and olive oil), even if the number of significant factors
312
identified is not always consistent with the number of pure oils used for preparing blends.
313
3.4. SIMCA
314
SIMCA was applied to the four data sets already processed by PCA (blends of olive oil with each
315
type of seed oil), performing a cross validation with 6 cancellation groups. The mathematical model
316
of each class was built with 7 components for blends of olive oil and peanut oil, and 6 components
317
for the other blends. SIMCA was applied, considering a 95% confidence level to define the class
318
space and an unweighted augmented distance (Wold, Sjostrom, 1977).
319
Models with classification ability (modeling rate), sensitivity (the percentage of objects belonging
320
to the category which are correctly identified by the mathematical model) and specificity (the
321
percentage of objects from other categories which are classified as foreign) equal to 100% were 14
322
obtained. The prediction ability (prediction rate) was 100% for blends of olive oil and rice oil, and
323
97.22% for the remaining data sets.
324
These noteworthy results, shown in Table 1, are consistent with those obtained with PCA, where for
325
each biplot, the three classes are already grouped on PC1 axis.
326
SIMCA was also applied to the data set of 180 samples and 8 variables, described in par. 3.2.5,
327
divided in three classes based on the percentage of olive oil. A cross validation was performed with
328
20 cancellation groups and 7 components were used to build the mathematical model of each class.
329
Classification ability for each class was 100%, the mean value of prediction ability was 97.78%.
330
The mean values of sensitivity and specificity were equal to 95.56 and 99.72, respectively, as
331
shown in Table 1.
332
Cooman's plots (Coomans et al., 1984), representing the distances of samples from classes 1 (blends
333
with 40% of olive oil) and 2 (blends with 50% of olive oil) and from classes 2 and 3 (blends with
334
60% of olive oil) are shown in Fig. 3 (a) and 3 (b) respectively.
335
The application of SIMCA, a class modeling tool, allowed to take a step forward compared to PCA
336
(which allowed to find a trend in the distribution of samples), since it demonstrated that it is
337
possible to achieve a true recognition of blends containing 40%, 50% or 60% of olive oil.
338
3.5 PLS
339
PLS was applied to the four data sets, already used for PCA and SIMCA. Twelve samples were
340
extracted from each data set in order to construct the external calibration set. Variables were column
341
centered, six cancellation groups were used for model validation, obtaining the best prediction with
342
three latent variables for the data set of olive oil – peanut oil, and four latent variables for the other
343
data sets. Therefore the closed form was calculated with a complexity of three for the first data set,
344
while a complexity of four was used for the remaining data sets. 15
345
PLS was also applied to the global data set of 180 samples, as already done with PCA and SIMCA.
346
Thirty samples were extracted for the construction of the external calibration set, fifteen
347
cancellation groups were used for model validation and the best prediction was obtained with seven
348
latent variables. The closed form was calculated with a complexity of seven.
349
The parameters used for the models evaluation are shown in Table 1.
350
Performance of the models, evaluated through the mean of standard deviation of the error of
351
prediction (SDEP), expressed as percentage of olive oil in blends, is more than satisfactory. Its
352
values don’t show significant differences among all data sets, even considering the one of 180
353
samples.
354
Model stability (the variability of SDEP among cancellation groups) was also good, because of the
355
low values of the standard deviation of SDEP in cancellation groups and the standard deviation of
356
the mean of SDEP in cancellation groups. These parameters are slightly higher, but still acceptable
357
for blends of corn oil - olive oil and blends of rice oil – olive oil.
358
The RMSEP (Kowalski, Seasholtz, 1991; Massart et al., 1997) obtained for the external evaluation
359
set were very satisfactory; even in this case blends of corn oil - olive oil and blends of rice oil –
360
olive oil have the highest error of prediction (4.22 e 4.98% respectively).
361
An overall evaluation of the regression model obtained putting together all samples, which
362
represents the most important analytical challenge of this work, is very positive as these results
363
(mean of SDEP equal to 1.68% and RMSEP equal to 3.97%) have been achieved by mixing olive
364
oils - with the widest variability in the fatty acids composition - with five types of seed oils.
365
3.6 Supervised chemometric tools applied to data sets including blends with 45% and 55% of
366
olive oil
16
367
Results achieved with the chemometric tools so far described, especially those obtained from
368
modeling and regression tools, show that the proposed method is able to discriminate blends with a
369
difference in concentration lower than 5%.
370
Considering, moreover, that the calibration was carried out using blends with a concentration
371
difference equal to 10% (mixtures containing 40%, 50% and 60% of olive oil), a further statistical
372
analysis was performed, by adding the blends with 45% and 55% of olive oil previously excluded
373
from chemometric evaluation.
374
Although Anova has already showed (par. 3.1.) that variables tend to lose their discriminant power
375
when comparing blends with a concentration difference of 5%, the aim of this statistical analysis
376
was to verify if the performance of supervised tools could still be improved.
377
Variables selected by Anova for the reduced data sets (par. 3.1.) were used for blends of olive oil –
378
corn oil, olive oil – peanut oil, olive oil – rice oil and olive oil- grape seed oil, while the variables
379
selected in the previous paper were used for the data set olive oil – sunflower oil.
380
SIMCA was applied to the five data sets performing a cross validation with ten cancellation groups.
381
The mathematical model of each class was built with seven components for the blends of olive oil -
382
peanut oil and olive oil – sunflower oil; six components were used for the other blends. Results are
383
shown in Table 2: models still allow a good differentiation of samples as a function of the amount
384
of olive oil in the blend.
385
All samples so far considered were put together and therefore a data set of 300 samples and eight
386
variables, (selected in par. 3.2.5.), was drawn up. SIMCA was applied, performing a cross
387
validation with twenty cancellation groups and using seven components to build the mathematical
388
model of each class. Results obtained, shown in Table 2, are still very satisfactory, allowing a quite
389
good classification of blends with a difference concentration of 5% in olive oil. It must be said,
17
390
nevertheless, that SIMCA gave the best performance when applied to each type of binary blend,
391
rather than to the overall data set.
392
Cooman's plots, representing the distances of samples from classes 2 (blends with 45% of olive oil)
393
and 3 (blends with 50% of olive oil) and from classes 3 and 4 (blends with 55% of olive oil) are
394
shown in Fig. 4 (a) and 4 (b) respectively. Through these graphics, highlighting the model's ability
395
to differentiate blends with 50% of olive oil from those containing 45% and 55%, it is clear that
396
there is a number of samples belonging to the classes 2 and 3 (Fig. 4(a)), which are classified in the
397
area (bounded by the square, in the lower left, starting from the origin of axes) where the two
398
classes overlap. The same thing happen in Fig. 4(b) for a number of samples belonging to the
399
classes 3 and 4.
400
Cooman’s plots for blends of olive oil and corn oil, are shown, as an example, in Fig. 4(c) and 4(d),
401
from which no sample can be found in the area where classes 2 and 3, or classes 3 and 4 overlap.
402
Therefore Cooman’s plots confirm that the best performance of SIMCA is achieved when each type
403
of binary blend is separately processed.
404
As done for SIMCA, multivariate regression by PLS was applied to the five data sets; twenty
405
samples, forming the external calibration set, were extracted. Variables were centered, ten
406
cancellation groups were used for model validation, obtaining the best prediction with seven latent
407
variables for the data set of olive oil – sunflower oil, six latent variables for the data set of olive oil
408
– peanut oil, and five latent variables for the other data sets. Therefore the closed form was
409
calculated with a complexity of seven for the first data set, six for the second data set, while a
410
complexity of five was used for the remaining data sets.
411
PLS was, moreover, applied to the overall data set of 300 samples, already processed by SIMCA.
412
Thirty samples were extracted in order to construct the external calibration set. Twenty-seven
18
413
cancellation groups were used for model validation, obtaining the best prediction with seven latent
414
variables; the closed form was then calculated with a complexity of seven.
415
The parameters used for the models evaluation are shown in Table 2.
416
Comparing these data with PLS results shown in Table 1 (for blends olive oil - sunflower oil, data
417
are reported in the previous paper), it is quite clear there are not significant differences: regarding
418
the mean of SDEP, the performance of the models, tends to improve, except for the data set of
419
blends of olive oil – peanut oil and the global data set.
420
Regarding the errors of prediction on the external test set, assessed through the RMSEP, there has
421
been a slight worsening of the models’ performance, with the exception of the blends of olive oil -
422
rice oil, for which, instead, there has been an improvement.
423
Differences between PLS results shown in Tables 1 and 2 are, however, rather small. For this
424
reason it can be argued that quantitative analyses of blends containing an amount of olive oil close
425
to 50%, need a calibration with blends containing 40%, 50% and 60% of olive oil: models show
426
errors of prediction (RMSEP = 3.97) of less than 5%. When calibration is done by adding blends
427
containing 45% and 55% of olive oil, such results are roughly confirmed (RMSEP = 4.62), but not
428
improved. This statement is fundamental when an unknown blend needs to be compared with a
429
calibration set representative of a large variability in the fatty acids profiles of both olive oils and
430
seeds oils.
431
However, when blends with 45% and 55% of olive oil are added to the calibration set of blends
432
with only one type of seeds oil, even if prediction errors with PLS are not always reduced, the
433
knowledge of such systems improves, because of the power of class modeling tools.
434
4. Conclusions
19
435
The extension of the previous study to blends of olive oils with other four types of seed oils (corn,
436
peanut, rice and grape seed) has fully confirmed the earlier results, leading to the construction of
437
models capable of verifying and recognising the percentage of olive oil in a binary blend.
438
Moreover, the application of supervised tools to the data set consisting of all the blends with 40%,
439
50% and 60% of olive oil allowed to obtain a noteworthy classification model, through SIMCA
440
(sensitivity equal to 95.56% and specificity of 99.72%) and, as well, an excellent quantitative
441
model, by means of PLS (mean value of SDEP equal to 1.68% and RMSEP of 3.97%).
442
Good classification models were still obtained by adding blends containing 45% and 55% of olive
443
oil. In this case, the best results were achieved by applying SIMCA to the separate data sets of
444
binary blends rather than to the overall data set.
445
Regarding the PLS algorithm, the introduction of blends with 45% and 55% of olive oil did not
446
make significant improvements, indeed in some cases there was even a slight deterioration in the
447
errors of prediction. However, an error of prediction (on the external evaluation set, RMSEP) lower
448
than 5% was confirmed.
449
Moreover, the assessment of blends which differ from 10% in olive oil concentration allowed the
450
identification of variables most significantly for the purpose of this study, while the analyses of
451
blends differing from 5% in olive oil concentration allowed a complete evaluation of the
452
potentialities and limits of the proposed method.
453
In conclusion, the application of this method for analysing an unknown binary blend, in order to
454
verify the compliance with the Commission Regulation EU No. 29/2012, could be managed by
455
using a calibration set of blends with 40%, 45%, 50%, 55% and 60% of olive oil, if the comparison
456
with only one type of binary blend was needed. If, on the other hand, the type of seed oil (of the
457
unknown blend) was not known, a comparison with different types of binary blends would be 20
458
necessary and hence the calibration would be carried out with blends containing 40%, 50% and
459
60% of olive oil.
460
461
References
462
Commission Regulation EC No. 1019/2002 on marketing standard for olive oil, Official Journal of
463 464 465
the European Communities, L155, 2002, 27-36. Commission Regulation EU No. 29/2012 on marketing standard for olive oil, Official Journal of the European Communities, L12, 2012, 14-26.
466
Commission Regulation EEC No. 2568/91 of 11 July 1991 on the characteristics of olive oil and
467
olive-residue oil and on the relevant methods of analysis. Official Journal of the
468
Commission European Communities, L248, 1991, 1-83 and successive modifications.
469
Coomans, D., Broeckaert, I., Derde, M.P., Tassin, A., Massart, D.L., & Wold, S. (1984). Use of a
470
microcomputer for the definition of multivariate confidence regions in medical diagnosis
471
based on clinical laboratory profiles. Computers and Biomedical Research, 17, 1-14.
472
De la Mata-Espinosa, P., Bosque-Sendra, J.M., Bro, R., & Cuadros-Rodríguez, L. (2011). Olive oil
473
quantification of edible vegetable oil blends using triacylglycerols chromatographic
474
fingerprints and chemometric tools. Talanta, 85, 177-182.
475
De la Mata, P., Dominguez-Vidal, A., Bosque Sendra, J.M., Ruiz-Medina, A., Cuadros Rodríguez,
476
L., & Ayora-Caňada, M.J. (2012). Olive oil assessment in edible oil blends by means of
477
ATR-FTIR and chemometrics. Food Control, 23, 449-455.
478
Fasciotti, M., & Pereira Netto, A.D. (2010). Optimization and application of methods of
479
triacylglycerol evaluation for characterization of olive oil adulteration by soybean oil with
480
HPLC-APCI-MS-MS. Talanta, 81, 1116-1125.
481
Forina, M., Lanteri, S., Armanino, C., Cerrato-Oliveiros, C., & Casolino C. V-PARVUS 2010: An
482
Extendable Package of Programs for Data Explorative Analysis, Classification and 21
483
Regression Analysis, Department of Chimica e Tecnologie Farmaceutiche e Alimentari,
484
University of Genova, Genova, Italy. URL http://www.parvus.unige.it.
485 486
Gurdeniz, G., & Ozen, B. (2009). Detection of adulteration of extra-virgin olive oil by chemometric analysis of mid-infrared spectral data. Food Chemistry, 116, 519-525.
487
IBM SPSS Statistics computer program, Version 19, 2010.
488
Kasemsumran, S., Kang, N., Christy, A., & Ozaki, Y., (2005). Partial Least Squares Processing of
489
Near-Infrared Spectra for Discrimination and Quantification of Adultered Olive Oils.
490
Spectroscopy Letters, 38, 839-851.
491 492
Kowalski, B.R., & Seasholtz, M.B. (1991). Recent developments in multivariate calibration Journal of Chemometrics, 5, 129-145.
493
Maggio, R.M., Cerretani, L., Chiavaro, E., Kaufman T.S., & Bendini A. (2010). A novel
494
chemometric strategy for the estimation of extra virgin olive oil adulteration with edible oils.
495
Food Control, 21, 890-895.
496
Massart, D.L., Vandeginste, B.G.M., Buydens, L.M.C., De Jong, S., Lewi, P.J., & Smeyers-
497
Verbeke, J. (1997). Data Handling in Science and Technology 20A, Handbook of
498
Chemometrics and Qualimetrics Part A, Elsevier, Amsterdam.
499
Monfreda, M., Gobbi, L., & Grippa, A. (2012). Blends of olive oil and sunflower oil:
500
characterisation and olive oil quantification using fatty acid composition and chemometric
501
tools. Food Chemistry, 134, 2283-2290.
502
Peña, F., Cárdenas, S., Gallego, M., & Valcárcel, M. (2005). Direct olive oil authentication:
503
Detection of adulteration of olive oil with hazelnut oil by direct coupling of headspace and
504
mass spectrometry, and multivariate regression techniques. Journal of Chromatography A,
505
1074, 215-221.
506 507
Poulli, K.I., Mousdis, G.A., & Georgiou, C.A. (2007). Rapid synchronous fluorescence method for virgin olive oil adulteration assessment. Food Chemistry, 105, 369-375.
22
508
Priego Capote, F., Ruiz Jiménez, J., & Luque de Castro, M.D. (2007). Sequential (step-by-step)
509
detection, identification and quantitation of extra virgin olive oil adulteration by
510
chemometric treatment of chromatographic profiles. Analytical and Bioanalytical
511
Chemistry, 388, 1859-1865.
512
Rohman, A. & Che Man, Y.B. (2012). Authentication of extra virgin olive oil from sesame oil
513
using FTIR spectroscopy and gas chromatography. International Journal of Food
514
Properties, 15, 1309-1318.
515
Ruiz-Samblás, C., Marini, F., Cuadros-Rodríguez, L., & González-Casado, A. (2012).
516
Quantification of blending of olive oils and edible vegetable oils by triacylglycerol
517
fingerprint gas chromatography and chemometric tools. Journal of Chromatography B, 910,
518
71-77.
519
Wold, S., & Sjostrom, M. (1977). In B.R. Kowalski (Ed.) Chemometrics, Theory and Application,
520
ACS Symposium Series No. 52, (pp 243.-282). American Chemical Society, Washington,
521
DC.
23
Table 1 Results obtained by applying SIMCA and PLS to the four data sets (blends with 40%, 50% and 60% of olive oil) and to the data set of 180 samples. Peanut Corn
Rice
Grape seed Data set of 180 samples
SIMCA Mean % classification rate
100.00 100.00 100.00 100.00
100.00
Mean % prediction rate
97.22
97.78
Mean % sensitivity
100.00 100.00 100.00 100.00
95.56
Mean % specificity
100.00 100.00 100.00 100.00
99.72
Mean SDEP in canc. groups
1.57
1.76
1.50
1.17
1.68
SDEP st. dev. in canc. groups
0.29
0.64
0.64
0.37
0.40
St. dev. of the mean of SDEP in canc. groups 0.12
0.26
0.38
0.16
0.10
RMSEP
4.22
4.98
2.83
3.97
97.22
100.00 97.22
PLS
1.28
Table 2 Results obtained by applying SIMCA and PLS to the four data sets (blends with 40%, 45%, 50%, 55% and 60% of olive oil) and to the data set of 300 samples. Sunflower Peanut Corn
Rice
Grape seed Data set of 300 samples
Mean % classification rate
100.00
100.00 98.33
100.00 100.00
85.33
Mean % prediction rate
83.33
86.67
96.67
85.00
79.67
Mean % sensitivity
100.00
100.00 100.00 98.33
100.00
95.67
Mean % specificity
99.17
100.00 98.75
100.00 100.00
88.83
Mean SDEP in canc. groups
1.42
1.74
1.35
1.29
0.87
1.98
SDEP st. dev. in canc. groups
0.64
0.69
0.74
0.76
0.47
0.48
St. dev. of the mean of SDEP in canc. groups 0.20
0.22
0.23
0.24
0.15
0.09
RMSEP
2.34
4.96
3.53
3.38
4.62
SIMCA
81.67
PLS
2.83
Figure 1 Biplot of PC2 versus PC1 for blends of olive oil - peanut oil (a), olive oil - corn oil (b), olive oil rice oil (c), olive oil - grape seed oil (d). Figure 2 Data set of 180 samples: Scores - plot of PC2 versus PC1 (a) and PC4 versus PC1 (b), obtained considering fifteen groups of samples. Biplot of PC6 versus PC3 (c), with samples divided in only three classes, based on the oilve oil content. Figure 3 Data set of 180 samples: Cooman's plots for the classes 1 and 2 (a) and for the classes 2 and 3(b). Figure 4 Data set of 300 samples: Cooman's plots for the classes 2 and 3 (a) and for the classes 3 and 4(b). Blends of olive oil and corn oil: Cooman's plots for the classes 2 and 3 (c) and for the classes 3 and 4 (d).
Table 1 Results obtained by applying SIMCA and PLS to the four data sets (blends with 40%, 50% and 60% of olive oil) and to the data set of 180 samples. Table 2 Results obtained by applying SIMCA and PLS to the four data sets (blends with 40%, 45%, 50%, 55% and 60% of olive oil) and to the data set of 300 samples.
9 9 9 9
We proposed an analytical method for detecting the percentage of olive oil in a blend. Fatty acids methyl esters analysis and chemometric tools were used. This method meets requirements set out by Regulation Eu No. 29/2012. A noteworthy quantification model was developed.