Accepted Manuscript Title: A new measure of orthogonality for multi-dimensional chromatography Author: Michelle Camenzuli Peter J. Schoenmakers PII: DOI: Reference:

S0003-2670(14)00670-9 http://dx.doi.org/doi:10.1016/j.aca.2014.05.048 ACA 233292

To appear in:

Analytica Chimica Acta

Received date: Revised date: Accepted date:

22-2-2014 22-5-2014 27-5-2014

Please cite this article as: Michelle Camenzuli, Peter J.Schoenmakers, A new measure of orthogonality for multi-dimensional chromatography, Analytica Chimica Acta http://dx.doi.org/10.1016/j.aca.2014.05.048 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ip t

A new measure of orthogonality for multi-dimensional chromatography

a

us

cr

Michelle Camenzuli a,b* , Peter J. Schoenmakersa,b

van ’t Hoff Institute for Molecular Sciences

an

University of Amsterdam

COAST, Science Park 904, 1098 XH Amsterdam, The Netherlands

d

b

M

Science Park 904, 1098 XH Amsterdam, The Netherlands

7040

Ac ce pt e

* Corresponding author’s contact details; Email: [email protected], Phone: +31-20-525

Page 1 of 31

Graphical abstract

cr

ip t

Highlights

?A new method for measuring orthogonality in multidimensional separations is introduced ?Our

us

method also diagnoses areas where peaks are clustered in the separation space ?The new

an

method comprises of a number of equations which are easily implemented in Microsoft Excel ? We applied the method to 8 computer-generated and 2 experimental multidimensional

d

Abstr act

M

chromatograms ?The method compared favorably against established methods

Ac ce pt e

Multi-dimensional chromatographic techniques, such as (comprehensive) two-dimensional liquid chromatography and (comprehensive) two-dimensional gas chromatography, are increasingly popular for the analysis of complex samples, such as protein digests or mineral oils. The reason behind the popularity of these techniques is the superior performance, in terms of peakproduction rate (peak capacity per unit time), that multi-dimensional separations offer compared to their one-dimensional counterparts. However, to fully utilize the potential of multidimensional chromatography it is essential that the separation mechanisms used in each dimension be independent of each other. In other words, the two separation mechanisms need to be orthogonal. A number of algorithms have been proposed in the literature for measuring chromatographic orthogonality. However, these methods have their limitations, such as reliance on the division of the separation space into bins, need for specialist software or requirement of

Page 2 of 31

advanced programming skills. In addition, some of the existing methods for measuring orthogonality include regions of the separation space that do not feature peaks. In this paper we introduce a number of equations which provides information on the spread of the peaks within

ip t

the separation space in addition to measuring orthogonality, without the need for complex

cr

computations or division of the separation space into bins.

us

KEYWORDS: multi-dimensional chromatography, LC×LC, GC×GC, orthogonality, separation

1. INTRODUCTION

M

an

science

d

When analysing very complex samples, it is essential to use a technique that is capable of

Ac ce pt e

providing the maximum separation power possible. Multi-dimensional chromatographic techniques, for example comprehensive two-dimensional liquid chromatography (LC×LC), are much more powerful than their one-dimensional counterparts. This is because of the larger peak capacity that multi-dimensional techniques afford in a reasonable time [1, 2]. The high peak capacities arise from the combination of two or more separation techniques within the one system. However, the choice of separation mechanisms in each dimension has a large effect on whether the high peak capacity of the corresponding multi-dimensional system can be effectively exploited. In order to attain the maximum effective peak capacity, the separation mechanisms in each dimension must be independent from each other. In other words, the dimensions must be chromatographically orthogonal [3]. Multi-dimensional techniques that use orthogonal

Page 3 of 31

separation mechanisms are capable of fully exploiting the various chemical and physical properties of the sample to obtain better separations [3]. The importance of chromatographic orthogonality is not restricted to multi-dimensional

ip t

chromatography. In the pharmaceutical industry, part of the validation process for quality-control methods requires the development of two separation methods, the separation mechanisms of

cr

which must be chromatographically orthogonal. This ensures that the quantification of impurities

us

within the sample is as accurate as possible [4].

an

The importance of chromatographic orthogonality has led to the development of a variety of methods for its measurement, particularly in multi-dimensional chromatography. Perhaps the

M

most widely known methods involve dividing the separation space into bins. The number of bins containing peaks is counted and related back to the total number of bins

d

within the separation space [5, 6]. Another variation of the bin-counting method has been

Ac ce pt e

proposed by D. Stoll, an author of [6]. It involves drawing a box around the part of the separation space which contains peaks. The bins within this box are counted and reported as a proportion of the total number of bins within the separation space. These methods are elegant in their simplicity and are effective. However, they are strongly affected by the decision the user must make with regards to the total number of bins to use in the division of the separation space [7].The total number of bins in these methods is meant to be ideally equal to the total number of components within the sample. With complex samples, which contain hundreds or perhaps even thousands of peaks, determining the total number of components in the sample is not straightforward. Peaks often co-elute within such samples, making estimates of the total number of peaks (and, thus, the ideal number of bins) quite error-prone. Furthermore, the width of the bins is determined by the peak width, which is assumed constant. This assumption or

Page 4 of 31

requirement is fine for temperature programmed elution in GC and gradient elution in LC. However, in GC×GC the second-dimension separation is usually performed in (near-)isothermal mode and in LC×LC the second dimension may also involve isocratic elution to eliminate the

ip t

need to equilibrate the column between runs. It is well known that the peak width in isothermal GC or isocratic LC is not constant but increases with increasing retention time. This could pose a

cr

problem for the selection of the bin width. Another variant of the bin counting methods is the

us

fractal approach [8]. Although it relies on bin counting, it is implemented in quite a different manner. This approach is based on the mathematical concept of fractals which relates to the

an

scaling of self-similar objects. In the case of multi-dimensional separation, these self-similar objects are bins. The implementation of this approach involves applying a number of bins which

M

scales with regards to the length/height of the bins. The logarithm of the number of bins required to cover the used separation space is plotted against the logarithm of the length/height scaling

d

parameter. The slope of this plot is multiplied by -1 which results in a value of dimensionality

Ac ce pt e

[8]. For a completely orthogonal two-dimensional separation, a dimensionality value of 2.00 is obtained. A completely non-orthogonal two-dimensional separation would have a dimensionality value of 1.00 [8]. Because the dimensions of the bins is scaled, the fractal approach does not rely on correct determination of the peak width which gives it a potential advantage over more established bin methods. However it still shares the other limitations of the bin counting methods. That is, the number of bins must be appropriate for the number of sample components and it is not possible to automate the fractal approach, at this stage. It is important to note that the fractal approach loses the simplicity with which the Gilar and Stoll bin counting methods can be implemented. This ease of implementation is one of the strong aspects of the Gilar and Stoll bin counting methods.

Page 5 of 31

There are other methods that do not require the division of the separation space into bins. Such methods include measures derived from information theory [9], the minimum-convex-hull method and the kernel method [10]. The information-theory approach is based on determining

ip t

the amount of mutual information shared by the two dimensions. Such mutual information includes the peaks which cluster along the right-leaning (upward)

cr

diagonal of the separation space. The proportion of mutual information compared to the total

us

separation space ‘entropy’, or the total spread of peaks, is expressed by the term synentropy. In information theory an orthogonal separation would have a synentropy value of 0% [9]. The

an

downfall of this technique for measuring orthogonality is its reliance on the assumption that peaks only cluster along the upward diagonal. This is certainly the most common form of

M

clustering, but not the only possible one. The spreading angle method of Liu, Patterson and Lee [11] also shares this limitation. In this case, measure of orthogonality is the amount of separation

d

space used. To calculate this, two vectors corresponding to the retention times of each dimension

Ac ce pt e

are determined. These vectors are used to create a correlation matrix which is then used to determine the correlation or peak spreading angle. Once this angle is known, a fan-like shape is constructed with its apex located at the origin of the separation space. The spreading angle determines the width of the apex. The area enclosed within the fan describes the area of the separation space occupied by peaks [11]. It is clear that a fan with an apex located at the origin assumes that peaks tend to only cluster around the upward diagonal of the separation space. This limitation of the spreading angle method has been pointed out previously [12,13]. Unlike in informational theory and the peak spreading angle, peaks are not assumed to only cluster along the diagonal in the convex-hull and kernel methods. The latter methods are often used in ecological home-range studies and have recently been applied to multi-dimensional

Page 6 of 31

chromatography [10]. Although, these methods are reported to be quite effective, the minimumconvex-hull method includes parts of the separation space which do not include peaks, thus biasing the measure of orthogonality. This is not so much the case in the kernel method. In the

ip t

kernel method, an area around each peak is blurred to form a kernel. The summed area covered by the kernels is the indicator of the degree of orthogonality in this method. The analyst sets a

cr

threshold which is a multiple of the height of an individual kernel. However, it should be noted

us

that the selection of an appropriate threshold is difficult [10] and appears to be done empirically. The amount by which the summed kernel area exceeds the set threshold is the measure of

an

separation space coverage which is directly related to orthogonality [10]. It follows that selection of the appropriate threshold value can significantly alter the results of the kernel method. In

M

addition, both the convex hull method and the kernel method require some familiarity with computer programming tools such as Matlab. Such programs are not straightforward in their use,

d

in addition to being expensive for use in (industrial) practice.

Ac ce pt e

Very recently another measure of orthogonality, which does not necessarily require the use of specialist software, has been proposed. This method is based on the calculation of the nearest-neighbor distance for each peak [14,15], i.e. the distance from a given peak to its closest neighbor. The arithmetic mean and the harmonic mean of these distances can then be calculated. The magnitude of the arithmetic mean is directly proportional to the amount of spreading of the peaks within the multidimensional chromatogram. It follows that for an orthogonal separation the magnitude of the arithmetic mean should be high compared to non-orthogonal separations [14,15]. The harmonic mean is a measure of the degree of clustering that occurs within the multidimensional separation space. In cases where the degree of clustering is high, the nearest neighbor distances are low and consequently the magnitude of the harmonic mean will be low. In short, orthogonal separations should have a high arithmetic mean and a high harmonic mean

Page 7 of 31

14,15]. One problem of the nearest-neighbor approach is that it is highly dependent on the number of peaks within the separation space. As the number of peaks within the separation space decreases, the nearest-neighbor distance increases. As a result, the reported degree of

ip t

orthogonality is artificially increased. In this paper we report a set of “Asterisk” equations which assess chromatographic

cr

orthogonality based solely on the experimentally measured peak retention times. The analyst is

us

not required to make any decisions with regards to setting parameter values, nor is any specialist software required, as all calculations are easily accomplished in Microsoft Excel. No hulls or

an

boxes are drawn around the peaks, which prevents the inclusion of non-peak-containing regions of the separation space, thus minimizing error. Furthermore, the separation space is not divided

M

into bins, thus preventing variation bias in the measurement of orthogonality due to variations in

d

the number of bins used.

Ac ce pt e

We compare our equations to other measures of orthogonality for a series of ‘test’scenarios with various numbers of components. We will investigate the repeatability of our approach and the effect of the number of sample components on the reported value of orthogonality. To conclude the paper, we will apply the asterisk equations to experimental data from two GC×GC separations.

2.

MATERIALS AND METHODS

2.1

Theory

In this section we will describe the basis for the asterisk equations. The implementation and interpretation of the asterisk equations will be described in more detail in the methods and

Page 8 of 31

discussion sections, respectively. For the derivation of the asterisk equations readers are referred to the supplementary information. In our approach, the separation space is crossed by four lines as shown in figur e 1. These

ip t

lines have been given the arbitrary names of the Z-, Z+, Z1 and Z2 lines. It is important to note

cr

here that these lines cross the separation space, not divide it. All components are considered with respect to each of the Z lines regardless of their position within the separation space. In the case

us

of an orthogonal separation, the spread of sample components around these lines will be maximized. It is this spread that we are interested in determining. The spread of sample

an

components around the Z- and Z+ lines is affected by both the 1st and 2nd dimension separation mechanisms. The Z1 considers the spread of components in the 1st dimension whilst the spread

M

around Z2 line is related to the 2nd dimension only.

d

The spread of peaks around these four lines is computed using equations 1 to 4. In these

Ac ce pt e

equations, the expression in the brackets are computed for each peak in the separation. These bracketed expressions calculate the distance of a given peak from a given Z line. It follows that for each bracketed expression there is a series of values or distances, one for each peak within the chromatogram. These distances are orthogonal to the respective Z lines for which they were calculated. The standard deviation of these distances is calculated and describes the standard deviation of the peaks around the Z line in question. By taking the standard deviation of these distances we are in effect determining the degree of spreading of peaks around the four Z lines. This is why the values computed by equations 1-4 are named SZx, where S refers to “spreading” around line Zx. A comprehensive explanation as to why we restrict the number of lines to four is included in the supplementary information.

Page 9 of 31

 S Z     1t R , norm ( i )  2t R , norm (i ) 

t 2

R ,norm ( i )



 1  1tR ,norm(i )



(2)

ip t

 SZ   

(1)

 S Z 1    1t R ,norm ( i )  0.5

(3)

(4)

us

cr

 SZ 2    2tR, norm (i )  0.5

Where s = standard deviation of the values of the bracketed equations for all the peaks within a

an

chromatogram.

t R ( i )  tR , first

d

component i given by equation 5.

M

While, 1t R ,norm (i ) and 2t R ,norm (i ) are the normalized 1st and 2nd dimension retention times of

t R ,last  t R , first

(5)

Ac ce pt e

t R , norm ( i ) 

Where t R ( i ) is the retention time of component i, t R , first and t R ,last are the retention times of the first and last eluting components, respectively.

The computed    S Z  ,  S Z  ,  S Z 1 and  S Z 2 values are then entered into equations 6 to 9 to produce what we call the Z parameters. Since these values range from 0 to 1, they can readily be expressed as a percentage which will be the form we will use throughout this manuscript.

Page 10 of 31

(6)

Z   1  2.5  S Z   0.4

(7)

(9)

us

Z 2  1  2.5? S Z 2? 2  1

(8)

cr

Z1  1  2.5? S Z 1? 2  1

ip t

Z   1  2.5  S Z   0.4

an

The Z parameters describe the use of the separation space with respect to the corresponding Z line. For instance the Z- parameter describes the spread or use of the separation space with

M

respect to line Z- in figur e 1. It follows that in terms of orthogonality, the spread of peaks around

d

each of these lines is equally important. Therefore, we bundle the Z parameters into the main

Ac ce pt e

asterisk equation (equation 10) which gives the value which measures the degree of orthogonality. This value is designated the A O value and is expressed as a percentage. The more orthogonal two separation mechanisms are with respect to each other, the higher the value of the Z parameters and in turn, the higher the A O value. A completely orthogonal separation will have a A O value of 100%.

AOCS    Z  ?Z  ? Z1? Z 2

2.2

(10)

Application of the asterisk equations and other selected methods for measuring

orthogonality To test and compare the performance of the asterisk equations with other measures of orthogonality, a number of two-dimensional-chromatography scenarios were generated in

Page 11 of 31

Microsoft Excel. The majority of these scenarios are not commonly experienced within the field. However, they were included to provide a rigorous test of the different measures of chromatographic orthogonality. The scenarios were generated containing various numbers of

us

2.2.1 Bin-counting methods (Gilar and Stoll-Gilar methods)

cr

ip t

peaks, i.e. 500, 250, 50, 25 and 10 peaks.

The bin counting method was introduced by Gilar et al. [5] and later modified by Davis, Stoll

an

and Carr [6]. However, the application and interpretation of these methods is quite similar. The details of these methods have been reported extensively in the literature [5–7]. Therefore, within

M

this text we only present a brief description of their application.

d

Firstly, the retention times of the peaks within the two-dimensional plot are normalized in

Ac ce pt e

a manner analogous to equation 5. The normalized retentions are plotted with the firstdimension time on the x-axis and the second-dimension retention times on the y-axis. The separation space is then divided into bins where the number of bins is (ideally) equal to the number of components in the sample. In the Gilar method, the number of bins which contain peaks is counted and the measure of orthogonality, O, is calculated using equation 11:

O

bins 

Pmax

0.63 Pmax

(11)

Where Pmax is the total number of bins. O = 1 for an orthogonal separation based on the observation that systems close to orthogonal have a ratio of bins occupied/total bins = 0.63 [5].

Page 12 of 31

The Stoll-modified Gilar approach also commences with the normalization of retention times using equation 5. Then, a box is drawn to surround the region of the separation space that contains peaks. All of the bins within the box are counted and divided by the total number of

ip t

bins. It follows that a value of 1 is obtained for an orthogonal separation where ideally all bins

cr

contain exactly one peak.

Both bin-counting methods were implemented in Microsoft Excel. The number of bins

us

within the separation space was taken to be identical to the number of components within the

an

chromatogram, which is the ideal situation for the bin-counting methods.

M

2.2.2 Pearson’s correlation coefficient and the linear-regression correlation coefficient

d

Orthogonal separation mechanisms by definition should not be correlated, thus the peaks should

Ac ce pt e

be spread throughout the separation space. As such it appears logical to propose statistical correlation coefficients as a measure of orthogonality. Two correlation coefficients have been used for this purpose: the Pearson’s correlation coefficient and the linear regression coefficient (R2) [16,17]. In this work these correlation coefficients were calculated using built-in functions with Microsoft Excel.

2.2.3 The nearest-neighbor distance approach In our use of the nearest-neighbor approach, we calculate the arithmetic mean (A) and the harmonic mean ( H ) using the Matlab code made accessible by the authors of references [14,15]. This code was run using Matlab R2012b (Mathworks, Natick MA, USA) and the input consisted

Page 13 of 31

of the raw 1st dimension and 2nd dimension retention times for the sample components. In this

ip t

method, a high A and high H imply a highly orthogonal separation.

2.2.4 The asterisk equations

cr

All calculations involving the asterisk equations were performed in Microsoft Excel. Firstly, the

us

1st and 2nd dimension retention times of the components were normalized according to equation 5. The normalized retention times were then fed into equations 1-4 to produce the S Z  , S Z  ,

an

S Z 1 and S Z 2 values. These values were used in conjunction with equations 6 to 9 to give the

RESULTS AND DISCUSSION

Ac ce pt e

3.

d

M

Z parameters which were used to calculate the A O value with equation 10.

We examined the validity of the asterisk equations using a series of computer generated scenarios. These scenarios were designed to push the asterisk equations to their limit. As such, most of these scenarios would unlikely be encountered in experimental multi-dimensional chromatography. Figure 2 illustrates the values for the asterisk equation parameters for each of the scenarios containing 500 components.

The A O value clearly ranks the scenarios in terms of the degree of orthogonality. Scenario A, which is completely orthogonal, was ranked appropriately with a AO value of 98%. Scenario E was the 2nd most orthogonal with a A O value of 70%, this scenario was followed by scenarios F (64%) and C (63%), scenario B (32%), scenarios G and D (26% and 22%) and finally scenario H (9%). Scenario H is a particularly unusual scenario generated because we thought it would push

Page 14 of 31

the asterisk equations to the absolute limit. In this scenario 500 components are perfectly overlaid in the four corners of the separation space. This would cause a high spread for the Z1 and Z2 lines but also a good spread for the Z- and Z+ lines as the top right corner and bottom left

ip t

corner components are far from the Z+ line whilst the converse is true for the Z- line. Despite the difficulty of scenario H, the asterisk equation still proved valid which illustrates the robustness of

cr

these equations.

us

Although the A O value is the measure of orthogonality for the asterisk equations, the Z parameters can be used to semi-quantitatively describe the use of the separation space, thus

an

diagnosing any areas of clustering. This ability of the Z parameters can facilitate the comparison of multi-dimensional chromatographic methods between systems and samples, in terms of the

M

use of the separation space. This is shown by the values of the Z parameters for the scenarios in figure 2. There is no clustering of components in scenario A (the orthogonal scenario) and so all

d

the Z parameters are high (Z- = 97%, Z+ = 98%, Z1 = 97%, Z2 = 99%). Scenarios which have a

Ac ce pt e

high degree of clustering around the diagonals consequently have high Z1 and Z2 values but considerably lower Z- and Z+ values. Such scenarios include scenarios B, D and G. Scenario E simulates a separation where components are well distributed in the 1st dimension but are poorly retained in the 2nd dimension. Similarly in scenario E, components are well distributed in both dimensions but all components are poorly retained in either the 1st or the 2nd dimension. Clearly, the separation mechanisms in scenario E are very different and so one could argue that these are orthogonal selectivities. In fact, the selectivities in scenario E produce a mathematically orthogonal separation but not a chromatographically orthogonal one, as not all of the separation space is used. As such, the A O value for scenario E was reasonable and the values of Z1 and Z2 are high, which reflects the good distribution of components in each dimension. However the Z-

Page 15 of 31

and Z+ values are moderate as the components are not spread evenly around these lines, indicating that clustering in the separation space is occurring with respect to these diagonals. Scenarios C and F simulate separations where the entire separation space is used, but the peaks

ip t

are spread quite heterogeneously. In these scenarios the values of all the Z parameters are reasonably high. However, for scenario C, the Z2 , Z- and Z+ values are slightly lower than the Z1

cr

parameter indicating some clustering is detected by the Z2 , Z- and Z+ parameters. We can see in

us

figure 2 that scenario C shows some clustering with respect to the Z2 line, as the components are more concentrated below this line, because of low retention in the 2nd dimension. In Scenario C

an

the clustering of components around the Z- and Z+ lines occurs at the corners of the separation space, which suppresses their values. Like scenario C, scenario F also scores a moderately high

M

A O value due to its good use of the separation space. However the clustering of components in the top left corner of the space reduced the value of the Z- parameter. From these observations,

d

some general guidelines can be formulated about the use of the Z parameters to diagnose

Ac ce pt e

clustering within the separation space. If clustering is occurring in the top left and/or bottom right corners of the separation space, the Z- parameter will be reduced with respect to the other parameters. Similarly, the value of the Z+ parameter will be reduced if clustering is occurring in the top right and/or bottom left corners of the space. In general, the values of the Z1 and Z2 parameters will be high except in cases where a larger proportion of components are eluted near the void or close to the end of a programmed run (e.g. towards the end of an LC gradient). Like in scenario C were most components eluted near the void in the 2nd dimension, causing a lower Z2 value.

The importance of the number of components present in the separation on the values of the asterisk equation parameters was examined using scenarios A-G in Figur e 3. We computed the asterisk equation parameters for samples containing 500, 250, 50, 25 and 10 components.

Page 16 of 31

The A O value remains constant (within 5%) for the majority of scenarios when 50 or more components are present in the sample. Below 50 components, the A O values for all scenarios drop considerably. However, it is important to note that the A O value is able to correctly rank the

ip t

scenarios in terms of orthogonality irrespective of the number of components within the sample. The order of ranking for scenarios C, E and F did change. However, these scenarios are similar

cr

in the fraction of the separation space that is used and they are consistently ranked below the

us

completely orthogonal scenario A and the above non-orthogonal scenarios, which is a correct assessment. A similar effect of the number of components on the value of the Z parameters is

an

also seen. During these computations we also calculated the standard deviation in the asterisk equation parameters for 7 replicates of scenarios A-G for 500, 250, 50, 25 and 10 components.

M

For samples containing equal to or more than 25 components, the standard deviation was low (below 0.10) for the A O and the Z parameters, irrespective of the number of components in the

3.1

Ac ce pt e

d

sample (figur e 3).

Comparison of the asterisk equations to other measures of orthogonality

We compared the asterisk equations to some common methods for measuring orthogonality: the bin-counting methods of Gilar and Stoll, correlation coefficients (Pearson’s and linear regression coefficients) and the recently introduced nearest-neighbor approach. We chose these methods, because the bin-counting methods are simple and effective; the correlation coefficients are simple and at first glance they are an obvious starting point to measuring orthogonality. The

Page 17 of 31

nearest-neighbor method was chosen, because, like the asterisk equations, it not only describes the degree of orthogonality, but aims to point out any clustering in the separation space. We did not consider the convex-hull method as it may be considered a special case (“curved version”) of

ip t

the Stoll bin-counting method, which is essentially a rectangular hull [6, 7]. Information theory and the peak spreading angle was not used in this work as the assumption that peaks can only

cr

cluster along the upward diagonal is not desirable. Table 1 compares these methods of

us

orthogonality with the A O value for scenarios A-H (see Figure 2) where each scenario contains 500 components. Of all of these methods, the arithmetic mean from the nearest-neighbor

an

approach is the least specific method for measuring orthogonality. All of the scenarios, except scenarios A and H, have the same arithmetic mean. The athematic mean does not suffice to rank

M

the scenarios in terms of orthogonality. The nearest-neighbor approach fairs slightly better if we also consider the value of the harmonic mean. Yet the approach is still not able to clearly rank

d

the scenarios. The bin-counting methods of Gilar and Stoll are much more specific than the

Ac ce pt e

nearest-neighbor method and the correlation coefficients. In fact, the linear-regression correlation method proved to be the most inaccurate means of measuring orthogonality, resulting in three incorrect assessments. The A O values and the bin-counting methods were able to clearly rank the scenarios in terms of orthogonality.

However, the orthogonality values for the bin-counting methods vary more strongly than the A O values with the number of components in the sample. The number of components does not affect the other measures of orthogonality quite as strongly (figure 4). The value of orthogonality for the bin-counting methods also is greatly dependent on the number of bins. This has been documented elsewhere [7].

Page 18 of 31

Unlike the bin-counting methods and the correlation coefficients, the nearest-neighbor approach can be used to diagnose the presence of clustering in the separation space. It does this through the value of the harmonic mean. We saw earlier that the Z parameters of the asterisk

ip t

equation can also indicate the presence of clustering. In table 2 we compare the Z parameters for the asterisk equations with the harmonic mean of the nearest-neighbor approach for diagnosing

cr

clustering. Both methods highlight a large degree of clustering in scenarios B, D and G. The

us

harmonic mean has the advantage that it is only one parameter to evaluate compared to four parameters in the asterisk equations. However, the four Z parameters have the benefit of being

an

able to provide more information about the location of the clustering within the separation space.

Application of the asterisk equation to experimental data

d

3.2

M

This was discussed in detail at the beginning of the discussion section of this paper.

Ac ce pt e

Previous discussion of the performance of the asterisk equation has been restricted to computergenerated scenarios created to rigorously test the performance of the A O and Z parameters. Here we apply the asterisk equation to two experimental GC×GC separations and compare the assessments to those of the Gilar bin-counting method, which was the other best-performing method in Table 1.

Figur e 5a and b show the apex plots and asterisk parameters for the two GC×GC separations of flame accelerants, yielding similar chromatograms (unpublished data from our group). Experimental details will not be listed, as such information is irrelevant for the present discussion. The separation in figur e 5b is more orthogonal compared to the separation in figure 5a and this is shown by the corresponding A O values (57% for figur e 5a and 63% for figure 5b).

Page 19 of 31

Both separations have a high degree of clustering along the Z- line, which reduces the value of the Z- parameter. The results of the Gilar bin-counting method (0.41 and 0.55 in figure 5a and 5b, respectively) were in agreement with the A O values. However, implementation of the bin-

ip t

counting method was tedious and could not be automated. When using the bin-counting method, the Excel spreadsheet had to undergo significant adjustments in order to ensure the number of

cr

bins matched the number of peaks in the sample. This is not a problem when analyzing samples

us

which always have the same number of components. However, it becomes tedious and time consuming if different analyses require changes in the number of bins. The application of the

an

asterisk equations is straightforward and fast and independent of the number of components. However, one limitation of both the asterisk equations and the bin-counting method is that they

M

are not suitable for peaks which are broad and don’t have a clearly defined apex, like in polymer

4.

Ac ce pt e

d

separations. Addressing this limitation will be the focus of future work on the asterisk equations.

CONCLUSION

We have introduced a new approach to measuring orthogonality in multi-dimensional chromatography. This approach uses a series of equations known under the heading of the asterisk equations.

These equations consist of one main equation which gives the measure of orthogonality, known as the A O value. Within this equation, four parameters known as the Z parameters are computed by a sub-set of equations. The Z parameters aid to semi-quantitatively diagnose areas of the separation space where sample components are clustered, reducing the orthogonality. The asterisk equations were compared to the bin-counting methods, correlation coefficients and the

Page 20 of 31

nearest-neighbor approach for a series of computer-generated scenarios. These scenarios were created to rigorously test these methods and the asterisk equations. The latter proved to be the most specific and the method was not greatly affected by the number of components in the

ip t

sample, unlike the bin-counting methods. We also applied the asterisk equations to experimental GC×GC separations, which contained 283 components and varied in the degree of orthogonality.

cr

The assessment for the degree of orthogonality by the asterisk equations matched the visual

us

assessment of these data and was in agreement with the Gilar bin-counting method, which was the benchmark for our method. Furthermore, the asterisk equations can easily and quickly be

an

implemented in Microsoft Excel. Finally, the asterisk equations provide information on where in the two-dimensional separation space clustering occurs. This is especially useful in situations in

M

which a visual interpretation of the data is not easily possible, for example during computer-

5.

Ac ce pt e

d

aided optimization of two-dimensional separations.

ACKNOWLEDGEMENTS

The authors would like to acknowledge the following people, institutions and companies for their support of this work:

Anjoe Sampat, for allowing us to test the asterisk equations on GC×GC data from her PhD research.

Dr. Witold Nowik for generously supplying us with the Matlab code for the Nearest Neighbor Distance approach.

Page 21 of 31

Dr. Gabriel Vivó-Truyols, Martin Lopatka, Andrea Gargano and Prof. Sjoerd van der Wal for

ip t

their insightful discussions.

Members of the COAST HYPERformance LC×LC project for their financial support: Akzo

cr

Nobel, Avantor, DSM, Free University of Brussel, Hogeschool van Arnhem en Nijmegen, NWO, RIKILT, Shell, Syngenta, ThermoFisher Scientific, TNO, University of Amsterdam and the

REFERENCES

d

6.

M

an

us

University of Groningen.

Ac ce pt e

[1] G. Guiochon, N. Marchetti, K. Mriziq, R.A. Shalliker, Implementations of two-dimensional liquid chromatography, J. Chromatogr. A, 1189 (2008) 109–68. [2] D.R. Stoll, X. Li, X. Wang, P.W. Carr, S.E.G. Porter, S.C. Rutan, Fast, comprehensive twodimensional liquid chromatography, J. Chromatogr. A, 1168 (2007) 3–43.

[3] J.C. Giddings, Sample dimensionality: a predictor of order-disorder in component peak distribution in multidimensional separation, J. Chromatogr. A, 703 (1995) 3–15. [4] G. Xue, A.D. Bendick, R. Chen, S.S. Sekulic, Automated peak tracking for comprehensive impurity profiling in orthogonal liquid chromatographic separation using mass spectrometric detection, J. Chromatogr. A, 1050 (2004) 159–171.

Page 22 of 31

[5] M. Gilar, P. Olivova, A.E. Daly, J.C. Gebler, Orthogonality of separation in two-dimensional liquid chromatography, Anal. Chem., 77 (2005) 6426–34. [6] J.M. Davis, D.R. Stoll, P.W. Carr, Dependence of effective peak capacity in comprehensive

ip t

two-dimensional separations on the distribution of peak capacity between the two dimensions,

cr

Anal. Chem., 80 (2008) 8122–34.

us

[7] M. Gilar, J. Fridrich, M.R. Schure, A. Jaworski, Comparision of orthogonality estimation methods for the two-dimensional separations of peptides, Anal. Chem., 84 (2012) 8722–32.

an

[8] M.R. Schure, The dimensionality of chromatographic separations, J. Chromatogr. A, 1218

M

(2011) 293-302

[9] P.J. Slonecker, X. Li, T.H. Ridgway, J.G. Dorsey, Informational orthogonality of two-

d

dimensional chromatographic separations, Anal. Chem., 68 (1996) 682–689.

Ac ce pt e

[10] S.C. Rutan, J.M. Davis, P.W. Carr, Fractional coverage metrics based on ecological home range for calculation of the effective peak capacity in comprehensive two-dimensional separations, J. Chromatogr. A, 1255 (2012) 267–76.

[11] Z. Liu, D.G. Patterson Jr., M.L. Lee, Geometric approach to factor analysis for the estimation of orthogonality and practical peak capacity in comprehensive two-dimensional separations, Anal. Chem., 67 (1995) 3840-3845 [12] M. Gilar, P. Olivova, A.E. Daly, J.C. Gebler, Orthogonality of separation in twodimensional liquid chromatography, Anal. Chem., 77 (2005) 6426-6434

Page 23 of 31

[13] M. Gilar, J. Fridrich, M.R. Schure, A. Jaworski, Comparison of orthogonality estimation method for the two-dimensional separations of peptides, Anal. Chem., 84 (2012) 8722-8732 [14] W. Nowik, S. Héron, M. Bonose, M. Nowik, A. Tchapla, Assessment of two-dimensional

ip t

separative systems using nearest-neighbor distance approach. Part 1: orthogonality aspects, Anal.

cr

Chem., 85 (2013) 9449-9458.

us

[15] W. Nowik, M. Bonose, S. Héron, M. Nowik, A. Tchapla, Anal. Chem., Assessment of twodimensional separative systems using the nearest neighbor distances approach. Part II: separation

an

quality aspects, Anal Chem.,85 (2013) 9459–68.

[16] E. Van Gyseghem, I. Crosiers, S. Gourvénec, D.. Massart, Y. Vander Heyden, Determining

M

orthogonal and similar chromatographic systems from the injection of mixtures in liquid chromatography-diode array detection and the interpretation of correlation coefficients color

Ac ce pt e

d

maps, J. Chromatogr. A, 1026 (2004) 117–128.

[17] E. Van Gyseghem, M. Jimidar, R. Sneyers, D. Redlich, E. Verhoeven, D.L. Massart, Y. Vander Heyden, Orthogonality and similarity within silica-based reversed-phased chromatographic systems, J. Chromatogr. A, 1074 (2005) 117–131.

Page 24 of 31

Figur e captions Figure 1: Graphical representation of the principles underlying the Asterisk equations. SZx terms refer to the standard deviation of the distances of peaks from the Zx line.

ip t

Figure 2: Values for the asterisk equation parameters obtained for the computer generated scenarios A-H. Note that in scenario H the four corners contain one point which is actually 125

us

cr

components perfected overlaid on each corner point.

an

Figure 3: The effect of varying the number of components on the values of the asterisk equation parameters for scenarios A-G. Panels of the left refer to the effect of the number of components

M

on the value of the asterisk equation parameters (%). Panels on the right side show the standard

Ac ce pt e

d

deviation in the asterisks equation parameters (plotted as a percentage).

Figure 4: Orthogonality values for scenarios A-G using different methods for assessing orthogonality. Orthogonality values were computed for each scenario containing 500, 250 and 50 components.

Figure 5a and 5b: GC×GC separations of flame accelerants containing 283 components. Experimental data is unpublished data from a member of our research group.

Page 25 of 31

Table 1: Orthogonality values for scenarios A-H using different methods for measuring

AO (%)

NND ( )

Gilar

Stoll

a

98

0.030

0.94

1.00

b

33

0.014

0.31

0.19

c

63

0.022

0.52

d

22

0.015

0.35

e

70

0.022

0.67

f

64

0.022

0.56

g

26

0.015

h

9

1.00

M

2

Pearsons’

cr

Orthogonality measure Scenario

ip t

orthogonality. NND = nearest neighbor distance approach.

R

0.00

0.97

0.95

0.37

0.10

0.01

0.26

0.92

0.85

0.55

0.92

0.85

0.57

-0.17

0.03

0.36

0.26

-0.92

0.84

-0.06

0.12

-0.103

0.01

Ac ce pt e

d

an

us

0.01

Table 2: Comparision of the asterisk equation parameters and the nearest neighbor distance (NND) approach parameters for scenarios A-G.

asterisk equation

NND

Scenario

AO (%)

Z- (%)

Z+ (%)

Z1 (%)

Z2 (%)

a

98

97

98

97

99

0.030

0.021

b

33

19

66

97

88

0.014

0.009

c

63

83

83

97

63

0.022

0.012

d

22

35

26

76

73

0.015

0.010

Page 26 of 31

70

78

70

95

96

0.022

0.014

f

64

72

90

79

82

0.022

0.013

g

26

32

35

78

79

0.015

0.010

M

an

us

cr

ip t

e

Ac ce pt e

d

Figure 1: Graphical representation of the principles underlying the Asterisk equations. SZx terms refer to the standard deviation of the distances of peaks from the Zx line.

Page 27 of 31

ip t cr us an

Figure 2: Values for the asterisk equation parameters obtained for the computer generated

M

scenarios A-H. Note that in scenario H the four corners contain one point which is actually 125

Ac ce pt e

d

components perfected overlaid on each corner point.

Page 28 of 31

Page 29 of 31

d

Ac ce pt e us

an

M

cr

ip t

Figure 3: The effect of varying the number of components on the values of the asterisk equation parameters for scenarios A-G. Panels of the left refer to the effect of the number of components

Ac ce pt e

d

M

an

us

cr

deviation in the asterisks equation parameters (plotted as a percentage).

ip t

on the value of the asterisk equation parameters (%). Panels on the right side show the standard

Page 30 of 31

Ac ce pt e

d

M

an

us

cr

ip t

Figure 4: Orthogonality values for scenarios A-G using different methods for assessing orthogonality. Orthogonality values were computed for each scenario containing 500, 250 and 50 components.

Figure 5a and 5b: GC×GC separations of flame accelerants containing 283 components. Experimental data is unpublished data from a member of our research group.

Page 31 of 31

A new measure of orthogonality for multi-dimensional chromatography.

Multi-dimensional chromatographic techniques, such as (comprehensive) two-dimensional liquid chromatography and (comprehensive) two-dimensional gas ch...
2MB Sizes 4 Downloads 3 Views