Analytica Chimica Acta 807 (2014) 143–152

Contents lists available at ScienceDirect

Analytica Chimica Acta journal homepage: www.elsevier.com/locate/aca

How to select equivalent and complimentary reversed phase liquid chromatography columns from column characterization databases Endler M. Borges ∗ Universidade Estadual de Campinas, Instituto de Química, Rua Monteiro Lobato, Cidade Universitária “Zeferino Vaz”, Distrito de Barão Geraldo, Campinas, SP, Caixa Postal 6154, CEP 13083-970, Brazil

h i g h l i g h t s

g r a p h i c a l

a b s t r a c t

• The SRM 870, Tanaka and PQRI tests were compared.

• Special attentions to different manners to interpret the data were given.

• Correlations between the Tanaka and PQRI tests were reported.

• An interactive tool is given, where the data was afforded in eight Microsoft Excel tables. • Disadvantages of PCA use are highlighted.

a r t i c l e

i n f o

Article history: Received 31 July 2013 Received in revised form 18 October 2013 Accepted 5 November 2013 Available online 15 November 2013 Keywords: Column characterization Similar and dissimilar columns Revered phase liquid chromatography (RPLC) Principal Component Analysis (PCA)

a b s t r a c t Three RP-LC column characterization protocols [Tanaka et al. (1989), Snyder et al. (PQRI, 2002), and NIST SRM 870 (2000)] were evaluated using both Euclidian distance and Principal Components Analysis to evaluate effectiveness at identifying equivalent columns. These databases utilize specific chromatographic properties such as hydrophobicity, hydrogen bonding, shape/steric selectivity, and ion exchange capacity of stationary phases. The chromatographic parameters of each test were shown to be uncorrelated. Despite this, the three protocols were equally successful in identifying similar and/or dissimilar stationary phases. The veracity of the results has been supported by some real life pharmaceutical separations. The use of Principal Component Analysis to identify similar/dissimilar phases appears to have some limitations in terms of loss of information. In contrast, the use of Euclidian distances is a much more convenient and reliable approach. The use of auto scaled data is favoured over the use of weighted factors as the former data transformation is less affected by the addition or removal of columns from the database. The use of these free databases and their corresponding software tools shown to be valid for identifying similar columns with equivalent chromatographic selectivity and retention as a “backup column”. In addition, dissimilar columns with complimentary chromatographic selectivity can be identified for method development screening strategies. © 2013 Elsevier B.V. All rights reserved.

1. Introduction Mobile phase composition (i.e. pH, type and proportion of the organic modifier), the operating HPLC parameters (i.e. temperature, gradient time and slope) and the stationary phase used can dramatically influence the chromatographic

∗ Tel.: +55 19 983055999/19 98839054. E-mail address: [email protected] 0003-2670/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.aca.2013.11.010

performance and selectivity of the separation [1–6]. While the influence of the former parameters can be predicted and modelled successfully, the influence of using differing stationary phases is less well understood. Even changing from the same nominal type of stationary phase chemistry from differing manufacturers can give rise to differing chromatographic selectivities [7–9]. By 2005, there were approximately 220 C18 columns (i.e. USP L1 classification) commercially available and, every year, a plethora of new L1 columns are launched.

144

E.M. Borges / Analytica Chimica Acta 807 (2014) 143–152

Several attempts have been made to produce a definitive set of chemical probes and test conditions to best characterize the huge number of stationary phases available. As of 2013, it is believed that there are over 1000 reversed-phase stationary phases commercially available [1–3]. These chromatographic characterization approaches have been reviewed in several manuscripts [10–27]) including that of Lesellier and West [17] who discussed the main stationary phase characterization procedures that have been developed over last 20 years. To date, a unified column characterization approach has not been agreed upon by chromatographers or the stationary phase manufacturers. An early attempt to do so was made by Tanaka and co-workers [28]. Since then, the USP Working Group on HPLC Columns, the Impurities Working Group of the PQRI Drug Substance Technical Committee in collaboration with Lloyd Snyder [29–38], the NIST Standard Reference Material (SRM) 870 [39–41] and the group of Euerby and Petersson [42–48] have expanded this type of work. These groups have all attempted to create a testing protocol and rationale that can assess the most important chromatographic properties of a stationary phase. Most of these approaches have utilized various chemometric and statistical approaches to visualize the similarities/dissimilarities of the stationary phases in the databases based on Principal Component Analysis (PCA) or computing a numerical “similarity” factor, as a measure of a stationary phase’s equivalent or complementary chromatographic selectivity. The column characterization databases developed by these three groups (which are now freely available [49,50]) can be used to rapidly identify similar/dissimilar stationary phases as a “backup column” or for method development screening strategies respectively. This paper evaluates how stationary phase characterization databases can assist in the selection of similar/dissimilar stationary phases. The supplementary material includes all the Microsoft Excel spreadsheets which can be used by the reader to perform their own evaluations. 2. Experimental All calculations were performed using Microsoft Excel 2007. Correlation coefficients, standard deviations (s) and averages were obtained using the Microsoft Excel functions CORREL, STDEV and AVERAGE respectively. The percentage relative standard deviations (%RSD) were calculated dividing the s by the average and multiplying by 100. PCA was performed using STATISTICA 7.1 StatSoft. Inc. (Tulsa, OK, USA). The data was auto scaled before performing the PCA. The SRM 870 and PQRI databases were accessed via the USP web site [49] and the Tanaka database by the Analytical Chemistry Development web site [50]. 2.1. Description of the column characterization parameters 2.1.1. Tanaka column characterization approach In 1989, Tanaka [28] reported a simple, rapid and very useful approach for the evaluation the chromatographic properties of various stationary phases intended for reversed phase (RP) use. The protocol determines six chromatographically relevant variables which are briefly described below [42–48,50]. Hydrophobicity: Retention factor for pentylbenzene, kPB : which reflects the surface area and surface coverage (ligand density). Chromatographic conditions: methanol–water (80:20, v/v). Hydrophobic selectivity, ˛CH2 : retention factor ratio between pentylbenzene and butylbenzene, ˛CH2 = kPB /kBB , this is a measure of the surface coverage of the phase as expressed by the selectivity between alkylbenzenes differentiated by one methylene group

which is dependent on the ligand density. Chromatographic conditions: as for hydrophobicity. Shape selectivity, ˛T/O : retention factor ratio between triphenylene and o-terphenyl, ˛T/O = kT /kO , this descriptor is a measure of the shape selectivity, which is influenced by the spacing of the ligands and probably also the shape/functionality of the silylating reagent. Chromatographic conditions: as for hydrophobicity. Hydrogen bonding capacity, ˛C/P : retention factor ratio between caffeine and phenol, ˛C/P = kC /kP , this descriptor is an empirical descriptor relating to the concentration and accessibility of silanol groups in the stationary phase. Chromatographic conditions: methanol–water (30:70, v/v). Total ion-exchange capacity, ˛B/P pH 7.6: the retention factor ratio between benzylamine and phenol, ˛B/P pH 7.6 = kB /kP , this is an estimate of the total silanol activity. Chromatographic conditions: methanol–phosphate buffer (pH 7.6; 20 mM) (30:70, v/v). Acidic ion-exchange capacity, ˛B/P pH 2.7: The retention factor ratio between benzylamine and phenol, ˛B/P pH 2.7 = kB /kP , this is a measure of the acidic activity of the silanol groups. Chromatographic conditions: methanol–phosphate buffer (pH 2.7; 20 mM) (30:70, v/v). Euerby et al. [42–48,50] evaluated 322 columns using the mentioned conditions above, this database is freely available in the web [50] and in Table S1. These columns are, in most part, chemically bonded type B silica having (150 mm × 4.6 mm, dp 5 ␮m), but there are some exotic columns like polymer coated zirconium oxide, polymer coated alumina and type A silica chemically bonded. Chromatographic testes were carried out at 40 ◦ C, 1 mL min−1 and detection at 220 nm. Flow rates were adjusted using Eq. (1), where F is the flow rate and d the column internal diameter of the new or original method. Fnew = Foriginal

2 dnew 2 doriginal

(1)

Supplementary material related to this article can be found, in the online version, at http://dx.doi.org/10.1016/j.aca.2013.11.010.

2.1.2. SRM 870 column characterization approach The NIST SRM 870 test, developed in 2000 ([39–41,49]), consists of a mixture of five test solutes (uracil, toluene, ethylbenzene, quinizarin and amitriptyline) which are analyzed using a methanol–phosphate buffer (pH 7; 5 mM) (80:20, v/v) mobile phase at 23 ◦ C to evaluate the following parameters: Hydrophobicity: retention factor for ethylbenzene, kE : reflects the surface area and surface coverage (ligand density). Silanol activity: retention factor and tailing factor for amitriptyline, kami and Tfami . Tfami , reflects how good a stationary phase should be to analyze basic solutes and kami reflects the participation of ion-exchange interactions in the retention process of basic solutes.The activity towards chelators: tailing factor for quinizarin, TfQ , reflects the metal content in the stationary phase. This database is freely available in the web [49] and in Table S2. These columns are, in most part, chemically bonded type B silica having (250–150 × 4.6 mm, dp 10–5 ␮m), but there are some type A silica chemically bonded columns. Chromatographic testes were carried out at 23 ◦ C, 2 mL min−1 and detection could be carried out at detection at 254 nm, 210 nm, and 480 nm. In the event of coelution of quinizarin and amitriptyline, data for each component can be obtained by selective detection at 210 nm and 480 nm. At 210 nm, the area of quinizarin is approximately 2% of the area of amitriptyline, making the interference to amitriptyline small [40]. Supplementary material related to this article can be found, in the online version, at http://dx.doi.org/10.1016/j.aca.2013.11.010.

E.M. Borges / Analytica Chimica Acta 807 (2014) 143–152

2.1.3. PQRI column characterization approach The theory underpinning the PQRI test (alternatively known as the hydrophobic subtraction method) has been described in a series of publications [29–37]. Six column parameters (see below) are determined by the multiple linear regression analysis of the retention of 16 test probes against their known physical chemical properties. Chromatographic conditions of acetonitrile–phosphate buffer (20 mM) (50:50, v/v) at 35 ◦ C are employed: relative retention (kEB ), hydrophobicity (H), steric interaction (S*), hydrogen-bond acidity (A) and basicity (B), and relative silanol ionization or cation-exchange capacity at pH 2.8 (C 2.8) and pH 7 (C 7.0). It is recommended that conditions be controlled within the following limits: temperature, 35 ◦ C •± 0.5 ◦ C; acetonitrile concentration, 50% •± 0.05%; pH 2.8 •± 0.1; pH 7.0 •± 0.1. When available, 150 mm × 4.6 mm columns packed with 5 ␮m particles are preferred, with a flow rate of 2.0 mL min−1 . For columns with different particle sizes and/or dimensions, changes in flow rate are done using Eq. (1). Columns must be equilibrated with the pH 2.8 mobile phase for at least 10 h before testing. Complete equilibration with pH 7.0 mobile phase is achieved within 1 h [33]. More than 500 different RPC columns have so far been tested by this procedure, in each case starting with a virgin column. This database is freely available in the web [49] and in Table S3. These columns are, in most part, chemically bonded type B silica having (150 mm × 4.6 mm, dp 5 ␮m), but there are some exotic columns like polymer coated zirconium oxide, polymer coated alumina and type A silica chemically bonded columns. Supplementary material related to this article can be found, in the online version, at http://dx.doi.org/10.1016/j.aca.2013.11.010. 3. Results and discussion 3.1. Statistical analysis 3.1.1. Tanaka stationary phase characterization approach With the exception of the expected correlation between the Tanaka chromatographic parameters kPB and ˛CH2 (R2 > 0.7), no correlation was observed between the other parameters for the 322 stationary phases in the ACD database (downloaded October 2012), see Table S1. Thus, the kPB , ˛T/O , ˛C/P , ˛B/P pH 7.6 and ˛B/P pH 2.7 chromatographic parameters appear to be independent of each other, representing different chromatographic properties associated with the stationary phases. Chromatographic parameters with high percentage relative standard deviations (%RSD) (i.e. ˛B/P pH 2.7 = 421% RSD) should facilitate a greater differentiation between stationary phases than chromatographic parameters possessing low %RSDs (i.e. ˛CH2 = 10% RSD). Therefore, based on their %RSD the relative importance of the six chromatographic parameters in discriminating between stationary phases is as follows ˛B/P pH 2.7 > ˛B/P pH 7.6 > ˛C/P  kPB > ˛T/O > ˛CH2 . It is suggested that since the ˛CH2 and kPB parameters encode the same information, the former be discarded [50]. 3.1.2. Statistical analysis of the SRM 870 column characterization database Table S2 highlights that the four chromatographic parameters of the 112 stationary phases (as of October 2012) on the USP website [52] are uncorrelated, suggesting that each parameter represents a different property associated with the stationary phase. The %RSD suggests that the relative order of importance in discriminating stationary properties is TfQ ≥ kami ≥ Tfami > kE . The ability of SRM 870 parameters to discriminate between stationary phases as suggested by their %RSD range (41–69%) appears to be lower than those of the five Tanaka parameters (%RSD range = 38–421%).

145

The lack of correlation between the retention factor and the asymmetry factor of amitriptyline reinforces the observation that stationary phases with high ion-exchange properties do not necessarily provide poor peak shape for amitriptyline [11], while the lack of correlation between kami and kE show the importance of ion-exchange mechanism to the retention in RP mode. In addition, the lack of correlation between the tailing factor of amitriptyline and quinizarin suggests that the peak tailing of bases is not directly related to the metal content of the stationary phases. For example, certain stationary phases (i.e. Zorbax Extend C18, high purity silica and double end-capped) displayed a good tailing factor for quinizarin (1.1) but a substantially larger tailing factor for amitriptyline (3.4), while Kromasil C18 and Partisil ODS-1 gave high tailing factors for quinizarin (3.75 and 3.61, respectively) and low tailing factors for amitriptyline (1.45 and 1.59, respectively) [39–41,49]. It has been previously shown that the surface metal activity value for a phase is highly dependent on the previous history of the stationary phase. For example, shipping and storage of columns in pure organic solvents such as acetonitrile and methanol [55] and the previous use of the stationary phase will affect the metal activity of that stationary phase [56,57]. Hence, the repeatability and reproducibility of the metal content of differing stationary phases is typically poor [58,59]. The metal activity parameter is only valid for the stationary phase under evaluation at that moment in time and it may be potentially misleading to report such values. 3.1.3. Statistical analysis of the PQRI column characterization database The USP database contains 550 stationary phases (as of October 2012) evaluated using the PQRI protocol (see Table S3). As with the Tanaka and SM 870 protocols, the PQRI parameters were observed to be uncorrelated to each other. Based on their %RSD the relative importance of the six parameters in discriminating between stationary phases was observed to be: C (2.8) > B > S > A > C (7.0)  H. The size of the %RSD range (i.e. 22–439%) for the six PQRI parameters was similar to that of the five Tanaka parameters (i.e. 38–421%) illustrating a similar discriminating power between these two approaches. 3.2. Assessment of similar/dissimilar stationary phases The ability to select similar or dissimilar chromatographic stationary phases from the hundreds of phases commercially available is vitally important to the practicing chromatographer. Those working in method development laboratories require columns of complimentary chromatographic selectivity to be identified in order to maximize the possibility of achieving the separation of all components in a mixture. Whereas, those working in QC laboratories often require a “backup column” of equivalent chromatographic properties (especially in terms of selectivity). To date, the chemometrical approach of PCA or the use of Euclidian distances between the multiple chromatographic parameters have been used to determine the similarity/dissimilarity of LC columns. 3.2.1. Principal component analysis The chemometric approach of PCA is a versatile tool for the interpretation of large datasets. For example, in the Tanaka database, the six parameters (i.e. variables) are reduced by a projection of the 322 stationary phases (i.e. objects) onto a smaller number of new variables termed principal components (PCs). These are orientated so that the first PC describes as much original variation as possible between the objects (i.e. stationary phases). The second PC is orientated in an orthogonal manner to the first and describes as much of the remaining variation as possible; this can be extended for a third PC, etc. The projection of objects onto a PC is called a score and by

146

E.M. Borges / Analytica Chimica Acta 807 (2014) 143–152

a 8

225

307 306

139

PC2 score 23%

6

255 223

4 224 2

259

128 54286 256 80 161 304 195 249 136 264 191 190 115 175 52 258 265 316 278 237 207 197 92 241 10 277 189 61 64 95 194 315 9 254 236 283 287 44 193 129 198 230 43 112 111 168 121 45 250 103 235 127 263 309 182 262 314 158 1475 7 130 66 84 59 135 245 212 243 120 90 311 46 91 26 157 202 270 234 313 697 208 310 209 177 47 221 196 176 5 268 229 261 160 62206 280 140 303 187 201 125 232 244 133 215213 233 257 118 302 22 117 87 137 200 281 275 93 169 170 48 238 12427 126 222 152 74 94 96 171 33 37 32 63 163 83 165 838 60108150 205 164 151 7134 31 142 248 246 82 13 219 272 295 23 71 1 218 16 290 24 6 2 9 65 288 49 289 317 11 204 4 51 28 2 203 141 19 132 271 9 174 26 57 217 67 284 58 172 301 252 308 274 192 251 180 319 294 73 36 25 267 16 293 11 9 105 53 55 14107 8104 28 210 22 122 85 181 242 8 227 154 109 10 253 78 70 2 22 72 615 68 81 27 69 18 305 3 173 17 98 34 42 167 273 166 291 35 100 312 41 322 188 292 298 179 178 231 1 22 269 0 56 296 12 318 89 88 247 297 260 13 186 321 146 11 153 6 106 20 24 159 0300 30 143 214 138 21 299 27 677 214 211 110 239 3320 156 145 155 144 123 99 7976 185 285 113 114 2950 19 149 184 101 39 18 40 216

0

86

-2 -6

-4

-2

0

2

PC1 score Factor 1 : 35% 34.50%

b

1.0 a B/P p H 2 .7 aB/P pH 7.6

aCH 2 kPB

PC2 score 23% Factor 2 : 23.17%

0.5

a T /O a C/P

0.0

-0.5

-1.0 -1.0

-0.5

0 .0

0.5

1.0

Factor 1 : 3 4.5 0%

PC1 score 35%

Fig. 1. PCA using the six parameters in the Tanaka test and the 322 LC columns described in Table S1 (a) Score plot (b) Loading plot.

plotting the scores for two PC one is able to visually see differences and similarities between the objects (i.e. stationary phases) – the distance between two stationary phases in the score plot depicts how similar the phases chromatographic properties are. How much of each of the original variable is included in the PC is described by the loadings. By plotting the loadings for the two PCs it is possible to assess the relative importance of each of the variables (i.e. chromatographic parameters), the further the variable is from the origin, the more important it is. One can also visually assess for correlations between parameters (i.e. positive correlated-variables will be located close together or inversely correlated-variables will be a 180◦ to one another). In this section, we discuss some important limitations of using PCA in data interpretation [51,52]. Numerous groups [16,17,20,21,42–48,54] have reported the use of PCA to identify similar and dissimilar stationary phases. However, in most of these applications, it was reported that information was lost due to reducing the projections of the six variables onto two PCs. The data set comprising of the six Tanaka chromatographic parameters for the 322 widely differing types of stationary phases (i.e. C8, C18, diol, Aqua, embedded polar phases, phenyl, phenylhexyl, pentafluorophenyl, etc.) shown in Table S1 was subjected to PCA. Fig. 1 highlights the loss in information when the six variables are reduced to two PCs, in that only 58% of the variability of the data was described by the two PCs. The PCA loading plot (Fig. 1b), suggests that the relative importance of the variables is different from that predicted by the %RSD values. For example, PCA indicated the relative importance

of the parameters in discriminating between stationary phases was as follows: ˛B/P pH 2.7 ∼ = ˛B/P pH 7.6 ∼ = kPB ∼ = ˛CH2  ˛C/P ∼ = ˛T/O , while %RSD values indicate an order of ˛B/P pH 2.7 > ˛B/P pH 7.6 > ˛C/P  kPB > ˛T/O > ˛CH2 . Fig. 1b also suggests that the following variables ˛B/P 2.7 and ˛B/P 7.6, ˛T/O and ˛C/P , and ˛CH2 and kPB are closely related. However, these suggested correlations differ from those observed in Table S1. (with the exception of the previously mentioned relationship observed between ˛CH2 and kPB ). This disparatity between the PCA and the statistical correlations can be attributed to the loss of information when the six parameters are reduced to two PC. The two PC from the PCA of the Tanaka data set (see Table S1 and Fig. 1) resulted in 42% loss of information. According to the loadings shown in Fig. 1b, the stationary phases located in the left top quadrant in Fig. 1b should possess high ˛B/P 2.7 and ˛B/P 7.6 values, while stationary phases located in the right top quadrant should possess high hydrophobic retentivity (i.e. high kPB values). Indeed the highly hydrophobic Grom-Sil ODS-7 pH (n = 128, kPB = 13.7) and Ultracarb ODS30 (n = 286, kPB = 13.3) stationary phases are located in the top right quadrant, while the phases possessing high ˛B/P 2.7 and ˛B/P 7.6 values (i.e. Zirconium phases – ZirChrom MS, ZirChrom PDB [n = 306 and 307] and the mixed alkyl/cation exchange phases – Primesep 100 and Hypersil Duet [n = 223 and 139]) are located in the top left quadrant. Another complication with PCA is that the position of the stationary phases in the score plot and the chromatographic parameters in the loading plots may change on the addition or removal of stationary phases from the database since doing so may dramatically change the %RSD of variables. Fig. 1a, highlights the presence of a number of outliers, which possess extremely high silanophilic (i.e. ˛C/P , ˛B/P 2.7 and ˛B/P 7.6) values, inclusion of these compresses the majority of the other stationary phases into a very tight grouping around the origin, which is detrimental to the visualization of similar and dissimilar stationary phases. Table S4 (17 outliers columns were removed from the dataset) indicates that the relative importance remains the same as in Table S1 (see Section 3.1.1) and that the %RSD for the hydrophobic parameters (kPB , ˛CH2 and ˛T/O ) remained unaffected, however, the relative importance of the silanophilic parameters (˛C/P , ˛B/P pH 7.6 and ˛B/P pH 2.7) was dramatically reduced (compare Tables S1 and S4). The PCA plots of the full database (see Fig. 1) and the repeated PCA of the database excluding the 17 outliers (see Fig. 2) highlight the differences that can be observed on the addition/removal of data. Fig. 3a and b are expansions of Fig. 1a (full database) and Fig. 2a (removal of outliers) respectively and clearly demonstrate the effect of inserting or deleting stationary phases from the dataset on the positioning of the stationary phases in the score plots. In the full database (Fig. 3a), the nearest two phases to the HyPURITY C18 phase (n = 151) are a second batch of the 3 ␮m HyPURITY C18 (n = 152) and the 2.7 ␮m Poroshell 120 SB-C18 (n = 222) phases, whereas in the reduced database (Fig. 3b) the Pursuit C18 (n = 238) and Discovery C18 (n = 96) are the closest phases. Supplementary material related to this article can be found, in the online version, at http://dx.doi.org/10.1016/j.aca.2013.11.010. Removal of the polar outliers (see Fig. 2b) results in a higher importance being attributed to the ˛C/P and ˛T/O than in Fig, 1b. The correlations observed between the variables in Fig. 2b are also different from those observed in Fig. 1b. Removal of the outliers spreads the phases out more in the score plot (see Fig. 2a) aiding the visualization of the data. The same observations can be observed with the SM 870 and PQRI databases. PCA is undoubtedly a very powerful tool to visualize large data sets, however, given the fact that these databases are living documents with more phases being added on a monthly basis or that subsets of differing classes of phases may need to be analyzed, positioning of phases relative to one another may change depending on the size of the database. Also, the potential for PCA to lose

E.M. Borges / Analytica Chimica Acta 807 (2014) 143–152

a

a

147

0.4

4 108

PC2 score 19%

-2

125

257

23 3

104 105

60

0

0.3

147

87

0.2

110 211

136 277 188 278 300 51 286 128 182 19 138 242 14 13 200 201 239 204316 46 21 287 291 12 309 311 52 310 215 47 321 81 35 127 212 120 230 80 190 34 56 72 17 95 197 54 318 77 20 116 292 126 118 191 257 176 304 84 112 153 305 172 266 91 319 312 92 241 198 16 175 232 169 65 170 208 111 199 222 192 132 244 261 221 313 322 173 131 302 129 171 30146 67 245 220 74 297 160 117 275 43 122 137 121 177 130 181 134 187 15 303 262 135 296154 196 44 10 161 209 282 281 174 165 103 280 61 64 83 31 69213 317 168 88276231 140 315 164 163 32 5 274 66 290 207 314 295 271 33 97 124 98 115 158 70 288 9 50 119 235 141 22 48 202 28 301 263 159 42 125 37 113 7596 38268 243 254 289 250 114 49 285 41 152 206 85 1838162 189 90236 238 237195 178 62 20553 7 219 623 194 151 229 234 270 45 283 24 157 258 102 279 193 218 71 63 166 100 59 308 93 25 94 68 55 144 123 228 29106 149 214 299298 142 294 273233 1 76 203 272 18 82 36293 133 79320 264 184 155 167 109 180 226 210 150143 145 186 78267 240 284 89 227 247 4 217 156 248 11 58 26 57 252 27 251 87 269 2253 260 101

PC2 score 23%

99

2

246

107

0.1 83

0.0 -0.1

165

151

1 64 31 272 295

23 8

74

152

222

8

94

96 3

71 65

-0.2

39 73

40

-4 -5

-4

-3

-2

-1

0

1

-0.3 2

3

0.46

0.48

0.50

0.52

0.54

PC1 score 41%

b

b

1.0

1 .0

0.62

0.64

0.66

308

aCH2 kPB

PC2 score 19%

PC2 score 19%

0.60

228

0 .8

aC/P aB /P pH 7.6

0.0

0.58

233

aB /P p H 2.7

0.5

0.56

PC1 score 35%

-0.5

94

0 .6

93 218

0 .4

279 219

151

0 .2 aT /O -1.0

0 .0 -1.0

-0 .5

0.0

PC1 score 41%

0.5

119 301

162 141

1.0

152 290 295

-0 .2

Fig. 2. PCA using the six parameters in the Tanaka test and excluding outliers, which are shown in Fig. 1 (a) Score plot (b) Loading plot.

289

288

192 125

0.2

0.4

2 3 89 6

271

164 132

268 163

165

0.6

134

0.8

281

1.0

140 275

PC1 score 41%

information on reducing the multiple projections to two or more PCs and the fact that dedicated software is required to visualize the data, may diminish its widespread use within the chromatographic fraternity. 3.2.2. Euclidian distance A simple and alternative approach to the identification of similar/dissimilar stationary phases is calculation of the Euclidian

 Fs =

2

2

Fig. 3. Columns located close to the HyPURITY C18 material using PCA, (a) PCA performed using the database shown in Table S1, (b) PCA performed using the database shown in Table S4.

chromatographically) to the reference stationary phase receives a ranking of 1; rankings are given in column J (rows 348–669) of Table S1.

(kPB 1 − kPB 2) + (˛CH2 1 − ˛CH2 2)2 + (˛T/O 1 − ˛T/O 2)2 + (˛C/P 1 − ˛C/P 2)2 + (˛B/P 7.61 − ˛B/P 7.62)2 + (˛B/P 2.71 − ˛B/P 2.72)2

distance in the multi-dimensional variable space (i.e. sixdimensions for the Tanaka approach) between the stationary phase of interest and others in the database. This can be simply achieved by the use of a spreadsheet programme (i.e. Microsoft Excel) as follows: (1) All variables are “auto scaled” by subtracting the mean value of the variable and dividing it by its standard deviation. (In Tables S1, the data is “auto scaled” in rows 348–669 and columns C–H). (2) In Tables S1, the Euclidian distance, which is termed the distance factor (Fs), is calculated (according to Eq. (2)) between two stationary phases, one being chosen as a reference. For example, in Table S1 values for the reference stationary phase are inserted in row 347 and Fs values are calculated in the spreadsheet (column I/rows 348–669). The stationary phase that has the smallest distance (i.e. most similar

(2)

To illustrate how to find equivalent stationary phases to the HyPURITY C18 one would simply insert the auto scaled data for the HyPURITY C18 stationary phase parameters (row 498, Table S1) into row 347 of Table S1. The distance factor (Fs) and rankings between the HyPURITY C18 column parameters and the 321 stationary phases in the database are calculated and shown in column I, rows 348–669 and column J and rows 348–668 respectively. Table 1, illustrates the four most similar (i.e. ranked 2–5) and the reference stationary phase (i.e. ranked 1). The most similar stationary phase to the HyPURITY C18 (n = 151) based on the Tanaka chromatographic parameters is shown to be the Discovery C18 (n = 96, ranked 2, Fs = 0.17). Figs. 4 and 5 validated this prediction in that both of these phases generated very similar chromatographic retention and selectivity for the isocratic separations of hydrophilic bases and the gradient separation of related impurities in a pharmaceutical drug substance. The approach was equally applicable to more polar Aqua phases in that the closest phase to

148

E.M. Borges / Analytica Chimica Acta 807 (2014) 143–152

Table 1 Stationary phases classed as most similar to the HyPURITY C18. The ranking was determined using the Euclidian distance (according to Error! Reference source not found.) to the Tanaka database (Table S1 or S2). The complete calculations of the Euclidian distance between the stationary phases can be seen in Tables S1 and S4. n

Description

kPB

˛CH2

˛T/O

˛C/P

˛B/P pH 7.6

˛B/P pH 2.7

Rank

96 151 152 219 238

Discovery C18 HyPURITY C18 5 ␮m HyPURITY C18 3 ␮m Polaris C18 Ether Pursuit C18

3.32 3.20 3.69 2.98 3.73

1.48 1.47 1.46 1.45 1.47

1.51 1.60 1.59 1.63 1.57

0.39 0.37 0.39 0.46 0.38

0.28 0.29 0.41 0.38 0.37

0.10 0.10 0.13 0.10 0.11

2 1 5 3 4

2

4

1

6

a

3 5 4

2

6 3

1

b 5

0

5

10

15

20

25

Time (min) Fig. 4. Isocratic HPLC separation of a mixture of hydrophilic bases using (a) HyPURITY C18 and (b) Discovery C18 (150 mm × 4.6 mm, dp 5 ␮m). Mobile phase: methanol: potassium phosphate buffer (pH 2.7; 20 mM) (3.3:96.7, v/v), flow rate 1.0 mL min−1 , temperature 60 ◦ C. Detection at 210 nm. Solutes identification: 1 = nicotine; 2 = benzylamine; 3 = terbutaline; 4 = procainamide; 5 = salbutamol; 6 = phenol.

the Atlantis dC18 (n = 74) was shown to be the Capcell Pak C18AQ (n = 83, ranked 5, Fs = 0.28). The chromatographic similarity of these two phases was verified in Fig. 6 which highlights the gradient separation of degradation products from a drug substance. These results confirm the veracity of the Euclidian distance approach using the

Fig. 6. Gradient HPLC separation of degradents in a pharmaceutical drug substance using (a) Atlantis dC18 and (b) Capcell Pak C18AQ columns. Chromatographic conditions and columns dimensions are the same. Mobile phase: 18–70% methanol during 40 min, 40–42 min 70% methanol. Temperature: 30 ◦ C. Flow rate: 1.0 mL min−1 , both columns were 150 mm × 4.6 mm. Detection at 210 nm.

Tanaka approach, this methodology of assessing column similarity/dissimilarity is used in the free “Column Selector” software for interrogating the Tanaka database, supplied by Analytical Chemistry Development [53]. In contrast to the PCA approach, adding or subtracting stationary phases to the database results does not affect the classification, as the Euclidian distance approach (according with Eq. (2)) is totally independent of the stationary phases presented in the data set. Thus, the rankings obtained in Tables S1 and S4 are identical (for example, insert stationary phase parameters in rows 347 and 331 of Tables S1 and S4 respectively, and compare the rankings obtained). The PQRI approach [29–37] also utilizes Euclidian distances (i.e. distance factor – Fs) to identify stationary phases with equivalent properties, however, instead of auto scaling the data, differing weighting factors are used, see Eq. (3). The advantage of auto scaling the data is that it removes the necessity of using weighting

3 2

6

5

2

a

4

1 3

6

5

b 1 0

Fig. 5. Gradient HPLC separation of impurities in a pharmaceutical drug substance using (a) HyPURITY C18 and (b) Discovery C18 columns. Chromatographic conditions and columns dimensions are the same. Mobile phase: 18–70% methanol during 40 min, 40–42 min 70% methanol. Temperature: 30 ◦ C. Flow rate: 1.0 mL min−1 , both columns were 150 mm × 4.6 mm. Detection at 210 nm.

5

4 10

Time (min)

15

20

25

Fig. 7. Isocratic HPLC separation of a mixture of hydrophilic bases using (a) Selectosil C18 and (b) Nucleosil C18 columns: methanol–potassium phosphate buffer (pH 2.7; 20 mM) (3.3:96.7, v/v), flow rate 1.0 mL min−1 , temperature 60 ◦ C, detection 210 nm.

E.M. Borges / Analytica Chimica Acta 807 (2014) 143–152

factors. A comparison of the PQRI using weighting factors and nonauto scaled data (Eq. (3)) with auto scaled data (Eq. (4)) to find an equivalent stationary phase to the Nucleosil C18 (n = 279) illustrated that both approaches identified the Selectosil C18 and Brava BDS C18 phases (n = 317 and 79 respectively) as ranked most similar to the Nucleosil C18. (see Table S3, insert auto scaled data and non-auto scaled for Nucleosil C18 in rows 568 and 569 respectively, the rankings obtained using Eqs. (3) and (4) are given in columns K and J, respectively). The results were verified by their similar chromatographic retention and selectivity observed between the isocratic separation of hydrophilic bases on the Nuclesoil and Selectosil C18 phases (see Chromatographic conditions and columns dimensions are the same. Mobile phase: 18–70% methanol during 40 min, 40–42 min 70% methanol. Temperature: 30 ◦ C. Flow rate: 1.0 mL min−1 , both columns were 150 mm × 4.6 mm. Detection at 210 nm. Fig. 7). Fs = Fs =

 2  2

149

In addition, differing temperatures and organic modifiers are used for these tests which may affect the steric discrimination of the stationary phases. The hydrogen bonding parameters A, B and ˛C/P , the ionexchange parameters, C (2.8), C (7.0), ˛B/P pH 2.7 and ˛B/P pH 7.6 are unrelated (see Table S5 rows 155–166 and columns C–N). However, stationary phases, which have high ˛C/P , ˛B/P pH 2.7 and ˛B/P pH 7.6 values, (i.e., ZirChrom-PDB, Resolve C18, etc.,) also have high C (2.8) and C (7.0) values, see Tables S1 and S3. 3.3.2. SRM 870 with the PQRI and Tanaka chromatographic parameters At the present moment, there are 79 stationary phases common to both the SRM 870 and PQRI databases (see Table S6). No correlation was observed between the six PQRI parameters and the four

12.5(H1 − H2)2 + 100(S1 − S2)2 + 30(A1 − A2)2 + 143(B1 − B2)2 + 83(C(2.7)1 − C(2.7)2)2 + 83(C(7.0)1 − C(7.0)2)2 (H1 − H2)2 + (S1 − S2)2 + (A1 − A2)2 + (B1 − B2)2 + (C(2.7)1 − C(2.7)2)2 + (C(7.0)1 − C(7.0)2)2

Despite this excellent agreement there is only a weak correlation (R2 = 0.4) between the rankings of the two approaches. This may, in part, be due to the fact that the weighting factors were attributed to a smaller data set of less than 300 stationary phases in the early PQRI development [31–38] and may warrant being updated. Whereas, the use of auto scaled data (Eq. (4)) does not involve any weighting factors and removes the necessity of having to update them. 3.3. Comparison between various column characterization protocols 3.3.1. PQRI and Tanaka chromatographic parameters As of October 2012, there were approximately 148 stationary phases (prepared with type B silica) common to both the PQRI and Tanaka databases; this data (plus three additional phases ACE C18-AR, ACE C18-PFP and Kinetex C18-USP web site accessed May 2013) is given in Table S5. Correlations between the Tanaka and PQRI chromatographic parameters for the 151 stationary phases is shown in Table 2 (calculated in rows 155–166 and columns C–N of Table S5). Supplementary material related to this article can be found, in the online version, at http://dx.doi.org/10.1016/j.aca.2013.11.010. The H chromatographic parameter is correlated with kPB and ˛CH2 parameters of Tanaka test, but its %RSD is lower than that of kPB , probably because the kPB values has higher numerical values than H. For example, ACE C18 and ACE Phenyl have kPB and H values of 4.6 and 1.2, and 1 and 0.65, respectively. Steric interactions are thought to increase as the bonded phase becomes more crowded. Hence, an increase in S should be observed as the chain length or concentration of the bonded phase is increased. It is of interest to note that C8 packings which possess higher surface coverage compared to C18 packings, exhibited lower ˛T/O values than those of C18 packings (i.e. ACE C8 and ACE C18 100 A˚ materials possessed ˛T/O values of 1.00 and 1.53, respectively), while shape and steric selectivity have previously been shown not to be [53] (see Table S5). The S value should also increase for narrow-pore stationary phases because of the compression of the ends of the alkyl chains. S has a significant effect on stationary phase selectivity, especially for molecules of different shape and it is not related with ˛T/O parameter. However, the %RSD of it is higher than the %RSD of ˛T/O . For example, the conventional ODS (i.e. XBridge C18) and polar embedded phases (i.e. XBridge Shield RP18) possess ˛T/O values of 1.38 and 2.27 respectively, while their S values are 0.028 and −0.026. Widely differing probes are used for these two tests so it is not surprising that the correlation is weak.

(3)

(4)

SRM 870 parameters (see rows 80–90 and columns A–Z in Table S7). In comparison, there are only 34 stationary phases common to both the SRM 870 and Tanaka databases (see Table S6). As before, no correlation was observed between the six Tanaka parameters and the four SRM 870 parameters (see rows 37–46 and columns A–K in Table S7). Supplementary material related to this article can be found, in the online version, at http://dx.doi.org/10.1016/j.aca.2013.11.010. 3.4. Do the databases identify comparable results with respect to phase similarity/dissimilarity? 3.4.1. Tanaka and PQRI approaches Given that the two approaches generate chromatographic parameters which are essentially uncorrelated, the question arises–“will they still generate similar results with respect to identifying equivalent and complementary phases?”. In order to address this, a sub-dataset of 151 stationary phases, which are common to both database, has been used to compare the results from both databases. The subsets of the Tanaka and PQRI characterization databases is given in rows 2–152 and columns C–N of Table S5, both sets of data were “auto scaled” (see rows 179–326) as explained in Section 3.2.2.Values for the selected stationary phase are inserted in row 171 and columns C–N. The distance factor (Fs) and rankings for the Tanaka approach using Error! Reference source not found. are shown in column Q & R, rows 179–326 and for the PQRI approach using Eq. (4) are shown in column O & P rows 179–326. Both approaches successfully identified the Selectosil C18 and Discovery C18 phases as the most equivalent phases in the reduced database as alternatives for the Nucleosil C18 and HyPURITY C18 phases respectively (see Table S5). Despite this excellent agreement there were subtle differences in the ranking order between the two approaches for other phases. For example, Table 3 highlights the different ranking order for equivalent columns to the XBridge C18 column. ET and EQ are Euclidian distances between the stationary phases in Tanaka and PQRI tests, respectively. RT and RQ are rankings obtained for the Tanaka and PQRI tests, respectively. Both approaches afforded similar results, however, it is expected that subtle differences in the ranking would occur due to the uncorrelated chromatographic parameters that are used in both approaches. Methanol–potassium phosphate buffer (pH 2.7; 20 mM) (3.3:96.7, v/v), flow rate 1.0 mL min−1 , temperature 60 ◦ C, detection 210 nm. Fig. 8 illustrates that the practical consequences of the different ranking is very small in chromatographic selectivity and retention

150

E.M. Borges / Analytica Chimica Acta 807 (2014) 143–152

Table 2 Correlations between the chromatographic parameters of PQRI and Tanaka test. These correlations are observed for dataset of 151 stationary phases in common to both tests, Table S5. Parameter

S

A

B

C (2.8)

C (7.0)

kPB

˛CH2

˛T/O

˛C/P

˛B/P pH 7.6

˛B/P pH 2.7

H S A B C (2.8) C (7.0) kPB ˛CH2 ˛T/O ˛C/P ˛B/P pH 7.6

0.50

0.56 0.08

−0.35 0.19 −0.29

0.21 −0.40 0.43 −0.55

−0.29 −0.51 0.39 0.04 0.56

0.79 0.46 0.45 −0.28 −0.02 −0.29

0.90 0.46 0.53 −0.18 0.11 −0.25 0.76

−0.14 −0.03 −0.08 0.20 −0.10 0.06 0.00 −0.03

−0.35 −0.56 0.16 −0.05 0.17 0.44 −0.31 −0.27 −0.13

−0.38 −0.32 0.19 0.11 0.08 0.56 −0.26 −0.35 0.13 0.41

−0.16 −0.37 0.05 −0.09 0.55 0.47 −0.23 −0.25 −0.14 0.18 0.30

Table 3 Stationary phases classed as most similar to the XBridge C18. The ranking was determined using Euclidian distances according to the Tanaka database (Table S5 and Error! Reference source not found.) and the PQRI database (Table S5 and Eq. (4)). Column number

Column description

124 67 45 4 55 124 61 4 80 103 149 150 139

XBridge C18 HyPURITY C18 Discovery C18 ACE C18 Genesis C18 XBridge C18 Hypersil BDS ACE C18 Novapak C18 Pursuit C18 ACE 3 C18-AR ACE 3 C18-PFP Zorbax Eclipse XDB-C18

PQRI

Tanaka

EQ

RQ

ET

RT

0.00 0.18 0.19 0.23 0.45 0.00 0.50 0.23 1.49 0.52 1.62 2.78 0.69

1 2 3 4 5 1 7 4 66 8 71 105 19

0.00 0.66 0.57 0.47 1.03 0.00 0.45 0.47 0.48 0.52 1.23 3.43 0.96

1 11 8 3 26 1 2 3 4 5 40 124 22

terms as both the EQ and ET values are

How to select equivalent and complimentary reversed phase liquid chromatography columns from column characterization databases.

Three RP-LC column characterization protocols [Tanaka et al. (1989), Snyder et al. (PQRI, 2002), and NIST SRM 870 (2000)] were evaluated using both Eu...
2MB Sizes 0 Downloads 0 Views