International Journal of Pharmaceutics 475 (2014) 504–513

Contents lists available at ScienceDirect

International Journal of Pharmaceutics journal homepage: www.elsevier.com/locate/ijpharm

Spectral fluctuation dividing for efficient wavenumber selection: Application to estimation of water and drug content in granules using near infrared spectroscopy Takuya Miyano a, *, Manabu Kano b , Hideaki Tanabe a , Hiroshi Nakagawa a , Tomoyuki Watanabe a , Hidemi Minami a a Formulation Technology Research Laboratories, Pharmaceutical Technology Division, Daiichi Sankyo Co., Ltd., 1-12-1, Shinomiya, Hiratsuka, Kanagawa 254 0014, Japan b Department of Systems Science, Kyoto University, Kyoto, Japan

A R T I C L E I N F O

A B S T R A C T

Article history: Received 30 May 2014 Received in revised form 22 August 2014 Accepted 6 September 2014 Available online 12 September 2014

In process analytical technology (PAT) based on near infrared (NIR) spectroscopy, wavenumber selection is crucial to develop an accurate and robust calibration model. The present research proposes new efficient spectral dividing and wavenumber selection methods to significantly reduce the computational load required by conventional wavenumber selection methods such as interval partial least squares (iPLS). The proposed method, named spectral fluctuation dividing (SFD), divides a whole spectrum into multiple spectral intervals at local minimum points of the spectral fluctuation profile, which consists of the standard deviation of absorbance at each wavenumber in a calibration set. SFD is combined with PLS (SFD–PLS) to select the spectral intervals at which input variables have significant influence on a target response. The usefulness of SFD–PLS was demonstrated through its application to the problems of estimating water and drug content in granules. PLS models based on SFD–PLS achieved higher estimation accuracy than those based on conventional methods including iPLS, PLS-beta, and variable influence on projection (VIP). In addition, SFD–PLS was more than 10 times faster than the conventional variable selection methods including PLS-beta and VIP; in particular, SFD–PLS was more than 25 times faster than iPLS. Consequently, the proposed SFD–PLS is a promising wavenumber selection method. ã 2014 Elsevier B.V. All rights reserved.

Keywords: Near infrared (NIR) spectroscopy Wavenumber selection Process analytical technology (PAT) Interval partial least squares (iPLS) Calibration

1. Introduction Solid formulation represented by tablets is manufactured through multiple processes such as granulation and blending. To improve understanding of manufacturing processes is one of the key goals in Food and Drug Administration’s (FDA’s) process analytical technology (PAT) initiative (FDA, 2004). The pharmaceutical industry has already introduced PAT to enhance product quality and production efficacy (Kourti and Davis, 2012; Cogdill et al., 2007). To put PAT to practical use, real-time process monitoring and rapid measurement have been intensively explored. For example, real-time monitoring of water content in granules elucidated that tablet properties depend on the water profiles during the entire granulation process as well as on the residual water content in the granules (Hartung et al., 2011). Rapid measurement of drug content in granules in the blending process

* Corresponding author. Tel.: +81 463 31 6954; fax: +81 463 31 6475. E-mail address: [email protected] (T. Miyano). http://dx.doi.org/10.1016/j.ijpharm.2014.09.007 0378-5173/ ã 2014 Elsevier B.V. All rights reserved.

makes it smooth to change a target of tablet weights in the following tableting process to optimize drug content in the tablets (Kim et al., 2011). Near infrared (NIR) spectroscopy has been extensively employed as a PAT tool for monitoring various qualities due to its rapid and nondestructive characteristics (Reich, 2005; Roggo et al., 2007). Examples include water content in granules in a granulation process (Frake et al., 1997; Rantanen et al., 2000), blend uniformity in a blending process (Wu and Khan, 2009; Nakagawa et al., 2013), film thickness in a film coating process (Kirsch and Drennen, 1995, 1996), and drug content in granules and tablets after their sampling (Chalus et al., 2005; Porfire et al., 2012). Nakagawa et al. (2013) conducted a literature survey of blend uniformity analysis with NIR spectrometers. Quality estimation using NIR spectroscopy, in general, needs a calibration model that describes the relationship between NIR spectra and the quality, e.g., water or drug content in granules. An NIR spectrum consists of absorbance at several hundreds to thousands of wavenumber points in the range from about 12,500 cm1 to 4000 cm1. To deal with a large number of input variables and multicollinearity, partial least squares (PLS)

T. Miyano et al. / International Journal of Pharmaceutics 475 (2014) 504–513

regression has been widely used for building calibration models. The use of a whole spectrum, however, is not always optimal and rather deteriorates the estimation accuracy because a part of spectrum would have no useful information to estimate the target quality. Thus, wavenumber selection is important to build an accurate PLS model (Nadler and Coifman, 2005). It can be made manually from the basic knowledge about NIR spectroscopic properties of analyte (Namkung et al., 2008). However, it is difficult to accurately identify the wavenumbers necessary to estimate the target response because of the complicated nature of NIR spectra such as overlap of wide absorption bands, effects of neighbor functional groups, and effects of physical attributes such as particle size of granules. To overcome the difficulty, statistical wavenumber selection methods have been proposed (Xiaobo et al., 2010; Kim et al., 2011). One approach is to use univariate measures such as regression coefficients of a PLS model (PLS-beta) and variable influence on projection (VIP); the wavenumbers at which input variables have greater influence on the target response are selected one-by-one (Chong and Jun, 2005). Another approach is to use intervals of wavenumbers as used by interval PLS (iPLS); a whole spectrum is divided into several spectral intervals in equal width, and the spectral intervals at which input variables have significant influence on the target response are selected (Nørgaard et al., 2000). Spectral interval-based wavenumber selection is useful because absorbance at neighboring wavenumbers is often strongly correlated. Although iPLS was originally developed to enhance a graphical understanding of important spectral regions, it can be used as a wavenumber selection tool. Combination of iPLS and another method such as moving window (Kasemsumran et al., 2006), changeable size moving window (Du et al., 2004), and genetic algorism (Leardi and Nørgaard, 2004; Arakawa et al., 2011) has been also proposed with success in improvement of the estimation accuracy. These iPLS-based methods, however, often require a heavy computational load to optimize tuning parameters through numerous repetitive PLS modeling. iPLS is a kind of groupwise wavenumber selection methods. Another type of group-wise wavenumber selection methods, referred to as nearest correlation spectral clustering-based variable selection (NCSC-VS), has been proposed and its superiority to other methods such as stepwise, PLS-beta, VIP, selectivity ratio, Lasso, and group Lasso, has been demonstrated (Kano and Fujiwara, 2013). NCSC-VS can be applied not only to spectral data but also to general data. However, it requires a heavy computational load. The present study proposes new efficient spectral dividing and wavenumber selection methods for NIR spectroscopy-based PLS modeling. The proposed spectral dividing method, which is named spectral fluctuation dividing (SFD), divides a whole spectrum into

505

multiple spectral intervals at local minimum points of the spectral fluctuation profile, which consists of the standard deviation of absorbance at each wavenumber in a calibration set. SFD can be combined with PLS (SFD–PLS) to select the spectral intervals at which input variables have significant influence on a target response as a kind of spectral interval-based method. In addition, the usefulness of the proposed SFD–PLS is demonstrated in terms of both improving the estimation accuracy and reducing the computational load through its application to the problems of estimating water and drug content in granules. 2. Materials and methods 2.1. Materials Granules containing a drug substance (Daiichi-Sankyo, Japan) were used as analyte. The granules were manufactured at various scales: 4–100 kg in a granulation process and 0.4–500 kg in a blending process. In the granulation process, the drug substance and several excipients were granulated in a fluid bed granulator: NFLO-5 (Freund, Japan) for 4 kg scale, Aeromatic Fielder (GEA Pharma Systems, Belgium) for 10 kg scale, WSG-120 (Powrex, Japan) for 100 kg scale, or GPCG-120 (Glatt, German) for 100 kg scale. In the blending process, the granules were blended with a lubricant using a blender: S-3-S (Tsutsui Scientific Instruments, Japan) for 0.4 kg scale, TCV-10 (Tokuju, Japan) for 2 kg scale, PM-1000 (Bohle, German) for 100–300 kg scale, TB-1200 (Tanico, Japan) for 100–300 kg scale, or PM-2000 (Kotobuki, Japan) for 500 kg scale. Granules during the granulation process and those after the blending process were sampled. 2.2. Near infrared (NIR) measurement In the granulation process, real-time NIR measurement was performed using a Fourier-transform NIR spectrometer MPA (Bruker GmbH, Germany) or equivalent Matrix-F (Bruker GmbH, Germany) through a fiber-optic probe mounted in the fluid bed granulator. The NIR spectra were obtained every one minute during the granulation process. For blended granules, NIR measurement was performed using a Fourier-transform NIR spectrometer MPA. About 0.2 g of the sampled granules were put into a dedicated vial and were measured to obtain the NIR spectrum. Measurement conditions are shown in Table 1. All the NIR spectra were recorded by OPUS 6.5 software (Bruker GmbH, Germany). Periodical calibration of NIR spectrometer caused a little difference in its recording wavenumber points, even if fixed wavenumber range from 4000 cm1 to 12,500 cm1 was set. In

Table 1 Experimental conditions to prepare the calibration and validation sets used for constructing and validating PLS models. Water content estimation NIR measurement with the diffuse reflectance method Wavenumber range (cm1) 12,500–4000 Resolution (cm1) 8 No. of wavenumber points 2201 Integration time 8 times No. of NIR spectra Calibration set 96 (13 batches) Validation set 58 (7 batches) Reference measurement Measurement method Range of value (%) Calibration set Validation set

Drug content estimation 12,500–4000 8 2202 64 times 64 (64 batches) 40 (40 batches)

LOD

HPLC

1.1–17.0 1.7–15.6

67.7–130.7 73.1–124.2

506

T. Miyano et al. / International Journal of Pharmaceutics 475 (2014) 504–513

order to analyze each set of spectra with PLS, OPUS 6.5 software automatically adjusts spectra to have the same wavenumber points using interpolation. Thus, the number of wavenumber points was slightly different from each other data set.

pk ¼

bk ¼ 2.3. Reference measurement In the granulation process, granules were sampled from the fluid bed granulator every ten minutes during granulation and at the end of both spraying and drying. Water content of the sampled granules was measured by the loss on drying (LOD) method using HR73 (Mettler-Toledo, Japan) or equivalent HR83 (Mettler-Toledo, Japan). The LOD values were associated with the corresponding NIR spectra. On the other hand, drug content in the blended granules was measured by the high performance liquid chromatography (HPLC) method using Alliance Waters 2695 Separations Module (Waters Corporation, US). These reference methods have their own uncertainty. LOD method's precision is 0.2% of standard deviation according to technical data of the LOD equipment. HPLC method's precision is 0.4% of standard deviation according to results of a recovery test.

X Tk1 tk tTk tk

yTk1 tk tTk tk

(7)

(8)

The above procedure is repeated until the number of adopted latent variables K is achieved. The optimal K can be determined by leave-one-out cross validation (LOOCV) so that standard error of cross validation (SECV) is minimized. SECV is a measure of the estimation error in the LOOCV, and the standard error is calculated as follow: rffiffiffiffi N 1X ^  yi Þ2 ðy (9) Standarderror ¼ N i¼1 i ^i are where yi is the reference output values of the ith sample and y the estimated output values of the ith sample. In this study, K was determined in the range from 1 to 20. The maximum number of K was set at 20 for all the spectral preprocessing methods based on our prior knowledge to avoid the risk of over-fitting. When the maximum number increases, SECV decreases but the risk of overfitting increases.

2.4. Partial least squares (PLS) In this section, PLS is briefly introduced. In PLS, after mean centering of input and output variables, the input variable matrix (preprocessed NIR spectra) X 2 < NM and the output variable vector (water or drug content in granules) y 2 < N are decomposed as follows: X¼TPT þE

(1)

y¼Tbþf

(2)

where T 2 < NK is the latent variables' score matrix whose columns are the score vectors tk 2 < N (k = 1, 2,...,K), P 2 < MK is the X’s loading matrix whose columns are the loading vectors pk 2 < M (k = 1, 2,...,K), b 2 < M is the y’s weighting vector, E 2 < NM is the X’s residual matrix, f 2 < N is the y’s residual vector, N is the number of samples, M is the number of input variables (wavenumber points), and K is the number of adopted latent variables. The nonlinear iterative partial least squares (NIPALS) algorithm can be used to construct a PLS model (Wold et al., 2001). Suppose that the first to k  1 th score vectors are t1,t2,...,tk1, the X’s loading vectors are p1,p2,...,pk1, and the y’s weighting values are b1,b2,..., bk1. The kth residual input matrix and the kth residual output vector are expressed as follows: X k ¼X k1 tk pTk

2.5. Calibration and validation sets Calibration and validation sets are shown in Table 1. In the granulation process, each batch had four to twelve sampling time points according to the granulation time. After the blending process, the blended granules were sampled to make one sample in each batch. The calibration set was used to construct PLS models and to evaluate their calibration accuracy on the basis of standard error of calibration (SEC). The validation set, which was different from the calibration set, was used to evaluate their estimation accuracy on the basis of standard error of prediction (SEP) and a coefficient of determination (R2). The deviation of SEP from SEC measures the risk of over-fitting to the calibration set. SEC and SEP are calculated with Eq. (9) by using all the samples in the calibration set and those in the validation set, respectively. R2 is calculated as follow: N  X  ^i 2 yi  y

R2 ¼ 1 

i¼1 N  X 2 yi y

(10)

i¼1

where all the samples in the validation set are used and y is the mean value of reference values in the validation set.

(3) 2.6. Spectral preprocessing

yk ¼yk1 bk t Tk

(4)

Here X0 = X and y0 = y. The kth score vector tk is given by t k ¼ X k1 wk

(5)

where wk 2

Spectral fluctuation dividing for efficient wavenumber selection: application to estimation of water and drug content in granules using near infrared spectroscopy.

In process analytical technology (PAT) based on near infrared (NIR) spectroscopy, wavenumber selection is crucial to develop an accurate and robust ca...
2MB Sizes 1 Downloads 4 Views