G Model ACA 233278 No. of Pages 5

Analytica Chimica Acta xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Analytica Chimica Acta journal homepage: www.elsevier.com/locate/aca

Analysis of breath samples for lung cancer survival Birgitta Schmekel a,b , Fredrik Winquist c, *, Anders Vikström d a

Division of of Clinical Physiology, County Council of Östergötland, Linköping, Sweden Clinical Physiology, Department of Medicine and Health, Faculty of Health Sciences, Linköping University, Linköping, Sweden Department of Physics, Chemistry and Biology, Linköping University, Linköping SE-581 83, Sweden d Department of Pulmonary Medicine, University hospital of Linköping, County Council of Östergötland, Linköping, Sweden b c

H I G H L I G H T S

G R A P H I C A L A B S T R A C T

 Analyses of exhaled air offer a large diagnostic potential.  Patientswith diagnosed lung cancer were studied using an electronic nose.  Excellent predictions and stable models of survival day were obtained.  Consecutive measurements were very important.

Predictions of survival days for lung cancer patients.

A R T I C L E I N F O

A B S T R A C T

Article history: Received 8 February 2014 Received in revised form 14 May 2014 Accepted 20 May 2014 Available online xxx

Analyses of exhaled air by means of electronic noses offer a large diagnostic potential. Such analyses are non-invasive; samples can also be easily obtained from severely ill patients and repeated within short intervals. Lung cancer is the most deadly malignant tumor worldwide, and monitoring of lung cancer progression is of great importance and may help to decide best therapy. In this report, twenty-two patients with diagnosed lung cancer and ten healthy volunteers were studied using breath samples collected several times at certain intervals and analysed by an electronic nose. The samples were divided into three sub-groups; group d for survivor less than one year, group s for survivor more than a year and group h for the healthy volunteers. Prediction models based on partial least square and artificial neural nets could not classify the collected groups d, s and h, but separated well group d from group h. Using artificial neural net, group d could be separated from group s. Excellent predictions and stable models of survival day for group d were obtained, both based on partial least square and artificial neural nets, with correlation coefficients 0.981 and 0.985, respectively. Finally, the importance of consecutive measurements was shown. ã 2014 Elsevier B.V. All rights reserved.

Keywords: Breath analysis Electronic nose Lung cancer Survival prediction

1. Introduction In the past, the scent of exhaled air has been used to diagnose diseases, such as diabetes or liver disease. Presently, a large

* Corresponding author. Tel.: +46 730933946. E-mail address: [email protected] (F. Winquist).

diagnostic potential is associated for the analyses of exhaled air by means of electronic nose (EN). Such analyses are non-invasive; samples can also be easily obtained from severely ill patients and repeated within short intervals. The analyses may give information concerning various metabolic pathways and disorders. There is a complex mixture of a number of volatile organic compounds (VOC) in exhaled air, arising from volatile constituents in the blood. This was first reported by Pauling et al. [1], who

http://dx.doi.org/10.1016/j.aca.2014.05.034 0003-2670/ ã 2014 Elsevier B.V. All rights reserved.

Please cite this article in press as: B. Schmekel, et al., Analysis of breath samples for lung cancer survival, Anal. Chim. Acta (2014), http://dx.doi. org/10.1016/j.aca.2014.05.034

G Model ACA 233278 No. of Pages 5

2

B. Schmekel et al. / Analytica Chimica Acta xxx (2014) xxx–xxx

combined gas chromatography with mass spectroscopy (GC–MS). Since then, many studies have been performed to associate specific VOCs with certain diseases, such as mercaptanes and alkanes with liver disturbances [2] and various amines with uremic diseases [3]. GC–MS identified alkanes, alkane derivates and benzene derivates in exhaled air from lung cancer patients [4] and combinations of multiple VOCs discriminated patients with lung cancer from noncancer patients [5]. The use of GC–MS for the detection of biomarkers is, however, of limited use in clinical practice, due to its complexity of handling and the very huge and complex data obtained. The emerging technology of the electronic noses in the 80s offered an alternative approach. They were originally described as a synthetic olfaction [6], since they operated in a similar way as the olfactory sense. Thus, an electronic nose consists of a gas sensor array with partially overlapping selectivities, a data collecting unit and signal processing routines for pattern recognition. When the gas sensor array interacts with the VOC in a breath sample, a characteristic fingerprint will be generated which can be recognized with previously recorded fingerprints using pattern recognition routines. Electronic noses have gained a considerable interest since their first appearance, and have found many applications in fields and recently in the medical area also [7–9]. Thus, bacteria were identified [10] and the number of applications has grown considerably, including many diagnostic fields [11], most of them concern lung diseases, such as lung cancer [5] and asthma [12]. Lung cancer is the most deadly malignant tumor worldwide with an overall 5-year survival below 15%. If diagnosed and treated at an early stage the 5-year survival increases considerably to approximately 50%. A number of new and more efficacious anti-cancer treatments have been introduced during the last five years. High costs and unpredictable efficacy of these drugs, necessitates access to a method for monitoring, prediction of outcome and/or drug tailoring. Monitoring of lung cancer progression by means of analyses of biomarkers or reliable scoring of well being is presently lacking, and it is therefore obvious that cheap and simple methods are needed for this purpose. To enable monitoring progress of the disease, it appears clear that several measurements must be done at certain time intervals. Furthermore, measurements before and after treatment may be done to document effects of a certain treatment. Follow-up studies by means of GC–MS [13] and nanomaterial based sensor array after lung cancer surgery have been done [14], but mostly cross sectional studies have been made by means of the e-nose technique [4]. Although repeated measurements by means of a seven sensor enose was done in subjects with chronic obstructive pulmonary disease to study reproducibility [15], true longitudinal studies to establish the utility of ENs for monitoring disease progression and pharmacological response have, however, not been published. The findings of Poli et al. [13] in a three years follow up study of GC–MS results showed that surgery of lung cancer influenced concentrations of some exhaled VOCs, suggesting that measurements of exhaled VOCs could be useful in monitoring the disease. Although some biomarkers and certain gene signatures have been claimed to serve as prognostic and predictive tools, the gain in overall survival has not been shown [16]. Prospective studies on the efficacy of EN in prediction of outcome have not been reported previously either and we therefore performed a pilot study with the aim of questioning whether signals from EN might predict outcome of disease in patients with end stage lung cancer during palliative chemotherapy. It is described in this paper how an electronic nose was used to “predict” survival of a group of patients suffering from lung cancer, by analysis of a series of samples of exhaled breath collected at specific time intervals before and after chemotherapy. The prediction model obtained can then be able to predict expected survival for new

patients, giving a measure of the severity of their diseases. It is also expected that effects of treatment may be indicated. As more new information obtained from new patients will be added to the prediction model, it will become more and more accurate by time. 2. Materials and methods 2.1. Multivariate data analysis (MVDA) In many analytical fields, there is an increasing use of MVDA to investigate and get an overview of large amount of data, and this is a key technique when evaluating sensor fingerprints from the sensor array in the electronic nose. Two basic principles are used, one is to find structure and correlations between samples, the other is to make a prediction model from calibration sets, and use this to predict the real data. For the first principle, principal component analysis (PCA) is most often used [17], and for the second, there are many alternative methods of which partial least square, also called projection to latent structure (PLS) [18] and artificial neural nets (ANNs) [19] are often used. ANNs consist of an input layer, one or more hidden layers and an output layers. The layers are connected with each other with logarithmic transfer functions, and by training, the method of backpropagation of errors is often used. When dealing with non-linear data, ANNs often give better predictions compared with linear methods such as PLS. Since ANNs are vulnerable to larger amount of input variables, the most important variables given from regression coefficients in the PLS modelling can be chosen. There are various ways to determine the validity of a prediction model. The correlation coefficient is often used as well as the root mean square error of prediction (RMSEP). In many practical applications, the RPV (relative predicted deviation) value is used. This is defined as the standard deviation of the whole dataset divided by the standard error of prediction. For a useful model, this value should be 2 or higher. The software SIRIUS 6.5 (Pattern Recognition Software, PRS, Bergen, Norway) was used for PLS analysis, and the software Brainmaker (California Scientific Software, USA) for ANN analysis. 2.2. The electronic nose (EN) The EN was obtained from Applied Sensor AB (Linköping, Sweden) model 2010, which is normally used for sequential analysis using a sample carousel and an injector needle. The EN was modified by disconnecting the carousel and attaching a sample collecting tubing to the injector needle. Alumina bags containing breath samples were placed in a specially designed sample holder and attached to the sample collecting tubing. This tubing was thermostated to 55  C and the sample holder was thermostated to 40  C to avoid water condensation. The sensor array in the EN consisted of 10 metal–oxide– semiconductor field effect transistors (MOSFET) and 12 metal oxide semiconductor (MOS) sensors. These two sensor types represent two different sensing classes. MOS sensors are more sensitive for stable alkanes compared to MOSFET sensors, while the latter ones are more sensitive for nucleophilic compounds (e.g. ammoniac, amines). Each of the 22 sensors in the array has its own individual sensing profile, and by using a multivariate approach, each component of the mixture of the VOC can be identified. 2.3. The sample bags In contrast to commercially available bags, which may be permeable to various gases and may contain emissions from glue or plastics, our sample bags are impermeable to gas diffusion (even for hydrogen gas) and do not emit VOCs. Thick alumina foil (thickness

Please cite this article in press as: B. Schmekel, et al., Analysis of breath samples for lung cancer survival, Anal. Chim. Acta (2014), http://dx.doi. org/10.1016/j.aca.2014.05.034

G Model ACA 233278 No. of Pages 5

B. Schmekel et al. / Analytica Chimica Acta xxx (2014) xxx–xxx

14 mm), used for preparation of food fulfils specified demands, and the foil was obtained from Skultuna AB, Västerås, Sweden. Thus, an alumina foil sheet (55 mm  40 mm) was folded to give a double layered bag, size approximately 25 cm  35 cm, which could collect up to 3 L of a breath sample. A sealed inlet tube, for the patient to exhale, was connected to one of the corner of the sample bag. 2.4. Measuring procedure Subjects were instructed not to smoke or eat during at least 2 h prior to sampling of exhaled air. All subjects were asked to rinse their mouth with tap water prior to sampling of exhaled air, which was done by slow expirations into the alumina bag. Due to simplistic reasons, we sampled the whole breath, not excluding the initial portion from the exhaled bolus originating from airway deadspace. Thereby, a mixture of expired air originating from alveolar and airway deadspace regions was collected. Considering the fact that the volume of alveolar air by far exceeds the one originating from airway deadspace, we judged that mainly alveolar air was sampled for analyses by EN. The sample bag was placed in the sample holder and connected to the sample inlet tube within 1 h. Samples were then injected to the EN during 40 s, and peak responses from the sensor array were recorded. Each sample was measured 4 times, and the mean of the two latest measurements were used for calculation. 2.5. Breath samples Twenty-four patients with diagnosed lung cancer were included and breath samples were collected before any anticancer treatment was given and thereafter every third week, and up to four times, immediately prior to every dose of treatment given. One patient withdrew consent, another patient survived only one month and one of the samples retrieved was an obvious measurement failure, leaving samples from twenty-two patients to be further analysed. Furthermore, merely two samples were collected from one of the cancer patients. Mean age of the patients was 71 years, 19 were smokers or ex-smokers and ten also had moderate chronic obstructive pulmonary disease. Pathological anatomical diagnoses were adenocarcinoma (n = 14), squamous cell carcinoma (n = 5), large cell carcinoma (n = 2) and small cell carcinoma (n = 3). All except one patient had end stage lung cancer at the time of inclusion [stage 4 (n = 17), stage 3 (n = 6)]. Average delay from the time when cancer was first suspected to sampling of exhaled air and start of chemotherapy was 2.7 months (0–6.6 months). Patients were monitored for clinical data up to nearly two years and at the end of the observation period, five patients were still alive. Chemotherapy was given in accordance with international guidelines. All subjects visited the clinic three times with an intermission of three weeks, in six cases the third visit was followed by a fourth visit after at least three months. Clinical baseline data including case history were documented at all visits, and chemotherapy was started at visit 1. Ten patients received second line therapy due to progression of disease. None of the patients suffered from any serious infection during the terminal stage. Ten subjectively healthy non-smoking female volunteers, aged 42–64 years, were recruited as control persons. Previously recorded data indicated no significant difference in recorded data depending on gender or age (data not shown). 3. Results and discussions Samples from all patients were sub-grouped into different sets according to data on survival; 12 samples obtained from

3

patients who survived less than one year was denoted “d” (diseased), 10 samples obtained from patients who survived more than one year was indicated by “s” (survivors) and 10 samples from healthy volunteers were denoted “h”. Group d seemed to be in a rapidly developing process, whereas group h were stable. Group s tended to be an intermediate and heterogeneous group. It is thus anticipated that groups d and h should separate in analyses by means of MVDA, while predictions in group s would be more indecisive. Since measurements in each of the participants were done at least three times and each measurement generated 32 sensor values, a matrix 32  96 was obtained. In the first data analysis, PLS modelling were used for classification of the three groups d, s, and h, with rather poor result. Classification with ANN was also performed, using ten inputs, ten nodes in the hidden layer and one output node, but giving similar results. If only the two classes, d and h were analysed, predictions of the two classes improved considerably. In Fig. 1, predictions using a PLS model are shown. For this and in the following, the classes were affiliated the numerical values 1 and 2, respectively. The PLS model used three components, giving the RMSEP = 0.154, the correlation coefficient 0.954 and the RPD = 3.2, indicating a good prediction. In the same figure, results from a corresponding class prediction based on an ANN are shown. As for the PLS study, also this model works well, with RMSEP = 0.134, the correlation coefficient 0.976 and the RPD = 3.7. Finally, classification studies on data from classes d and s, showed that a weak prediction could be obtained for PLS modelling (correlation coefficient = 0.86 and RPD = 1.92), but for ANN the predictions are good (correlation coefficient = 0.97 and RPD = 3.91), as can be seen in Fig. 2. Models for prediction of survival in days were also developed. Thus, a prediction model based on PLS was developed for survival of patients in class d and class s. As expected, prediction of survival in class s is difficult and complex. The development of the disease is related to individual, genetical and/or other factors and is thus for these long survival times, very difficult to predict. It is thus of great importance, that by using an ANN, the class s can be separated from class d.

Fig. 1. Classification of groups d and h using PLS (purple, square markers) and ANN (blue, diamond markers). The dotted lines show the true class values (1 or 2) (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.).

Please cite this article in press as: B. Schmekel, et al., Analysis of breath samples for lung cancer survival, Anal. Chim. Acta (2014), http://dx.doi. org/10.1016/j.aca.2014.05.034

G Model ACA 233278 No. of Pages 5

4

B. Schmekel et al. / Analytica Chimica Acta xxx (2014) xxx–xxx

Fig. 2. Classification of groups d and s using PLS (purple, square markers) and ANN (blue, diamond markers). The dotted lines show the true class values (1 or 2) (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.).

For class d, however, PLS modelling showed an excellent prediction capacity and a stable mode. Crossvalidated true versus predicted values, based on “leave one out”, are shown in Fig. 3, with RMSEP = 17.0, correlation coefficient = 0.981 and RPD = 4.9. A prediction model based on an ANN was also developed. As for the classification study, using ANN, the same strategy was used for choosing ten input variables. True versus predicted values of survival are shown in Fig. 4, and as for the PLS study, a very good prediction capacity and stable model was obtained, with RMSEP = 15.4, correlation coefficient = 0.985 and RPD = 5.6. The effect of analysing a series of consecutive data was also examined. As samples from the patients were collected at three to four times, information concerning the development of the diseased was expected to be found in the data. Thus, PLS prediction models for reduced data were studied. For data collected once, the correlation coefficient was 0.252, for data collected twice, the correlation coefficient was 0.581, and for data collected three times the correlation coefficient was 0.871. This should be compared with the correlation coefficient for all measurements (also including some data collected four times), that was 0.981. This clearly shows the importance of consecutive sampling.

Fig. 4. Predictions of survival days by using ANN.

4. Conclusions It is of great value to have access to estimations of likely survival in choosing anticancer treatment and in monitoring of the disease. Such information may be used as base for decision on drug tailoring or choice of anti-cancer treatment as well as estimations of the efficiency of treatment. It is important to stress, that the results of this study rests upon data recorded in a limited number of patients, but still, the very good prediction power of survival of both PLS and ANN for the d group indicates that the models are stable. The importance of confirming these results in a larger population is stressed, and this may easily be done since the device can be geared up at a much smaller size to be truly portable, and also designed in such a way that patients could exhale directly into the device. The strategy for application in a clinical setting is to confirm the relevance of present data in a larger study population and to relate smell points from new patients to the present prediction curves to predict survival. If there is a good correlation between EN data and survival, it would be possible to evaluate the patients that are likely to progress early after treatment of their disease and therefore should be monitored closely for relapse. Provided that the present prediction curve is valid also for a larger population, the consequence might be that more efficient or alternative therapy would be chosen in certain lung cancer patients. Conversely, for good prognosis, it would be soothing for patients to have this information. Another application might be to monitor patients during treatment in order to enable early termination of less effective treatments. Acknowledgements The technical assistance of Mrs Izabella Sandberg, R.N. is greatly acknowledged. This study was supported by the County Council of Östergötland. References

Fig. 3. Predictions of survival days by using PLS.

[1] L. Pauling, A.B. Robinson, R. Teranishi, P. Cary, Quantitative analysis of urine vapour and breath by gas–liquid partition chromatography, Proc. Natl. Acad. Sci. U. S. A. 68 (1971) 2374–2376. [2] H. Kaji, M. Hisamura, N. Sato, M. Murao, Clin. Chim. Acta 85 (1978) 279. [3] M. Simenhof, J. Burke, L. Saukkonen, A. Ordinario, R. Doty, N. Engl. J. Med. 297 (1977) 132.

Please cite this article in press as: B. Schmekel, et al., Analysis of breath samples for lung cancer survival, Anal. Chim. Acta (2014), http://dx.doi. org/10.1016/j.aca.2014.05.034

G Model ACA 233278 No. of Pages 5

B. Schmekel et al. / Analytica Chimica Acta xxx (2014) xxx–xxx [4] K.D.G. van de Kant, et al., Clinical use of exhaled volatile compounds in pulmonary diseases: a systemic review, Respir. Res. 13 (2012) 117. [5] A. D’Amico, G. Pennazza, M. Santonico, E. Martinelli, C. Roscioni, G. Galluccio, R. Paolesse, An investigation on electronic nose diagnosis of lung cancer, Lung Cancer 68 (2010) 170–176. [6] K. Persaud, G. Dodds, Analysis of discrimination mechanisms in the mammalian olfactory system using a model nose, Nature 299 (1982) 352– 355. [7] A. Pavlou, A.P. Turner, Sniffing out the truth: clinical diagnosis using the electronic nose, Clin. Chem. Lab. Med. 38 (2000) 99–112. [8] P. Montuschi, N. Mores, A. Trove, C. Mondino, P.J. Barnes, The electronic nose in respiratory medicine, Respiration 85 (2013) 72–84. [9] N. Fens, M.P. van der Schee, P. Brinkman, P.J. Sterk, Exhaled breath analysis by electronic nose in airways disease. Established issues and key questions, Clin. Exp. Allergy 43 (7) (2013) 705–715. [10] M. Bruins, Z. Rahim, A. Bos, W.W. van de Sande, H.P. Endtz, Diagnosis of active tuberculosis by e-nose analysis of exhaled air, Tuberculosis (Edinb.) 93 (2013) 232–238. [11] A.D. Wilson, M. Baietto, Advances in electronic-nose technologies developed for biomedical applications, Sensors (Basel) 11 (2011) 1105–1176.

5

[12] S. Dragonieri, R. Schot, B.J. Mertens, C.S. Le, S.A. Gauw, A. Spanevello, O. Resta, N.P. Willard, T.J. Vink, K.F. Rabe, E.H. Bel, P.J. Sterk, An electronic nose in the discrimination of patients with asthma and controls, J. Allergy Clin. Immunol. 120 (2007) 856–862. [13] D. Poli, et al., Breath analyis in non small cell lung cancer patients after surgical tumour resection, Acta Biomed. 79 (2008) 64–72. [14] Y. Broza, et al., A nanomaterial-based breath test for short-term follow-up after lung tumour resection, Nanomed. Nanotechnol. Biol. Med. 9 (2013) 15– 21. [15] R.A. Incalzi, et al., Reproducibility and respiratory function correlates of exhaled breath fingerprint in chronic obstructive pulmonary disease, PLoS One 7 (2012) 10–18. [16] E. Thunnissen, et al., Prognostic and predictive biomarkers in lung cancer: a review, Virchows Arch. 464 (2014) 47–58. [17] S. Wold, K. Esbensen, P. Geladi, Principal component analysis: a tutorial, Chemom. Intel. Lab. Syst. 2 (1987) 37–52. [18] P. Geladi, B. Kowalski, Partial least square regression: a tutorial, Anal. Chim. Acta 185 (1986) 1–17. [19] J. Lawrence, Introduction to Neural Networks and Expert Systems, California Scientific Software, Nevada City, USA, 1992.

Please cite this article in press as: B. Schmekel, et al., Analysis of breath samples for lung cancer survival, Anal. Chim. Acta (2014), http://dx.doi. org/10.1016/j.aca.2014.05.034

Analysis of breath samples for lung cancer survival.

Analyses of exhaled air by means of electronic noses offer a large diagnostic potential. Such analyses are non-invasive; samples can also be easily ob...
603KB Sizes 2 Downloads 5 Views