Journal of Applied Bacteriology 1992, 73, 9498

A multiple logistic model for predicting the occurrence of Campylobacter jejuni and Campylobacter colj in water E. Skjerve and 0. Brennhovd Department of Food Hygiene, Norwegian College of Veterinary Medicine, Oslo, Norway 3596/2/91: accepted 15 February 1992

E. S K J E R V E A N D 0.BRENNHOVD. 1992. A multiple logistic regression model was established to predict the occurrence of Campylobacter jejunilcoli, related to index bacteria such as faecal coliforms, faecal streptococci, and sulphite-reducing clostridia, in a water source in southern Norway. The fitted model indicated that faecal coliforms were strong predictors for C. jejunilcoli, although the water temperature also had a strong influence on results. Sulphite-reducing clostridia, faecal streptococci, and season of the year had no significant influence on the results, in spite of their apparent predictor value as separate variables. T h e model employed offers a new approach to the relationship between index bacteria and the occurrence of pathogenic bacteria in water. Similar models can also be established in general food microbiology.


Pathogenic micro-organisms in water are of major concern for human health. Although water-borne pathogens are particularly important causes of infections in developing countries, frequently leading to death, they also pose significant health hazards in industrialized countries, if present in drinking water or swimming pools. Campylobacter jejunilcoli are recognized as major causes of human enteritis. Several water-borne outbreaks have been reported, including two from Norway (Mentzing 1981; Vogt et al. 1982; Taylor et al. 1983; Rogol et al. 1983; Dahl & Melby 1987). Although direct testing for most enteric bacterial pathogens is possible their detection in water or foods is often laborious. Such analytical problems have led to the supplementation of tests for pathogens with appropriate tests for indicator bacteria (Mossel 1982a). The principle of employing indicator organisms was introduced almost 100 years ago by Schardinger (1982) and Smith (1899, who independently suggested using Escherichia coli as an indicator of enteric pathogens such as Salmonella typhi. According to Mossel (1981), Ingram introduced the term ‘marker organisms’, and distinguished between indicator bacteria that suggest inadequate bacteriological quality of a general nature, and index organisms whose presence provides evidence of the potential occurrence of specific pathogens or toxin-formers (Mossel 1981, 1982b). This Correspondence to: Dr Eystein Skjetve, Dtpartmenr of Food Hygiene, The Norwegian College of Veterinary Medicine, PO Box 8/46, DEP, 0033 Oslo, Norway.

terminology will be used throughout this paper, the term index bacteria being employed for those which indicate the presence of C. jejunilcoli. Several bacteria have been suggested as markers of faecal pollution of water, among them coliform bacteria ( E . coli and related lactose fermenting, Gram-negative bacteria), faecal coliforms (mostly, though not exclusively, E. coli), faecal streptococci, Enterobacteriaceae, and sulphite-reducing closmdia (mostly Clostridium perfringens). Each bacterial group has had its proponents and opponents, but relevant markers for different environments have not been established (Hendriksen 1955; Buttiaux & Mossel 1961; Bonde 1963; Geldreich & Kenner 1969; Dutka 1973; Smith et al. 1973; Mossel 1978, 1982a, b; Bisson & Cabelli 1980; Cabelli 1982; Muller & Mossel 1982). Thus, the question as to which marker organisms are the best for different kinds of water is still controversial, and there is an obvious need for a new approach to deal with the matter. Mossel (1982a) suggested the use of a ratio : colony forming units (cfu) of marker organismlcfu of pathogenic organism (&-factor) to ascertain the predictor value of the marker (index bacteria), and to determine from this the acceptable level for the marker organism in question. A linear relationship is difficult to use in the case, as most analyses for pathogens in water or food are qualitative (present or absent), quantification demanding a substantially increased workload. An alternative is to use statistical methods which can handle such dichotomous ( f ) outcomes. Logistic regression is one of these techniques, which is being increasingly applied in the analysis of epidemiological data and in


medical decision making, adequate computer software having become available (Breslow & Day 1980; Schmitz 1986; Hosmer & Lemeshow 1989). There is also increasing agreement among statisticians that inferences based on regression methods often allow a better interpretation of data than analysis of variance approaches and simple hypothesis testing (Oakes 1986). In the present study, we introduce a multiple logistic regression model for deciding which variables can serve as predictor variables for the occurrence of C. jejuni/coli in water. With the established model the probability of isolating C. jejuni/coli in specific water samples may be predicted, based on results from analyses for index bacteria and other markers.


Data were obtained from a study on the occurrence of index bacteria and C. jejunifcoli, through 1 year, in a water source in southern Norway (Brennhovd & Kapperud 1991). Water samples (246) were examined for faecal coliforms (FCB), faecal streptococci (FS), sulphite-reducing clostridia (SRC), and C. jejuni/coli (CAMP). Filter methods as described by Brennhovd et al. (1991) were used. Briefly, samples of 100 ml were filtered through separate nitrocellulose membrane filters (Millipore) with a pore size of 45 pm and a diameter of 47 mm. Membranes were placed face up on selectivefindicator agar plates, whereafter plates were incubated and typical colonies were counted. For enumeration of CAMP, Preston agar (Bolton & Robertson 1982) was incubated for 24 and 48 h at 4243°C in a microaerobic atmosphere, for FCB m-Endoagar LES (Difco) was incubated at 44°C for 24 h, for FS m-Enterococcus agar (Difco) was incubated at 44°C for 24 h and for FS SFP agar (Difco) with Pcycloserine (Sigma) was incubated anaerobically at 44°C for 24 h. Water temperature and dates were also recorded. An external validation data set ( n = 96) containing the same variables was obtained from three different water sources around Oslo, Norway, as described by Brennhovd et al. (1991).

Table 1 Information in the data set used in establishing a

multiple logistic model for prediction. of the Occurrence of Campylobacter jejunilcoli in a Norwegian water source Season : Decembe-February (l), March-May (2), June-August (3), September-November (4) Temperature: 12°C (4) Faecal coliform bacteria (FCB)/100 ml (log,,) Faecal streptococci (FS)/lOO ml (log,,) Sulphite-reducing clostridia (SRC)/100 ml (log,,) Campylobacter jejuni/coli present in 100 ml of water (Y/N)

shown in Table 1. Before logarithmic transformation, a constant of 1 was added to FCB, FS and SRC to avoid the non-existing logarithm of 0. Of all 246 samples, 84 (34%) were found to harbour campylobacters. FCB were detected in 90% of samples, SRC in 97% and FS in 85% of all samples. The levels of FCB, S R C and F C are shown in Table 2. The Kruskal-Wallis test was performed with the statistical/epidemiological programme EPI-Info (Centers for Disease Control, Atlanta, GA/World Health Organization, Geneva, Switzerland). T h e logistic model was built mainly as suggested by Hosmer & Lemeshow (1989) : (1) Univariate logistic analysis of each variable. (2) Variables with a P-value ~ 0 . 2 5 from Wald’s test (coeficient/standard error, s.E.) from the univariate analysis were included in the model building steps. (3) Selection of variables in the logistic model was done in a forward selection process, using the likelihood ratio for the model with and without the new variable as determinants. An improvement of the model was decided for an increase in the likelihood ratio by a P-value of 0.20. FCB, FS and SRC were tried both as continuous variables, and after creation of dummy variables (groups: 0, 1-10, 11-100 and > 100 cfu/lOO ml). (4) Goodness of fit of the model was assessed by the Hosmer-Lemeshow Xz-test.

Table 2 Counts of faecal coliform bacteria (FCB)/lOO ml, faecal

Statistical analysis

A programme for logistic regression in epidemiology was used in the analyses (Multreg, Ludwig Institute of Cancer Research, Epidemiology and Statistical Unit, Sao Paulo, Brazil). Data for FCB, FS and SRC were transformed to logarithms, and dummy variables for seasons and temperature were created. The structure of the final data set is

streptococci (FS)/100 ml and sulphite-reducing clostridia (SRC)/100 ml given as median and range counts

Faecal coliform bacteria (FCB)/100 ml

Faecal streptococci (FS)/100 ml Sulphite-reducing clostridia (SRC)/100 ml



25 23

(0-lO00) (0-500) (0-500)



(5) T h e model was validated on the second dataset using the Hosmer-Lemeshow X2-test for goodness of fit.

Table 4 Prediction of the Occurrence of Campylobacter jejunilcoli using a multiple logistic regression model. Predicted and observed

(6) T h e Kruskal-Wallis 'test was run to determine the influence of season and temperature on the index bacteria (FCB, FS, SRC).

frequencies of detection for different cut-off levels for the prediction function P*

P* interval




0.w.1 0.1-0.2 0.243 0.3-0.4 0.4-0.5 0.546 0.647 0.7-0.8 04-09 0.9-1.0

54 48 25 26 33 11

3.1 6.9 6.2 8.9 13.9 5.9 7.8 7.3 14.7 9.2

3 11 4 7 14 5 6 9 16 9

Establishing a prediction model From the results of the logistic analysis, a probability function was established as described by Hosmer & Lemeshow (1989):

P* = P (C. jejunilcoli) = eT/(l + eT) where

+ (coefficient1 x varl) + (coefficient2 x var2) + . . . .

T = constant

P* in the prediction equation represents the overall predicted probability of the occurrence of C. jejunilcoli.

12 10 17 10

The probability function was thus established with the T-function from the final multiple regression model :

+ e+T) T = -4.38 + (1.62 x log FCB) + (C,,,,)


P* = eT/(I

Logarithms of counts gave better fit than original counts. Univariate logistic regression analysis revealed Wald statistics with P-values of less than 0.25 for all variables. All initial variables were therefore tested in the model. T h e main results from the model building are shown in Table 3. The fitted model included temperature intervals and FCB as a continuous variable. SRC, FS and season had no significant predictor value in the overall model. No interactions between the predictor variables in the model were detected in the analyses (P > 0.15). Table 4 shows the predicted number of samples with C. jejunilcoli compared with observed values. The HosmerLemeshow 'goodness of fit' statistics based on data from Table 4 gave a x2 of 8.33 ( P = OWSO),indicating a very good fit with empirical data. Applying the same model to the external validation data set gave a x2 of 10.67 ( P = 0.30). Table 3 Multiple logistic model for prediction of probability of

finding Campylobacter jejunilcoli in water. Model given with coefficients for each parameter with standard error (s.E.), and Wald test for deviation from zero (coefficient/s.E.) with corresponding P-value Variable

Coefficient (s.E.)

Temp. 12"C FCB/100 ml (log) Constant

0 2.17 (0.45) 2.65 (0.51) 1.12 (0.49) 1.62 (0.25) -4.38 (0.57)

FCB, Faecal coliform bacteria.

Wald (P)

where CTemp

= 0(&2"C), 2.18 (2-7"C),

2.65 (7-12"C), 1.12 (

> 12°C)

Figure 1 shows an idealized graphical interpretation of the predicted probabilities, P*, of finding C. jejunilcoli in the four different temperature intervals. If an acceptable prevalence of C. jejunilcoli is established, acceptable levels of FCB can also be read from the same figure. The

I .o

8 2



0.9 0$8 0.7 0.6 0.5 0.4

0.3 0*2 0.1 J


4.84 (

A multiple logistic model for predicting the occurrence of Campylobacter jejuni and Campylobacter coli in water.

A multiple logistic regression model was established to predict the occurrence of Campylobacter jejuni/coli, related to index bacteria such as faecal ...
440KB Sizes 0 Downloads 0 Views