Bacterial colony counting by Convolutional Neural Networks.

Bacterial Colony Counting by Convolutional Neural Networks Alessandro Ferrari1,2 , Stefano Lombardi1 , Alberto Signoroni1

Abstract— Counting bacterial colonies on microbiological culture plates is a time-consuming, error-prone, nevertheless fundamental task in microbiology. Computer vision based approaches can increase the efficiency and the reliability of the process, but accurate counting is challenging, due to the high degree of variability of agglomerated colonies. In this paper, we propose a solution which adopts Convolutional Neural Networks (CNN) for counting the number of colonies contained in confluent agglomerates, that scored an overall accuracy of the 92.8% on a large challenging dataset. The proposed CNN-based technique for estimating the cardinality of colony aggregates outperforms traditional image processing approaches, becoming a promising approach to many related applications.

I. INTRODUCTION The interest on laboratory automation in the Clinical Microbiology (CM) field [1], [2] is rising mainly due to the steadily increasing number of samples that clinical facilities have to process daily. The current challenge of CM labs is to increase efficiency, both in terms of speed and costs, together with the quality of specimens processing. The automation frees microbiologists from the laborious preanalytical phases, comprising manual handling of specimens, manual streaking of samples on the culture media, samples incubation. Most importantly, modern full laboratory automation (FLA) solutions allow bacterial growth monitoring by means of digital image recording at different phases during the incubation time. However, microbiologists still need to handle a large number of images of bacterial cultures on Petri dishes to be screened daily. Computer vision techniques are optimal candidates for reducing the burden of the screening operations while improving the process reliability. In this work, we focus on a tool for assisting screening of microbiological samples cultured on Petri dishes. Screening phase assesses whether a sample is indicative of an infectious disease in the patient (thus requiring further analysis), or if it is not relevant (and can be discarded). The bacterial load of these samples is relevant for diagnosing infectious diseases and it is determined by colony counting. This study focuses on urine samples, that numerically represent a large portion of the whole set of samples examined in CM labs. Similarly to what happens in many laboratories worldwide, specimens have been inoculated on Trypticase Soy Agar with 5% sheep blood (an example is shown in Fig. 1). An automated solution for colony counting can improve bacterial load estimation. To achieve this goal, it is necessary This work was partially supported by MIUR: Progetto Cluster Fabbrica Intelligente, Adaptive Manufacturing, CTN01 00163 216730 1 Information Engineering Dept., University of Brescia, via Branze 38, I25123 Brescia (Italy) [email protected] 2 Copan Italia S.p.A., via Perotti 10, I25125 Brescia (Italy)

[email protected]

978-1-4244-9270-1/15/$31.00 ©2015 IEEE

Fig. 1: Example of Blood agar plate and image segments showing confluent colonies.

that an accurate and precise count of single colonies or small confluent agglomerates can be guaranteed, especially for relatively low bacterial loads (e.g. less than 80-100 colonies), so that the software can safely sort the plates in terms of bacterial load intervals or screen out negative samples (i.e. presenting a number of colonies lower than a given threshold). Differently, for higher bacterial loads, especially when massive confluent areas occur, the count is not required to be exact, and some coarse estimation can be done. Typically, colony counting methods start with a segmentation process. Subsequently it is usually necessary to enumerate the number of colonies contained in the segments and discard outliers (see Fig.2). The focus of this work is on this specific step. The peculiar aspect of the proposed approach is to teach convolutional neural networks (CNN) to estimate the cardinality of the colony aggregates, proposing a solution that can outperform traditional image processing methods. Once the number of colonies contained in each segment is established, and the outliers are rejected, the determination of the overall count is trivial. More in general, for mediumhigh bacterial loads where large agglomerates are present, the proposed solution can be thought as a (possibly extendible) component of a bacteria counting evaluation system. II. RELATED WORK Literature on bacterial colony counting reports a number of different methods. In [3], it is proposed to apply distance

7458

transform on a segmented binarized image and to consider as colonies the local maxima with values over a certain threshold. A method that exploits a particular lighting producing countable spot reflections on certain colonies is proposed in [4]. A method that uses the watershed transform for splitting clumps once the colonies are segmented is reported both in [5] and [6]. Another grayscale morphological analysis solution for counting is introduced in [7]. The above methods usually rely on an involved parameters hand-tuning, that can be effectively adjusted only for some limited settings. However, they may have difficulties when facing the large variety encountered in clinical routine. For example, it is hard to set correct threshold for a watershed splitting when dealing at the same time with micro-colonies with about 10 pixels diameter on high resolution images and macrocolonies with hundreds or even thousands pixels diameter, especially in clinical settings where not all the colonies have a regularly rounded morphology. The shortcomings of traditional image processing methods have suggested the design of classification approaches for determining the segments cardinality. In [8] a method based on shape classification of the segments is presented. A Sanger neural network is adopted for performing dimensionality reduction of the binary segments, that are then classified in categories from 1 to 7 colonies, or outliers, obtaining results similar to those attainable by watershed based techniques. In [9], a multistage classification identifies isolated colonies, which are detected by means of a classifier taking as inputs Zernike moments representations of the binary segments. Detected isolated colonies are then classified into different bacteria species. CNN are hierarchical models that are attaining stateof-the-art performances for many object classification and detection applications. They have been first introduced for overcoming known problems of fully connected deep neural networks when handling high dimensionality structured inputs, such as images or speech [10]. Recently, favoured by the advent of fast GPU and few tricks added to the original design, CNNs have become state-of-the-art solutions for large scale object classification [11], [12] and object detection tasks on large resolution images [13], [12]. CNNs have been already applied to a variety of biomedical imaging problems. In [14] cells and nuclei of developing embryos of C.elegans roundworm were segmented and located. A system that performs automatic segmentation of neuronal structures on electronic microscope images was presented in [15], while in [16] a CNN system is designed for mitosis detection on cell nucleus in breast cancer histology images, significantly outperforming previous solutions.

Italy) deployed on clinical sites. Segmentation is performed by means of thresholding techniques. Each segment can be assigned to one of 7 classes depending on the number of colonies that it contains, from 1 to 6, or labeled as an outlier if it does not contain colonies, but bubbles, dust or dirt on the agar (Fig 2). Segments have been labeled by humans by means of a dedicated GUI and labeling data have been stored using a custom metadata format. The dataset suffers of skewed classes, since most of the segments only contain one colony; 74.4% of segments contain an isolated colony, 11.6% contains 2 colonies, 5.1% contains 3 colonies, 2.8% contains 4 colonies, 1.6% contains 5 colonies, 1.7% contains 6 colonies and 2.6% contains outliers. This is not surprising as it reflects normal clinical work where isolated colonies are more common compared to clustered ones. However, the cardinality of the database guarantees a sufficient number of examples for each class. As stated, the dataset consists of variable-size images, however traditional CNN requires a constant input dimensionality. Therefore, segments have been resized to fixed sizes. Both cropping [11] and warping [13] approaches have been experimented. In cropping approach, segments are cropped with a squared bounding box with edges slightly longer than the greater among segments horizontal or vertical axis, using bounding box border replication padding whether necessary. Conversely, in the warping approach, the segments are resized to fit into a constant size by warping their content. Cropping method has the drawback that if the segment is elongated in one direction, a lot of context may be included in the other direction. However, morphological features of clustered colonies are not badly and unpredictably distorted, as it can happen in the warping approach. This is probably the reason why, from an experimental evidence, the performance of the warping approach are poorer compared to the cropping one, that was therefore selected. Images have been resized to 128x128 pixels, since smaller sizes, always by our tests, start to decrease the ovarall performances. Dataset augmentation techniques, by means of classpreserving transformations, have been also investigated and tested to increase the training set cardinality. Since binary

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

III. COLONIES DATASET We built a dataset consisting of about 17,000 images containing small image segments of different sizes depending on the colonies’ dimension and on the orientation of the colonies aggregates. Segments are obtained from high resolution (0.0265 mm/pixel ratio) RGB color images of clinical urine bacterial cultures inoculated on blood agar plates, collected by WaspLab™automations (Copan Italia S.p.a.,

Fig. 2: Example of dataset images representing a certain number of colonies, from 1 (a) to 6 (f), and two example of outliers (g) and (h).

7459

masks of the segments were available, segments masking has been performed in order to remove clutter coming inside the field of view of the CNN. After this first phase, different datasets have been created in order to test the performance of transformations. An horizontal flip has been performed on the images, doubling the training dataset; three different artificial color distortions on RGB color space have been applied; another transformation has been the conversion of masked dataset in grayscale color space; seven different values of spatial rescaling before cropping have been performed. The last tested augmentation has been produced through a normalization with respect to the segment orientation. Since colonies cluster together with different angles, this process may enforce some rotation invariance. Normalization was obtained by calculating the moments of the image converted to gray-scale by: XX Mij = xi y j I(x, y) (1) x

y

with I(x, y) the image pixel intensities. After the centroid has been calculated: (c1 , c2 ) = (M10 /M00 , M01 /M00 )

(2)

and with this value, the central moments can be calculated: XX µpq = (x − c1 )p (y − c2 )q I(x, y) (3) x

µ020

y

µ002

µ011

Given = µ20 /µ00 , = µ02 /µ00 and = µ11 /µ00 the segment orientation can been extracted by calculating its angle as follows: 2µ011 1 (4) θ = arctan 2 µ020 − µ002 However, despite the randomness of the colony confluence patterns, they often tend to cluster in orientations roughly aligned to the original streaking path. Moreover colonies have reflections that have a clear horizontal orientation, that is lost after rotation. Thus, enforcing rotation invariance often destroys those properties, not always leading to a performance improvement. Dataset has been randomly split in 70% for training and 30% for testing. IV. NETWORK TOPOLOGY AND TRAINING The proposed colonies classifier has been implemented using BVLC Caffe [17]. Similarly to [10], our CNN contains five learned layers, four convolutional and one fully connected as shown in Fig 3: • 1st conv. layer, 20 features maps with filter size 5x5; • 2nd conv. layer, 50 features maps with filter size 5x5; • 3rd conv. layer, 100 features maps with filter size 4x4; • 4th conv. layer, 200 features maps with filter size 4x4; • first fully connected layer, 500 hidden units; • soft-max output layer. In order to speed up learning convergence, non-saturating linearities f (x) = max(0, x), also referred as ReLU activation function [18], are adopted on all layers. Deep

Fig. 3: Convolutional Neural Network topology.

convolutional neural networks with ReLU learn way faster than their sigmoid counterparts. The output of the convolutional layers after passing through ReLU non-linearities is normalized by means of Local Response Normalization [11], then down sampled with non-overlapping max-pooling. For reducing over-fitting on the two fully-connected stages, random dropout technique (cross-validated at the 75% of the weights) is adopted on them [19]. Networks weights are initialized following the initialization scheme, that in Caffe framework, is called xavier, that initializes each weight drawing their value from an uniform probability distribution function from [−a, a] where r 3 (5) a= fanin and fanin is the number of input nodes. This scheme has been modified from the scheme examined in [20]. Bias values are initialized as constant values. Training is performed with Stochastic Gradient Descent with batch size 64. For regularizing, weight decay is set to 0.0005. Learning rate is initialized to 0.01 and it is decreased of the 0.01% at each iteration. Training is performed applying momentum at 0.9. V. RESULTS AND DISCUSSION Results obtained are summarized in Fig.4, which represents the training curve of the testing accuracy with respect to the number of iterations. Many augmented datasets have been tested. Masking process provided a gain of accuracy of 1.5 percentage points compared to the original (not masked) dataset. Converting color space to gray-scale provides equivalent results, suggesting that morphology of colonies gives the most discriminative features. Artificial color distortions on the datasets have not brought significant performances gain in terms of overall accuracy, like the resized dataset and the rotated one. Dataset augmented with horizontal flip gave the best performance, achieving an overall accuracy of the 92.8%. During the training, the testing accuracy flatten after 15000 iterations. 50000 iterations have taken approximately 3 hours on an Nvidia Titan Black GPU. In Fig.5, the confusion matrix of the best performing model is shown. The confusion matrix shows that, even if the classes corresponding to cluster composed by 3 to 6 colonies are sometimes misclassified, the wrongly selected labels remain nearby the main diagonal, e.g. cluster composed by 5 colonies are often confused by clusters of 6 colonies. Thus, also considering the substantial symmetry

7460

R EFERENCES

Fig. 4: Overall accuracy of the tested CNN for the original and the considered expanded datasets.

Fig. 5: Confusion matrix for the horizontal flip dataset.

of the confusion matrix, the overall final bacterial load estimation is only slightly affected. The discrimination of those aggregates is often hard even for a trained technician. The classifier effectively detected outliers, segments that are often really difficult to distinguish with standard image and feature analysis techniques. The per-colony error is used to assess counting accuracy: s 1X Err = (cs − c¯s )2 (6) C s∈S

where S is the test set, cs the count for the sample s, considered empty if it is an outlier, c¯s is the estimated count, C is the sum of the counts of all the aggregates in the test set. The best scoring CNN has a per-colony error of 0.29 on the test set. For comparison, distance transform applied to the binary masks of the segments combined with watershed transform have been implemented in a solution similar to [6]. The local maxima of the distance transform have been used as markers for the watershed. The best score obtained in terms of per-colony error in this case was 0.69. Thus, the proposed CNN approach drastically improves accuracy.

[1] P. P. Bourbeau and N. A. Ledeboer, “Automation in clinical microbiology,” Jour. of clinical microbiology, vol. 51, no. 6, pp. 1658–1665, 2013. [2] S.M. Novak and E.M. Marlowe, “Automation in the clinical microbiology laboratory,” Clinics in Laboratory Medicine, vol. 33, no. 3, pp. 567 – 588, 2013. [3] D.P. Mukherjee, “Bacterial colony counting using distance transform,” Internat. Jour. of Biomedical Computing, vol. 38, pp. 131–140, 1995. [4] G. Corkidi, R. Diaz-Uribe, J.L. Folch-Mallol, and J. Nieto-Sotelo, “Covasiam: an image analysis method that allows detection of confluent microbial colonies and colonies of various sizes for automated counting.,” Applied and Environmental Microbiology, vol. 64(4), pp. 1400–1404, 1998. [5] S.D. Brugger, C. Baumberger, M. Jost, W. Jenni, U. Brugger, and K. M¨uhlemann, “Automated counting of bacterial colony forming units on agar plates,” PLoS ONE, vol. 7, no. 3, pp. e33695, 03 2012. [6] C. Zhang, W.-B. Chen, W.-L. Liu, and C.-B. Chen, “An automated bacterial colony counting system,” in IEEE Internat. Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (SUTC 2008), 11-13 June 2008, Taichung, Taiwan, 2008, pp. 233–240. [7] Anan Liu, Zheren Liu, Limin Song, and Dong Han, “Adaptive ideal image reconstruction for bacteria colony detection,” in Information Technology and Agricultural Engineering, E. Zhu and S. Sambath, Eds., vol. 134 of Advances in Intelligent and Soft Computing, pp. 353–360. Springer Berlin Heidelberg, 2012. [8] A. Brunetti G. L. Masala, U. Bottigli, “Automatic cell colony counting by region-growing approach,” IL NUOVO CIMENTO, 2008. [9] A. Ferrari and A. Signoroni, “Multistage classification for bacterial colonies recognition on solid agar images,” in Imaging Systems and Techniques (IST), 2014 IEEE Internat. Conference on, Oct 2014, pp. 101–106. [10] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov 1998. [11] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25, F. Pereira, C.J.C. Burges, L. Bottou, and K.Q. Weinberger, Eds., pp. 1097–1105. Curran Associates, Inc., 2012. [12] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” CoRR, vol. abs/1409.4842, 2014. [13] R.B. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” CoRR, vol. abs/1311.2524, 2013. [14] F. Ning, D. Delhomme, Y. LeCun, F. Piano, L. Bottou, and P.E. Barbano, “Toward automatic phenotyping of developing embryos from videos,” Image Processing, IEEE Transactions on, vol. 14, no. 9, pp. 1360–1371, Sept 2005. [15] D. Ciresan, A. Giusti, L.M. Gambardella, and J. Schmidhuber, “Deep neural networks segment neuronal membranes in electron microscopy images,” in Advances in Neural Information Processing Systems 25, F. Pereira, C.J.C. Burges, L. Bottou, and K.Q. Weinberger, Eds., pp. 2843–2851. Curran Associates, Inc., 2012. [16] D.C. Cirean, A. Giusti, L.M. Gambardella, and J. Schmidhuber, “Mitosis detection in breast cancer histology images with deep neural networks,” in Medical Image Computing and Computer-Assisted Intervention MICCAI 2013, Kensaku Mori, Ichiro Sakuma, Yoshinobu Sato, Christian Barillot, and Nassir Navab, Eds., vol. 8150 of Lecture Notes in Computer Science, pp. 411–418. Springer Berlin Heidelberg, 2013. [17] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” arXiv preprint arXiv:1408.5093, 2014. [18] V. Nair and G.E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th Internat. Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel, 2010, pp. 807–814. [19] G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Improving neural networks by preventing coadaptation of feature detectors,” CoRR, vol. abs/1207.0580, 2012. [20] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Internat. conference on artificial intelligence and statistics, 2010, pp. 249–256.

7461

Zebrafish tracking using convolutional neural networks.

Classification of breast cancer histology images using Convolutional Neural Networks.

Improving deep convolutional neural networks with mixed maxout units.

Deep Convolutional Neural Networks for large-scale speech tasks.

Toward Content Based Image Retrieval with Deep Convolutional Neural Networks.

Convolutional neural networks for mammography mass lesion classification.

Rationale-Augmented Convolutional Neural Networks for Text Classification.

High-Throughput Classification of Radiographs Using Deep Convolutional Neural Networks.

Exploring convolutional neural networks for drug-drug interaction extraction.

HEp-2 Cell Image Classification With Deep Convolutional Neural Networks.

Automated Training of Deep Convolutional Neural Networks for Cell Segmentation.

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning.

Pulmonary Nodule Classification with Deep Convolutional Neural Networks on Computed Tomography Images.

Vehicle Detection in Aerial Images Based on Region Convolutional Neural Networks and Hard Negative Example Mining.

Detection of Nuclei in H&E Stained Sections Using Convolutional Neural Networks.

Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition.

Convolutional Neural Networks for Biomedical Text Classification: Application in Indexing Biomedical Articles.

Fully automatic acute ischemic lesion segmentation in DWI using convolutional neural networks.

Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition.

Predicting Response to Neoadjuvant Chemotherapy with PET Imaging Using Convolutional Neural Networks.

Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding.

Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks.

CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks.

Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks.