Segmentation of ultrasound images of fetal anatomic structures using random forest for low-cost settings.

Segmentation of Ultrasound Images of Fetal Anatomic Structures Using Random Forest for Low-Cost Settings Evelyn Arthur Anto

1,2 ,

Benjamin Amoah

Abstract— In ultrasound imaging, manual extraction of contours of fetal anatomic structures from echographic images have been found to be very challenging due to speckles and low contrast characteristic features. Contours extracted are therefore associated with variability of human observers. In this case, the contours that are extracted are not reproducible and hence not reliable. This challenge has called for the need to develop a method that can accurately segment the fetal anatomic structures. This will help to estimate and measure the contours of the structures of fetal bodies such as the head circumference, femur length, etc. Most recent methods are able to integrate global shape and appearance. The drawback to most of these methods is that, they are not able to handle localized appearance variations. They only rely on an assumption of Gaussian gray value distribution and also require initialization near the optimal solution. In this manuscript random forest is used to segment head contour in fetal ultrasound scans acquired in low-cost settings, such as acquisition performed in rural areas of low-income countries using low-cost portable machines.

I. INTRODUCTION The manual extraction of contours of ultrasound (US) images of fetal anatomic structures from echographic images has always been very difficult and time consuming task since most of these images are of poor image quality due to speckle noise. The results obtained from such measurements are subject to the variability of the human observer and hence unreliable and not reproducible. The measurement of echographic images in the study of obstetric gives important insight as the best way to estimate the fetal age of a foetus. There are several parameters which serve as development indicators for the foetus. These parameters include the parietal diameter (BPD), occipitalfrontal diameter (OFD), head circumference (HC) and femur length (FL) [3]. The measurements of these parameters predict the gestational age (GA) of the foetus. The challenge is how to accurately measure these contours for better predictions of the GA. Automatic segmentation of these biometric This work was supported by ETH-Zurich, Switzerland and African Institute for Mathematical Sciences, Ghana 1 Swiss Federal of Technology (ETH-Zurich) 2 African Institute for Mathematical Sciences, Ghana

978-1-4244-9270-1/15/$31.00 ©2015 IEEE

1,2 ,

Alessandro Crimi

1,2

features can allow better reliability and reproducibility in measurement. Carneiro et al. [4] used a discriminative constrained probabilistic boosting tree classifier to segment structures and to predict GA using them afterwards. Jardim and Figueiredo [3] developed an iterative approach to segment fetal structures using Maximum-Likelihood estimation. For a review of other methods, the reader is addressed to [7]. In this manuscript, segmentation of the head bone as an example of automatic features identification will be discussed. This is conducted with images acquired using a portable US in low-cost settings in rural areas of Ghana. In sub-Saharan Africa, risk of maternal mortality in women was 1 in 38, in sharp contrast to 1 in 3700 among women in high income countries. Maternal mortality in Ghana in 2013 was estimated at 380 (per 100,000 live births), a figure which is far above the 185 targeted in Millennium Development Goal (MDG) Five [10]. Despite progress made so far, Ghana is not likely to achieve the MDG Five, especially in the rural isolated communities. Most maternal deaths are due to obstetric complications which can be prevented or detected and managed if pregnant women get early access to available intervention programs. With the goal of improving the prenatalcare management in rural areas in Ghana, a pilot project involving community health workers [9], and low-cost portable US machine has been carried out as depicted in Fig. 1. In the following sections, random forest method, a supervised learning classification method is proposed to automatically segment the contours of US images of the HC on US scans acquired in low-cost settings. Mainly, using low-cost portable US machines to acquire scans directly in rural communities. This is done to guarantee that the robustness of automatic features identification even in low-cost settings is accessible in low-income countries. II. METHODS Acquiring US images is non-invasive, cheap, and does not require ionizing radiations compared 793

Fig. 1: Example of US acquisition in low-cost settings carried out within this project. Here the acquisition has been carried out by using a lowcost portable US by a trained physician in a lecture room and not in a clinic. to other medical imaging techniques. The artifacts and speckle noise which are inherent in US images make the automatic segmentation of anatomical structures in US imagery a real challenge [2]. US images follow a Rayleigh distribution. The distribution is discussed below. A. Rayleigh Distribution Consider an image v as a real positive function defined in a rectangular domain Ω ⊂ R+ and C be a closed contour. The gray levels are assumed to be uncorrelated and independently distributed for each scatter. This results in a random walk in R+ plane. The distribution becomes a zero-mean Gaussian probability density function in complex plane when central limit theorem is applied to the random walk and is given by, f (z|ψ) =

1 |z|2ψ2 e , 0 < ψ < ∞, 2πψ

(1)

where z is complex and ψ is the variance of the distribution. For a real image display, there is the performance of envelope detection on the magnitude (in-phase) image. The distribution in the magnitude image after the transformation, becomes a Rayleigh distribution given by, f (x|ψ) =

2 x −x e 2ψ , 0 < ψ < ∞, ψ

(2)

where x is real [6]. B. Random Forest Model The random forest model has been introduced by Breiman [8], and is being applied to several fields

[1]. A forest is given by the fusion of decision trees. Given an input object, decision tree works by performing consecutive queries arranged in a hierarchical manner about the known features of the object in exam, in order to estimate an unknown property of the object. Each subsequent query depend mainly on the responds to the previous result and the far process is carried out, the greater the confidence in the responds [1]. The criterion of a random decision is defined by random attribute and it is related random threshold [5]. Data points are denoted by a vector v = (x1 , x2 , ..., xn ) ∈ F, where the components xi are some attributes of the data point called features and F is the feature space. In our experiments on US images, we define an image v to be a set of image pixel values xi . The data points in our model are the intensity values of the gray-scale image. An image with known labels is first trained and then tested with previously unknown image with data points v. The training data has binary labels c which defines the classes, since there are only two classes: bones and background. Tree testing is performed to predict which class each of the data points belong to. This is done by applying a number of predefined tests using hierarchical decision tree. The leaf nodes contain a predictor/estimator which associates an output with the input v [1]. At each leaf node, the posterior probability of a pixel with feature vector v = (x1 , x2 , ..., xd ) belonging to class c in a forest of t ∈ {1, 2, ...., T } trees is given by Pr(c|v) =

1 T

T

∑ Prt (c|v)

(3)

t=1

During the algorithm, there are four important parameters that need to be set. These comprises the maximum depth of the trees, the number of trees/estimators in the forest, and the minimum sample split. With a very shallow tree depth, much information will be lost whereas too much deep tree depth also leads to over-fitting of the model. The number of trees/estimators parameter, which signifies the forest size should be high to allow for enough randomness. The minimum sample split parameter is the minimum number of samples required to split an internal node. This parameter is tree specific, therefore the more the data points, the higher the value. The random state is a random random number generator. During the experiments, data points are given by pixel values, in particular an image and its manual annotation of the head is used in a leave-one-out 794

manner for the training. Of the entire image only about 100000 pixels are used as a subsection of the image removing labels and pixels outside the ultrasound cone. The maximum depth is set to 100, then 100 trees/estimators, and 50 minimum sample splits. These parameters have not been optimized exhaustively. The computational time for the training using a single scan 26 seconds, while for testing a new image only 0.5 seconds are required. This computational time is greatly influenced by the forest size, the greater the number, the more time spent. Although the maximum depth also has little influence on the time, this is not greatly seen with tree depth greater than 10. The minimum sample split and the random state however do not have any influence on the running time. The experiments are performed on laptop with a 2.4 GHz processor and 4 GB system RAM, which are similar features of the used portable ultrasound. C. Data Description The used US images were acquired from 50 pregnant women using a commercially available diagnostic curvilinear transducer array with 4.5MHz and a B-Mode DP-20 Mindray US machine. All women gave written consent and the study has been carried out according to the Helsinki declaration. The acquired images were gray-scale images with resolution 800 × 600 pixels which were acquired using different image gain for a better view. Fig. 2 illustrates the original US image of the HC of a foetus. Of the original image, only the US cone was considered. For our experiments the unnormalized gray values of pixels have been used, nevertheless the framework can use other types of features as filter response, patch of US B-mode image or short segments of radio frequency data. Acquisition was not carried out in hospitals or clinics, but in private homes within rural communities in Ghana at different stages of pregnancies. During the training and testing, the labels and menu were removed from each images. An expert technician performed the manual annotation.

Fig. 2: Screenshot example acquired during a gynecological session by the used US machine

where A and B are respectively the ordered pixel values of the resulting segmented image and manual annotation. Using the described method, Fig. 3 and Fig. 4 depict respectively an acquired scan and the resulting image from the segmentation. By using the leave-one-out-training fashion during the segmentation, an overall mean DICE of 0.75 was obtained. Fig. 5 depicts the average DICE values for single US scans according to the leave-one-outtraining fashion.

Fig. 3: Example of testing image of a head fetus

IV. DISCUSSION III. RESULTS The experiments have been carried out in a leave-one-out fashion manner. Where the image scans left out was used for training the random forest model, and then tested on the remainders. The quality of the segmentation is assessed using the DICE coefficient 2|A ∩ B| , (4) D= |A| + |B|

The segmentation of head contour using random forest produces valid results even with images with maximum brightness, easy to generalize without tuning parameters or setting threshold for binarization, and with very few false positive which will be detected using edge detection algorithms. The proposed method does not require Haar filtering as recently proposed for US images [14]. The approach has shown good results using only one 795

ACKNOWLEDGMENT The authors are thankful to Dr. Kojo Pieterson for the clinical support. This research is part of the cooperation project carried out in rural communities in Ghana called Docmeup by ETHGlobal and the African Institute for Mathematical Sciences in Ghana [9]. More resources related to the cooperation project can be found on the website www.docmeup.org. R EFERENCES Fig. 4: Example of the resulting segmentation on US scans 1 0.9

average DICE coefficients

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0

5

10

15

20

25

30

35

40

45

50

image samples

Fig. 5: Average DICE values for single US scans according to the leave-one-out-training fashion. It can be seen that results are consistent across the dataset.

image as a training dataset. However, the training can be dependent on the machine used during the training, this step can be easily repeated for each different machine. The method can be easily extended by an ellipse fitting regression which can then be used to estimate the GA, weight or delivery date [11]. Future works include the extension of other features as the different filter responses of Gaussian or derivative filters, considering patches of US scans [13] or extending the framework with a Maximum Likelihood estimation of the [12]. Integration to regression models [11] to predict GA and delivery date are ongoing works. V. CONCLUSION The random forest method can be easily used to segment biometric features in fetal US scans, this model has been in fact tested on US scans acquired with a low-cost portable device and in rural areas, where this solution can lead to a more efficient fetal analysis suitable for low-income countries.

[1] A. Criminisi, J. Shotton, and E. Konukoglu, Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning, Foundations and Trends in Computer Graphics and Vision 7, 2011, pp. 81-227. [2] S. Kalpana, M. L. Dewal, and M. Rohit, Ultrasound imaging and image segmentation in the area of ultrasound: a review, International Journal of Advanced Science and Technology, vol. 24, Nov. 2010. [3] S.M. Jardim, and M.A. Figueiredo, Segmentation of fetal ultrasound images, Ultrasound in medicine & biology, vol. 31, no. 2, 2005, pp. 243-250. [4] G. Carneiro and B. Georgescu and S. Good and D. Comaniciu, Detection and measurement of fetal anatomies from ultrasound images using a constrained probabilistic boosting tree, IEEE Trans Med. Imaging, vol. 27, no. 9, pp. 1342-1355, Sept. 2008. [5] A. Criminisi and J. Shotton. Decision forests for computer vision and medical image analysis. Springer Science and Business Media, 2013. [6] G. Slabaugh, G. Unal, M. Wels, T. Fang and B. Rao, Statistical region-based segmentation of ultrasound images, Ultrasound in medicine & biology vol. 35, no. 5, 2009, pp. 781-795. [7] S. Rueda and S. Fathima and C.L. Knight and M. Yaqub and A.T. Papageorghiou and B. Rahmatullah, and A. Foi and M. Maggioni and A. Pepe and J. Tohka and J.A. Noble, Evaluation and comparison of current fetal ultrasound image segmentation methods for biometric measurements: a grand challenge, Medical Imaging, IEEE Transactions on, vol. 33, 797-813. [8] Breiman, Leo. ”Random forests.” Machine learning 45.1 (2001): 5-32. [9] B. Amoah and E.A. Anto and A. Crimi, Phone-based prenatal care for communities and remote ultrasound imaging, in MobMed Prague, 2014. [10] WHO, UNICEF, UNFPA, TheWorld Bank, and The United Nations Population Division, Trends in Maternal Mortality: 1990 to 2013, UNICEF, 2014 [11] S. Campbell and D. Wilkin, Ultrasonic measurement of fetal abdomen circumference in the estimation of fetal weight. Br J Obstet Gynaecol 1975; vol. 82:689-97. [12] A. Sarti and C. Corsi and E. Mazzini and C. Lamberti Maximum likelihood segmentation of ultrasound images with Rayleigh distribution. Ultrasonics, Ferroelectrics, and Frequency Control, IEEE Transactions on 52.6 (2005): 947-960. [13] F. Schroff, A. Criminisi and A. Zisserman. Object Class Segmentation using Random Forests. Proceedings of the British Machine Conference, pages 54.1-54.10. BMVA Press, September 2008. [14] A. Namburete and J.A. Noble. ”Fetal cranial segmentation in 2D ultrasound images using shape properties of pixel clusters.” Biomedical Imaging (ISBI), 2013 IEEE 10th International Symposium on. IEEE, 2013.

796

Tissue segmentation of computed tomography images using a Random Forest algorithm: a feasibility study.

Random Walk Based Segmentation for the Prostate on 3D Transrectal Ultrasound Images.

Segmentation of Perivascular Spaces Using Vascular Features and Structured Random Forest from 7T MR Image.

Shape Based Segmentation of Anatomical Structures in Magnetic Resonance Images.

Lesion segmentation from multimodal MRI using random forest following ischemic stroke.

Grain-oriented segmentation of images of porous structures using ray casting and curvature energy minimization.

Segmentation and quantification of subcellular structures in fluorescence microscopy images using Squassh.

Self-Trained Supervised Segmentation of Subcortical Brain Structures Using Multispectral Magnetic Resonance Images.

CT images using fuzzy Markov random field model.

Using random forest to model the domain applicability of another random forest model.

Segmentation of renal parenchymal area from ultrasound images using level set evolution.

Automatic Segmentation of Right Ventricle on Ultrasound Images Using Sparse Matrix Transform and Level Set.

A Discriminatively Trained Fully Connected Conditional Random Field Model for Blood Vessel Segmentation in Fundus Images.

Accuracy of telediagnosis of fetal heart disease using ultrasound images transmitted via the internet.

Segmentation of ultrasound images of thyroid nodule for assisting fine needle aspiration cytology.

3D Forest: An application for descriptions of three-dimensional forest structures using terrestrial LiDAR.

Feature selection and classification of leukocytes using random forest.

Appropriate window settings for CT anatomic measurements.

Segmentation of magnetic resonance images using an artificial neural network.

Segmentation of intensity inhomogeneous brain MR images using active contours.

Towards ultrasound-guided adaptive radiotherapy for cervical cancer: Evaluation of Elekta's semiautomated uterine segmentation method on 3D ultrasound images.

Automatic segmentation of breast MR images through a Markov random field statistical model.

Land cover mapping based on random forest classification of multitemporal spectral and thermal images.

Lazy random walks for superpixel segmentation.