A manifold learning method to detect respiratory signal from liver ultrasound images.

Computerized Medical Imaging and Graphics 40 (2015) 194–204

Contents lists available at ScienceDirect

Computerized Medical Imaging and Graphics journal homepage: www.elsevier.com/locate/compmedimag

A manifold learning method to detect respiratory signal from liver ultrasound images Jiaze Wu a,∗ , Apoorva Gogna b , Bien Soo Tan b , London Lucien Ooi c , Qi Tian a , Feng Liu a , Jimin Liu a a

Singapore Bioimaging Consortium, Agency for Science, Technology and Research, #08-01, Matrix, 30 Biopolis Street, 138671 Singapore Department of Diagnostic Radiology, Singapore General Hospital, Singapore c Department of Surgery, Singapore General Hospital, Singapore b

a r t i c l e

i n f o

Article history: Received 30 March 2014 Received in revised form 11 October 2014 Accepted 20 November 2014 Keywords: Liver ultrasound images Respiratory gating Respiratory signal Manifold learning Local tangent space alignment

a b s t r a c t Respiratory gating has been widely applied for respiratory correction or compensation in image acquisition and image-guided interventions. A novel image-based method is proposed to extract respiratory signal directly from 2D ultrasound liver images. The proposed method utilizes a typical manifold learning method, based on local tangent space alignment based technique, to detect principal respiratory motion from a sequence of ultrasound images. This technique assumes all the images lying on a low-dimensional manifold embedding into the high-dimensional image space, constructs an approximate tangent space of each point to represent its local geometry on the manifold, and then aligns the local tangent spaces to form the global coordinate system, where the respiratory signal is extracted. The experimental results show that the proposed method can detect relatively accurate respiratory signal with high correlation coefficient (0.9775) with respect to the ground-truth signal by tracking external markers, and achieve satisfactory computing performance (2.3 s for an image sequence of 256 frames). The proposed method is also used to create breathing-corrected 3D ultrasound images to demonstrate its potential application values. © 2014 Elsevier Ltd. All rights reserved.

1. Introduction Respiration motion is a quasi-cyclic physiologic process and may lead to motion and deformation of abdominal organs, e.g. liver [1]. This physiologic process seriously affects the efficacy and efficiency of interventional and radio-therapeutic procedures performed for diagnosing and treating these diseased organs. A variety of respiratory motion modeling methods [2] have been proposed to overcome this problem, which inevitably involves detection of the respiratory signal. Here, respiratory signal can be considered as a generalized pattern of the human respiration and feature the principal component of the 3D respiratory motion. The respiratory signal can find its value in multiple liver-related clinical applications. For instance, it may be individually used for image acquisition to capture the respiration-induced moving liver

∗ Corresponding author. Tel.: +65 6478 8411; fax: +65 6478 9049. E-mail addresses: wu [email protected], [email protected] (J. Wu), [email protected] (A. Gogna), [email protected] (B.S. Tan), [email protected] (L.L. Ooi), [email protected] (Q. Tian), liu [email protected] (F. Liu), [email protected] (J. Liu). http://dx.doi.org/10.1016/j.compmedimag.2014.11.013 0895-6111/© 2014 Elsevier Ltd. All rights reserved.

[3], respiratory gating for liver ablation interventions [4,5] and radiation therapies [6], and accuracy improvement of quantitative evaluation of the hepatic perfusion [7]. In addition, the respiratory signal can also be combined with a 4D respiratory motion model for motion correction or prediction during radiation therapies [8,9] and high intensity focused ultrasound (HIFU) [10,11]. For this case, the signal is firstly used in the pre-operative procedure for establishing a correspondence model with 3D motion of the whole liver motion or specific liver part (e.g. vessels or lesions), and applied as input in the intra-operative stage to parameterize the motion model for motion estimation or prediction. One traditional and widely used method to obtain the respiratory signal is to place one or multiple external markers on the thoracic or abdominal skin to detect the dominant anteriorposterior translation of the human skin [12–14], which is viewed as measurements of the breathing phases. These markers are usually optical or electromagnetic (EM) sensors, whose position variations could be monitored by optical cameras or EM tracking devices. Recently, marker-less external tracking methods were proposed to track the whole chest or abdomen surface using optical imaging devices to provide high-dimensional respiratory information [15,16]. However, these extra positioning or imaging devices are

J. Wu et al. / Computerized Medical Imaging and Graphics 40 (2015) 194–204

not routinely available in clinical procedures, also need long device setup time and extra surgical working space and consequently lead to high financial costs. In order to relieve these problems existing in external markerbased methods, researchers and physicians recently proposed another class of more flexible and lower-cost methods, i.e. purely image-based respiratory signal tracking [6,17–22], which extract the respiratory signal directly from time-varying US liver images during pre-operative image acquisition or intra-operative imageguided procedures. These methods detect the respiration-related information by utilizing some specific image processing techniques to process an image region or the whole image. For the image-region-based methods [6,17–19], it is necessary to firstly specify a special image patch containing salient anatomical structures, such as the diaphragm, liver vessels or liver boundaries, from a reference image. Xu and Hamilton [6] calculated statistical dependency of successive image patches, which are located on the same position on all the processed images. These calculated statistical values (mutual information or correlation coefficient) are representative of principal respiratory motion. Hwang et al. [17] pointed that this method has severe dependency on the reference image with salient features, and statistical values cannot represent the respiratory signal well, especially in key respiratory phases, end of inhalation (EI) or exhalation (EE). Furthermore, they proposed to directly identify the motion of the salient anatomical features in image patches (called feature windows) and calculate displacements of these features with the first image patch’s feature. These displacements reflect the respiratory motion of the anatomical features on images and, hence, can better delineate the respiratory signal. However, the feature identification is based on some thresholding technique, which is susceptible to speckle noise and intensity variations usually appearing in US images. Therefore, Wu et al. [18,19] proposed a template matching (TM) based method to estimate the respiratory motion of the anatomical structures on the image patches (also called template blocks). This method is robust to speckle noise and intensity variations and can recover the respiratory motion well. However, these methods above heavily rely on predefined distinctive anatomical features with strong contrast and large motion. When the features are absent or not distinctive in images, or they have small motion, these methods fail to extract satisfactory respiratory signal. Sundar et al. [20] proposed a novel phase correlation method to detect respiratory signal by considering the whole images but not image patches. This method uses the cumulated phase shift in the spectral domain of successive image frames as a measure of the respiratory motion. Wachinger et al. [21,22] pointed and experimentally demonstrated that this method cannot well recover the respiratory information from US images and further proposed a manifold learning (ML) based respiratory signal tracking for 4D US imaging. The ML-based tracking method makes use of a classic nonlinear dimensionality reduction technique, Laplacian Eigenmaps (LEM) [23], to infer intrinsic low-dimensional structures (e.g. respiratory motion of at most three dimensions) from very high-dimensional data (e.g. images of hundreds of thousands of dimensions). In this paper, we present a novel manifold learning based method to extract the respiratory signal from a sequence of 2D US B-mode images. The proposed method is based on a manifold learning technique, local tangent space alignment (LTSA) [24], which is a typical dimensionality reduction technique. This method assumes a low-dimensional manifold embedded in the high-dimensional space (image space) and each frame of the image sequence lies on this manifold. Based on this assumption, images over the breathing cycle will roughly form a continuous back-and-forth trajectory on the manifold in image space, with points at similar positions on the manifold related by the

195

similar or same state of the breathing cycle. LTSA uses a linear approximation within each neighborhood to construct a local coordinate system for the neighborhood, and then aligns these overlapping local coordinate systems to obtain a global coordinate system. By such nonlinear mapping process, LTSA may assign to each image a low-dimensional coordinate (namely the respiratory phase) by exploring the neighborhood relationship and is well suitable for extracting the respiratory information from the image sequence. In contrast to previous image-region-based methods, our MLbased method directly takes the entire image sequence as input to extract the respiratory information and has no special requirement on salient anatomical features in images. In addition, our method could be more robust to artifacts and noise in US images and produces much smoother respiratory signal than previous imageregion-based methods, which is important for respiratory gating during image acquisition. Compared to the LEM-based tracking method, the main advantage of the proposed LTSA-based method is that it can extract the local geometric information by constructing tangent planes of the local neighborhood of each point on the manifold, but LEM directly uses the distance relationship of each point with other points on the manifold. Therefore, LTSA in our paper is able to construct better mapping function to convert the high-dimensional images into corresponding low-dimensional respiratory states. The experiments in this paper will prove that the proposed LTSA-based method is able to extract more accurate and robust respiratory signal than other image-based methods and as a typical application example used to reconstruct respirationcorrected 3D US images from multiple 2D image sequences.

2. Methods 2.1. Manifold learning Manifold learning belongs to a class of nonlinear dimensionality reduction techniques, which aims to discover some underlying structure of complex high dimensional data and reduce them to a simpler representation of much lower dimensionality, while still retaining the local spatial relationship of the original data. Local tangent space alignment [24] is such a manifold learning algorithm, which can efficiently learn a nonlinear embedding into the low-dimensional space from high-dimensional data, and can also reconstruct high-dimensional coordinates from embedding coordinates. It has a number of attractive features: simple geometric intuitions, straightforward implementation, and global optimization. For a set of D-dimensional input feature vectors (X = {x1 , x2 , . . ., xn } , xi ∈ RD , i = 1, 2, . . ., n), drawn from a intrinsically lowdimensional manifold lying on the D-dimensional space, LTSA attempts to find a nonlinear mapping to transform these vectors to a d-dimensional space (d < D) to produce a set of corresponding d-dimensional vectors (Y = {y1 , y2 , . . ., yn } , yi ∈ Rd , i = 1, 2, . . ., n). The mapping process is performed to as much as possible preserve local geometrical information around each vector. For LSTA, the local geometrical information of each input vector is described by the local coordinates of this vector with respect to its tangent space on the manifold. This algorithm begins by computing the k nearest neighbors (d < k < D) of every vector on the manifold. It then computes the first d principal components on each neighborhood of input vectors to get a d-dimensional subspace which approximates the local tangent space. It finally computes the local tangent coordinates of data samples and finds an embedding to align them into a global coordinate system. The detailed process can be divided into four basic steps as follows:

196


Step 1: Identify nearest neighbors. For each input vector xi ∈ X, 1 ≤ i ≤ n, find its k nearest neighbors, i.e. a set of feature vectors Xi satisfies the following formula: arg min

2

xj ∈ Xi

Xi

|xi − xj | ,

1 ≤ j ≤ n,

j= / i

2

where the identified k nearest neighbors are denoted as Xi = {xi1 , xi2 , . . ., xik }, and, naturally, the indices of these neighbors in the input vector set X may be denoted as Ni = {i1 , i2 , . . ., ik } , 1 ≤ i1 , i2 , . . ., ik ≤ n.. Step 2: Extract local information. Compute the correlation matrix G of these k neighbors, namely each element is represented as Gp,q = (xip − x¯ i , xiq − x¯ i ), 1 ≤ p ≤ k,

x¯ i =

1 k xip , k p=1

2.3. 3D ultrasound

1≤q≤k

where (· , ·) means dot product of both vectors. Use principal component analysis (PCA) to compute the first d largest eigenvectors g1 , g2 , . . ., gd , which represent the local coordinates of the input vector xi with respect to its local tangent space and the local geometrical characteristics. Step 3: Construct sparse alignment matrix. Compute the matrix

Q = [1k /

should have the same or adjacent coordinates in the mapped lowdimensional space. Finally, by running the LTSA-based manifold learning, a series of low-dimensional coordinates (namely 1D for respiratory signal in this paper) will be extracted from a sequence of images. For US liver images, the respiratory motion is dominant in contrast to other motion, deformation or noise. Therefore, the first several eigenvalues (especially the first one) and corresponding eigenvectors are highly relevant to the respiratory motion. In addition, the respiratory signal is considered as one-dimensional basic breathing pattern, which can be featured by the principal component of the 3D respiratory motion. Therefore, the respiratory signal in this paper is defined to be one-dimensional and represented by the first eigenvector extracted from the embedding matrix, namely d = 1.

k, g1 , . . ., gd ][1k /

T

k, g1 , . . ., gd ] ,

where 1k ∈ Rk is a column vector of all ones and Q is a k × k matrix. This matrix is appended into the sparse alignment matrix B using the following procedure: Bip ,iq ← Bip ,iq + (Ek − Q )p,q , where Ek denotes a k × k identity matrix, B is firstly initialized by zeros and the indices of the matrix elements have been already explained in step 2. Step 4: Align global coordinates through eigendecomposition. To obtain d features (coordinates) of embedded vectors, solve the standard eigenproblem Bv = v, For first d + 1 smallest eigenvalues, 1 , 2 , . . ., d , d+1 , and its corresponding eigenvectors, v1 , v2 , . . ., vd , vd+1 . Drop the smallest eigenvalue 1 ∼ 0 and the corresponding eigenvector v1 , and form the embedding matrix such that i-th coordinate (i = 1,2,. . .,n) of jth eigenvector (j = 1,2,. . .,d) corresponds to j-th coordinate of the projected i-th vector, namely Y = {y1 , y2 , . . ., yn } . 2.2. Signal extraction In this paper, the goal of the LTSA-based algorithm is to extract the corresponding respiratory state of each image from an US B-mode image sequence S consisting of n images, S = {I1 , I2 , . . ., In } , and spanning multiple breathing cycles. For each image sequence S, LTSA firstly needs to convert Ii into a corresponding high-dimensional vector xi in a row-majored pixel serialization, and forms a set of data vectors X = {x1 , x2 , . . ., xn } . According to the manifold learning theory and approximate reproducibility of these US images due to the breathing, the converted data vectors can be reasonably assumed to lie on or near a naturally low-dimensional manifold embedded in a high-dimensional space. Since the core idea of the manifold learning is to maximally preserve local neighboring information when mapping a manifold from the high-dimensional space to the low-dimensional space, similar images during a breathing cycle or spanning different cycles

As stated in the introduction, the respiratory signal is valuable in multiple clinical situations, e.g. image acquisition. Acquisition of US images plays a key role in image-guided interventions. When US images can clearly visualize tumors, they can directly guide the needle insertion in image-guided interventions. Even though tumors are not visible under US imaging, US images can also be utilized for multi-modal image fusion [25,26] to register intra-operative images (e.g. US) with pre-operative images, such as computerized tomography (CT) and magnetic resonance (MR). Since 2D US only provides relatively few information, it is extremely challenging to register a 2D US image with other images modalities. The 3D US probe is still very expensive and also has limited field of view. Therefore, freehand [27] and mechanically-swept [28] 3D US are usually more cost-effective options in image-guided interventions, and were also comprehensively studied in the literature [29]. However, their main issue is to create 3D US images from 2D images with different orientations and breathing phases and does not provide any respiratory correction. The proposed method in this paper can be easily integrated with freehand or mechanically-assisted 3D US to acquire respirationcorrected 3D US images. Herein, we especially discuss how to apply it to acquire respiration-corrected 3D US images using a conventional 2D US probe. The basic idea is to utilize the proposed gating method to detect multiple 2D images with a specific breathing phase from multiple 2D image sequences with different scanning angle, and then combine these detected 2D images into a respiration-corrected 3D image, which captures the moving liver at the specific breathing stage. In order to do that, the first step is to use a motorized mechanical arm to hold and regularly rotate a 2D US probe to acquire multiple image sequences with different scanning angles, as illustrated in Fig. 1. The proposed signal gating method in this paper is then applied to label the breathing phase of each frame of each image sequence. For each sequence, one image with the specific breathing phase is then selected. Finally, all the images with the same breathing phase and different scanning angles (from different image sequences) are collected together to reconstruct a 3D US image. 3. Experiments and results 3.1. Data acquisition The US imaging system for acquiring image videos is the Terason (Burlington, USA) t3000 with a 5C2 curved transducer. The used frequencies vary from 2.0 MHz to 5.0 MHz and the focusing depth is in the range of 16–19 cm. The US image resolution is 640 × 480 pixels and the temporal resolution is nearly 10 FPS. The pixel size


197

Fig. 1. A 2D US probe is rotated to acquire multiple image sequences with different scanning angles.

is about 0.37 × 0.37 mm, and slightly varies under different depth settings. In order to avoid tremor of the US probe by hands, a robotic arm (Fig. 2) is designed to hold the probe, which can help to stably scan the moving liver. US images will be acquired from volunteers lying on a bench bed with the supine position. The US probe is firstly held by hands to flexibly scan the liver at inter-rib intervals of the chest to avoid obstruction of ribs. After finding a proper scanning position and orientation, the US probe will be mounted onto the holder of the robotic arm with its position and orientation similar to that just selected manually. The mounted US probe can be tilted to scan the liver because the probe holder is actuated by a geared motor (Faulhaber DC-Micromotor 0615, Schönaich, Germany). The relative scanning angle can be tracked by an encoder embedded in the motor. This motor is connected to a computer using an Ethernet mode (via a network cable). On the computer, a customized imaging tool was developed and deployed to acquire different angles of image sequences. This tool can send commands to drive the motor and tilt the US probe at a regular angular interval to sweep the liver. In order to validate the breath signal tracked by our method, a NDI (Ontario, Canada) Aurora electromagnetic (EM) tracking system is used to track an EM sensor placed on the umbilicus of the volunteers while acquiring the US images. The dominant motion direction of the tracked sensor for the umbilicus will be applied as the reference respiratory signal. The translational motion of umbilicus is selected for the ground-truth signal because the umbilicus on the abdominal surface is usually a good position to monitor the abdominal respiration and is often adopted in respiratory motion modeling to obtain standard surrogate breathing signals [2]. For reducing the relative latency between US images and EM positions, a customized software tool is developed and deployed onto the laptop computer of the Terason US system to acquire both US images and EM positions. The EM tracking device is connected to the laptop using an Ethernet cable with 1000 Mbps. In our customized tool, a corresponding EM position will be instantly recorded on demand before an US image being acquired. For the following experiments, eight healthy volunteers (male, average age 36, ranged 25–46) were recruited. For the first fourth experiments, eight US image sequences, numbered as {S1, S2, . . ., S8} for convenience, were collected from these volunteers under normal and free breathing. Each sequence consists of 256 frames. For the fifth experiment, 48 image sequences were acquired from one of these volunteers and each sequence consists of 128 frames. All acquisitions were performed with these volunteers lying in the supine position. All the acquired US images and corresponding

EM positioning data will be transferred to our working computer for further off-line analysis. Relevant analysis experiments were run on a Dell workstation with Intel Xeon CPU E5620 2.4 GHz and 12G RAM, and the single-thread programming mode was used. 3.2. Experiments It is noted that the respiratory signals, we are interested in, are used to qualitatively characterize general pattern (state or phase) of the respiratory motion but not quantitatively measure physical motion amount. Therefore, for more convenient visual inspection, the respiratory signals extracted from the US images or tracked by the EM device are all linearly normalized to the interval (0, 1) using their minimums and maximums. The correlation coefficient (CC) metric will be applied for evaluating quantitative accuracy between the extracted signals, by different tracking methods, and the reference signals, tracked by the EM device. 3.2.1. Nearest neighbors The experiment herein was performed to analyze how different numbers of nearest neighbors affect the signal accuracy and computing time, and help determine the optimal neighborhood size to achieve the trade-off between the signal accuracy and computing efficiency. In this experiment, the images as the input vectors of this algorithm are kept to the original size to preserve all the image details in order to focus on the effect of the nearest neighbors. The used neighbor numbers in the experiments are 16, 32, 64 and 128, respectively. 3.2.2. Downsampling level The experiment was executed to investigate how different levels of image downsampling influence the accuracy and efficiency of the signal extraction. In the experiment, the number of neighbors is fixed as 64 according to the results in the first experiment, the downsampling levels for reducing the images both horizontally and vertically are 1/2 × 1/2, 1/4 × 1/4, 1/8 × 1/8, 1/16 × 1/16 and 1/32 × 1/32, respectively. Since the US images in this paper have a resolution of 640 × 480 pixels, the reduced resolutions for images should be 320 × 240, 160 × 120, 80 × 60, 40 × 30 and 20 × 15 pixels, respectively. 3.2.3. Comparison to Laplacian Eigenmap This experiment was executed to visually and quantitatively compare the proposed LTSA-based method to the LEM-based one [21]. According to the first two experiments, the number of the

198


Fig. 2. Setup for acquiring US images in synchronization with EM signals. (A) On the right chest is the US transducer for imaging the right liver lobe, (B) on the left of the volunteer is a NDI Aurora EM tracking device, (C) on the umbilicus is an EM sensor tracked by this EM device, and (D) is a geared motor for tilting the US probe.

nearest neighbors is set as a constant of 64 and the image downsampling level fixed as 1/16. In addition to the number of the nearest neighbors, LEM has another important parameter: namely kernel width, which is used to weight the influence of neighbors and experimentally given as 10 to obtain good results.

3.2.4. Comparison to template matching We also performed an experiment to compare our method to another classic image-region-based signal tracking method [18], which is based on template matching (TM) technique. For the TMbased method, a template block is at first manually specified from the reference image (often the first image of an image sequence). This image block should contain some salient features different from the image background, such vessels or liver boundaries. Afterwards, this method sequentially searches each frame of the image sequence for the block best matching the template block, and calculates 2D displacement between the matched block and the template block. Finally, this process will produce a sequence of 2D displacements, which represents local in-plane 2D respiratory motion of the liver. For the sequence of 2D displacements, either of both directional components, with greater motion, will be used as the respiratory signal. The TM-based method is quite simple in principle and fairly easy in implementation, but have to select a proper template block which contains some prominent anatomical structures. Since the TM-based method only extracts the respiratory motion by tracking a small region of the liver images, for fair comparison, the input data of our method are not the entire frame images but image blocks selected from each frame of the image sequence. These blocks have the same size as the template block used in the TM-based method, and their positions for all the images are fixed as the position of the template block. In this experiment, the image block has the size of 65 × 65 pixels, which was demonstrated by repeated experiments to be a good choice. Since the template block is fairly small, no image downsampling is required when using our LTSA-based method. The number of the nearest neighbors is fixed as 64.

3.2.5. 3D ultrasound We performed an initial experiment to prove the possibility to create respiration-corrected 3D ultrasound images from 2D ultrasound images. The US probe was regularly tilted to acquire 48 image sequences with different scanning angles from one volunteer. These sequences cover a angle of 30.84◦ , and each sequence consists of 128 frames.

3.3. Results 3.3.1. Nearest neighbors The experimental results are displayed in Fig. 3 for visual inspection, where two typical image sequences (S6 and S7) are used. It is observed when the number of the nearest neighbors is small (for instance 16 and 32), the extracted respiratory signals (in red) are not sufficiently accurate and have severe distortions at some peaks and valleys, in comparison to the EM-tracked reference signals (in green). However, the respiratory phases at peaks and valleys are usually vital for respiratory gating and other applications, and, therefore, too small neighbors (16 and 32) are not suitable options. When the number of the nearest neighbors increases to 64, the extracted breathing signals are very close to the reference signals in key respiratory phases as well as whole signal profiles. As the neighborhood size is further increased to 128, the extracted signals start to become unsmooth and have distortions in the peaks and valleys. More importantly, we also performed a quantitative analysis on the signal tracking accuracy under different number of nearest neighbors by calculating the CC values between the extracted signals to the EM-tracked reference signals, where results are listed in Table 1. Here, eight image sequences corresponding to different volunteers are utilized for analysis. The mean and standard deviation (STD) of the CC values were also calculated over these sequences. It is observable that the quantitative results, on the whole, support the aforementioned analysis in Fig. 3. The proposed method gains highest mean CC values and lowest STD values for 64 nearest neighbors. We also analyzed the computing time of the proposed method under different number of neighbors, which results are listed in Table 2. As the neighborhood size gradually increases, the time for extracting the respiratory signals also sharply boosts, from about 80 s up to nearly 20 min. Therefore, by careful evaluation on both the signal accuracy and computing time, the number of the nearest neighbors of each feature vector on the manifold should be set as 64, which is fixed in the following experiments.

3.3.2. Downsampling level Fig. 4 gives visual inspection about the extracted signal accuracy using different levels of image downsampling, where two typical image sequences (S6 and S7) are used. As the downsampling levels are set as 1/2 × 1/2, 1/4 × 1/4, 1/8 × 1/8 and 1/16 × 1/16, the extracted breathing signals (in red) have similar phases and shapes with the EM-tracked reference signals (in green). When the level is increased to 1/32 × 1/32, the extracted signals start becoming distinctive from the EM-tracked reference signals in shapes and


199

Table 1 Quantitative signal accuracy for different number of nearest neighbors. The signal accuracy is calculated by comparing the extracted signals to the EM-tracked reference signals using the CC metric. The sign * indicates the image sequences that are used in Fig.3. Neighbor number

16 32 64 128

Image sequences (CC values) S1

S2

S3

S4

S5

S6*

S7*

S8

Mean

STD

0.9512 0.9551 0.9692 0.9613

0.9861 0.9872 0.9927 0.8916

0.9347 0.9787 0.9842 0.9828

0.9482 0.9580 0.9627 0.9514

0.9522 0.9559 0.9642 0.9548

0.9385 0.9583 0.9721 0.9729

0.9701 0.9780 0.9818 0.9775

0.9689 0.9780 0.9787 0.9813

0.9562 0.9687 0.9757 0.9592

0.01631 0.01218 0.00977 0.02789

Table 2 Computing time of the proposed LTSA-based method under different number of nearest neighbors. The sign * indicates the image sequences that are used in Fig. 3. Neighbor number

16 32 64 128

Image sequences (computing time, in seconds) S1

S2

S3

S4

S5

S6*

S7*

S8

Mean

80.77 137.40 337.30 1071.04

87.24 142.04 341.25 1204.62

83.94 136.94 346.58 1017.48

83.75 135.88 356.34 1144.26

84.18 143.07 321.28 1106.20

88.51 139.92 336.06 1044.09

81.77 129.84 324.15 1019.72

87.56 149.22 363.15 1196.89

84.72 139.29 340.76 1100.54

Table 3 Quantitative signal accuracy for different levels of image downsampling. The signal accuracy is calculated by comparing the extracted signals to the EM-tracked reference signals using the CC metric. The sign * indicates the image sequences that are used in Fig.4. Downsampling level

1/2 × 1/2 1/4 × 1/4 1/8 × 1/8 1/16 × 1/16 1/32 × 1/32

Image sequences (CC values) S1

S2

S3

S4

S5

S6*

S7*

S8

Mean

STD

0.9698 0.9698 0.9683 0.9757 0.9744

0.9927 0.9925 0.9931 0.9915 0.9885

0.9843 0.9841 0.9835 0.9810 0.9638

0.9628 0.9625 0.9629 0.9648 0.8939

0.9644 0.9644 0.9634 0.9584 0.9503

0.9721 0.9740 0.9744 0.9799 0.9730

0.9819 0.9815 0.9831 0.9859 0.9578

0.9780 0.9769 0.9791 0.9828 0.9819

0.9758 0.9757 0.9760 0.9775 0.9605

0.00967 0.00954 0.01001 0.01026 0.02772

smoothness. More importantly, we also quantitatively evaluated the signal tracking accuracy under different levels of image downsampling by comparing the extracted signals to the EM-tracked reference signals using the CC metric, where results are listed in

Table 3. Here, eight image sequences corresponding to different volunteers are utilized for evaluation. The statistical mean and STD of the CC values were also calculated over these sequences. In most of the cases, the proposed method can gain relatively high CC values

Fig. 3. Different number of nearest neighbors may affect the approximation degree of the extracted signals (in red) to the EM-tracked reference signal (in green). Two typical image sequences (S6 and S7) from different volunteers are used, corresponding to each column. The first row gives the first frame of each image sequence. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

200


Fig. 4. Different levels of image downsampling affect the approximation degree of the extracted signals (in red) to the EM-tracked reference signal (in green). Two typical image sequences (S6 and S7) from different volunteers are used, corresponding to each column. The first row gives the first frame of each image sequence. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

from level 1/2 × 1/2 to 1/16 × 1/16. The CC values begin to apparently decrease while the level is given as 1/32 × 1/32. We also investigated the computing time of the proposed method under different downsampling levels, with corresponding results shown in Table 4. We can notice from this table that the computing time for the signal extraction will sharply decrease as the image size decreases. Therefore, by simultaneously considering both the signal accuracy and computing efficiency, the downsampling level 1/16 × 1/16 (corresponding image size, 40 × 30 pixels) will be best suitable for the signal extraction and kept constant in the following experiments. 3.3.3. Comparison to Laplacian Eigenmap The experimental results are displayed in Fig. 5, where two typical image sequences (S6 and S7) are used. It is noticeable from this figure that the signals (in red) extracted by the LEM-based method are quite different from the EM-tracked reference signals (in green), especially in the peaks and valleys, which corresponds to vital respiratory phases, ends of inhalation (EI) and exhalation (EE). The inaccuracy in detecting key phases limits clinical application of the LEM-based method in respiratory-gating situations. By contrast, the signals detected by our method are highly similar to the EM-tracked ground-truth signals. By viewing their overlapping graphs, we observe nearly perfect matching effects, which indicate our method is able to detect the respiratory signals highly consistent to the reference ones in key phases (EI and EE) as well as entire signal profiles. A quantitative analysis on the signal detection accuracy was also done to calculate the CC values between the signals, extracted by our LSTA-based method and the LEM-based one, and the ground-truth signals. The mean and STD of the CC

values were also calculated over these eight sequences. The experimental results are given in Table 5. Apparently, our method obtains the higher CC values than the LEM-based method for all the eight image sequences. The lower STD values also mean that our method is more robust in the signal extraction than the LEM-based method. We also compared the computing performance between our method and LEM-based one, and the results are also listed in Table 5. The LEM-based method takes less than 1 second to process an image sequence of 256 frames, whereas our method needs more than 2 seconds to complete the same task. It is evidently seen that our method trades larger calculation cost for higher signal accuracy. However, the increased computing time is acceptable to some extent for achieving higher signal accuracy and robustness. 3.3.4. Comparison to template matching The experimental results are illustrated in Fig. 6, where two typical image sequences (S6 and S7) are used. We can observe that the breathing signals (in red) detected by the TM-based method are fairly inaccurate in comparison to the EM-tracked reference signals (in green). For instance, there are a large number of noises, distortions in peaks and valleys, and evident mismatch in signal profiles with the reference ones. Furthermore, we can see that the TM-based method can detect much better signals in S7 than S6. This is because that the selected anatomical feature (liver boundary) in S6 has relatively low contrast and the TM-based method seriously depends on properly choosing a template block containing salient anatomical features. In contrast, our LTSA-based method does not have such preference on anatomical features and is able to extract smoother and more accurate respiratory signals with relatively small noise. Furthermore, we as well performed a


201

Table 4 Computing time of the proposed LTSA-based method under different levels of downsampling. The sign * indicates the image sequences that are used in Fig. 4. Downsampling level

Image sequences (computing time, in seconds) S1

S2

S3

S4

S5

S6*

S7*

S8

Mean

1/2 × 1/2 1/4 × 1/4 1/8 × 1/8 1/16 × 1/16 1/32 × 1/32

88.50 19.63 5.72 2.31 1.49

83.21 20.52 5.66 2.34 1.51

81.90 20.89 5.72 2.30 1.53

81.22 20.58 5.64 2.29 1.50

81.11 19.47 5.56 2.27 1.49

81.25 20.40 5.62 2.34 1.49

89.32 20.37 5.70 2.28 1.52

78.95 19.95 5.50 2.26 1.48

83.18 20.23 5.64 2.30 1.50

Fig. 5. Comparison in signal detection accuracy between the proposed LTSA-based method and previous LEM-based one. Two typical image sequences (S6 and S7) from different volunteers are used, corresponding to each column. The first row gives the first frame of each image sequence. For better visual analysis, the extracted signals (in red) are individually drawn for inspecting their whole profiles, and drawn together with the EM-tracked reference signals (in green) for comparing the signal accuracy. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Table 5 Comparison of the quantitative signal accuracy and computing time between our LTSA-based method and LEM-based one. The signal accuracy is calculated by comparing the extracted signals to the EM-tracked reference signals using the CC metric. The sign * indicates the image sequences that are used in Fig.5. Image sequences

CC Time (s)

LEM LTSA LEM LTSA

S1

S2

S3

S4

S5

S6*

S7*

S8

Mean

STD

0.9437 0.9757 0.81 2.31

0.9572 0.9915 0.80 2.34

0.9610 0.9810 0.83 2.30

0.9412 0.9648 0.79 2.29

0.9358 0.9584 0.81 2.27

0.9185 0.9799 0.80 2.34

0.9606 0.9859 0.84 2.28

0.9518 0.9828 0.80 2.26

0.9462 0.9775 0.81 2.30

0.01363 0.01026

Table 6 Comparison of the quantitative signal accuracy and computing time between our LTSA-based method and TM-based one. The signal accuracy is calculated by comparing the extracted signals to the EM-tracked reference signals using the CC metric. The sign * indicates the image sequences that are used in Fig. 6. Image sequences

CC Time (s)

TM LTSA TM LTSA

S1

S2

S3

S4

S5

S6*

S7*

S8

Mean

STD

0.9036 0.9681 5.14 4.51

0.9302 0.9747 5.11 4.54

0.9584 0.9620 5.08 4.54

0.9576 0.9858 5.12 4.64

0.9392 0.9589 5.13 4.31

0.7818 0.9681 5.09 4.39

0.9429 0.9518 5.06 4.69

0.8652 0.9781 5.12 4.32

0.9099 0.9684 5.11 4.49

0.05639 0.01027

202


Fig. 6. Comparison in signal detection accuracy between the proposed LTSA-based method and the TM-based one. Two typical image sequences (S6 and S7) from different volunteers are used, corresponding to each column. The first row gives the first frame of each image sequence. For better visual analysis, the extracted signals (in red) are individually drawn for inspecting their whole profiles, and then drawn together with the EM-tracked reference signals (in green) for comparing the signal accuracy. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

quantitative analysis on the signal tracking accuracy by calculating the CC values between the extracted signals, by our method and the TM-based one, and the EM-tracked reference signals. The mean and STD of the CC values were also calculated over these eight sequences. The experimental results are displayed in Table 6, where the higher mean values and lower STD values further demonstrate that our method always gains more accurate and robust breathing signals than the TM-based one. In addition, the comparison for the computing performance between our method and the TM-based one was also executed, and the corresponding results are placed into Table 6. It is noted that our method spends much less than 5 s to extract the signals from image sequences of 256 frames, but the TM-based method needs more than 5 s to do the same task. Therefore, in contrast to the TMbased one, our method is able to extract more accurate signals at higher computing efficiency.

3.3.5. 3D Ultrasound Fig. 7 illustrates two reconstructed 3D US volumes from these image sequences: one volume being created by combining 2D images with the same position in these image sequences (the left side in this figure), and another being reconstructed by merging 2D images with the same expiration phase (the right side of this figure). From the lateral views in this figure, it is observed that, for the non-corrected volume, there are severe jagging effects around the liver boundary and three vessels. Due to application of our gating method, the breathing-corrected volume is well reconstructed, and, especially, the liver boundary and vessels are clearly visible.

4. Discussion As stated in the data acquisition subsection, the US images and corresponding EM positions are always acquired in pairs to ensure an approximate synchronization, and, especially, an US image is scanned immediately after an EM position is recorded. Therefore, the latency for each pair of US image and EM position roughly includes transmission and acquisition time of each EM position. Since each EM position occupies small data throughout (at most tens of bytes), the latency (mainly including the transmission and acquisition time) is very small and ignorable. Generally, this is a simple but practical technique for synchronizing US and EM data, similar to Barratt’s work [30]. Quantitative measurement of the latency is quite difficult and needs special devices and complex calibration process, which is not in the scope of this paper. In fact, the experiments in this paper show that the EM-tracked respiratory signals nearly have no phase shift with respect to the signals extracted from US images, which also implies that the latency between US images and EM positions has been reduced to an acceptable level. According to the description in the method section above, our signal extraction algorithm has only one controllable parameter: the number of the nearest neighbors of each data point on the manifold. The neighborhood of a data point delineates its local geometrical characteristic on the manifold and is usually utilized to map the nonlinear manifold in the high-dimension space to the corresponding one in the low-dimension space [24]. Theoretically, the number of the nearest neighbors should be moderate. The reason is that if the neighborhood size is too small, it will be unable to collect


203

Fig. 7. 3D US images created from 2D images without correction (left) and with breathing correction (right).

sufficient local information and incur incorrect embedding results; otherwise great neighborhood will also lose the locality and degenerate the embedding results. In addition, when the neighborhood of a data point becomes bigger, the time consumption for identifying the nearest neighbors and constructing the sparse alignment matrix will increase accordingly. Therefore, reducing the computing time as far as possible is another important consideration while choosing the number of the nearest neighbors. The visual inspection in Fig. 3 and quantitative results in Table 1 show that 64 nearest neighbors are most suitable options for a sequence of 256 frames in this paper. It is also noticed from Table 1 that the differences for different number of nearest neighbors are not quite evident in the quantitative accuracy. However, we can still see great visual differences in the key peaks and valleys (corresponding to ends of expiration and inspiration, EE and EI), as illustrated in Fig. 3. The accuracy of the key respiratory phases plays an important role in breathing-gated applications, which often utilize the key phases for guiding 3D ultrasound reconstruction, radiation dose delivery and so forth. In fact, the trade-off between the computing time and signal accuracy is subtle, and often varies under different conditions. Under some conditions, if the time performance is critical, it is usually better to choose the smaller number of neighbors, for instance 16 neighbors. In this paper, when using 64 neighbors, we are still able to extract the respiratory signal in a very short time (in Table 5, about 2.3 for an image sequence of 256 frames, nearly 110 Hz). Our method takes an US image sequence as input and serializes each frame image as one feature vector in a row-majored way, so direct use of original images as features vectors will incur large time consumption. For example, as shown in Table 2, if the

number of nearest neighbors is fixed as 64, it takes more than 5 min to extract the respiratory signal for an image sequence consisting of 256 frames. Therefore, downsampling US images before serializing them will be able to speed up the signal extraction process. Furthermore, as the input images are reduced to some extent, dominant respiration-induced motions in these images will be preserved and other minor intensity variations and noise, which are inherent in the US imaging process, will be eliminated or weakened. Hence, image downsampling will help detect more accurate and robust breathing signals while reducing the time consumption. However, if reducing the images too much, respiratory information appearing in US images will also be partially missing and the accuracy of the extracted signals will be also deteriorated. The visual inspection in Fig. 4 and quantitative results in Table 3 already demonstrated the theoretical analysis. In addition to be used for breathing gating from liver US images, manifold learning techniques have recently been applied for other gating situations, such as breathing-gated 4D CT imaging for the lung [31,32], breathing-gated fluoroscopic imaging for lung cancer radiotherapy [33], breathing-gated MR [22,34,35] and heartbeating-gated intravascular ultrasound (IVUS) [36]. Therefore, it is rationally expected that the proposed LTSA-based method in this paper is also able to find more value in other clinical fields. 5. Conclusion This paper has presented a new approach for accurate and fast detection of surrogate respiratory signals of the moving liver directly from 2D US B-mode image sequences. This algorithm makes use of the LTSA-based nonlinear dimensionality reduction technique to extract intrinsic low-dimensional respiratory phase

204


information embedded inside high-dimensional image data. LTSA uses a linear approximation for local geometry of each data point to construct a local coordinate system, and then aligns these overlapping local coordinate systems to obtain a global coordinate system to represent the data vectors in a low-dimensional space, which is finally projected as the respiratory signal. Multiple experiments have demonstrated that the proposed method surpasses other typical image-based methods in signal tracking accuracy and robustness at relatively high computing efficiency. Although only implemented and experimented using 2D US images, our method can also naturally be extended to operate on native 3D US images for respiratory gating. In future, we plan to integrate the proposed respiratory gating method into our ongoing imageguided robotic system for 3D/4D US imaging to capture the moving liver. Acknowledgement This work is supported by a research grant (Grant No. 1431AFG099) from the Joint Council Office (JCO), Agency for Science, Technology and Research (ASTAR), Singapore. References [1] von Siebenthal M. Analysis and modelling of respiratory liver motion using 4DMRI. ETH Zurich 2008, http://dx.doi.org/10.3929/ethz-a-005552073. [2] McClelland JR, Hawkes DJ, Schaeffter T, King AP. Respiratory motion models: a review. Med Image Anal 2013;17:19–42, http://dx.doi.org/10.1016/ j.media.2012.09.005. [3] von Siebenthal M, Cattin P, Gamper U, Lomax A, Székely G. 4D MR imaging using internal respiratory gating. In: MICCAI 2005. Heidelberg: Springer; 2005. p. 336–43. [4] Blackall JM, Penney GP, King AP, Hawkes DJ. Alignment of sparse freehand 3-D ultrasound with preoperative images of the liver using models of respiratory motion and deformation. IEEE Trans Med Imaging 2005;24:1405–16, http://dx.doi.org/10.1109/TMI.2005.856751. [5] Nicolau SA, Pennec X, Soler L, Ayache N. Clinical evaluation of a respiratory gated guidance system for liver punctures. Med Image Comput Comput Assist Interv 2007;10:77–85. [6] Xu Q, Hamilton RJ. A novel respiratory detection method based on automated analysis of ultrasound diaphragm video. Med Phys 2006;33:916–21, http://dx.doi.org/10.1118/1.2178451. [7] Zhang J, Ding M, Meng F, Yuchi M, Zhang X. Respiratory motion correction in free-breathing ultrasound image sequence for quantification of hepatic perfusion. Med Phys 2011;38:4737–48, http://dx.doi.org/10.1118/1.3606456. [8] Nguyen T-N, Moseley JL, Dawson LA, Jaffray DA, Brock KK. Adapting liver motion models using a navigator channel technique. Med Phys 2009;36:1061, http://dx.doi.org/10.1118/1.3077923. [9] Preiswerk F, Arnold P, Fasel B, Cattin PC. A Bayesian framework for estimating respiratory liver motion from sparse measurements. In: Yoshida H, Sakas G, Linguraru M, editors. Abdom. Imaging 2011. Heidelberg: Springer; 2011. p. 207–14. [10] Rijkhorst E-J, Rivens I, ter Haar G, Hawkes D, Barratt D. Effects of respiratory liver motion on heating for gated and model-based motioncompensated high-intensity focused ultrasound ablation. In: Fichtinger G, Martel A, Peters T, editors. MICCAI 2011. Heidelberg: Springer; 2011. p. 605–12. [11] Arnold P, Preiswerk F, Fasel B, Salomir R, Scheffler K, Cattin PC. 3D organ motion prediction for MR-guided high intensity focused ultrasound. In: Yoshida H, Sakas G, Linguraru M, editors. MICCAI 2011. Heidelberg: Springer; 2011. p. 623–30. [12] Khamene A, Warzelhan J, Vogt S, Elgort D, Chefd’Hotel C, Duerk J, et al. Characterization of internal organ motion using skin marker positions. In: Barillot C, Haynor D, Hellier P, editors. MICCAI 2004. Berlin, Heidelberg: Springer; 2004. p. 526–33, http://dx.doi.org/10.1007/978-3-540-30136-3 65. [13] Ernst F, Martens V, Schlichting S, Besirevic´ A, Kleemann M, Koch C, et al. Correlating chest surface motion to motion of the liver using ␧-SVR – a porcine study. In: MICCAI 2009. Berlin, Heidelberg: Springer; 2009. p. 356–64.

[14] Ernst F, Bruder R, Schlaefer A, Schweikard A. Correlation between external and internal respiratory motion: a validation study. Int J Comput Assist Radiol Surg 2012;7:483–92, http://dx.doi.org/10.1007/s11548-011-0653-6. [15] Schaerer J, Fassi A, Riboldi M, Cerveri P, Baroni G, Sarrut D. Multi-dimensional respiratory motion tracking from markerless optical surface imaging based on deformable mesh registration. Phys Med Biol 2012;57:357–73, http://dx.doi.org/10.1088/0031-9155/57/2/357. [16] Alnowam MR, Lewis E, Guy M, Wells K. Marker-less tracking for respiratory motion correction in nuclear medicine. In: Nucl Sci Symp Conf Rec (NSS/MIC), 2010. IEEE; 2010. p. 3118–21, http://dx.doi.org/10.1109/ NSSMIC.2010.5874374. [17] Hwang Y, Kim J-B, Kim YS, Bang W-C, Kim JDK, Kim C. Ultrasound image-based respiratory motion tracking. In: Proceeding SPIE, Med. Imaging 2012. 2012., http://dx.doi.org/10.1117/12.911766, 83200N–83200N–6. [18] Wu J, Li C, Huang S, Liu F, Tan BS, Ooi LL, et al. Fast and robust extraction of surrogate respiratory signal from intra-operative liver ultrasound images. Int J Comput Assist Radiol Surg 2013;8:1027–35, http://dx.doi.org/10.1007/s11548-013-0902-y. [19] Wu J, Chi Y, Li C, Tan BS, Ooi LL, Ramamurthy S, et al. Automatic and real-time identification of breathing pattern from ultrasound liver images. In: MIAR 2013. Berlin/Heidelberg: Springer; 2013. p. 27–34. [20] Sundar H, Khamene A, Yatziv L, Xu C. Automatic image-based cardiac and respiratory cycle synchronization and gating of image sequences. In: MICCAI 2009; 2009. p. 381–8. [21] Wachinger C, Yigitsoy M, Navab N. Manifold learning for image-based breathing gating with application to 4D ultrasound. In: MICCAI 2010; 2010. p. 26–33. [22] Wachinger C, Yigitsoy M, Rijkhorst E-J, Navab N. Manifold learning for image-based breathing gating in ultrasound and MRI. Med Image Anal 2012;16:806–18, http://dx.doi.org/10.1016/j.media.2011.11.008. [23] Belkin M, Niyogi P. Laplacian Eigenmaps for dimensionality reduction and data representation. Neural Comput 2003;15:1373–96, http://dx.doi.org/10.1162/ 089976603321780317. [24] Zhang Z, Zha H. Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J Sci Comput 2003;26:313–38, http://dx.doi.org/10.1137/S1064827502419154. [25] Penney GP, Blackall JM, Hamady MS, Sabharwal T, Adam A, Hawkes DJ. Registration of freehand 3D ultrasound and magnetic resonance liver images. Med Image Anal 2004;8:81–91, http://dx.doi.org/10.1016/j.media.2003.07.003. [26] Kadoury S, Zagorchev L, Wood BJ, Venkatesan A, Weese J, Jago J, et al. A modelbased registration approach of preoperative MRI with 3D ultrasound of the liver for Interventional guidance procedures. In: ISBI 2012; 2012. p. 952–5, http://dx.doi.org/10.1109/ISBI.2012.6235714. [27] Fenster A, Downey DB, Cardinal HN. Three-dimensional ultrasound imaging. Phys Med Biol 2001;46:R67. [28] Neshat H, Cool DW, Barker K, Gardi L, Kakani N, Fenster A. A 3D ultrasound scanning system for image guided liver interventions. Med Phys 2013;40:112903, http://dx.doi.org/10.1118/1.4824326. [29] Solberg OV, Lindseth F, Torp H, Blake RE, Nagelhus Hernes TA. Freehand 3D ultrasound reconstruction algorithms – a review. Ultrasound Med Biol 2007;33:991–1009, http://dx.doi.org/10.1016/j.ultrasmedbio.2007.02.015. [30] Barratt DC, Davies AH, Hughes AD, Thom SA, Humphries KN. Accuracy of an electromagnetic three-dimensional ultrasound system for carotid artery imaging. Ultrasound Med Biol 2001;27:1421–5. [31] Georg M, Souvenir R, Hope A, Pless R. Manifold learning for 4D CT reconstruction of the lung. In: CVPR 2008 Work. 2008. p. 1–8, http://dx.doi.org/10.1109/CVPRW.2008.4563024. [32] Luo Z, Xi Z, Wang J, Tang D. Sorting 4DCT images based on manifold learning. In: ICICTA 2008; 2008. p. 181–5, http://dx.doi.org/10.1109/ICICTA.2008.150. [33] Lin T, Li R, Tang X, Dy JG, Jiang SB. Markerless gating for lung cancer radiotherapy based on machine learning techniques. Phys Med Biol 2009;54:1555–63, http://dx.doi.org/10.1088/0031-9155/54/6/010. [34] Usman M, Vaillant G, Atkinson D, Schaeffter T, Prieto C. Compressive manifold learning: estimating one-dimensional respiratory motion directly from undersampled k-space data. Magn Reson Med 2013, http://dx.doi.org/10.1002/mrm.25010. [35] Yigitsoy M, Wachinger C, Navab N. Manifold learning for image-based breathing gating in MRI. In: Proceeding SPIE, Med. Imaging 2011. 2011. p. 796210–7, http://dx.doi.org/10.1117/12.878027. [36] Isguder G, Unal G, Groher M, Navab N, Kalkan A, Degertekin M, et al. Manifold learning for image-based gating of intravascular ultrasound (IVUS) pullback sequences. In: Liao H, Eddie Edwards PJ, Pan X, Fan Y, Yang G-Z, editors. MIAR 2010. Berlin Heidelberg: Springer; 2010. p. 139–48, http://dx.doi.org/10.1007/978-3-642-15699-1 15.

Manifold learning based registration algorithms applied to multimodal images.

Prediction of high-dimensional states subject to respiratory motion: a manifold learning approach.

An automatic method to detect and track the glottal gap from high speed videoendoscopic images.

Similarity Learning of Manifold Data.

Sampling from Determinantal Point Processes for Scalable Manifold Learning.

Machine learning plus optical flow: a simple and sensitive method to detect cardioactive drugs.

A Selective Ensemble Classification Method Combining Mammography Images with Ultrasound Images for Breast Cancer Diagnosis.

Learning-based prediction of gestational age from ultrasound images of the fetal brain.

Real-time ultrasound transducer localization in fluoroscopy images by transfer learning from synthetic training data.

Placental damages in preeclampsia - from ultrasound images to histopathological findings.

A method for computer simulation of ultrasound Doppler color flow images--I. Theory and numerical method.

CT fusion imaging.

Manifold Learning by Preserving Distance Orders.

Unraveling flow patterns through nonlinear manifold learning.

Hierarchical manifold learning for regional image analysis.

Quality Improvement of Liver Ultrasound Images Using Fuzzy Techniques.

Interactive Outlining of Pancreatic Cancer Liver Metastases in Ultrasound Images.

Texture analysis and classification of ultrasound liver images.

Out-of-Sample Extrapolation utilizing Semi-Supervised Manifold Learning (OSE-SSL): Content Based Image Retrieval for Histopathology Images.

Generation of fluoroscopic 3D images with a respiratory motion model based on an external surrogate signal.

Learning to detect a tone in unpredictable noise.

Extensions to a manifold learning framework for time-series analysis on dynamic manifolds in bioelectric signals.

Using Bayesian surprise to detect calcifications in mammogram images.

Manifold learning for object tracking with multiple nonlinear models.