Medical Image Analysis 24 (2015) 106–124

Contents lists available at ScienceDirect

Medical Image Analysis journal homepage: www.elsevier.com/locate/media

A Markov random field approach to group-wise registration/mosaicing with application to ultrasound Jason Kutarnia∗, Peder Pedersen Worcester Polytechnic Institute, 100 Institute Rd, Worcester, MA 01609, United States

a r t i c l e

i n f o

Article history: Received 17 June 2014 Revised 25 May 2015 Accepted 26 May 2015 Available online 3 June 2015 Keywords: Ultrasound mosaicing Group-wise registration Discrete methods

a b s t r a c t In this paper we present a group-wise non-rigid registration/mosaicing algorithm based on block-matching, which is developed within a probabilistic framework. The discrete form of its energy functional is linked to a Markov Random Field (MRF) containing double and triple cliques, which can be effectively optimized using modern MRF optimization algorithms popular in computer vision. Also, the registration problem is simplified by introducing a mosaicing function which partitions the composite volume into regions filled with data from unique, partially overlapping source volumes. Ultrasound confidence maps are incorporated into the registration framework in order to give accurate results in the presence of image artifacts. The algorithm is initially tested on simulated images where shadows have been generated. Also, validation results for the group-wise registration algorithm using real ultrasound data from an abdominal phantom are presented. Finally, composite obstetrics image volumes are constructed using clinical scans of pregnant subjects, where fetal movement makes registration/mosaicing especially difficult. In addition, results are presented suggesting that a fusion approach to MRF registration can produce accurate displacement fields much faster than standard approaches. © 2015 Elsevier B.V. All rights reserved.

1. Introduction and prior work In this paper we will develop a novel group-wise image registration/stitching algorithm capable of producing composite volumes with structural continuity from many partially overlapping 3D sources. The definition of a discrete energy functional used to partition the composite volume into regions corresponding to each unique source is one contribution of this work and will be referred to as a mosaicing function. The second contribution is the development of a group-wise block matching algorithm that is derived using a probabilistic framework and enhanced with the pre-computed mosaicing function defined later. The motivation for this research was to produce composite obstetrics volumes for use in an ultrasound training simulator, where each image volume encompasses the mother’s entire abdomen. Unique challenges include fetal movements and extensive shadowing, which caused volumes acquired from one side of the abdomen to have few features in common with volumes acquired from the opposite side. This work builds on the research of Wachinger (2011), where rigid and non-rigid group-wise registration techniques were developed



Corresponding author. Tel.: +1 978 2731 602. E-mail addresses: [email protected], (J. Kutarnia), [email protected] (P. Pedersen). http://dx.doi.org/10.1016/j.media.2015.05.011 1361-8415/© 2015 Elsevier B.V. All rights reserved.

[email protected]

for ultrasound. In Wachinger et al. (2008b) rigid registration strategies were developed for 3D ultrasound mosaicing, and in Wachinger (2011) temporal group-wise non-rigid registration was applied to model liver motion in 4D. Our work links the accumulated pairwise estimates framework, developed for group-wise registration, to a discrete Markov Random Field and formulates an effective method to deal with many partially overlapping volumes. Previous work on extended field of view ultrasound includes the registration techniques developed in Poon and Rohling (2006), where a set of volumes acquired from a 3D ultrasound machine was stitched together by linking position information from a tracking system to each 3D volume. Non-rigid registration was performed and the volumes were compounded by averaging pixel values in overlapping regions. In vitro experiments were performed by stitching together volumes obtained from a fetal phantom. The composite volume was grown by sequentially adding adjacent volumes, using pair-wise registration techniques, until all volumes had been incorporated. In Ni et al. (2009) an improved technique for the pair-wise rigid registration of ultrasound, which uses a 3D scale invariant feature transform, was developed with ultrasound mosaicing as its intended application. The authors generated panoramic images of a liver phantom. In Kutter et al. (2009b) the authors utilized CT data to improve the group-wise affine registration of ultrasound volumes by fusing a USCT similarity metric with the standard US-US similarity metrics. This approach was used to produce a mosaic of the liver using clinical data.

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

A solution for the real-time rigid registration of 3D ultrasound volumes was presented in Schneider et al. (2012). Our approach adds the capability to handle the significant non-rigid deformations and imaging artifacts which are present in clinical abdominal and obstetrics ultrasound. Spatial compounding methods are an integral component in ultrasound mosaicing. Once the volumes are aligned, the overlapping regions must be combined in some way to produce the composite image. To this end Wachinger et al. (2008a) estimated the acoustic impedance from multiple overlapping ultrasound images. In Rajpoot et al. (2009) multiple single-view 3D echocardiography images were fused utilizing a method based on 3D wavelet transforms. These methods require precise alignment of the overlapping volumes and based on our experience with clinical obstetrics volumes this is not possible. The mosaicing function we propose in this paper is a very capable alternative in difficult scenarios where these methods fail. Similarity metrics are used to guide registration algorithms and those that have been designed with specific modalities in mind can improve the accuracy of the resulting deformation fields, as shown by Andriy (2010). For example, methods designed for ultrasound have taken into account the unique statistical properties of signal noise and also the speckle correlation between volumes acquired close to one another in time. An intensity based metric, which minimizes the residual complexity between two images, was presented in Myronenko and Song (2010) and was shown to be superior to mutual information. Other metrics, such as those developed by Mellor and Brady (2005), incorporate local phase information because it can provide more local structural information than using intensity alone. These methods have been shown to outperform traditional metrics such as mutual information and sum of squared differences in experimental settings, but they are costly to compute and do not account for the shadow artifacts that are prevalent in clinical ultrasound. An efficient similarity measure that can handle large deformations between many partially overlapping volumes (≥3) is desired. To this end we choose to implement an improved sum of squared differences metric capable of dealing with shadowing artifacts, which dominated the registration error when clinical data acquired from live subjects was used. Our metric can be efficiently calculated in the frequency domain and effectively handles obscured regions. MRF based image registration schemes have been proposed in earlier research and shown to produce satisfying results. In Glocker et al. (2011) a pair-wise hybrid geometric/iconic algorithm based on MRF theory was proposed and validated using lung CT data. A group-wise registration method based on MRFs was presented in Sotiras et al. (2009) and tested on 2D MR human skeletal muscle images. The previously cited discrete methods use first order derivatives to regularize the transformation fields, which we found to be problematic when dealing with partially overlapping volumes. In Kwon et al. (2008) a higher order regularization term based on second derivatives was proposed and used to perform pair-wise 2D registration. In Kwon et al. (2012) this higher order smoothness term was extended to 3D and used in the pair-wise registration of MR volumes. We extend this discrete second order smoothness term for use in a group-wise setting. Alpha expansion has also been effectively used to perform traditional pair-wise registration in So et al. (2011), where their contribution outperformed the diffeomorphic demons algorithm found in the Insight Toolkit, as well as other discrete approaches such as Glocker et al. (2008). Their idea was to perform alpha expansion using a set of displacement labels. Our work is unique in that, starting from a probabilistic framework, we develop a pair of MRF energies for the joint group-wise registration/mosaicing problem and then show how they may be efficiently optimized using current computer vision approaches. In contrast to previous efforts, we consider the mosaicing function as input to our registration method in order to focus the similarity metric on the most influential regions. This novel approach has been

107

specifically designed for the group-wise registration of many partially overlapping volumes. Choosing an optimal mosaicing function reduces unwanted image artifacts in the final composite volume as well as simplifies the group-wise registration problem, leading to increased computational performance. Existing compounding methods require precise alignment between volumes, which is not achievable with freehand 3D fetal ultrasound, making our method a feasible alternative. For the sake of speed we have chosen to implement an improved version of the sum of squared differences (SSD) similarity metric, where our modification increases the metric’s robustness in the presence of shadows. Accounting for these ultrasound specific artifacts is novel in the context of group-wise non-rigid registration. This gives us the ability to quickly compute all the similarity information needed for the entire registration process efficiently by using the FFT method derived in this paper. This computation occurs before any optimization is performed. Using the FFT for block matching was previously proposed in Orchard (2007) for pair-wise multimodal rigid registration and in this paper we extend the method for non-rigid registration in a group-wise setting, taking into account ultrasound specific shadowing artifacts. The data required for the optimization of the MRF is pre-computed and stored for easy access by whichever optimization algorithm is chosen. Due to the pairwise nature of the similarity metric, coupled with its fast pre-computation, we can include additional volumes with minimal effort. Existing similarity metric data is reused for the group-wise registration problem involving additional volumes. In addition, we present some results about the parallelization of image registration through deformation field fusion of independent solutions. 2. Theory: group-wise registration component 2.1. Group-wise registration in a probabilistic framework The registration problem will first be described in a maximum likelihood framework. If given a set of N overlapping ultrasound image volumes, V = {I1 , . . . , IN }, our goal is to calculate the N threedimensional deformation fields which align the volumes within the regions of overlap. We define a scene coordinate space in R3 and seek to transform all N volumes into this space where the final mosaiced volume will reside. Let the set of transformations be defined as T = {T1 , . . . , TN } where Tn : R3 → R3 . The resulting problem is a fairly complicated group-wise registration which can be formulated as an energy minimization and stated as

Tˆ = argmin E (T ; I1 , . . . , IN ).

(1)

T

In the equation above E represents the registration energy that we wish to minimize. A lower value of E, measured in the regions of overlap, indicates better alignment among the source volumes contained in V. The function E is usually split and written as a sum of two terms, E = M + R, which represents the total energy to be minimized. This can be looked at as a combination of the image matching criteria and a transformation regularization term, M and R respectively. The matching term measures alignment of image features and is referred to as a similarity metric, while the regularization term penalizes unlikely transformations such as those containing very large first or second derivatives. Each source in V has a set of control points which are used to generate its deformation field. Our strategy is to perform a series of group-wise translational alignments at these control point locations. Similarly, in a probabilistic framework this registration problem can be stated as

Tˆ = argmax P (T |I1 , . . . , IN ) T

≈ argmax P (I1 , . . . , IN |T ) P (T ), T

(2)

108

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

where the prior probability on the source volumes is ignored but could be utilized as part of a future extension. Taking the natural log of the argument in the last line of (2) yields

Tˆ ≈ argmax[ ln P (I1 , . . . , IN |T ) + ln P (T ) ].

(3)

T

Equation (3) takes the form of E and is a combination of the similarity metric and the prior probability on the set of transformations. The two probabilities in (3) correspond to M and R respectively. Concentrating on the first probability in (3), ln P (I1 , . . . , IN |T ), we can formulate a group-wise similarity metric known as accumulated pair-wise estimates. It was first derived in Wachinger et al. (2008b) and used for the rigid registration of multiple US volumes acquired from a fetal phantom. This approach was later extended to temporal deformable group-wise registration where a 3D-US system was used to model liver motion in Wachinger (2011). We will see that this formulation can be directly linked to the discrete graph based MRF framework. This approach also avoids the issue of choosing a fixed volume, which may be significantly out of alignment with the rest, and drawing the others toward it unrealistically. In the derivation that follows all image volumes are assumed to be conditionally independent of each other; thus, given a realization of one image volume all other volumes are independent. The first step in the derivation is to rewrite P (I1 , . . . , IN |T ) as the product of N conditional densities with respect to some image volume In . This can be done using the product rule and the property of conditional independence P (A, B|C ) = P (A|C ) P (B|C ), and so we can write

P (I1 , . . . , IN |T ) = P (I1 , . . . , In−1 , In+1 , . . . , IN |In , T ) P (In |T ) = P (In |T )

N 

(4)

P (Ii |In , T ).

i=n

In order to make the likelihood function “symmetric” with respect to all volumes, the Nth power of the joint density is taken and (4) is employed N times, iterating through the image volumes so that each takes a turn as the fixed volume. This results in



P (I1 , . . . , IN |T ) = N

N 



P (Ii |T )

N N  



P I j |Ii , T





.

(5)

i=1 j=i

i=1

N N N   1 1  ln P (Ii |T ) + ln P I j |Ii , T N N i=1 j=i

i=1



N N  







In,b (x ) = In (Tn (x + cb )) where x ∈ B





(7)

≈ In x + cb + dn,b . ↓

The symbol In,b represents a transformed version of In where the downward arrow is used to signify translation. The translational component dn, b is considered to be the exact deformation at the control point cb . Note that at each grid location a group-wise block matching is performed. If a grid location is common to many (≥3) overlapping volumes it will be shown that the sum of pairwise terms can serve as the group-wise measure of alignment for this set of blocks. First we will concentrate on simplifying the term ln P (I1 , . . . , IN |T ), which represents the block based similarity measure. The block formulation process for multiple image volumes, ↓ given the transformed volume Ii,b , can be approximated as

 ↓

I j,b (x ) ≈ f Ii,b x − d j,b



+ ε,

(8)

where x ∈ B and noise is represented by the term ε . This equation characterizes the block Ij, b as being a transformed and corrupted sub↓

Finally, the logarithm is applied to the joint probability function in (5) resulting in an expression for the likelihood function ln P (I1 , . . . , IN |T ), which can be used in practice to evaluate the similarity of multiple overlapping image volumes. This is shown in the following equation:

ln P (I1 , . . . , IN |T ) =

centered on the control points. The location of each block in the scene coordinate system can be determined using the block index b. The number of blocks along the x, y, and z dimensions are denoted by Bx , By , and Bz respectively. Also, we will define a block as a cubic region centered on a control point whose sides are equal to 2h. Individual voxels within a block will be indexed as offsets from the  grid coordinates that the block is centered on using the set B = x|x ∈ Z3 ∧ −h < xi < h . The control point indexes increase fastest along the x dimension and slowest along the z dimension. Let the each Tn ∈ T be the transformation defined by the set of displacement vectors corresponding to the control points associated with vectors for source n will be indexed usIn . The set of displacement  ing the notation Dn = dn,1 , . . . , dn,b , . . . , dn,B where dn,b ∈ R3 . The complete solution, consisting of all the displacement vectors for the N source volumes, is denoted as D = {D1 , . . . , Dn , . . . , DN }. Simplified notation for the translation and formation of blocks will be helpful in providing a cleaner presentation. Thus transformed regions near control point cb from image volume In will be represented using

image of the transformed source image Ii,b . Although we form I j,{1,...,B} from an intact image volume each Ij, b is assumed independent of the others for the purpose of deriving the metric. Starting with (6), a group-wise probabilistic block matching term can be constructed as follows:

ln P (I1 , . . . , IN |T ) ≈

N N  



ln P I j |Ii , Ti , T j



i=1 j=i

(6)

ln P I j |Ii , T .



2.2. Block matching formulation In this section a group-wise block matching algorithm will be described based on the probabilistic concepts discussed previously. We propose that the source volumes In should be broken down into overlapping blocks which are centered on each grid  point. The set of con trol point locations will be denoted as C = cb |cb ∈ R3 ∧ 1  b  B , where B denotes the total number of blocks. The control points will form a rectangular grid of uniform spacing h in the scene coordi nate system. Let In,{1,...,B} = In,1 , . . . , In,b , . . . , In,B represent the set of B overlapping blocks which comprise image volume n and are



ln P I j,{1,...,B} |Ii , Ti , T j



i=1 j=i

i=1 j=i

From (6) we see that the higher order joint density representing the group-wise registration problem can be estimated using the sum of pairwise densities.

N N  



N  N  B  i=1 j=i b=1

=

N  N  B 



ln P I j,b |Ii , di,b , d j,b







(9)



ln P I j,b |Ii,b , d j,b .

i=1 j=i b=1

The terms in the first line signify the probability of observing Ij , given Ii and the transformations Ti , Tj . Reasoning that a complete image can be estimated by overlapping blocks containing highly dependent random variables in common regions, the second line of (9) can be written. The third line of (9) is the result of applying the conditional independence property to the blocks forming an image volume. Due to ↓ our model of the imaging process, which produced I j,{1,...,B} from Ii , we can say that the overlapping blocks of Ij are independent given the set of control point displacement vectors. The approximation of (9)

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

becomes increasingly valid as the block size decreases and it also allows us to develop a computationally efficient implementation. Note that in the final result of (9) the transformed version of Ii is used as the scene image and not the original acquired volume. This is important because it corresponds to fixing the transformation of volume Ii ↓ during the calculation of P(I j,b |Ii,b , d j,b ). If we assume zero mean Gaussian noise then the term ↓ ln P(I j,b |Ii,b , d j,b ) can be simplified to the SSD metric calculated over the block designated by the index b,





P I j,b |Ii,b , d j,b



  2 ↓ ≈ I j,b (x ) − Ii,b x − d j,b SSD between vols. i,j over block b

x∈B

=







I j,b (x ) − Ii x + cb + di,b − d j,b

2

.

(10)

109

the quality of the final mosaic were the two biggest concerns. With this in mind, we found that larger search windows did not necessarily increase of quality of the final mosaic, an effect likely due to the multi-resolution approach of the registration; however, it did increase optimization time due to the larger label set. The set W = [−4, −3, . . . , 0, . . . , 3, 4] × [−4, −3, . . . , 0, . . . , 3, 4] × [−4, −3, . . . , 0, . . . , 3, 4], where the number of labels was |W | = 729, corresponding to h = 8, was found to be the optimal choice in our experiments. We compared this to the expanded set W = [−8, −7, . . . , 0, . . . , 7, 8] × [−8, −7, . . . , 0, . . . , 7, 8] × [−8, −7, . . . , 0, . . . , 7, 8], where |W | = 4913, and found a negligible increase in quality; thus, we do not recommend a displacement label set this large. Because all volumes are allowed to deform during optimization, the smaller label set can still result in significant non-rigid transformations.

x∈B

This type of simplification is discussed in detail in Roche et al. (2000). The block-wise SSD metric can be evaluated very fast using FFT techniques, as seen in Essannouni et al. (2007). This technique is applied to the group-wise similarity metric in Appendix A. The final likelihood term for the SSD based group-wise registration algorithm is

ln P (I1 , . . . , IN |T ) ≈

N  B  N   [I j,b (x ) − Ii (x + cb + (di,b − d j,b ))]2 .

(11)

i=1 j=i b=1 x∈B

This term takes into account the translation of both image volumes, Ii and Ij , during its calculation; however, (11) chooses fixed blocks of Ij to match within larger windows of Ii thus enabling efficient calculation. The origins of the windows and blocks are determined by the location of each cb . Let P = {i, j} be the set of volume pairs which have some amount of overlap. The right side of (11) can now be rewritten to show the symmetry of the metric,

ln P (I1 , . . . , IN |T ) ≈

B   i, j∈P b=1 x∈B







I j,b (x ) − Ii x + cb + di,b − d j,b







2

+ Ii,b (x ) − I j x + cb + d j,b − di,b

2

.

(12)

Thus for each pair of images that overlap we use symmetric blockmatching to calculate the similarity in the region of overlap and then sum these values together for all valid pairs resulting in the group-wise registration metric. In (12) we see that the driving term in the registration process is τ = di,b − d j,b , which is a vector variable representing the relative displacement between image volumes Ii and Ij at location cb . In a group-wise setting only the relative movement between overlapping image volumes is a factor in the similarity metric. We will rewrite the innermost summation of (12) in terms of τ = di,b − d j,b . The inner summation can be written as

SSDi,b j (τ ) =



I j,b (x ) − Ii (x + cb + τ )

x∈B



2

2

+ Ii,b (x ) − I j (x + cb − τ ) .





This section will introduce the improved metric specific for mosaicing ultrasound image volumes. Shadowing artifacts can substantially hinder efforts to register partially overlapping ultrasound volumes. These artifacts are very common in fetal ultrasound, the area our stitching efforts are concentrated in, thus our algorithm should attempt account for them. Our approach to compensate for shadows during non-rigid registration involves a pre-processing detection step followed by group-wise registration with a modified similarity metric. Shadow detection can be accomplished using a few different techniques, such as those presented in Hellier et al. (2010) and Karamalis et al. (2012). The latter describes a graph-based algorithm that labels each voxel in an ultrasound image with a confidence value from 0 to 1 and we have chosen to implement this method. Shadow maps were created by designating all voxels with confidence values less than 0.25 as occluded. This threshold was heuristically determined by Karmalis in his work (2012). During our experiments there was good agreement between the results of the automatic shadow detection algorithm and a visual inspection, so this parameter did not require further evaluation. Finally, the total similarity value is normalized by the number of voxels that were used in its calculation. The normalization is required in order to prevent the registration algorithm from forcing structures apart in the areas dominated by shadows. The metric that we are minimizing becomes the average error per voxel in shadow free regions. In the following discussion let us assume that Mi and Mj are Boolean masks identifying valid image regions that are free of artifacts. It is easy to incorporate these masks into the block matching framework presented in this paper by modifying (13) and this modification is shown below,

OVRi,b j (τ ) =



M j,b (x )Mi (x + cb + τ ) + Mi,b (x )M j (x + cb − τ )

x∈B

SHDi,b j (τ ) =



1 OVRi,b j

(τ )

x∈B

[Mi (x + cb + τ )I j,b (x ) − M j,b (x )Ii (x + cb + τ )]2 + [M j (x + cb − τ )Ii,b (x ) − Mi,b (x )I j (x + cb − τ )]2 .

(13)

Each control point is allowed to move within a fixed window during the optimization process. Assuming a uniform grid of control points, let the displacement window be

W = d|d ∈ Z3 ∧ − 12 h  dn  12 h .

2.3. Increasing the robustness of our group-wise block matching term in the presence of shadows

(14)

We have experimented with different search window sizes in order to determine the ideal displacement label set for use by the optimization algorithm. Computational performance and

(15) i, j OVRb

is the normalization factor, giving the number The function of unobscured voxels that volume i and volume j have in common around control point cb when their relative offset is τ . The funci, j tion SHDb calculates the sum of squared differences while ignoring shadow regions. This can be seen by noticing that the term inside i, j the summation of SHDb evaluates to 0 if the voxel location x is obstructed by shadows in both image volumes. Let us also assume that the masks calculated during preprocessing were used to eliminate those obscured regions in the ultrasound volumes via setting their

110

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

voxels equal to 0. Since we have determined that those areas were occluded during preprocessing we can assume their voxel intensities just represent noise anyway. Performing this step prior to registrai, j tion ensures that the term inside the summation of SHDb is only non-zero as long as Mi = 1 and M j = 1, which is an intuitive result. Equation (15) simply calculates the sum of squared differences using the voxels from the overlapping regions that have not been obscured by shadowing artifacts.

point cb is found by summing the L1 vector norms of the terms from (16). This contribution is expressed as



2.4. Transformation regularization We will now discuss the probability P (T ) in (3), representing the transformation regularization. Because our ultrasound registration problem involves many partially overlapping image volumes we found that 2nd order regularization was necessary. Minimizing the 2nd derivative as opposed to the 1st derivative has the effect of producing a smoother displacement field. The field’s value is dominated by the similarity metric in an area of overlap; however, in a region with no other overlapping volumes the field is totally dependent on the regularization energy to determine its value. Shearing artifacts may occur in the transition between the overlapping and nonoverlapping regions if a 1st order regularizer is used. Sudden changes in the deformation field tended to appear on this transition in our experiments with a 1st order regularizer. This is due to the fact that sudden changes are penalized nearly the same as smooth transitions with this type of regularization. Consider a 1D problem with 5 control points, where the last 2 points have displacement values of 4. In this simple example of shearing the value of the 1st order regularizer is 4, which is the result obtained after summing up the discrete 1st derivatives. Now let the last 2 control points have displacement values of 2 and 4 respectively. The shearing is effect is gone but the 1st order regularizer is still 4, which means the optimization method would not show any preference to this. The 2nd order regularizer would ensure that the optimizer prefers a smooth displacement field in these instances. This problem is unique to mosaicing because we are registering volumes that contain differing/limited views of the anatomy. In typical registration scenarios the structure being registered is completely contained in all volumes, thus the similarity measure is able to guide the transformation everywhere. Also, a 2nd order regularizer is invariant to linear transformations, which is a desirable property when the deformation contains mostly rotational components. For each image, the control point displacement set defining the transformation Tn is used to estimate its 2nd derivative using a central difference scheme. Each Tn is regularized independently from the rest of the transformations T{1,...n−1,n+1,...N} and the total regularization energy for an individual volume is calculated by summing up each control point’s contribution. The 2nd order derivatives along the x, y, and z dimensions of the deformation field are estimated at the control points using the following equations:

 dn,b−1 − 2dn,b + dn,b+1 ∂ 2 Tn  ≈ ∂ x2 x=c h2  b dn,b−Bx − 2dn,b + dn,b+Bx ∂ 2 Tn  ≈ ∂ y2 x=c h2 b  dn,b−Bx By − 2dn,b + dn,b+Bx By ∂ 2 Tn  ≈ . ∂ z2 x=c h2

(16)

b

Due to the discretization of the registration problem, the regularization of the deformation field only considers the translation at each control point in its calculation. This is a necessary approximation required to use the graph based optimization methods to be discussed. For volume In , the regularization contribution associated with control



     dn,b−1 − 2dn,b + dn,b+1 dn,b−Bx − 2dn,b + dn,b+Bx  +     h2 h2    dn,b−Bx By − 2dn,b + dn,b+Bx By   +  h2        2    2    2    ∂ Tn    ∂ Tn    ∂ Tn   ≈ (17) + + .  ∂ x2 x=cb   ∂ y2 x=cb   ∂ z2 x=cb 

Rn,b dn,b = 

Finally, the total regularization of the continuous transformation Tn is approximated as the sum of discrete terms calculated by applying (17) to every control point associated with Tn . The regularization   energy for image volume In is Rn (Dn ) = 1B Bb=1 Rn,b dn,b . The total energy for the group-wise registration problem is found by summing the regularization energy for each individual image volume as  R= N n=1 Rn (Dn ). This term is substituted into the registration formulation where λ controls how much influence the regularization has on the registration process.

2.5. Formulation of registration energy In this section the complete registration energy is presented, which is a combination of the similarity measure and the transformation regularization developed previously. Formulation of the registration energy is given as



E (D ) =

B N N    i=1



SHDi,b j di,b − d j,b



j>i b=1

     di,b−1 − 2di,b + di,b+1   di,b−Bx − 2di,b + di,b+Bx    +  +     B h2 h2 b=1     di,b−Bx By − 2di,b + di,b+Bx By   . +  (18)  h2 λ

B 

The first thing to note about this equation is the grouping of random variables from the set D into distinct terms. Like before, each discrete variable dn, b corresponds to the displacement of a control point that is located at cb and associated with volume n. The groupings are referred to as cliques and upon inspection it can be seen that there are terms in (18) that are  functions of double and triple i, j cliques. The similarity terms SHDb di,b − d j,b are examples of double clique terms because they are functions of two displacement variables. Each double clique term links the displacements of co-located control points belonging to a pair of different volumes. Notice in the  i, j SHDb di,b − d j,b terms that the volume identifiers i, j are different but the location identifier b is identical. As discussed before, this measures the block-wise similarity between volumes at a region surrounding cb and can be thought of as inter-volume terms. The triple clique terms found in the second row of (18) measure the regularization energy of the transformations. Each control point in volume n belongs to three triple cliques, one corresponding to each dimension of the volume. Looking at the indexes in these terms it is apparent that all triple clique terms are intra-volume terms since the volume identifier i is constant. This results from our assumption that the prior probabilities of the transformations in the set T should be independent. We can also write (18) in a more compact way where double (inter-volume) and triple (intra-volume) are  cliques of nodes collected into the sets di,b , d j,b ∈ NInter and di,a , di,b , di,c ∈ NIntra

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

D SS

111

assigns each node a label from the space L and the goal of the optimization is to find xˆ = argmin E (x ). The real valued terms  and

1,3

x

D SS D SS

1,2

Re

gu

1,3

Re

gu

S

iza

tio

no

fT

tio

3

no

fT

2

2

lar

iza

3

gu

iza

e ag Im

Re

lar

lar

e ag Im

SD

 are used to calculate the energy of a labeling and are commonly

2,3

tio

no

e ag Im

fT

1

1

i j (0, 0 ) + i j (1, 1 )  i j (1, 0 ) + i j (0, 1 ).

Fig. 1. Graph for Markov Random Field representing registration problem with three overlapping 2D images. Each control point node is represented by a circle, with its latent value being defined as a displacement vector. Probabilistic dependencies between nodes are modeled by the edges connecting them. For example, similarity dependencies are edges which extend between images while smoothness dependencies are represented by edges between nodes of the same image.

respectively. Equation (18) is now written as

E (D ) =





{di,b ,d j,b }

SHDi,b j di,b − d j,b

∈NInter

+



λ B

{

di,a ,di,b ,di,c ∈NIntra

}



   di,a − 2di,b + di,c   ,   h2

(19)

which can be described by a graphical model where the cliques are indicated using edges. This is the most common form found in the computer vision literature. A partial graph representing the MRF modeling a two dimensional version of our registration problem is shown in Fig. 1. The 3D version is a straightforward extension of this 2D version. Fig. 1 shows control point nodes corresponding to three overlapping 2D images and demonstrates the intra- and inter-volume dependencies. Using this graph we desire to make an inference on the displacement values for each node. In the implementation section the method chosen to make this inference will be discussed. The double clique terms corresponding to the similarity measure are modeled by the edges between images while the triple clique terms corresponding to the regularization are modeled by the edges between nodes within the same image. Not all edges are shown in this figure; however, remaining edges can be deduced by (19). 3. Theory: mosaicing function with applications 3.1. Overview and calculation The goal of the mosaicing function is to label each voxel in the final composite volume with the ID of the ultrasound source that its intensity value should originate from. This function can be used to produce the final mosaic as well as focus the registration on certain regions, thus increasing quality and efficiency. Similar to the group-wise registration algorithm, the mosaicing problem will also be formulated as an MRF; however, it will only contain pair-wise cliques. This MRF takes the form,

E (x ) =

 i∈V

i (xi ) +



referred to as unary and pairwise potentials. These symbols are commonly found in the literature and so are used here to introduce the graph cut theory; however, they are not equivalent to the symbols that were used earlier to define the registration problem. It is well known that if L = {0, 1} and the  and  terms satisfy a certain condition called sub-modularity then a globally optimal labeling can be found by solving a minimum cut problem on a specially constructed graph. Minimization of (20) with this binary label space is known as quadratic pseudo-Boolean optimization (QPBO), where quadratic refers to the highest ordered term being pairwise. The submodularity condition for a binary labeling problem can be stated as

i j (xi , x j ) where x ∈ L|V | .

(20)

(i, j )∈N

In this equation V is a set of nodes, N is a set of undirected edges connecting pairs of nodes, and L is the set of labels. The labeling x

(21)

The minimization of equations taking the form of (20) can be accomplished using a popular algorithm known as alpha expansion, which is based on graph cuts. Alpha expansion reduces the optimization of (20), which may have a large label space, to a series of binary labeling problems. During each iteration a label, referred to as alpha, is chosen from the set and has its optimal expansion calculated. Ultimately this procedure labels the voxels in the composite volume’s common scene coordinate system with a source ID. Thus the label space for this problem is defined as L = {1..N}, which leads to the partitioning of the abdomen by source volume. The voxel labelling is chosen in a way that attempts to minimize intensity and gradient differences between source volumes at the seams. Seam selection can be expressed as a discrete optimization problem in the form of (20). In this instance each element of the vector variable x corresponds to a single voxel in the common scene coordinate system and may take on values from 1 to N, where N is the number of sources. We have redefined T in this section so that it no longer represents a registration transformation. The 3D ultrasound source volumes are designated as I1 .....IN where In : R3 → R; now let T : R → R3 map the voxel index to its 3D spatial coordinates. Because the common scene space is comprised of a number of irregularly shaped overlapping 3D ultrasound volumes, not every voxel location in this space lies in the domain of each source volume. The unary term found in (20) is used by our mosaicing application to guarantee that the entire domain of  the composite volume, N n=1 Domain (In ), is comprised of valid image data and is calculated as



i (xi ) =

0

if T (i ) ∈ Domain(Ixi )



Otherwise

.

(22)

In our application this term guarantees that all imaged areas of the abdomen are included in the final composite volume. It also ensures that the voxel associated with xi is assigned a source that coincides with its location. The pairwise terms control where the boundaries are placed between overlapping volumes in the composite image. We wish to limit the intensity and gradient differences across the boundary, resulting in the most continuous composite volume. The following expression gives the pairwise interaction terms:

    i j (xi , x j ) = Ixi (T (i )) − Ix j (T (i )) + Ixi (T ( j )) − Ix j (T ( j ))     +∇ Ixi (T (i )) − ∇ Ix j (T (i )) + ∇ Ixi (T ( j )) − ∇ Ix j (T ( j )),

(23) where i, j are neighboring voxels. The first term in (23) measures the intensity difference if neighboring voxel locations in the composite volume are labeled with different sources. If neighboring locations in the composite volume are labeled with the same source then this term is 0 and no seam runs through this region. The second term

112

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

in (23) penalizes gradient differences along the seams in the composite volume which limits discontinuities along surfaces between neighboring sources. This procedure is necessary for the creation of large composite volumes due to inconsistencies between scans arising from subject movement and organ deformation during acquisition. These inconsistencies present themselves as large intensity differences between multiple overlapping volumes and in certain cases cannot be eliminated with non-rigid image registration. This was especially apparent when stitching fetal ultrasound volumes together. It is easy to show that the terms in (23) satisfy the submodularity constraint given in (21). We see that  ij (1, 1) evaluates to 0 because it signifies the neighboring voxels both taking the new label specified by alpha. Using the triangle inequality, X + Y  X + Y , we get the following result:

  Ixi (T (i )) − Ix j (T (i ))



Coronal/transverse slices formed using maximum intensities:

Coronal/transverse slices formed using mosaicing function:



 Ixi (T (i )) − Iα (T (i )) + Iα (T (i )) − Ix j (T (i ))

  Ixi (T ( j )) − Ix j (T ( j ))





 Ixi (T ( j )) − Iα (T ( j )) + Iα (T ( j )) − Ix j (T ( j )).

(24)

This proves the submodularity of the energy function. Ideally neighboring locations in the composite ultrasound image volume should come from the same source volume if possible. Substituting the newly defined terms i (xi ) and  ij (xi , xj ), from (22) and (23), into (20) and then optimizing with alpha-expansion results in a labeling that can be used to produce a mosaiced volume as well as guide group-wise registration. Let the final labeling be represented as (x ) : R3 → R where x ∈ R3 is a location in the common scene coordinate system. In the regions of the composite volume where no sources lie we set (x ) = 0 because there is no information available to determine what the mosaiced volume should look like here.

Mosaicing function overlaid onto slices from composite volume:

3.2. Generation of 3D mosaic Given a set of partially overlapping 3D volumes we desire to form a mosaiced volume that fills as much of the composite coordinate system as possible. To this end, the first step is to perform seam selection using the algorithm discussed above. This results in labeled regions of the composite coordinate system where each label corresponds to a particular source volume. At this point it is possible to create a mosaiced volume by simply assigning each voxel in the composite coordinate system with the intensity value from its designated source. Let the final image be designated as F (x ) which is constructed as follows:



F (x ) =

I(x) (x )

if (x ) > 0

0

otherwise

,

(25)

where  (x) is the final labeling. In order to improve the visual quality of the composite volume we perform group-wise registration before mosaicing. Finally, (25) is used to construct the composite volume. It is also possible to construct ultrasound mosaics using spatial compounding techniques. For example, one of the simplest methods to determine the intensity of a voxel in the composite volume is to use the maximum value of all the overlapping sources at its location. This type of approach works well when creating mosaics of stationary organs such as the liver, or those with predictable motion patterns such as the heart; however, they give poor results when faced with large non-deterministic movements, which is the type that occurs during fetal ultrasound scanning. Since our clinical application is the construction of fetal mosaics for use in a training simulator, our proposed method outperforms the spatial compounding techniques that require almost perfect alignment between volumes. Fig. 2 compares the mosaicing results of our proposed method with a simple spatial compounding approach. The first row shows two slices

Fig. 2. Comparison between spatial compounding technique and our proposed mosaicing function using fetal ultrasound data.

from a composite volume created using the maximum intensity technique, where the red circles indicate poor quality in the constructed mosaic. In this example rigid registration was performed before we compared spatial compounding to the proposed method. In the first volume the baby’s leg was straight but when the second volume was acquired the leg was bent at a 90° angle. This substantial misalignment was too much for non-rigid registration algorithms to correct without severely degrading the quality of the ultrasound image. With the proposed mosaicing technique the composite volume looks seamless despite the uncorrectable misalignment of limbs. The second row of Fig. 2 shows the same source volumes stitched using our proposed mosaicing function. The errors seen in the top two slices have been correctly dealt with, resulting in a composite volume that can be included in the simulator. In the last row the mosaicing function is overlaid onto the composite slices showing how the five overlapping volumes are stitched together. 3.3. Focused registration Because we are dealing with living structures, including a fetus that typically exhibits some motion during the ultrasound scanning, registration is both difficult and essential when attempting to construct an accurate training volume from clinical data. This seam selection algorithm enables the group-wise registration algorithm to focus in the neighborhoods of stitching boundaries, where anatomical features are similarly represented in all overlapping volumes and the movement between volumes in this region is therefore hopefully

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

minimized. We consider this a pre-processing task before group-wise non-rigid registration. The basic idea of focused registration is to consider a fixed region in the neighborhood of the implied seams and only compute the similarity metric from (18) inside this region. The performance of group-wise registration in terms of computation time is greatly improved due to the decreased complexity of the graph representing the registration problem. Since the registration energy expressed by (18) is composed of many thousands of discrete terms, the simplest way to introduce the mosaicing function is to use it to crop the similarity terms that may  negatively  affect the registration outcome. Thus certain i, j SHDb di,b − d j,b terms are removed from the registration energy if they lie too far away from the junction of multiple image volumes. The process for cropping the pairwise similarity terms can be formalized after introducing a few utility functions. Let Di (x ) be a distance function that measures the minimum distance from point x to the region in the composite volume labeled with source i. Next a parameter will be defined that controls the number of terms which are cropped from the registration energy. Let ς be the maximum distance from a seam that we wish to consider when calculating the similarity metric. This means that the width of the total region influencing the pair-wise alignment terms will be 2ς . The last step before expressing the new focused registration energy will be to define an indicator function that is used to remove blocks from consideration when they are greater than ς from a seam. The indicator i, j function δb is defined as



δbi, j (cb ) =

1

if Di (cb )  ς and D j (cb )  ς

0

otherwise

.

(26)

Using this indicator function the focused registration energy can now be defined as

E (d ) =

 i, j



{di,b ,d j,b }

δbi, j (cb ) SHDb di,b − d j,b

∈NInter

+

λ



B

{di,a ,di,b ,di,c }



   di,a − 2di,b + di,c   .   h2

(27)

∈NIntra

The only difference between (27) and (19) is the addition of the indicator function used to cancel the unwanted block-based similarity terms. The indicator function can also be thought of as eliminating specific inter-volume edges from the specially constructed MRF graph in Fig. 1. As ς increases, more blocks are considered by the registration algorithm until all overlapping regions influence the solution. When using this technique the mosaicing function is calculated prior to group-wise non-rigid registration and consequently it must be recalculated after the volumes have been aligned as there may be significant changes. Using phantom data from Section 6, Fig. 3 compares the initial mosaicing function with one calculated after deformable registration. There are some noticeable changes to the final stitching seam. 4. Implementation details Implementation choices, such as the transformation type and the optimization algorithm, are discussed in this section. Since the groupwise registration problem and the mosaicing function are formulated as discrete MRFs, the same family of optimization methods may be used for both. As advances are made in discrete optimization, the algorithm used in our mosaicing framework may be easily swapped out in favor of better performing methods. 4.1. The transformation function We choose to use a parametric transformation model where the parameters are the displacements of the control points in

113

After:

Before:

After:

Before:

Fig. 3. The mosaicing function is recalculated after non-rigid registration to account for changes in alignment.

3D space. The displacement of each voxel is computed as a weighted linear combination of control point displacements Dn =  dn,1 , . . . , dn,b , . . . , dn,B where dn,b ∈ R3 :

Dn (x ) =

B 

wb (x )dn,b .

(28)

b=1

In this expression n specifies the source volume, b identifies the control point that dn, b corresponds to, and B is the total number of points. The weighting function wb (x) represents the amount of influence that control point b has at location x. Using the displacement function in (28), the transformation at each voxel can be calculated as Tn (x ) = x + Dn (x ). Each unregistered volume has a unique grid of control points in the scene coordinate system. We are trying to determine the displacements in the set {D1 , . . . , Dn , . . . , DN }, where each Dn contains the control points defining the transformation of volume n. In our work the weighting functions were chosen from the tri-cubic convolution interpolation algorithm. Because these functions force the deformation field to take the exact value of the registration result at each control point location, a larger step is taken after each iteration when compared to weighting functions such as cubic Bsplines. This is beneficial because the computational requirement to optimize N three-dimensional displacement fields simultaneously is large. 4.2. Efficient optimization of registration energy using fusion techniques Since our displacement label space W is large and the MRF representing the registration energy given by (18) contains a considerable amount of double and triple clique terms, which may not be submodular, we are left with a substantial NP-hard optimization problem. Our strategy to optimize this energy will be to employ a parallelized alpha-expansion technique that was recently developed in Lempitsky et al. (2010), the details of which are included in Appendix B. It should be noted that each triple clique term in our energy function is expanded into six double clique terms before optimization, since the methods we employ operate on pair-wise potentials. This reduction in order of the MRF is discussed in Zikic et al. (2010). A key component of this algorithm will be the optimization of non-submodular binary-labeled MRFs, which are sub-problems of the registration procedure. The BHS (Boros, Hammer and Sun) algorithm presented in

114

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

Fig. 4. An abdominal slice of the Visible Human Female CT dataset is used to generate simulated images with position dependent shadow artifacts. The five simulated images, along with their respective shadow masks, are shown along the outside of the VHF slice.

Kolmogorov and Rother (2007) will be used for this purpose. The general idea of the fusion technique is to intelligently combine different suboptimal solutions, which were calculated in parallel using subsets of the complete displacement label set W, in order to produce a better solution with lower registration energy. The BHS algorithm effectively optimizes non-submodular binary labeled MRFs, where each node that has been labeled is part of the globally optimal solution. Thus the BHS algorithm is executed on a non-submodular binary labeling problem during each expansion and produces an output vector x which contains 0,1 or ?. The nodes given values 0 or 1 are considered labeled and part of the globally optimal solution whereas the nodes designated as ? are considered unlabeled. In our experience the registration algorithm produces very few unlabeled nodes, less than 0.01%, and those that are unlabeled simply retain their previous value. The registration energy in (18) reduces to the form shown in (20) because each triple clique term can be expanded into six double clique terms. In addition to the computational burden from the additional pairwise functions used to express the triple clique terms there is also overhead associated with using the more complex BHS algorithm, which is necessary due to the nonsubmodularity of the problem. In light of this complexity we propose an approach where mutually exclusive regions in the solution space are explored in parallel. This is accomplished by partitioning the label space W and then calculating the registration solution for each partition in parallel. By fusing these solutions the final result encompasses the entire space of W. Implementation details can be found in Appendix B.

5. Performance of registration framework on simulated data This section will demonstrate the effectiveness of our proposed improvements on the mosaic quality. Specifically, results showing the need for shadow detection and a group-wise registration framework are presented.

5.1. Impact of position dependent shadow artifacts on registration accuracy In this section we demonstrate the extent to which position dependent shadow artifacts can negatively affect group-wise non-rigid registration results. The experiments are conducted using simulated images, where the addition of shadow artifacts occurs after organ deformation. Using simulated images to study these effects, instead of real ultrasound data, allows us to add realistic shadows to multiple deformed variations of the same anatomy. This cannot be done using a tissue mimicking phantom or a human subject due to the fact that we cannot physically deform them between image acquisitions. The simulated images shown in Fig. 4 were generated from an abdominal slice of the Visible Human Female (VHF) CT dataset, as described in Ackerman et al. (1995). Based on the approach described in Kutter et al. (2009a) and Shams et al. (2008) we added position dependent shadow artifacts for five deformed variations of the VHF anatomy, which is also shown in Fig. 4. The idea is to relate CT Hounsfield units (HU) to acoustic impedance values which can be used to simulate ultrasound propagation. Each image’s corresponding shadow mask is shown adjacent to the section of the abdominal slice it was created from. In order to produce the organ deformations approximately 40 landmark points were inserted into the image and displaced. The dense deformation fields were then calculated using thin plate splines. As mentioned, the shadows were generated after the deformation field was applied in an attempt to closely model what happens in a clinical environment. The positioning of the organs greatly affects the way shadows manifest themselves in the ultrasound images so they must be calculated post deformation. Group-wise non-rigid registration experiments were performed on these five partially overlapping images utilizing the graph-based framework, and include both the standard SSD and the improved shadow-aware similarity metrics presented in Section 2. The convincing results are shown in Fig. 5. Each of the mosaics in Fig. 5 was produced by taking the maximum intensity value from the set of

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

115

Group-wise registration

Pair-wise registration

Fig. 5. Results demonstrating proposed method’s improved ability to perform registration in the presence of shadows when compared to the standard implementation.

overlapping images at each pixel location. This simple method for ultrasound mosaicing provides a good visual means for comparison. The top image shows the mosaic before any non-rigid registration was performed. Every organ has multiple outlines in the mosaic, each outline corresponding to one of the individual ultrasound images. Significant deformation between scans is visible here. The lower left image from Fig. 5 shows the results using standard SSD and it is apparent that the shadow artifacts caused a lot of issues during registration. Structures have not been properly aligned and many have become much less discernible. There is significant improvement in the registration quality when we include shadow maps, as can be seen by examining the image in the lower right of Fig. 5. This image closely resembles the original undeformed Visible Human Female slice shown in the center of Fig. 4, which our simulation was based on. The difference in quality was large enough to render numerical evaluation unnecessary. The results convincingly demonstrate the need to account for shadows during group-wise non-rigid registration. Using the efficient method introduced in this paper the additional time required for this is negligible. 5.2. Advantages of group-wise registration in the context of ultrasound mosaicing As an alternative to group-wise registration, it is possible to build a composite volume by performing pair-wise deformable registration between partially overlapping volumes; however, in this section we will present a simple experiment that demonstrates the benefits of choosing a group-wise method. In this scenario position dependent compression and shadow artifacts are simulated on a ring. The images are shown in Fig. 6, where the compression is along the direction of the beam. The applied transducer force has partially flattened the ring, and the combined reflectivity and absorptivity of the ring produces the shadow. The goal is to produce a mosaic of the deformed shape using both registration frameworks. The left side of Fig. 6 shows the result of the proposed method, which does a good job reconstructing the original source image. The right side of Fig. 6 shows the mosaic produced using the pair-wise formulation of our method. This is a poor reconstruction in comparison to the results of the group-wise approach. The only difference between the pair-wise and group-wise methods used in this experiment was the number of volumes allowed to participate in the optimization of (19). Starting with the source image on the left, the pair-wise based mosaic was constructed by sequentially adding its neighboring sources in

Mosaic results

Difference Images

Fig. 6. Comparison between pair-wise and group-wise registration using simulated images with position dependent compression/shadow artifacts.

a clock-wise direction until all 4 images had been included. We believe this was the best way to make a fair comparison between both frameworks, as the only variable altered was the number of volumes participating in the registration at any given time. The shadows made this problem especially challenging since no single image contained the entire structure. The group-wise registration proved to be key in generating an accurate reconstruction in this instance. In order to quantify the accuracy, the area of the original structure was compared to the area of the reconstructions. The original structure had an area of 71.95 cm2 , the proposed method’s reconstruction had an area of 74.71 cm2 , and finally the pair-wise based method’s had an area of 102.92 cm2 . We believe that group-wise registration is a key component of any ultrasound mosaicing technique where shadows may impact accuracy. 6. Validation of group-wise registration algorithm using the abdominal trauma phantom, FAST/ER FAN This section describes the procedure that we have used to measure the effectiveness of the group-wise registration algorithm on ultrasound data acquired from an abdominal phantom. An abdominal ultrasound phantom is ideal for this experiment because the internal anatomy remains static between overlapping sweeps, thus the

116

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124 Number of overlapping volumes per voxel in transverse slice of FASTFAN: 5 Vols.

Mosaicing of deformed FASTFAN data without group-wise registration Slices from 5 overlapping US volumes:

4 Vols. 3 Vols.

Dashed arrows denote random deformation using thin plate splines 2 Vols. 1 Vol. 0 Vols

Fig. 7. Degree of overlap increases as we move toward center mass. An intelligent group-wise registration algorithm designed for ultrasound mosaicing should be able to deal with varying degrees of overlap.

Slices of mosaiced volume showing significant misalignment at seams:

Table 1 Thin plate spline deformation field statistics.

Max disp. (cm) Mean disp. (cm)

Volume 1

Volume 2

Volume 4

Volume 5

1.7446 0.6828

1.9888 0.6955

1.9430 0.6792

2.2609 0.6737

Intensity differences between original slices and TPS deformed slices:

non-rigid misalignment between volumes can be more easily controlled. Also, visualizing registration improvement is more difficult with live subjects due to artifacts caused by respiration, muscle tension, or fetal movement between sweeps, which are all uncontrollable. After acquisition we applied an additional large non-rigid deformation to each volume in order to make the experiment more realistic. This is necessary because scanning a phantom does not simulate the motion artifacts caused by a patient’s movement/breathing and the deformation resulting from the non-constant pressure of the transducer against the skin. The abdominal phantom used in this section was borrowed from the Kyoto Kagaku Co. and was designed to provide training in the FAST (Focused Assessment with Sonography for Trauma) procedure. It contains various internal injuries that are detectable via ultrasound, such as the presence of free intra-peritoneal or pericardial fluid in traumatic patients. The Philips IU22 Ultrasound system was used in combination with a C5-1 convex array transducer, on which an Ascension Technology trakSTAR 6 DoF position sensor was mounted. The video output of the scanner was connected to a laptop running Stradwin software, which was developed by Treece et al. (2003). Sonographers at UMass Medical School performed the scanning for us and produced five partially overlapping volumes using swept 3D ultrasound. These volumes encompassed the majority of the phantom. Fig. 7 shows the degree of overlap at each voxel location in the composite volume. In the outer regions of the composite, where only 2 volumes overlap, a pair-wise similarity metric is used; however, we see that as many as 5 volumes overlap at each voxel close to the center. Non-rigid motion between overlapping volumes was simulated by applying random deformation fields to 4 of the 5 volumes using thin plate splines, as shown in Bookstein (1989). Each volume was deformed independently of the others by randomly positioning approximately 75 control points inside each and subsequently assigning random displacement values to them using a uniform distribution. Table 1 provides some statistics on the each volume’s unique displacement fields. It should also be noted that the depth of the C5-1 transducer was set to 16 cm. Most common ultrasound mosaicing tools, such as Stradwin, generate the composite volume by defining planar boundaries between overlapping volumes. We chose this seam selection method to validate the group-wise registration algorithm due to its popularity and also due to the fact that planar boundaries are unaware of the image misalignment through which they pass, thus the improvement in the composite volume before and after registration can be substantial. Fig. 8 shows the procedure used to generate the initial composite volume before our algorithm was applied. The top row of

Vol. 1

Vol. 2

Vol. 4

Vol. 5

Fig. 8. Diagram showing the procedure used to deform 4 of 5 overlapping volumes acquired from the FAST/ER FAN.

images in Fig. 8 are slices from the 5 original volumes that were obtained using swept 3D ultrasound. Dashed arrows denote the application of the random non-rigid transformations and the second row of images show how the original slices are deformed. Thin plate splines are popular because they have a closed-form solution, the interpolation is smooth, and there are no free parameters to adjust. Since 4 out of 5 volumes are randomly deformed, the net effects of the transformations in the overlapping areas are considerable because each adds to the misalignment. Two slices from the composite volume, which was produced using planar seams, are shown in the 3rd row of Fig. 8. Significant misalignment between volumes is obvious and will need to be corrected before the image data is able to be used in an ultrasound simulator. Instead of sequentially performing pair-wise nonrigid registration, and thus inefficiently growing the composite result one additional volume at a time, our technique simultaneously aligns all 5 overlapping volumes from the abdominal FAST/ER FAN at once, requiring only the single energy function defined in (19). The last row in Fig. 8 qualitatively shows the effect of the random transformations on each original volume. The difference images between each original volume and its deformed version are displayed. Major abdominal structures are significantly displaced between original and deformed volumes. Next we will describe how the various parameters of our algorithm were set for the validation experiment. We used a multiresolution approach based on a Gaussian pyramid with 3 levels. Each subsequent level was produced by convolving the volume with a 3D 1 [1 4 6 4 1] and then version of the typical Gaussian kernel w = 16

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

Mosaic before group-wise registration:

Mosaic after group-wise registration:

Group-wise difference before registration:

Group-wise difference after registration:

117

Mosaic before group-wise registration: Mosaic after group-wise registration:

Group-wise difference before registration:

Group-wise difference after registration:

Final mosaic with seam mask overlaid on top:

Final mosaic with seam mask overlaid on top:

Fig. 9. Result of registration showing improvement in slice orthogonal to planes displayed in Figs. 10 and 11.

down-sampling by a factor of 2, as described in Burt and Adelson (1983). Because the C5-1 transducer can produce sector ultrasound images with a wide angle, we limited the potential interactions by removing pair-wise terms from (19) if they corresponded to a situation where source indexes i, j were farther than 2 volumes apart. When using planar seams we found that restricting the registration for each region by only including adjacent volumes in the optimization produced excellent results and made the registration energy much more efficient to optimize. Every volume is still linked together in this framework but we need fewer terms in the MRF. Since ultrasound image quality is not on par with MR or CT and small structures are hard to identify, we believe the additional MRF terms would not noticeably improve the final composite image. We performed 6 iterations at the lowest resolution, where the original volumes were down-sampled by a factor of 4, then 8 iterations at the middle resolution and finally a single iteration at the original resolution where each voxel was approximately 0.48 mm3 . The regularization parameter in (19), denoted by lambda, was allowed to relax during optimization. This ensures that the initial iterations of the registration algorithm result in more rigid transformations, which attempt to globally align the overlapping volumes, before allowing more fluid-like transformations. We produced qualitative and quantitative results using the experimental procedure described above. Figs. 9 through 11 show improvement after registration and we see that the misalignment between structures spanning more than one image volume is almost entirely eliminated. Also shown in each figure is the corresponding planar seam mask, which identifies the source that each voxel in the slice gets its intensity value from. The next result we present quantifies the intensity differences between overlapping volumes in each region of the composite which has been designated by the planar seam mask. Referencing the seam mask in Fig. 10, for each region labeled as volume n we use its adjacent volumes n − 1 and n + 1 to calculate an error measure as follows:

DIFF (x ) =

 1  I(x)+1 (x ) − I(x) (x ) 3  +I(x) (x ) − I(x)−1 (x ) 



+ I(x)+1 (x ) − I(x)−1 (x ) ,

(29)

Fig. 10. Result of registration showing improvement in coronal slice of mosaiced volume.

Mosaic before group-wise registration:

Mosaic after group-wise registration:

Group-wise difference before registration:

Group-wise difference after registration:

Final mosaic with seam mask overlaid on top:

Fig. 11. Result of registration showing improvement in slice orthogonal to planes displayed in Figs. 9 and 10.

118

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124 Table 2 Performance increase realized by computing similarity metric in frequency domain. Method of comp.

Exec. time (s)

Speedup

Naive FFT

16040 1556

1.0 10.3

where  (x) is the mosaicing function described previously. Care must be taken when evaluating (29) for voxels that belong to the outer regions of the composite volume. In that case only a single term from (29) contributes to the difference measure since that region only has one adjacent volume. The error measure described by (29) simply adds the difference errors between all possible combinations of adjacent overlapping volumes at a specific voxel location in the composite volume. The result of (29) before and after group-wise registration is presented in Figs. 9 through 11. The improvement is substantial with much of the difference error being removed. This image gives a better visualization of our algorithm’s ability to correct misalignment between multiple (in our case 5) volumes when compared to the mosaiced results because we are not restricted to evaluating the error at just the boundary between adjacent volumes. Figs. 9 through 11 show that the anatomical structures were well aligned after registration, which is the most important aspect in ultrasound mosaicing. We should note that incomplete structures residing on a volume’s edge posed some difficulty. This can be attributed to the large deformation we applied after acquisition, which substantially changed the degree of overlap in some cases. Remaining errors are also due to the variation in the position/orientation of the transducer between collected volumes. This resulted in differing image intensities for the same structure. Despite these difficulties, high quality mosaicing results can be easily produced using clinical ultrasound data from live patients, and our experiments using fetal data show this. In order to understand the size of the problem we tracked the number of pair-wise energy terms needed for each level of the Gaussian pyramid. At every pyramid level in the FAST/ER FAN experiment the labeled nodes defined 5 non-rigid transformations, 1 corresponding to each source. The lowest resolution required approximately 9000 pair-wise energy terms in order to evaluate the group-wise similarity metric, the second lowest required 69,000, and the highest resolution needed 281,000. We should note that pre-computation using the FFT method at the highest resolution required (173 search window size )(281000 pair-wise energy terms ) (2 bytes per uint ) = 2.761 GB to hold the SHD data from (19). We also found that optimization at the highest resolution did not offer substantial improvement over registration at 1/2 resolution, as the relatively sizable structures were pretty well aligned by this point. However, ultrasound images containing finer features would certainly benefit from full resolution registration. The FAST/ER FAN data was also used to quantify the performance increase that we achieved by pre-computing the pair-wise similarity terms in (15) using the FFT approach. In order to strictly measure the improvement in optimization speed, the interpolation time was not included in the timing results. Both the FFT and the naive approach were implemented as Matlab functions and care was taken to vectorize the naive approach, utilizing Matlab’s strengths where possible. This ensured that our comparison was not unfairly biased. At the middle resolution of our pyramid scheme we measured the amount of time it took to complete an iteration of alpha expansion, which consisted of a single sweep over the label set. Alpha expansion required 1556 s using the FFT based approach and 16040 s using the naive approach, giving a speedup factor of 10.3. Most alpha expansion schemes use multiple passes over the label set and because the 1556 s already includes the pre-processing time, the speedup fac-

Fig. 12. Illustration of the two possible scan paths along the abdomen which can be followed to obtain individual ultrasound volumes from a pregnant subject using freehand 3D techniques.

tor between the FFT and naive approaches would only increase as the number of passes increased. We conclude that the FFT approach is the most efficient method to compute the group-wise similarity metric in (19) using an MRF optimization scheme. These results are summarized in Table 2. 7. Results from clinical ultrasound In order to validate our approaches we used clinical ultrasound data obtained from obstetrics patients at the University of Massachusetts Medical School (UMMS). With the same set-up that was used to scan the FAST/ER FAN, we collected individual volumes from the patients along multiple overlapping linear scan paths. Based on the visual assessments conducted by the obstetrics clinicians at UMMS, we found that the best stitching results were achieved using overlapping volumes produced from 2D ultrasound images acquired of mother’s sagittal plane. Fig. 12 refers to this as the primary probe orientation; however, acceptable volumes were also produced using the secondary probe orientation. The first step after 3D image acquisition is to rigidly register the multiple volumes together in order to remove global misalignments caused by the shifting of anatomical structures between scans; such shifts can be attributed to breathing, variation in muscle tone, fetal movements and non-uniform probe pressure. The rigid registration can be accomplished using the optimized modular algorithms found in the Insight Toolkit, developed by Ibanez et al. (2005). Shadow maps were incorporated into the pair-wise rigid-registration step as well. We found this was necessary for mosaicing clinical obstetrics data because the occlusion artifacts were especially strong. It should be noted that the FAST/ER FAN dataset did not require this to achieve good results. The center-most volume was chosen to be fixed during rigid registration. Now we will present results from the complete abdominal reconstruction of three pregnant patients who were 26–30 weeks along in their pregnancy. During clinical scanning at UMass Medical School 10+ individual image volumes were acquired from each subject, encompassing the entire fetus and placenta. We collected such a large number of scans because swept 2D ultrasound was used to generate the volumes and is very sensitive to fetal movement. During construction of the composite volume some of the individual volumes were found to be unusable making redundant scanning of the subject necessary. Any sudden movements by the baby would ruin an active scan. Each fetal volume was constructed using 4–5 overlapping source volumes. The number of sources used depended on the size of the mother’s abdomen and the sonographer who was conducting the

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

Fig. 13. Re-slices of composite volume created from Subject 1. Rightmost images show how final volume is partitioned into regions corresponding to overlapping source volumes.

scan. Since we worked in a clinical environment we had the hospital sonographers acquire the volumes to our specifications. The spatial extent of the sources is automatically regulated by the mosaicing function, which ensures that the composite volume covers the full range of the imaged region. The individual volumes were approximately 16 × 30 × 27 cm3 and comprised of cubic voxels where each dimension was 0.488 mm. The dimensions of the control point grid for an individual volume in the lowest level of the Gaussian pyramid, which had been down-sampled by a factor of 4, was typically around 15 × 25 × 20. In our hierarchal approach the resolution of this grid is increased by a factor of 2 prior to registration at the next pyramid level. Figs 13–15 show slices of the composite volume created from each of these subjects. The overall image quality is very good and the slices show anatomical details which would not be visible in a single scan. For example, at the top of Fig. 14 amniotic fluid is visible on both sides of the fetus which typically does not occur during clinical scanning due to the shadowing effects of obstetric ultrasound. Also visible are details of the baby’s limbs and spine which are hard to capture simultaneously. The mosaicing function was chosen to construct the composite volume from the registered source volumes as opposed to a voxel-wise weighting scheme because the overlapping source volumes were much too different in regions far away from the optimized seams. Again, this is mostly due to fetal movement and shadowing. Correcting the misalignment between volumes caused by the baby shifting its limbs is more difficult than correcting for pre-

119

Fig. 14. Re-slices of composite volume created from Subject 2. Rightmost images show how final volume is partitioned into regions corresponding to overlapping source volumes.

dictable movement such as elastic deformation or a heartbeat, which is why the mosaicing function is used. We do not require the limbs in all source volumes to be aligned, only the limbs in the sources specifically used to construct this region of the composite volume. The right side of each figure demonstrates how the mosaicing function partitions the composite volume into regions corresponding to each of the overlapping sources. By looking at the image continuity in Figs. 13–15 we see that stitching algorithm is effective at producing visually satisfying results from multiple partially overlapping ultrasound volumes. Our primary purpose for developing the algorithms explained in this paper is to produce anatomically correct abdominal ultrasound volumes for use in obstetrics ultrasound simulators. Clinicians can be expected to detect fetal position and placental position, as well as measure the abdominal circumference, femur length, biparietal index, and amniotic fluid index among others. When we refer to the anatomical correctness in this context we are referring to the capability to collect these measurements accurately. Fetal movement was the main contributor to misalignment in our application thus we desired a quick scan procedure, resulting in less sweeps and minimal overlap. The anatomical correctness of the volumes we generated was determined by the sonographers at UMMS. In order to evaluate the training value of each composite volume we loaded them into our laptop based ultrasound simulator by Pedersen and Skehan (2012), Liu et al. (2014) and encouraged the sonographers at UMass Med. School to compare the simulator experience to that of scanning a live patient. Our simulator allows the user to re-slice the

120

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

appeared seamless. Performance was further enhanced by splitting the solution space into distinct regions, exploring them in parallel, and then fusing the results. The framework developed here is modular, which allows improvements in optimization or similarity measure to be easily integrated into the algorithm. As more efficient, higher order MRF optimization methods are developed their speed/accuracy should be tested on the registration energy in (27). Improved MRF optimization could enable the use of more sophisticated registration models, such as those including both rigid and elastic components. Joint registration/segmentation would also be interesting to explore in this framework. Finally, work toward efficient pre-computation of more sophisticated group-wise similarity metrics should be completed. The qualitative evaluation by sonographers at UMass speaks to the ability of our proposed technique to overcome the significant challenges, to include large deformations and shadow artifacts, found in ultrasound mosaicing. The results suggest that intra-volume misalignment of many (≥2) partially overlapping ultrasound volumes can be effectively corrected using an intuitive group-wise MRF approach. The most important factor to consider when constructing ultrasound mosaics is how well major anatomical structures align when spanning more than one volume. With the techniques for producing composite image volumes in place, additional simulator development work includes the addition of landmarks that will be used in educational modules providing training on the acquisition of common fetal measurements, which is a task clinicians must be proficient in. Also, an evaluation of the simulator’s training benefit will be conducted by incorporating it into the curriculum of a small number of medical students. Fig. 15. Re-slices of composite volume created from Subject 3. Rightmost images show how final volume is partitioned into regions corresponding to overlapping source volumes.

composite volume in any orientation/position by utilizing a sham transducer with a 5 degree of freedom tracking system. From an initial evaluation the sonographers were impressed with the quality and realism that our composite volumes added to the training experience. The main issue was a slight blurring in some of the re-slices that were used to calculate certain fetal measurements such as abdominal circumference. Scanning procedures focusing on fetal development are very important for obstetrics sonographers to master and improving the effectiveness of the ultrasound training simulator can lessen the learning curve. To this end, careful selection of image volumes ensures the highest quality composite volume is constructed.

Appendix A. Efficient calculation of similarity metrics in the frequency domain This section describes the frequency domain calculation of the similarity metrics in detail. In our multi-resolution framework, where a 3 level Gaussian pyramid was constructed prior to registration, we found that the FFT formulation was always preferred over the naive formulation for reasonable sized blocks (15 × 15 × 15, 19 × 19 × 19). The control-point spacing was 8 voxels and the block-size used was 15 × 15 × 15 for each level of the pyramid, thus the only difference between the lower and higher resolutions was the additional terms in the MRF expression. Performance gains found in the frequency domain were carried across all pyramid resolutions.

8. Conclusions A.1. Group-wise SSD metric calculation The novel algorithm we presented in this paper essentially requires two MRF optimizations to mosaic N partially overlapping ultrasound volumes. The first optimization determines how the volumes should be stitched together by attempting to minimize the intensity/gradient differences between adjacent volumes. This could be used as preprocessing step to more sophisticated compounding techniques if desired. The second optimization determines the N deformation fields that bring the overlapping volumes into alignment in the neighborhoods of the stitching boundaries. Pre-computation of the similarity metric using FFTs and a focused registration energy resulted in a very efficient mosaicing algorithm. Since a mosaicing function is used to produce the composite volume we argue that the alignment in the region adjacent to the seam is the most important when considering the visual quality of the final product. Also, when the explicit link between two volumes has been severed by cropping their similarity terms in the MRF, these volumes are still indirectly linked together through a chain of remaining terms. In our experiments ς was typically set to 5 cm, which still provided a large region where the similarity metric was calculated. In this case, even though we are cropping a large number of MRF terms, the final composite volume

Our goal in this section is to introduce a computationally efficient way to evaluate (12), utilizing some properties of the Discrete Fast Fourier Transformation (FFT). The number of block matching operations required for a single iteration of the group-wise registration algorithm for multiple (|V | > 2) full resolution volumes can be calcu-

 

lated as 2|C | |V2 | , where the last term is a binomial coefficient if we

assume that all volumes overlap completely. Because the number of times this calculation is performed grows non-linearly with respect the number of volumes, we found that (12) may contain more than 105 three-dimensional block matching operations when |V | > 3. For example, in a registration problem with five completely overlapping volumes and an 18 × 18 × 18 control point grid the number of block

 

matching operations is found to be 2|C | |V2 | = 2(183 )10 = 116640. This makes it a serious computational bottleneck. The method of calculation will be shown in the context of a single block-matching operation, and simple bookkeeping will be required to fully evaluate all the block-matching terms of (12). The full calculation of the group-wise metric includes the two outer summations, where all

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

overlapping volume pairs are considered. Also, care must be taken in the instance where the two volumes fail to overlap at cb . One can naively break down (13) into two SSD calculations and assume that it will take 6 FFT operations to calculate; however, this can be reduced to four FFT operations, three forward and one inverse. Expanding equation (13) results in the following summation:

SSDi,b j (τ ) =



I j,b (x )2 − 2I j,b (x )Ii (x + cb + τ )2 + Ii (x + cb + τ )2

x∈B

+ Ii,b (x )2 − 2Ii,b (x )I j (x + cb − τ )2 + I j (x + cb − τ )2 . (A.1) The goal will be to write this expression as the result of two correlation operations between complex functions to be defined shortly. The correlations can be performed in the frequency domain using the circular cross-correlation theorem

(x ⊗ y )n =

N−1 

DFT

¯ ¯ x[m] y [m + n] ↔ X[k]Y [k].

(A.2)

m=0 

Let I j (x ) = I j (cb − x ) be a time reversed and translated version of Ij which centers the region surrounding control point cb at the origin.   Also, let Ii (x ) = Ii (x + cb ) and Ii,b (x ) = Ii,b (−x ). Let us define complex functions f, g, h, m as follows: 



f = Ii (x ) − jI j (x )



g=







if x ∈ B

0

otherwise 

1 if x ∈ B

0

otherwise

g= .

(A.3)

h=

Optimization with (A.4) is equivalent to the process in (13) when applying the functions defined in (A.3),



SSDi,b j (τ ) = Real



  g( x ) f ( x + τ ) + m(x )h(x + τ ).

x∈B





(A.5)

Because the image block is initially centered in the search window before the FFT is taken, and due to the effects of the circular correlation, the results from (A.5) must be rearranged before being assigned i, j to SSDb (τ ). Using the example where h = 8 we find that the size of

(τ ) is 31 × 31 × 31 after computation of (A.5). Care must be taken to extract the correct 17 × 17 × 17 region from this result because it will be stored in memory for the duration of the registration and referenced by the graph based optimization algorithm multiple i, j times during its progression, looking up the value SSDb (τ ) for pari, j SSDb

ticular i, j, b, τ values. Pre-computing SSDb (τ ) for each valid i, j, b, τ i, j

if x ∈ B

0

otherwise

Mi,Sb (x ) + jIi,Sb (x )

if x ∈ S

0

otherwise



I i,b (x ) − 2 jI i,b (x )

if x ∈ B

0

otherwise

M

k=

2



j,Sb



(x ) + jI



j,Sb

(x )

if x ∈ S

0

otherwise



M j,b (x ) + jM i,b (x )

m=

x∈B

Real IFFT F (u )G¯ (u ) + H (u )M¯ (u ) .

I j,b (x )2 − 2 jI j,b (x )



(A.4)

Due to the fixed displacement window, τ is known to be limited to {τ|τ ∈ Z3 ∧ −h  τn  h} thus the search window for each block matching operation can be defined as S = {x|x ∈ Z3 ∧ −2h < xi < 2h}. Next, we calculate DFTs which match the size of S. Because we are performing a circular correlation the function g is padded with zeros to match the size of f which is associated with the search window. For example, if the control point spacing h is set to eight voxels, then the DFTs will be 31 × 31 × 31 due to the definition of the search window S. In this case the final dimensions i, j of SSDb (τ ) should be 17 × 17 × 17 in order to match the limits of τ . Note that the DFT of m only has to be computed once at the beginning of the registration procedure because it is a constant mask, thus its computational burden is negligible. Equation (A.4) can be computed for each block using three forward FFTs and one inverse FFT as



Using the methodology presented in the previous section we can efficiently calculate all possible values of (15) for varying τ by employing multiple FFT operations. In order to simplify the presentation and provide equations which are easily translated into code, the notation Ii,Sb (x ) will be used to designate the search window around a control point cb ; thus, Ii,Sb (x ) = Ii (x + cb ) where x ∈ S. Let us define the following complex functions f, g, h, k, m, n which will be used in i, j the calculation of SHDb ,



2



m=

A.2. Improved shadow-aware metric calculation

f =

h = I j (x ) + Ii (x ) 2

combination using the efficient FFT technique described above generates all the data terms required by the graph-based optimization algorithm during registration. It should be mentioned that padding the argument of the FFT so that its dimensions are powers of two will result in a speedup of the FFT algorithm. For example functions of size 32 × 32 × 32 were faster to transform than functions of size 31 × 31 × 31. The reduction in execution time is not only due to the arithmetic gains by doing the computation in the frequency domain, but also due to the highly optimized FFT libraries available. Special low-level instructions and architectural specific speed-ups can greatly improve performance. Furthermore GPUs are highly suited for FFT computation.



−2 I j,b (x ) + jIi,b (x )



121



if x ∈ B

0



otherwise

Ii,Sb (x ) + jI

n=

2



j,Sb

(x )

2

if x ∈ S

0

otherwise

,

and also define the functions s, t to be used in the calculation of i, j OVRb ,



s=

M j,b (x ) + jM i,b (x )

if x ∈ B

0

otherwise

 t =

Mi,Sb (x ) + jM



j,Sb

(x )

0

if x ∈ S otherwise

,

where the prime notation indicates a time reversal here, i.e., I i,b (x ) = Ii,b (−x ). Using these functions we see that the following equations are equivalent to those presented in (15),

 OVRi,b j (τ ) = Real



 s¯(x )t (x + τ )

x∈B

SHDi,b j

(τ ) =

1 OVRi,b j

(τ )

 Real

 x∈B

f¯(x )g(x + τ )



¯ (x )n(x + τ ) . + h¯ (x )k(x + τ ) + m

(A.6)

These correlation operations can be written in the frequency domain using the cross-correlation theorem, thus the final similarity measure

122

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

is calculated using

OVRi,b j



1

Real(IDFT[F¯ (u )G(u )

puting sub-optimal solutions in parallel, one can think of other registration applications for this algorithm. For example, one may find that improved registration may result from the fusion of solutions calculated using different similarity measures or regularization techniques.

 (τ ) = Real IDFT S¯(u )T (u )

SHDi,b j (τ ) =



OVRi,b j (τ )

+ H¯ (u )K (u ) + M¯ (u )N (u )] ).

(A.7)

With this approach we achieved a speedup factor of 10.31 over the naive implementation, which is a substantial improvement. Our results suggest that an efficient group-wise similarity measure, which is robust to ultrasound shadowing, can be easily implemented using FFT operations. To our knowledge this is the first attempt at a groupwise metric for ultrasound mosaicing which accounts for shadowing artifacts such as those commonly experienced in a clinical setting. Appendix B. Parallel alpha expansion using a fusion approach B1. Implementation details This section will describe the details of the parallelized alpha expansion used to calculate the registration solution. The fusion process is discussed, which combines a set of registration solutions computed in parallel. In the three-dimension application of ultrasound mosaicing the cubic search window is separated into 8 distinct regions corresponding to the 8 corners of the cube. There is very little overhead associated with the fusion process since the label space is so large. For example, each fusion step requires the BHS algorithm to run once, whereas computing an initial solution using alpha-expansion with a limited displacement label set requires 92 BHS executions if |W | = 9 3 . |C | |C | Consider two registration solutions d0 ∈ W0 and d1 ∈ W1 , where |C | is simply the number of control points formally stated as the cardinality of the set C. The solutions are referred to as labellings where the subscript indicates the region of the displacement window W that was considered during their calculation. Also note that they are vectors whose lengths are equal to the number of control points. The goal of the fusion step is to determine a composite labeling dc where the displacement vector of each node must come from either d0 or d1 . Thus dc ∈ (W0 ∪ W1 )|C | and can be expressed as a linear combination of d0 and d1 using a binary vector y ∈ {0, 1}|C | as dc (y ) = d0 • (1 − y ) + d1 • y, where the multiplication is done element-wise. The binary vector y has an element associated with every displacement node and can be indexed using the same notation that was used to index the displacement vector d. Also, all inter and intra volume nodal relationships or cliques are identical in the new MRF problem. Using this construction for dc the fusion of two registration solutions can be written as a non-submodular binary-labeled MRF which can be solved using the BHS algorithm discussed earlier. The new binary optimization problem is shown below,

E (dc (y )) =



y

{yi,b ,y j,b }

Split displacement window W into W1 . . . W8 Calculate  all block-based similarity  terms d1 ← (0, 0, 0 ) , . . . , (0, 0, 0 ) for several sweeps do d2 . . . d8 ← d1 parfor i ∈ (1 . . . 8 ) do for α ∈ Wi do di ← di  α end for end parfor Wait for all threads d1 ← d1  d2 // 1st level of fusion d3 ← d 3  d 4 d5 ← d5  d6 d7 ← d7  d8 d1 ← d1  d3 // 2nd level of fusion d5 ← d 5  d 7 d1 ← d1  d5 // Final result end for return d1

B2. Performance measurement using the Visible Human Female anatomical dataset In order to evaluate the optimization method we constructed a finite element model (FEM) using the Visible Human Female dataset then subsequently deformed it with varying pressure applied in order to produce three overlapping and uniquely deformed image volumes. Although these images are visually different from ultrasound and do not suffer from the same artifacts, this framework provides a way to quantify the performance of the mosaicing/registration algorithm on multiple overlapping 3D volumes by using the

y

SHDi,b j (di,bi,b − d j,bj,b )

∈NInter

+

Algorithm 1 Parallelized alpha-expansion for MRF group-wise registration.



{yi,a ,yi,b ,yi,c }

 y   d i,a − 2dyi,b + dyi,c   i,a i,c  i,b   h2  

(B.1)

∈NIntra

The optimization of the fusion energy in (B.1) produces a binary solution vector y which equals 0 at nodes that are designated to use the displacement value from d0 and 1 at nodes that are designated to use the displacement value from d1 . The binary vector y is used to generate the improved composite solution dc . The procedure described above is known as fusion and will be designated by the operator . The pseudo-code for registration energy optimization is shown in algorithm 1. Although we used the fusion technique in order to improve the efficiency of our group-wise registration algorithm by com-

Fig. B.16. VHF stitching results showing the qualitative performance of our algorithm. The left half shows slices from the mosaiced volume when no registration is performed. The right half demonstrates how our algorithm can reconstruct the anatomy seamlessly, using multiple partially overlapping volumes. All images were formed using  (x).

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

123

(a)

(b)

(c)

(d)

Fig. B.17. Vector magnitude of misalignment in overlapping regions of Visible Human Female FEM before and after group-wise registration. (a) Axial slice showing error before registration. (b) Axial slice showing error after registration. (c) Coronal slice showing error before registration. (d) Coronal slice showing error after registration. Error is shown in mm.

Table B.3 Visible Human Female mosaicing accuracy/speed.

References

# Processes

Mean error (mm)

Std. dev. (mm)

Exec. time (s)

Speedup

1 4 8

0.8067 0.8278 0.7271

0.8566 0.6035 0.4585

3176.0 1470.0 811.6

1.00 2.16 3.91

deformation fields provided by the finite element simulations as the gold standard. The qualitative results before and after mosaicing can be seen in Fig. B.16. The voxel-wise registration error between overlapping volumes in the final mosaic is also displayed in Fig. B.17. The accuracy/performance results are compiled in Table B.3. The mean error is given in column 2 and the standard deviation of this error is given in column 3. Splitting the label space W into one, four, or eight partitions and then running parallel registration optimizations using the smaller sets still produced highly accurate mosaics of the finite element model and the speedup was impressive. The registration time in Table B.3 includes the processing time to build the similarity metric data-structure, which is done before any MRF optimization, and thus could be shortened by increased parallelization of that calculation. The number of processes in Table B.3 is equal to the number of partitions created from W. Varying the level of parallelization did not affect the visual quality of the final mosaics although there was a slight increase in accuracy when the registration was split between 8 processes. We have no definitive answer for this phenomenon and it is most likely related to the way the solution space was explored in the case of this particular experiment. The takeaway is that it is possible to split the registration between separate processes and still generate high quality mosaics.

Ackerman, M.J., Spitzer, V.M., Scherzinger, A.L., Whitlock, D.G., 1995. The Visible Human data set: an image resource for anatomical visualization. Medinfo 8 Pt 2, 1195– 1198. Andriy, M., 2010. Non-rigid Image Registration: Regularization, Algorithms and Applications. Ph.D. thesis. Oregon Health & Science University. Bookstein, F.L., 1989. Principal warps: thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. intell. 11 (6), 567–585. Burt, P.J., Adelson, E.H., 1983. The Laplacian pyramid as a compact image code. IEEE Trans. Commun. 31 (4), 532–540. Essannouni, F., Haj Thami, R.O., Aboutajdine, D., Salam, A., 2007. Simple noncircular correlation method for exhaustive sum square difference matching. Opt. Eng. 46 (10), 107004–107004–4. Glocker, B., Komodakis, N., Tziritas, G., Navab, N., Paragios, N., 2008. Dense image registration through MRFS and efficient linear programming. Med. Image Anal. 12 (6), 731–741. Glocker, B., Sotiras, A., Komodakis, N., Paragios, N., 2011. Deformable medical image registration: setting the state of the art with discrete methods∗ . Annu. Rev. Biomed. Eng. 13 (1), 219–244. Hellier, P., Coup, P., Morandi, X., Collins, D.L., 2010. An automatic geometrical and statistical method to detect acoustic shadows in intraoperative ultrasound brain images. Med. Image Anal. 14 (2), 195–204. Ibanez, L., Schroeder, W., Ng, L., Cates, J., 2005. The ITK Software Guide. Kitware, Inc. ISBN 1-930934-15-7. second edition. http://www.itk.org/ItkSoftwareGuide.pdf. Karamalis, A., Wein, W., Klein, T., Navab, N., 2012. Ultrasound confidence maps using random walks. Med. Image Anal. 16 (6), 1101–1112. Kolmogorov, V., Rother, C., 2007. Minimizing nonsubmodular functions with graph cuts – a review. IEEE Trans. Pattern Anal. Mach. Intell. 29 (7), 1274– 1279. Kutter, O., Shams, R., Navab, N., 2009a. Visualization and GPU-accelerated simulation of medical ultrasound from ct images. Comput. Methods Prog. Biomed. 94 (3), 250– 266. Kutter, O., Wein, W., Navab, N., 2009b. Multi-modal registration based ultrasound mosaicing. In: Medical Image Computing and Computer-Assisted Intervention– MICCAI 2009. Springer, pp. 763–770. Kwon, D., Lee, K., Yun, I., Lee, S., 2008. Nonrigid image registration using dynamic higher-order MRF model. In: Forsyth, D., Torr, P., Zisserman, A. (Eds.), Computer Vision ECCV 2008. In: Lecture Notes in Computer Science, 5302. Springer Berlin Heidelberg, pp. 373–386.

124

J. Kutarnia, P. Pedersen / Medical Image Analysis 24 (2015) 106–124

Kwon, D., Yun, I.D., Pohl, K., Davatzikos, C., Lee, S.U., 2012. Nonrigid volume registration using second-order MRF model. In: 2012 9th IEEE International Symposium on Biomedical Imaging (ISBI), pp. 708–711. Lempitsky, V., Rother, C., Roth, S., Blake, A., 2010. Fusion moves for Markov random field optimization. IEEE Trans. Pattern Anal. Mach. Intell. 32 (8), 1392– 1405. Liu, L., Kutarnia, J., Pedersen, P.C., Belady, P., 2014. Personal training simulator for asynchronous learning of obstetric ultrasound. Medicine Meets Virtual Reality 21: NextMed/MMVR21 196, 252–258. Mellor, M., Brady, M., 2005. Phase mutual information as a similarity measure for registration. Med. Image Anal. 9 (4), 330–343. Myronenko, A., Song, X.B., 2010. Intensity-based image registration by minimizing residual complexity. IEEE Trans. Med. Imaging 29 (11), 1882–1891. Ni, D., Chui, Y.-P., Qu, Y., Yang, X.S., Qin, J., Wong, T.-T., Ho, S.S.H., Heng, P.-A., 2009. Reconstruction of volumetric ultrasound panorama based on improved 3D sift. Comp. Med. Imaging Graph. 33 (7), 559–566. Orchard, J., 2007. Efficient least squares multimodal registration with a globally exhaustive alignment search. IEEE Trans. Image Process. 16 (10), 2526–2534. Pedersen, P.C., Skehan, D., 2012. Personal low-cost ultrasound training system. Stud. Health Technol. Inform. 173, 344. Poon, T.C., Rohling, R.N., 2006. Three-dimensional extended field-of-view ultrasound. Ultrasound Med. Biol. 32 (3), 357–369. Rajpoot, K., Noble, J.A., Grau, V., Szmigielski, C., Becher, H., 2009. Multiview RT3D echocardiography image fusion. In: Functional Imaging and Modeling of the Heart. Springer, pp. 134–143. Roche, A., Malandain, G., Ayache, N., 2000. Unifying maximum likelihood approaches in medical image registration. Intl. J. Imaging Syst. Technol.: Spec. Issue 3D Imaging 11, 71–80.

Schneider, R.J., Perrin, D.P., Vasilyev, N.V., Marx, G.R., del Nido, P.J., Howe, R.D., 2012. Real-time image-based rigid registration of three-dimensional ultrasound. Med. Image Anal. 16 (2), 402–414. Shams, R., Hartley, R., Navab, N., 2008. Real-time simulation of medical ultrasound from CT images. In: Metaxas, D., Axel, L., Fichtinger, G., Szkely, G. (Eds.), Medical Image Computing and Computer-Assisted Intervention MICCAI 2008. In: Lecture Notes in Computer Science, 5242. Springer Berlin Heidelberg, pp. 734–741. So, R.W., Tang, T.W., Chung, A.C., 2011. Non-rigid image registration of brain magnetic resonance images using graph-cuts. Pattern Recognit. 44 (1011), 2450–2467. Sotiras, A., Komodakis, N., Glocker, B., Deux, J.-F., Paragios, N., 2009. Graphical models and deformable diffeomorphic population registration using global and local metrics. In: MICCAI (1), pp. 672–679. Treece, G.M., Gee, A.H., Prager, R.W., Cash, C.J., Berman, L.H., 2003. High-definition freehand 3-D ultrasound. Ultrasound Med. Biol. 29 (4), 529–546. Wachinger, C., 2011. Ultrasound Mosaicing and Motion Modeling – Applications in Medical Image Registration. University of Munich. Wachinger, C., Shams, R., Navab, N., 2008a. Estimation of acoustic impedance from multiple ultrasound images with application to spatial compounding. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008. CVPRW’08. IEEE, pp. 1–8. Wachinger, C., Wein, W., Navab, N., 2008a. Registration strategies and similarity measures for three-dimensional ultrasound mosaicing. Acad. Radiol. 15 (11), 1404– 1415. Zikic, D., Glocker, B., Kutter, O., Groher, M., Komodakis, N., Kamen, A., Paragios, N., Navab, N., 2010. Linear intensity-based image registration by Markov random fields and discrete optimization. Med. Image Anal. 14 (4), 550–562.

mosaicing with application to ultrasound.

In this paper we present a group-wise non-rigid registration/mosaicing algorithm based on block-matching, which is developed within a probabilistic fr...
6MB Sizes 0 Downloads 9 Views