Robust direction-of-arrival by matched-lags, applied to gunshots.

Izabela Lyon Freire: JASA Express Letters

[http://dx.doi.org/10.1121/1.4874223]

Published Online 13 May 2014

Robust direction-of-arrival by matched-lags, applied to gunshots Izabela Lyon Freire Program of Defense Engineering, Military Institute of Engineering, Praça General Tib urcio 80, Rio de Janeiro, 22290-270, Brazil [email protected]

Abstract: This work investigates the direction-of-arrival problem. A time-delay-estimate (TDE) obtained from a peak of a correlation function is subject to two types of error: type I, approximation errors, and type II, errors due to spurious signals. The iterative least-squares algorithm tentatively selects spatially coherent subsets of TDEs containing no type II errors and minor contributions of type I errors (“matchedlags”). Simulations use a seven-microphone array and a gunshot signal. The evaluation methodology is rigorous, comparing empirical distribution functions of estimation error of algorithms through two-sample, one-sided Kolmogorov–Smirnov tests, and quantifying differences with Cohen’s D. The direction-of-arrival estimate is improved, specifically at low signal-to-noise ratios. C 2014 Acoustical Society of America V

PACS numbers: 43.60.Jn, 43.60.Fg [CG] Date Received: December 12, 2013 Date Accepted: April 19, 2014

1. Introduction Among state-of-the-art solutions to the problem of the direction of arrival (DOA) estimation of a wave captured by a sensor array are those based on the time differences of arrival (TDOAs) of the wave at positions that are known to be occupied by the sensors. Each time delay estimate (TDE) is obtained as the time-argument that maximizes the generalized cross-correlation function (GCC) with phase transform weights (GCC-PHAT).1 Comparison of expected TDOAs to measured TDEs allows derivation of a DOA.2 TDEs derived by GCC-PHAT are subject to two types of errors: type I, approximation errors, and type II, errors caused by a possible peak of the correlation function due to spurious signals at one or both sensors. Errors of type II have been discussed in Ref. 3 and errors of type I in Refs. 3 and 4. A TDE subset selection principle is idealized: a DOA should be derived from a subset of TDEs with no type II errors and a minimum contribution of type I errors. In Ref. 5, a conceptually similar proposal was found, called the “matched-lag filter,” which constrains the signal related lags to have physically possible values. This letter proposes a first approach for amalgamation of the literature cited in this paragraph, and tentatively implements the principle of DOA estimation from a spatially coherent subset of TDEs derived by GCC-PHAT using an iterative least-squares (ILS) algorithm. Simulations inspired by the “sniper localization problem” use a sevenmicrophone array and vary the signal-to-noise ratio (SNR) of a gunshot signal, which is characteristically an impulsive, wideband sound. Three variations of this method are evaluated here: raw least-squares (RAW),6 weighted least-squares (WLS) with weights that were proposed specifically for DOA problems,3 and ILS. GCC-PHAT leastsquares (LS) methods are advocated for the estimation of DOA of impulsive signals.4 A new methodology for comparative analysis of DOA algorithms is proposed: the Kolmogorov–Smirnov statistical test7 compares cumulative distributions of error and Cohen’s D effect size measurement8 quantifies possible performance differences.

EL246 J. Acoust. Soc. Am. 135 (6), June 2014

C 2014 Acoustical Society of America V


[http://dx.doi.org/10.1121/1.4874223]


The ILS algorithm has been described before,9 however, without a deeper understanding of the reason why it performed well, and without statistical validation of results. The work in Ref. 9 was started with the specific objective of eliminating sensors that will eventually malfunction in an array. 2. Methods 2.1 Data creation Simulation data were created based on the paradigm of geometric acoustics,10 implemented by the authors of Ref. 11. An anechoic environment was adopted. The source was positioned at a random point, with an azimuth randomly taken from a uniform distribution in [0 , 360 ] and a zenith angle randomly taken from a uniform distribution in [45 , þ45 ]. The number of positions sampled is Ns ¼ 1024. An array of seven microphones was created, with sensor positions in Cartesian coordinates (in cm): [0, 0.4, 10], [3.45, 7, 6.9333], [8, 0, 7.1667], [5.25, 6.8333, 6.7667], [5.15, 6.5667, 6.6333], [10.2, 0.3, 6.5], [4.95, 6.5667, 6.7333]. This seven-microphone setup was inspired by the system described in Ref. 12. The sound speed was 350 m/s throughout the simulations. Noise was added to the signal of each sensor. The noise has a power spectral density (PSD) which approximates that of the open field environment in which the muzzleblast signal was recorded and was created by passing white noise through a 60-coefficient finite impulse response filter, obtained by Wiener filtering. The variance of the additive noise signal is such that a desired SNR is obtained, where SNRdB ¼ 10 log10 r2s =r2n . The SNR varies between 5 and 15 dB, in steps of 0.5 dB, for each sampled position of the source. A gunshot muzzleblast, shown in Fig. 1, was used as the source signal. It is a recording of a Browning Machine Gun, Cal. 0.50, M2, HB, taken at a distance of 236 m from the sniper. The signal was recorded at the Centro de Avaliaç~oes do Exercito (CAEx) in a grass-covered open field with a Behringer ECM8000 microphone, which covers frequencies up to 20 kHz. The sampling frequency is 96 kHz, which is well above the Nyquist rate. The duration of the source signal is 20 ms, and the simulated array recording has the same length, according to the default setting of the ISM_AUDIODATA function of the ISM toolbox.11 Because this function considers the impulse response between source and sensor, a delay was introduced before the simulated signal arrives at each sensor. This delay was kept and the source signal length was chosen so that the actual muzzleblast signal would be roughly centered in the array signal. The minimum length of the array window capable of containing the full muzzleblast in all sensor recordings would be the sum of the length of the actual muzzleblast signal and the maximum TDOA for the array, or approximately 4.5 ms overall. 2.2 Data analysis Let Ns be the number of DOAs that are sampled and the i index a random placement of the source, 1 i Ns . Let R represent the SNR obtained after addition of ambient noise to the signal. An estimate of DOA derived by algorithm A, where A 2 fRAW; WLS; ILSg, for

Fig. 1. The gunshot muzzleblast source signal used in simulations.

J. Acoust. Soc. Am. 135 (6), June 2014

Izabela Lyon Freire: Robust direction-of-arrival by matched-lags EL247


[http://dx.doi.org/10.1121/1.4874223]


source placement i and SNR R is denoted aÂði; RÞ. The original, simulated DOA is denoted aðiÞ. The DOA error is the three-dimensional angle between ai and aÂði; RÞ. Samples of size Ns of error for each algorithm at each SNR are thus obtained. From the samples of error, we derive the empirical cumulative distribution functions (CDFs) of error EA ðxÞ, with their corresponding true (but unknown) population CDFs FA ðxÞ. The empirical CDFs of error EA and EA0 of an ordered pair of algorithms A and A0 are compared by two-sample, one-sided Kolmogorov–Smirnov tests.7 In this test, the null hypothesis H0 states that CDFs of error are the same for the two algorithms compared, H0 : 8s;

FA ðxÞ ¼ FA0 ðxÞ:

(1)

The alternative hypothesis H1 in the one-sided test FA > FA states that H1 : 9sj FA ðxÞ > FA0 ðxÞ:

(2)

max ½EA ðsÞ EA0 ðsÞ:

(3)

The test statistic is

The performance difference between the two algorithms is quantified by Cohen’s D effect size measurement.8 The difference between means is given in units of pooled standard deviation. 3. Theory A plane wavefront propagating at sound speed c was synchronously recorded by an M-sensor array. Sensor positions were known, with sensor mi placed at position pi , 1 i M. The DOA of the wavefront, relative to the array, is represented by zenith angle h and azimuth / or, equivalently, by the unit vector ah;/ ¼ ½ sin hcos / sin hsin / cos hT . The TDOA spi pj ðaÞ (in units of time relative to direction a) between sensors mi and mj , considering Dpij ¼ ðpi pj Þ=c, is given by spi pj ðaÞ ¼ DpTij a;

(4)

while the corresponding TDE, ^s xi xj , is given by the time-argument maximizing the GCC-PHAT, ^r PHAT xi xj , between signal windows xi and xj . Simple measures that add robustness to this estimate are to search for the function maximum only within the range of physically possible delays smax ij , as in Ref. 3, and to interpolate the correlation function, as in Refs. 3 and 6, decreasing the frequency of occurrence of type II errors and the magnitude of type I errors. Pseudocode is listed in algorithm 1. Algorithm 1. GCC-PHAT and TDE, where F ð Þ is the discrete Fourier transform, j j is magnitude, and an asterisk denotes complex conjugate. The definition1 is given in a comment in line 2, and the pseudocode gives implementation details. The series ^r xi xj is interpreted circularly, starting at zero, defining the standard direction as positive lags (xi precedes xj ), from zero to the end, and backwards as negative lags. The TDE is ^r PHAT ^s xi xj ¼ arg maxs;jsjsmax xi xj ðsÞ, where the use of smax is advocated by the authors of Ref. 3. ij 1: function GCC-PHAT(xi , xj ) 2: R F fxi gF fxj g ä R is the cross-PSD; ^r PHAT is defined1 as F 1 fR=jRjg xi xj 3: R R= maxðjRjÞ ä normalization of R, preparing for the next instruction 4: RPHAT R=ðjRj þ Þ ä is used heuristically, to avoid division by zero 1 f PHAT g ä interpolation3,6 can increase precision of TDE 5: ^r PHAT F R xi xj PHAT 6: return ^r xi xj 7: end function end


Izabela Lyon Freire: Robust direction-of-arrival by matched-lags


[http://dx.doi.org/10.1121/1.4874223]


Let P be a set of pairs fi; jg of indexes. From P, the TDEs ^s xi ;xj are compared to the corresponding parameterized TDOAs spi pj ðaÞ, always taking i < j to avoid redundancy. The LS cost function nP ð^ a Þ is then defined as 1 X nP ðaÞ ¼ ð^s x x DpTij aÞ2 : (5) jPj fi;jg2P i j Then taking the gradient of Eq. (5) with respect to a and equating the result to zero, the minimum-cost a^PDOA is estimated as X 1 X ^s xi xj Dpij ; Dpij DpTij (6) a^PDOA ¼ fi;jg2P

fi;jg2P

where the RAW algorithm uses all available TDEs for estimation of a^h;/ , while ILS tries to derive a^h;/ from a spatially coherent subset of TDEs (matched-lags), iteratively eliminating the pair of sensors which most contributes to the cost function of Eq. (5) and then re-calculating the DOA of Eq. (6). The ILS algorithm applied to the DOA problem is detailed in the pseudocode algorithm 2. Nmin 3 non-coplanar pairs are necessary for a solution in threedimensional space, and the experiments reported in Sec. 4 use Nmin 5, a value which was derived considering that empirical gunshot audio data showed an increase in average n for solutions derived from less than five pairs of sensors.9 Algorithm 2. The ILS algorithm applied to the DOA problem. j j is the cardinality of a set and CNk is the number of combinations of N elements taken k by k. 1: function ILSðp1:M ; X1:M Þ 2 2: Let P be the set of all possible sensor pairs ä here, jPj ¼ CM 3: while jPj Nmin do 4: Obtain, from P, an estimate a^PDOA of DOA, according to Eq. (6) 5: Obtain the LS cost nP ð^ a PDOA Þ, according to Eq. (5) min 6: Store the minimum nP min ð^ a PDOA Þ obtained thus far, along with a^PDOA 2 T ^P 7: Find fi; j g 2 P with the largest ð^s xi xj Dpij a DOA Þ 8: P P fi; j g 9: end while min 10: return DOA a^PDOA of the iteration with smallest cost 11: end function end

Fig. 2. (Color online) Empirical CDF of error at (a) SNR ¼ 15 dB and (b) SNR ¼ 0 dB. EA ðxÞ is the empirical CDF of error of algorithm A. For picture clarity, the curves may not include the highest 2% sampled errors.




[http://dx.doi.org/10.1121/1.4874223]


Fig. 3. Two-sample, one-sided Kolmogorov–Smirnov tests: rejection of the null hypothesis H0 at significance level 5%. H0 states that the algorithms under comparison have the same CDF of DOA error. The alternative hypothesis H1 is given at the y axis. Rejection of H0 in favor of H1 is marked.

4. Results Results from comparative analysis of three algorithms, RAW, WLS,3 and ILS, under progressive signal-to-noise ratios are described. The source signal used in the simulations was a single-microphone real recording and included environmental perturbations. Some of the real-world directionality information may have been lost in the simulation. Empirical CDFs of DOA error are illustrated in Fig. 2. Comparisons are performed by a two-sample, one-sided Kolmogorov–Smirnov statistical test,7 with results reported in Fig. 3. Differences encountered in the samples are quantified by Cohen’s D measure of effect size,8 with results reported in Fig. 4. In Fig. 3, it occasionally happens that H0 is rejected in favor of both A > A0 and A > A0 . This means that the CDFs cross, which is in agreement with the definition of the test statistic, Eq. (3). In Fig. 4, a positive value of D ðA; A0 Þ indicates that A performs better than A0 . D is such that D ðA; A0 Þ ¼ D ðA0 ; AÞ. Statistically validated relationships include, for SNR in the interval [5, 7], ILS > WLS and ILS > RAW. In [8, 13.5], WLS > ILS. For most of the interval [3, 11], WLS > RAW. At 5, 3, and 2, RAW > WLS. These relationships can be quantified by Cohen’s D, according to Fig. 4. 5. Conclusions In the conditions created by the simulations described in this letter, ILS is the algorithm of choice for low SNR and WLS is the algorithm of choice for high SNR. The advantage offered by ILS fades with increasing signal quality. The algorithm WLS, which is based on quantifying how an error in a TDE propagates to an error in the DOA estimate according to the angle between the line joining the respective pair of sensors and the DOA estimate implied by the TDE, and also on the

Fig. 4. (Color online) Cohen’s D for varying SNR.


Izabela Lyon Freire: Robust direction-of-arrival by matched-lags


[http://dx.doi.org/10.1121/1.4874223]


quality of the GCC-PHAT function, offers improvement over RAW, as statistically validated in Fig. 3. The assessed conclusions are drawn by a novel methodology for the evaluation of DOA algorithms, which is designed to be rigorous. This methodology is advocated for DOA problems, because the Kolmogorov–Smirnov test allows a statistical comparison of the distribution functions of DOA error, which may even cross. That is, an algorithm A may perform better than A0 for some range of error, but this relationship may be inverted for other ranges of error. Cohen’s D is an interesting measure of the size of possible differences. The results obtained suggest that this methodology provides consistent results. In the application task of sniper localization, the recorded gunshot audio often has a critically low SNR. Due to this specificity of the problem, and considering the choice of TDOA methods, the selection of spatially coherent subsets of TDEs possibly implemented by the ILS algorithm is advocated for this application scenario. Preliminary real-world data can be found in Ref. 9. Acknowledgments This work was partially supported by CAPES. The author thanks the Exercito Brasileiro and CAEx for the gunshot audio signal used in simulations; Dirceu Gonzaga da Silva for suggesting research on sniper localization; and Jose Apolinario, Jr. for discussions which led to previously published research. Intellectual property rights related to this work are object of the INPI process BR 10 2012 023084 4 (decision pending). References and links 1

C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” IEEE Trans. Acoust. Speech Signal Process. 24, 320–327 (1976). 2 J. Smith and J. Abel, “Closed-form least-squares source location estimation from range-difference measurements,” IEEE Trans. Acoust. Speech Signal Process. 35, 1661–1669 (1987). 3 K. Varma, T. Ikuma, and A. Beex, “Robust TDE-based DOA estimation for compact audio arrays,” in Sensor Array and Multichannel Signal Processing Workshop Proceedings (SAM), Rosslyn, VA (2002), pp. 214–218. 4 C. Szurbela and J. Olson, “Uncertainties associated with parameter estimation in atmospheric infrasound arrays,” J. Acoust. Soc. Am. 115, 253–258 (2004). 5 J. Spiesberger, “The matched-lag filter: Detecting broadband multipath signals with auto- and crosscorrelation functions,” J. Acoust. Soc. Am. 109, 1997–2007 (2001). 6 I. Freire and J. Apolin ario, Jr., “DOA of gunshot signals in a spatial microphone array: Performance of the interpolated generalized cross-correlation method,” in Argentine School of Micro-Nanoelectronics Technology and Applications (EAMTA), Buenos Aires, Argentina (2011), pp. 1–6. 7 F. J. Massey, Jr., “The Kolmogorov-Smirnov test for goodness of fit,” J. Am. Stat. Assoc. 46, 68–78 (1951). 8 J. Cohen, “A power primer,” Psych. Bull. 112, 155–159 (1992). 9 I. Freire, P. Prandel, and J. Apolin ario, Jr., “Sobre a escolha de sinais em arranjos de microfones estimando DOA com GCC-PHAT” (“Choice of signals from microphone arrays, for DOA estimation with GCC-PHAT”), in Simp osio Brasileiro de Telecomunicaç~ oes (SBrT), Brasılia, D. F., Brazil (2012). 10 J. Allen and D. Berkley, “Image method for efficiently simulating small-room acoustics,” J. Acoust. Soc. Am. 65, 943–950 (1979). 11 E. Lehmann and A. Johansson, “Prediction of energy decay in room impulse responses simulated with an image-source model,” J. Acoust. Soc. Am. 124, 269–277 (2008). 12 J. A. Mazurek, J. E. Barger, M. Brinn, R. J. Mullen, D. Price, S. E. Ritter, and D. Schmitt, “Boomerang mobile counter shooter detection system,” in Defense and Security (International Society for Optics and Photonics, Orlando, FL, 2005), pp. 264–282.



Copyright of Journal of the Acoustical Society of America is the property of American Institute of Physics and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

Quantum theory as plausible reasoning applied to data obtained by robust experiments.

Detecting gunshots using wearable accelerometers.

Robust functional statistics applied to Probability Density Function shape screening of sEMG data.

Robust dose-response curve estimation applied to high content screening data analysis.

Robust human machine interface based on head movements applied to assistive robotics.

Hereditary Neuropathy with Liability to Pressure Palsies Masked by Previous Gunshots and Tuberculosis.

A Novel Robust Trilateration Method Applied to Ultra-Wide Bandwidth Location Systems.

Quantitative evaluation of robust skull stripping and tumor detection applied to axial MR images.

A robust estimation of exon expression to identify alternative spliced genes applied to human tissues and cancer samples.

Our experience in the management of segmental bone defects caused by gunshots.

Short and robust directional couplers designed by shortcuts to adiabaticity.

Ultrathin coating by plasma polymerization applied to corneal contact lens.

Homogenization by nitrogen cavitation technique applied to platelet subcellular fractionation.

Free malondialdehyde determination by HPLC applied to microsomal studies.

Science applied to specific sports.

Gaitography applied to prosthetic walking.

Hybrid functionals applied to perovskites.

Isotachophoresis applied to biomolecular reactions.

Pattern recognition applied to surgery.

Physiology applied to field hockey.

Robust mutant strain design by pessimistic optimization.

Robust ferromagnetism carried by antiferromagnetic domain walls.

Skin irritation induced by topically applied timolol.

Composite resin buttons used to direct forces applied to teeth by removable orthodontic devices.