Science of the Total Environment 476–477 (2014) 158–164

Contents lists available at ScienceDirect

Science of the Total Environment journal homepage: www.elsevier.com/locate/scitotenv

Analysis of carbon dioxide concentration skewness at a rural site Isidro A. Pérez ⁎, M. Luisa Sánchez, M. Ángeles García, Marta Ozores, Nuria Pardo Department of Applied Physics, Faculty of Sciences, University of Valladolid, Paseo de Belén, 7, 47011 Valladolid, Spain

H I G H L I G H T S • • • •

15 skewness coefficients were compared and the Yule coefficient was selected. A relationship between skewness, concentration, and meteorological variables was considered. CO2 skewness was related with sources and dispersive processes. Symmetric distributions were linked to an urban plume or summer features.

a r t i c l e

i n f o

Article history: Received 15 October 2013 Received in revised form 18 December 2013 Accepted 6 January 2014 Available online 23 January 2014 Keywords: Kernel regression CO2 distribution Robust statistics Triangular distribution

a b s t r a c t This paper provides evidence that symmetry of CO2 concentration distribution may indicate sources or dispersive processes. Skewness was calculated by different procedures with CO2 measured at a rural site using a Picarro G1301 analyser over a two-year period. The usual skewness coefficient was considered together with fourteen robust estimators. A noticeable contrast was obtained between day and night, and skewness decreased linearly with the logarithm of the height. One coefficient was selected from its satisfactory relationship with the median concentration in daily evolution. Three analyses based on the kernel smoothing method were conducted with this coefficient to investigate its response to yearly and daily evolutions, wind direction, and wind speed. Left-skewed distributions were linked to thermal turbulence during midday, especially in spring–summer, or with high wind speeds. Almost symmetric distributions were associated with sources, such as the Valladolid City plume reinforced with spring emissions and the lack of emissions in summer in the remaining directions. Finally, right-skewed distributions were related to low wind speeds and stable stratification at night, furthered by strong emissions in spring. Skewness intervals were proposed and their average median concentrations were calculated such that the relationship between skewness and concentration depends on the analysis performed. Since some skewness coefficients may also be negative, they provide better information about sources or dispersive processes than concentration. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Rural environments are the best places for not only measuring CO2 background concentrations (Beardsmore and Pearman, 1987; Wang et al., 2010), but also for studying the influence of sources and atmosphere evolution, since the distribution of CO2 concentration is extremely sensitive to both factors. Soil and vegetation respiration are the main CO2 sources (Eler et al., 2013; Guo et al., 2013; Lohila et al., 2003; Mancinelli et al., 2013), photosynthesis being the process affecting plants responsible for removal. However, the atmosphere also plays a key role on observed concentrations, since stable stratification and turbulence determine CO2 dilution in the lower atmosphere (Sánchez et al., 2010; Pérez et al., 2012a) and transport may have a major influence on measured values downwind from sources (Pérez et al., 2009a). ⁎ Corresponding author. Tel.: +34 983 184 189; fax: +34 983 423 013. E-mail address: [email protected] (I.A. Pérez). 0048-9697/$ – see front matter © 2014 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.scitotenv.2014.01.019

Statistics are frequent in assimilation of atmospheric observations. However, location indicators, such as the mean or the median, are the most widely considered. Spread estimators, such as standard deviation or interquartile range, are usually calculated to indicate the uncertainty of location statistics. A description of symmetry estimators is uncommon. However, previous analyses have shown that CO2 distribution skewness changes along the day (Pérez et al., 2012b,c). This behaviour merits detailed research in order to ensure adequate quantification thereof and investigate its behaviour. The present paper seeks to establish a relationship between CO2 distribution skewness and CO2 sources or dispersive processes in the low atmosphere. Nearly symmetric distributions may be linked with sources since Gaussian fits have successfully been used in both point and area sources (Beychok, 1994; Pérez et al., 2012d; Sparks and Toumi, 2010). However, skewed distributions indicate factors that determine scattered high or low values, such as transport from sources revealed by changes in wind direction, emissions that increase concentrations, or atmospheric processes that clean the atmosphere.

I.A. Pérez et al. / Science of the Total Environment 476–477 (2014) 158–164

Although skewness may be used to isolate outliers (Heymann et al., 2012; Hubert and van der Veeken, 2008), the first part of this paper is devoted to skewness calculation using different statistics so as to explore their application possibilities outside the theoretical field, comparing them in an effort to observe their differences and choose one which may be deemed representative of distribution symmetry behaviour. Skewness coefficients are based on the observations at the tails of the distribution. As a result, various fractions of values or a range of procedures may be used to calculate them, such that outcomes and determination time may prove quite different. Certain methods used may be slow, one key point therefore being to simplify the proposed procedures so as to make their calculation computationally feasible. The relationship between the skewness coefficient and other variables forms the second part of this analysis and was established using the kernel smoothing method. This is a procedure widely used to locate air pollution sources or in trajectory analysis (Donnelly et al., 2012; Henry, 2008; Henry et al., 2002, 2011; Yu et al., 2004). The contribution made by the present paper lies in its simplified application. Kernel smoothing, also called nonparametric regression, is usually calculated to interpolate observations, the closeness of values proving a key in this procedure. However, kernel smoothing cannot be directly applied, since observations must previously be divided into groups where skewness must be calculated. The current research was conducted in three ways. Firstly, the time evolution of skewness and concentration is investigated so as to weigh the yearly and daily cycles. In a second step, wind direction is introduced to expand the analysis directionally. Finally, wind speed is considered to account for dispersive processes. Intervals of skewness are proposed and associated with ancillary variables.

2. Experimental description The measuring campaign extended over two years, commencing on 15 October 2010 at CIBA (Low Atmosphere Research Centre) located on a flat terrain 840 m above MSL, in the North of Spain, 41°48′49″ N, 4°55′ 59″ W. Grass surrounded by crops is the main vegetation. Orographic effects, which proved noticeable in numerical implementations (Xu and Yi, 2013), were absent. CO2 dry concentrations were measured with a Picarro G1301 analyser, which uses cavity ring-down spectroscopy (Crosson, 2008). Values provided by the device were corrected slightly using a linear equation obtained with calibrations performed each two weeks (Pérez et al., 2013). This analyser also controls solenoid valves to measure at three levels, 1.8, 3.7, and 8.3 m. Meteorological variables, wind speed, and wind direction, were recorded at 10 m. Finally, all variables were averaged in half hour intervals. 3. Theoretical fundamentals 3.1. Skewness definition The symmetry of a distribution is measured by the skewness coefficient, which is usually calculated by S1 (Table 1). Positive values of S1 indicate particularly frequent high observations. The dataset with a right tail is thus referred to as right-skewed, the left-skewed distributions being on the opposite side. One serious restriction of S1 is its extreme sensitivity to outliers, which may change the value of this coefficient considerably. This disadvantage led to a more robust symmetry

Table 1 Skewness coefficients. Name

Equation a

n

Standardized third moment

1 ∑ ðx −μ Þ3 S1 ¼ n i¼1 σ 3 i

David and Johnson coefficienta,b

−2MþQ α S2 ¼ Q 1−α Q 1−α −Q α

Bowley's coefficienta,b

−2MþQ 0:25 S3 ¼ Q 0:75 Q 0:75 −Q 0:25

Octile skewnessc

−2MþQ 0:125 S4 ¼ Q 0:875 Q 0:875 −Q 0:125

Kelly's coefficienta

−2MþQ 0:1 S5 ¼ Q 0:9Q 0:9 −Q 0:1

Yule's coefficienta,b a,b

S6 ¼ μ−M σ

Groeneveld and Meeden coefficient

S7 ¼ Eμ−M jx−Mj

Ekström and Jammalamadaka coefficienta

0:8 −4MþQ 0:2 þQ 0:1 S8 ¼ QðQ0:90:9þQ −Q 0:1 ÞþðQ 0:8 −Q 0:2 Þ

Percentile skewnessd From a triangular distributione

0:9 −M S9 ¼ QM−Q 0:1 pffiffi Þð2a−b−cÞða−2bþcÞ S10 ¼ 2ðaþb−2c 3=2 2 5ða2 þb þc2 −ab−ac−bcÞ

Proposed from a triangular distribution

S11 ¼ bþa−2c b−a

Medcouplec,f

Medtriplee,f

From the mean of a distribution, 1g From the mean of a distribution, 2g From the mode of a distributiong

159

  S12 ¼ medxi ≤ Q 0:5 ≤ x j h xi ; x j 8 xð jÞ −Q 0:5 Þ−ðQ 0:5 −xðiÞ Þ ð > > with xðiÞ bxð jÞ > xð jÞ −xðiÞ   < ) þ1 i N j h xi ; x j ¼ > > with xi ¼ x j ¼ Q 0:5 0 i¼ j > : −1 ib j   S13 ¼ medib jbk8 h xi ; x j ; xk > <   ðxðkÞ −xð jÞ Þ−ðxð jÞ −xðiÞ Þ with xðiÞ ≠xðkÞ h xi ; x j ; xk ¼ xðkÞ −xðiÞ > : 0 with xðiÞ ¼ xðkÞ   S14 ¼ ln 1−F ðFμðÞμ Þ S15 = 2F(μ) − 1 S16 = 1 − 2F(c)

n, number of observations; μ, mean; M, median; σ, standard deviation; Q, quantile; α, fraction of observations; E, expected value; [a,b], interval where the triangular distribution is defined; c, mode; med, median; x(i), sorted observations; F, cumulative probability. a Ekström and Jammalamadaka (2012). b Bonato (2011). c Brys et al. (2003). d Vose (2008). e http://mathworld.wolfram.com/TriangularDistribution.html. f Brys et al. (2004). g Tajuddin (1999).

160

I.A. Pérez et al. / Science of the Total Environment 476–477 (2014) 158–164

statistics being proposed, such as S2, which involves the quantiles Qα, where α is between 0 and 0.5. Its numerator accounts for symmetry since it is the difference between the distances from the median, M, to two quantiles with the same number of observations above and below each one respectively, (Q1 − α − M) − (M − Qα). Its denominator is the quantile range, Q1 − α − Qα. From this definition, S2 lies between − 1 and + 1. However, the S2 coefficient responds to a theoretical approach, since the fraction α must be fixed for practical applications. The S3 coefficient, sometimes called the Yule Kendall index (Wilks, 2011), considers the quartiles. S4 uses α = 0.125 in S2, and S5 the first and ninth deciles. S3 is less sensitive to outliers than S4 and S5. However, S5 and S4 use more information from distribution tails and are more appropriate to detect asymmetry in observations. S6 and S7 estimate symmetry by the difference between the mean and the median. Ekström and Jammalamadaka (2012) proposed a general measure of skewness. However, they also suggested the overall compromise equation S8, which considers the first, second, eighth, and ninth deciles. The skewness percentile, S9, is another symmetry coefficient, although it is rarely used. Its values are positive, below one for left-skewed distributions and above one when they are right-skewed. Pérez et al. (2013) fitted CO2 concentrations to a triangular distribution taking the mode to begin an iterative procedure. They showed that triangular distribution correctly describes CO2 concentrations, its skewness being provided by S10. However, since this distribution is defined in the interval [a,b], a new simpler calculation than S10 is proposed by S11. It follows S2 by using the interval extremes and the mode, c, as a position estimator. Medcouple, S12, is a more robust coefficient where quantiles of S2 are replaced by observations. However, it involves all observations, thus increasing calculation time. This problem may be successfully overcome with a procedure based on stem and leaf displays (Hoaglin et al., 2000) or histograms. Since an approximate value is required, the median of h(xi,xj) need not be calculated. Intervals of h(xi,xj) were established and their frequencies calculated. The median is located in the middle of the ranked values and may be approximately obtained from the cumulative frequencies, with value sorting proving unnecessary. Medtriple, S13, emerges from medcouple by replacing the median by each observation. Its main disadvantage lies in the extremely long calculation time. To shorten this, observations are sorted and a sample is taken by forming intervals with the same number of observations, each interval providing one randomly chosen observation. The kernel h(xi,xj,xk) is calculated with observations, and medtriple is approximated with the procedure used to obtain medcouple. The three last statistics are obtained from the cumulative probability distribution. S14 and S15 consider the mean and S16 the mode. S14 calculation is made from the quotient between the fraction of observations below and the fraction of observations above the mean, which is positive, above one for left-skewed distributions and below one for right-skewed distributions. The logarithm of this quotient introduces a sign, although S14 can take any value. However, S15 (S16) is defined as the difference between the fraction of values below (above) and the fraction of values above (below) the mean (mode), which lies between −1 and +1. 3.2. Kernel smoothing Two-variable analyses were performed by using the time of day, time of year, wind direction or wind speed as independent variables, and skewness or median concentration as dependent variables. Intervals of two independent variables were established and the dependent variable, Aij, was calculated at each interval and attributed to the interval centre, (x i,yi ). A representation with smooth curves was made with the expression     XN XN Y−y j X−xi 1 2 K Aij K 2 i¼1 j¼1 1 h h  1   2  AðX; Y; h1 ; h2 Þ ¼ X X ð1Þ Y−y j X−xi N1 N2 K K 1 2 i¼1 j¼1 h1 h2

where A is the magnitude calculated at point (X,Y). K is the Gaussian function −1=2

K ðxÞ ¼ ð2π Þ

  2 exp −0:5x ; −∞bxb∞:

ð2Þ

Bandwidths h1 and h2 have been taken as equal to one unit of the interval selected. With these restrictions concerning the application of Eq. (1), this procedure may be observed as a simpler version of that used by Donnelly et al. (2011) or Pérez et al. (2013). These plots, where points (X,Y) were distributed in a dense grid, provided the basis for skewness analysis so as to link skewness to a second variable, such as median concentration. Intervals of skewness were proposed, and averages of this second variable were calculated in each skewness interval, and the statistical significance of these differences was investigated. 4. Results and discussion 4.1. Skewness coefficient values CO2 skewness was calculated with the equations presented in Table 1. Triangular distributions were calculated from histograms with 2 ppm wide classes. Medcouple was obtained following the procedure described above by means of histograms with 0.001 classes. Medtriple was calculated with samples of around 400 observations and similar histograms to those used with medcouple. S14, S15 and S16 were obtained with the experimental cumulative distribution. The mode for S16 was calculated from the histogram with 2 ppm classes. Values of the statistics are shown in Table 2. All the values for the whole day correspond to right-skewed distributions due to the highest values being reached during the night, linked to soil and plant respiration and atmospheric stability. S1 presented the highest values due to the key role played by outliers. However, S3 was the lowest, since the tail effect on skewness was smaller. The statistic proposed for triangular distribution, S11, is a similar and simpler alternative to the usual S10. Bold figures correspond to values during midday, from 13 to 15 GMT. In this case, left-skewed distributions prevailed. This behaviour was particularly noticeable in S9, S10, S11, and S16. However, S1 and S13 indicated almost symmetric distributions. Italic figures were calculated for values during midnight, from 0 to 2 GMT. In general, during this period skewness was greater than for the whole day, except for S1 and certain values of S10, S11, and S16. These three coefficients gave the lowest values in both periods (midday and midnight) and this behaviour may be related with their calculation, since they are based on the distribution mode, S16 directly, and S10, S11 from the triangular distribution, obtained from the mode. The influence of sample size on medtriple was considered for the lowest level and the midnight period. Calculations were repeated 50 times with samples of the same size from about 100 to 300 observations in 50-observation intervals. Medtriple values were very similar and interquartile ranges of the 50 values calculated with samples over 200 observations were equal to or below 0.001. Skewnesses presented in Table 2 for all observations and for 0 to 2 GMT linearly decreased with the logarithm of the height with very satisfactory fits. In order to perform a more detailed analysis, one coefficient was chosen from the daily evolution with semi-hourly observations, these providing a wide range of skewnesses and median concentrations. An intensive relationship between the two variables has previously been observed (Pérez et al., 2012c). In order to shorten medtriple calculation time, samples of around 200 observations were selected. Table 3 shows the Pearson correlation coefficients between the two variables. Values were noticeably high and somewhat lower for S16 , since this skewness was calculated with the mode of the experimental distribution, which was sensitive to the histogram used. The S6 coefficient was selected due to the satisfactory relationship guaranteed

I.A. Pérez et al. / Science of the Total Environment 476–477 (2014) 158–164

161

Table 2 Skewness values calculated using different procedures. Bold figures for observations from 13 to 15 GMT, and italics for values from 0 to 2 GMT. Skewness

Level 1.8 m

S1 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15 S16

3.7 m

3.68 0.135 0.217 0.276 0.365 0.326 0.271 1.966 0.177 0.184 0.201 0.227 0.568 0.277 0.178

0.332 −0.119 −0.042 −0.091 −0.053 −0.097 −0.099 0.823 −0.268 −0.288 −0.092 −0.008 −0.068 −0.034 −0.232

2.723 0.231 0.254 0.362 0.406 0.383 0.352 2.243 0.2 0.21 0.274 0.252 0.674 0.325 0.138

3.141 0.104 0.194 0.216 0.311 0.264 0.216 1.716 0.157 0.162 0.168 0.193 0.466 0.229 0.155

8.3 m 0.333 −0.122 −0.041 −0.09 −0.052 −0.09 −0.095 0.836 −0.308 −0.336 −0.092 −0.005 −0.048 −0.024 −0.261

2.436 0.133 0.208 0.279 0.325 0.298 0.26 1.848 0.076 0.078 0.183 0.211 0.568 0.277 0.049

2.252 0.053 0.149 0.132 0.219 0.172 0.131 1.414 0.15 0.155 0.104 0.144 0.331 0.164 0.105

0.443 −0.115 −0.039 −0.08 −0.05 −0.085 −0.093 0.844 −0.286 −0.309 −0.092 −0.003 −0.037 −0.018 −0.254

2.225 0.025 0.139 0.141 0.209 0.18 0.138 1.439 −0.039 −0.04 0.079 0.148 0.417 0.206 −0.035

by the largest Pearson correlation coefficient, and this was the only one used in the subsequent analyses presented in this paper. 4.2. Daily–yearly analysis One hour intervals were considered for daily evolution, and one month intervals for yearly evolution. Skewness and median concentration were calculated in each interval and the kernel smoothing method was used to obtain the curves for the lowest and highest levels, presented in Fig. 1. Skewnesses were especially low from 10 to 19 GMT and from April to September due to the intensive turbulence at this daily interval during these months. Moreover, the daily interval of low skewness was shorter in summer compared to spring, linked with low CO2 concentrations owing to lack of vegetation and less soil activity during this dry period. In contrast, high skewnesses observed during the night throughout the whole year were justified by the stable stratification of the low atmosphere. This daily evolution of skewness was described by Pérez et al. (2012b) although only for certain hours. Moreover, the highest values were associated with the highest concentrations of the lowest level, due to the lowest dispersion near the surface, where CO2 was emitted. Three skewness intervals were suggested from Fig. 1, one of which had negative skewnesses. Averages of median concentrations of these

Table 3 Pearson correlation coefficients between skewnesses and median concentrations calculated each half hour. Skewness

Level 1.8 m

3.7 m

8.3 m

S1 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15 S16

0.929 0.938 0.977 0.981 0.984 0.983 0.980 0.956 0.928 0.920 0.967 0.983 0.982 0.982 0.850

0.906 0.901 0.956 0.970 0.979 0.972 0.960 0.930 0.881 0.875 0.926 0.976 0.968 0.971 0.880

0.906 0.858 0.869 0.905 0.946 0.939 0.900 0.864 0.833 0.829 0.890 0.927 0.920 0.923 0.819

Fig. 1. Daily and yearly evolution of skewness (grey bands) and median concentration in ppm (black lines) at the highest (a) and lowest (b) levels.

162

I.A. Pérez et al. / Science of the Total Environment 476–477 (2014) 158–164

Table 4 Average median CO2 concentrations (in ppm) for the skewness intervals proposed in the daily–yearly analysis. Skewness interval

b0.0 0.0–0.2 N0.2

Level 1.8 m

8.3 m

391.9 394.3 401.1

392.2 395.0 400.4

intervals were calculated and presented in Table 4. They were statistically significant at a 95% confidence level. Lowest skewnesses were linked with concentrations about 392 ppm, intermediate skewnesses with concentrations around 395 ppm, and concentrations were about 400 ppm for the highest skewnesses. The close relationship between skewness and median concentration described by Pérez et al. (2012c) may be attributed to the daily cycle prevalence (Pérez et al., 2013). Moreover, the greatest contrasts in averages of median concentrations were observed at 1.8 m and were already presented in a simpler daily evolution (Pérez et al., 2012d).

4.3. Wind direction analysis Wind direction was considered to analyse the 8.3 m level observations. Sixteen wind sectors were used to calculate the skewness and median concentration. Monthly analysis (Fig. 2a) revealed positive skewnesses. However, the 0.15–0.20 skewness band was the most noticeable feature. This band enveloped the lowest skewness and approximately concurred with high and low concentrations. By combining skewness and concentration, the band has extended from ENE to S from December to July, and from ENE to N, through S, from July to September. In this analysis, the nearly symmetric distributions described by small skewnesses were associated with sources that determined the infrequent presence of extreme scattered values and the joint displacement of the distribution towards high or low concentrations. The highest concentrations were caused by the city of Valladolid plume (317 000 pp), located some 24 km to the SE, and reinforced by soil and vegetation emissions in spring, whose influence and shape has previously been observed (Pérez et al., 2009a, 2012d). During summer, emissions were small and turbulence during the day was high due to intensive heating, both factors determining small skewnesses and low concentrations. A similar representation of concentrations was presented by García et al. (2008) with observations recorded using a MIR 9000 analyser in 2004–2005 at the same site. Yearly evolution and directional behaviour were similar although values were around 10–15 ppm lower, in agreement with the trend presented by Pérez et al. (2009b) with a three-year measuring campaign, or with the trend presented by Sánchez et al. (2010) for nine years. When daily evolution was considered (Fig. 2b), a direct relationship between skewness and concentration was observed. Highest skewnesses and concentrations were obtained during the night, from 0 to 6 GMT in the Valladolid sector. However, the lowest values of both variables were obtained during midday, especially in the prevailing direction, around NE (Pérez et al., 2008), reinforced by high wind speed, which was more frequent during the day than during the night and linked to thermal turbulence in the lowest atmosphere. Five skewness intervals were proposed in this analysis (Table 5). Leftskewed distributions were associated with averages of median concentrations around 395 ppm, while they were above 400 ppm for strongly right-skewed distributions. Although an increasing trend is observed, the lowest concentrations were closer than the highest. 4.4. Wind speed analysis Wind speed intervals were 1 m s−1 up to a wind speed of 5 m s−1. Above this wind speed, all observations were attributed to the same wind speed interval and the 8.3 m level was used. When yearly evolution is investigated (Fig. 3a), skewness and concentration behaviours differed considerably. Skewness bands mainly decreased with wind speed, while median concentration presented a yearly evolution. Time analysis revealed that concentration decreased in summer for all wind speeds, and wind speed analysis indicated that concentration also decreased with wind speed for values above 2 m s−1. Table 6 presents the averages of wind speeds and median concentrations calculated for the proposed skewness intervals. Their differences were statistically significant at a 95% confidence level. Lowest skewnesses were observed with an average wind speed of 5 m s− 1 and an average median Table 5 Skewness intervals and average median concentrations for dailydirectional analysis.

Fig. 2. Directional analysis of yearly (a) and daily (b) evolution of skewness (grey bands) and median concentration in ppm (black lines).

Skewness interval

Concentration (ppm)

b−0.1 −0.1 to 0.0 0.0–0.1 0.1–0.2 N0.2

395.2 395.5 396.8 400.7 403.5

I.A. Pérez et al. / Science of the Total Environment 476–477 (2014) 158–164

163

Table 6 Skewness intervals and average values of wind speed and median concentration for wind speed analysis. Analysis

Skewness interval

Wind speed (m s−1)

Concentration (ppm)

Yearly

b0.1 0.1–0.2 N0.2 b−0.1 −0.1 to 0.0 0.0–0.1 0.1–0.2 N0.2 b0.1 0.1–0.2 N0.2

5.0 3.3 1.2 5.3 3.9 2.1 1.5 0.9 4.4 1.7 0.5

395.4 397.3 397.6 396.1 395.5 397.3 401.6 403.8 396.4 398.5 400.4

Daily

Wind direction

concentration of about 395 ppm. Fig. 3a also shows that distributions for the highest wind speeds were nearly symmetric from March to October and in April were even slightly left-skewed. Dispersion processes linked to these high wind speeds were responsible for this behaviour. However, the remaining skewness intervals suggested were noticeably related with wind speed and slightly with concentration, which was around 397 ppm. This study expands the relationship between concentration and wind speed previously described by García et al. (2010) where seasonal behaviour was presented. An almost direct relationship between skewness, concentration and wind speed was observed in the daily analysis (Fig. 3b). The daily evolution was observed in the whole wind speed range. Moreover, when wind speed increased, skewness and concentration decreased. In order to obtain a detailed relationship, five skewness intervals were proposed whose average wind speed values and median concentrations shown in Table 6 presented significant differences at a 95% confidence level. The only noticeable feature was the lowest skewness linked to the highest average wind speed, but with the second lowest average median concentration. This result is a consequence of the lowest frequency of the lowest skewnesses and their distribution before and after midday when concentrations were small. The link with wind direction, shown in Fig. 3c, revealed that skewness was strongly wind speed dependent. This behaviour was particularly noticeable in the sector from N to S through E, where skewness decreased considerably when wind speed increased and skewnesses were negative for wind speeds above around 4 m s−1. Concentrations were determined by the Valladolid urban plume, since the highest values, around 402 ppm, were obtained with wind speeds around 2 m s− 1. Even for the highest wind speeds, the greatest concentrations, around 398 ppm, were obtained in the Valladolid direction. Although Fig. 3c presents five skewness bands, several intervals were assayed to establish a relationship between skewness, wind speed, and concentration. Finally, the simple classification presented in Table 6 based on three groups determined differences between average values that were statistically significant at a 95% confidence level.

5. Conclusions

Fig. 3. Wind speed analysis of yearly evolution (a), daily evolution (b) and directional behaviour (c) of skewness (grey bands) and median concentration in ppm (black lines).

Several skewness statistics were compared with the CO2 concentrations recorded at a rural site. The usual coefficient proved to be extremely sensitive to outliers and most robust statistics indicated daily skewness evolution better. Skewness coefficients based on triangular distribution and mode marked left-skewed distribution noticeably at midday. The Yule coefficient was selected due to its satisfactory daily evolution. A relationship between skewness and sources or dispersive processes was established by means of ancillary variables. The daily cycle was the most important feature of both skewness and concentration, while the yearly cycle revealed the greatest soil and plant emissions in spring.

164

I.A. Pérez et al. / Science of the Total Environment 476–477 (2014) 158–164

Wind direction analysis indicated that the most symmetric distributions were linked to two kinds of sources. The Valladolid City plume determined the highest concentrations, which were reinforced by natural emissions in spring. The lowest plant and soil emissions and summer turbulence provided the lowest concentrations. Wind speed analysis revealed that skewness decreased when wind speed increased. However, the relationship between skewness and concentration depends on the analysis performed. Below 2 m s−1, the relationship between concentration and wind speed was extremely weak. Skewness must not be observed as a surrogate variable of concentration, but as a complementary one, since its sign and quantity can identify sources or dispersive processes in agreement with the corresponding analyses. Conflict of interests There is no conflict of interests. Acknowledgements The authors wish to acknowledge the financial support of the Ministry of Economy and Competitiveness, ERDF funds, and the Regional Government of Castile and Leon. References Beardsmore DJ, Pearman GI. Atmospheric carbon dioxide measurements in the Australian region: data from surface observatories. Tellus B 1987;39:42–66. Beychok MR. Fundamentals of stack gas dispersion. 3rd ed. Irvine: Milton R. Beychok; 1994. Bonato M. Robust estimation of skewness and kurtosis in distributions with infinite higher moments. Financ Res Lett 2011;8:77–87. Brys G, Hubert M, Struyfl A. A comparison of some new measures of skewness. In: Dutter R, Filzmoser P, Gather U, Rousseeuw J, editors. Developments in robust statistics. Springer; 2003. p. 98–113. Brys G, Hubert M, Struyf A. A robust measure of skewness. J Comput Graph Stat 2004;13: 996–1017. Crosson ER. A cavity ring-down analyzer for measuring atmospheric levels of methane, carbon dioxide, and water vapor. Appl Phys B-Lasers Opt 2008;92:403–8. Donnelly A, Misstear B, Broderick B. Application of nonparametric regression methods to study the relationship between NO2 concentrations and local wind direction and speed at background sites. Sci Total Environ 2011;409:1134–44. Donnelly A, Broderick B, Misstear B. Relating background NO2 concentrations in air to air mass history using non-parametric regression methods: application at two background sites in Ireland. Environ Model Assess 2012;17:363–73. Ekström M, Jammalamadaka SR. A general measure of skewness. Stat Probab Lett 2012;82:1559–68. Eler K, Plestenjak G, Ferlan M, Čater M, Simončič P, Vodnik D. Soil respiration of karst grasslands subjected to woody-plant encroachment. Eur J Soil Sci 2013;64:210–8. García MA, Sánchez ML, Pérez IA, de Torre B. Continuous carbon dioxide measurements in a rural area in the upper Spanish plateau. J Air Waste Manage Assoc 2008;58:940–6.

García MA, Sánchez ML, Pérez IA. Synoptic weather patterns associated with carbon dioxide levels in Northern Spain. Sci Total Environ 2010;408:3411–7. Guo M, Wang X-F, Li J, Yi K-P, Zhong G-S, Wang H-M, et al. Spatial distribution of greenhouse gas concentrations in arid and semi-arid regions: a case study in East Asia. J Arid Environ 2013;91:119–28. Henry RC. Locating and quantifying the impact of local sources of air pollution. Atmos Environ 2008;42:358–63. Henry RC, Chang Y-S, Spiegelman CH. Locating nearby sources of air pollution by nonparametric regression of atmospheric concentrations on wind direction. Atmos Environ 2002;36:2237–44. Henry RC, Vette A, Norris G, Vedantham R, Kimbrough S, Shores RC. Separating the air quality impact of a major highway and nearby sources by nonparametric trajectory analysis. Environ Sci Technol 2011;45:10471–6. Heymann S, Latapy M, Magnien C. Outskewer: Using skewness to spot outliers in samples and time series. Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM; 2012. p. 527–34. Hoaglin DC, Mosteller F, Tukey JW. Understanding robust and exploratory data analysis. New York: John Wiley & Sons; 2000. Hubert M, van der Veeken S. Outlier detection for skewed data. J Chemometr 2008;22: 235–46. Lohila A, Aurela M, Regina K, Laurila T. Soil and total ecosystem respiration in agricultural fields: effect of soil and crop type. Plant Soil 2003;251:303–17. Mancinelli R, Marinari S, Di Felice V, Savin MC, Campiglia E. Soil property, CO2 emission and aridity index as agroecological indicators to assess the mineralization of cover crop green manure in a Mediterranean environment. Ecol Indic 2013;34:31–40. Pérez IA, García MA, Sánchez ML, de Torre B. Description of atmospheric variables measured with a RASS sodar: cycles and distribution functions. J Wind Eng Ind Aerodyn 2008;96:436–53. Pérez IA, Sánchez ML, García MA, de Torre B. CO2 transport by urban plumes in the upper Spanish plateau. Sci Total Environ 2009a;407:4934–8. Pérez IA, Sánchez ML, García MA, de Torre B. Daily and annual cycle of CO2 concentration near the surface depending on boundary layer structure at a rural site in Spain. Theor Appl Climatol 2009b;98:269–77. Pérez IA, Sánchez ML, García MA. CO2 dilution in the lower atmosphere from temperature and wind speed profiles. Theor Appl Climatol 2012a;107:247–53. Pérez IA, Sánchez ML, García MA, Pardo N. Analysis of CO2 daily cycle in the low atmosphere at a rural site. Sci Total Environ 2012b;431:286–92. Pérez IA, Sánchez ML, García MA, Pardo N. Spatial analysis of CO2 concentration in an unpolluted environment in northern Spain. J Environ Manage 2012c;113:417–25. Pérez IA, Sánchez ML, García MA, Pardo N. Analysis and fit of surface CO2 concentrations at a rural site. Environ Sci Pollut Res 2012d;19:3015–27. Pérez IA, Sánchez ML, García MA, Pardo N. Carbon dioxide at an unpolluted site analysed with the smoothing kernel method and skewed distributions. Sci Total Environ 2013;456–457:239–45. Sánchez ML, Pérez IA, García MA. Study of CO2 variability at different temporal scales recorded in a rural Spanish site. Agric For Meteorol 2010;150:1168–73. Sparks N, Toumi R. Remote sampling of a CO2 point source in an urban setting. Atmos Environ 2010;44:5287–94. Tajuddin IH. A comparison between two simple measures of skewness. J Appl Stat 1999;26:767–74. Vose D. Risk analysis: a quantitative guide. Chichester: Wiley; 2008. Wang Y, Munger JW, Xu S, McElroy MB, Hao J, Nielsen CP, et al. CO2 and its correlation with CO at a rural site near Beijing: implications for combustion efficiency in China. Atmos Chem Phys 2010;10:8881–97. Wilks DS. Statistical methods in the atmospheric sciences. Amsterdam Academic Press; 2011. Xu X, Yi C. The influence of geometry on recirculation and CO2 transport over forested hills. Meteorol Atmos Phys 2013;119:187–96. Yu KN, Cheung YP, Cheung T, Henry RC. Identifying the impact of large urban airports on local air quality by nonparametric regression. Atmos Environ 2004;38:4501–7.

Analysis of carbon dioxide concentration skewness at a rural site.

This paper provides evidence that symmetry of CO2 concentration distribution may indicate sources or dispersive processes. Skewness was calculated by ...
562KB Sizes 3 Downloads 0 Views