Contents lists available at ScienceDirect

Journal of Environmental Radioactivity journal homepage: www.elsevier.com/locate/jenvrad

Description of spatial patterns of radionuclide deposition by lognormal distribution and hot spots Andry Grubich a, *, V.I. Makarevich b, O.M. Zhukova c a

CJSC TIMET, 220015 Minsk, Belarus Belarussian State Institute of Metrology, Minsk, Belarus c The Republican Centre of Radiation Control and Environmental Monitoring, Minsk, Belarus b

a r t i c l e i n f o

a b s t r a c t

Article history: Received 7 May 2013 Received in revised form 3 September 2013 Accepted 4 September 2013 Available online 19 October 2013

Spatial distributions of activity density (kBq/m2) and activity concentration (Bq/kg) are studied on sites with non-cultivated soils. Fitting datasets with lognormal, Weibull and normal distributions with sampling size n 60 showed that radionuclide deposition (90Sr, 137Cs, 238Pu, 239þ240Pu, 241Am) due to Chernobyl fallout no more than in 10% of cases are described by Weibull distribution, and in the rest of the cases e by lognormal distribution. However asymptotics of “righthand tail” of empirical (sample) distribution quite often differs from the right-hand tail asymptotics of lognormal distribution. Thereby lognormal distribution is only an approximate statistical model of radionuclides’ spatial pattern. Estimates of site surface area with “hot spots” are considered. Also distributions of 137Cs and 134Cs activity concentration on the territory contaminated by Fukushima fallout are reviewed. Characteristics of activity concentration for Fukushima and Chernobyl fallouts are collated. The results obtained make it possible to suggest that in both cases spatial contaminations of soil are described by approximately the same statistical models. 2013 Elsevier Ltd. All rights reserved.

Keywords: Hot spots Lognormal distribution Chernobyl fallout Fukushima fallout 90 Sr, 134Cs, 137Cs, 238Pu, 239þ240Pu,

241

Am

1. Introduction Daniels and Higgins (2002) gave a brief review of data on frequencies with which lognormal, Weibull and normal distributions are used to describe content of radionuclides in different objects (grain, soil, plant) for different sources of radionuclide intake in the environment (weapons and Chernobyl fallout, Savannah River Plant, and so on). From this review it follows that in the vicinity of several nuclear production reactors, data on content of 137Cs in different objects most frequently are described by lognormal distribution, less frequently e by Weibull distribution, and least frequently e by normal distribution. This review also shows results of studying soil radioactive contamination by Chernobyl fallout. But in this case data on frequencies are not given. As far as we know for Chernobyl deposition corresponding research was not carried out. One of the objectives of this article is to bridge this gap. Daniels and Higgins (2002) paid special attention to the following results. Lognormal distributions describe the 137Cs and 90 Sr contamination of the inhabited areas of Bryansk region as a whole, except in the ﬁnite collection of points forming the tails of

each distribution (Arutyunyan et al., 1993). For the fallout pattern of 137 Cs in Austria, extreme values have a higher probability of occurrence than one would expect from a lognormal distribution (Pausch et al., 1998). In this connection it is worthy of note that the shape of “the right-hand tail” distribution in the range of maximum extreme values in fact determines site surface area with the socalled “hot spots”. Evaluation of area of hot spots’ surface is essential information for protection and remediation measures. This is precisely why the article gives analysis of features of tails’ shapes of empirical distributions for datasets under study, and various variants of computing area of hot spots’ surface are considered. Finally, the article collates a number of characteristics of soil contamination by Chernobyl and Fukushima fallout, being of interest in ﬁnding out dependencies between statistics, which are characteristic of (common for) large-scale accidents at nuclear power plants. 2. Data 2.1. Site P3

* Corresponding author. Tel.: þ375 172840818. E-mail address: [email protected] (A. Grubich). 0265-931X/$ e see front matter 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.jenvrad.2013.09.004

Letter P in the site name implies, as in Grubich (2012), that this site is located on the territory of Polessie State Radiation Ecological

A. Grubich et al. / Journal of Environmental Radioactivity 126 (2013) 264e272

265

Table 1a Characteristics of sites’ radioactive activity densities. Site

Dataset

Nuclide

Sample size

x0 (kBq m2)

CV

R/s

x(n)/x(1)

Skewness

Kurtosis

P3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

137

142 71 114 85 85 34 33 34 34 34 34 34 34 34 34 34 34 34 34 34 34 33 34 34 34 136 134 136 136 136 100 60 100 100 84

3181.0 274.0 14.4 2.78 5.89 2579.0 309.1 14.9 3.19 6.65 3551.0 313.8 10.4 2.55 5.41 4031.0 363.8 11.8 3.16 6.80 4447.0 488.2 20.1 4.20 8.97 3652.1 368.3 14.3 3.28 6.96 2848.0 456.1 83.0 80.2 913.0

0.499 0.912 0.735 0.853 0.778 0.417 0.522 0.421 0.712 0.612 0.514 0.670 0.404 0.375 0.390 0.365 0.505 0.343 0.376 0.382 0.350 0.650 0.414 0.599 0.586 0.450 0.637 0.488 0.590 0.560 0.228 0.214 0.277 0.201 0.456

4.50 5.73 7.26 6.33 6.36 3.94 4.00 4.93 5.25 5.11 4.40 3.71 4.13 4.50 4.55 4.55 4.07 4.74 4.62 4.61 4.54 3.70 3.83 4.34 4.54 5.06 5.01 4.86 6.18 6.45 5.45 3.70 5.72 4.82 5.36

18.4 69.9 65.5 80.0 72.2 5.50 9.32 9.13 19.4 15.4 7.59 10.8 4.60 4.91 4.98 5.00 11.4 5.57 4.93 4.75 5.02 19.3 6.39 9.15 9.87 9.85 19.3 9.92 19.4 18.4 3.20 2.22 4.50 2.57 8.50

0.687 2.813 3.272 2.853 2.713 0.507 0.614 1.285 2.651 2.031 1.114 1.038 1.007 0.802 1.102 1.049 0.641 0.514 1.012 1.019 0.681 0.559 0.559 1.994 2.065 0.855 1.089 1.288 2.602 2.385 1.424 0.348 1.014 0.663 1.004

0.1250 10.65 15.17 10.50 9.976 0.311 0.027 3.155 9.019 6.076 1.508 0.109 0.681 1.032 1.735 1.328 0.159 1.177 1.522 1.812 0.566 0.396 0.277 4.632 5.139 0.878 1.176 1.909 9.488 8.646 3.463 0.766 1.743 0.0570 1.655

P4.1

P4.2

P4.3

P4.4

P4

P5 P6 B1 B3 P2

Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 137 Cs 137 Cs 137 Cs 137 Cs

Table 1b Characteristics of sites’ radioactive activity concentrations. Site

Dataset

Nuclide

Sample size

x0 (Bq kg1)

CV

R/s

x(n)/x(1)

Skewness

Kurtosis

P4.1

36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64

137

34 33 34 34 34 34 34 34 34 34 34 34 34 34 34 34 33 34 34 34 136 134 136 136 136 100 84 96 96

12,713.0 1851.9 97.7 12.3 39.1 19,506.0 1776.3 61.4 13.8 29.8 20,747.0 1921.3 67.8 17.0 36.6 18,827.0 2171.4 72.2 16.8 30.9 17,948.3 1929.0 77.8 16.7 35.4 323.0 3622.0 31,725.0 25,933.0

0.603 1.037 1.431 1.728 1.535 0.656 0.696 0.833 0.638 0.669 0.698 0.844 1.204 0.905 0.906 0.615 0.777 0.661 0.601 0.598 0.678 0.840 1.150 1.159 1.047 0.280 0.694 1.829 1.818

4.90 4.57 4.73 5.82 5.69 4.21 3.56 4.83 5.37 5.47 4.63 4.53 5.65 5.17 5.15 4.82 3.81 5.28 3.65 3.81 5.77 5.45 7.40 9.98 9.22 5.07 7.13 7.06 6.99

11.9 35.8 48.3 82.6 65.4 11.6 17.4 14.0 10.1 10.5 11.2 31.3 28.1 14.2 15.4 8.50 35.0 13.8 7.51 7.74 21.4 47.7 48.3 82.6 65.4 3.72 20.1 1108.0 1138.0

1.826 2.350 3.257 4.925 4.557 1.386 0.508 2.610 2.840 3.135 2.656 2.557 4.368 3.467 3.413 2.196 0.839 2.566 1.351 1.371 2.222 1.802 4.243 6.578 5.464 0.809 3.438 4.076 4.033

4.805 6.184 10.43 25.94 22.41 1.878 0.767 7.741 10.69 12.83 7.756 7.389 20.67 13.29 12.82 5.926 0.071 8.930 1.324 1.432 6.282 4.141 20.99 55.51 39.23 0.592 16.71 21.14 20.59

P4.2

P4.3

P4.4

P4

B1 P2 F

Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 90 Sr 241 Am 238 Pu 239,240 Pu 137 Cs 137 Cs 137 Cs 134 Cs

266

A. Grubich et al. / Journal of Environmental Radioactivity 126 (2013) 264e272

Reserve (PSRER). On site P3 sampling was carried out using a central aligned square grid of 1 km 1 km. Thus in the center of each block of regular grid the sampling point was situated. In each sampling point ﬁve increments were taken (four increments in corners of a small square measuring about 10 m 10 m and one increment in the center of the square). For increment, sampling a metal sampler that gave 20-cm-deep cores with 12.56 cm2 cross section was used. Then cores were mixed to obtain a composite sample. If the sampling point was in the forest, the sampling sites were cleared of forest ﬂoor materials (branch and leaf litter, etc). Totally on site P3, 142 composite samples were taken. The site area was 142 km2. Site P3 had an irregular shape. Geographical coordinates of sampling points on site P3 varied within the following range: longitude from 29 560 15.600 E to 30 060 0700 E; latitude from 51300 25.500 N to 51360 29.500 N. The 142 composite samples are presently stored in the samples bank of the Republican Centre of Radiation Control and Environmental Monitoring (RCRCEM), Minsk. In test samples, the activities of 90Sr, 137Cs, 241Am, 238Pu and gross activities of 239Pu and 240Pu (further on e 239þ240Pu) were measured. In 2009 results of the activity measurement were used for mapping radioactive contamination of PSRER. For measurements of radionuclide activity, gamma-spectrometry (137Cs, 241 Am), alpha-spectrometry (238Pu, 239þ240Pu) and beta-radiometry 90 ( Sr) were used. In alpha-spectrometry measurements, samples were extracted by a radiochemical method for plutonium. Samples for determination of 90Sr were also prepared by a radiochemical method. Activity of 137Cs was measured in all 142 samples. Meanwhile activities of 90Sr, 241Am, 238Pu and 239þ240Pu were measured only in some of samples: in 71 for 90Sr; 114 for 241Am; 85 for 238Pu and 239þ240Pu. Sampling points, corresponding to samples in which activities of 90Sr, 241Am, 238Pu and 239þ240Pu were measured were distributed across site P3 rather evenly. The mean values of sample activity density, x0 (kBq m2), and activity concentration, x0 (Bq kg1), are listed in Table 1 as of January 1, 2009. Table 1 also shows non-dimensional quantities that characterize nonuniformity of site contamination.

analysis, so the 7th, 22nd, 37th and 52nd datasets in Table 1 have sampling size n ¼ 33 instead of 34. Boundaries between sub-sites P4.1eP4.4 coincide with directions SoutheNorth and WesteEast and intersect in point with coordinates 30 000 43.400 E and 51350 0000 N. It should be noted that site P4 with surface area of 34 km2 is located on the territory of site P3 with surface area of 142 km2. Note that sampling points on sites P3 and P4 are not the same and soil sampling on these sites was done in different years. 2.3. Sites P5, P6 Surface areas of sites P5 and P6 were 25 m2 and 15 m2 respectively. Geographical coordinates of these tiny sites are: 30 01019.900 E, 51300 60.400 N (site P5) and 29 520 3000 E, 51390 1500 N (site P6). For measurements of 137Cs activity in soil of sites P5 and P6, an in situ method of measurement was used with the detector placed inside a lead collimator, which was put directly of soil surface (Grubich, 2012). The measurements were carried out using a central aligned square grid of 0.5 m 0.5 m. The results obtained are given in Table 1a as of August 1, 2011. 2.4. Territory F

In order to make description of datasets more comprehensive, Table 1 shows values of sample skewness, g1, and sample kurtosis, g2.

The internet contains numerous data for radionuclide activity concentration in soil as a result of Fukushima fallout. This article reviews datasets for 137Cs and 134Cs activity concentration submitted by Ministry of Education, Culture, Sports, Science and Technology, MEXT (http://eq.wide.ad.jp/ﬁles_en/110606soil_1000_ en.pdf). This ﬁle contains information on contamination of 96 sampling points located within an area of about 5400 km2 (area of semi-circle inclusive of minimal, 20 km, and maximal, 62 km, distances to the Fukushima Dai-ichi Nuclear Power Plant). In Table 1b the territory on which these sampling points are located is labeled F. Given in Table 1b data for the Fukushima fallout can be interpreted as data, describing properties of spatial distributions if sampling points are arranged according to simple random sampling. Roughly this condition can be considered as satisﬁed. In some sampling points only one sample was taken (sampling depth e 5 cm), in others e several soil samples were taken. Totally, in 96 sampling points 1110 samples were taken. So, for instance, in sampling point designated in this ﬁle as “[1] 62 km North/West”, within the period from April 14 to June, 2011, 47 samples were taken. The value of activity concentration in the sampling point where more than one sample was taken is taken below as equal to an arithmetic mean (mean of increments).

2.2. Sites P4.1e4.4 and P4

2.5. Sites B1, B3 and P2

Bondar et al. (2011) studied in detail several sites of soil located on the territory of PSRER, including site No. 2, on which sampling was carried out using a central aligned square grid of 0.5 km 0.5 km. No distributions were ﬁtted to these datasets by Bondar et al. In this article on the basis of results of samples activity measurements for site No. 2, datasets were composed for four equal-area sub-sites P4.1e4.4 with surface areas of 8.5 km2 each, forming collectively the site designated as P4. Review of four subsites P4.1eP4.4 makes it possible, on the one hand, to collate characteristics of mutual radioactive contamination of adjacent sites and, on the other hand, to ﬁnd out how spatial distributions of each of the radionuclides vary with increase of site surface area. Table 1 shows characteristics of radionuclide contamination of both sites P4.1eP4.4, and site P4. For two sampling points the numerical values of uncertainty of 90Sr speciﬁc activity measurement in soil samples were anomalously large. These data are excluded from the

Radioactive contamination of sites B1, B3, P2 was earlier considered in Grubich (2012), where symbols used here were ﬁrst introduced. But in this work normal and Weibull distributions were not ﬁtted. Contamination characteristics of sites B1, B3 and P2 are given in Table 1 and the corresponding dates of measurement are: August 1, 2009, July 1, 2010 and December 20, 2008, respectively. Therefore, all values of sample mean, x0, in Table 1 are given as of the time when relevant measurements were made. Recall that 241 Am is the daughter of 241Pu and the 241Am activity in soil is basically conditional on decay of 241Pu over the period of time after radioactive fallout. Out of all the sites described in Section 2, the smallest surface area was site B3, 1.56 m2, and the largest, the territory F, 5400 km2. The thing in common for datasets in Table 1 is that all of them were obtained either as a result of sampling or with in situ measurements on sites with not cultivated after radioactive fallout.

- Coefﬁcient of variation, CV ¼ s/x0, where s e sample standard deviation; x0 e sample mean. - “The studentized” range, R/s, where R ¼ (x(n) x(1)); x(n) and x(1) e maximal and minimal values of activity density (concentration); n e sample size. - Extremal quotient, x(n)/x(1).

A. Grubich et al. / Journal of Environmental Radioactivity 126 (2013) 264e272

y ¼ a þ bu;

267

(3)

which describes scatter of points (ui, lnx(i)), where ui is

h i1 ui ¼ lnln 1 Fn xðiÞ :

(4)

Equation (3) and deﬁnition (4) follow from the expression obtained after taking double logarithms the of Weibull cumulative distribution function

a WðxÞ ¼ 1 exp ðx=bÞ

Fig. 1. Fitting of distribution to dataset: ﬁtting of lognormal distribution (a) and Weibull distribution (b) to 63rd dataset.

Cheng et al. (1994) used quantity

3.1. Fitting distributions to dataset For ﬁtting lognormal, normal and Weibull distributions to datasets we used simple graphic methods. Well-known is the method of probability paper (Gilbert, 1987). However, for ﬁtting normal or lognormal distributions to data one may also use relationships between order statistics x(i) or their logarithms lnx(i) and corresponding quantiles zi (i ¼ 1, 2, ., n) of the normal standard distribution. In this case the ﬁtting procedure consists in the determination of linear regression equation

(1)

that describes for normal distribution scatter of points (zi, x(i)), and for lognormal distribution e of points (zi, lnx(i)). In the case of the lognormal distribution, this method is described in detail in Burmaster and Hull (1996). In Burmaster and Hull (1996) the value of empirical (sample) distribution function Fn(x) for x ¼ x(i) is deﬁned by expression

Fn xðiÞ ¼ ði 0:5Þ=n:

and supposition that empiric distribution is adequately described by Weibull distribution. As an example, Fig. 1a shows ﬁtting lognormal distribution L(m, s2) to dataset for 137Cs on the territory F (coefﬁcient of determination R2 ¼ 0.992). For comparison, on top of the plot scale of probability in percentage is shown for the method of logarithmic probability paper. Results of ﬁtting Weibull distribution W(a, b) to this same dataset are shown in Fig. 1b. For the Weibull distribution the value of coefﬁcient of determination, R2 ¼ 0.925, proved to be much smaller than in case of lognormal distribution. The plot of ﬁtting this dataset with normal distribution N(m, s2) is not given here, as the value of the coefﬁcient of determination proves to be too small in this case, R2 ¼ 0.530. Distributions for which coefﬁcients of determination R2 in ﬁtting lognormal, normal and Weibull distributions to dataset assumes maximum value will be considered as the distribution which in the best way describes this dataset. Let us remind that coefﬁcients in equation (1) are estimators of lognormal distribution parameters (points (z(i), lnx(i))) and normal distribution parameters (points (z(i), x(i))): m z a, s z b. Coefﬁcients in equation (3) are estimators of Weibull distribution parameters: a z 1/b, b z exp(a).

3.2. Area of site surface with contamination in arbitrary interval

3. Methods

y ¼ a þ bz;

(5)

(2)

In case of ﬁtting the Weibull distribution to the datasets we used the method the idea of which is described, in particular, in Johnson et al. (1994). According to this method the following linear regression equation is found:

SðX > xÞ ¼ ½1 FðxÞQ ;

(6)

for geochemical anomaly separation, where S(X > x) represents the area enclosed by contours which have contour values greater than x; F(x) is distribution function; Q is area of site. Evidently, formula (6) can also be used for description of total area of site surface with activity density (kBq m2) or concentration (Bq kg1) larger than selected value x. From (6) it follows that area of site with activity density or concentration ranges from xa to xb equals

Sðxa < X xb Þ ¼ ½Fðxb Þ Fðxa ÞQ :

(7)

Cheng et al. (1994) considered the case in which distribution function F(x) is determined through spatial distribution of quantity being measured and represented in the form of map: in order to determine F(x) the map is plotted with grid having cells of equal size, and then number of cells with various values of quantity x is counted. This method is appropriate in cases where sampling points are scattered across the site absolutely irregularly. In the case of the radioactivity contamination, this method was used for determine numerical values of statistics for the fallout of 137Cs in Mogilev region of Belarus (Grubich, 2012). In order to obtain datasets 1e62 in Table 1 systematic grid sampling was used, when to datasets x1, x2, ., xn equal-sized blocks (grid cells) correspond. In this case in (6), (7) empiric distribution

268

A. Grubich et al. / Journal of Environmental Radioactivity 126 (2013) 264e272

Fig. 2. Quantity x (equation (8)) against coefﬁcient of variation CV.

function (2) proper can be used, or distribution function that best describes (2). It should be noted that the same is valid for random sampling within blocks, and also simple random sampling. In the latter case availability of empiric distribution function (2) in (6) and (7) is not so evident (one has to imagine blocks of various shape with similar surface area). It is a reminder that datasets 63 and 64 in Table 1 can approximately be considered as selected by means of simple random sampling.

Fig. 3. Probability density functions of lognormal (curves a, b, c) and normal (curve A) distributions.

described by lognormal distributions, with coefﬁcients of determination equal to 0.992. Thus, in Table 1 datasets with sample size n 60 in the majority of cases are best described by lognormal distribution.

4.2. Hot spots

4. Results and discussions

According to data in Table 1 for datasets on sites B1, B3, P2eP6, F values of the non-dimensional quantity

4.1. Frequency distributions

x ¼ xðnÞ x0 =s

According to the derived coefﬁcients of determination, the normal distribution describes none of the datasets in Table 1. This same conclusion can be drawn from asymmetry and kurtosis values listed in Table 1. On sites P4.1e4.4 spatial distributions of activity density are often described by lognormal distribution: 14 datasets out of 20. The remaining 6 datasets (7, 8, 17, 18, 22, and 23 in Table 1a) are described by Weibull distributions. We remind that sites P4.1e4.4 are parts of site P4. If in place of contamination of sites P4.1e4.4 we consider contamination of site P4, then for each of the radionuclides (90Sr, 137Cs, 238Pu, 239þ240Pu, 241 Am) spatial distributions of activity density are best by described lognormal distributions. The same holds for the distributions of activity concentration. This suggests that in order to determine the type of distribution one has to use samples with the size much larger than for sites P4.1e4.4 (at any rate, in case of central aligned square grid). Therefore, below, instead of datasets for individual sites P4.1e4.4 we shall consider only datasets for aggregate site P4. For Chernobyl fallout (sites B1, B3, P2eP6) the following frequency distributions hold: - For activity density 14 datasets (2e5, 26e35 in Table 1a) out of 15 are described by lognormal distribution with coefﬁcients of determination 0.956 to 0.994, and only one dataset (1st e distribution of 137Cs on P3) is best described by a Weibull distribution. For this dataset the coefﬁcient of determination for Weibull, lognormal and normal distributions were 0.985, 0.969 and 0.959 respectively. - All seven datasets for activity concentration (datasets 56e62 in Table 1b) are best described by lognormal distribution for which the coefﬁcients of determination were from 0.929 to 0.995. Spatial distributions of 134Cs and 137Cs activity concentration as a result of Fukushima fallout (territory F in Table 1b) are also best

(8)

vary in the range from 2.06 to 9.24. Fig. 2 shows dependence of x against values of sample coefﬁcient of variation CV ¼ s/x0. Circles and squares in Fig. 2 denote values of quantity x for activity density and activity concentration, respectively. As evidenced by plots in Fig. 2 there is a clear-cut trend for increase of quantity x values with increase of coefﬁcient of variation. This trend is explained by the fact, that the overwhelming majority of datasets are best described by lognormal distributions. Indeed, datasets under consideration have sampling size n in the range from 60 to 142. With normal distribution of samplings with values of n like these, exceedance by quantity x of value equal three is extremely unlikely. However, out of 24 datasets being analyzed, quantity x proves to be less than three only for datasets 1, 32, 34 (values of x equal to 2.76, 2.06 and 2.91 respectively). Other than that (21 datasets!) we have x > 3 e an event absolutely unconceivable for normal distribution and quite anticipated for lognormal distribution, as shape of lognormal distribution all the more differs from shape of normal distribution, the larger value of coefﬁcient of variation is. As an example, Fig. 3 shows plots of probability density function for lognormal distribution L(m, s2) for common value of parameter m ¼ 8, but for different values of parameter s (curves a, b and c). Values of parameter s were selected in such a way, that coefﬁcient of variation CV ¼ [exp(s2) 1]1/2 for a lognormally distributed variable X for curves a, b and c equaled values 0.2, 0.3 and 0.5. For reference, Fig. 3 shows probability density function of normal distribution N(m, s2) with parameters m ¼ 3040 and s ¼ 608 (curve A). For normally distributed variable X coefﬁcient of variation CV ¼ s/m. That is why for curve A in Fig. 3 variable CV ¼ 0.2. As can be seen, with small values of coefﬁcients of variation CV, plots of both distributions are quite similar in appearance. But as values of coefﬁcients of variation increase, skewness of lognormal distribution sweepingly increases and shape of lognormal distribution increasingly differs from the shape of normal distribution.

A. Grubich et al. / Journal of Environmental Radioactivity 126 (2013) 264e272

We remind that the 1st dataset is best described by a Weibull distribution (distribution of 137Cs on site P3), and 32nd and 34th datasets correspond to tiny sites P6 and B3 with surface area of 15 m2 and 1.56 m2 correspondingly. For the two latter datasets, the coefﬁcients of variation had the smallest values (CV ¼ 0.214 j CV ¼ 0.201). Thus, if a data set is best described by lognormal distribution and value of sampling coefﬁcient of variation is sufﬁciently large (CV > 0.25), then even for a dataset with sample size n z 100 the value of x(n) will be, as a rule, larger than quantity (x0 þ 3 s). In cases like these values x w x(n) may belong to highly contaminated local areas or hot spots. (Here and hereinafter the sign “tilde” means “of the same order of magnitude as”.) In order to avoid possible misunderstandings the following should be pointed out. For data shown in Fig. 2 dependence of x on sample size does not observe (which is easy to show by drawing plot for points (n, x)). It is explained by the fact that for sites B1, B3, P2eP6, F size of samplings does not change a lot (values n from 60 to 142). 4.2.1. Evaluation of hot spots’ area on site For evaluation of the total area of hot spots’ surface one can use both expression (6) and (7). From (7) it follows that area of site surface with contamination values in small neighborhood around x (from [x Dx] to [x þ Dx]) equals

SX ¼ 2DxðdF=dxÞQ ;

(9)

where dF/dx is the probability density function (Dx/x xg ¼ 1 Fn ðxÞ;

(10)

described by hyperbolic probability distribution:

PrfX > xg ¼ cxd ;

(11)

where Fn(x) is empirical distribution function; c and d are positive constants. Keep in mind that distribution (11) was also used in Cheng et al. (1994). By using the expression

Fn ðxÞ ¼ 1 cxd ;

(12)

it is easy to compute the area of site surface with values of x w x(n) using expression (9). Let us ﬁnd out now how mutually close are the results of computing area of site surface with x w x(n) in the case of using empirical and lognormal distributions. According to (9), to this end one may use, for example, numerical values of quantity

269

Fig. 4. Ratio rL (equation (13)) for x ¼ x(n) against x (equation (8)).

rL ¼ ðdL=dxÞ=ðdFn =dxÞ;

(13)

where dL/dx and dFn/dx are the probability density functions for lognormal and empirical distributions correspondingly, evaluated at a given x. Indeed, variation of quantity rL from one (rL ¼ 1) means, that numerical value of site surface area with contamination level in the neighborhood of x (equation (9)) essentially depends on probability density function (dL/dx or dFn/dx), used in computation. It is evident that if for any dataset ratio (equation (13)) will noticeably differ from one, then values of estimates (6) and (7) will also noticeably depend on namely what distribution (lognormal or empirical) is used in computations. For this reason quantity (13) can be considered as a special measure that determines “sensitivity” of estimates (6), (7) and (9) to distribution used in computations. Fig. 4 shows results of computing quantity (13) for value x ¼ x(n) in the case datasets best described by lognormal distributions (datasets 2e5, 26e35, 56e64 for sites B1, B3, P2eP6, F). Circles and squares in Fig. 4 denote values of ratio (equation (13)) for activity density and activity concentration correspondingly. As can be seen, numerical values of ratio (equation (13)) are scattered in value range from 0.0034 to 3.08. As consequence, for most of these datasets the use of lognormal distributions in place of the empirical ones results in unacceptably large errors in the evaluation of the area of site surface with values of activity density or activity concentration in the order of maximal extreme value x(n). Scattering of values of ratio rL with respect to the unit (rL ¼ 1) implies that the slope of the right-hand tail of the empirical distribution in the majority of cases differs from the slope of the righthand tail of the lognormal distribution. Examples of such differences in the shape of tails of empirical and lognormal distribution are in Section 1. Note that Pausch et al. (1998) reviews the case when the ratio (equation (13)) proves to be much less than the unit.

4.3. Some relations between the quantities From data in Table 1 it follows, as it should be, that there is a dependence between values of the coefﬁcient of variation and an extreme quotient. Thus, for sites B1, B3, P2eP6, F (excluding subsites P4.1eP4.4) this dependence is described by the following regression equation:

xðnÞ =xð1Þ ¼ 83:7CV2:36 for activity density (R2 ¼ 0.935);

(14)

270

A. Grubich et al. / Journal of Environmental Radioactivity 126 (2013) 264e272

(c) Fig. 5. The hyperbolic characteristic exponent d against extreme quotient x(n)/x(1).

xðnÞ =xð1Þ ¼ 83:0CV3:01

(15)

for activity concentration (R2 ¼ 0.882). There is also the tendency for the increase of value of the coefﬁcient of variation with increase of area of site surface. It seems to be explained by the fact, that with the increase of area of site surface probability of obtaining either smaller value of quantity x(1), or larger value of quantity x(n), or both at the same time grows gradually. It is obvious that in all cases like these value of the coefﬁcient of variation increases. For sites B1, B3, P2eP6, F dependence of the coefﬁcient of variation on the area of site surface, Q (km), is described by the following regression equations (values Q from 1.56$106 km2 to 5400 km2):

CV ¼ 0:459Q 0:068

(b)

(16)

for activity density (R2 ¼ 0.848);

CV ¼ 0:582Q 0:126

(17)

for activity concentration (R2 ¼ 0.856). Besides, for sites B1, B3, P2eP6, F from data in Tables 1a and 1b follows availability of regression (R2 ¼ 0.974):

g1 ¼ 0:913x 2:011;

(18)

where g1 is the skewness; x is the value of (8). According to regression equation (18) for x ¼ 2, values of the coefﬁcient of skewness, as it should be, is close to zero. Indeed, as can be seen from plots in Fig. 2, to values of x z 2 small values of CV correspond. In this case shape of lognormal distribution begins to look like the shape of normal distribution (see plots in Fig. 3), for which, as is well-known, g1 ¼ 0. 4.3.1. The hyperbolic characteristic exponent Salvadori et al. (1996) cite numerical values of the hyperbolic characteristic exponent d in (11), and also extreme values x(1) and x(n) of activity density for 137Cs contamination of territories of a number of European countries as a result of the Chernobyl fallout. Fig. 5 shows numerical values of the hyperbolic characteristic exponent against an extreme quotient x(n)/x(1) for datasets in Table 1 without sub-sites P4.1eP4.4 and for data of (Salvadori et al., 1996). Black circles, squares and circles in Fig. 5 denote values of d for activity density, activity concentration (sites B1, B3, P2eP6, F) and for activity density as per data Salvadori et al. (1996) respectively. From plots in Fig. 5 it follows that values of the hyperbolic characteristic exponent d, obtained for sites B1, B3, P2eP6, F, are, in

(a) Fig. 6. Probability Pr{X > x}. Points, heavy gray and thin blacks curves e probabilities for empiric, lognormal and Weibull distributions correspondingly.

principle, consistent with results of Salvadori et al. (1996). It should also be stressed that there is consistency with data for the Fukushima fallout as well. For distributions of 137Cs and 134Cs on territory F values of d equal to 1.67 and 1.70 correspondingly. Let us pay attention to the tendency of the decrease of the hyperbolic characteristic exponent value with the increase of extreme quotient, which is explained by the following reasons. Small values of the coefﬁcient of variation corresponded to small values of the extreme quotient. In cases like this the shape of lognormal distribution, as it was stated in Section 4.2, is getting similar to the normal distribution shape. As a consequence, values of the hyperbolic characteristic exponent are large. Or, otherwise, the righthand tail of the empirical distribution is getting more and more ﬂat (more “heavy”) with increase of extreme quotient and to the distribution with more ﬂattened shape of the tail, evidently,

A. Grubich et al. / Journal of Environmental Radioactivity 126 (2013) 264e272

271

correspond smaller values of the hyperbolic characteristic exponent. Let us also pay attention to the fact that from plot in Fig. 5, taking into account dependency relations (14)e(17), follows the tendency of decrease of the hyperbolic characteristic exponent with increase of area of the site surface. 4.4. Fukushima fallout Fig. 6a for 137Cs on territory F shows dependence of probability (10) (points) against values of activity concentration. Also shown are plots of function [1 F(x)] for lognormal and Weibull distributions (heavy gray and thin black curves correspondingly). Two rightmost points in Fig. 6a correspond to order statistics x(ne 1) ¼ 263 kBq/kg and x(n) ¼ 410 kBq/kg. As an example, we give estimate of area of territory F surface, on which values of activity concentration belong to the interval [x(n1), x(n)]. According to Section 2.4, the approximate estimate of territory F surface area is 5400 km2. By using asymptotics for empiric distribution given in Fig. 6a (dotted straight line), and expression (7), we obtain that area of soil surface on territory F with activity concentration of 137Cs from 263 kBq/kg to 410 kBq/kg is equal to 36.5 km2. If in place of asymptotics of empiric distribution, the lognormal distribution is used, this one describing in the best way the dataset (probability diagram [1 L(x)] is shown in Fig. 6a by heavy gray curve), then we obtain surface area equal to 52 km2. As we can see, difference in estimates of site surface area is 43%. It should also be pointed out, that quotient (52 km2/36.5 km2) equals 1.43. This value, as it should be, approximately corresponds to point x ¼ 6.5, rL ¼ 1.49 on plot of Fig. 4, obtained for quantity (13) in the case of 137Cs on territory F. For comparison, Fig. 6b and c shows plots for datasets of 90Sr activity density and of 238Pu activity concentration on site P4. In these two examples quantity (13) assumes maximal and minimal values for data in Table 1 without considering sub-sites P4.1eP4.4. In Fig. 4 to these datasets correspond points x ¼ 3.71, rL ¼ 3.08 and x ¼ 9.24, rL ¼ 0.0034. It is evident that estimates of site surface area with values x ˛ [x(n1), x(n)] obtained by using lognormal and empiric distributions in these two examples will differ by as much as hundreds of percent. It should also be noted that plots in Fig. 6 exemplify to what extent the shape of empiric distribution tail differs from the lognormal distribution tail. 4.5. Practical recommendations In compliance with results obtained in Section 4.1 the overwhelming majority of datasets are best described by lognormal distribution (from one data set to another only distribution parameters vary). Only in rare cases datasets are described Weibull distribution. On the other hand, of the right-hand tail of empiric distribution as per Section 4.2.2 quite often differs from the righthand tail of the lognormal distribution. At that even small, at ﬁrst sight, variance between the shape of empirical distribution tail and the lognormal distribution tail results in substantially different numerical estimates of hot spots’ area (see example of 137Cs distribution on site F, considered in Section 4.4). In view of these circumstances one may give the following recommendations for practical purposes: 1) Determine the type of the distribution function which best describes dataset. 2) Use this distribution function for computing of conﬁdence limits. 3) Compute site surface area with contamination values in range of interest [xa, xb] by using:

Fig. 7. Coefﬁcient of variation CV against area of site Q.

- The distribution function, if xb much smaller of the maximal extreme value of x(n). - Asymptotics of probability Pr{X > x} of the empiric distribution, if xa an order of magnitude x(n).

4.6. Collation of the characteristics of Fukushima and Chernobyl fallout Plots in Figs. 2, 4 and 5, and regression equations (15), (17) and (18) have been obtained with regard to datasets for the Fukushima fallout. As can be seen from Figs. 2, 4 and 5, data for the Fukushima and Chernobyl fallout agree (points corresponding to Fukushima fallout do not fall out of the “cloud” of points for Chernobyl fallout). On top of that, if we consider data only for Chernobyl fallout, then in place of regression equations (17) and (18) we obtain regression equations, given for this case in Figs. 7 and 8 respectively. For comparison, in Figs. 7 and 8 data for territory F is marked with gray squares. As can be seen, regression equations do not undergo noticeable changes, although surface area for territory F is much larger than for the remaining sites in Table 1. As we see it, these circumstances highlight the following: - The degree of soil contamination nonuniformity as a result of both fallouts is approximately the same. - For both fallouts dependencies between statistics are described by analogous regression equations with close numerical values of coefﬁcients. The latter suppositions naturally need to be more thoroughly checked.

Fig. 8. Coefﬁcient of skewness g1 against quantity x (equation (8)).

272

A. Grubich et al. / Journal of Environmental Radioactivity 126 (2013) 264e272

5. Conclusions Results of data sets’ analysis made for radioactive contamination of soil with different radionuclides (90Sr, 134Cs, 137Cs, 238Pu, 239þ240 Pu, 241Am) on sites with surface area from 1.56 m2 to 10 5.4$10 m2 agree with general ﬁndings of Daniels and Higgins (2002), that an assumption of lognormality is an idealization. It was shown that for evaluation of the size of hot spots’ surface area, it is reasonable to use solely empirical distribution of radionuclides. The degree of soil contamination nonuniformity for Chernobyl and Fukushima fallouts is approximately the same. For Chernobyl fallout the magnitude of coefﬁcient of variation, CV, slowly rises with increase of site area, Q, in compliance with equation of regression CV ¼ a$Qb (0 < b