Computers in Biology and Medicine 49 (2014) 67–73

Contents lists available at ScienceDirect

Computers in Biology and Medicine journal homepage: www.elsevier.com/locate/cbm

An optimized Nash nonlinear grey Bernoulli model based on particle swarm optimization and its application in prediction for the incidence of Hepatitis B in Xinjiang, China Liping Zhang a,b, Yanling Zheng a,b, Kai Wang b, Xueliang Zhang b, Yujian Zheng a,n a b

School of Public Health, Xinjiang Medical University, Urumqi 830011, People's Republic of China Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi 830011, People's Republic of China

art ic l e i nf o

a b s t r a c t

Article history: Received 15 November 2013 Accepted 12 February 2014

In this paper, by using a particle swarm optimization algorithm to solve the optimal parameter estimation problem, an improved Nash nonlinear grey Bernoulli model termed PSO–NNGBM(1,1) is proposed. To test the forecasting performance, the optimized model is applied for forecasting the incidence of hepatitis B in Xinjiang, China. Four models, traditional GM(1,1), grey Verhulst model (GVM), original nonlinear grey Bernoulli model (NGBM(1,1)) and Holt–Winters exponential smoothing method, are also established for comparison with the proposed model under the criteria of mean absolute percentage error and root mean square percent error. The prediction results show that the optimized NNGBM(1,1) model is more accurate and performs better than the traditional GM(1,1), GVM, NGBM(1,1) and Holt–Winters exponential smoothing method. & 2014 Published by Elsevier Ltd.

Keywords: Nonlinear grey Bernoulli model Particle swarm optimization Grey model Hepatitis B

1. Introduction Hepatitis B (HB) is one of the major diseases of mankind and a serious global public health problem. According to the data of World Health Organization (WHO), there are 2 billion people who have been contracted with the hepatitis B virus (HBV), over 350 million chronic HBV carriers which account for 5% of the worlds population. Every year there are over 4 million acute clinical cases of HBV, and about 25% of carriers. About 1 million people die from chronic active hepatitis, cirrhosis or primary liver cancer annually [1]. China is the high prevalence of HBV infection region. An estimated 93 million people have been infected with the HBV. HBV infection is not only the health problem but also becoming a social problem [2]. In the Xinjiang Uygur Autonomous Region, HBV infection is a major public health problem. Beginning in 2007, new cases per year were more than 60,000, of which more than 70% were HB patients. A group data of Xinjiang Center for Disease Control and Prevention (Xinjiang CDC) shows that a total of 348,103 new cases of HB have been found since 2004, and the average annual incidence has reached to 186/105. The HB

n

Corresponding author. Tel.: þ 86 991 4362456; fax: þ86 991 4365561. E-mail addresses: [email protected] (L. Zhang), [email protected] (Y. Zheng). http://dx.doi.org/10.1016/j.compbiomed.2014.02.008 0010-4825 & 2014 Published by Elsevier Ltd.

incidence increased year by year, which rose from 101/105 in 2004 to 210/105 in 2012. Accurately forecast the epidemic tendency of HB, can provide a theoretical basis for the prevention and control of diseases. Statistical methods are the most commonly used method for epidemic diseases forecasting. In traditional forecasting approaches, the forecasters are built on the assumption of realizing the structure of the system to be forecasted. However, because of the limitation of information and knowledge, only part of the system structure could be fully realized. The grey system theory was first proposed by Deng [3,4], mainly for a system with incomplete or uncertain information, to construct a grey forecasting and decision-making. As a superiority to conventional statistical models, grey models require only a limited amount of data to estimate the behavior of unknown systems. In recent years, the grey system theory has been successfully applied in various fields and has demonstrated satisfactory results [5–15]. Constructed by exponential function, the predicting competence of the traditional grey forecasting model GM(1,1) still could be improved despite of its wide use. The nonlinear grey Bernoulli model NGBM(1,1) is a recently developed grey forecasting model proposed by Chen [16,17], which is a simple modification of GM(1,1) combining with the Bernoulli differential equation. It has a power exponent that can effectively manifest the nonlinear characteristics of real systems and flexibly determine the shape of the model's curve. By introducing an interpolated coefficient into

68

L. Zhang et al. / Computers in Biology and Medicine 49 (2014) 67–73

the background value of NGBM(1,1), Chen et al. [18] proposed a Nash NGBM(1,1) model (NNGBM(1,1)), which strengthens the adaptability of the model towards the original data and eventually improves the accuracy of the NGBM model. In this paper, by using the particle swarm optimization (PSO) algorithm to estimate the parameters of NNGBM(1,1), we propose an optimized NNGBM(1,1) model abbreviated as PSO–NNGBM(1,1). To evaluate the performance of the proposed model, we apply it to forecast the incidence of HB with nonlinear small sample characteristics in Xinjiang, China. The remainder of this paper is organized as follows. A brief introduction to the PSO algorithm, the optimized NNGBM(1,1) model based on the PSO algorithm and the Holt–Winters additive model are presented and discussed in Section 2. In Section 3, a numerical example of fluctuating data and an empirical analysis on incidence of HB are adopted to verify the feasibility and effectiveness of the optimized NNGBM(1,1) model proposed here. Finally, conclusions are made in Section 4.

Step 3: Compare particle's fitness evaluation with particle's previous best value (pbest). If current value is better than pbest, then set pbest value equal to the current value, and the pbest location equal to the current location in D-dimensional space. Step 4: Compare fitness evaluation with the population's overall previous best (gbest). If current value is better than gbest, then reset gbest to the current particle's array index and value. Step 5: Change the velocity and position of the particle according to Eqs. (1) and (2), respectively. Step 6: Loop to Step 2 until a criterion is met, usually a sufficiently good fitness or a maximum number of iterations (generations). The PSO can be used to solve many of the same kinds of problems as genetic algorithms (GA) [23,24]. It has been shown to be effective in optimizing difficult multidimensional discontinuous problems in a variety of fields and shown in certain instances to outperform other methods of optimization like GA [19]. 2.2. PSO based Nash nonlinear grey Bernoulli model (PSO–NNGBM) Assume that X ð0Þ ¼ ðxð0Þ ð1Þ; xð0Þ ð2Þ; …; xð0Þ ðnÞÞ is the original data sequence with n entries. Its 1-AGO (accumulated generating operator) sequence X ð1Þ is

2. Methods 2.1. The PSO algorithm

X ð1Þ ¼ ðxð1Þ ð1Þ; xð1Þ ð2Þ; …; xð1Þ ðnÞÞ;

The PSO is an evolutionary computation technique developed by Kennedy and Eberhart based on the social behavior metaphor [19]. In PSO language, the entire collection of agents is considered as a swarm, each individual in the swarm is referred to as a particle and each particle is assigned a randomized velocity and is iteratively moved through the problem space. It is attracted towards the location of the best fitness achieved so far by the particle itself and by the location of the best fitness achieved so far across the whole population (global version of the algorithm). A complete theoretical analysis of the algorithm has been given by Clerc and Kennedy [20]. A particle represents a point in a D-dimension space, and its status is characterized through its position and velocity. The ith particle is

where

denoted as

X ki

¼ ðxki1 ; xki2 ; …; xkiD Þ. The velocity by V ki ¼ ðvki1 ; vki2 ; …; vkiD Þ. The best

of this particle can be

represented previous position (the position giving the best fitness value) of the ith particle until iteration k represented as PBki ¼ ðpki1 ; pki2 ; …; pkiD Þ. The best position in the entire swarm is denoted as GBk ¼ ðg 1k ; g 2k ; …; g D k Þ. The particles are manipulated according to the following equation: vkid ¼ w  vid þ c1  r 1  ðpkid 1  xkid 1 Þ þ c2  r 2  ðg kd  1  xkid 1 Þ;

ð1Þ

xkid ¼ xkid 1 þvkid ;

ð2Þ

d ¼ 1; 2; …; D;

where w is the inertia weight, c1 and c2 are two positive constant parameters called acceleration factors which control the maximum step size, and r1 and r2 are two random numbers in the range [0,1]. A large inertia weight facilitates a global search while a small inertia weight facilitates a local search. Eq. (1) shows that the new velocity is updated according to its previous velocity and the distance from its current position to both its best historical position and the global best position of the entire swarm. Then the particle moves toward a new position according to Eq. (2). The process is repeated until the stopping criterion is reached. The process for implementing the global version of PSO is summarized as follows [21,22]: Step 1: Initialize a population (array) of particles with random positions and velocities on D dimensions in the problems space. Step 2: For each particle, evaluate the desired optimization fitness function in D variables.

k

xð1Þ ðkÞ ¼ ∑ xð0Þ ðiÞ; i¼1

k ¼ 1; 2; …; n:

ð3Þ

The generated mean sequence Z ð1Þ of X ð1Þ can be evaluated as follows: Z ð1Þ ¼ ðzð1Þ ð2Þ; zð1Þ ð3Þ; …; zð1Þ ðnÞÞ: The grey differential equation of NGBM(1,1) is defined as xð0Þ ðkÞ þazð1Þ ðkÞ ¼ bðzð1Þ ðkÞÞγ ;

ð4Þ

and its whitenization differential equation is given by ð1Þ

dx þaxð1Þ ¼ bðxð1Þ Þγ ; dt

ð5Þ

zð1Þ ðkÞ ¼ αxð1Þ ðkÞ þ ð1  αÞxð1Þ ðk  1Þ is always referred to the background value of the grey derivative. α is a production coefficient of the background value in the range of [0,1], which is traditionally set to 0.5 [25]. When α is an indefinite value in the interval [0,1], the model is called the Nash nonlinear grey Bernoulli model (NNGBM (1,1)) [18,26]. When α ¼ 0:5 and γ ¼ 0, Eq. (5) reduced to the traditional GM (1,1) model, xð0Þ ðkÞ þazð1Þ ðkÞ ¼ b:

ð6Þ

When α ¼ 0:5 and γ ¼ 2, Eq. (5) reduced to the GVM model, xð0Þ ðkÞ þazð1Þ ðkÞ ¼ bðzð1Þ ðkÞÞ2 :

ð7Þ

According to the original grey forecasting model, the generating coefficient α is usually given as 0.5. However, the fixed α value is not the optimal selection for some series in reality [24]. If α is adjusted according to the characteristics of original series, it will provide a better prediction performance. In order to give an optimal selection for parameters α and γ, we take PSO as a parameter search technique to find near optimal solutions to establish the PSO– NNGBM. Define the fitness of each particle as  ! ð0Þ 1 n xð0Þ ðkÞ  x^ ðkÞ ∑  min f ðα; γ Þ ¼ ð8Þ   100%:  nk¼1 xð0Þ ðkÞ In the current work, the decision variables are α and γ. Let X i ¼ ðαi ; γ i Þ be the ith particle vector, where parameters are random real number and restricted by an appropriate range, γ i a 1 and αi A ½0; 1.

L. Zhang et al. / Computers in Biology and Medicine 49 (2014) 67–73

With the calibrated α^ , the background values are zð1Þ ðkÞ ¼ α^ xð1Þ ðkÞ þ ð1  α^ Þxð1Þ ðk  1Þ; k ¼ 2; 3; …; n. a and b are the structure parameters, which can be estimated by using the least square method A^ ¼ ½a; bT ¼ ðBT BÞ  1 BT Y; where 2 ð0Þ 3 x ð2Þ 6 ð0Þ 7 6 x ð3Þ 7 7 Y ¼6 6 ⋮ 7; 4 5 xð0Þ ðnÞ

ð9Þ

2

 zð1Þ ð2Þ 6 6  zð1Þ ð3Þ B¼6 6 ⋮ 4  zð1Þ ðnÞ

½zð1Þ ð2Þγ^

3

7 ½zð1Þ ð3Þ 7 7: 7 ⋮ 5 ½zð1Þ ðnÞγ^ γ^

ð10Þ

ð1Þ Set the initial value x^ ð1Þ ¼ xð0Þ ð1Þ, we can get the time response function   1=ð1  γ^ Þ b b ð1Þ xð0Þ ð1Þð1  γ^ Þ  e  að1  γ^ Þk þ ; x^ ðk þ 1Þ ¼ a a k ¼ 1; 2; …; n; γ a1: ð11Þ

Apply the first-order inverse accumulated generation operation ^ (1-IAGO) to xð1Þðk þ1Þ, we can obtain the simulation and forecasting function of X ð0Þ as ð0Þ ð1Þ ð1Þ x^ ðkÞ ¼ x^ ðkÞ  x^ ðk  1Þ;

k ¼ 2; 3; …

ð12Þ

2.3. Holt–Winters additive model When there are only a few observations on which to base the forecast, another common and effective method is exponential smoothing method. After applying various exponential smoothing methods and comparing their mean square errors, it is observed that the Holt–Winters additive exponential smoothing method is the best one, giving the best fit model of the observed data [27]. In order to make a comparison and assess the strength of the forecast model PSO–NNGBM(1,1), a Holt–Winters exponential smoothing model is also implemented in this paper. A time series can be decomposed by Holt–Winters additive exponential smoothing into three main components, level (L), trend (T) and season (S) [28]: Lt ¼ αðX t  St  s Þ þ ð1  αÞðLt  1 þ T t  1 Þ;

ð13Þ

T t ¼ βðLt Lt  1 Þ þ ð1  β ÞT t  1 ;

ð14Þ

St ¼ γ ðX t Lt Þ þð1  γ ÞSt  s ;

ð15Þ

here α, β and γ are the damping factors, which vary between 0 and 1. The prediction equation is X^ t ðhÞ ¼ Lt þ h  T t þ St þ h  rs :

ð16Þ

All the aforementioned parameters, reported in Table 4, are estimated by minimizing the sum of square errors [11]. 2.4. Measure of forecasting performance Prediction accuracy is an important criterion for evaluating forecasting validity. For such reason, an error analysis based on Table 1 Criteria of MAPE and RMSPE. MAPE and RMSPE (%)

Forecasting power

o 10

Highly accurate forecasting Good forecasting Reasonable forecasting Inaccurate forecasting

10–20 20–50 450

69

two statistical measure, i.e. the mean absolute percentage error (MAPE) and the root mean square percentage error (RMSPE), is employed to estimate model performances and reliability. The MAPE and the RMSPE are defined as  ! ð0Þ 1 n xð0Þ ðkÞ  x^ ðkÞ MAPE ¼ ∑  ð17Þ   100%;  nk¼2 xð0Þ ðkÞ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð0Þ ∑ni¼ 2 ½ðxð0Þ ðkÞ  x^ ðkÞÞ=xð0Þ ðkÞ2  100%; RMSPE ¼ ðn 1Þ

ð18Þ

ð0Þ where xð0Þ ðkÞ is the actual value at time k, x^ ðkÞ is its fitting value and n is the number of data used for prediction. The criteria of MAPE and RMSPE are shown in Table 1 [29,30].

3. Numerical illustrations 3.1. Validation of the PSO–NNGBM Chen et al. [18] took the randomly fluctuating sequence X ð0Þ ¼ ð1; 2; 1:5; 3Þ as a numerical example to demonstrate the precision and effectiveness of NNGBM. In this section, we also adopt this example to compare the forecasting accuracy of the PSO–NNGBM(1,1) to the NNGBM(1,1) and the traditional NGBM (1,1). Forecasting results are shown in Table 2 and Fig. 1. Table 2 reveals that the NNGBM(1,1) with α ¼ 0:54 and γ ¼  1:7 has a higher accuracy than that of the traditional NGBM(1,1) with α ¼ 0:5 and γ ¼  1:4867. The PSO–NNGBM(1,1) with α ¼ 0:5856 and γ ¼  59:5180 holds the minimum MAPE/RMSPE of 10.26%/ 15.73%. Fig. 1 shows that the traditional NGBM(1,1) and the NNGBM (1,1) have similar forecasting results, while the fitting effect of the PSO–NNGBM model on the same sequence is better than that of the previous two models. 3.2. Empirical results In this section, the PSO–NNGBM(1,1) is used to forecast the incidence of HB in Xinjiang, China. 3.2.1. Data Epidemiological incidence data was obtained from the report of Xinjiang CDC. The raw data is shown in Table 5. Table 5 demonstrates there are some nonlinear fluctuations in the original data. 3.2.2. Simulation analysis As the pathogenesis of HB is not entirely clear, modeling based on pathogenesis has the certain difficulty. In this section, we focus on establishing forecasting models with high predicting accuracy for the incidence of HB in Xinjiang based on data characteristics. Five prediction models, the original GM(1,1) model, the GVM model, the Holt–Winters additive model, the original NGBM(1,1) model and the PSO–NNGBM(1,1) are established for comparison. Table 2 Forecasting results from the NGBM models based on the fluctuation sequence. Original value

NGBM α ¼ 0:5, γ ¼  1:4867

NNGBM α ¼ 0:54, γ ¼  1:7

PSO–NNGBM α ¼ 0:5856, γ ¼  59:5180

1 2 1.5 3 MAPE (%) RMSPE (%)

– 2.0002 2.0687 2.9122 13.62 21.95

– 2.0110 2.0744 2.9909 13.05 22.11

– 1.9999 1.9046 3.1139 10.26 15.73

70

L. Zhang et al. / Computers in Biology and Medicine 49 (2014) 67–73

to find the optimal values of generating coefficient α^ and power exponent γ^ that correspond to the smallest in-sample forecasting error. The evolution of fitness of optimized NNGBM(1,1) is shown in Fig. 2. By taking the settings in Table 3, we obtain the minimum MAPE (4.59%) and optimal parameters estimation (α^ ¼ 0:5340, γ^ ¼  0:2967). As can be seen from Fig. 2 that the value of MAPE converges very fast to a stationary value, which demonstrate that PSO is an effective globe optimization algorithm suitable for the parameter optimization of NNGBM(1,1). For the sake of convenience, the detailed calculation and modeling process are omitted here. All the parameters estimation results of the five models are listed in Table 4. Only the actual values, fitting, forecasting results and the corresponding errors are shown in Table 5. The fitting curves are illustrated in Fig. 3. Fig. 3 shows that, the traditional forecasting model GM(1,1) and the GVM model cannot ideally catch the nonlinear characteristic of incidence of HB as the original data is highly nonlinear and nonstationary. For predications involving nonlinear small sample time series, the performance of PSO–NNGBM(1,1) is better than that of the traditional grey forecasting models and the exponential smoothing method. The model fitting results indicate that the traditional NGBM (1,1) and the PSO–NNGBM(1,1) have similar results, while the fitting and forecasting effect of the proposed optimized model on the same sequence is better than the original GM(1,1) model and

To verify the effectiveness of the PSO–NNGBM(1,1), data from January 2012 to August 2012 are regarded as the verifying periods, and September 2012 to December 2012 are reserved for ex post testing. Regarding the setting of PSO parameter, a number of articles have already established the concept and setting value [22,31]. In this study, the default values selected for parameters of the PSO are listed in Table 3. The software Matlab 7.1 is used in this study

Original value

NGBM

NNGBM

PSO−NNGBM

3.5

3

2.5

2

1.5

1 0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Fig. 1. Curves of original values and fitting values corresponding to different NGBM models.

Table 3 Default PSO parameter values in Matlab. PSO parameters

Default value

Epochs between updating display Maximum number of iterations (epochs) to train Population size c1 c2 Initial inertia weight Final inertia weight Epoch when inertial weight at final value Minimum global error gradient Epochs before error gradient criterion terminates run Error goal Type flag PSO seed

100 2000 24 2 2 0.9 0.4 1500 10  25 250 NaN (unconstrained minimum) 0 (common PSO) 0 (for initial positions all random)

0.0458983 = MAPE( [ −0.296679, 0.533954 ] )

0

10

Fitness (MAPE)

Dimension 2

1 0.8 0.6 0.4 0.2 0 −1

10

−0.5

−0.4 −0.3 −0.2 Dimension 1 PSO Model: Common PSO Dimensions : 2 # of particles : 24 Minimize to : Unconstrained Function : MAPE Inertia Weight : 0.47305

Green = Personal Bests −2

10

0

200

400

600 epoch

800

1000

1200

Blue = Current Positions Red = Global Best

Fig. 2. Evolution of fitness of optimized PSO–NNGBM(1,1) model.

−0.1

0

L. Zhang et al. / Computers in Biology and Medicine 49 (2014) 67–73

the GVM model. It reflects that, the influence of γ on forecasting accuracy is much higher than that of α. Clearly, the NGBM(1,1) model achieves significant improvement upon the traditional GM (1,1). The Holt–Winters exponential smoothing method also has lower fitting error that it presents a high degree of fitting accuracy with the MAPE less than 5% and the RMSPE less than 10%.

Table 4 The parameter evaluation of the fitting models. Model

Parameter value

Holt–Winters GM(1,1) GVM(1,1) NGBM(1,1) PSO–NNGBM(1,1)

α ¼ 0:7; β ¼ 0:02; γ ¼ 0:1 α ¼ 0:5; γ ¼ 0 α ¼ 0:5; γ ¼ 2 α ¼ 0:5; γ ¼  0:3039 α ¼ 0:5340; γ ¼  0:2967

71

Comparing the out-of sample forecasting results of the above mentioned five models from September 2012 to December 2012, the optimized NNBGM(1,1) model yields the lowest MAPE 10.97% and RMSPE 16.0%. Noting that, the data in October is significantly lower than normal. So, we can treat it as an outlier. Eliminating the influence of the outlier, the MAPE of the PSO–NNGBM(1,1) is less than 4.45% and the RMSPE is less than 5.51%, which can meet the practical demands. The simulation results confirm that the NBGM (1,1) model combined with optimal parameters optimized by the PSO algorithm can reduce forecasting error effectively. In order to further demonstrate the superiority of the optimized model in this application, we apply the PSO–NNGBM model to analyze the annual incidence of HB from 2009 to 2012 in Xinjiang, which is a random fluctuation sequence X ð0Þ ¼ ð211:9065; 186:6180; 209:1230; 209:6002Þ: The simulation results are illustrated in Table 6 and the fitting curves are shown in Fig. 4.

Table 5 Forecasting performance evaluation and comparison. Month

Original value

GM(1,1)

GVM

Holt–Winters

NGBM

PSO–NNGBM

Jan. Feb. Mar. Apr. May. Jun. Jul. Aug. MAPE (%) (Jan.–Aug.) RMSPE (%) (Jan.–Aug.) Sept. Oct. Nov. Dec. MAPE (%) (Sept.–Dec.) RMSPE (%) (Sept.–Dec.)

16.2818 21.4523 20.1184 15.5942 18.3216 16.5935 17.5836 17.0885

– 20.0801 19.3817 18.7077 18.0571 17.4291 16.8229 16.2379 6.54 8.65 15.6731 15.1281 14.6019 14.0941 15.51 17.32

– 8.8304 12.4467 16.3160 19.4678 20.7825 19.7094 16.7016 21.06 28.70 12.8548 9.1806 6.2117 4.0508 49.23 54.63

– 21.4523 20.1184 15.5942 18.3216 16.2818 21.6749 17.4242 3.39 8.28 12.6074 15.2772 13.1799 18.2929 17.61 20.54

– 21.5794 18.9330 17.8795 17.3668 17.1267 17.0517 17.0885 4.66 6.51 17.2074 17.3902 17.6252 17.9047 10.99 16.02

– 21.4523 18.8870 17.8600 17.3594 17.1250 17.0520 17.0885 4.59 6.50 17.2054 17.3849 17.6156 17.8897 10.97 16.00

GM(1,1)

PSO−NNGBM

5

Original value 25

20

15

10

Incidence of HB (unit: per 10 )

Incidence of HB (unit: per 105)

Outlier.

1

2

3

4

5

6

7

8

9

10

11

12

Original value

PSO−NNGBM

5

Holt−Winters

25

20

15

10

1

2

3

4

5

6

7

Month

PSO−NNGBM

20 15 10 5 0

1

2

3

4

5

6

7

8

9

10

11

12

Month

Incidence of HB (unit: per 10 )

Original value

GVM

25

Month

Incidence of HB (unit: per 105)

a

16.2818 13.3160a 19.0779 17.8907

8

9

10

11

12

Original value

NGBM

PSO−NNGBM

25

20

15

10

1

2

3

4

5

6

7

Month

Fig. 3. The comparison of the original and forecast values for five models.

8

9

10

11

12

72

L. Zhang et al. / Computers in Biology and Medicine 49 (2014) 67–73

Table 6 Simulation results by Holt–Winters, NGBM(1,1) and PSO–NNGBM(1,1). Year

Original value

Holt– Winters

NGBM (α ¼ 0:5, γ ¼ 0:6295)

PSO–NNGBM (α ¼ 0:5651, γ ¼ 0:7331)

2009 2010 2011 2012 2013 2014 2015 MAPE (%) RMSPE (%)

211.9065 186.6180 209.1230 209.6002

– 210.5148 209.1230 207.7312 206.3395 204.9477 203.5560 4.57 7.41

– 185.2974 207.1121 209.6002 199.4838 182.2370 161.7687 0.56 0.69

– 186.6175 209.1232 208.1432 192.1614 168.6815 142.9614 0.23 0.40

Incidence of HB (unit: per 105)

220

Original value

Holt−Winters

NGBM

stronger adaptability to the original nonlinear sequence than the exponential smoothing method. Moreover, the success also demonstrates the PSO is effective and suitable for the parameter optimization of NNGBM(1,1).

Conflict of interest statement None declared.

Acknowledgments This research was supported by the National Natural Science Foundation of China [11201399] and Support Academic Discipline Project of XinJiang Medical University-Health Measurements and Health Economics [XYDXK50780308].

PSO−NNGBM

215

References

210 205 200 195 190 185 180 2008

2009

2010

2011

2012

2013

Year

Fig. 4. The comparison of the original and forecast values for three models.

Table 6 shows that, with the lowest MAPE and RMSPE, the forecasting accuracy of the optimized NNGBM(1,1) is the best and, therefore the proposed optimized method is effective for use in the above application. Due to the lack of more additional data published by Xinjiang CDC at present, the validation of the forecasting results will be completed after more data is released by Xinjiang CDC. Compare with traditional NNGBM solving for the parameters based on the Nash equilibrium concept [18], the optimized model PSO–NNGBM is not only easy to understand but also simple to calculate. All the results show that the PSO–NNGBM(1,1) is more accurate and performs better than the traditional GM(1,1) model, the GVM model, the NGBM(1,1) model and the exponential smoothing method. Moreover, the optimum mechanisms indeed improve the grey model of prediction accuracy by using PSO algorithms approach.

4. Discussion and conclusion HB is among the most important infectious diseases in China. It is also an important public health problem in Xinjiang, China. Studying the transmission, epidemiology and vaccination of HBV is still a popular topic. In this paper, we propose an optimized NNGBM(1,1) model termed PSO–NNGBM(1,1) and utilize it into prediction for incidence of HB in Xinjiang. By using the PSO technique, we gave an optimal estimation of the interpolated coefficient in the model background value and the power exponent under the criterion of the minimization of mean absolute percentage error. The empirical results using PSO–NNGBM show a lowest MAPE/ RMSPE among all the forecasting models. This success indicates that the PSO–NNGBM(1,1) improves the accuracy of the simulation and forecasting of the original NGBM(1,1) model and it has a

[1] J.H. Pang, J.A. Cui, X.Y. Zhou, Dynamical behavior of a hepatitis B virus transmission model with vaccination, J. Theor. Biol. 265 (2010) 572–578. [2] X.P. Zhang, F.H. Wang, Epidemiology and prevention of hepatitis B virus in China, J. Med. Coll. PLA 24 (2009) 301–308. [3] J.L. Deng, Control problems of grey systems, Syst. Control Lett. 1 (1982) 288–294. [4] M.S. Yin, Fifteen years of grey system theory research: A historical review and bibliometric analysis, Expert Syst, Appl. 40 (7) (2013) 2767–2775. [5] J.C. Huang, Application of grey system theory in telecare, Comput. Biol. Med. 41 (2011) 302–306. [6] L.C. Hsu, Using improved grey forecasting models to forecast the output of opto-electronics industry, Expert Syst. Appl. 38 (2011) 13879–13885. [7] E. Kayacan, B. Ulutas, O. Kaynak, Grey system theory-based models in time series prediction, Expert Syst. Appl. 37 (2010) 1784–1789. [8] Y. Peng, M. Dong, A hybrid approach of HMM and grey model for agedependent health prediction of engineering assets, Expert Syst. Appl. 38 (2011) 12946–12953. [9] Y.H. Lin, P.C. Lee, Novel high-precision grey forecasting model, Autom. Constr. 16 (2007) 771–777. [10] C.X. Fan, S.Q. Liu, Wind speed forecasting method: gray related weighted combination with revised parameter, Energy Procedia 5 (2011) 550–554. [11] V. Bianco, O. Manca, S. Nardini, A.A. Minea, Analysis and forecasting of nonresidential electricity consumption in Romania, Appl. Energy 87 (2010) 3584–3590. [12] M.L. Lei, Z.R. Feng, A proposed grey model for short-term electricity price forecasting in competitive power markets, Int. J. Electr. Power Energy Syst. 43 (2012) 531–538. [13] C.S. Lin, F.M. Liou, C.P. Huang, Grey forecasting model for CO2 emissions: a Taiwan study, Appl. Energy 88 (2011) 3816–3820. [14] Y.T. Hsu, M.C. Liu, J. Yeh, H.F. Hung, Forecasting the turning time of stock market based on Markov–Fourier grey model, Expert Syst. Appl. 36 (2009) 8597–8603. [15] M. Guo, J. Lan, J. Li, Z. Lin, X. Sun, Traffic flow data recovery algorithm based on gray residual GM(1,n) model, J. Transp. Syst. Eng. Inf. Technol. 12 (2012) 42–47. [16] C.I. Chen, Application of the novel nonlinear grey Bernoulli model for forecasting unemployment rate, Chaos Solitons Fractals 37 (2008) 278–287. [17] C.I. Chen, H.L. Chen, S.P. Chen, Forecasting of foreign exchange rates of Taiwans major trading partners by novel nonlinear Grey Bernoulli model NGBM(1,1), Commun. Nonlinear Sci. Numer. Simul. 13 (2008) 1194–1204. [18] C.I. Chen, P.H. Hsin, C.S. Wu, Forecasting Taiwans major stock indices by the Nash nonlinear grey Bernoulli model, Expert Syst. Appl. 37 (2010) 7557–7562. [19] J. Kennedy, Particle swarm optimization, in: Encyclopedia of Machine Learning Springer, US, 2010, pp. 760–766. [20] M. Clerc, J. Kennedy, The particle swarm-explosion, stability, and convergence in a multidimensional complex space, IEEE Trans. Evol. Comput. 6 (2002) 58–73. [21] R. Eberhart, J. Kennedy, A new optimizer using particle swarm theory, in: Proceedings of the Sixth International Symposium on Micro Machine and Human Science, MHS'95, IEEE, 1995, pp. 39–43. [22] Y. Shi, et al., Particle swarm optimization: developments, applications and resources, in: Proceedings of the 2001 Congress on Evolutionary Computation, vol. 1, IEEE, 2001, pp. 81–86. [23] R.C. Eberhart, Y. Shi, Comparison between genetic algorithms and particle swarm optimization, in: Evolutionary Programming VII, Springer, Berlin Heidelberg, 1998, pp. 611–616. [24] L.C. Hsu, Forecasting the output of integrated circuit industry using genetic algorithm based multivariable grey optimization models, Expert Syst. Appl. 36 (2009) 7898–7903.

L. Zhang et al. / Computers in Biology and Medicine 49 (2014) 67–73

[25] Z.X. Wang, K.W. Hipel, Q. Wang, S.W. He, An optimized NGBM(1,1) model for forecasting the qualified discharge rate of industrial wastewater in China, Appl. Math. Model. 35 (2011) 5524–5532. [26] P.H. Hsin, Forecasting Taiwan's GDP by the novel Nash nonlinear grey Bernoulli model with trembling-hand perfect equilibrium, in: AIP Conference Proceedings, vol. 1557, 2013, p. 224. [27] J.E. Hanke, A.G. Reitsch, D.W. Wichern, Business Forecasting, Prentice Hall, Upper Saddle River, NJ, 2001. [28] M. Kolonko, A generalized crossover operation for genetic algorithms, Complex Syst. 9 (1995) 177–192. [29] S.A. DeLurgio, Forecasting principles and applications, 1998. [30] C.D. Lewis, Industrial and Business Forecasting Methods: A Practical Guide to Exponential Smoothing and Curve Fitting, Butterworth Scientific, London, 1982.

73

[31] I.C. Trelea, The particle swarm optimization algorithm: convergence analysis and parameter selection, Inf. Process. Lett. 85 (2003) 317–325. Liping Zhang was born in Xinjiang, China, in 1980. She is an assistant professor in the College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, China. She received the B.S. degree and M.S. degree in Mathematics from Xinjiang University, Xinjiang, China, in 2003 and 2006, respectively. Now, she is working toward the Ph.D. degree in Preventive Medicine from Xinjiang Medical University, Urumqi, China. Her current research interests include grey theory, mathematical biology and epidemiology.

An optimized Nash nonlinear grey Bernoulli model based on particle swarm optimization and its application in prediction for the incidence of Hepatitis B in Xinjiang, China.

In this paper, by using a particle swarm optimization algorithm to solve the optimal parameter estimation problem, an improved Nash nonlinear grey Ber...
372KB Sizes 0 Downloads 3 Views