388

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

Recurrent-Neural-Network-Based Multivariable Adaptive Control for a Class of Nonlinear Dynamic Systems With Time-Varying Delay Chih-Lyang Hwang, Senior Member, IEEE, and Chau Jan

Abstract— At the beginning, an approximate nonlinear autoregressive moving average (NARMA) model is employed to represent a class of multivariable nonlinear dynamic systems with time-varying delay. It is known that the disadvantages of robust control for the NARMA model are as follows: 1) suitable control parameters for larger time delay are more sensitive to achieving desirable performance; 2) it only deals with bounded uncertainty; and 3) the nominal NARMA model must be learned in advance. Due to the dynamic feature of the NARMA model, a recurrent neural network (RNN) is online applied to learn it. However, the system performance becomes deteriorated due to the poor learning of the larger variation of system vector functions. In this situation, a simple network is employed to compensate the upper bound of the residue caused by the linear parameterization of the approximation error of RNN. An e-modification learning law with a projection for weight matrix is applied to guarantee its boundedness without persistent excitation. Under suitable conditions, the semiglobally ultimately bounded tracking with the boundedness of estimated weight matrix is obtained by the proposed RNN-based multivariable adaptive control. Finally, simulations are presented to verify the effectiveness and robustness of the proposed control. Index Terms— Lyapunov stability theory, multivariable sliding-mode control, nonlinear autoregressive moving average (NARMA), recurrent neural network (RNN).

I. I NTRODUCTION T IS well known that most real-world dynamic systems are (highly) nonlinear. However, the linearization of such systems around the equilibrium states may yield linear models that are mathematically tractable. It is also known that a conventionally designed linear controller may not achieve an adequate performance over a variety of operating regimes, especially when the system is highly nonlinear [1]. Although a linear adaptive control problem with unknown system parameters can deal with this difficult situation, its effectiveness is limited. It is also known that a robust controller design based on a nominal system is not enough to stabilize the system with large uncertainty [2]. There are some

I

Manuscript received August 18, 2014; revised May 20, 2015; accepted May 31, 2015. Date of publication June 25, 2015; date of current version January 18, 2016. C.-L. Hwang is with the Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei 10607, Taiwan (e-mail: [email protected]). C. Jan is with the Department of Mechanical Engineering, Nan Jeon University of Science and Technology, Tainan 737, Taiwan (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2015.2442437

nonlinear multivariable systems that can be modeled by an interconnected nonlinear matrix gain and a linear dynamic system, e.g., Wiener model, Hammerstein model, and hysteresis model [3], [4]. Furthermore, a nonlinear autoregressive moving average (NARMA) model is a generalized representation of input–output behavior of finite-dimensional nonlinear discrete dynamic system. Comparing with a nonlinear state space representation of dynamic systems, the NARMA model does not require a state estimator, which is easier for system identification, and can represent a wider class of nonlinear dynamic systems with time-varying delay for the controller design [5], [6]. Motivated by these, in this paper, a neural controller design for a class of unknown and multivariable NARMA models with time-varying delay is addressed. In the past, most people used a multilayer neural network (MNN) or a radial basis function neural network (RBFNN), combined with tapped delays for the input, and a backpropagation or gradient training algorithm to deal with the dynamic problem [7]–[9]. On the other hand, recurrent neural networks (RNNs) have important capabilities not shown in MNN or RBFNN, such as dynamic mapping without the need for tapped delays for the input. Therefore, the RNN (see [10]–[15]) is more suited for dynamic systems than the MNN or the RBFNN. An RNN can cope with time-varying input or output through its own natural temporal operation. Hence, less number of neurons for the RNN can approximate the dynamic mapping to obtain the desired accuracy. Recently, a comprehensive review of the research on (delay dependent) the stability of continuous-time RNNs, including Hopfield neural networks, Cohen–Grossberg neural networks, and related models, is discussed in [16]. In addition, many papers discuss the RNN-based adaptive control or the stability analysis for nonlinear dynamic systems (with time-varying delay) [17]–[27]. For more details about their advantages and disadvantages, one can refer to these papers. Since the unknown system functions are learned by RNN, the variations of time delay unnecessarily and adversely destroy the stability of the closed-loop system. Furthermore, the system performance becomes deteriorated due to the poorer learning of the larger fluctuation of system functions. Since the RNN possesses nonlinear connection weights, the stability of RNN-based adaptive control system is difficult to verify and analyze. Hence, a linearly parameterized

2162-237X © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

HWANG AND JAN: RNNBMAC FOR A CLASS OF NONLINEAR DYNAMIC SYSTEMS

approximation error of RNN is required for the controller design. To compensate the above residue caused by the linearized parameterization of the approximation error, a simple and extra network is established to estimate its upper bound. It is called the RNN with residue compensation [28]. In addition, an e-modification learning law with the projection for the weight matrix is designed to guarantee its boundedness and the stability of network without the requirement of persistent excitation. The purpose of using the projection algorithm for the feedback weight matrix between the neurons is to ensure a stable neural network. Based on the advantages of sliding-mode control, e.g., fast response, less sensitivity to uncertainties, and easy implementation [29], [30], the proposed RNN-based multivariable adaptive control (RNNBMAC) is designed to contain two parts: 1) the equivalent control to deal with online nominal dynamic behavior and 2) the switching control to enhance the system robustness. Under mild conditions, the convergent region for the tracking error of the proposed control can be smaller than that of robust control. The semiglobal stability of the overall system can be proved by the Lyapunov stability theory. In this paper, the following features (see [28]) are included. 1) The linearly parameterized weight for the vector function approximation error using RNN is first derived. The upper bound of its residue caused by the linearly parameterized approximate error is estimated and then learned for the compensation. 2) An e-modification learning law with the projection for weight matrix is applied to guarantee its boundedness without the requirement of persistent excitation. As compared with [28] and other recent papers (see [17]–[27]), the following important contributions of this paper are described. 1) The NARMA is considered to improve the system performance of robust control with limited amplitude of uncertainty. 2) The variations of time delay will not adversely destroy the stability of the closed-loop system. However, the system performance becomes deteriorated due to the poorer learning of the larger variations of system vector functions. 3) To cope with the time-varying delay, a new switching surface is designed and discussed. 4) The amplitude of the switching gain to obtain the specific convergent set of switching surface is verified. 5) Simulations, including the reinforcement learning control [9], the robust control for known NARMA with uncertainty, and the proposed RNNBMAC for the unknown NARMA with known time-varying delay, are presented to confirm the effectiveness and robustness of our developed RNNBMAC.

389

Fig. 1.

Control block diagram of the overall system.

following NARMA model: Y (k) = f 1 (X (k)) + F2 (X (k))U (k − d(k))

(1)

X (k) = [Y (k − 1) . . . Y (k − n y )U (k − d(k) − 1) . . . U (k − d(k) − n u )]

(2)

where d(k) represents the time-varying but known integer delay; Y (k) and U (k) ∈ n denote the system output and the control input, respectively; n y and n u are the degrees of system output and control input, respectively; the vector function f 1 (X) ∈ n and the matrix function F2 (X) ∈ n×n are continuous but unknown; and F2 (X) is nonsingular for all X (k) [5], [6]. Without loss of generality, F2 (X) > 0 ∀X (k) is assumed. To begin with, two classes of neural networks are employed to learn the unknown vector function f 1 (X) and matrix function F2 (X). Since they are the functions of X (k), the number of the inputs for the two classes of neural networks will be larger if n y or n u or n is larger. Since n is the dimension of input and output, it is dependent on the controlled system. In this situation, the computation time of a neural controller is larger if n y or n u is larger. Then, the effectiveness of the proposed control is limited. On the other hand, the RNN is a dynamic mapping, which is more suitable for learning the input–output behavior with dynamic feature [e.g., f 1 (X) and F2 (X)]. Since the RNN possesses nonlinear connection weights, the stability of RNN-based adaptive control system is difficult to verify and analyze. Hence, the linearly parameterized approximation error of RNN is first derived [28]. To compensate the residue caused by the linearized parameterization of approximation error, an extra network is established to estimate its upper bound. Without the requirement of persistent excitation (see [31]–[33]), an e-modification learning law with a projection for the weight matrix is also designed to guarantee its boundedness. The proposed RNNBMAC contains two parts: 1) the equivalent control to deal with online nominal dynamics [i.e., learning functions of f 1 (X) and F2 (X) by two classes of RNN] and 2) the switching control to enhance the system robustness. In addition, no state estimator is required for the proposed control (Fig. 1).

II. P ROBLEM F ORMULATION

III. R ECURRENT N EURAL N ETWORK W ITH R ESIDUE

Consider a class of nonlinear discrete-time multivariable dynamic systems with time-varying delay by the

Since the nonlinearly parameterized weight for the function approximation error is changed into a linearly parameterized

390

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

form, the RNN with residue is derived. At the beginning, the RNN performing as an approximator is suggested as follows:   ˆ Wˆ 2T (k)x(k)+ ˆ ¯ Wˆ 3T (k)z −1 () fˆ(x, ¯ Wˆ 1 , Wˆ 2 , Wˆ 3 ) = Wˆ 1T (k) (3) where Wˆ 1T (k) ∈ n×m is the output-hidden weight matrix, Wˆ 2T (k) ∈ m×(2n+1) is the hidden-input weight matrix, Wˆ 3T (k) ∈ m×m is the recurrent weight matrix, x(k) ¯ = T T (2n+1)×1 T T [ x (k) b ] = [ Y (k − 1) U (k − d(k)−1) b ] ∈  ˆ x) is the input vector with a known constant b, ( ¯ = m×1 , z −1 (·) is the ˆ T (k)z −1 ()) ˆ Wˆ T (k)x(k) ˆ ( ¯ + W ∈  2 3 ˆ = z −1 {[ ˆ x(k)]} ¯ = backward-time shift operator, i.e., z −1 () ˆ x(k [ ¯ − 1)], where its i th component is φˆ i (x¯i ) = [1 − exp(−2ρ x¯i )]/[1 + exp(−2ρ x¯i )], i = 1, 2, . . . , m, ρ > 0, ˆ i . The reason to and x¯i (k) = (Wˆ 2T (k)x(k) ¯ + Wˆ 3T (k)z −1 ()) use the hyperbolic tangent function is that the neuron model of the network is antisymmetric, which will learn fast compared with nonsymmetric activation function (e.g., sigmoid function). If Wˆ 3 (k) = 0, then the network becomes an MNN; if Wˆ 3 (k) = 0 and Wˆ 2 (k) are known, it becomes an RBFNN. The universal approximation theory is as follows. Theorem 1: Suppose f (x) :  → n denotes a continuous vector function, which is relatively bounded with respect to x(k) ∈  (a compact subset of n ). For an arbitrary constant ε > 0, there exists an integer m (the number of hidden neurons) and real constant√matrices W¯ 1 , W¯ 2 , and W¯ 3 , where 0 < ε¯ ≤ W¯ 3  F ≤ W˘ 3 < m, such that   f (x) = W¯ 1T  W¯ 2T x(k) ¯ + W¯ 3T z −1 () + ε f (x) (4) where ε f (x) denotes the vector approximation error satisfying ε f (x) ≤ ε, ∀x(k) ∈ . Remark 1: The constant matrices in Theorem 1 are not unique and satisfy the following inequalities: W¯ 1  F ≤ W˘ 1 , √ W¯ 2  F ≤ W˘ 2 , and 0 < ε¯ ≤ W¯ 3  F ≤ W˘ 3 < m, the RNN is a where ε¯ , W˘ 1 , W˘ 2 , and W˘ 3 are known. Since √ dynamic mapping, the condition W¯ 3  F < m is required for a stable neural network. Hence, a learning law for Wˆ 3 (k) is modified as a project algorithm to ensure an effective learning. Lemma 1 [12]: The approximation error of the vector function f˜ = f − fˆ is transformed into the following linearly parameterized form: f˜(x, ¯ Wˆ 1 , Wˆ 2 , Wˆ 3 )  ˆ x) ˆ (x) ¯ − ¯ Wˆ 2T (k)x(k) ¯ = W˜ 1T (k) (  T −1 ˆ ˆ ˆ ˆ (x) −2 (x) ¯ W3 (k)z () + Wˆ 1T (k) ¯   T T −1 ˆ ˜ ˜ ¯ + 2W3 (k)z () + ε˜ f (x, ¯ k) (5) · W2 (k)x(k) where

 ˆ (x) ˜ ε˜ f (x, ¯ k) = W˜ 1T (k) ¯ W¯ 2T x(k) ¯ + W˜ 3T (k)z −1 ()  T −1 T ˆ + Wˆ 3 (k)z −1 () + W¯ 3 z () ˆ (x) +Wˆ 1T (k) ¯ W˜ 3T (k)z −1 ()  +W¯ 1T O W˜ 2T (k)x(k) ¯ + W˜ 3T (k)z −1 ()  ˜ 2 + ε f (x) ¯ + Wˆ 3T (k)z −1 ()

  ˆ (x) ¯ = diag φˆ1 (x), ¯ φˆ2 (x), ¯ . . . , φˆm (x) ¯ ∈ m×m    = diag ρ(1 + φi (x))(1 ¯ − φi (x)) ¯ , i = 1, 2, . . . , m (7) ¯ = d φˆ i (x)/d x|x=x¯ , 1 ≤ i ≤ m, and O[·]2 denotes where φˆ i (x) the sum of the order terms not less than two of the argument. Remark 2: The first and the second terms of f˜(x, ¯ Wˆ 1 , Wˆ 2 , Wˆ 3 ) in (5) are canceled by the learning law of the weights Wˆ 1 (k), Wˆ 2 (k), and Wˆ 3 (k). The residue ε˜ f (x, ¯ k) is unknown, but its upper bound is compensated by a simple network with the learning weight Wˆ 4 (k). ¯ k) is bounded by Lemma 2 [12]: The residue term ˜ε f (x, the following inequality: ˜ε f (x, ¯ k) ≤ W¯ 4T (k)

(8)

where the parameter vector W¯ 4 ∈ 7×1 possesses all positive bounds but unknown components, and the known function (k) = (x, ¯ Wˆ 1  F , Wˆ 2  F , Wˆ 3  F ) = [1 x(k) ¯ Wˆ 1 (k) F Wˆ 3 (k) F x(k) ¯ Wˆ 1 (k) F x(k) ¯ Wˆ 2 (k) F Wˆ 1 (k) F Wˆ 3 (k) F ]T .

(9)

Remark 3: Based on the result of [12], the total number of the connection weights of the dynamic mapping for the vector function approximation [i.e., f (X) ∈ n , n ≥ 2] using RNN, MNN, and RBFNN is as follows: NRNN = n × m + m × (n RNN + 1) + m × m, n RNN = k m≥2 NMNN = n × m + m × (n MNN + 1), n MNN = k × n ord n ord ≥ 2 RBFNN NRBFNN = n nseg + 1, n RBFNN = k × n ord , n seg ≥ 3

where n RNN , n MNN , and n RBFNN denote the number of the input layers for RNN, MNN, and RBFNN, respectively; and m, n ord , and n seg are the number of hidden neurons, the order of nonlinear function, and the number of segments for every input signal (in general, it is odd), respectively. It indicates that the RNN is more suitable for the approximation of dynamic mapping. IV. D ESIGN OF RNNBMAC Since F2 (X) is a matrix function, it is decomposed as n column vector functions with RNN approximation F2 (X) = [ f 21 (X) f 22 (X) . . . f 2n (X) ] (10a)  T  T T −1 ¯ 32q ¯ 22 x(k) f 2q (X) = W¯ 12  ¯ + W z () + ε ( x) ¯ W 2q q q (10b) where x(k) ¯ is the same as (3), and ε2q (x) ¯ ≤ ε, ∀x(k) ∈ , q = 1, 2, . . . , n. Similarly  T  T T −1 f 1 (X) = W¯ 11  W¯ 21 x(k) ¯ + W¯ 31 z () + ε1 (x) ¯ (11) ¯ ≤ ε, ∀x(k) ∈ . By Theorem 1, (10) and (11) where ε1 (x) are valid. Then, the following switching surface is defined: S(k) =

(6)

d  j =1

H j S(k − j ) + T1 E(k) + T2 E(k − 1)

(12)

HWANG AND JAN: RNNBMAC FOR A CLASS OF NONLINEAR DYNAMIC SYSTEMS

where S(k) ∈ n , Tl = diag(tlii ), |t2ii /t1ii | < 1, l = 1, 2, i = 1, 2, . . . , n, E(k) = R(k) − Y (k), R(k) is any bounded reference input, and H j = h j ii In is cho sen such that the polynomial equation In − dj H j z − j = 0 is Hurwitz. Without loss of generality, it is assumed that T1 > 0. Remark 4: It is assumed that no pole-zero cancellation in (12) occurs. It better contains a pole of 1− such that the integral feature of switching surface can eliminate the dc biasing tracking error. A. Learning Laws for Weight Matrices The learning laws for the weight matrices are designed as follows: Wˆ i j (k + 1) = Wˆ i j (k) + i j (k) − ηi j Wˆ i j (k), i = 1, 2, 4 (13a) ˆ ˆ ˆ ˆ W3 j (k + 1) = W3 j (k) + 3 j (k)−η3 j W3 j (k) − P j (k)W3 j (k) j = 1, 2q , q = 1, 2, . . . , n (13b) where

 T ˆ x) ˆ (x) 11 (k) = −α11 ( ¯ Wˆ 21 (k)x(k) ¯ ¯ −  T ˆ (x) ˆ S T (k − d)T1 /2 −2 ¯ Wˆ 31 (k)z −1 () (14a)

(14b) (15a)

T 22q (k) = −α22q u eqq (k − d)x(k)S ¯ (k − d)T1 T ˆ (x)/2 ×Wˆ 12 (k) ¯ q

(15b)

T ˆ T (k − d)T1 Wˆ 11 ˆ (x) 31 (k) = −α31 z −1 ()S (k) ¯

(16a)

ˆ 32q (k) = −α32q u eqq (k − d)z ()S (k − d)T1 T ˆ (x) × Wˆ 12 (k) ¯ (16b) q ⎧ 0, ⎪ ⎪ √ ⎪ ⎪ if Wˆ 3 j (k) F < m j or ⎪ ⎪ ⎪ ⎪ √ ⎪ ⎪ ⎪if Wˆ 3 j (k) F ≥ m j , ⎪ ⎪ ⎪ ⎨ tr{[3 j (k) − η3 j Wˆ 3 j (k)]T Wˆ 3 j (k)} ≤ 0; P j (k) = ⎪ tr{[3 j (k) − η3 j Wˆ 3 j (k)]T Wˆ 3 j (k)} ⎪ ⎪   ⎪ ⎪ ⎪ / κ j + Wˆ 3 j (k)2F , ⎪ ⎪ ⎪ ⎪ ⎪ if Wˆ 3 j (k) F ≥ √m j , ⎪ ⎪ ⎪ ⎩ tr{[3 j (k) − η3 j Wˆ 3 j (k)]T Wˆ 3 j (k)} > 0 −1

T

(16c) 41 (k) = α41 S(k − d)1 (k)/2 42q (k) = α42q u eqq (k − d)S(k − d)2q (k)/2

qth component of the Ueq (k − d) in (21). The learning laws (13a) and (13b) possess the learning rates αi j , i = 1, 2, 3, 4, j = 1, 2q , q = 1, 2, . . . , n, the error function S(k − d), and the specific basis functions i j (k), i = 1, 2, 3, 4, j = 1, 2q , q = 1, 2, . . . , n. In general, the learning rate should be chosen small enough to avoid the instability of the closed-loop system. The selection of ηi j Wˆ i j (k) in (13) is the reason for the boundedness of the learning weight matrices without the requirement of persistent excitation condition. Otherwise, the drift of learning weights probably occurs [12]. In addition, ηi j , i = 1, 2, 3, 4, j = 1, 2q , q = 1, 2, . . . , n, are small to allow a possibility of effective learning of Wˆ i j (k). Too large values of ηi j will force Wˆ i j (k) to converge to a neighborhood of zero. Under the circumstance, if W¯ i j , i = 1, 2, 3, 4, j = 1, 2q , q = 1, 2, . . . , n, are not very small, a poor learning of Wˆ i j (k) or an approximation of nonlinear vector or matrix function occurs. The projection term P j (k)Wˆ 3 j (k) √ in (16c) ensures that Wˆ 3 j (k) F < m j as k → ∞. Due to F2 (X) > 0, a similar projection for the learning function Fˆ2 (k) should be applied to ensure that Fˆ2 (k) > 0∀k. B. Proposed RNNBMAC The following two signals are connected with the system uncertainties. The first one is 1 (k) = Fˆ2−1 (k)F2 (X) − I

12q (k) = −α12q u eqq (k − d)  T ˆ x) ˆ (x) ¯ Wˆ 22 (k)x(k) ¯ × ( ¯ − q  T ˆ (x) ˆ S T (k − d)T1 /2 −2 ¯ Wˆ 32 (k)z −1 () q T T ˆ (x)/2 21 (k) = −α21 x(k)S ¯ (k − d)T1 Wˆ 11 (k) ¯

391

(17a) (17b)

where 0 ≤ κ j , 0 < αi j , 0 < ηi j < 1, for i = 1, 2, 3, 4, j = 1, 2q , q = 1, 2, . . . , n, and u eqq (k − d) is the

(18)

T ( W T z −1 ())], ˆ T x(k) where Fˆ2 (k) = {col[Wˆ 12 + Wˆ 32 22q ¯ q q q = 1, 2, . . . , n} is the learning of F2 (X), and  1 (k) F ≤ β < 1. The second one is

2 (k) = −T1 { f˜1 (k) + F˜2 (k)Ueq (k)}

(19)

where f˜1 (k) = f 1 (X) − fˆ1 (k), F˜2 (k) = F2 (X) − Fˆ2 (k), and Ueq (k) is described by (21) and (22). Before designing the proposed RNNBMAC, Lemma 3 about the properties of trace operator is given. 3 [12]: Define the trace operation as tr[ A] = Lemma n a , where A ∈ n×n . Then, the following is obtained. ii i=1 1) tr[ ABC] = tr[C AB] = tr[BC A], where A, B, and C are three compatible square (or nonsquare) matrices. 2) tr[W˜ T Wˆ ] = −[W˜ 2F + Wˆ 2F − W¯ 2F ]/2. The main theorem of this paper is described as follows. Theorem 2: Consider the controlled system (1) or (2) and the following RNNBMAC: U (k − d) = Ueq (k − d) + Usw (k − d)

(20)

where Ueq (k − d) −1 ˆ = −[T ⎧1 F2 (k)] d ⎨  × T1 fˆ1 (k) − H j S(k − j ) ⎩ j =1

+S(k − d) − T1 R(k) − T2 E(k − 1) + (k)

⎫ ⎬ ⎭

392

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

(k) =



T T T1 11 (k)/2 − η11 Wˆ 11 (k)

(21)



  T T ˆ ˆ (k)Wˆ 21 ˆ (k)Wˆ 31 ˆ × (k) − (k)x(k) ¯ − 2 (k)z −1 ()  T  T T ˆ (k) 21 (k) (k)/2 − η21 Wˆ 21 (k) x(k) ¯ +T1 Wˆ 11  T  −1 T T ˆ (k) 31 (k) − 2η31 Wˆ 31 (k) z () ˆ +T1 Wˆ 11 (k)

j = 1, 2q , q = 1, 2, . . . , n, as k → ∞, are achieved. Proof: If the following mathematical expressions are not vague, their arguments are omitted for simplicity. Consider the case of x ∈ . Define the following Lyapunov function: V (W˜ 11 , W˜ 21 , W˜ 31 , W˜ 41 , W˜ 121∼n , W˜ 221∼n , W˜ 321∼n , W˜ 421∼n , S)

−α41 S(k − d)1T (k)1 (k)/4

=

T −(1 − η41 )S(k − d)Wˆ 41 (k)1 (k)/S(k − d)  T  T +u eq1∼n (k − d)T1 121∼n (k)/2 − η121∼n Wˆ 12 (k) 1∼n  T ˆ ˆ (k)Wˆ 22 × (k) − (k)x(k) ¯ 1∼n  T ˆ (k)Wˆ 32 ˆ − 2 (k)z −1 () 1∼n

T ˆ (k) (k) +u eq1∼n (k − d)T1 Wˆ 12 1∼n  T  T × 221∼n (k)/2 − η221∼n Wˆ 22 (k) x(k) ¯ 1∼n

4 4 n     T  T   tr W˜ i1 tr W˜ i2 W˜ i1 /αi1 + W˜ i2q αi2q q i=1

+ S S/2 = Z Q Z Z T = [ W˜ 11  W˜ 21  F W˜ 31  F W˜ 41  F W˜ 121  F W˜ 221  F W˜ 321  F W˜ 421  F

T ×Wˆ 42 (k)21∼n (k)/S(k − d) 1∼n

1/α12n

W˜ 32n  F

W˜ 42n  F

(22)

˜ 2 (k) = 2 (k) + (k) and  ˜ 2 (k) ≤ f ˜ (k). where 2 In addition, fˆ1 (k) is the learning of f 1 (X), Wˆ i j (k), i = 1, 2, 3, Wˆ 4 j (k), j = 1, 2q , q = 1, 2, . . . , n, are achieved from the learning laws (13)–(17), and 0 < α = 1 − μ(1 + β)2 /(1 − β)2 < 1, where 0 < μ < 1 denotes the exponential convergence rate to switching surface. The amplitude of the switching gains ξ(k) is achieved by the following inequality: ξ2 (k) > ξ(k) > ξ1 (k) ≥ 0

(24)

h 21 (k) − h 2 (k)

1/α22n

1/α32n

1/α42n

··· S ] (29)

··· 1/2 ]. (30)

Usw (k − ⎧ d) −1 ˆ ⎪ ⎪ ⎪ ξ(k) f ˜ 2 (k)[T1 F2 (k)] S(k − d)/[(1 − β) ⎪ ⎨ × (1 + β)S(k − d)], = ⎪ ⎪ if S(k − d)>2(1+β) f ˜ 2 (k)/[(1 − β)α] ⎪ ⎪ ⎩ 0, otherwise (23)

(25)

h 1 (k) = (1 − β) S(k − d)/[(1 + β) f ˜ 2 (k)] − (1 − β) (26)  2  2 2 2 h 2 (k) = (1 − β) f ˜ (k) + μS(k − d) / f ˜ (k) > 0. 2

2

W˜ 22n  F

Q = diag[ 1/α11 1/α21 1/α31 1/α41 1/α121 1/α221 1/α321 1/α421

−(1 − η421∼n )u eq1∼n (k − d)S(k − d)

ξ1,2 (k) = h 1 (k) ±

(28)

where

−α42q u 2eq1∼n (k − d)S(k − d)2T1∼n (k)21∼n (k)/4



T

W˜ 12n  F

T ˆ (k) +u eq1∼n (k − d)T1 Wˆ 12 (k) 1∼n  T  T ˆ × 321∼n (k) − 2η321∼n Wˆ 32 (k) z −1 () 1∼n

where

q=1 i=1 T

At the beginning, the case of P j = 0, for j = 1, 2q , q = 1, 2, . . . , n, is examined. By (12) and (13), the change rate of V in (28) is described as follows: V 4 4 n    = Vi1 + Vi2q + V4n+5 i=1

=

4  i=1

+

If the overall system satisfies the conditions (8), (18), (19), √ and (23) and Wˆ 3 j (0) F ≤ W˘ 3 j < m j , for j = 1, 2q , q = 1, 2, . . . , n, then the signals {Wˆ i j (k), for i = 1, 2, 3, 4, j = 1, 2q ,q = 1, 2, . . . , n S(k − d), U (k − d)} are bounded, and the performances S(k − d)< √ 2(1+β) f ˜ 2 (k)/[(1−β)α] and Wˆ 3 j (k) F < m j,

 tr (W˜ i1 − i1 + ηi1 Wˆ i1 )T (W˜ i1 − i1 + ηi1 Wˆ i1 )  T ˜ − W˜ i1 Wi1 /αi1

n  4   tr (W˜ i2q − i2q + ηi2q Wˆ i2q )T q=1 i=1

 T ˜ Wi2q /αi2q × (W˜ i2q −i2q + ηi2q Wˆ i2q ) − W˜ i2 q

+ S T (k) S(k)/2 + S T (k −d) S(k)

(31)

where S(k) = S(k) − S(k − d) = −(T1 Fˆ2 )(I + 1 )Usw + 2 + 

(32)

where 1 and 2 are shown in (18) and (19), respectively. Based on Lemma 2 ⎧ ⎫ n ⎨ ⎬  2 = −T1 { f˜1 + F˜2 Ueq } = −T1 f˜1 + u eqq f˜2q (33) ⎩ ⎭ q=1

2

(27)

q=1 i=1

where

  T ˆ T T −1 ˆ ˆ Wˆ 31 ˆ Wˆ 21 x−2 ¯  z () − f˜1 = W˜ 11   T ˆ ˜T T −1 ˆ +Wˆ 11 z () + ε˜ f 1 (34a)  W21 x¯ + 2 W˜ 31   T T T −1 ˆ Wˆ 32 z () ˆ ˆ − ˆ Wˆ 22 x−2 f˜2q = W˜ 12q  ¯  q q   T ˆ ˜T T −1 ˆ  W22q x¯ + 2 W˜ 32 z () + ε˜ f 2q +Wˆ 12 q q q = 1, 2, . . . , n.

(34b)

HWANG AND JAN: RNNBMAC FOR A CLASS OF NONLINEAR DYNAMIC SYSTEMS

Then, Vi j for i = 1, 2, 3, 4, j = 1, 2q , and q = 1, 2, . . . , n is expressed as follows:        Vi j = − 2tr W˜ iTj i j + 2ηi j tr W˜ iTj Wˆ i j + tr iTj i j     −2ηi j tr Wˆ iTj i j + ηi2j tr Wˆ iTj Wˆ i j /αi j . (35) By inspecting (13)–(17), using Lemma 3 yields the following:  T   T ˆ T ˆ Wˆ 21 −2tr W˜ 11 11 /α11 = S T T1 W˜ 11 x¯ −  T ˆ ˆ Wˆ 31 z −1 () − 2   T T ˆ ˜T  W21 x¯ 21 /α21 = S T T1 Wˆ 11 −2tr W˜ 21  T  T ˆ ˜ T −1 ˆ  W31 z () −2tr W˜ 31 31 /α31 = 2S T T1 Wˆ 11   T T ˜ ˜ −2tr W41 41 /α41 = −SW41 1   T    T tr 11 11 − 2η11 tr Wˆ 11 11 /α11  T  T = −S T T1 11 /2 − η11 Wˆ 11   T T −1 ˆ ˆ − ˆ Wˆ 21 ˆ Wˆ 31 x¯ − 2 z () ×   T    T tr 21 21 − 2η21 tr Wˆ 21 21 /α21  T  T ˆ T  21 x¯ = −S T T1 Wˆ 11 /2 − η21 Wˆ 21     T  T tr 31 31 − 2η31 tr Wˆ 31 31 /α31  T  T ˆ T −1 ˆ z () = −S T T1 Wˆ 11 − 2η31 Wˆ 31  31    T   T tr 41 41 − 2η41 tr Wˆ 41 41 /α41   T = S T α41 S1T 1 /4 − η41 S Wˆ 41 1 /S .

393

Based on Lemma 2 T 1 T1 ε˜ f 1  ≤ W¯ 41

  T1 u eq ε˜ f  ≤ W¯ T 2 . q q 2q 42q

(38)

Then, the following two inequalities are obtained:   T T S T −T1 ε˜ f1 − S Wˆ 41 1 /S − S W˜ 41 1 /S   T 1 ≤ 0 (39a) ≤ S T1 ε˜ f1  − W¯ 41   T T S T −u eqq T1 ε˜ f 2q − Su eqq Wˆ 42  /S − Su eqq W˜ 42  /S q 2q q 2q    ≤ S u eq T1 ε˜ f  − W¯ T 2 ≤ 0, q = 1, 2, . . . , n q

2q

42q

q

(39b) where 2q is different from 1 by that its each component is multiplied by the signal of |u eqq |. Using (31)–(39) and Remark 2 yields V =

2q 3 n    q=1 j =1 i=1

   × 2ηi j tr W˜ iTj (W¯ i j − W˜ i j )   + ηi2j tr (W¯ i j − W˜ i j )T (W¯ i j − W˜ i j ) /αi j   T T +S T −T1 ε˜ f1 − S Wˆ 41 1 /S − S W˜ 41 1 /S  T +S T − u eqq T1 ε˜ f 2q − Su eqq Wˆ 42  /S q 2q  T − Su eqq W˜ 42q 2q /S + S T S/2 − S T (T1 Fˆ2 )(I + 1 )Usw

(36) ≤−

Similarly  T  −2tr W˜ 12 12q /α12q q   T T T −1 ˆ ˆ − ˆ Wˆ 22 ˆ Wˆ 32  = u eqq S T T1 W˜ 12 x¯ − 2 z () q q q   T 22q /α22q −2tr W˜ 22 q

and

2q 4 n    q=1 j =1 i=1

 × ηi j (2 − ηi j )W˜ i j 2F  − 2ηi j (1 − ηi j )W˘ i j W˜ i j  F − ηi2j W˘ i2j /αi j + S T S/2 − S T (T1 Fˆ2 )(I + 1 )Usw . (40) Suppose that   − ηi j (2−ηi j )W˜ i j 2F −2ηi j (1−ηi j )W˘ i j W˜ i j  F −ηi2j W˘ i2j /αi j ≤ −λi j W˜ i j 2F /αi j , i = 1, 2, 3, 4, j = 1, 2q

T ˆ ˜T = u eqq S T T1 Wˆ 12  W22q x¯ q   T −2tr W˜ 32 32q /α32q q

T ˆ ˜ T −1 ˆ = 2u eqq S T T1 Wˆ 12  W32q z () q   T −2tr W˜ 42 42q /α42q

q = 1, 2, . . . , n.

(41)

q

T = −u eqq SW˜ 42  q 2q   T    T tr 12q 12q − 2η12q tr Wˆ 12 12q /α12q q  T  T = −u eqq S T T1 12 /2 − η12q Wˆ 12 q q   T T −1 ˆ ˆ − ˆ Wˆ 22 ˆ Wˆ 32 × x¯ − 2 z () q q   T    T tr 22q 22q − 2η22q tr Wˆ 22 22q /α22q q  T  T ˆ T  22 x¯ = −u eqq S T T1 Wˆ 12 /2 − η22q Wˆ 22 q q q   T    T tr 32q 32q − 2η32q tr Wˆ 32q 32q /α32q  T  −1 T ˆ T ˆ z () = −u eqq S T T1 Wˆ 12 − 2η32q Wˆ 32  32 q q q   T    T tr 42q 42q − 2η42q tr Wˆ 42q 42q /α42q   T = S T α42q u 2eqq S2Tq 2q /4 − η42q u eqq S Wˆ 42  /S q 2q

q = 1, 2, . . . , n.

(37)

Hence, Vi j ≤ −λi j Vi j for i = 1, 2, 3, 4, j = 1, 2q , and q = 1, 2, . . . , n, which implies that G i j /αi j ≤ 0, where G i j = −[ηi j (2 − ηi j ) − λi j ] ⎧ 2 ⎨ ηi j (1 − ηi j )W˘ i j W˜ i j  F − × ⎩ ηi j (2 − ηi j ) − λi j ⎫ (1 − λi j )ηi2j W˘ i2j ⎬ − [ηi j (2 − ηi j ) − λi j ]2 ⎭

(42)

and ηi j (2 − ηi j ) − λi j > 0, 1 >ηi j > λi j > 0 for i = 1, 2, 3, 4, j = 1, 2q , and q = 1, 2, . . . , n. If W˜ i j  F ≥ then [(1 − ηi j ) + (1 − λi j )1/2 ] ηi j W˘ i j /[ηi j (2 − ηi j ) − λi j ], G i j ≤ 0 for i = 1, 2, 3, 4, j = 1, 2q , and q = 1, 2, . . . , n. Similarly, V4n+5 ≤ −μV4n+5

394

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

Fig. 2. Response of (55) and (56) by the proposed control. (a) r1 (k)(· · · ), y1 (k)(−), and ρ(k)(−·). (b) r2 (k)(· · · ), y2 (k)(−), and ρ(k)(−·). (c) u 1 (t)(· · · ) and u 2 (t)(−). (d) s1 (t)(· · · ) and s2 (t)(−).

(or V¯4n+5 = V4n+5 − μV4n+5 ≤ 0) is assumed. Then

where 0 < λ0 < 1. Then, V (k +1)−(1−λ0 )V (k) ≤ 0. Hence, outside of the domain D making V ≤ −λ0 V is described as follows:  D = Z ∈ 4n+5 |0 ≤ W˜ i j  F ≤ Wi∗j , i = 1, 2, 3, 4,  j = 1, 2q , q = 1, 2, . . . , n, 0 ≤ S ≤ S ∗

V¯4n+5 T = Usw (I + 1 )T (T1 Fˆ2 )T (T1 Fˆ2 )(I + 1 )Usw /2

˜ 2 /2 ˜ 2T ˜ 2T (T1 Fˆ2 )(I + 1 )Usw + −

(45a)

−S T (T1 Fˆ2 )(I + 1 )Usw + μS T S/2 ξ 2 f 2˜

ξ f 2˜

f 2˜

ξ f ˜ 2 S

μS2 + − + ≤ + 2(1 − β)2 (1 − β) 2 (1 + β) 2 2

2

2

= f 2˜ {ξ 2 − 2h 1 ξ + h 2 }/{2(1 − β)2 } 2

(43)

where the inequality has used (8), (18), and (19), and the expressions of h 1 and h 2 are shown in (26) and (27). From Lemma 4, the inequality S > 2(1+β) f ˜ 2 /[(1 − β)α] can result in the facts h 1 > 0 and h 21 − h 2 > 0. Then, the result (24) is achieved from the inequality ξ 2 − 2h 1 ξ + h 2 < 0. Hence, the change rate of Lyapunov function (28) becomes V ≤ −

2q 3 n   

(λi j Vi j ) − μV4n+5

q=1 j =1 i=1

≤−

min

i=1,2,3,4, j =1,2q q=1,2,...,n

(λi j , μ)V = −λ0 V

(44)

where Wi∗j = [(1 − ηi j ) +



1 − λi j ]ηi j W˘ i j /[ηi j (2 − ηi j ) − λi j ] (45b)

S ∗ > 2(1 + β) f ˜ 2 /[(1 − β)α].

(45c)

Finally, from (20)–(27), {U } is bounded. Second, the case of P j = 0, j = 1, 2q and q = 1, 2, . . . , n is addressed as follows. The above results are the same except that V3 j , j = 1, 2q and q = 1, 2, . . . n in (35) possesses the extra second term in the right-hand side of  V3 j ≤ − η3 j (2 − η3 j )W˜ 3 j 2F − 2η3 j (1 − η3 j )W˘ 3 j  × W˜ 3 j  F − η32 j W˘ 32j /α3 j      + 2P j tr W˜ 3Tj Wˆ 3 j − 2P j tr (3 j − η3 j Wˆ 3 j )T Wˆ 3 j   + P j2 tr Wˆ 3Tj Wˆ 3 j /α3 j ≤ −λ3 j V3 j . (46)

HWANG AND JAN: RNNBMAC FOR A CLASS OF NONLINEAR DYNAMIC SYSTEMS

395

Substituting P j in (16c) and part 2) of Lemma 3 into the second term of (46) gives   G 3 j = 2P j tr W˜ 3Tj Wˆ 3 j − 2P j tr[(3 j − η3 j Wˆ 3 j )T Wˆ 3 j ]   +P j2 tr Wˆ 3Tj Wˆ 3 j   = − W˜ 3 j 2F + Wˆ 3 j 2F − W¯ 3 j 2F     × tr (3 j − η3 j Wˆ 3 j )T Wˆ 3 j / κ j + Wˆ 3 j 2F 2    − 2tr (3 j − η3 j Wˆ 3 j )T Wˆ 3 j / κ j + Wˆ 3 j 2F  2   + tr (3 j − η3 j Wˆ 3 j )T Wˆ 3 j tr Wˆ 3Tj Wˆ 3 j  2 / κ j + Wˆ 3 j 2F . (47) √ Since W¯ 3 j 2F ≤ W˘ 32j , if Wˆ 3 j  F ≥ m j > W˘ 3 j and tr[(3 j − η3 j Wˆ 3 j )T Wˆ 3 j ] > 0, then G 3 j < 0. Hence, the projection algorithm (16c) makes the original V3 j ≤ −λ3 j V3 j √ more negative. Since 0 < ε¯ ≤ Wˆ 3 j (0) ≤ W˘ 3 j ≤ m j √ and the result 0 < ε¯ ≤ W¯ 3 j  ≤ W˘ 3 j ≤ m j exists, then √ Wˆ 3 j  ≤ m j as k → ∞. Remark 5: The signal (k) in (22) is employed to cancel the unnecessary terms for the satisfaction of the stability of the closed-loop system. It is observed that: 1) if ηi j , i = 1, 2, 3, 4, j = 1, 2q , and q = 1, 2, . . . , n are small enough; 2) if S(k − d) → 0; and 3) if f˜1 (k), F˜2 (k) → 0, then (k) ˜ 2 (k) ≤ f ˜ (k) is in the neighbor of zero. The assumption  2 seems reasonable. Remark 6: A suitable selection of T1 can reduce the degree of the singularity of T1 Fˆ2 (k). In addition, the freezing technique is suggested to avoid its singularity. Lemma 4: For the existence of condition (24), the following fact is achieved: S(k − d) > 2(1+β) f ˜ 2 (k)/[(1 − β)α].

(48)

Proof: Since h 1 (k) = (1 − β)2 S(k − d)/[(1 + β) f ˜ 2 (k)] − (1 − β) > 0 the following result is obtained: S(k − d) > (1 + β) f ˜ 2 (k)/(1 − β).

(49)

Based on the condition h 21 (k) − h 2 (k) > 0, the following equation is derived: h 21 (k) − h 2 (k) = {(1 − β)2 S(k − d)/[(1 + β) f ˜ 2 (k)] − (1 − β)}2   −(1 − β)2 f 2˜ (k) + μS(k − d)2 / f 2˜ (k) 2

Fig. 3. Response of Fig. 2 case in [9]. (a) r1 (k)(· · · ), y1 (k)(−), and ρ(k)(−·). (b) r2 (k)(· · · ), y2 (k)(−), and ρ(k)(−·). (c) u 1 (t)(· · · ) and u 2 (t)(−).

From (50)

2

= S(k − d) (1 − β)4 S(k − d)α − 2(1 − β)3 (1 + β) f ˜ 2 (k) × >0 (1 + β)2 f 2˜ (k) 2

(50) where 0 < α = 1 − μ(1 + β)2 /(1 − β)2 < 1.

(51)

0 < μ = (1 − α)(1 − β)2 /(1 + β)2 < 1.

(52)

From (51)

S(k − d) > 2(1 + β) f ˜ 2 (k)/[(1 − β)α].

(53)

To simultaneously satisfy the conditions (49) and (53), the corresponding (48) is achieved. Remark 7: The exponential convergence rate to switching surface is dependent on 0 < μ = (1 − α)(1 − β)2 / (1 + β)2 < 1. It indicates that the larger α < 1 or  1 (k) ≤ β < 1 [i.e., the smaller convergent region in (23) or the smaller estimated uncertain control gain in (18)], the smaller μ.

396

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

Fig. 4. Response of RNNBMAC with known system functions (57), (58), and d(k) = 1, and uncertainty (59). (a) r1 (t)(· · · ) and y1 (t)(−). (b) r2 (t)(· · · ) and y2 (t)(−). (c) e1 (t)(· · · ) and e2 (t)(−). (d) u 1 (t)(· · · ) and u 2 (t)(−). (e) s1 (t)(· · · ) and s2 (t)(−).

Remark 8: To satisfy condition (24), the proposed amplitude of the switching gain can be as follows:  ξ(k) = h 1 (k) + c1 h 21 (k) − h 2 (k){1 − c2 exp[−c3 S(k)]} (54) where 0  c1 ≤ 2− , 0  c2 ≤ 1− , and 0  c3 < M (a large constant) are the parameters of the switching gain. V. S IMULATIONS AND D ISCUSSION Two simulation examples are presented. The first example is from [9]. The other example validates the effectiveness of the proposed control.

Example 1: Based on [9, Example 1], the controlled system is rewritten as the form described in (1)   f 11 (X) , f 1 (X) = f 12 (X)   2 0 F2 (X) = (55) 0 2 = 0.4y1 (k − 2)/[1 + y12 (k − 1)], where f 11 (X) f 12 (X) = y2 (k − 1) · [0.1 + 0.005 cos(y2 (k − 2))], and d = 2. The reference input is R T (k) = [ 0.2 sin(0.01kπ + 0.25π)) 0.2 cos(0.02kπ + 0.25π)) ].

HWANG AND JAN: RNNBMAC FOR A CLASS OF NONLINEAR DYNAMIC SYSTEMS

Fig. 5. Response of Fig. 4 case with d(k) = (d) u 1 (t)(· · · ) and u 2 (t)(−). (e) s1 (t)(· · · ) and s2 (t)(−).

397

2. (a) r1 (t)(· · · ) and y1 (t)(−). (b) r2 (t)(· · · ) and y2 (t)(−). (c) e1 (t)(· · · ) and e2 (t)(−).

In addition, the external disturbance is as follows:  ρ1 (k), as 300 > k ρ(k) = ρ2 (k), as 300 ≤ k ≤ 1000

(56)

where ρ1 (k) and ρ2 (k) are the random numbers in the intervals [0, 0.2] and [0.3, 0.5], respectively. It is presented in Fig. 2(a) and (b) with the green dashed-dotted line. The response of the proposed RNNBMAC using the parameters of the switching gain c1 = 0.5, c2 = 0.98, and c3 = 200 in (54), the coefficients of the switching surface H1 = 1.23I2, H2 = −0.235I2, T1 = I2 , and T2 = −0.5I2 , the control parameters β = 0.001, μ = 0.003, and f ˜ 2 = 0.005, and the learning parameters: κ1 = κ21 = κ22 = 0.5, α1 j = 0.02,

α2 j = 0.004, α3 j = 0.01, α4 j = 0.0008, j = 1, 21 , 22 , ηi j = 0.008, i = 1, 2, 3, 4, j = 1, 21 , 22 is shown in Fig. 2. The responses of the other parameters in the neighbor are similar with those in Fig. 2. On the other hand, the response using the reinforcement learning control [9] is shown in Fig. 3, which is poorer than that of Fig. 2. The reasons are summarized as follows. 1) Although it emphasizes its less use of learning parameters [e.g., two for system (55)], its performance is poor to cope with the random disturbance of sudden change. 2) Since the control parameter of [9] is only one, it becomes deteriorated as two components of the

398

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

Fig. 6. Reponse of the proposed RNNBMAC with time-varying delay between 1 and 2. (a) d(t). (b) r1 (t)(· · · ) and y1 (t)(−). (c) r2 (t)(· · · ) and y2 (t)(−). (d) e1 (t)(· · · ) and e2 (t)(−). (e) u 1 (t)(· · · ) and u 2 (t)(−). (f) s1 (t)(· · · ) and s2 (t)(−). (g) f 11 (X)(· · · ) and fˆ11 (t)(−). (h) f 12 (X)(· · · ) and fˆ12 (t)(−). (i) f 211 (X) and fˆ211 (t). (j) f 212 (X) and fˆ212 (t). (k) f 221 (X) and fˆ221 (t). (l) f 222 (X) and fˆ222 (t).

reference input are not the same frequency or phase [e.g., dotted lines in Fig. 3(a) and (b)]. 3) Although the biasing of switching surface occurs after the plunge of external disturbance [Fig. 2(d)],

the biasing tracking error is eliminated [Fig. 2(a) and (b)] by the coefficient of switching surface from Remark 4. It is one of the advantageous features of the proposed control.

HWANG AND JAN: RNNBMAC FOR A CLASS OF NONLINEAR DYNAMIC SYSTEMS

399

uncertain components and time-varying delay:   f 11 (X)(1 + f 11 (X)) f 1 (X) = f 12 (X)(1 + f 12 (X))   f 211 (X)(1 + f 211(X)) f 212 (X)(1 + f 212 (X)) F2 (X) = f 221 (X)(1 + f 221(X)) f 222 (X)(1 + f 222 (X)) (57) where f 11 (X) = 0.79u 1(k − d(k) − 1)y2 (k − 3) − 0.31u 2 (k − d(k) − 2) ·y1 (k − 2) + 0.08y1(k − 2)y1 (k − 4) f 12 (X) = 0.83u 2(k − d(k) − 1)y1 (k − 2) + 0.23u 1(k − d(k) − 2) ·y1 (k − 3) − 0.15y1(k − 3)y2 (k − 4) f 211 (X) = 1.4 + 0.3 cos(2.5u 1 (k − d(k) − 1)u 2 (k − d(k) − 1) −y1(k − 1)y2 (k − 2)) − 0.3y1(k − 1) f 212 (X) = 0.45 − 0.3 sin(0.25u 1 (k − d(k) − 2)u 2 (k − d(k) − 1) −0.27y1(k − 2)y2 (k − 1)) − 0.3y1(k − 1) f 221 (X) = 0.56 + 0.45 sin(0.33u 1(k − d(k) − 1)u 2 (k − d(k) − 2) −0.22y1(k − 1)y2 (k − 2)) + 0.35y2 (k − 1) f 222 (X) = 1.6 − 0.3 cos(2u 1 (k − d(k) − 2)u 2 (k − d(k) − 1) +y1(k − 2)y2 (k − 1)) − 0.32y1(k − 2)

(58)

and f 11 (X) f 12 (X) f 211 (X) f 212 (X) f 221 (X) f 222 (X)

Fig. 6. (Continued.) Reponse of the proposed RNNBMAC with timevarying delay between 1 and 2. (a) d(t). (b) r1 (t)(· · · ) and y1 (t)(−). (c) r2 (t)(· · · ) and y2 (t)(−). (d) e1 (t)(· · · ) and e2 (t)(−). (e) u 1 (t)(· · · ) and u 2 (t)(−). (f) s1 (t)(· · · ) and s2 (t)(−). (g) f 11 (X)(· · · ) and fˆ11 (t)(−). (h) f 12 (X)(· · · ) and fˆ12 (t)(−). (i) f 211 (X) and fˆ211 (t). (j) f 212 (X) and fˆ212 (t). (k) f 221 (X) and fˆ221 (t). (l) f 222 (X) and fˆ222 (t).

Example 2: Consider two inputs and two outputs system with the following system matrices possessing multiplicative

= = = = = =

−0.27 sin(0.02u 2 (k − d(k) − 1)) 0.25 sin(0.015u 2 (k − d(k) − 2)) −0.13 sin(0.015u 1(k − d(k) − 1)) 0.25 sin(0.02u 1 (k − d(k) − 2)) −0.27 cos(0.015y2(k − 3)) 0.28 sin(0.02y1 (k − 2)).

(59)

The reference input is R(t) = [ sin(2π f t) sin(2π f t) ]T , where t = kTs , f = 1 Hz, and Ts = 0.01 s. Fig. 4 shows the response of the RNNBMAC for d(k) = 1, known system functions (57) and (58) but uncertainty (59) (i.e., robust control) by the parameters of the switching gain c1 = 0.5, c2 = 0.98, and c3 = 200, the coefficients of the switching surface H1 = 0.99I2 , T1 = I2 , and T2 = −0.5I2 , the control parameters β = 0.001, μ = 0.003, and f ˜ 2 = 0.005. It is acceptable to our expectation. Similarly, the response of Fig. 4 case except d(k) = 2, H1 = 1.23I2 , H2 = 0.235I2 , T1 = I2 , and T2 = −0.75I2 is shown in Fig. 5. Its result is similar to that in Fig. 4 but a little larger tracking error [Figs. 4(c) and 5(c)]. Some important observations of Figs. 4 and 5 are drawn as follows. 1) It indicates that the nonlinear system with larger time delay is more difficult to control [Figs. 4(c) and (e) and 5(c) and (e)].

400

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 27, NO. 2, FEBRUARY 2016

parameters: κ1 = κ21 = κ22 = 0.5, α1 j = 0.02, α2 j = 0.004, α3 j = 0.01, α4 j = 0.0008, ηi j = 0.0065, i = 1, 2, 3, 4, j = 1, 21 , 22 is shown in Fig. 6, which is indeed satisfactory. Although the tracking error of Fig. 6(d) is little larger than that in Figs. 4(c) and 5(c), the simultaneous convergence of tracking error for one set of suitable control parameters is achieved. Although the time-varying delay must be known in advance, it can be online estimated [35]. Finally, the output response of Fig. 6 case without uncertainty (59) is shown in Fig. 7, which is also satisfactory. VI. C ONCLUSION

Fig. 7. Output response of Fig. 6 case with f 11 = f 12 = f 211 = f212 = f 221 = f 222 = 0. (a) r1 (t)(· · · ) and y1 (t)(−). (b) r2 (t)(· · · ) and y2 (t)(−).

2) The larger switching control as compared with that of Fig. 4 must be used to ensure system stability. However, too large values of switching gain (e.g., c1 = 1 or c3 = 500) will result in system instability. In addition, the magnitude of transience is dependent on the amplitude of uncertainty (59). 3) The tracking performance is affected by the zero of switching surface [34]. However, a better tracking performance is not necessary to have excess robustness to cope with extra uncertainty. 4) From the viewpoint of system stability and tracking performance, different control parameters for the same controlled system but different delays must be considered. 5) However, the response for (57)–(59) with time-varying delay [e.g., Fig. 6(a)] is unstable only for the use of a robust control. It implies that the time-varying delay indeed causes the instability of the closed-loop system. It is one of the important motivations of this paper. Under these circumstances, the above difficult situations are tackled by the proposed RNNBMAC. The response of systems (57) and (58) with d(k) = 1 or 2 in Fig. 6(a) and the uncertainty (59) using the RNNBMAC with the parameters: c1 = 0.8, c2 = 0.98, c3 = 400, H1 = 1.23I2, and H2 = −0.235I2 for d(k) = 2 and H1 = 0.99I2 for d(k) = 1, T1 = I2 , T2 = −0.5I2 , β = 0.001, μ = 0.003, and f ˜ 2 = 0.0025, and the following learning

In this paper, a neural adaptive controller design for an approximate NARMA model with highly nonlinear, unknown, multivariable features, and known time-varying delay is addressed. The proposed controller contains an equivalent control and a switching control. The equivalent control uses two classes of learning vector functions by RNN, switching surface, and a bounded reference input. To compensate the residue caused by the linearized parameterization of function approximation error, a simple network is also established to estimate its upper bound for the controller compensation. A projection algorithm for the learning of feedback weight matrix is applied to guarantee the two classes of stable RNNs. No state estimator or persistent excitation is required for the proposed control. Under suitable conditions, the semiglobally ultimately bounded tracking with the boundedness of learning weight matrices is achieved. We demonstrate that an appropriate use of neural network in control application can increase benefits, e.g., the simultaneous convergence of tracking error, the smoothness of control input, and the easy selection of control parameter for time-varying delay. Simulations, including the reinforcement learning control [9], the robust control, and the proposed control, show the effectiveness and robustness of the proposed control. One of the future works is the online learning of time-varying delay [35] to extend the performance of our methodology. ACKNOWLEDGMENT The authors would like to thank the anonymous reviewers for their valuable comments and Prof. Y.-H. Chen from the Georgia Institute of Technology, Atlanta, GA, USA, for the English improvement. R EFERENCES [1] H. K. Khalil, Nonlinear Systems, 2nd ed. Upper Saddle River, NJ, USA: Prentice-Hall, 1996. [2] C.-L. Hwang, “Neural-network-based variable structure control of electrohydraulic servosystems subject to huge uncertainties without persistent excitation,” IEEE/ASME Trans. Mechatronics, vol. 4, no. 1, pp. 50–59, Mar. 1999. [3] C.-L. Hwang and C. J. Hsu, “Nonlinear control design for a Hammerstein model system,” IEE Proc.-Control. Theory Appl., vol. 142, no. 4, pp. 277–285, Jul. 1995. [4] G. Tao and P. V. Kokotovic, Adaptive Control of Systems With Actuator and Sensor Nonlinearities. New York, NY, USA: Wiley, 1996. [5] M. A. Brdys and G. J. Kulawski, “Dynamic neural controllers for induction motor,” IEEE Trans. Neural Netw., vol. 10, no. 2, pp. 340–355, Mar. 1999.

HWANG AND JAN: RNNBMAC FOR A CLASS OF NONLINEAR DYNAMIC SYSTEMS

[6] G. A. Barreto and A. F. R. Araujo, “Identification and control of dynamical systems using the self-organizing map,” IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1244–1259, Sep. 2004. [7] F.-C. Chen and H. K. Khalil, “Adaptive control of a class of nonlinear discrete-time systems using neural networks,” IEEE Trans. Autom. Control, vol. 40, no. 5, pp. 791–801, May 1995. [8] S. Jagannathan and F. L. Lewis, “Multilayer discrete-time neural-net controller with guaranteed performance,” IEEE Trans. Neural Netw., vol. 7, no. 1, pp. 107–130, Jan. 1996. [9] Y.-J. Liu, L. Tang, S. Tong, C. L. P. Chen, and D.-J. Li, “Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 1, pp. 165–176, Jan. 2015. [10] C.-C. Ku and K. Y. Lee, “Diagonal recurrent neural networks for dynamic systems control,” IEEE Trans. Neural Netw., vol. 6, no. 1, pp. 144–156, Jan. 1995. [11] M. K. Sudareshan and T. A. Condarcure, “Recurrent neural-network training by a learning automaton approach for trajectory learning and control system design,” IEEE Trans. Neural Netw., vol. 9, no. 3, pp. 354–368, May 1998. [12] C.-L. Hwang and C.-H. Lin, “A discrete-time multivariable neuroadaptive control for nonlinear unknown dynamic systems,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 30, no. 6, pp. 865–877, Dec. 2000. [13] Z. Yi and K. K. Tan, “Multistability of discrete-time recurrent neural networks with unsaturating piecewise linear activation functions,” IEEE Trans. Neural Netw., vol. 15, no. 2, pp. 329–336, Mar. 2004. [14] Q. Zhu and L. Guo, “Stable adaptive neurocontrol for nonlinear discretetime systems,” IEEE Trans. Neural Netw., vol. 15, no. 3, pp. 653–662, May 2004. [15] S. Cong and Y. Liang, “PID-like neural network nonlinear adaptive control for uncertain multivariable motion control systems,” IEEE Trans. Ind. Electron., vol. 56, no. 10, pp. 3872–3879, Oct. 2009. [16] H. Zhang, Z. Wang, and D. Liu, “A comprehensive review of stability analysis of continuous-time recurrent neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 7, pp. 1229–1262, Jul. 2014. [17] Y. Xia, G. Feng, and J. Wang, “A novel recurrent neural network for solving nonlinear optimization problems with inequality constraints,” IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1340–1353, Aug. 2008. [18] Y. Zhao, H. Gao, J. Lam, and K. Chen, “Stability analysis of discretetime recurrent neural networks with stochastic delay,” IEEE Trans. Ind. Electron., vol. 20, no. 8, pp. 1330–1339, Aug. 2009. [19] C.-H. Lu, “Design and application of stable predictive controller using recurrent wavelet neural networks,” IEEE Trans. Ind. Electron., vol. 56, no. 9, pp. 3733–3742, Sep. 2009. [20] Z. Liu, H. Zhang, and Q. Zhang, “Novel stability analysis for recurrent neural networks with multiple delays via line integral-type L-K functional,” IEEE Trans. Neural Netw., vol. 21, no. 11, pp. 1710–1718, Nov. 2010. [21] Z. Wang, H. Zhang, and B. Jiang, “LMI-based approach for global asymptotic stability analysis of recurrent neural networks with various delays and structures,” IEEE Trans. Neural Netw., vol. 22, no. 7, pp. 1032–1045, Jul. 2011. [22] Y. Pan and J. Wang, “Model predictive control of unknown nonlinear dynamical systems based on recurrent neural networks,” IEEE Trans. Ind. Electron., vol. 59, no. 8, pp. 3089–3101, Aug. 2012. [23] H. Zhang, F. Yang, X. Liu, and Q. Zhang, “Stability analysis for neural networks with time-varying delay based on quadratic convex combination,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 4, pp. 513–521, Apr. 2013. [24] X. Li and S. Song, “Impulsive control for existence, uniqueness, and global stability of periodic solutions of recurrent neural networks with discrete and continuously distributed delays,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 6, pp. 868–877, Jun. 2013. [25] F. F. M. El-Sousy, “Intelligent optimal recurrent wavelet Elman neural network control system for permanent-magnet synchronous motor servo drive,” IEEE Trans. Ind. Informat., vol. 9, no. 4, pp. 1986–2003, Nov. 2013. [26] X. Le and J. Wang, “Robust pole assignment for synthesizing feedback control systems using recurrent neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 2, pp. 383–393, Feb. 2014. [27] Z. Yan and J. Wang, “Robust model predictive control of nonlinear systems with unmodeled dynamics and bounded uncertainties based on neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 3, pp. 457–469, Mar. 2014.

401

[28] C.-L. Hwang, “Multivariable adaptive control of nonlinear unknown dynamic systems using recurrent neural-network,” in Proc. IEEE Int. Joint Conf. Neural Netw., Vancouver, BC, Canada, Jul. 2006, pp. 5006–5011. [29] K. Furuta, “VSS type self-tuning control,” IEEE Trans. Ind. Electron., vol. 40, no. 1, pp. 37–44, Feb. 1993. [30] K. D. Young, V. I. Utkin, and U. Ozguner, “A control engineer’s guide to sliding mode control,” IEEE Trans. Control Syst. Technol., vol. 7, no. 3, pp. 328–342, May 1999. [31] Y. Liu, H. Wang, and C. Hou, “Sliding-mode control design for nonlinear systems using probability density function shaping,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 2, pp. 332–343, Feb. 2014. [32] W. Chen, C. Wen, S. Hua, and C. Sun, “Distributed cooperative adaptive identification and control for a group of continuous-time systems with a cooperative PE condition via consensus,” IEEE Trans. Autom. Control, vol. 59, no. 1, pp. 91–106, Jan. 2014. [33] S.-L. Dai, C. Wang, and M. Wang, “Dynamic learning from adaptive neural network control of a class of nonaffine nonlinear systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 1, pp. 111–123, Jan. 2014. [34] K. J. Astrom and B. Wittenmark, Computer-Controlled Systems: Theory and Design, 2nd ed. Englewood Cliffs, NJ, USA: Prentice-Hall, 1997. [35] X. M. Ren, A. B. Rad, P. T. Chan, and W. L. Lo, “Online identification of continuous-time systems with unknown time delay,” IEEE Trans. Autom. Control, vol. 50, no. 9, pp. 1418–1422, Sep. 2005.

Chih-Lyang Hwang (SM’08) received the B.E. degree in aeronautical engineering from Tamkang University, Taipei, Taiwan, in 1981, and the M.E. and Ph.D. degrees in mechanical engineering from Tatung University, Taipei, in 1986 and 1990, respectively. He was with the Department of Mechanical Engineering, Tatung University, where he was involved in teaching and research in the area of servo control and control of manufacturing systems and robotic systems. He was a Professor of Mechanical Engineering with Tatung University from 1996 to 2006. From 1998 to 1999, he was a Research Scholar with the George W. Woodruf School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA, USA. From 2006 to 2011, he was a Professor with the Department of Electrical Engineering, Tamkang University. Since 2011, he has been a Professor with the Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei. He has authored or co-authored over 111 journal and conference papers in the related field. His current research interests include robotics, fuzzy or neural-network modeling and control, sliding-mode control, visual tracking or navigation system, network-based control, and distributed sensor-network.

Chau Jan was born in Taiwan in 1972. He received the B.E. degree from the Department of Mechanical Engineering, National Cheng Kung University, Tainan, Taiwan, in 1996, and the M.E. and Ph.D. degrees in mechanical engineering from Tatung University, Taipei, Taiwan, in 1998 and 2003, respectively. He has been with the Department of Mechanical Engineering, Nan Jeon University of Science and Technology, Tainan, since 2003, where he is currently involved in teaching and research in the area of electronics and control systems. His current research interests include control systems, robust control, fuzzy or neural network modeling and control, and piezomechanics.

Recurrent-Neural-Network-Based Multivariable Adaptive Control for a Class of Nonlinear Dynamic Systems With Time-Varying Delay.

At the beginning, an approximate nonlinear autoregressive moving average (NARMA) model is employed to represent a class of multivariable nonlinear dyn...
3MB Sizes 2 Downloads 8 Views