IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 8, AUGUST 2015

1803

Brief Papers Missile Guidance Law Based on Robust Model Predictive Control Using Neural-Network Optimization Zhijun Li, Yuanqing Xia, Chun-Yi Su, Jun Deng, Jun Fu, and Wei He Abstract— In this brief, the utilization of robust model-based predictive control is investigated for the problem of missile interception. Treating the target acceleration as a bounded disturbance, novel guidance law using model predictive control is developed by incorporating missile inside constraints. The combined model predictive approach could be transformed as a constrained quadratic programming (QP) problem, which may be solved using a linear variational inequality-based primal– dual neural network over a finite receding horizon. Online solutions to multiple parametric QP problems are used so that constrained optimal control decisions can be made in real time. Simulation studies are conducted to illustrate the effectiveness and performance of the proposed guidance control law for missile interception.

Index Terms— Guidance law, primal–dual neural network (PDNN), robust model predictive control (MPC). I. I NTRODUCTION Proportional navigation (PN) of the missile interception is effective and simple enough in the practical implementation, which can be found in widely used guidance laws [1], [2]. In [3], consider the target moving on two different courses, the PN guidance law was proposed and investigated. The basic idea of PN guidance law was described as the normal acceleration of missile being proportional to the angular velocity of line-of-sight (LOS) [1]. Some of modified PN guidance laws were also developed to further enhance its performance in [2] and [7]. However, the related PN guidance law can be applied to a nonmaneuvering target or a weakly maneuvering target, while the acceleration of the target is Manuscript received October 22, 2013; revised April 22, 2014; accepted July 27, 2014. Date of publication September 4, 2014; date of current version July 15, 2015. This work was supported in part by the Foundation of Key Laboratory of Autonomous Systems and Networked Control, Chinese Ministry of Education, under Grant 2013A04, in part by the National Natural Science Foundation of China under Grant 61174045, Grant U1201244, and Grant 61225015, in part by the Fundamental Research Funds for the Central Universities under Grant 2013ZG0035, in part by the Program for New Century Excellent Talents in University under Grant NCET-12-0195, and in part by the Ph.D. Programs Foundation, Ministry of Education of China, under Grant 20130172110026. Z. Li and J. Deng are with the Key Laboratory of Autonomous System and Network Control, College of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, China (e-mail: [email protected]). Y. Xia is with the School of Automation, Beijing Institute of Technology, Beijing 100081, China (e-mail: [email protected]). C.-Y. Su is with the Department of Mechanical and Industrial Engineering, Concordia University, Montreal, QC H4B 1R6, Canada, and also with the Key Laboratory of Autonomous System and Network Control, College of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, China (e-mail: [email protected]). J. Fu is with the College Information Science and Engineering, Northeast University, Shengyang 110006, China (e-mail: [email protected]). W. He is with the Robotics Institute and the School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2014.2345734

actually fast varying. Due to unsatisfactory transient performance of traditional PN, a powerful maneuvering target cannot be dealt with through traditional PN, the performance of interception might be degraded and ineffective between missile and target. To enhance the performance of interception under the consideration of target maneuver capability, various control approaches have been investigated in the development of guidance laws, for example, in [4], an adaptive nonlinear guidance law for interception was considered in the integrated guidance and control through compensating for the target acceleration and control loop dynamic uncertainties. In [5], using zero-effort miss distance and considering an integrated missile autopilot and guidance loop, a sliding-mode control was developed. Using the idea of following a circular arc toward the target, in [6], an accurate guidance law was described. It should be noted that the guidance approach does not need any information of the range to the target. In [7], consider the problem of finite time convergence, a second-order sliding mode controller was presented to implement hit-to-kill guidance strategy even if there exist target maneuvers and dynamic uncertainties of actuators. Moreover, using the errors between the actual LOS angle and the desired LOS angle as filter signals, a second-order sliding mode guidance law incorporating back-stepping technique with impact angle constraints and impact time constraints was also presented in [8]. In [9], through designing a state observer to estimate the target states, a sliding mode control guidance law was also investigated for missile interception. In [21], it has been accomplished for a class of nonlinear systems subject to disturbances and uncertainties that an adaptive nonsingular terminal sliding mode control was developed based on nonlinear longitudinal missile model. In [11], considering various kinds of noise including state, target movement and measured noise, and system uncertain parameters in the missile target intercept model, a stochastic variable structure approach was developed based on optimal control approach. In [12], the PN guidance law was formulated using the nonlinear predictive control approach, and the proposed time delay control estimates the target acceleration. However, all the reported works do not address the inside constraints, including actuator saturation, velocity increment, and some of the states of the constraint limits. To include these constraints in the controller designs, model predictive control (MPC) also called as receding horizon control, is an promising tool because it can be used to handle constraints through optimization procedures [13]–[15]. Utilizing the dynamics model, an objective function based on the current controlled variables can be computed optimally at each step, the MPC can produce the control input for dynamics systems within system constraints. The intension of this brief is to utilize the feature of the MPC combining with the linear variable inequalitybased primal–dual neural-network (LVI-PDNN) optimization for the tracking control of the mobile robots in consideration of the inside constraints. This gives the major motivation on this brief. MPC can conduct online optimization of an objective function through a input–output predictor model in a finite sample time. An objective

2162-237X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

1804

Fig. 1.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 8, AUGUST 2015

Planar engagement geometry.

function can be optimized at each sample time using the current state. At the next step, the optimization is reconducted using updated variable information. Then, a sequence of control outputs is available at each control step. Therefore, MPC includes control and planning. Therefore, online optimization is the fundamental issue in the MPC implementation. It is dependent on the efficiency of online optimization whether the MPC technique can be implemented successfully. On the other hand, neurodynamic models for constrained optimization emerge as a promising approach for handle the online computational power issue. Neural dynamic models for constrained optimization problems have been investigated for some robotics applications [16], [17], [24], [25]. Recent published results of numerous works have shown better performance compared with the traditional optimization approaches, especially in the real-time implementation [18], [20], [22], [23], [33]–[36]. In [24], coordination of kinematically redundant robots was formulated as a complicated optimization problem subject to both equality and inequality constraints, then a dual neural-network method was used to conduct the optimization with the joint torque limits and the minimized internal forces applied on the object. In [14], support vector regression-based data-driven model was used in MPC for the biped robot. Due to the nature of the neural-network optimization, there exists a new challenge to address the neural-network optimization for MPC for such real-time target interception. Until now, to our best knowledge, no works on the missile interception using MPC based on neural network have been reported. Therefore, in this brief, consider the physical constraints on missile and target motion, and take the acceleration of the target as a bounded disturbance, a novel guidance law based on MPC is developed by incorporating missile acceleration constraints. The combined model predictive approach could be transformed as a constrained quadratic programming (QP) problem, which can be solved using an LVI-PDNN over a finite receding horizon. The applied neural networks can make the formulated constrained QP converging to the exact optimal values. Simulation studies are conducted to illustrate the effectiveness and performance of the proposed guidance control law for missile interception. II. I NTERCEPT S TRATEGY The investigated situation is shown in Fig. 1 [7], where the frame of the missile body is denoted by X O Z and the inertial frame is denoted by X  O Z  in Fig. 2. The kinematics of the missile interception described in the frame are given by R˙ = V R , V R = VT − V M V˙ R = A R ,

A R = AT − A M

where R = (r, θ) is the range of the missile to the target, r is the range along LOS, and θ is the LOS angle; V R and A R are the first

Fig. 2.

Defined symbols with respect to fixed frame.

and second time derivatives of R, respectively; VT and A T are target velocity and acceleration, respectively; and V M and A M are missile velocity and acceleration, respectively. Then, the following state model of missile-target engagement process can be obtained [7]: (1) r˙ = Vr (2) V˙r = r ωθ2 + A T r − A M sin(θ − ϕ M ) θ˙ = ωθ (3) (−2Vr ωθ + A T θ − A M cos(θ − ϕ M )) ˙ Vθ = . (4) r Consider the normal acceleration of the missile A M as a control input, (1)–(4) can be written as (5) r˙ = Vr 2 V V˙r = θ + A T r − A M sin(θ − ϕ M ) (6) r Vθ θ˙ = (7) r V V r θ V˙θ = − (8) + A T θ − A M cos(θ − ϕ M ) r where Vθ = r ωθ is a transversal component of relative velocity rotating with LOS. In actual implementation, the target acceleration A T is hard to be estimated and should be unknown beforehand, thus A T r and A T θ are handled as unknown bounded disturbances. Assumption 2.1: Consider (5)–(8), the state variables r , θ, Vθ , Vr , and ϕ M can be measured or obtained [7]. Assumption 2.2 [7]: The variables A T r and A T θ are supposed to max ˙ ˙ max satisfy |A T r | ≤ A max T r , | A T r | ≤ A T r , |A T θ | ≤ A T θ , and max max ˙ ˙ | A T θ | ≤ A T θ , where the unknown boundedness A T r , A max T θ , and A˙ max are difficult to be available due to the unmeasurement of target Tr acceleration in actual implementations. Consider a direct hit [7], the control objective is to ensure that Vr < 0, ωθ = 0 or Vθ = 0 is satisfying, or another √ less aggressive √ hit-to-kill guidance strategy is known as ωθ = / r or Vθ =  r , where  > 0 is a constant [7]. Let us consider the state variable vector x = [r, Vr , θ, Vθ ]T , and rewrite (5)–(8) in the state space as follows: ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ Vr r 0 0 0 2 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ d ⎢ ⎢ Vr ⎥ = ⎢ Vθ /r ⎥ + ⎢ 1 0 ⎥ ω− ⎢ sin(θ − ϕ M ) ⎥ u x˙ = ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ θ 0 0 0 V /r dt θ Vθ 0 1 cos(θ − ϕ M ) −Vr Vθ /r (9) where ω = [A T r , A T θ ]T is a disturbance vector, and u = A M . One can rewrite the above equation as x˙ = f (x) + g(x)u + d(x)

(10)

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 8, AUGUST 2015

where

⎡ ⎤ Vr 0 2 ⎢1 ⎢ V /r ⎥ ⎢ ⎥ θ f (x) = ⎢ ⎣ Vθ /r ⎦, d(x) = ⎣ 0 0 −Vr Vθ /r ⎡ ⎤ 0 ⎢ sin(θ − ϕ M ) ⎥ ⎥. g(x) = ⎢ ⎣ ⎦ 0 cos(θ − ϕ M ) ⎡

with the appropriate dimensional matrices Q and R. The major benefit of using cost function (16) is that it transforms the control problem of (11) into a constrained quadratic optimization problem, whose exact optimal solution can be obtained through neurodynamic optimization approach. Define the following vectors:

⎤ 0 0⎥ ⎥ω 0⎦ 1

x¯ = [x(k + 1|k), . . . , x(k + N|k)]T ∈ R 4N u(k) ¯ = [u(k|k), . . . , u(k + Nu − 1|k)]T ∈ R Nu u(k) ¯ = [u(k|k), . . . , u(k + Nu − 1|k)]T ∈ R Nu .

MPC can be described as an iterative optimization procedure: an optimal input vector can be obtained through optimizing a defined cost function using measure or estimate the current state at each sampling time. The difference from the traditional control strategies lies in iteratively solving an optimization problem online. The tracking dynamics (10) can be rewritten as (11)

subject to constraints u min ≤ u(k) ≤ u max u min ≤ u(k) ≤ u max xmin ≤ x(k) ≤ xmax dmin ≤ d(k) ≤ dmax

(12) (13) (14) (15)

where x(k) ∈ X , k = 1, 2, . . . , N and u(k) ∈ U, k = 1, 2, . . . , Nu , with the prediction horizon N and the control horizon Nu , which are satisfying 1 ≤ N and 0 ≤ Nu ≤ N, respectively; x = [x1 x2 x3 x4 ]T = [r Vr θ Vθ ]T ∈ R 4 is the state variable vector; u = A m is the input; u min , u max , u min , and u max are the lower and upper bounds of the input, they are all constant; xmin and xmax are the lower and upper bounds of the state vector; dmin and dmax are the lower and upper bounds of the external disturbances vector; and ⎡ ⎡ ⎤ ⎤ x2 x1 ⎢ x4 2 /x1 ⎥ ⎢ x2 ⎥ 4 ⎢ ⎥ ⎥ f (x) = ⎢ ⎣ x3 ⎦ + t ⎣ x4 /x1 ⎦ ∈ R x4 −x2 x4 /x1 ⎤ ⎡ 0 ⎢ sin(θ − ϕ M ) ⎥ ⎥ ∈ R4 g(x) = t ⎢ ⎦ ⎣ 0 cos(θ − ϕ M ) ⎡ ⎤ 0 ⎢ AT r ⎥ 4 ⎥ d(x) = t ⎢ ⎣ 0 ⎦∈R AT θ with the sampling step t. Considering that the control objective for (11) in MPC is to drive the state variables to the origin, we can define the following the objective function for MPC as: J (K ) =

N 

x T (k + j |k)Qx(k + j |k)

j =1 N u −1

+

u T (k + j |k)Ru(k + j |k)

(16)

j =0

with the predicted state x(k + j |k) and the input increment u(k + j |k), where u(k + j |k) = u(k + j |k) − u(k − 1 + j |k)

(18) (19) (20)

The predicted output (11) can be described in the following form:

III. ROBUST MPC S CHEME

x(k + 1) = f (x(k)) + g(x(k))u(k) + d(k)

1805

(17)

x(k) ¯ = Gu(k) ¯ + f˜ + g˜ + d¯ where



g(x(k|k − 1)) ⎢ g(x(k + 1|k − 1)) ⎢ G =⎢ .. ⎣ . g(x(k + N − 1)|k − 1)

... ... .. . ...

(21)

⎤ 0 ⎥ 0 ⎥ ⎥ .. ⎦ . g(x(k + N − 1)|k − 1)

∈ R 4N×Nu ⎤ ⎡ f (x(k|k − 1)) ⎢ f (x(k + 1|k − 1)) ⎥ ⎥ ⎢ f˜ = ⎢ ⎥ ∈ R 4N .. ⎦ ⎣ . f (x(k + N − 1|k − 1)) ⎡ ⎤ g(x(k|k − 1)u(k − 1)) ⎢ g(x(k + 1|k − 1))u(k − 1) ⎥ ⎢ ⎥ g˜ = ⎢ ⎥ ∈ R 4N .. ⎣ ⎦ . g(x(k + N − 1|k − 1))u(k − 1)

T ¯ d(k) = d(k + 1|k), . . . , d(k + N|k) ∈ R 4N . Then, the optimization objective function (16) with state constraints (12)–(15) becomes T QG(u(k) ¯ ¯ + f˜ + g˜ min(Gu(k) ¯ + f˜ + g˜ + d(k)) T ¯ ¯ (22) + d(k)) + u¯ (k)Ru(k)

subject to ¯ ≤ u¯ max u¯ min ≤ u(k)

(23)

¯ − 1) ≤ u¯ max u¯ min ≤ u(k ¯ − 1) + I˜u(k) ¯ ≤ u¯ max u¯ min ≤ u(k

(24)

¯ ≤ x¯max x¯min ≤ f˜ + g˜ + Gu(k) ¯ ¯ ¯ dmin ≤ d(k) ≤ dmax

(26)

(25) (27)

where u¯ min , u¯ max , u¯ min , and u¯ max are the lower and upper bounds of the input vectors; x¯min and x¯max are the lower and upper bounds of the state vectors; d¯min and d¯max are the lower and upper bounds of the external disturbances vectors; and ⎡ ⎤ I 0 ··· 0 ⎢ I I ··· 0⎥ ⎢ ⎥ I˜ = ⎢ . . . ⎥ ∈ R Nu ×Nu . . . ... ⎦ ⎣ .. .. I I ··· I Then, the optimization objective (22) can be rewritten as a QP problem T T u(k) ¯ u(k) ¯ u(k) ¯ c1 + (28) W min ¯ ¯ ¯ c2 d(k) d(k) d(k) subject to ¯ ≤ b1 E 1 u(k) ¯ ≤ u¯ max u¯ min ≤ u(k) ¯ ≤ d¯max d¯min ≤ d(k)

(29) (30) (31)

1806

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 8, AUGUST 2015

where the coefficients are defined as T G QG + R G T Q W = ∈ R (Nu+4N)×(Nu+4N) QG Q c1 = 2G T Q(g˜ + f˜) ∈ R Nu c2 = 2Q(g˜ + f˜) ∈ R 4N E 1 = [− I˜, I˜, −G, G]T ∈ R (2Nu+8N)×Nu ⎤ ⎡ ¯ − 1) −u¯ min + u(k ⎢ u¯ max − u(k ¯ − 1) ⎥ 2Nu+8N . ⎥ b1 = ⎢ ⎣ −x¯min + f˜ + g˜ ⎦ ∈ R x¯max − f˜ − g˜ T , c = [c , c ]T . Then, (28)–(31) can be ¯ Let ξ = [u(k), ¯ d(k)] 1 2 rewritten as follows:

min ξ T W ξ + cT ξ

(32)

Fig. 3.

Block diagram of primal–dual dynamical system.

Eξ ≤ b

(33)

where coefficients M, p, and h ± are defined in (34) and (35), respectively. Proof: The proof can be found in [16] and [17]. From the derivation in [16] and [17], it should be noted that LVI (36) is equivalent to the following system of piecewise linear equation:

subject to

where the coefficients are ⎡ ⎤ E1 0 ⎢ I 0 ⎥ ⎢ ⎥ (4Nu +16N)×(Nu +4N) ll E = ⎢ 0 I ⎥ ⎢ ⎥∈R ⎣ −I ⎦ 0 0 −I ⎡ ⎤ b1 ⎢ u¯ max ⎥ ⎢ ⎥ 4Nu +16N . ⎥ ¯ b=⎢ ⎢ dmax ⎥ ∈ R ⎣ −u¯ min ⎦ −d¯min

P (h − (Mh + p)) − h = 0 where the projection P (h) = [P (h 1 ), . . ., ⎧ − ⎪ ⎨h i P (h i ) = h i ⎪ ⎩ + hi

IV. P RIMAL –D UAL N EURAL -N ETWORK O PTIMIZATION Considering (21), a unified QP formulation for the MPC has been formulated for missile guidance control; however, the high efficient online computation of the QP problem is needed to be developed. Based on [16] and [17], one neurodynamic optimization approach has been developed for robotic manipulator and optimal feet forces’ distribution and control of quadruped robots, respectively. In this brief, we utilize the linear variational inequality (LVI) primal–dual neural network for performing the optimization of the MPC. For constraints (12)–(14), one can define the corresponding dual decision vector β ∈ R 4Nu +16N and its upper/lower bounds y ± as + − ξ ξ ξ + − , h = ∈ R 5Nu +20N (34) h= , h = +y + −y − β where the elements yi+  0 in y + are positive and represent +∞, and ξ + and ξ − are upper and lower bounds of ξ , respectively, which

compose of the corresponding upper and lower bounds of u(k) ¯ T. ¯ ¯ and d(k), respectively, due to the definition of ξ = [u(k), ¯ d(k)] Thus, the convex set is defined by primal–dual decision vector h is = {h −  h  h + }. One can choose the coefficient matrix M and the vector p as 2W −E T M = ∈ R (5Nu +20N)×(5Nu +20N) E 0 c p= ∈ R (5Nu +20N) . (35) −b Theorem 4.1 [16], [17]: The optimization objective (32) and the constraint (33) can be transformed to seek a vector h ∗ ∈ = {h|h −  h  h + }, which can satisfy (h − h ∗ )T (Mh ∗ + p)  0 ∀h ∈

(36)

(37)

operator P (·) onto is defined as P (h 5Nu +20N )]T with if if if

hi < h− i + h− i  hi  hi + h i > h i ∀i ∈ {1, . . . , 5Nu + 20N}.

To solve linear projection equation (37), we need to build a neurodynamic system [16] and [17]. However, since the matrix M is asymmetric, we develop the following modified neurodynamical system to solve (37): h˙ = γ (I + M T ){P (h − (Mh + p)) − h}.

(38)

with a positive design parameter γ . Then, we could present the ith neuron form of the LVI-PDNN for (38) as ⎧ ⎡ ⎫ ⎛ ⎞ ⎤ ⎨ ⎬   dh i ( j ) ⎠ sik h k − pi ⎦ − h j =γ z i j ⎣ P ⎝ (39) ⎩ ⎭ dt j =1

k=1

where i = 1, 2, . . . , , = 5Nu + 20N, z i j represents the i j th element of matrix Z = I + M T , and sik represents the ikth element of matrix S = I −M. In (39), the circuit implementing the LVI-PDNN in (38) composed of integrators, limiters, 2 2 multipliers, and 5 summers. The structure of primal–dual neural dynamics (38) is shown in Fig. 3, where ξ is input into the system after constituting the coefficient matrices and vectors like W , b, E, ξ − , and ξ + . The primal–dual dynamical system outputs are the signal h(t), of which ¯ the first Nu elements are u. Theorem 4.2 [16], [17]: Consider any initial values of state variables, the vector h(t) of primal–dual dynamical system (38) drives the state variables to the origin with the first Nμ elements being the optimal values u¯ of the QP problem in (12)–(14) and (22). Moreover, if there exists a constant ρ > 0 such that

h − P (h − (Mh + p)) 22  ρ h − h ∗ 22 , one can obtain the exponential convergence. The MPC scheme with neural-network optimization for missile guidance law can be summarized and is shown in Fig. 4.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 8, AUGUST 2015

Fig. 5.

Intercepting trajectory.

Fig. 6.

Target acceleration.

Fig. 7.

Missile acceleration.

1807

Fig. 4. Block diagram of MPC scheme for missile guidance law based on the PDNN.

1) Let k = 1. Initial the values of the control time terminal T , the prediction horizon N, the control horizon Nu , the sample step t, and weight matrices Q and R. 2) Solve the model matrices G, f˜, and g, ˜ the external disturbances d, and the primal–dual neural network (PDNN) matrices W , C, E, and b using (21)–(33). 3) Calculate the QP optimization (32) and (33) using (38) to obtain the control increment vector u(k). ¯ 4) Obtain the input vector u(k) ¯ and implement u(k|k) at instant k. 5) Calculate the states x at instant k + 1 using (11). 6) If k + 1 < T , set k = k + 1, and go to step 2; otherwise end. Remark 4.1: By selecting N, Nu , and Q, it always exists a finitehorizon length, the closed-loop stability can be guaranteed, which can be found in [15] and [26]–[29]. Remark 4.2: To solve QP (32) and (33), a traditional sequential quadratic programming (SQP) method with gradient descent methods can be adopted, which requires repeatedly calculating the Hessian matrix of the Langragian to solve a QP problem [30]–[32], for example, the MATLAB optimization routines QUADPROG or LINPROG function. Because of online-solution requirement of (32) and (33), numerical QP methods containing such O(N 4 + N + (4Nu + 16N) ∗ Nu2 + (5Nu + 16N)3 ), numerical QP algorithm may not be efficient enough for mobile robot systems, while the proposed LVI-PDNN methods contain O(7(5Nu + 20N) + 2(5Nu + 20N)2 ) operations. It is obvious that the computation cost using LVI-PDNN can be greatly reduced. As the horizon N increases, the computational complexity will of course increase. V. S IMULATION S TUDIES To verify the effectiveness of the proposed approach, extensive simulations are conducted. In the simulation, it is assumed that the guidance commands are without limitation. The parameters of MPC are chosen as N = 3, Nu = 2, Q = 0.1I , and R = 0.1I . The boundaries of the input are chosen as u¯ min = [−100 −100]T , u¯ max = [100 100]T , u¯ min = [−100 −100]T , and u¯ max = [100 100]T . The boundaries of the state variable are x¯min = [−30000 · · · −30000]T ∈ R 12 and x¯max = [30000 · · · 30000]T ∈ R 12 , and the boundaries of the disturb are d¯min = [−3 · · · − 3]T ∈ R 12 and d¯max = [3 · · · 3]T ∈ R 12 . The initial positions of the missile are chosen as X M (0) = 0 m and y M (0) = 0 m. Its initial flight path is chosen as ϕ M = π/2 rad, and

its initial velocity is chosen as V M = 720 m/s. The target’s initial positions are X T (0) = 20 000 m and yT (0) = 20 000 m. Its initial velocity is chosen as VT = 460 m/s, and its initial flight path is ϕT = π rad. The LOS angle is θ = π/3 rad, the measurement of the LOS is taken as a first-order lag system with a time constant 30 ms, and the measurement noise is chosen as the Gaussian noise with standard deviation of 10 mrad. The target acceleration is chosen as A T = 100 sin(t) m2 /s. The intercept geometry in Fig. 5 shows that the proposed approach achieves the interception within short time. The disturbance A T is given in Fig. 6, and Fig. 7 shows the input signals, i.e., missile acceleration of the MPC guidance law. The predictive control inputs

1808

Fig. 8.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 8, AUGUST 2015

Control inputs using MPC guidance law.

Fig. 11.

Predictive control inputs using function QUADPROG.

SQP approach. Therefore, it is preferable to exploit the LVI-PDNN model to solve the QP problem for missile guidance law. VI. C ONCLUSION In this brief, a MPC scheme is proposed for missile interception. Based on the tracking kinematics, the proposed MPC approach can handle a formulated QP problem using a neurodynamic optimization approach over a finite receding horizon. The applied neural networks can make the formulated constrained QP converging to the exact optimal values. Extensive simulation is conducted to illustrate the effectiveness and performance of the proposed PDNN-based MPC schemes for missile interception. Fig. 9.

Fig. 10.

Intercepting trajectory using function QUADPROG.

R EFERENCES

Missile acceleration using function QUADPROG.

[1] S. N. Ghawghawe and D. Ghose, “Pure proportional navigation against time-varying target manoeuvres,” IEEE Trans. Aerosp. Electron. Syst., vol. 32, no. 4, pp. 1336–1347, Oct. 1996. [2] Y. Ulybyshev, “Terminal guidance law based on proportional navigation,” J. Guid., Control, Dyn., vol. 28, no. 4, pp. 821–824, Jul. 2005. [3] L. C.-L. Yuan, “Homing and navigational courses of automatic targetseeking devices,” J. Appl. Phys., vol. 19, no. 12, pp. 1122–1128, Dec. 1948. [4] D. Chwa and J. Y. Choi, “Adaptive nonlinear guidance law considering control loop dynamics,” IEEE Trans. Aerosp. Electron. Syst., vol. 39, no. 4, pp. 1134–1143, Oct. 2003. [5] T. Shima, M. Idan, and O. M. Golan, “Sliding-mode control for integrated missile autopilot guidance,” J. Guid., Control, Dyn., vol. 29, no. 2, pp. 250–260, Mar. 2006. [6] I. R. Manchester and A. V. Savkin, “Circular-navigation-guidance law for precision missile/target engagements,” J. Guid., Control, Dyn., vol. 29, no. 2, pp. 314–320, Mar. 2006. [7] Y. B. Shtessel, I. A. Shkolnikov, and A. Levant, “Guidance and control of missile interceptor using second-order sliding modes,” IEEE Trans. Aerosp. Electron. Syst., vol. 45, no. 1, pp. 110–124, Jan. 2009. [8] N. Harl and S. N. Balakrishnan, “Impact time and angle guidance with sliding mode control,” IEEE Trans. Control Syst. Technol., vol. 20, no. 6, pp. 1436–1449, Nov. 2012. [9] A. Zhurbal and M. Idan, “Effect of estimation on the performance of an integrated missile guidance and control system,” IEEE Trans. Aerosp. Electron. Syst., vol. 47, no. 4, pp. 2690–2708, Oct. 2011. [10] X. Wang and J. Wang, “Partial integrated missile guidance and control with finite time convergence,” J. Guid., Control, Dyn., vol. 36, no. 5, pp. 1399–1409, 2013. [11] H. Wang, D. Cao, and X. Wang, “The stochastic sliding mode variable structure guidance laws based on optimal control theory,” J. Control Theory Appl., vol. 11, no. 1, pp. 86–91, 2013. [12] S. E. Talole, A. Ghosh, and S. B. Phadke, “Proportional navigation guidance using predictive and time delay control,” Control Eng. Pract., vol. 14, no. 12, pp. 1445–1453, 2006. [13] S. Chai, G.-P. Liu, D. Rees, and Y. Xia, “Design and practical implementation of internet-based predictive control of a servo system,” IEEE Trans. Control Syst. Technol., vol. 16, no. 1, pp. 158–168, Jan. 2008.

of MPC guidance law are shown in Fig. 8. To deal with the disturbance, the MPC adopts the PDNN optimization method, by which the disturbance can be suppressed using the known upper bound. Then, we conduct the comparison with MPC using SQP approach, where the MATLAB function QUADPROG is also exploited to the QP problem. Figs. 9–11 are obtained using the MATLAB function QUADPROG. As shown in Fig. 9, it does not achieve successfully interception. Fig. 10 shows the missile acceleration, and the predictive control inputs are shown in Fig. 11. Thus it can be seen, there is no good effect, even occur the oscillation in Fig. 11. This demonstrates that the neurodynamic optimization (38) has much better performance for solving the QP problem under the comparison with the

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 8, AUGUST 2015

[14] S. S. Ge, Z. Li, and H. Yang, “Data driven adaptive predictive control for holonomic constrained under-actuated biped robots,” IEEE Trans. Control Syst. Technol., vol. 20, no. 3, pp. 787–795, May 2012. [15] J. L. Garriga and M. Soroush, “Model predictive control tuning methods: A review,” Ind. Eng. Chem. Res., vol. 49, no. 8, pp. 3505–3515, 2010. [16] Y. Zhang, S. S. Ge, and T. H. Lee, “A unified quadratic-programmingbased dynamical system approach to joint torque optimization of physically constrained redundant manipulators,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 34, no. 5, pp. 2126–2132, Oct. 2004. [17] Z. Li, S. S. Ge, and S. Liu, “Contact-force distribution optimization and control for quadruped robots using both gradient and adaptive neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 8, pp. 1460–1473, Aug. 2014. [18] Y.-J. Liu, C. L. P. Chen, G.-X. Wen, and S. Tong, “Adaptive neural output feedback tracking control for a class of uncertain discretetime nonlinear systems,” IEEE Trans. Neural Netw., vol. 22, no. 7, pp. 1162–1167, Jul. 2011. [19] Y.-J. Liu, L. Tang, S. Tong, and C. L. P. Chen, “Adaptive NN controller design for a class of nonlinear MIMO discrete-time systems,” IEEE Trans. Neural Netw. Learn. Syst., to be published. [20] Y.-J. Liu, S.-C. Tong, D. Wang, T.-S. Li, and C. L. P. Chen, “Adaptive neural output feedback controller design with reduced-order observer for a class of uncertain nonlinear SISO systems,” IEEE Trans. Neural Netw., vol. 22, no. 8, pp. 1328–1334, Aug. 2011. [21] L. Wang, Z. Liu, C. L. P. Chen, Y. Zhang, S. Lee, and X. Chen, “Energyefficient SVM learning control system for biped walking robots,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 5, pp. 831–837, May 2013. [22] Z. Liu, G. Lai, Y. Zhang, X. Chen, and C. L. P. Chen, “Adaptive neural control for a class of nonlinear time-varying delay systems with unknown hysteresis,” IEEE Trans. Neural Netw. Learn. Syst., to be published. [23] R. Cui and W. Yan, “Mutual synchronization of multiple robot manipulators with unknown dynamics,” J. Intell. Robot. Syst., vol. 68, no. 2, pp. 105–119, 2012. [24] Z.-G. Hou, L. Cheng, and M. Tan, “Multi-criteria optimization for coordination of redundant robots using a dual neural network,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 40, no. 4, pp. 1075–1087, Aug. 2010.

1809

[25] L. Cheng, Z.-G. Hou, and M. Tan, “Constrained multi-variable generalized predictive control using a dual neural network,” Neural Comput. Appl., vol. 16, no. 6, pp. 505–512, 2007. [26] L. Grüne and J. Pannek, Nonlinear Model Predictive Control: Theory and Algorithms, London, U.K.: Springer-Verlag, 2011. [27] P. O. M. Scokaert and J. B. Rawlings, “Constrained linear quadratic regulation,” IEEE Trans. Autom. Control, vol. 43, no. 8, pp. 1163–1169, Aug. 1999. [28] J. A. Primbs and V. Nevistic, “Feasibility and stability of constrained finite receding horizon control,” Automatica, vol. 36, no. 7, pp. 965–971, 2000. [29] A. Jadbabaie and J. Hauser, “On the stability of receding horizon control with a general terminal cost,” IEEE Trans. Autom. Control, vol. 50, no. 5, pp. 674–678, May 2005. [30] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, Nonlinear Programming: Theory and Algorithms. New York, NY, USA: Wiley, 1993. [31] F.-T. Cheng, R.-J. Sheu, and T.-H. Chen, “The improved compact QP method for resolving manipulator redundancy,” IEEE Trans. Syst., Man, Cybern., vol. 25, no. 11, pp. 1521–1530, Nov. 1995. [32] W. Li and J. Swetits, “A new algorithm for solving strictly convex quadratic programs,” SIAM J. Optim., vol. 7, no. 3, pp. 595–619, 1997. [33] Y. Xiao, Y. Liu, C.-S. Leung, J. P. Sum, and K. Ho, “Analysis on the convergence time of dual neural network-based kWTA,” IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 4, pp. 676–682, Apr. 2012. [34] Y. Xia, G. Feng, and J. Wang, “A novel recurrent neural network for solving nonlinear optimization problems with inequality constraints,” IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1340–1353, Aug. 2008. [35] Q. Liu and J. Wang, “A one-layer recurrent neural network with a discontinuous hard-limiting activation function for quadratic programming,” IEEE Trans. Neural Netw., vol. 19, no. 4, pp. 558–570, Apr. 2008. [36] Q. Liu and J. Wang, “A one-layer projection neural network for nonsmooth optimization subject to linear equalities and bound constraints,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 5, pp. 812–824, May 2013.

Missile Guidance Law Based on Robust Model Predictive Control Using Neural-Network Optimization.

In this brief, the utilization of robust model-based predictive control is investigated for the problem of missile interception. Treating the target a...
1MB Sizes 0 Downloads 11 Views