840

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 4, APRIL 2015

Nonlinear Model Predictive Control Based on Collective Neurodynamic Optimization Zheng Yan, Student Member, IEEE, and Jun Wang, Fellow, IEEE

Abstract— In general, nonlinear model predictive control (NMPC) entails solving a sequential global optimization problem with a nonconvex cost function or constraints. This paper presents a novel collective neurodynamic optimization approach to NMPC without linearization. Utilizing a group of recurrent neural networks (RNNs), the proposed collective neurodynamic optimization approach searches for optimal solutions to global optimization problems by emulating brainstorming. Each RNN is guaranteed to converge to a candidate solution by performing constrained local search. By exchanging information and iteratively improving the starting and restarting points of each RNN using the information of local and global best known solutions in a framework of particle swarm optimization, the group of RNNs is able to reach global optimal solutions to global optimization problems. The essence of the proposed collective neurodynamic optimization approach lies in the integration of capabilities of global search and precise local search. The simulation results of many cases are discussed to substantiate the effectiveness and the characteristics of the proposed approach. Index Terms— Collective neurodynamic optimization, model predictive control (MPC), recurrent neural networks (RNNs).

I. I NTRODUCTION

M

ODEL predictive control (MPC) is probably the most successful modern control technologies in industries with several thousands of applications as reported in [1]. An MPC design methodology is characterized by three main features: 1) an explicit model of the plant to be controlled; 2) a mechanism for computation of control inputs by optimizing the predicted plant behaviors; and 3) a receding horizon. Compared with many other control techniques, MPC has many desirable features, e.g., it handles multivariable control problems naturally, it considers input and output constraints, and it adapts model changes. With developments over three decades, MPC for linear systems is considered as a mature technique [2]. However, as most real-world applications are inherently nonlinear, Manuscript received August 10, 2013; revised July 24, 2014 and December 14, 2014; accepted December 27, 2014. Date of publication January 15, 2015; date of current version March 16, 2015. This work was supported in part by the Research Grants Council, University Grants Committee, Hong Kong, under Grant CUHK416812E and in part by the National Natural Science Foundation of China under Grant 61273307. Z. Yan is with the Department of Mechanical and Automation Engineering, Chinese University of Hong Kong, Hong Kong (e-mail: [email protected]). J. Wang is with the School of Control Science and Engineering, Dalian University of Technology, Dalian 116024, China, and also with the Department of Mechanical and Automation Engineering, Chinese University of Hong Kong, Hong Kong (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2014.2387862

linear MPC is not suitable for the cases where the model is highly nonlinear or the operating points change frequently [3]. As a result, nonlinear MPC (NMPC) technology is deemed highly demanding and desirable. A fundamental difficulty of most existing NMPC approaches is the requirement to solve the constrained global optimization problems with nonconvex functions [4]. So far, there is no universally reliable optimization method for solving the global optimization problems in real time. Thus, in-depth investigations on computationally efficient NMPC are necessary and rewarding. As NMPC law is essentially a sequential constrained optimization procedure, the computational efficiency determines the applicability of any NMPC approach. In the literature, several approaches are available with the aim of reducing the computational burden of NMPC. The first category is NMPC with online linearization. Linearization enables the NMPC to be synthesized by solving a sequence of quadratic programming problems, which can be solved very efficiently [5]–[7]. The second category is explicit MPC using multiparametric nonlinear programming, in which the optimal control signals are computed offline as an explicit function of the state and reference vectors, so that online operations reduce to a simple function evaluation [8]–[10]. The third category is to approximate the optimal time-varying feedback control law via function approximation techniques, such as set membership methodologies and neural networks [11]–[14]. One common limitation of most existing methods is that they may not be competent for problems with high dimensionality. Furthermore, the reformulations generally result in suboptimal control signals with respect to the system performance index. There is always an incentive to further improve the global optimality to achieve the best system performance. Last three decades witnessed the birth and growth of neurodynamic optimization where recurrent neural networks (RNNs) have been developed to serve as optimization solvers. The essence of neurodynamic optimization lies in its inherent nature of parallel and distributed information processing and the availability of hardware implementation. Many RNN models with global convergence property have been developed for solving the various optimization problems, such as the projection-based neural networks [15]–[18], the dual-based neural networks [19]–[21], the neural networks with discontinuous activation functions [22], [23], and the neural networks for generalized convex optimization [24]–[26]. These neural networks have shown superior performances in terms of global optimality and low model complexity.

2162-237X © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

YAN AND WANG: NMPC BASED ON COLLECTIVE NEURODYNAMIC OPTIMIZATION

Neurodynamic optimization research has brought many opportunities for the design and synthesis of NMPC. The incorporation of neural networks in MPC was probably the first time discussed in [27], where the use of an RNN to implement NMPC with quadratic programming was described. This paper inspired many studies on applying neural networks in MPC in last two decades. Generally speaking, the applications of neural networks for MPC fall into three categories: 1) using neural networks for system modeling or identification [28]–[32]; 2) using neural networks for solving optimization problems in real time [31]–[36]; and 3) using neural networks to approximate offline MPC laws [11], [12]. In these works, the use of neural networks in MPC design greatly improved the computational efficiency and the control performances. While neurodynamic optimization approaches with individual RNNs have achieved great successes, they would reach their solvability limits at constrained optimization problems with unimodal objective functions and are impotent for global optimization in the presence of multimodality in objective functions. In other words, most existing neurodynamic optimization approaches are gradient driven and insufficient of solving global optimization problems, as the neurodynamics are easily trapped into local minima of nonconvex objective functions. In parallel to neurodynamic optimization research, population-based evolutionary computation approaches emerged as a branch of popular metaheuristic methods for global optimization in recent decades [37]–[41]. The population-based evolutionary optimization algorithms are stochastic, heuristic, discrete time, and multiple state in their nature with the capabilities of global search and the deficiencies in precise local search and constraint handling. In contrast, based on optimization and dynamic systems theories, neurodynamic optimization approaches are capable of constrained local search, but incapable of global search in the presence of nonconvexity. It would be a good idea to integrate the two types of computationally intelligent optimization methods for constrained global optimization. In this paper, a collective neurodynamic optimization approach is proposed for NMPC. A group of RNNs is employed in a cooperative way to tackle the real-time global optimization problem in NMPC. Each RNN carries out the constrained local search according to its own dynamical equation and converges to a candidate solution. The improvements of the solution quality of each RNN are guided by the individual best known solution as well as the best known solution of entire population. By iteratively improving the starting/restarting points for local search, the population of neurodynamic optimization models is expected to discover the global optimal solution to the global optimization problem. Based on the collective neurodynamic optimization approach, NMPC law can be realized without any problem reformulation and the optimal control signals can be computed in real time. The collective neurodynamic optimization approach offers a new paradigm for NMPC design and implementation. The rest of this paper is organized as follows. In Section II, some preliminaries are discussed. In Section III, the NMPC

841

problem formulation is given. In Section IV, the collective neurodynamic optimization approach is delineated. In Section V, simulation results are provided. Finally, the conclusion is drawn in Section VI. II. P ROBLEM F ORMULATION Consider a discrete-time time-invariant nonlinear model in the form of x(k + 1) = f (x(k), u(k)), y(k) = h(x(k))

(1)

where x(k) ∈ n is the state vector, u(k) ∈ m is the input vector, y(k) ∈  p is the output vector, f (·) and h(·) are the nonlinear functions. The following assumptions will be used throughout this paper. Assumption 1: All state variables are available. Assumption 2: f (·) and h(·) are continuously differentiable functions of their arguments, with f (0, 0) = 0 and h(0) = 0. In (1), x(k), u(k), and y(k) are required to fulfill the following constraints: x(k) ∈ X , u(k) ∈ U, y(k) ∈ Y ∀k ≥ 0.

(2)

Assumption 3: X , U, and Y are the closed subsets of n , and  p , respectively. Both X and U contain the origin as an interior point. An MPC law is supposed to optimize a performance index iteratively over a predicted future horizon via the explicit use of the system model (1). In MPC, the control inputs are obtained by solving a constrained optimization problem during each sampling interval, using the current state as an initial state. An MPC of the nonlinear system (1) can be formulated as follows: m ,

J (u(k), x(k), y(k))

min u(k)

=

N  j =1

+

r (k + j ) − y(k + j |k)2Q j

N u −1 j =0

u(k + j |k)2R j + F(x(k + N|k))

s.t. u(k + j |k) ∈ U, x(k + j |k) ∈ X , y(k + j |k) ∈ Y, x(k + N|k) ∈  f

j = 0, 1, . . . , Nu − 1 j = 1, 2, . . . , N j = 1, 2, . . . , N (3)

where u(k + j |k) denotes the predicted input vector, x(k + j |k) denotes the predicted state vector, y(k + j |k) denotes the predicted output vector, r (k + j ) denotes a known output reference vector, N and Nu are, respectively, prediction horizon (1 ≤ N) and control horizon (0 < Nu ≤ N), || · || denotes the Euclidean norm, Q and R are weight matrices with compatible dimensions, x(k + N|k) denotes the predicted terminal state within the prediction horizon, F is a terminal cost, and  f is a terminal constraint. The notation ξ(k + j |k) denotes a model-based predicted vector of the future time k + j at given time j . The predicted values, even in the nominal undisturbed cases, need not be equal to the actual closed-loop values. The first term in the

842

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 4, APRIL 2015

cost function penalizes the deviations from the reference trajectory within the prediction horizon, the second term penalizes the control energy throughout the control horizon, and the third term is for closed-loop stability purpose. It is assumed that the optimization problem (3) has a feasible solution at k = 0. Let us define the following vectors:

optimal control input such that the closed-loop system is stabilizable. III. BACKGROUND I NFORMATION In this section, we discuss some background information to propose population-based neurodynamic optimization for real-time NMPC.

u(k) ¯ = [u T (k|k) . . . u T (k + Nu − 1|k)]T ∈  Nu m x(k) ¯ = [x T (k + 1|k) . . . x T (k + N|k)]T ∈  Nn y¯ (k) = [y (k + 1|k) . . . y (k + N|k)] ∈  T

T

T

A. Two-Layer Recurrent Neural Network At each time instant k, denote u as u(k) ¯ for brevity. The NMPC problem (4) is equivalent to the following optimization problem:

Np

r¯ (k) = [r T (k + 1) . . . r T (k + N)]T ∈  N p where r¯ (k) is a known in advance. In view of (1), denote a vector-valued function gx (·) as the nonlinear mapping between x(k) ¯ and u(k), ¯ and denote a vector-valued function g y (·) as the nonlinear mapping between y¯ and u(k). ¯ Thereafter, the optimization problem (3) can be written as the following compact form: min J (u(k), ¯ x(k), ¯ y¯ (k)) u(k) ¯

s.t. u(k) ¯ ∈

(4)

where  = {u(k)| ¯ u(k) ¯ ∈ U and gx (u(k)) ¯ ∈ X ¯ ∈ Y}.  can also be equivalently written as and g y (u(k))  = {u(k)|g ¯ ¯ ≤ 0, i = 1, . . . , q} where gi (·) is a i (u(k)) nonlinear function. At each time instant k, the solution to the optimization problem (4) offers a sequence of optimal control input vectors: u¯ ∗ (k) = [u T (k), u T (k + 1|k), . . . , u T (k + Nu − 1|k)]T , but only the first input vector u(k) will be implemented. Due to the nonlinearity of the system model, (4) often becomes a nonconvex optimization problem. The success of an MPC approach largely depends on how to solve (4) efficiently and effectively. Many results on stability of NMPC have been obtained [42]. Due to the input and output constraints in (3), the closedloop system (1) is bounded-input bounded-output stable if (3) admits a feasible solution. Moreover, in view of the results presented in [2] and [43], the sufficient conditions for asymptotical stability of NMPC using the formulation (3) can be summarized as follows. Let u = κ(x) be a control law such that X f ⊂ X is a positively invariant set for the closed-loop system f (x, κ(x)). Let F(x) be a Lyapunov function associated to the system in X f such that ∀x ∈ X f , F( f (x, κ(x))) − F(x) ≤ −l(x, κ(x)), where l(·) is a continuous function with l(0, 0) = 0 and l(x, u) > 0∀(x, u) = (0, 0). Then, the closed-loop system can be asymptotically stabilized in the NMPC framework. The terminal cost F and terminal constraint  f play important roles in the stability of NMPC. Many design methods on F and  f are available [42]. In addition, it is proved that there always exists a finite-horizon length for which the MPC is stabilizing without the use of a terminal cost or constraint in constrained linear systems [44] and nonlinear systems [45]. Hence with proper selections of N, Nu , Q, and R, a cost function without a terminal penalty can also guarantee closed-loop stability. In this paper, it is assumed that the optimization problem (4) offers an

min J (u) s.t. g(u) ≤ 0

(5)

where g(u) = (g1 (u), . . . , gq (u))T . In view of the Assumption 2, J (u) is continuously differentiable. In [46], a two-layer RNN based on an augmented Lagrangian function was developed for seeking local minima of constrained nonconvex optimization problems subject to inequality constraints. The dynamic equation of the two-layer RNN for solving (5) is described as follows:     d u −∇ J (u)+∇g(u)λ + β∇g(u)(λ2 )g(u) (6) = λ − (λ + g(u))+ dt λ where  is a positive scaling constant, β is a positive parameter, ∇ J (u) is the gradient of J (u), λ ∈ q is a vector of Lagrange multipliers, ∇g(u) = (∇g1 (u), . . . , ∇gq (u)), λ2 = (λ21 , . . . , λ2q )T , (λ2 ) = diag(λ21 , . . . , λ2q ), and (λ + g(u))+ is defined as  0 λ + g(u) ≤ + (λ + g(u)) = (7) λ + g(u) λ + g(u) > 0. The above neurodynamic optimization model tackles nonconvex optimization problems by solving equivalent Karush–Kuhn–Tucker (KKT) equations. According to the theoretical analysis in [46], the optimality and convergence property of the two-layer neural network model (6) are summarized as follows. 1) The set of equilibrium points of the neural network (6) is the set of KKT points of the nonconvex optimization problem (5). 2) Suppose that u ∗ is a feasible and regular point to (5). u ∗ is a strict minimum of the problem (5), if there exists λ∗ ∈ q such that (u ∗ , λ∗ ) is a KKT point pair and the Hessian matrix of the Lagrangian function is positive definite on the tangent subspace M(u ∗ ) = {d ∈ q |d T ∇gi (u ∗ ) = 0, d = 0 ∀i ∈ I (u ∗ )} where I (u ∗ ) = {i ∈ I |λ∗i > 0}. 3) Let z ∗ = (u ∗ , λ∗ )T be a KKT point of the optimization problem (5). There exists β > 0 such that the neural network in (6) is asymptotically stable at z ∗ , where z ∗ is a strict local-minimum of (5).

YAN AND WANG: NMPC BASED ON COLLECTIVE NEURODYNAMIC OPTIMIZATION

B. Particle Swarm Optimization In a particle swarm optimization (PSO) algorithm, each particle in a swarm represents a potential solution. Each particle adapts its search patterns by learning from experiences of itself and its neighbors [37]–[41]. Each particle has a fitness value and a velocity to adjust its search direction. The swarm is expected to move to the global optimal solution through an iterative learning and movement process. Denote xi = (xi1 , . . . , xiq )T as the position of the i th particle and vi = (vi1 , . . . , viq )T as the velocity of i th particle, where q is the dimension of each particle. Define pbesti = (pbesti1 , . . . , pbestiq )T as the best previous position yielding the best fitness value for the i th particle, and gbest = (gbesti , . . . , gbestq )T as the best position of the swarm. The movement of the i th particle in the swarm is updated as follows: vi ← vi + c1 · r1 · (pbesti − xi ) + c2 · r2 · (gbest − xi ) (8) xi ← xi + vi where c1 and c2 are the two weighting parameters, r1 and r2 are the two random numbers lie in [0, 1]. The movement of the swarm will stop if the number of iterations reaches a predefined maximum number or gbest stops improving. The PSO is efficient for solving unconstrained optimization due to its global search capabilities, but it is weak in constraint-handling for constrained optimization [47]. The presence of constraints may cause the search to keep away the focus on optimization to seeking a feasible solution. Commonly used constraints handling methods in PSO include the dynamic penalty functions and the feasibility rules. The cons of the penalty function methods are that they require careful fine-tuning of the penalty parameters. The cons of the feasibility rules are that they cause premature convergence [48]. IV. C OLLECTIVE N EURODYNAMIC O PTIMIZATION Inspired by the paradigm of brainstorming where each member in a population makes spontaneous contribution of ideas, a group of RNNs is employed and mobilized collectively to search for global optimal solutions to constrained global optimization problems. The collective neurodynamic optimization can be viewed as a integration of PSO and neurodynamic optimization. As PSO is deficient for precise constrained local search, RNNs serve as an individual agent to improve the computational efficiency. As a result, the advantages of both the neurodynamic optimization and the PSO can be combined and utilized [49]. In view of the discussion in Section III-A, the two-layer neural network model (6) can be applied for seeking KKT points of (5). Denote z = (u T , λT )T as the state variable of the neural network. Let Nr be the number of RNNs used in a group. At a iteration step s, s = 0, 1, . . . , Nc , where Nc is the maximum iteration step, denote z i as the state variable of the i th neural network and z¯ i as the equilibrium of the i th neural network model. The procedures of the collective neurodynamic optimization based on the neural network model (6) are summarized as follows.

843

1) For each RNN i = 1, . . . , Nr , do: a) divide each dimension of the search space into Nr partitions of equal length. Initialize each dimension of z i with a uniformly random distribution in the corresponding partition; b) perform local search according to dynamical equation (6) to obtain z¯ i ; c) initialize the personal best solution as z ip ← z¯ i ; d) initialize the global best solution as z ∗g by evaluating the cost function using z ip , i = 1, . . . , Nr . 2) Until a termination criterion is met (e.g., the maximum number of iteration is reached, or a solution with adequate cost function value is found), repeat: a) update the state variable of each RNN according to z i ← c0 (¯z i −z i ) + c1r1 (z ip − z i ) + c2r2 (z g − z i ) (9) b) each RNN performs local search guided by its dynamical equation (6) to obtain z¯ i ; c) if J (¯z i ) < J (z ip ), then update the personal best solution as z ip ← z¯ i ; d) if J (z ip ) < J (z ∗g ), then update the global best solution as z ∗g ← z ip . 3) Termination criteria are as follows. a) The maximum number of iteration Nc is reached. b) J (z ∗g ) stops decreasing for Tn consecutive iterations, where Tn is a positive integer specified by the users. When the procedures terminate, z ∗g holds the best found solution. Similar to a brainstorming process, (9) reflects both the experiential knowledge of each individual agent and socially exchanged knowledge of the whole population. The parameters c0 is an inertial weight, c1 and c2 are the positive coefficients used to scale the contributions of individual think and social cooperation, respectively, and r1 and r2 are the random variables used to introduce stochastic elements to the algorithm. Compared the collective neurodyanmic optimization approach with PSO methods, there are several distinct qualitative differences as follows. 1) PSO methods search for the global optima among infinite number of solution candidates, whereas the collective neurodynamic optimization approach searches among finite number of candidates (i.e., local minima, local maxima, and saddle points) only. As a result, the collective neurodynamic optimization approach is expected to significantly increase the computational efficiency, and it is able to obtain the global optimal solution almost surely provided that the number of neural networks Nr and the number of executed iterations Nc are sufficiently large. 2) PSO methods are generally weak in constraints handling, whereas the collective neurodynamic optimization approach guarantees constraint satisfaction

844

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 4, APRIL 2015

during the search process. The dynamic equations in (6) ensure that the state variables always converge to the feasible region regardless of the initial points. 3) Both the search process and information exchange rule in PSO methods are stochastic. In the collective neurodynamic optimization approach, the information exchange rule is stochastic as well, but the search process is deterministic. Compared with conventional neurodynamic optimization approaches [15]–[26], the collective neurodynamic optimization is capable of carrying out global exploration of the search space. In conventional neurodynamic optimization approaches, a single neural network is applied as a goal-seeking computational model. In the collective neurodynamic optimization approach, multiple neural networks carry out parallel and simultaneous search with the aid of cooperative information exchange. The collective neurodynamic optimization approach is in mixed continuous-discrete time scales. Each individual model carries out the constrained local search in continuous time. The information exchange and neuronal states reset are in discrete time. The computational complexity of the collective neurodynamic optimization depends on the complexity and characteristics of the optimization problem. The temporal complexity is mainly determined by two parameters tc and Nc , where tc is the time elapsed for the state variable of a neural network to converge to a sufficiently small neighbourhood of its equilibrium, which is proportional to the positive scaling constant , and Nc represents the maximum number of iterations allowed for neuronal states reset and information exchange. Users can choose  and Nc based on their needs and time allowed. At the worst case, the computational time of the collective neural network is roughly estimated as tc Nc . As tc is normally in microsecond time scale, tc Nc is generally well acceptable for solving the optimization problem (4) in real time for synthesizing NMPC in practical applications. Using the collective neurodynamic optimization, there is no need to approximate the system model or reformulate the optimization problem, which is more convenient for NMPC design and implementation. The overall NMPC scheme based on the collective neurodynamic optimization approach is summarized as follows. 1) Let k = 1. Set control time terminal T , prediction horizon N, control horizon Nu , and weighting parameters Q and R. 2) Predict the future system behaviors using the model (1) and formulate the optimization problem (4). 3) Set the parameters Nr , Nc , and Tn in the collective neurodynamic optimization approach. 4) Solve the nonlinear optimization problem (4) using the collective neorodynamic optimization approach to obtain the optimal control input u(k). 5) Apply u(k) to the system and update state information. 6) If k < T , set k = k + 1, go to Step 4; otherwise end. It is worth noting that the neurodynamics-based nonlinear control system is essentially a two-time scale system.

Fig. 1. Neuronal states with 10 random initial conditions at k = 400 in Example 1.

The neural networks can generally converge to their equilibria at least thousand time faster than the controlled plant by letting  to be sufficiently small. As a result, the dynamic behaviors of neurodynamic models will generally not jeopardize stability of the closed-loop system. V. S IMULATION R ESULTS In this section, simulation results on nonlinear control problems are provided to demonstrate the characteristics and effectiveness of the proposed NMPC scheme. Example 1: Consider a wheeled mobile robot model [50] described as follows: x˙ = v cos θ y˙ = v sin θ θ˙ = ω

(10)

where (x, y) denotes the position of the mobile robot, θ denotes the orientation, v denotes the linear velocity, and ω denotes the angular velocity. The control objective is to force the mobile robot to track a given reference trajectory ⎧ ⎨r x (t) = t 2.8 t < 15 ⎩r y (t) = 0.0125t 2 t ≥ 15. The system constraints are 0 ≤ v ≤ 2 and −π/6 ≤ ω ≤ π/2. The initial position of the robot is (x(0), y(0)) = (0, 2). As the linearized model of system (10) at the origin is not controllable (controllability matrix is not full rank), MPC with Jacobian linearization (MPC-JL) cannot result in desirable tracking performance. Set N = 5, Nu = 5, Q = diag(5, 500, 0.1), and R = 0.1I . Let F = 0, so the performance index becomes 2 + u(k) 2 . Let N = 4, c = 1, c = 1, ¯ J = ¯r (k) − x(k) ¯ r 0 1 Q R and c2 = 1. During each sampling interval, the neural network (6) is convergent to its equilibrium points. For example, Fig. 1 shows the convergence behaviors of a neural network model at k = 400. Figs. 2 and 3, respectively, show the global

YAN AND WANG: NMPC BASED ON COLLECTIVE NEURODYNAMIC OPTIMIZATION

Fig. 2.

Global best solution z ∗g at k = 400 in Example 1.

Fig. 3. Stationary neuronal states of four networks at k = 400 in Example 1.

845

Fig. 5.

Tracking errors in Example 1.

Fig. 6.

Cost in Example 1.

in this example, is low as the number of RNNs and iterations are both small. In addition, we found by experiments that the success rate will be significantly dropped if either c1 or c2 is 0. The control results are shown in Figs. 5–9. Compared with the method presented in [51], the method herein results in smaller tracking error and lower cost. Example 2: Consider a nonlinear and nonaffine model [52] as follows: y(k + 1) = 0.2 cos(0.8(y(k)+ y(k − 1))) Fig. 4.

Average number of iterations and success rate with respect to Nr .

best solution z ∗g and the neuronal state of each network during the collective searching process. The effect of Nr is also investigated. For a given Nr , the proposed algorithm is executed 20 times. The experimental results are summarized in Fig. 4. Although it is difficult to quantify the sensitivity of the algorithm to Nr , the experimental results indicate that the proposed method becomes almost surely reliable when Nr increases. Fig. 4 indicates that the computational complexity,

+ 0.4 sin(0.8(y(k)+ y(k − 1))+2u(k)+u(k − 1)) 2(u(k)+u(k − 1)) . + 0.1(9+ y(k)+ y(k − 1))+ 1 + cos(y(k)) (11) The control objective is to force the output variable to track a desired reference given as πk

πk

πk

r (k) = 0.5 + 0.05 sin + sin + sin . 50 100 150 The system constraint is −2π ≤ u ≤ 2π. The initial condition of the plant is (y(0), y(1)) = (0, 0.2). The NMPC parameters

846

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 4, APRIL 2015

Fig. 10. Neuronal states with 10 random initial conditions at k = 1 in Example 2. Fig. 7.

Mobile robot trajectory in Example 1.

Fig. 8.

Linear velocity in Example 1.

Fig. 9.

Angular velocity in Example 1.

are selected N = 2, Nu = 2, Q = 10I , and R = I . Due to the nonlinearity of the model (11), the optimization problem (4) generally becomes nonconvex at each time instant.

Fig. 11.

Global best solution z ∗g at k = 1 in Example 2.

Fig. 12.

Tracking error in Example 2.

As a result, a single neural network model (6) cannot be guaranteed to be convergent to the global minimum. In steady, as shown in Fig. 10, the neural network (6) is convergent

YAN AND WANG: NMPC BASED ON COLLECTIVE NEURODYNAMIC OPTIMIZATION

847

Fig. 13.

Control input in Example 2.

Fig. 16.

Global best solution z ∗g at k = 40 in Example 3.

Fig. 14.

Tracking error in the presence of noise in Example 2.

Fig. 17.

Tracking error of y1 in Example 3.

Fig. 15. Neuronal states with 10 random initial conditions at k = 40 Example 3.

Fig. 18.

Tracking error of y2 in Example 3.

to KKT points. The collective neurodynamic optimization approach is applied for computing the global optimal control input at each time instant. Let Nr = 3. The resulting z ∗g

is shown in Fig. 11. The effect of Nr , in this example, is also investigated. The success rate is 100% when Nr ≥ 2. It indicates that the computational complexity is very low.

848

Fig. 19.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 4, APRIL 2015

Fig. 20.

Cost in Example 3.

Control inputs in Example 3.

objective is to track the reference The control results are shown in Figs. 12 and 13. MPC-JL is also applied for (11) to compare the performance. The collective neurodynamic optimization approach results in superior tracking performance without any need of reformulating the optimization problem in NMPC. In addition, assume that a random noise w(k) ∈ [−0.05, 0.05] with uniform distribution is added to the model (11). The tracking error resulting from the proposed NMPC approach is shown in Fig. 14. The resulting tracking errors are bounded in very small range, which indicates that the proposed NMPC approach can tolerate small noises. Example 3: Consider a double parallel invert pendulum system model [53] x˙1 = x 2   m 1 gr kr 2 kr sin x 1 + x˙2 = − (l − b) j1 4 j1 2 j1 + α1

y1 = x 1 y2 = x 3

(13)

The initial condition is x(0) = [0.2, 0, −0.15, 0] . The constraints are −0.5 ≤ y ≤ 2. Let N = 5, Nu = 2, Q = 10I , R = 0.1I , and sampling frequency be 20 Hz. A single neural network (6) can converge to local equilibria from any initial point in microsecond scale. For example, Fig. 15 shows the convergence behaviors at k = 40 with Nr = 4. The evolution of z ∗g at the corresponding time instant is shown in Fig. 16. The computational complexity, in this example, is very low. Tracking errors are shown in Figs. 17 and 18. The cost value is shown in Fig. 19, T where the total cost k=1 J (k) for the method herein and MPC-JL are, respectively, 38.0563 and 58.2767. The control inputs are shown in Fig. 20. The collective neurodynamic optimization approach results in superior tracking performance. T

VI. C ONCLUSION

sat(u 1 ) kr 2 + sin x 3 j1 4 j1

x˙3 = x 4   m 2 gr kr 2 kr x˙4 = − (l − b) sin x 3 + j2 4 j2 2 j2 + α2

r1 (t) = cos πt + tan ht r2 (t) = sin πt + tan h2t.

sat(u 2 ) kr 2 + sin x 1 j2 4 j2 (12)

where x 1 and x 3 are the vertical angular displacements of the pendulums, x 2 and x 4 are the angular velocities, u 1 and u 2 are the input torques generated by the servomotor, m 1 = 2 and m 2 = 2.5 are the pendulum masses, j1 = 0.5 and j2 = 0.625 are the moments of inertia, k = 100 is the spring constant, r = 0.5 is the pendulum height, l = 0.5 is the natural length of the spring, α1 = 25 and α2 = 25 are the control input gains, and g = 9.81 is the gravitational acceleration. All parameters are in SI unit. The function sat(·) represents the actuators’ nonlinearity which is implemented by sin(10u). The control

This paper presents a new NMPC law based on collective neurodynamic optimization. By directly and sequentially solving the global optimization problem for NMPC, multiple RNNs are utilized in a cooperative way to search for global optimal control solutions. Guided by the individual best known solution as well as the best known solution in the entire group, multiple RNNs collectively and iteratively improve the solution quality until a global optimal control solution is found. By applying the collective neurodynamic optimization approach, the NMPC problem can be solved very effectively in real time without any need for problem reformulation or model approximation. Simulation results substantiate the superior performances of the proposed approach. Potential implementation of the proposed collective neurodynamic optimization in hardware (e.g., Graphic Processing Unit) could show its efficiency. Other topics for further investigations include robust MPC approach based on collective neurodynamics by explicitly considering disturbances and model mismatch, where the collective neurodynamic optimization approach may be incorporated with invariant tube methods and minimax methods.

YAN AND WANG: NMPC BASED ON COLLECTIVE NEURODYNAMIC OPTIMIZATION

R EFERENCES [1] S. J. Qin and T. A. Badgwell, “A survey of industrial model predictive control technology,” Control Eng. Pract., vol. 11, no. 7, pp. 733–764, 2003. [2] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. M. Scokaert, “Constrained model predictive control: Stability and optimality,” Automatica, vol. 36, no. 6, pp. 789–814, 2000. [3] G. Karer, G. Mušiˇc, I. Škrjanc, and B. Zupanˇciˇc, “Hybrid fuzzy modelling for model predictive control,” J. Intell. Robot Syst., vol. 50, no. 3, pp. 297–319, 2009. [4] P. Tatjewski, Advanced Control of Industrial Processes: Structures and Algorithms. London, U.K.: Springer-Verlag, 2003. [5] J. Mu, D. Rees, and G. P. Liu, “Advanced controller design for aircraft gas turbine engines,” Control Eng. Pract., vol. 13, no. 8, pp. 1001–1015, 2005. [6] G. Colin, Y. Chamaillard, G. Bloch, and G. Corde, “Neural control of fast nonlinear systems—Application to a turbocharged SI engine with VCT,” IEEE Trans. Neural Netw., vol. 18, no. 4, pp. 1101–1114, Jul. 2007. [7] M. Ławryáczuk and P. Tatjewski, “Nonlinear predictive control based on neural multi-models,” Int. J. Appl. Math. Comput. Sci., vol. 20, no. 1, pp. 7–21, 2010. [8] A. Bemporad, F. Borrelli, and M. Morari, “Model predictive control based on linear programming—The explicit solution,” IEEE Trans. Autom. Control, vol. 47, no. 12, pp. 1974–1985, Dec. 2002. [9] M. N. Zeilinger, C. N. Jones, and M. Morari, “Real-time suboptimal model predictive control using a combination of explicit MPC and online optimization,” IEEE Trans. Autom. Control, vol. 56, no. 7, pp. 1524–1534, Jul. 2011. [10] M. Kvasnica, J. Hledík, I. Rauová, and M. Fikar, “Complexity reduction of explicit model predictive control via separation,” Automatica, vol. 49, no. 6, pp. 1776–1781, 2013. [11] T. A. Johansen, “Approximate explicit receding horizon control of constrained nonlinear systems,” Automatica, vol. 40, no. 2, pp. 293–300, 2004. [12] B. M. Åkesson and H. T. Toivonen, “A neural network model predictive controller,” J. Process Control, vol. 16, no. 9, pp. 937–946, 2006. [13] M. Canale, L. Fagiano, and M. Milanese, “Set membership approximation theory for fast implementation of model predictive control laws,” Automatica, vol. 45, no. 1, pp. 45–54, 2009. [14] L. Fagiano, M. Canale, and M. Milanese, “Set membership approximation of discontinuous nonlinear model predictive control laws,” Automatica, vol. 48, no. 1, pp. 191–197, 2012. [15] Y. Xia, H. Leung, and J. Wang, “A projection neural network and its application to constrained optimization problems,” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 49, no. 4, pp. 447–458, Apr. 2002. [16] Y. Xia and J. Wang, “A general projection neural network for solving monotone variational inequalities and related optimization problems,” IEEE Trans. Neural Netw., vol. 15, no. 2, pp. 318–328, Mar. 2004. [17] X. Hu and J. Wang, “Solving pseudomonotone variational inequalities and pseudoconvex optimization problems using the projection neural network,” IEEE Trans. Neural Netw., vol. 17, no. 6, pp. 1487–1499, Nov. 2006. [18] Q. Liu and J. Wang, “A one-layer projection neural network for nonsmooth optimization subject to linear equalities and bound constraints,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 5, pp. 812–824, May 2013. [19] Y. S. Xia, G. Feng, and J. Wang, “A primal-dual neural network for online resolving constrained kinematic redundancy in robot motion control,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 35, no. 1, pp. 54–64, Feb. 2005. [20] S. Liu and J. Wang, “A simplified dual neural network for quadratic programming with its KWTA application,” IEEE Trans. Neural Netw., vol. 17, no. 6, pp. 1500–1510, Nov. 2006. [21] X. Hu and J. Wang, “An improved dual neural network for solving a class of quadratic programming problems and its k-winners-take-all application,” IEEE Trans. Neural Netw., vol. 19, no. 12, pp. 2022–2031, Dec. 2008. [22] Q. Liu and J. Wang, “A one-layer recurrent neural network with a discontinuous activation function for linear programming,” Neural Comput., vol. 20, no. 5, pp. 1366–1383, 2008. [23] Q. Liu and J. Wang, “A one-layer recurrent neural network with a discontinuous hard-limiting activation function for quadratic programming,” IEEE Trans. Neural Netw., vol. 19, no. 4, pp. 558–570, Apr. 2008.

849

[24] Z. Guo, Q. Liu, and J. Wang, “A one-layer recurrent neural network for pseudoconvex optimization subject to linear equality constraints,” IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 1892–1900, Dec. 2011. [25] Q. Liu, Z. Guo, and J. Wang, “A one-layer recurrent neural network for constrained pseudoconvex optimization and its application for dynamic portfolio optimization,” Neural Netw., vol. 26, pp. 99–109, Feb. 2012. [26] G. Li, Z. Yan, and J. Wang, “A one-layer recurrent neural network for constrained nonsmooth invex optimization,” Neural Netw., vol. 50, pp. 79–89, Feb. 2014. [27] D. A. White and D. A. Sofge, Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. New York, NY, USA: Van Nostrand, 1992. [28] S. Piche, B. Sayyar-Rodsari, D. Johnson, and M. Gerules, “Nonlinear model predictive control using neural networks,” IEEE Control Syst., vol. 20, no. 3, pp. 53–62, Jun. 2002. [29] J.-Q. Huang and F. L. Lewis, “Neural-network predictive control for nonlinear dynamic systems with time-delay,” IEEE Trans. Neural Netw., vol. 14, no. 2, pp. 377–389, Mar. 2003. [30] M. Han, J. Fan, and J. Wang, “A dynamic feedforward neural network based on Gaussian particle swarm optimization and its application for predictive control,” IEEE Trans. Neural Netw., vol. 22, no. 9, pp. 1457–1468, Sep. 2011. [31] Y. Pan and J. Wang, “Model predictive control of unknown nonlinear dynamical systems based on recurrent neural networks,” IEEE Trans. Ind. Electron., vol. 59, no. 8, pp. 3089–3101, Aug. 2012. [32] Z. Yan and J. Wang, “Model predictive control of nonlinear systems with unmodeled dynamics based on feedforward and recurrent neural networks,” IEEE Trans. Ind. Informat., vol. 8, no. 4, pp. 746–756, Nov. 2012. [33] L.-X. Wang and F. Wan, “Structured neural networks for constrained model predictive control,” Automatica, vol. 37, no. 8, pp. 1235–1243, 2001. [34] L. Cheng, Z.-G. Hou, and M. Tan, “Constrained multi-variable generalized predictive control using a dual neural network,” Neural Comput. Appl., vol. 16, no. 6, pp. 505–512, 2007. [35] Z. Yan and J. Wang, “Model predictive control for tracking of underactuated vessels based on recurrent neural networks,” IEEE J. Ocean. Eng., vol. 37, no. 4, pp. 717–726, Oct. 2012. [36] Z. Yan and J. Wang, “Robust model predictive control of nonlinear systems with unmodeled dynamics and bounded uncertainties based on neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 3, pp. 457–469, Mar. 2014. [37] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc. IEEE Int. Conf. Neural Netw., Perth, WA, Australia, 1995, pp. 1942–1948. [38] R. C. Eberhart and Y. Shi, “Particle swarm optimization: Developments, applications and resources,” in Proc. Congr. Evol. Comput., Seoul, Korea, 2001, pp. 81–86. [39] R. C. Eberhart, Y. Shi, and J. Kennedy, Swarm Intelligence. San Mateo, CA, USA: Morgan Kaufmann, 2001. [40] F. van den Bergh and A. P. Engelbrecht, “A cooperative approach to particle swarm optimization,” IEEE Trans. Evol. Comput., vol. 8, no. 3, pp. 225–239, Jun. 2004. [41] J. Kennedy, “Particle swarm optimization,” in Encyclopedia of Machine Learning. New York, NY, USA: Springer-Verlag, 2010, pp. 760–766. [42] J. B. Rawlings and D. Q. Mayne, Model Predictive Control: Theory and Design. San Francisco, CA, USA: Nob Hill Publishing, 2009. [43] L. Magni, G. De Nicolao, L. Magnani, and R. Scattolini, “A stabilizing model-based predictive control algorithm for nonlinear systems,” Automatica, vol. 37, no. 9, pp. 1351–1362, 2001. [44] J. A. Primbs and V. Nevisti´c, “Feasibility and stability of constrained finite receding horizon control,” Automatica, vol. 36, no. 7, pp. 965–971, 2000. [45] A. Jadbabaie and J. Hauser, “On the stability of receding horizon control with a general terminal cost,” IEEE Trans. Autom. Control, vol. 50, no. 5, pp. 674–678, May 2005. [46] X. Hu and J. Wang, “Convergence of a recurrent neural network for nonconvex optimization based on an augmented Lagrangian function,” in Proc. 4th Int. Symp. Neural Netw., Nanjing, China, 2007, pp. 194–203. [47] C.-L. Sun, J.-C. Zeng, and J.-S. Pan, “An improved vector particle swarm optimization for constrained optimization problems,” Inf. Sci., vol. 181, no. 6, pp. 1153–1163, 2011. [48] E. Mezura-Montes and C. A. Coello Coello, “Constraint-handling in nature-inspired numerical optimization: Past, present and future,” Swarm Evol. Comput., vol. 1, no. 4, pp. 173–194, 2011.

850

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 4, APRIL 2015

[49] Z. Yan, J. Wang, and G. Li, “A collective neurodynamic optimization approach to bound-constrained nonconvex optimization,” Neural Netw., vol. 55, pp. 20–29, Jul. 2014. [50] D. Gu and H. Hu, “Receding horizon tracking control of wheeled mobile robots,” IEEE Trans. Control Syst. Technol., vol. 14, no. 4, pp. 743–749, Jul. 2006. [51] Y. Pan and J. Wang, “Model predictive control for nonlinear affine systems based on the simplified dual neural network,” in Proc. IEEE Control Appl. Intell. Control, Saint Petersburg, Russia, Jul. 2009, pp. 683–688. [52] Q. Yang, J. B. Vance, and S. Jagannathan, “Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 994–1001, Aug. 2008. [53] Y. Tang, M. Tomizuka, G. Guerrero, and G. Montemayor, “Decentralized robust control of mechanical systems,” IEEE Trans. Autom. Control, vol. 45, no. 4, pp. 771–776, Apr. 2000.

Zheng Yan (S’11) received the B.Eng. degree in automation and computer-aided engineering and the Ph.D. degree in mechanical and automation engineering from the Chinese University of Hong Kong, Hong Kong, in 2010 and 2014, respectively. His current research interests include computational intelligence and model predictive control. Dr. Yan was a recipient of the Graduate Research Grant from the IEEE Computational Intelligence Society in 2014.

Jun Wang (S’89–M’90–SM’93–F’07) received the B.S. degree in electrical engineering and the M.S. degree in systems engineering from the Dalian University of Technology, Dalian, China, in 1982 and 1985, respectively, and the Ph.D. degree in systems engineering from Case Western Reserve University, Cleveland, OH, USA, in 1991. He held various academic positions with the Dalian University of Technology, Case Western Reserve University, and the University of North Dakota, Grand Forks, ND, USA. He also held various short-term or part-time visiting positions with the U.S. Air Force Armstrong Laboratory, San Antonio, TX, USA, in 1995, the RIKEN Brain Science Institute, Wako, Japan, in 2001, and the Huazhong University of Science and Technology, Wuhan, China, from 2006 to 2007. He was the Cheung Kong Chair Professor with Shanghai Jiao Tong University, Shanghai, China, from 2008 to 2011. He has been with the Dalian University of Technology as the National Thousand-Talent Chair Professor since 2011. He is currently a Professor with the Department of Mechanical and Automation Engineering, Chinese University of Hong Kong, Hong Kong. His current research interests include neural networks and their applications. Prof. Wang was a recipient of the Research Excellence Award from the Chinese University of Hong Kong from 2008 to 2009, two Natural Science Awards (first class) from the Shanghai Municipal Government in 2009 and the Ministry of Education of China in 2011, the Outstanding Achievement Award from the Asia Pacific Neural Network Assembly, the IEEE T RANSACTIONS ON N EURAL N ETWORKS Outstanding Paper Award (with Qingshan Liu) in 2011, and the Neural Networks Pioneer Award from the IEEE Computational Intelligence Society in 2014. He has been the Editorin-Chief of the IEEE T RANSACTIONS ON C YBERNETICS since 2014, and served as an Associate Editor of the journal and its predecessor from 2003 to 2013. He was a member of the Editorial Board of Neural Networks from 2012 to 2014. He served as an Associate Editor of the IEEE T RANSACTIONS ON N EURAL N ETWORKS from 1999 to 2009 and the IEEE T RANSACTIONS ON S YSTEMS , M AN , AND C YBERNETICS –PART C from 2002 to 2005, and a member of the Editorial Advisory Board of the International Journal of Neural Systems from 2006 to 2012. He was a Guest Editor of the special issues of the European Journal of Operational Research in 1996, the International Journal of Neural Systems in 2007, and Neurocomputing in 2008. He served as the President of the Asia Pacific Neural Network Assembly in 2006, the General Chair of the 13th International Conference on Neural Information Processing in 2006, and the IEEE World Congress on Computational Intelligence in 2008. He has served on many committees, such as the IEEE Fellow Committee. He was the IEEE Computational Intelligence Society Distinguished Lecturer from 2010 to 2012.

Nonlinear model predictive control based on collective neurodynamic optimization.

In general, nonlinear model predictive control (NMPC) entails solving a sequential global optimization problem with a nonconvex cost function or const...
3MB Sizes 0 Downloads 5 Views