A one-layer projection neural network for nonsmooth optimization subject to linear equalities and bound constraints.

812

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 24, NO. 5, MAY 2013

A One-Layer Projection Neural Network for Nonsmooth Optimization Subject to Linear Equalities and Bound Constraints Qingshan Liu, Member, IEEE, and Jun Wang, Fellow, IEEE

Abstract— This paper presents a one-layer projection neural network for solving nonsmooth optimization problems with generalized convex objective functions and subject to linear equalities and bound constraints. The proposed neural network is designed based on two projection operators: linear equality constraints, and bound constraints. The objective function in the optimization problem can be any nonsmooth function which is not restricted to be convex but is required to be convex (pseudoconvex) on a set defined by the constraints. Compared with existing recurrent neural networks for nonsmooth optimization, the proposed model does not have any design parameter, which is more convenient for design and implementation. It is proved that the output variables of the proposed neural network are globally convergent to the optimal solutions provided that the objective function is at least pseudoconvex. Simulation results of numerical examples are discussed to demonstrate the effectiveness and characteristics of the proposed neural network. Index Terms— Differential inclusion, global convergence, Lyapunov function, nonsmooth optimization, projection neural network.

I. I NTRODUCTION

C

ONSIDER THE following general nonsmooth optimization problem: minimize f (x), subject to Ax = b, x ∈

(1)

where x = (x 1 , x 2 , . . . , x n )T ∈ Rn , f : Rn → R is an objective function which is not necessarily convex or smooth; Manuscript received August 13, 2012; accepted January 26, 2013. Date of publication February 28, 2013; date of current version March 8, 2013. The work of Q. Liu was supported in part by the National Natural Science Foundation of China under Grant 61105060, the Program for New Century Excellent Talents in University under Grant NCET-12-0114, the Natural Science Foundation of Jiangsu Province of China under Grant BK2011594, Fundamental Research Funds for the Central Universities Teachers. The work of J. Wang was supported in part by the Research Grants Council of the Hong Kong Special Administrative Region, China, under Grant CUHK416811E and Grant CUHK416812E, and the National Natural Science Foundation of China under Grant 61273307. Q. Liu is with the School of Automation, Southeast University, Nanjing 210096, China (e-mail: [email protected]). J. Wang is with the Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, and also with the School of Control Science and Engineering, Dalian University of Technology, Dalian 116023, China (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2013.2244908

A ∈ Rm×n is a full row-rank matrix (i.e., rank(A) = m ≤ n); b = (b1 , b2 , . . . , bm ) ∈ Rm ; and is a nonempty and closed convex set in Rn . Constrained optimization arises in a broad variety of scientific and engineering applications where the real-time solutions of the optimization problems are often required. One of the possible and very promising approaches for real-time optimization is to apply recurrent neural networks based on circuit implementation. As parallel computational models for solving constrained optimization problems, recurrent neural networks have received a great deal of attention and brought a wide range of applications over the past few decades (e.g., [14], [21], [34], [37], and references therein). In 1986, Tank and Hopfield [37] proposed a recurrent neural network for solving linear programming problems which inspired many researchers to develop other neural networks for optimization. In 1988, the dynamical canonical nonlinear programming circuit (NPC) was introduced by Kennedy and Chua [21] for nonlinear programming by utilizing a finite penalty parameter, which can generate the approximate optimal solutions. From then on, the research on NPCs has been well developed and many neural network models have been designed for optimization problems problems [5], [8], [33], [39]. Among them, the Lagrangian network (based on the Lagrangian method) was proposed by Zhang and Constantinides [49] for solving convex nonlinear programming problems. The deterministic annealing neural network by Wang [40] was developed for linear and convex programming. Moreover, the projection method was introduced to design recurrent neural networks for solving smooth convex (pseudoconvex) optimization problems, such as the projection networks proposed by Xia et al. [42] and Hu and Wang [17], [18]. In order to reduce the model complexity, the dual and simplified dual neural networks were introduced for solving convex programming problems with dual variables only [19], [31]. In [26] and [27], we proposed several one-layer recurrent neural networks with lower model complexity for solving linear and quadratic programming problems. In recent years, recurrent neural networks based on penalty method were widely investigated for solving nonsmooth optimization problems. In [14], the generalized NPC (G-NPC) proposed by Forti et al. [14] can be considered as a natural extension of NPC for solving nonsmooth optimization problems with inequality constraints. Xue and Bian [48] proposed a recurrent neural network modeled by a differential inclusion

2162–237X/$31.00 © 2013 IEEE

LIU AND WANG: A ONE-LAYER PNN FOR NONSMOOTH OPTIMIZATION

for nonsmooth convex optimization based on the subgradient and penalty parameter method. In [28], we proposed a onelayer recurrent neural network to solve problem (1) with only equality constraints. More recently, some one-layer recurrent neural networks were proposed for solving some locally convex and pseudoconvex nonsmooth optimization problems [25], [30]. In this paper, we are concerned with the nonsmooth optimization problem (1), and a one-layer projection neural network model with easy implementation structure is developed. Compared with the existing neural networks for nonsmooth optimization, the proposed neural network here has several merits. The objective function of problem (1) does not need to be convex everywhere, and only needs to be convex (pseudoconvex) on a set defined by the constraints; the neural network does not have any design parameter; and some conditions in [25] and [48] are removed, such as the existence of a feasible point in the interior of . The remainder of this paper is organized as follows. Section II discusses some preliminaries. In Section III, the proposed projection neural network model is described. The optimality and global convergence of the proposed neural network are analyzed in Section IV. Next, in Section V, four illustrative examples are given to show the effectiveness and performance of the proposed neural network. Finally, Section VI concludes this paper and presents some future research directions. II. P RELIMINARIES In this section, we present some definitions and properties concerning set-valued map, nonsmooth analysis, and convex analysis, which are needed for the theoretical analysis in the paper. We refer the reader to [1], [9], [10], and [15] for more thorough discussions. Definition 1: Suppose E ⊂ Rn . F : x → F(x) is called a set-valued function from E → Rn if, to each point x of a set E, there corresponds to a nonempty closed set F(x) ⊂ Rn . Definition 2: A function ϕ : Rn → R is said to be Lipschitz near x ∈ Rn if there exist ε, δ > 0, such that for any x , x ∈ Rn satisfies x − x < δ and x − x < δ, we have |ϕ(x ) − ϕ(x )| ≤ εx − x . If ϕ is Lipschitz near any point x ∈ Rn , then ϕ is also said to be locally Lipschitz in Rn . Assume that ϕ is Lipschitz near x. The generalized directional derivative of ϕ at x in the direction v ∈ Rn is given by ϕ(y + sv) − ϕ(y) ϕ 0 (x; v) = lim sup . y→x s s→0+

The Clarke’s generalized gradient of f is defined as ∂ϕ(x) = {y ∈ Rn : ϕ 0 (x; v) ≥ y T v, ∀v ∈ Rn }. When ϕ is locally Lipschitz in Rn , ϕ is differentiable for almost all (a.a.) x ∈ Rn (in the sense of Lebesgue measure). Then, the Clarke’s generalized gradient of ϕ at x ∈ Rn is equivalent to ∂ϕ(x) = K lim ∂ϕ(x n ) : x n → x, x n ∈ / N , xn ∈ /E n→∞

813

where K (·) denotes the closure of the convex hull, N ⊂ Rn is an arbitrary set with measure zero, and E ⊂ Rn is the set of points where ϕ is not differentiable. Definition 3: A function ϕ : Rn → R, which is locally Lipschitz near x ∈ Rn , is said to be regular at x if there exists the one-sided directional derivative for any direction v ∈ Rn which is given by ϕ (x; v) = lim

ξ →0+

ϕ(x + ξ v) − ϕ(x) ξ

and we have ϕ 0 (x; v) = ϕ (x; v). The function ϕ is said to be regular in Rn if it is regular for any x ∈ Rn . Consider the following ordinary differential equation (ODE): dx = ψ(x), x(t0 ) = x 0 . (2) dt A set-valued is map defined as φ(x) = K [ψ(B(x, ε) − N )] ε>0 μ(N )=0

where μ(N ) is the Lebesgue measure of set N , B(x, ε) = {y : y − x ≤ ε}. A solution of (2) is an absolutely continuous function x(t) defined in an interval [t0 , t1 ](t0 ≤ t1 ≤ + ∞), which satisfies x(t0 ) = x 0 and the differential inclusion dx ∈ φ(x), a.a. t ∈ [t0 , t1 ]. dt Definition 4: Let E ⊂ Rn be a nonempty convex set. Function ϕ : E → R is said to be convex on E if, for any x , x ∈ E and 0 ≤ α ≤ 1, we have ϕ(αx + (1 − α)x ) ≤ αϕ(x ) + (1 − α)ϕ(x ). Moreover, ϕ is said to be strictly convex on E if strict inequality holds whenever x = x and 0 < α < 1. ϕ is said to be strongly convex on E if ϕ(αx + (1 − α)x ) ≤ αϕ(x ) + (1 − α)ϕ(x ) −σ α(1 − α)x − x 2 where σ is a positive constant. Definition 5: Let E ⊂ Rn be a nonempty convex set. A set-valued map F : E → Rm is said to be monotone on E if, for any x , x ∈ E, we have (ξ − ξ )T (x − x ) ≥ 0, ∀ ξ ∈ F(x ), ξ ∈ F(x ). Moreover, F is said to be strictly monotone on E if strict inequality holds whenever x = x . F is said to be strongly monotone on E if (ξ −ξ )T (x − x ) ≥ σ x − x 2 , ∀ ξ ∈ F(x ), ξ ∈ F(x ) where σ is a positive constant. A continuous function ϕ(x) is convex (strictly convex, strongly convex) if and only if its generalized gradient ∂ϕ(x) is a monotone (strictly monotone, strongly monotone) mapping. Definition 6 ([36]): Let E ⊂ Rn be a nonempty convex set. A function ϕ : E → R is said to be pseudoconvex on E if, for any x , x ∈ E, we have ∃ η ∈ ∂ϕ(x ) : η T (x − x ) ≥ 0 ⇒ ϕ(x ) ≥ ϕ(x ).

814


It is noted that a convex function is obviously pseudoconvex. Definition 7 ([36]): Let E ⊂ Rn be a nonempty convex set. A set-valued map F : E → Rm is said to be pseudomonotone on E if, for any x , x ∈ E, we have ∃ ηx ∈ F(x ) : ηxT (x − x ) ≥ 0

⇒∀ ηx ∈ F(x ) : ηxT (x − x ) ≥ 0. Moreover, F is said to be strongly pseudomonotone on E if ∃ ηx ∈ F(x ) : ηxT (x − x ) ≥ 0

⇒∀ ηx ∈ F(x ) : ηxT (x − x ) ≥ σ x − x 2 where σ is a positive constant. It is shown in [36] that a continuous function ϕ(x) is pseudoconvex if and only if its generalized gradient ∂ϕ(x) is a pseudomonotone mapping. III. M ODEL D ESCRIPTION This section describes the one-layer projection model for solving the nonsmooth optimization problem (1). First, some notations are introduced. In this paper, · denotes the Euclidean norm. The region where the constraints of problem (1) are satisfied (feasible region) is defined as S = {x ∈ Rn : Ax = b, x ∈ }. The region where the equality constraints of n problem (1) are satisfied is defined as E = {x ∈ R : Ax = b}; then, clearly S = E . The optimal solution set of problem (1) is denoted as M. Throughout this paper, we assume that the optimal solution set of problem (1) is nonempty; i.e., M = ∅. The dynamic equations of the proposed projection neural network model are described as follows for solving problem (1). 1) State equation

dy ∈ −Pg(y)−(I − P)(y −g(y)+∂ f ((I − P)g(y)+q))+q. dt (3) 2) Output equation x = g(y)

(4)

where is a positive scaling constant, I is the identity matrix, P = A T (A A T )−1 A, q = A T (A A T )−1 b, ∂ f is the generalized gradient of f , and g : Rn → is a projection operator defined by g(u) = arg min u − v.

The matrix P, called the projection matrix, has several desirable properties, such as symmetry, P 2 = P, (I − P)2 = I − P, P(I − P) = 0, and P = 1, which can be derived directly from the definition of P. Lemma 1 ([22]): For the projection operator g(x), the following inequality holds: (u − g(u))T (g(u) − v) ≥ 0, ∀ u ∈ Rn , v ∈ . Furthermore, according to Lemma 1, we have the following result. Lemma 2 ([24]): For the projection operator g(x), the following inequality holds: (u − v)T (g(u) − g(v)) ≥ g(u) − g(v)2 , ∀ u, v ∈ Rn . Remark 1: In [27], we have investigated the quadratic programming problems in which the objective functions are only needed to be convex on the equality constraints. However, the method used in [27] is not suitable for general constrained nonlinear programming problems. Moreover, to obtain the global convergence of the proposed neural network in [27], the parameter in the model needs to be large enough than an estimated lower bound. For a more general nonlinear nonsmooth optimization, the proposed neural network in (3) and (4) is capable of solving problem (1) and its global convergence can be guaranteed, which will be proved in next section. Remark 2: Recurrent neural networks for solving convex optimization problems with convex objective functions have been well studied in the literature [7], [20], [46], [48]. Recently, some recurrent neural networks based on the penalty parameter method have been proposed for nonsmooth convex and nonconvex optimization problems [4], [14], [25], and [30]. However, to get the optimality and convergence of the neural networks, the penalty parameters need to be larger than the estimated lower bounds. To avoid the inconvenience of estimation of the penalty parameters, here the proposed neural network in (3) and (4) does not contain any penalty parameter. Thus the proposed neural network in this paper is more convenient for solving (1). Problem (1) has two special cases. The first one is with only equality constraints described as follows: minimize f (x), subject to Ax = b.

v∈

(5)

In general, the calculation of projection of a point onto a convex set is nontrivial. However, if is a box set or a sphere set, the calculation is straightforward. For example, if = {u ∈ Rn : li ≤ u i ≤ h i , i = 1, 2, . . . , n}, then g(u) = [g(u 1 ), g(u 2 ), . . . , g(u n )]T and ⎧ ⎪ ⎨h i u i > h i g(u i ) = u i li ≤ u i ≤ h i ⎪ ⎩ li u i < li .

The second one is with only bound constraints described as follows: minimize f (x), (6) subject to x ∈ .

If = {u ∈ Rn : u − s ≤ r, s ∈ Rn , r > 0}, then

u u − s ≤ r g(u) = r(u−s) s + u−s u − s > r.

It is considered as an improved version of the model investigated in [28]. For problem (6), the proposed neural network can be described as follows.

For problem (5), the proposed neural network can be described as the following system:

dx ∈ −Px − (I − P)∂ f ((I − P)x + q) + q. dt

(7)


815

TABLE I C OMPARISON OF R ELATED N EURAL N ETWORKS IN T ERMS OF M ODEL C OMPLEXITY AND C ONVERGENCE C RITERIA Model Type Lagrangian network Primal-dual network Projection network Dual network Simplified dual network One-layer network One-layer projection network

Layer(s) 2 2 2 1 1 1 1

Neurons 3n + m 3n + m n+m n+m n n n

Penalty Parameter(s) No No No No No Yes No

1) State equation

f (x) f (x) f (x) f (x) f (x) f (x) f (x)

Convergence Condition is strictly convex is convex is (strictly) convex is strictly convex and quadratic is strictly convex and quadratic is convex on S is convex (pseudoconvex) on X

Reference(s) [49] [43] [16], [38], [47] [45] [31] [30] This paper

A. Optimality Analysis

dy ∈ −y + g(y) − ∂ f (g(y)). dt

(8)

2) Output equation x = g(y).

(9)

It is considered as an extension of the model investigated in [44] for nonsmooth optimization. Throughout this paper, we denote X = {v ∈ Rn : v = (I − P)x +q, x ∈ } and assume that the generalized gradient of the objective function f (x) of problem (1) is bounded on X . It is clear that X ⊂ E from the definitions of P and q. It is easy to show that S ⊂ X . Consequently, S ⊂ X ⊂ E. A comparison of the proposed neural network with several other neural networks for solving problem (1) is shown in Table I. We can see that the neural network proposed here has the least number of neurons among the existing ones. The other neural network models in Table I require the objective function f (x) to be convex in Rn or include a penalty parameter. However, here the objective function f (x) may be nonconvex for the global convergence of the proposed neural network. Moreover, there is no penalty parameter in the proposed neural network here. IV. T HEORETICAL A NALYSIS To obtain the optimal solutions of problem (1) using recurrent neural networks, two main issues need to be addressed. The first one is the optimality of the neural networks; i.e., we need to find the relationship between the optimal solutions and the outputs (or equilibrium points) of the neural networks. In general, the outputs (or equilibrium points) correspond to the optimal solutions. The second one is the convergence of the neural network. To show the performance of the neural networks for solving optimization problems, the convergence is most important. In this section, these two issues of the proposed neural network will be investigated in detail. Definition 8: y¯ ∈ Rn is said to be an equilibrium point of system (3) if 0 ∈ Pg( y¯ )+(I − P)( y¯ −g( y¯ )+∂ f ((I − P)g( y¯ )+q))−q (10) i.e., if there exists γ¯ ∈ ∂ f ((I − P)g( y¯ ) + q) such that Pg( y¯ ) + (I − P)( y¯ − g( y¯ ) + γ¯ ) − q = 0.

(11)

From the definitions of P and q, it is easily to obtain the following lemma. Lemma 3: Assume A is full row-rank. For any x ∈ Rn , Ax = b if and only if Px = q, where P and q are defined in (3). The optimality of the proposed neural network can be described as the following theorem. Theorem 1: Assume that the objective function f (x) in problem (1) is pseudoconvex on the feasible region S. Then x ∗ ∈ Rn is an optimal solution of problem (1) if and only if there exists an equilibrium point y ∗ ∈ Rn for system (3) such that x ∗ = g(y ∗ ). Proof: Assume x ∗ ∈ Rn to be an optimal solution of problem (1). According to the Karush–Kuhn–Tucker conditions [3] for problem (1), there exist v ∗ ∈ Rm , w∗ ∈ Rn , and γ ∗ ∈ ∂ f (x ∗ ) such that γ ∗ + A T v ∗ + w∗ = 0 ∗

Ax = b x ∗ = g(x ∗ + w∗ ).

(12) (13) (14)

According to (14), let y ∗ = x ∗ + w∗ , then x ∗ = g(y ∗ ). It follows that y ∗ = g(y ∗ ) + w∗ . (15) From (12), w∗ = −γ ∗ − A T v ∗ and substituting it into (15), we have y ∗ = g(y ∗ ) − γ ∗ − A T v ∗ . (16) Multiplying the both sides of (16) with A results in v ∗ = (A A T )−1 A(g(y ∗) − y ∗ − γ ∗ ).

(17)

Substituting (17) into (16), it follows that (I − P)(y ∗ − g(y ∗ ) + γ ∗ ) = 0.

(18)

According to (13) and Lemma 3, we have Px ∗ = q; i.e., Pg(y ∗ ) = q. Combining with (18), one gets Pg(y ∗ ) + (I − P)(y ∗ − g(y ∗ ) + γ ∗ ) − q = 0. Thus y ∗ is an equilibrium point of system (3). Next, we prove that the opposite side of the theorem is true. Assume y¯ ∈ Rn to be an equilibrium point of system (3) and x¯ = g( y¯ ). According to Definition 8, there exists γ¯ ∈ ∂ f ((I − P)x¯ + q) such that P x¯ + (I − P)( y¯ − x¯ + γ¯ ) − q = 0

(19)

816


where x¯ = g( y¯ ). Multiplying both sides of (19) with P, since P 2 = P and Pq = q, it follows that P x¯ = q and (I − P)( y¯ − x¯ + γ¯ ) = 0.

(20)

Thus according to Lemma 3, x¯ = g( y¯ ) is a feasible solution of problem (1). For any x ∈ S, from (20), we have (x − x) ¯ T (I − P)( y¯ − x¯ + γ¯ ) = 0. Because Px = P x¯ = q for x ∈ S, it follows that (x − x) ¯ T ( y¯ − x¯ + γ¯ ) = 0.

(21) x) ¯ T (x¯

− According to Lemma 1, for x ∈ S ⊂ , (x − y¯ ) = (x − g( y¯ ))T (g( y¯ ) − y¯ ) ≥ 0. Then, (21) implies (x − x) ¯ T γ¯ = (x − x) ¯ T (x¯ − y¯ ) ≥ 0. Since f (x) is pseudoconvex on S, we have f (x) ≥ f (x). ¯ Thus x¯ is an optimal solution of problem (1). According to the proof of Theorem 1, it is easy to get the following corollary. Corollary 1: An optimal solution of problem (1) is x¯ = g( y¯ ) if the objective function f (x) is pseudoconvex at x, ¯ where y¯ is an equilibrium point of system (3). Remark 3: From the above analysis, we know that if the state vector y(t) of the neural network in (3) and (4) converges to an equilibrium point y¯ , and f (x) is pseudoconvex at x¯ = g( y¯ ), then the output vector x(t) of the neural network converges to an optimal solution of problem (1). B. Convergence Analysis

(ii) Assume ϕ(y) = y − g(y)2 . Then ϕ(y) = min y − z2 . z∈

(23)

Since the minimum on the right-hand side of (23) is uniquely attained at z = g(y), it follows from [2, Ch. 4, Th. 1.7] that ϕ(y) is differentiable and (24) ∇ϕ(y) = ∇ y y − z2 z=g(y) = 2(y − g(y)). Furthermore, it is easily to get that ∇V0(y) = 2(g(y) − g( y¯ )). Now, we present the global convergence of the proposed neural network. Theorem 2: Assume that the objective function f (x) in problem (1) is convex on X . For any initial value y0 = y(t0 ) ∈ Rn , the output vector of the neural network in (3) and (4) is globally convergent to the optimal solution set M if y(t) is bounded and ∀x ∈ S, (x − x ∗ )T (γ − γ ∗ ) = 0 ⇐⇒ x ∈ M where x ∗ ∈ M, γ ∈ ∂ f (x), and γ ∗ ∈ ∂ f (x ∗ ). Proof: Since f (x) is convex on X , it is convex on the feasible region S. According to Theorem 1, x¯ = g( y¯ ) is an optimal solution of problem (1), where y¯ is an equilibrium point of system (3). Since y¯ is an equilibrium point of system (3), there exists γ¯ ∈ ∂ f ((I − P)x¯ + q) such that P x¯ + (I − P)( y¯ − x¯ + γ¯ ) − q = 0.

(25)

By substituting (25) into (3), it follows that dy ∈ −P(x − x)−(I ¯ −P)(y−x +∂ f ((I −P)x +q)− y¯ + x− ¯ γ¯ ) dt where x = g(y). Consider the Lyapunov function

V (y) = V0 (y) + (y − y¯ )T P(y − y¯ ) (26) 2 where V0 (y) is defined in (22). From Lemma 4(ii), we have

In this subsection, the convergence property of the proposed neural network in (3) and (4) is discussed by using the Lyapunov method and nonsmooth analysis [11]–[13], [32]. Definition 9: The output vector of the neural network in (3) and (4) is said to be globally convergent to the optimal solution set M of problem (1) if, for any initial value y0 = y(t0 ) ∈ Rn lim dist(x(t), M) = 0

t →∞

where dist(x(t), M) is the distance between x(t) and M, which is defined as dist(x, M) = min x − u. u∈M

To prove the global convergence of the proposed neural network, we first define V0 (y) = y − g( y¯ )2 − y − g(y)2

(22)

where y¯ is an equilibrium point of system (3). Then V0 (y) has some properties as the following lemma. Lemma 4: For any y ∈ Rn , we have (i) V0 (y) ≥ g(y) − g( y¯ )2 ; (ii) V0 (y) is differentiable and its gradient is ∇V0 (y) = 2(g(y) − g( y¯ )). Proof: (i) By simple calculation, we have V0 (y) − g(y) − g( y¯ )2 = 2(y − g(y))T (g(y) − g( y¯ )). According to Lemma 1, V0 (y) − g(y) − the result holds.

g( y¯ )2

≥ 0. Thus

∇V (y) = [x − x¯ + P(y − y¯ )] . According to the chain rule, it follows that V (y(t)) is differentiable for almost all (a.a.) t ≥ t0 and it results in V˙ (y(t)) = (∇V (y))T y˙ (t) ≤

sup [x − x¯ + P(y − y¯ )]T [−P(x − x) ¯

γ ∈∂ f (z)

− (I − P)(y − x + γ − y¯ + x¯ − γ¯ )] ¯ − (x − x) ¯ T (I − P) = −(x − x) ¯ T P(x − x) × (y − y¯ ) + (x − x) ¯ T (I − P)(x − x) ¯ T ¯ (I − P)(γ − γ¯ ) − inf (x − x) γ ∈∂ f (z)

¯ − (y − y¯ )T P(x − x) = −(x − x) ¯ T P(x − x) ¯ − (x − x) ¯ T (y − y¯ ) ¯ + (x − x) ¯ T (I − P)(x − x) −

inf (x − x) ¯ T (I − P)(γ − γ¯ )

γ ∈∂ f (z)

where z = (I − P)x + q.


817

Since I − P ≤ 1, combining with Lemma 2, we have

∞

H (y(t))dt = lim

s→∞ t 0

t0

(x − x)(I ¯ − P)(x − x) ¯ ≤ x − x ¯ 2

≤ (y − y¯ )T (g(y) − g( y¯ ))

s→∞

= −Vc + V (y(t0 ))

= (x − x) ¯ T (y − y¯ ).

which contradicts (27). Therefore, we have H ( y˜ ) = 0, and then x˜ = g( y˜ ) is an optimal solution of problem (1). That is, any limit point of the output vector is an optimal solution of problem (1). Then

Then, since P x¯ = q, we have V˙ (y(t)) ≤ −(x − x) ¯ T P(x − x) ¯ ¯ T (I − P)(γ − γ¯ ) inf (x − x)

lim dist(x(t), M) = 0.

t →∞

γ ∈∂ f (z)

= −Px − q2 − inf (z − z¯ )T (γ − γ¯ ) γ ∈∂ f (z)

where z¯ = (I − P)x¯ + q = x. ¯ Since f is convex on X , ∂ f is monotone on X . Then V˙ (y(t)) ≤ 0. Define H (y) = Px − q2 + inf γ ∈∂ f (z) (z − z¯ )T (γ − γ¯ ). Assume there exists yˇ ∈ Rn such that xˇ = g( yˇ ) is an optimal solution of problem (1). Thus P xˇ = q. From the condition, (ˇz − z¯ )T (γˇ − γ¯ ) = (xˇ − x) ¯ T (γˇ − γ¯ ) = 0, where zˇ = (I − P)xˇ + q = xˇ and γˇ ∈ ∂ f ((I − P)xˇ + q). Thus H ( yˇ ) = 0. Conversely, if there exists yˆ ∈ Rn such that H ( yˆ ) = 0, we have P xˆ = q, where xˆ = g( yˆ ). Combining with ∂ f (ˆz ) being a compact convex subset in Rn , there exists γˆ ∈ ∂ f (ˆz ) such that (ˆz − z¯ )T (γˆ − γ¯ ) = 0, where zˆ = (I − P)xˆ + q = x. ˆ From the condition, xˆ = zˆ is an optimal solution of problem (1). Consequently, there exists y ∈ Rn such that H (y) = 0 if and only if x = g(y) is an optimal solution of problem (1). From the boundedness of y(t) and (3), we infer that y˙ (t) is also bounded, denoted by M. Then, there exists an increasing sequence {tk } with limk→∞ tk = ∞ and a limit point y˜ such that limk→∞ y(tk ) = y˜ . Next, inspired by this paper in [23], we prove that H ( y˜ ) = 0. If it does not hold, that is, H ( y˜ ) > 0, from the definition of H (y), it is lower semicontinuous, then there exist δ > 0 and ε > 0, such that H (y) > ε for all y ∈ B( y˜ , δ), where B( y˜ , δ) = {y ∈ Rn : y− y˜ ≤ δ} is the δ neighborhood of y˜ . Since limk→∞ y(tk ) = y˜ , there exists a positive integer N, such that for all k ≥ N, y(tk ) − y˜ ≤ δ/2. When t ∈ [tk − δ/(4M), tk + δ/(4M)] and k ≥ N, we have y(t) − y˜ ≤ y(t) − y(tk ) + y(tk ) − y˜ ≤ M|t − tk | +

∞

Consequently, any trajectory of output vector of the neural network is globally convergent to the optimal solution set M of problem (1). As a special case of Theorem 2, if the objective function in problem (1) is strictly convex on X , the condition in Theorem 2 is obviously true since ∂ f is strictly monotone. Then one gets the following corollary: Corollary 2: Assume that the objective function f (x) in problem (1) is strictly convex on X . For any initial value y0 = y(t0 ) ∈ Rn , the output vector of the neural network in (3) and (4) is globally convergent to the unique optimal solution of problem (1) if y(t) is bounded. Next, we investigate the models of the two special cases in Section III. Corollary 3: Assume that the objective function f (x) in problem (5) is pseudoconvex on the equality constraint set E. For any initial value x 0 = x(t0 ) ∈ Rn , the state vector of the neural network in (7) is globally convergent to the optimal solution set M1 lim dist(x(t), M1 ) = 0

t →∞

where M1 is the optimal solution set of problem (5). Proof: Let x¯ be an equilibrium point of system (7). According to Theorem 1, x¯ is an optimal solution of problem (5). Consider the Lyapunov function ¯ T (I + P)(x − x). ¯ V1 (x) = (x − x) 2 By using the chain rule, it follows that V (x(t)) is differentiable for a.a. t ≥ t0 and it results in ˙ V˙1 (x(t)) = (∇V1 (x))T x(t)

δ ≤ δ. 2

≤ sup (x − x) ¯ T (I + P)(−Px − (I − P)γ + q) γ ∈∂ f (z)

It follows that H (y(t)) > ε for all t ∈ [tk − δ/(4M), tk + δ/(4M)]. Since the Lebesgue measure of the set t ∈ k≥N [tk − δ/(4M), tk + δ/(4M)] is infinite, then we have

H (y(t))dt

s ≤ − lim V˙ (y(t))dt s→∞ t 0

= − lim V (y(s)) − V (y(t0 ))

= g(y) − g( y¯ )2

−

s

= sup (x − x) ¯ T (I + P)(−P(x − x) ¯ − (I − P)γ ) γ ∈∂ f (z)

where z = (I − P)x + q and the last equality holds since q = P x. ¯ Furthermore, it follows that

(27)

¯ T (I + P)(−P(x − x) ¯ − (I − P)γ ) V˙1 (x(t)) ≤ sup (x − x)

Since V˙ (y(t)) ≤ 0, V (y(t)) is monotonically nonincreasing. Combining with V (y(t)) ≥ 0, there exists a constant Vc such that limt →∞ V (y(t)) = Vc . We have

= −2(x − x) ¯ T P(x − x) ¯ − inf (x − x) ¯ T (I − P)γ

H (y(t))dt = ∞.

t0

γ ∈∂ f (z)

γ ∈∂ f (z)

¯ − inf (z − z¯ )T γ = −2(x − x) ¯ P(x − x) T

γ ∈∂ f (z)

818


where z¯ = (I − P)x¯ + q = x. ¯ Then, ¯ T P(x − x) ¯ − inf (z − z¯ )T γ V˙1 (x(t)) ≤ −2(x − x) γ ∈∂ f (z) T

= −2Px − q − inf (z − z¯ ) γ . 2

γ ∈∂ f (z)

Since x¯ is an optimal solution of problem (5), from (18), we have (x − x) ¯ T (I − P)γ¯ = 0; i.e., (z − z¯ )T γ¯ = 0, where γ¯ ∈ ∂ f (x). ¯ Since f is pseudoconvex on E, it is pseudomonotone on E. Then (z − z¯ )T γ ≥ 0 for any γ ∈ ∂ f (z) and it results in V˙1 (x(t)) ≤ 0. Furthermore, from the definition of V1 (x), V1 (x) ≥ x − x ¯ 2 /2. Consequently, x(t) is bounded. Define H1(x) = 2Px −q2 +inf γ ∈∂ f (z) (z − z¯ )T γ . Assume there exists xˇ ∈ Rn to be an optimal solution of problem (5). We have P xˇ = q and (xˇ − x) ¯ T (I − P)γˇ = 0 from (18), where γˇ ∈ ∂ f (x). ˇ Then it results in H1(x) ˇ = 0. Conversely, if there exists xˆ ∈ Rn such that H1 (x) ˆ = 0, we have P xˆ = q. Combining with the fact that ∂ f (ˆz ) is a compact convex subset in Rn , there exists γˆ ∈ ∂ f (ˆz ) such that (ˆz − z¯ )T γˆ = 0, where zˆ = (I − P)xˆ + q = x. ˆ Since f is pseudoconvex on E, it follows that f (¯z ) ≥ f (ˆz ). Thus xˆ = zˆ is also an optimal solution of problem (1) as x¯ = z¯ is an optimal solution. Therefore, there exists x ∈ Rn such that H1(x) = 0 if and only if x is an optimal solution of problem (5). As the remainder of the proof is similar to that of Theorem 2, it is omitted here. Corollary 4: Assume that the objective function f (x) in problem (6) is pseudoconvex on the bound constraint set . For any initial value y0 = y(t0 ) ∈ Rn , the output vector of the neural network in (8) and (9) is globally convergent to the optimal solution set M2 if y(t) is bounded lim dist(x(t), M2 ) = 0

t →∞

where M2 is the optimal solution set of problem (6). Proof: Let y¯ be an equilibrium point of system (8). According to Theorem 1, x¯ = g( y¯ ) is an optimal solution of problem (6). Consider the Lyapunov function V2 (y) = V0 (y) 2 where V0 (y) is defined in (22). According to Lemma 4(ii), we have ∇V2 (y) = (x − x) ¯ where x = g(y) and x¯ = g( y¯ ). By using the chain rule, it follows that V (x(t)) is differentiable for a.a. t ≥ t0 and it results in V˙2 (y(t)) = (∇V2 (y))T y˙ (t) ¯ T (−y + x − γ ) ≤ sup (x − x) = −(x − x) ¯ (y − x) −

inf (x − x) ¯ γ. T

γ ∈∂ f (x)

According to Lemma 1, we have (x − x) ¯ T (y − x) = (g(y)− T g( y¯ )) (y − g(y)) ≥ 0. Then ¯ T γ. V˙2 (y(t)) ≤ − inf (x − x) γ ∈∂ f (x)

C. Convergence Rate Analysis It is well known that studying the convergence rate of neural networks is useful for real applications. This subsection studies the convergence rate of the proposed neural network. The results are stated as the following theorem. Theorem 3: Assume that the objective function f (x) in problem (1) is strongly convex on X . For any initial value y0 = y(t0 ) ∈ Rn , the output vector of the neural network in (3) and (4) is globally convergent to the unique optimal solution of problem (1) if y(t) is bounded, and the convergence rate of the output vector is described by x(t) − x ¯ 2 ≤ V0 (y(t0 )) + (y(t0 ) − y¯ )P(y(t0 ) − y¯ ) 2 t − Px(s) − q2 t0

(28) + σ (I − P)(x(s) − x) ¯ 2 ds where V0 (y) is defined in (22) and σ is a positive constant. Proof: Since f (x) is strongly convex on X , for any z = (I − P)x + q ∈ X , we have (z − z¯ )T (γ − γ¯ ) ≥ σ z − z¯ )2 = σ (I − P)(x − x) ¯ 2

γ ∈∂ f (x)

T

Since x¯ is an optimal solution of problem (6), from (18), we ¯ According have (x − x) ¯ T ( y¯ −g( y¯ )+ γ¯ ) = 0, where γ¯ ∈ ∂ f (x). to Lemma 1, we have (x¯ − x)T ( y¯ − g( y¯ )) = (g( y¯ ) − g(y))T ( y¯ −g( y¯ )) ≥ 0. Then (x − x) ¯ T γ¯ ≥ 0. Since f is pseudoconvex on , ∂ f is pseudomonotone on E. Then (x − x) ¯ T γ ≥ 0 for ˙ any γ ∈ ∂ f (x) and it results in V2 (x(t)) ≤ 0. ¯ T γ . Assume there exists xˇ Define H2(y) = inf γ ∈∂ f (x)(x − x) as an optimal solution of problem (6). On one hand, according to Theorem 1, there exists an equilibrium point yˇ ∈ Rn for system (8) such that xˇ = g( yˇ ). From (18), (xˇ − x) ¯ T ( yˇ −g( yˇ )+ γˇ ) = 0, where γˇ ∈ ∂ f (x). ˇ According to Lemma 1, we have (xˇ − x) ¯ T ( yˇ − g( yˇ )) = (g( yˇ ) − g( y¯ ))T ( yˇ − g( yˇ )) ≥ 0. Then (xˇ − x) ¯ T γˇ ≤ 0. On the other hand, from (18), since x¯ is an optimal solution of problem (6), (xˇ − x) ¯ T ( y¯ − g( y¯ ) + γ¯ ) = 0, where γ¯ ∈ ∂ f (x). ¯ According to Lemma 1, we have (x¯ − x) ˇ T ( y¯ − g( y¯ )) = (g( y¯ ) − g( yˇ ))T ( y¯ − g( y¯ )) ≥ 0. Then (xˇ − x) ¯ T γ¯ ≥ 0. Since f is pseudoconvex on , ∂ f is pseudomonotone on . Then (xˇ − x) ¯ T γˇ ≥ 0. Hence T (xˇ − x) ¯ γˇ = 0. Then H2( yˇ ) = 0. Conversely, if there exists yˆ ∈ Rn such that H2( yˆ ) = 0, combining with the fact that ∂ f (x) ˆ is a compact convex subset in Rn , there exists γˆ ∈ ∂ f (x) ˆ such that (xˆ − x) ¯ T γˆ = 0, where xˆ = g( yˆ ). Since f is pseudoconvex on , it follows that f (x) ¯ ≥ f (x). ˆ Thus xˆ is also an optimal solution of problem (6) as x¯ is an optimal solution. Therefore, there exists y ∈ Rn such that H2(y) = 0 if and only if x = g(y) is an optimal solution of problem (6). As the remainder of the proof is similar to that of Theorem 2, it is omitted here.

where σ is a positive constant, z¯ = (I − P)x¯ + q, γ ∈ ∂ f (z), and γ¯ ∈ ∂ f (¯z ). According the proof of Theorem (2), by the same Lyapunov function in (26), it follows that V˙ (y(t)) ≤ −Px − q2 −

inf (z − z¯ )T (γ − γ¯ )

γ ∈∂ f (z)

≤ −Px − q2 − σ (I − P)(x − x) ¯ 2


819

2 y

2

1

1.5

x2

2

1.5

1

1

0.5 0

output vector x

state vector y

x1

y

−0.5 −1 −1.5

0.5 0 −0.5 −1

−2 −1.5

−2.5 −3

−2

0

1 time (sec)

2 −4

x 10

0

1 time (sec)

2 −4

x 10

Fig. 1. Transient behaviors of the state variables of the neural network in (3) and (4) for solving (30) in Example 1.

Fig. 2. Transient behaviors of the output variables of the neural network in (3) and (4) for solving (30) in Example 1.

where σ is a positive constant. Then

initial value y0 = y(t0 ) ∈ Rn , the output vector of the neural network in (8) and (9) is globally convergent to the unique optimal solution of problem (6) if y(t) is bounded, and the convergence rate of the output vector is described by 2σ t x(s) − x ¯ 2 ds x(t) − x ¯ 2 ≤ V0 (y(t0 )) − t0

t

Px(s) − q2 t0

+ σ (I − P)(x(s) − x) ¯ 2 ds.

V (y(t)) ≤ V (y(t0 )) −

(29)

According to Corollary 2, the output vector of the neural network in (3) and (4) is globally convergent to the unique optimal solution of problem (1). Furthermore, the convergence rate can be derived from (29). In fact, from Lemma 4(i), V (y(t)) ≥ x − x ¯ 2 /2. Combining with (29) and V (y(t0 )) = [V0 (y(t0 )) + (y(t0 ) − y¯ ) P(y(t0 ) − y¯ )]/2, it follows (28). Remark 4: According to the formula in (28), we have t

Px(s) − q2 + σ (I − P)(x(s) − x) ¯ 2 ds t0

≤ V0 (y(t0 )) + (y(t0 ) − y¯ )P(y(t0 ) − y¯ ). 2 Thus for a fixed initial value y(t0 ), the smaller the , the smaller will be Px(s) − q2 + σ (I − P)(x(s) − x) ¯ 2 , i.e., the faster x(t) converges to the unique optimal solution. Similar to the proof of Theorem 3, we can derive the following two corollaries. Corollary 5: Assume that ∂ f (x) is strongly pseudomonotone on the equality constraint set E. For any initial value x 0 = x(t0 ) ∈ Rn , the state vector of the neural network in (7) is globally convergent to the unique optimal solution of problem (5), and the convergence rate of the state vector is described by x(t) − x ¯ 2 ≤ (x(t0 ) − x)(I ¯ + P)(x(t0 ) − x) ¯ 2 t 2Px(s) − q2 − t0

+ σ (I − P)(x(s) − x) ¯ 2 ds. Corollary 6: Assume that ∂ f (x) is strongly pseudomonotone on the bound constraint set . For any

where V0 (y) is defined in (22). V. S IMULATION R ESULTS In the following, four examples of optimization problems are solved using the proposed neural network. Example 1: Consider the following nonlinear optimization problem: minimize f (x) = −3x 12 + x 22 + 2x 1 x 2 + 6x 1 −2x 2 − e x1 + e x2 +2 , subject to x 1 − x 2 = 1, −2 ≤ x 1 , x 2 ≤ 2.

(30)

It is obvious that the objective function f (x) is nonconvex. However, if we substitute x 2 = x 1 − 1 into the objective function, then f˜(x 1 ) = e x1 +1 − e x1 + 3 is strictly convex. Consequently, the objective function of the problem is strictly convex on the equality constraint. Here, we use this simple example with simulations to show the performance of the proposed neural network compared with some other models in the literature. Assume = 10−5 in the neural network (3). Figs. 1 and 2 respectively, depict the transient behaviors of state vector y(t) and output vector x(t) with 10 random initial values. Fig. 3 depicts the two-dimensional phase plot of (x 1 , x 2 )T from 10 random initial points. The simulation results show that the output variables are globally convergent to the unique optimal solution x ∗ = (−1, −2)T in the bound constraints, in which the dashed line and rectangles indicate the equality and bound constraints, respectively. However, many other neural networks proposed in the literature may not be used for solving this problem. We give

820


2

2

x

1.5

1

x2

1.5 1 output vector x

1

2

0.5 x +x =1 1 2

x

0

0.5 0 −0.5 −1

−0.5

−1.5

−1

−2 0

−1.5

0.2

0.4

0.6

0.8

time (sec)

Ω

−2

x*

−2

−1

0 x1

1

1 −3

x 10

Fig. 5. Transient behaviors of the neural network in [47] for solving (30) in Example 1.

2

2

Two-dimensional phase plot of output variables (x1 , x2 )T of the

Fig. 3. neural network in (3) and (4) for solving (30) in Example 1.

x

1

x2

1.5 1

2

x 1.5

output vector x

x1 2

output vector x

1

0.5 0 −0.5

0.5 −1

0 −1.5

−0.5 −2

−1

0

0.2

0.4

0.6

0.8

time (sec)

−1.5

1 −3

x 10


−2 0

0.2

0.4

0.6 time (sec)

0.8

1 −3

x 10


some simulations shown in Figs. 4–7. Among them, Fig. 4 depicts the transient behaviors of the neural network proposed in [38] from 10 random initial points, which shows that the state variables are oscillatory in the bound constraints. Fig. 5 depicts the transient behaviors of the neural network proposed in [47] from 10 random initial points. We can observe that the state variables are not convergent. Moreover, if the bound constraints are unbounded, the neural network in [47] is divergent. In Fig. 6, the neural network proposed in [16] is used for solving this problem. However, we can observe that the state variables of the neural network are not convergent due to the nonconvexity of the objective function in problem (30). The simulations are further given for the more recent neural network model in [41] as shown in Fig. 7.

The simulation results shown in Figs. 4–7 illustrate that the neural networks in [16], [38] [41], and [47] are not capable of solving some optimization problems with nonconvex objective functions. Example 2: Consider a nonsmooth optimization problem as follows: minimize f (x) = |x 1 − x 2 + 2x 3 + 1| + |x 2 − x 3 + x 4 − 2| − |x 1 + x 3 − 2|, subject to x 1 + x 2 + x 3 + x 4 = 8, x 1 + x 2 − x 3 − x 4 = 2, x 1 , x 4 ≥ 0, x 2 , x 3 ∈ R.

(31)

In this problem, the objective function f (x) is nonsmooth and nonconvex. If we substitute the equality constraints into the objective function f (x), we can get that f (x) is convex with respect to x 1 and x 3 . Thus the objective function of the problem is convex on the equality constraint. Consequently, the proposed neural network in (3) and (4)


2

7

x

x

1

x2

1.5

x

1

2

x

3

x

4

6

1

5

0.5

4

output vector x

output vector x

821

0 −0.5

3 2

−1

1 −1.5

0 −2 0

0.2

0.4

0.6

0.8

time (sec)

1

−1

−3

x 10


7

y2

y3

1

2 time (sec)

3

4 −4

x 10


8 y1

0

5

y4

y1

y2

y3

y4

4

6 5 state vector y

state vector y

3 4 3 2

2

1

1 0

0 −1 −2

0

1

2 time (sec)

3

4 −4

−1

0

1 time (sec)

x 10

2 −4

x 10



is capable of solving this problem. Let = 10−5 in the simulations. Figs. 8 and 9, respectively, depict the transient behaviors of state vector y(t) and output vector x(t) with 10 random initial values, which shows that the output variables are globally convergent to the unique optimal solution x ∗ = (0, 5, 3, 0)T . However, for the neural networks proposed in [4] and [30], in order to estimate the lower bounds of the penalty parameters, the feasible regions of optimization problems need to be bounded. Then the proposed neural networks in [4] and [30] are not suitable for solving the problem in (31) due to the unboundedness of the feasible region. Recently, a recurrent neural network with a hardlimiting activation function has been proposed for solving constrained optimization problems with piecewise-linear objective functions [29]. Compared with the neural network in [29] for solving problem (31), here we avoid using any penalty

parameter for the design of the projection neural network in (3) and (4). Example 3: Consider a quadratic fractional optimization problem as follows [25]: x T Qx + a T x + a0 , c T x + c0 subject to x 1 + x 2 − x 3 = 3, x 1 − 2x 2 + x 4 = 0, 2 ≤ x1, x2 , x3 , x4 ≤ 4

minimize f (x) =

where

⎛

−1 ⎜ 0.5 Q=⎜ ⎝ 1 0

0.5 5.5 −1 −0.5

(32)

⎞ ⎛ ⎞ 1 0 1 ⎜ ⎟ −1 −0.5 ⎟ ⎟ , a = ⎜ −1 ⎟ , a0 = −2, ⎝ −1 ⎠ 1 0 ⎠ 0 0 1

c = (1, 1, 1, −1)T , c0 = 6.

822


x

4

1

x

2

x

3

1.6

x

4

1.5

condition number κ(A)

output vector x

3.5

3

2.5

1.4

1.3

1.2

1.1

1

2 0

1 time (sec)

2 −4

x 10


As stated in [25], the objective function is pseudoconvex on the feasible region. Let = 10−5 in the simulations. Figs. 10 and 11, respectively, depict the transient behaviors of state vector y(t) and output vector x(t) with 10 random initial values. Fig. 10 shows that the state vector y(t) of the neural network is convergent to an equilibrium point. Thus according to Corollary 1, the output vector x(t) is convergent to the unique optimal solution x ∗ = (8/3, 7/3, 2, 2)T , which is shown in Fig. 11. Compared with the neural network in [25] for solving this problem, here no penalty parameter is included in the proposed neural network, which provides a potential application for pseudoconvex optimization. Example 4: Consider minimizing the condition number of a nonzero matrix A as follows: minimize κ(A), (33) subject to A ∈ 0 where A ∈ Rn×n is a symmetric matrix, 0 is a compact convex set in Rn×n , and the condition number κ(A) is defined as λmax (A) κ(A) = λmin (A) in which λmax and λmin denote the maximum and minimum eigenvalues of matrix A, respectively. The optimization of condition number has been recently investigated in the literature (e.g., [6], [35], and the references therein). Consider a diagonal matrix T 0 a x + a0 A= 0 c T x + c0 where x = (x 1 , x 2 , x 3 , x 4 )T ∈ R4 , a = (−2, −1, 2, 0)T , c = (1, −1, 2, 1)T , a0 = 4, c0 = 2, and x ∈ = {x ∈ R4 : 0 ≤ x ≤ 1}. The condition number κ(A) is pseudoconvex with respect to x on [35] and it can be written as ⎧ T a x + a0 ⎪ ⎪ ⎨ T if a T x + a0 ≥ c T x + c0 , c x + c 0 κ(A) = c T x + c0 ⎪ ⎪ ⎩ T if a T x + a0 < c T x + c0 . a x + a0

0.9

0

0.5

1 time (sec)

1.5

2 −5

x 10

Fig. 12. Transient behaviors of the condition number along the output of the neural network in (8) and (9) for solving (33) in Example 4.

The neural network in (8) and (9) can be used for optimizing the condition number. Compared with the methods used in [6], [35] for the optimization of the condition number, the recurrent neural network in (8) and (9) is a parallel computational model which is easy for real-time optimization based on circuit implementation. Fig. 12 depicts the convergence of the condition number κ(A) for solving problem (33), which shows that it converges to 1 along the output of the neural network in (8) and (9), where = 10−5 . VI. C ONCLUSION This paper presented a one-layer projection neural network with piecewise-linear activation functions for solving nonsmooth optimization problems with linear equalities and bound constraints. The global convergence conditions of the neural network were derived by means of the Lyapunov method and nonsmooth analysis. Compared with the existing recurrent neural networks for solving constrained optimization problems, the proposed projection neural network has several salient features. The objective functions in the investigated optimization problems are not required to be convex everywhere, but only need to be convex (pseudoconvex) on a set defined by the constraints. Moreover, the proposed neural network does not have any design parameter. In addition, some extra conditions of many theoretical analyses in the literature are relaxed. Furthermore, numerical examples with simulation results were given to illustrate the effectiveness and performance of the proposed neural network. As future research directions, we propose to investigate the projection method used in this paper for solving some generalized convex optimization problems. Moreover, the convergence of the proposed neural network in this paper will be another interesting future research direction, such as the finite-time convergence to feasible region. Furthermore, it is interesting to investigate the possibility of how the proposed neural network in this paper can be realized for real-time optimization.


R EFERENCES [1] J. Aubin and A. Cellina, Differential Inclusions: Set-Valued Maps and Viability Theory. New York, USA: Springer-Verlag, 1984. [2] A. Auslender, Optimisation: Méthodes Numériques. Paris, France: Masson, 1976. [3] M. Bazaraa, H. Sherali, and C. Shetty, Nonlinear Programming: Theory and Algorithms, 3rd Ed. Hoboken, New Jersey, New York, USA: Wiley, 2006. [4] W. Bian and X. Xue, “Subgradient-based neural networks for nonsmooth nonconvex optimization problems,” IEEE Trans. Neural Netw., vol. 20, no. 6, pp. 1024–1038, Jun. 2009. [5] A. Bouzerdoum and T. Pattison, “Neural network for quadratic optimization with bound constraints,” IEEE Trans. Neural Netw., vol. 4, no. 2, pp. 293–304, Mar. 1993. [6] X. Chen, R. Womersley, and J. Ye, “Minimizing the condition number of a Gram matrix,” SIAM J. Optim., vol. 21, no. 1, pp. 127–148, 2011. [7] L. Cheng, Z. Hou, Y. Lin, M. Tan, W. Zhang, and F. Wu, “Recurrent neural network for non-smooth convex optimization problems with application to the identification of genetic regulatory networks,” IEEE Trans. Neural Netw., vol. 22, no. 5, pp. 714–726, May 2011. [8] E. Chong, S. Hui, and S. Zak, “An analysis of a class of neural networks for solving linear programming problems,” IEEE Trans. Autom. Control, vol. 44, no. 11, pp. 1995–2006, Nov. 1999. [9] F. Clarke, Optimization and Nonsmooth Analysis. New York, USA: Wiley, 1983. [10] A. Filippov, Differential Equations with Discontinuous Righthand Sides, Mathematics and its applications (Soviet series). Boston, USA: Kluwer Academic Publishers, 1988. [11] M. Forti, M. Grazzini, P. Nistri, and L. Pancioni, “Generalized lyapunov approach for convergence of neural networks with discontinuous or nonlipschitz activations,” Physica D, vol. 214, no. 1, pp. 88–99, 2006. [12] M. Forti and P. Nistri, “Global convergence of neural networks with discontinuous neuron activations,” IEEE Trans. Circuits Syst.-I, vol. 50, no. 11, pp. 1421–1435, Nov. 2003. [13] M. Forti, P. Nistri, and D. Papini, “Global exponential stability and global convergence in finite time of delayed neural networks with infinite gain,” IEEE Trans. Neural Netwo., vol. 16, no. 6, pp. 1449–1463, Nov. 2005. [14] M. Forti, P. Nistri, and M. Quincampoix, “Generalized neural network for nonsmooth nonlinear programming problems,” IEEE Trans. Circuits Syst.-I, vol. 51, no. 9, pp. 1741–1754, Sep. 2004. [15] M. Forti, P. Nistri, and M. Quincampoix, “Convergence of neural networks for programming problems via a nonsmooth Łojasiewicz inequality,” IEEE Trans. Neural Netw., vol. 17, no. 6, pp. 1471–1486, Nov. 2006. [16] X. Gao, “A novel neural network for nonlinear convex programming,” IEEE Trans. Neural Netw., vol. 15, no. 3, pp. 613–621, May 2004. [17] X. Hu and J. Wang, “Solving pseudomonotone variational inequalities and pseudoconvex optimization problems using the projection neural network,” IEEE Trans. Neural Netw., vol. 17, no. 6, pp. 1487–1499, Nov. 2006. [18] X. Hu and J. Wang, “Design of general projection neural networks for solving monotone linear variational inequalities and linear and quadratic optimization problems,” IEEE Trans. Sys., Man Cybern.-B, vol. 37, no. 5, pp. 1414–1421, Oct. 2007. [19] X. Hu and J. Wang, “An improved dual neural network for solving a class of quadratic programming problems and its k-winners-take-all application,” IEEE Trans. Neural Netw., vol. 19, no. 12, pp. 2022–2031, Dec. 2008. [20] X. Hu and B. Zhang, “A new recurrent neural network for solving convex quadratic programming problems with an application to the k-winnerstake-all problem,” IEEE Trans. Neural Netw., vol. 20, no. 4, pp. 654–664, Apr. 2009. [21] M. Kennedy and L. Chua, “Neural networks for nonlinear programming,” IEEE Trans. Circuits Syst., vol. 35, no. 5, pp. 554–562, May 1988. [22] D. Kinderlehrer and G. Stampacchia, An Introduction to Variational Inequalities and Their Applications, New York, USA: Academic Press, 1982. [23] G. Li, S. Song, C. Wu, and Z. Du, “A neural network model for nonsmooth optimization over a compact convex subset,” in Proc. 3rd Int. Symp. Neural Netw., 2006, pp. 344–349. [24] Q. Liu, J. Cao, and Y. Xia, “A delayed neural network for solving linear projection equations and its analysis,” IEEE Trans. Neural Netw., vol. 16, no. 4, pp. 834–843, Jul. 2005.

823

[25] Q. Liu, Z. Guo, and J. Wang, “A one-layer recurrent neural network for constrained pseudoconvex optimization and its application for portfolio optimization,” Neural Netw., vol. 26, pp. 99–109, Feb. 2012. [26] Q. Liu and J. Wang, “A one-layer recurrent neural network with a discontinuous activation function for linear programming,” Neural Comput., vol. 20, no. 5, pp. 1366–1383, 2008. [27] Q. Liu and J. Wang, “A one-layer recurrent neural network with a discontinuous hard-limiting activation function for quadratic programming,” IEEE Trans. Neural Netw., vol. 19, no. 4, pp. 558–570, Apr. 2008. [28] Q. Liu and J. Wang, “A one-layer recurrent neural network for nonsmooth convex optimization subject to linear equality constraints,” in Proc. 15th Int. Conf. Neural Inf. Proc.. 2009, pp. 1003–1010. [29] Q. Liu and J. Wang, “Finite-time convergent recurrent neural network with a hard-limiting activation function for constrained optimization with piecewise-linear objective functions,” IEEE Trans. Neural Netw., vol. 22, no. 4, pp. 601–613, Apr. 2011. [30] Q. Liu and J. Wang, “A one-layer recurrent neural network for constrained nonsmooth optimization,” IEEE Trans. Syst., Man, Cybern.-B, vol. 41, no. 5, pp. 1323–1333, Oct. 2011. [31] S. Liu and J. Wang, “A simplified dual neural network for quadratic programming with its kwta application,” IEEE Trans. Neural Netw., vol. 17, no. 6, pp. 1500–1510, Nov. 2006. [32] W. Lu and T. Chen, “Dynamical behaviors of delayed neural network systems with discontinuous activation functions,” Neural Comput., vol. 18, no. 3, pp. 683–708, 2006. [33] C. Maa and M. Shanblatt, “Linear and quadratic programming neural network analysis,” IEEE Trans. Neural Netw., vol. 3, no. 4, pp. 580–594, Jul. 1992. [34] D. Mandic and J. Chambers, Recurrent Neural Networks For Prediction: Learning Algorithms, Architectures, and Stability. New York, USA: Wiley, 2001. [35] P. Maréchal and J. Ye, “Optimizing condition numbers,” SIAM J. Optim., vol. 20, no. 2, pp. 935–947, 2010. [36] J. Penot and P. Quang, “Generalized convexity of functions and generalized monotonicity of set-valued maps,” J. Optim. Theory Appl., vol. 92, no. 2, pp. 343–356, 1997. [37] D. Tank and J. Hopfield, “Simple neural optimization networks: An A/D converter, signal decision circuit, and a linear programming circuit,” IEEE Trans. Circuits Syst., vol. 33, no. 5, pp. 533–541, May 1986. [38] Q. Tao, J. Cao, M. Xue, and H. Qiao, “A high performance neural network for solving nonlinear programming problems with hybrid constraints,” Phys. Lett. A, vol. 288, no. 2, pp. 88–94, 2001. [39] J. Wang, “Analysis and design of a recurrent neural network for linear programming,” IEEE Trans. Circuits Syst.-I, vol. 40, no. 9, pp. 613–618, Sep. 1993. [40] J. Wang, “A deterministic annealing neural network for convex programming,” Neural Netw., vol. 7, no. 4, pp. 629–641, 1994. [41] Y. Xia, G. Feng, and J. Wang, “A novel recurrent neural network for solving nonlinear optimization problems with inequality constraints,” IEEE Trans. Neural Netw., vol. 19, no. 8, pp. 1340–1353, Aug. 2008. [42] Y. Xia, H. Leung, and J. Wang, “A projection neural network and its application to constrained optimization problems,” IEEE Trans. Circuits Syst.-I, vol. 49, no. 4, pp. 447–458, Apr. 2002. [43] Y. Xia and J. Wang, “A general methodology for designing globally convergent optimization neural networks,” IEEE Trans. Neural Netw., vol. 9, no. 6, pp. 1331–1343, Nov. 1998. [44] Y. Xia and J. Wang, “Global exponential stability of recurrent neural networks for solving optimization and related problems,” IEEE Trans. Neural Netw., vol. 11, no. 4, pp. 1017–1022, Jul. 2000. [45] Y. Xia and J. Wang, “A dual neural network for kinematic control of redundant robot manipulators,” IEEE Trans. Syst., Man Cybern.-B, vol. 31, no. 1, pp. 147–154, Feb. 2001. [46] Y. Xia and J. Wang, “A general projection neural network for solving monotone variational inequalities and related optimization problems,” IEEE Trans. Neural Netw., vol. 15, no. 2, pp. 318–328, Mar. 2004. [47] Y. Xia and J. Wang, “A recurrent neural network for solving nonlinear convex programs subject to linear constraints,” IEEE Trans. Neural Netw., vol. 16, no. 2, pp. 379–386, Mar. 2005. [48] X. Xue and W. Bian, “Subgradient-based neural networks for nonsmooth convex optimization problems,” IEEE Trans. Circuits Syst.-I, vol. 55, no. 8, pp. 2378–2391, Sep. 2008. [49] S. Zhang and A. Constantinides, “Lagrange programming neural networks,” IEEE Trans. Circuits Syst.-II, vol. 39, no. 7, pp. 441–452, Jul. 1992.

824


Qingshan Liu (S’07-M’08) received the B.S. degree in mathematics from Anhui Normal University, Wuhu, China, the M.S. degree in applied mathematics from Southeast University, Nanjing, China, and the Ph.D. degree in automation and computer-aided engineering from the Chinese University of Hong Kong, Shatin, Hong Kong, in 2001, 2005, and 2008, respectively. He is currently an Associate Professor with the School of Automation, Southeast University. In 2008, he joined the School of Automation, Southeast University. In 2010 and 2012, he was a Post-Doctoral Fellow with the Department of Mechanical and Automation Engineering, Chinese University of Hong Kong. His current research interests include optimization theory and applications, artificial neural networks, computational intelligence, and nonlinear systems.

Jun Wang (S’89–M’90–SM’93–F’07) received the B.S. degree in electrical engineering and the M.S. degree in systems engineering from the Dalian University of Technology, Dalian, China, in 1982 and 1985, respectively, and the Ph.D. degree in systems engineering from Case Western Reserve University, Cleveland, OH, USA, in 1991. He is currently a Professor with the Department of Mechanical and Automation Engineering, Chinese University of Hong Kong, Shatin, Hong Kong. He has held various academic positions with the Dalian University of Technology, Case Western Reserve University, and the University of North Dakota, USA. He held various short-term or part-time visiting positions with the U.S. Air Force Armstrong Laboratory in 1995, RIKEN Brain Science Institute in 2001, the Universite Catholique de Louvain in 2001, the Chinese Academy of Sciences in 2002, the Huazhong University of Science and Technology from 2006 to 2007, Shanghai Jiao Tong University as a Cheung Kong Chair Professor from 2008 to 2011, and the Dalian University of Technology as a National Thousand-Talent Chair Professor since 2011. His current research interests include neural networks and their applications. Dr. Wang was a recipient of the Research Excellence Award from the Chinese University of Hong Kong for research during 2008–2009, the Natural Science Award (First Class) from Shanghai Municipal Government in 2009, the Natural Science Award (First Class) from the Ministry of Education of China in 2011, the Outstanding Achievement Award from Asia Pacific Neural Network Assembly, the IEEE Transactions on Neural Networks Outstanding Paper Award (with Qingshan Liu) from the IEEE Computational Intelligence Society in 2011. He was an Associate Editor of the IEEE TRANSACTIONS ON CYBERNETICS and its Predecessor since 2003, and has been a member on the Editorial Board of Neural Networks since 2012. He was an Associate Editor of the IEEE TRANSACTIONS ON NEURAL NETWORKS from 1999 to 2009 and the IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C from 2002 to 2005, and a member on the Editorial Advisory Board of the International Journal of Neural Systems from 2006 to 2012. He was a Guest Editor of special issues of the European Journal of Operational Research in 1996, the International Journal of Neural Systems in 2007, and Neurocomputing in 2008. He was the President of the Asia Pacific Neural Network Assembly in 2006, the General Chair of the 13th International Conference on Neural Information Processing in 2006 and the IEEE World Congress on Computational Intelligence in 2008. He was on many committees such as the IEEE Fellow Committee. He was an IEEE Computational Intelligence Society Distinguished Lecturer from 2010 to 2012.

Neural network for constrained nonsmooth optimization using Tikhonov regularization.

New discrete-time recurrent neural network proposal for quadratic optimization with general linear constraints.

A one-layer recurrent neural network for constrained nonsmooth invex optimization.

A two-layer recurrent neural network for nonsmooth convex optimization problems.

Neural network for nonsmooth, nonconvex constrained minimization via smooth approximation.

Controlling neural network responsiveness: tradeoffs and constraints.

A recurrent neural network for solving bilevel linear programming problem.

An Inertial Projection Neural Network for Solving Variational Inequalities.

A one-layer recurrent neural network for constrained nonconvex optimization.

A Projection free method for Generalized Eigenvalue Problem with a nonsmooth Regularizer.

Concurrent subspace width optimization method for RBF neural network modeling.

Smoothing neural network for constrained non-Lipschitz optimization with applications.

Neural network optimization for E. coli promoter prediction.

A generalized gradient projection method based on a new working set for minimax optimization problems with inequality constraints.

Structural constraints on learning in the neural network.

Projection-based fast learning fully complex-valued relaxation neural network.

A collective neurodynamic optimization approach to bound-constrained nonconvex optimization.

High-order tracking differentiator based adaptive neural control of a flexible air-breathing hypersonic vehicle subject to actuators constraints.

Nonsmooth finite-time stabilization of neural networks with discontinuous activations.

On the linear programming bound for linear Lee codes.

Neural constraints on learning.

Relative entropy minimizing noisy non-linear neural network to approximate stochastic processes.

Discrete-time neural network for fast solving large linear L1 estimation problems and its application to image restoration.

The Modified HZ Conjugate Gradient Algorithm for Large-Scale Nonsmooth Optimization.