984

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 6, JUNE 2012

Feedback Optimal Control of Distributed Parameter Systems by Using Finite-Dimensional Approximation Schemes Angelo Alessandri, Senior Member, IEEE, Mauro Gaggero, and Riccardo Zoppoli

Abstract— Optimal control for systems described by partial differential equations is investigated by proposing a methodology to design feedback controllers in approximate form. The approximation stems from constraining the control law to take on a fixed structure, where a finite number of free parameters can be suitably chosen. The original infinite-dimensional optimization problem is then reduced to a mathematical programming one of finite dimension that consists in optimizing the parameters. The solution of such a problem is performed by using sequential quadratic programming. Linear combinations of fixed and parameterized basis functions are used as the structure for the control law, thus giving rise to two different finite-dimensional approximation schemes. The proposed paradigm is general since it allows one to treat problems with distributed and boundary controls within the same approximation framework. It can be applied to systems described by either linear or nonlinear elliptic, parabolic, and hyperbolic equations in arbitrary multidimensional domains. Simulation results obtained in two case studies show the potentials of the proposed approach as compared with dynamic programming. Index Terms— Approximation scheme, distributed parameter system, neural network, optimal control.

I. I NTRODUCTION

T

HE INFINITE-DIMENSIONAL nature of systems described by partial differential equations (PDEs) makes the solution of the problems of designing and implementing optimal controllers much more difficult, as compared with systems that are modeled by ordinary differential equations (ODEs). The former are usually referred to as distributed parameter systems (DPSs), whereas the latter are called lumped parameter systems (LPSs). DPSs have received a special attention from the research community since they are widely employed for the purpose of modeling in a number of applications. For example, thermal convection, spatially distributed chemical reactions, flexible beams or plates, electromagnetic or acoustic waves, fluid flows, and other distributed phenomena that are of interest

Manuscript received August 1, 2011; revised March 14, 2012; accepted March 17, 2012. Date of publication April 30, 2012; date of current version May 10, 2012. A. Alessandri is with the DIME-University of Genoa, Genoa 16129, Italy (e-mail: [email protected]). M. Gaggero is with the ISSIA-National Research Council of Italy, Genoa 16149, Italy (e-mail: [email protected]). R. Zoppoli is with the DIST-University of Genoa, Genoa 16145, Italy (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2012.2192748

in both engineering and physics can be modeled as DPSs. Thus, a vast literature on the control of DPSs is available (see, among others, [1]–[8] and the references therein). Basing on the preliminary results reported in [9], the goal of this paper is that of providing a contribution to feedback optimal control for DPSs that is of general applicability (i.e., well suited for various types of PDEs in arbitrary domains) and performs successfully as compared with dynamic programming (DP) [10]. DP is quite a popular approach to the synthesis of optimal controllers for LPSs. As a DPS can be discretized to become an LPS, DP turns out to be a crucial comparison methodology whenever a feedback regulator is desired. Generally speaking, a closed-loop control is preferable with respect to an openloop one, if the system under consideration is affected by uncertainties or external disturbances. Thus, it is not surprising that feedback control has attracted a special attention in various research areas on DPSs, as evident in the recent literature (see [11]–[14]). Unfortunately, deriving analytically the solution of optimal control problems for DPSs, either in open-loop or closed-loop form, is a nontrivial task except in very few cases, which strictly depend on the type of PDE, cost functional, and control inputs. Thus, the search of solutions in approximate form to such problems is a topic of fundamental importance. Roughly speaking, the existing approaches to the suboptimal design of controllers for DPSs can be classified into two types, usually named discretize-then-optimize and optimize-then-discretize [15]. In the first type of approach, the DPSs are discretized into approximate finite-dimensional LPSs as a preliminary step before accomplishing the design of the controllers (see, for example, [3], [16]–[19]). Since the resulting LPSs may be described by a large number of ODEs, numerous modelreduction techniques for low-order approximation of various classes of DPSs are available, such as methods based on the Karhunen–Loëve expansion, empirical eigenfunctions, and proper orthogonal decomposition [20]–[23]. The advantage of the discretize-then-optimize approach is the possibility of using the well-established techniques available for the design of feedback suboptimal controllers for LPSs like, for example, approximate DP [24]–[26]. The main limitation lies in the difficulty of finding suitable reduced-order LPS models that are accurate for all the possible classes of DPSs, especially for nonlinear systems defined in high-dimensional domains. By contrast, the optimize-then-discretize approach stems from the idea of casting and solving the problem

2162–237X/$31.00 © 2012 IEEE

ALESSANDRI et al.: FEEDBACK OPTIMAL CONTROL OF DISTRIBUTED PARAMETER SYSTEMS

in its original infinite-dimensional setting via a variational formulation. Thus, the original optimal control problem is reduced to the problem of finding a numerical solution of a system of PDEs (made up by the original equation and an adjoint equation), for which a number of techniques are reported in the literature (see, among others, [27]–[31]). The main limitation of the optimize-then-discretize paradigm is that the resulting suboptimal solution to the original control problem is typically expressed in open-loop form. In this paper, a different approach is considered as a compromise between the discretize-then-optimize and optimizethen-discretize paradigms. More specifically, we address the approximate solution of optimal control problems for DPSs in line with the extended Ritz method (ERIM) [32], as an alternative to direct methods borrowed from the calculus of variations, such as the Ritz [33], [34] and weighted residual methods (least-squares, collocation, and Galerkin) [35], [36]. The basic idea lies in constraining the feedback control functions to take on fixed structures that depend on a finite number of free parameters to be tuned. By substituting such structures into the cost functional to be minimized and the constraints given by the model equation, the original functional optimization problem reduces to a mathematical programming one that consists in finding the optimal values of the parameters. Such a problem is nonlinear in general and, at least in principle, easier to solve with respect to the original functional one. The Ritz method and ERIM are associated with two different approximation schemes based on fixed-basis and variablebasis approximators, respectively. Generally speaking, the former type of schemes is simpler to deal with numerically since the approximate solution depends linearly on the unknown parameters. Unfortunately, fixed-basis schemes are less appealing from a theoretical standpoint than variable-basis ones, as the latter may guarantee a uniform approximation with a number of parameters that grow at most polynomially with the dimension of the inputs of the function to be approximated (see [37], [38]). Instead, in the variablebasis case the approximate solution depends nonlinearly on the unknown parameters, and thus their optimal value is more difficult to be found with respect to the fixed-basis case. The aforesaid entails a comparison between fixed-basis schemes based on the Ritz method and ERIM-based variablebasis schemes as to complexity, computational effort to find the optimal parameters, and effectiveness of the resulting controllers. Another comparison is performed with the solution obtained by a discretize-then-optimize approach based on approximate DP. More specifically, a DP approach with polynomial approximation of the cost-to-go functions is adopted to construct suboptimal regulators for the LPSs that result from the discretization of the corresponding DPSs. The optimizations involved in the search for the optimal parameters with the Ritz method and ERIM are performed by using sequential quadratic programming (SQP) [39]. SQP techniques are well suited for such a task because of their capability of dealing with nonlinearities successfully, even in a high-dimensional setting. Such techniques rely on the idea of solving a sequence of quadratic programming problems constructed around the current solution, which is computed by

985

using a Newton or a quasi-Newton technique. As compared with [40] and [41], where SQP is used to solve optimal control problems for DPSs, our approach is able to perform the design of optimal feedback controllers instead of open-loop ones. The resulting approach is of general applicability, as it can be employed for the suboptimal solution of optimal distributed or boundary control problems for a large variety of systems, governed by either linear or nonlinear elliptic, parabolic, or hyperbolic PDEs defined in arbitrary domains. The rest of this paper is organized as follows. In Section II, we introduce the class of optimal control problems for DPSs that will be considered throughout this paper. Section III contains two examples of optimal control problems for DPSs. In Section IV, the proposed approach for the approximate solution of such problems is presented. In Section V, the effectiveness of such a methodology is investigated by means of simulations. Finally, conclusions are drawn in Section VI. II. P ROBLEM S TATEMENT We consider a class of DPSs described by a model equation like the following: Ly(x, t) = u(x, t), x ∈ , t ∈ [0, T ]

(1)

where L : Y → U is a partial differential operator (either linear or nonlinear), Y and U are suitable spaces of functions defined in  × [0, T ]; x ∈  ⊂ Rd and t ∈ [0, T ] are the space and time independent variables, respectively; y :  × [0, T ] → Rm is the state of the system, with y ∈ Y; u :  × [0, T ] → Rm is the source term of the system, with u ∈ U. Of course, we suppose to have also a sufficient number of initial and boundary conditions. Concerning boundary conditions, we assume that they take on the following form in general: y(x, t) = s(x, t), x ∈  D , t ∈ [0, T ] ∂y(x, t) = q(x, t), x ∈  N , t ∈ [0, T ] ∂η

(2) (3)

where s:  D × [0, T ] → Rm and q:  N × [0, T ] → Rm are given functions, η stands for the unitary normal vector directed outside . Because of the type of boundary conditions,  D and  N are called Dirichlet and Neumann boundaries, respectively. These boundaries are such that  D ∪ N = ∂ and  °D ∩ °N = ∅, where ∂,  °D , and  °N denote the boundary of , the interior of  D , and the interior of  N , respectively. Distributed and boundary controls are the most common frameworks when dealing with an optimal control problem for DPSs. As to the former, the source term u of (1) (or only some of its components if m > 1) is assumed to be manipulable in order to obtain a desired behavior of the system, thus playing the role of a control input. Such a control input may change with the position in , and for this reason it is usually referred to as distributed control. Beside control in the source term, one can regard the boundaries as controllable terms, thus dealing with boundary control. In this case, either s in (2) or q in (3) or both (or only some of their components if m > 1) should be regarded as control inputs. Of course, one can assume to have a combination of distributed and boundary controls within the same DPS. This topic is not conceptually difficult to deal with,

986

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 6, JUNE 2012

as it is a straightforward extension of the results presented hereby, but it requires quite a cumbersome notation and thus it is not addressed in this paper. We tackle the solution of optimal distributed and boundary control problems by explicitly searching for a feedback control law. Such a control law for a DPS like (1) with boundary conditions (2) or (3) can be defined at the point x and time t as a function of a set of values the state takes on at other points near x or even at all x ∈  at the same time. In other words, the control law may contain both local and nonlocal terms. Toward this end, in the case of distributed control, one can resort to an operator μ: ×[0, T ]×Y → R p that provides the information on the state to generate the feedback control. Thus, we search for a control function γ :  × [0, T ] × R p → Rm such that u(x, t) = γ [x, t, μ (x, t, y)] (4) where x ∈  and t ∈ [0, T ]. In the case of Dirichlet boundary control, we consider an operator μ:  D × [0, T ] × Y → R p and search for a control function γ :  D × [0, T ] × R p → Rm such that s(x, t) = γ [x, t, μ (x, t, y)] (5) where x ∈  D and t ∈ [0, T ]. Of course, the case of Neumann boundary control can be easily obtained from the Dirichlet one by replacing  D with  N and s with q. An example of expression for μ(x, t, y) is  1  μ (x, t, y) =  y(ξ, t), r (ξ ) dξ (6)  Bρ (x) Bρ (x) where Bρ (x) is the ball of center x and radius ρ, |Bρ (x)| denotes the measure of such a ball, r :  → Rm is a given weight function, and ·, · denotes the scalar product between vectors in Rm . Another example of μ (x, t, y) is        μ (x, t, u) = col y x (1), t , y x (2) , t , . . . , y x ( p) , t x (i) ∈ Bρ (x), i = 1, 2, . . . , p.

(7)

Of course, any other expression for μ involving the state can be considered as well. Equation (6) describes a setting in which the control action for a given x ∈  and t ∈ [0, T ] depends on a weighted mean of the values the state takes on in the ball of radius ρ centered at x. Clearly, in this case p = 1. Of course, if needed, one can substitute Bρ (x) with the entire domain  or with  D or  N in the case of boundary control. The limit case with ρ = 0 and r ≡ 1 corresponds to the setting in which the control input at x depends only on the value of the state at the same point: u(x, t) = γ [x, t, y(x, t)] in the case of distributed control, with the obvious changes in the case of boundary control. Of course, this is a simplified type of feedback, but, as we shall see in Section V, it allows one to obtain satisfactory results with a small computational burden. A choice of μ like in (7) accounts for a control action that depends on a finite set of values the state takes on at given points x (i) ∈ Bρ (x), i = 1, 2, . . . , p. In this case, p is equal to the number of points in which the state is taken

into account. One can also employ (7) whenever there is no complete knowledge on the state. This happens for a number of systems modeling real phenomena in which the state may be not completely accessible from the exterior of the DPS, as, for example, only partial information on the state is available by sensors located at some points of the domain. Clearly, in this case one has to choose the form (7) for μ, where the points x (i) , i = 1, 2, . . . , p, coincide with the positions in which the sensors are located. Throughout this paper, with a little abuse of notation and for the sake of compactness, we shall briefly write of γ [x, t, μ(x, t, y)], where μ =   γ (x, t, μ) instead col μ1 , μ2 , . . . , μ p ∈ R p has to be intended as the output of the operator μ (x, t, y) once fixed x, t, and y. The candidate feedback control functions γ are assumed to belong to a set F , which is a subset of a real-normed linear space of functions H. The goal of the optimal control problems considered in this paper is to find, among the feedback control functions γ ∈ F , an optimal function γ ° (a solution is assumed to exist) that generates the optimal control inputs so as to minimize a cost functional J :F →R (8) under the constraints given by (1). Thus, F ⊆ H is the set of admissible control functions since it specifies constraints on such functions. Of course, if there are no constraints, F ≡ H. Clearly, such a problem is of functional optimization since the unknown is the optimal function γ ° :  × [0, T ] × R p → Rm . To sum up, the considered optimal control problem for DPSs can be stated as follows. Problem 1 (Optimal Control for DPSs): Find an optimal feedback control function γ ° ∈ F that minimizes a given cost functional (8) under the constraints resulting from (1) together with boundary and initial conditions. Solving analytically, Problem 1 is a nontrivial task that is possible only in few cases, which strictly depends on the type of PDE, cost functional, and control inputs. The first requisite to solve it analytically is to be able to find an exact solution of (1). As is well known, this is usually possible only for “simple” domains  (rectangles, circles, or combinations of them) and particular types of PDEs. Typically, likewise for LPSs, an analytical solution to an optimal control problem can be found under restrictive assumptions, i.e., quadratic cost functional to be minimized and linear model equation. Moreover, such a solution is often expressed in open-loop form. It can be computed by means of a distributed Pontryagin minimum principle (see [42, pp. 139–143]), which can be interpreted as the Euler–Lagrange equations solving the corresponding variational problem. The above discussion suggests to search for a numerical technique to compute approximate solutions to Problem 1. This will be discussed in Section IV, after the presentation of some optimal control problems for DPSs in Section III. III. E XAMPLES OF O PTIMAL C ONTROL P ROBLEMS FOR DPS S In the following, we shall describe two examples of optimal control problems for DPSs, whose approximate solutions will

ALESSANDRI et al.: FEEDBACK OPTIMAL CONTROL OF DISTRIBUTED PARAMETER SYSTEMS

987

(a)

be faced in Section V. More specifically, in the first we consider a distributed control problem for the Burgers’ equation, whereas the second is a boundary control problem for the optimal reheating of a slab.

Vertical cutting plane

Ω Section of interest

A. Distributed Optimal Control of the Burgers’ Equation We consider a DPS described by a Burgers’ equation like the following: ⎧ ∂y(x, t) 1 ∂y 2 (x, t) ⎪ ⎪ + = u(x, t), x ∈ , t ∈ [0, T ] ⎪ ⎨ ∂t 2 ∂x y(x, 0) = y0 (x), x ∈  ⎪ ⎪ ∂y(0, t) ∂y(L, t) ⎪ ⎩ = = 0, t ∈ [0, T ] ∂x ∂x (9) where L > 0 and T > 0;  = [0, L] is the space domain; y:  × [0, T ] → R is the state of the system; u:  × [0, T ] → R is the control input, representing a source term for (9); y0:  → R is a given initial state. We suppose that the source term u of (9) is manipulable, thus playing the role of a distributed control input. We impose an additional constraint on such a control input, i.e., u(x, t) ∈ [u min , u max ] for all x ∈  and t ∈ [0, T ], where u min and u max are given constants. Our aim is to find an optimal feedback control function γ ° :  × [0, T ] × R p → R such that, given u(x, t) = γ ° (x, t, μ) for all x ∈  and t ∈ [0, T ], the performance index J (γ ) 

1 2

 T 0

L

y(x, t)2 d x dt

(10)

0

is minimized. Concerning the operator μ, we focus on the expression given by (6) with ρ = 0 and r ≡ 1. In other words, we employ only the values of the state at the point x to compute the control input at the same point, i.e., we fix p = 1 and μ(x, t, y) = y(x, t). To sum up, we have to solve the following problem. Problem Distributed Burgers Optimal Control (DBOC): Find an optimal control function γ °:  × [0, T ] × R → R that minimizes the cost functional (10) under the constraints given by (9), u(x, t) = γ °(x, t, μ), and u(x, t) ∈ [u min , u max ] for all x ∈  and t ∈ [0, T ].

B. Optimal Reheating of a Slab Let us consider a homogeneous slab (see Fig. 1) made up of some heat-conducting material that has to be reheated according to a suitable curve. Suppose that there exists an external source that is able to reheat one face of the slab, whereas in the other faces the heat is dispersed. For the sake of simplicity, we assume that the material properties, the initial temperature distribution, the external source, and the dispersion phenomena may vary only along a vertical section of the slab and not across any other cross section. Thus, we expect the temperature distribution at any time to vary only within a 2-D section. The temperature distribution of the slab evolves according to

External heat source

(b)

x2 L2 β

α

xa Γ4

Γ1

0

xb Ω

xc Γ3

α

β

Γ2 L1

x1

q(t) Fig. 1. (a) Slab made up of some heat-conducting material. (b) 2-D model of the slab.

the following PDE (all the coefficients are normalized): ⎧ ∂y(x, t) ⎪ ⎪ − y(x, t) = 0, x ∈ , t ∈ [0, T ] ⎪ ⎪ ⎪ ∂t ⎪ ⎪ y(x, 0) = y0 (x), x ∈  ⎪ ⎪ ⎪ ⎪ ⎨ ∂y(x, t) = α + q(t), x ∈ 1 , t ∈ [0, T ] (11) ∂η ⎪ ⎪ ∂y(x, t) ⎪ ⎪ = β, x ∈ 2 ∪ 4 , t ∈ [0, T ] ⎪ ⎪ ∂η ⎪ ⎪ ⎪ ∂y(x, t) ⎪ ⎪ = α, x ∈ 3 , t ∈ [0, T ] ⎩ ∂η where is the Laplacian operator; L 1 > 0, L 2 > 0, and T > 0;  = [0, L 1 ] × [0, L 2 ] is the space domain; ∂ = 1 ∪ 2 ∪ 3 ∪ 4 ; y:  × [0, T ] → R is the state of the system, i.e., the temperature of the slab at given x and t; q: [0, T ] → R is the control input, representing the external controllable heat quantity given to the slab at time t; α < 0 and β < 0 account for the outgoing heat fluxes at the boundaries; y0 :  → R is a given initial temperature distribution. Since the manipulable term of (11) is in a boundary condition, we have a boundary control with the external heat source as actuator. As in the previous example, we impose an additional constraint on the control input, i.e., q(t) ∈ [qmin , qmax ] for all t ∈ [0, T ], where qmin and qmax are given constants. Our aim is to find an optimal feedback control function γ °: [0, T ] × R p → R that allows one to write q(t) = γ ° (t, μ) for all t ∈ [0, T ] so as to minimize the performance index  T J (γ )  (12) (y(x, t) − yref (x, t))2 d x dt 0



where yref :  × [0, T ] → R is the reference temperature at any x ∈  and t ∈ [0, T ] as prescribed, for example, by a given reheating curve. Concerning the operator μ, we focus on (7) by taking into account the values of the state at the points x a , x b , and x c ∈ 3 ,

988

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 6, JUNE 2012

i.e., we assume to measure the temperature of the slab at such points via thermocouples (see Fig. 1). In other words, we fix p = 3 and μ(x, t, y) = col [y(x a , t), y(x b , t), y(x c , t)]. To sum up, we have to solve the following problem. Problem Boundary Heat Optimal Control (BHOC): Find an optimal control function γ °: [0, T ] × R3 → R that minimizes the cost functional (12) under the constraints given by (11), q(t) = γ °(t, μ), and q(t) ∈ [qmin , qmax ] for all t ∈ [0, T ]. IV. A PPROXIMATE S OLUTION OF DPS O PTIMAL C ONTROL P ROBLEMS The proposed approach for the approximate solution of Problem 1 consists in replacing the original optimal control problem with a sequence of approximating problems, which are easier to solve than the original one. In order to obtain one of such easier problems, we constrain the admissible control functions γ ∈ F to take on fixed structures, where a finite, suitably large number of free parameters to be optimized is inserted. In other words, instead of γ we consider the functions γ n :  × [0, T ] × R p × R N(n) → Rm where γ n (x, t, μ, w) has a fixed structure and w is a vector of parameters to be determined. The dimension of w is denoted by N(n) since, as we shall see in the following, the number of components of the vector w depends on n through a known function N : N+ → N+ . The superscript n is useful to “measure” the complexity of the structure of the functions γ n . Such functions, with a convenient choice of w, become an approximation of the optimal solution γ °. A. Reduction to a Sequence of Approximating Problems Once we have constrained the function γ to take on the structure γ n , we can reduce Problem 1 (which, recall, is of functional optimization) to a mathematical programming problem. In general, such a problem is a nonlinear programming (NLP) one. By substituting the fixed-structure function γ n into the cost functional and the constraints, we obtain an NLP problem, where the unknown is no more a function but an N(n)-dimensional vector of parameters. The solution of this problem is sought by means of an NLP algorithm. By letting the number of free parameters vary, a set of fixed-structure functions is generated. More specifically, for any given value of n ∈ N+ , let us define the sets of the functions γ n as follows:

An  γ n (·, ·, ·, w) ∈ H : w ∈ R N(n) , n = 1, 2, . . . . The choice of the parameterized functions γ n is arbitrary and quite vast. The only requisite we ask in choosing such functions is that the sequence of sets {An }∞ n=1 has an infinite nested structure A1 ⊂ A2 ⊂ · · · ⊂ An ⊂ · · · . γn

(13)

Moreover, the functions must be such that the sequence {An }∞ is dense in F . We introduce the following definitions. n=1 Definition 1 (Density): A sequence {An }∞ n=1 of sets of functions γ n is dense in F ⊆ H if, for any ε > 0 and γ ∈ F , there exist a complexity n¯ ε and a parameter vector w such that ||γ n − γ ||H ≤ ε for any n ≥ n¯ ε .

Definition 2 (Approximating sequence): A sequence n {An }∞ n=1 of sets of functions γ that is dense in F ⊆ H and is endowed with the infinite nested structure (13) is called “approximating sequence.” The functions γ n belonging to the sets An are called “approximating functions.” For each integer n specifying the complexity of the approximating functions γ n , we take into account the presence of the constraints given by F by considering the minimization of the cost functional J over the set F n  An ∩ F given by the intersection of each An with the set F of admissible solutions. Let us define   F n  γ n (·, ·, ·, w) ∈ An ∩ F , n = 1, 2, . . . so that, instead of the nested structure (13), we consider the structure F1 ⊂ F2 ⊂ · · · ⊂ Fn ⊂ · · · .

(14)

Because of the infinite nested structure (14), when the number of parameters increases, the set of parameterized functions F n “invades” F . As such a set grows, the solutions of the corresponding NLP problems will approximate better and better (in a proper sense) the optimal solution of Problem 1. Basing on the aforesaid, by substituting γ with γ n in (1), boundary conditions (2) or (3), and a given cost functional (8), we obtain a cost function J n : R N(n) → R that depends on the vector of parameters w:   (15) J n (w)  J γ n (x, t, μ, w) . Thus, we formulate an approximation to Problem 1 as follows. Problem 2 (NLP Approximating Problem): Find an optimal value w° ∈ R N(n) that minimizes the cost function (15) under the constraints given by (1) (together with boundary and initial conditions) with γ n instead of γ . Solving Problem 2 is much easier than solving Problem 1, as the first is a mathematical programming problem, whereas the second is a functional optimization one. Of course, usually, it is not possible to solve Problem 2 exactly. Nevertheless, a number of efficient techniques exist to solve it numerically. Problem 2 can be viewed as an approximation of Problem 1, in the sense that, under proper assumptions, the optimal function γ no obtained after the solution of the first can be interpreted as an approximation of the optimal solution γ ° of the second. Specifically, we can formally state that a sequence of Problem 2 with increasing n approximates Problem 1 better and better, if the sequence {γ no }∞ n=1 is such that limn→∞ J (γ no ) = J (γ °) and limn→∞ ||γ no − γ °||H = 0. The following theorem holds (see [34, p. 196] for the proof). Theorem 1: Let us consider Problems 1 and 2 and assume that optimal solutions γ ° and w° exist. Let the following assumptions also be verified: 1) the functional J (γ ) is continuous in the norm of H; 2) the optimal function γ ° is unique; 3) {An }∞ n=1 is an approximating sequence; 4) the sequence {γ no }∞ n=1 has a limit function γˆ ∈ F . ∞ no Then, {γ }n=1 is an optimizing sequence, i.e., limn→∞ J (γ no ) = J (γ °) and γˆ = γ °.

ALESSANDRI et al.: FEEDBACK OPTIMAL CONTROL OF DISTRIBUTED PARAMETER SYSTEMS

First, in the following we shall focus on the choice of convenient approximation schemes. Second, the use of SQP to solve Problem 2 will be detailed.

989

Thus, instead of (16) we write γ jn (x, t, μ, w) =

n 

ci j ψ (x, t, μ, κi ) + b j ,

i=1

B. Ritz and Extended Ritz Approximation Schemes

ci j ∈ R, b j ∈ R, κi ∈ Rl , j = 1, 2, . . . , m (17)

As previously pointed out, the choice of fixed-structure functions γ n (x, t, μ, w) is quite arbitrary. The only requisite is that the sequence of sets {An }∞ n=1 made up by such functions when the parameter vector w varies in R N(n) has an infinite nested structure and the density property is satisfied. In the following, we shall focus on linear combinations of fixed and parameterized basis functions for γ n , since they represent a good compromise between accuracy of the overall approximation and computational burden needed to solve the resulting NLP problems. Moreover, a number of theoretical results on the approximating capabilities of such structures are available in the literature (see [37], [38], [43]–[45], and the references therein). The first choice for the parameterized functions γ n consists in using, for each component γ jn , j = 1, 2, . . . , m, a linear combination of preassigned basis functions: γ jn (x, t, μ, w) =

n 

ci j ϕi (x, t, μ) + b j ,

i=1

ci j ∈ R, b j ∈ R, j = 1, 2, . . . , m

(16)

where ϕ1 (x, t, μ) , ϕ2 (x, t, μ) , . . . , ϕn (x, t, μ) ∈ H are the first n functions of a sequence of basis functions. The components of the parameter vector w are all the coefficients ci j and the “bias” b j, namely w  col ci j , b j : i = 1, 2, . . . , n; j = 1, 2, . . . , m . In this case, the number of free parameters (i.e., the dimension of the vector w) is N(n) = (n + 1)m and the integer n that measures the complexity of the approximating functions is simply the number of basis functions. Examples of basis functions are the terms of algebraic and trigonometric polynomials in the space of continuous functions on compact sets. Other examples are given by radial basis functions with fixed centers and widths. An approximation scheme that makes use of a linear combination of fixed basis functions like in (16) is a fixed basis one. The idea of approximating a function belonging to an infinite-dimensional space H with linear combinations of functions belonging to a finite-dimensional approximation of the space itself is the essence of the Ritz method, one of the most used direct variational methods. A natural extension to the previous choice for the functions γ n consists in inserting some free parameters to be optimized in the basis functions, too. If we denote the vector of such parameters by κi ∈ Rl , i = 1, 2, . . . , n, the new basis functions take on the form ψi (x, t, μ, κi ) instead of ϕi (x, t, μ). As the vector κi increases quite a lot the “flexibility” of the basis functions, we renounce to use different functions ψ1 (x, t, μ, κ1 ) , ψ2 (x, t, μ, κ2 ) , . . . , ψn (x, t, μ, κn ). Thus, in the following we shall use only a fixed-structure function ψ (x, t, μ, κi ), hence dropping the subscript i in ψi . We choose the function ψ in such a way that ψ (·, ·, ·, κi ) ∈ H for any κi ∈ Rl , and we call the functions ψ (·, ·, ·, κi ) parameterized basis functions, the functions ϕi (·, ·, ·) are called fixed basis functions.

where the vector of parameters to be optimized is given by w  col ci j , b j , κi : i = 1, 2, . . . , n; j = 1, 2, . . . , m). The coefficients ci j and b j are called “outer” parameters, whereas the coefficients κi are called “inner” parameters. In this case, the total number of free parameters (i.e., the dimension of the vector w) is N(n) = n(m + l) + m and, as in the fixedbasis case, the integer n that measures the complexity of the approximating functions is simply the number of basis functions. The approximating scheme that relies on (17) belongs to the class of variable-basis approximation schemes. With suitable choices of the function ψ, (17) models a variety of approximating families frequently used in applications, such as free-nodes splines, wavelets, trigonometric polynomials with free frequencies and phases, radial-basis-function networks with adjustable centers and widths, and feedforward neural networks (see [46], [47] and the references therein). Variablebasis approximation schemes like (17) are the essence of the ERIM for functional optimization. For the sake of brevity, the approximation schemes that solve Problem 1 in a suboptimal way by means of a sequence of Problems 2 will be denoted as Ritz approximation scheme (RAS) with fixed basis functions and extended Ritz approximation scheme (ERAS) with parameterized basis functions. The convergence speed of the sequences of approximating Problems 2 based on the RAS or ERAS depends on the choice of the class of basis functions. Usually, linear combinations involving a small number of basis functions are enough to give satisfactory approximations. However, in some cases, especially when the functions γ n depend on a large number of variables (i.e., when the domain  is multidimensional, and the dimension p of the output of the operator μ is large), the RAS may incur the phenomenon of the curse of dimensionality [32], [37]: the number n of basis functions needed to obtain a satisfactory approximation may grow too fast (e.g., exponentially) with the dimension of the variables the functions to be approximated depend upon. Such an issue can be mitigated by the use of parameterized basis functions. Generally speaking, the accuracy being the same, the ERAS may require a number of basis functions that is lower than that of the RAS. The main drawback of the variable-basis approach is that the resulting NLP problem is more difficult to solve because of the presence of the free inner parameters, on which the approximate control law depends nonlinearly. As a consequence, the NLP solver may be trapped into local minima, which can compromise the effectiveness of the overall approximation. By contrast, in the fixed-basis case, the functions γ n depend linearly on the unknown parameters, whose selection can be addressed more easily as compared with a variable-basis scheme. However, when global optimal solutions are found, the quality of the resulting control law based on the ERAS may be better in general, as parameterized basis functions have a greater approximation capability than fixed ones.

990

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 6, JUNE 2012

Algorithm 1 Controller Design 1) Choose the form of parameterized structure for the approximation of the optimal control law. 2) Choose the initial number n of basis functions used for the approximation. 3) Choose the initial guesses w(0) and λ(0) ; set i ← 0. 4) Solve numerically the model equation (1) with   (4), where γ (x, t, μ) is replaced by γ n x, t, μ, w(i) . 5) Run the SQP algorithm by solving (23) and applying (24). 6) Check if a stop criterion for the SQP algorithm is satisfied. If it is satisfied, set w° = w(i) , λ° = λ(i) , and go further to step 7). Otherwise, set i ← i + 1 and go back to step 4). 7) Check if the obtained function γ no (x, t, μ, w°) is satisfactory. If yes, the procedure stops; otherwise, increase the number n of basis functions and go back to step 3).

C. Search of Optimal Parameters by SQP In principle, Problem 2 can be solved with any descent method. However, a convenient choice is an SQP algorithm since it allows one to solve nonlinear optimization problems with a large number of variables by enjoying the fast convergence rate of the Newton’s method. Without loss of generality, in the following we shall assume that the constraints in Problem 2 consist of only the model equation with the source term as control input. The extension to problems that include additional inequality constraints and boundary controls is straightforward. With these assumptions, Problem 2 can be written in more compact form as: min J n (w)

w∈R N(n)

s. t. c(w) = 0

(18)

where c(w)  Ly(x, t) − γ n (x, t, μ, w). In order to ensure the numerical tractability, we need to discretize the constraint c(w) = 0 in both time and space. Toward this end, a suitable numerical scheme has to be adopted that depends on the type of PDE one deals with. The differential operator L is then replaced by L d , which accounts for both space and time discretizations. Let ydk ∈ Rm P be the vector of the discretized values of y(x, t) at the points k = 0,  1, . . . , K , i.e., x (1), x (2) ,. . . , x ( P) and  time instants  ydk  col y x (1), k , y x (2), k , . . . , y x ( P) , k , where K is the number of discrete time instants and P is the number of space discretization points. Thus, we deal with the following discrete approximation of (18): min Jdn (w)

w∈R N(n)

s. t. cd (w) = 0

(19)

where cd (w)  L d y − γdn (w); Jdn (w) and γdn (w) result from the discretization of J n (w) and γ n (x, t, μ, w), respectively. As is well known, (19) can be tackled by using the Lagrange multiplier method (see [48]). Toward this end, (19) can be

reformulated as follows: min

w∈R N(n) ,λ∈Rm P K

L(w, λ)

s. t. cd (w) = 0

(20)

where L(w, λ)  Jdn (w) + λ cd (w) and λ ∈ Rm P K are the Lagrangian function and multiplier, respectively. The solution of (20) can be found by solving the system of first-order optimality conditions  ∇w L(w, λ) = ∇w Jdn (w) + ∇w cd (w) λ = 0 (21) ∇λ L(w, λ) = cd (w) = 0. The SQP approach consists in searching for such a solution by approximating (20) with a quadratic programming problem with linear constraints. A sequence of iteration points is then derived that converges to a local minimum pair (w°, λ°), which is a solution of (21). More specifically, given an initial guess (w(0) , λ(0) ), at each iteration i = 0, 1, . . . , a local model of the optimization problem (20) is constructed around the current solution (w(i) , λ(i) ) by using a quadratic approximation of the objective function L(w(i) , λ(i) ) and a linear approximation of the constraints cd (w(i) ). Thus, the following quadratic problems are considered for i = 0, 1, . . .: min

w(i) ∈R N(n)

L(w(i) , λ(i) ) + ∇w L(w(i) , λ(i) ) w(i)

1 + w(i) ∇w2 L(w(i) , λ(i) ) w(i) 2 s. t. cd (w(i) ) + ∇w cd (w(i) ) w(i) = 0

(22)

where w(i)  w(i+1) − w(i) . Problem (22) can be solved by resorting to first-order optimality conditions. Toward this end, let L∗ ( w(i) , λ(i) )  L(w(i) , λ(i) ) + ∇w L(w(i) , λ(i) ) w(i) 1 + w(i) ∇w2 L(w(i) , λ(i) ) w(i) 2   + λ(i) cd (w(i) ) + ∇w cd (w(i) ) w(i) be the Lagrangian function associated with (22), where λ(i)  λ(i+1) − λ(i) . The first-order optimality conditions derived by computing the gradient of L∗ like in (21) are the following:      2 w(i) ∇w L ∇w L (∇w cd ) =− . (23) cd ∇w cd 0 λ(i) Under the assumption that ∇w2 L(w(i) , λ(i) ) is positive definite, one can solve (23) in the unknowns w(i) and λ(i) ; the new pair (w(i+1) , λ(i+1) ) is obtained from (w(i) , λ(i) ):  (i+1) = w(i) + w(i) w (24) (i+1) λ = λ(i) + λ(i) for i = 0, 1, . . . . The above procedure is repeated until a given stop criterion is satisfied. Equations (23) and (24) can be regarded as the result of the application of the Newton’s method to solve (21). If the Hessian matrix is replaced by an approximation, the solution is derived by a quasi-Newton method.

ALESSANDRI et al.: FEEDBACK OPTIMAL CONTROL OF DISTRIBUTED PARAMETER SYSTEMS

991

TABLE I S IMULATION R ESULTS FOR P ROBLEM DBOC Case

P

A B C D A B C D

Optimal cost

Simulation time [s]

21 41 81 161 21 41 81 161

t

x

0.5 0.25 0.125 0.0625 0.5 0.25 0.125 0.0625

0.5 0.25 0.125 0.0625 0.5 0.25 0.125 0.0625

RAS n = 10 0.405 0.275 0.224 0.200 1.81 18.5 62.5 255.7

n=5 0.405 0.276 0.227 0.205 1.54∗ 12.5∗ 26.6∗ 99.5∗

n = 20 0.405 0.274 0.222 0.199 4.76 34.0 128.9 516.9

n=5 0.405 0.271 0.212∗ 0.185∗ 71.6 285.0 1.0 · 103 3.9 · 103

ERAS n = 10 0.405 0.266∗ 0.216 0.192 71.3 298.8 1.1 · 103 4.5 · 103

DP n = 20 0.405 0.272 0.219 0.195 87.8 335.2 1.3 · 103 5.2 · 103

0.403∗ 0.283 0.315 0.383 54 7.6 · 104 4.6 · 105 1.2 · 106

The “∗” indicates the best results in each row.

n = 10

n=5 0

RAS ERAS DP

J

10

10

Δx

−1

10

−1

10

Fig. 2.

0

0

−1

10

RAS ERAS DP

J

10

Δx

−1

10 0

10

n = 20 RAS ERAS DP

J

Δx

−1

10 0

10

−1

10

0

10

Relationship between optimal cost and space discretization for various numbers n of basis functions for problem DBOC.

To sum up, the proposed approach for the solution of optimal control problems for DPSs can be formalized as in Algorithm 1. It is worth noting that the approximation procedure that led us to formulate Problem 2 does not depend on the particular type of DPS. It can be applied to a large variety of DPSs, described by either linear or nonlinear elliptic, parabolic, or hyperbolic equations. However, the nature of the DPS is important in the choice of the most convenient PDE solver to be efficiently integrated with the SQP algorithm. Remark 1: At present there is no systematic way to choose the initial number n of basis functions. A common methodology consists in starting at step 2) of Algorithm 1 with a reduced number of basis functions. Then, such a number is increased at step 7) until the accuracy of the approximation grows, i.e., until the value of the cost function reduces if n is increased. In alternative, one can adopt a trial-and-error procedure that consists in fixing n and then evaluating the quality of the approximation. Indeed, the growth of the number of basis functions does not guarantee the decrease of the cost function in general. This desirable property is ensured under more restrictive assumptions, such as convexity. The convexity of the cost function guarantees the convergence of the SQP algorithm to the global minimum. Because of the density property of the approximating sequence, an increase of the number of basis functions corresponds to a larger set on which the minimum is searched and hence to a lower optimal cost. If convexity does not hold, the SQP algorithm can be trapped into local minima, thus lower values of n may provide better results than higher ones. Remark 2: As regards the initial guesses of the vector of parameters w(0) and Lagrange multiplier λ(0) , different choices may give rise to different values of w° and λ° because of the presence of local minima in the cost function to be minimized.

In order to mitigate this risk, a typical modus operandi consists in adopting a so-called multistart technique, which is based on the solution of various instances of Algorithm 1, each one characterized by randomly chosen w(0) and λ(0) . The best values of w° and λ° to use in the controller are those corresponding to the lowest cost. V. N UMERICAL R ESULTS In this section, we consider the optimal control problems reported in Section III by using the RAS and ERAS methods. The results are compared with a discretize-then-optimize approach based on the approximate DP algorithm. Such an approach consists, as a first step, in reducing the DPS to a discrete-time finite-dimensional LPS. The second step consists in using the approximate DP algorithm to compute a discretized suboptimal control law. Specifically, we used polynomials to perform the interpolations of the cost-to-go functions. We considered also other techniques to perform such interpolations, like those based on neural networks (the so-called neuro-DP [49]). Since performances similar to those achieved by polynomials were obtained but with an increase of the computational effort, such results are not reported here. All the simulations were performed on a personal computer with an Intel Xeon Quad Core CPU with clock frequency equal to 2.5 GHz and 8 GB of RAM by using the implementation of the SQP algorithm available in M ATLAB [50] and the software Comsol Multiphysics [51] for the numerical solution of PDEs via the finite element method. A. Distributed Optimal Control of the Burgers’ Equation We consider problem DBOC with L = 10, T = 1, y0 (x) = 0.001x 2(L − x)2 , u min = −1, and u max = +1.

992

t

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 6, JUNE 2012

1

y no (x, t)

0.8

0.8

t

0.6

0.8

0.4

0.6

0.2

0.4

1

uno (x, t)

1 0.5

0.6 0 0.4

0 0.2

−0.5

0.2 −0.2

0 0

5

x

10

0 0

5

x

10

−1

Fig. 3. Approximate optimal state and control input for problem DBOC along a trajectory starting from y0 (x) obtained by the ERAS in the case B with n = 20 basis functions.

In order to face such a problem with the RAS and ERAS approaches, we used the Lax–Friedrichs scheme (see [52, p. 71]) to solve (9) with temporal and space discretizations of width equal to t and x, respectively. Concerning the parameterized functions γ n , we used the approximating structures (16) and (17) with Gaussian basis functions. In the case of the RAS, the centers and widths of the Gaussians were fixed by using low-discrepancy Sobol’ sequences (see [53], [54]) in order to guarantee a distribution of points in the domain  × [0, T ] × [0, 1] as uniform as possible. Concerning the DP approach, we applied approximate DP with polynomials of degree equal to three for the interpolation of the cost-to-go functions. The sampling of the sets to which the discretized state vector belongs at the various time stages was made by using 1000 samples generated by Sobol’ sequences. The discretization of the model equation was performed again with the Lax–Friedrichs scheme. Table I contains the optimal costs and simulation times for some values of n together with the details of the considered case studies. Fig. 2 shows the relationship between the optimal cost and the space discretization of the considered approaches for various numbers n of basis functions. Figs. 3 and 4 show the approximate optimal state y no (x, t) and control input u no (x, t) along a trajectory starting from y0 (x) and the approximate optimal state y no (x, t) at different time instants obtained by the ERAS in the case B with n = 20 basis functions, respectively. Note that the effect of the distributed control is that of flattening the initial profile y0 (x). Basing on Table I and Fig. 2, we are able to compute the approximate order of convergence (i.e., the slope of the curves in Fig. 2) of the various methods. The orders of convergence with n = 10 basis functions turn out to be equal to 0.33 for the RAS and 0.35 for the ERAS, whereas there is no convergence for the DP approach since, for fine discretizations, the values of the optimal cost increase if x decreases (i.e., if the discretization becomes finer). All the three methods provide more or less the same results when the discretization of the spatial domain is coarse, i.e., in the cases A and B. When the discretization becomes finer, i.e., in the cases C and D, the DP approach suffers from the curse of dimensionality and presents a nonconvergent behavior. This may be ascribed to the fact that, the finer the discretization, the higher the number P of discretization points in the domain , and hence the higher the dimension of the state and control vectors of the discretized

system employed for DP. Thus, the use of 1000 points to sample the sets to which the discrete state vector belongs is completely inadequate to face the high dimension of such sets. The use of a higher number of points, however, would have increased too much the computational burden of such an approach. By contrast, a convergent behavior of the optimal cost is shown by both the RAS and ERAS. More specifically, once fixed the discretization grid, the optimal cost provided by the RAS reduces if n increases, whereas this does not always occur with the ERAS, as pointed out in Remark 1 (see Table I). This can be explained by noting that a higher n implies also a larger number of parameters to be optimized. Such a growth is moderate when using the fixed-basis scheme since the only parameters to be optimized are the outer ones. By contrast, it is higher when using the variable-basis scheme because of the presence of both inner and outer parameters to be optimized. Due to the high number of such parameters, the NLP problems may be difficult to solve, with the result that the optimization algorithm can be trapped into local minima. This may occur due to a bad initialization, as outlined in Remark 2. B. Optimal Reheating of a Slab We consider problem BHOC with   L 1 = 10, L 2 = 1, T = 20, y0 (x) = 10 + 20 exp −x 12 /20 , qmin = 0, qmax = 100, and ⎧ 20 + 40t, if t ∈ [0, 2) ⎪ ⎪ ⎨ 100, if t ∈ [2, 10) yref (x, t) = 100 + 50(t − 10), if t ∈ [10, 12) ⎪ ⎪ ⎩ 200, if t ∈ [12, 20] for t ∈ [0, T ] and all x ∈ . In other words, the reference reheating curve is made up by two steps of the same height. In order to solve such a problem with the RAS and ERAS approaches, we used the finite element method with quadratic Lagrangian elements (see [15]). Likewise for problem DBOC, we employed the structures (16) and (17) for γ n with Gaussian basis functions. In the case of the RAS, in order to guarantee a distribution of points in the domain  × [0, T ] × [0, 200]3 as uniform as possible, the centers and widths of the Gaussians were fixed by using again low-discrepancy Sobol’ sequences. We evaluated the performances with different meshes for the discretization of the domain . Two coarse meshes were generated for the case studies A and B, whereas the meshes of the cases C and D are refinements of the mesh used in the case B (see Fig. 5 for a pictorial sketch of the meshes and Table II for details on the case studies). Concerning the DP approach, we applied approximate DP with polynomials of degree equal to three for the interpolation of the cost-to-go functions. The sampling of the sets to which the discretized state vector belongs at the various time stages was made by using 1000 samples generated by Sobol’ sequences. The discretization of the model equation was performed again with the finite element method with quadratic Lagrangian elements. Table II reports the values of the optimal costs and simulation times for some values of n together with the details of the considered case studies. Fig. 6 shows the relationship between the optimal cost and the space discretization of the considered

ALESSANDRI et al.: FEEDBACK OPTIMAL CONTROL OF DISTRIBUTED PARAMETER SYSTEMS

0.8 0.6

y no (x, 0)

0.8

RAS ERAS DP

0.6

0.8

RAS ERAS DP

0.4

0.4

0.2

0.2

0.2

0

x

−0.2 0

2

4

6

8

10

y no (x, 1)

0.6

0.4

0

Fig. 4.

y no (x, 0.5)

993

RAS ERAS DP

0

x

−0.2 0

2

4

6

8

10

x

−0.2 0

2

4

6

8

10

Approximate optimal states for problem DBOC at different time instants in the case B with n = 20 basis functions. TABLE II S IMULATION R ESULTS FOR P ROBLEM BHOC Case A B C D A B C D

Optimal cost

Simulation time [s]

P 31 62 207 749 31 62 207 749

RAS n=5 5.92 · 104 5.39 · 104 4.76 · 104 4.72 · 104 3.28 · 103 3.79 · 103 8.21 · 103 9.97 · 103

n=2 1.21 · 105 1.21 · 105 1.20 · 105 1.19 · 105 1.41 · 103∗ 1.55 · 103∗ 1.94 · 103∗ 4.61 · 103∗

n = 10 2.65 · 104 2.40 · 104 1.65 · 104 1.59 · 104 9.06 · 103 9.91 · 103 1.29 · 104 3.23 · 104

ERAS n=5 5.01 · 104 4.21 · 104 3.40 · 104 1.95 · 104 2.58 · 104 2.74 · 104 4.73 · 104 6.37 · 104

n=2 9.57 · 104 9.38 · 104 9.01 · 104 8.98 · 104 4.35 · 103 5.66 · 103 6.70 · 103 1.04 · 104

DP n = 10 1.59 · 104 ∗ 1.46 · 104 ∗ 1.34 · 104 ∗ 1.22 · 104∗ 7.08 · 104 7.63 · 104 9.46 · 104 1.97 · 105

1.05 · 105 1.02 · 105 1.32 · 105 1.87 · 105 9.47 · 103 6.81 · 104 2.22 · 105 7.94 · 105

The “∗” indicates the best results in each row.

Case A

Case B

x2 xa

1

Case C

x2 xb

xc

xb

xa

1

0

x2 xc

1

0

Fig. 5.

2

5

8

xc

2

5

8

x1

10

0

2

5

8

x1

10

0

10

Meshes used for the discretization of the domain  for problem BHOC.

n=2

n = 10

n=5 RAS ERAS DP

J

RAS ERAS DP

J

5

5

10

4 1

10

2

10

10

4

P

10

3

4

P

10 10

RAS ERAS DP

J

5

10

Fig. 6.

xb

0

x1 0

xa

1

10

2

10

P

10 3

10

1

10

2

10

3

10

Relationship between optimal cost and space discretization for various numbers n of basis functions for problem BHOC.

approaches for various numbers n of basis functions. In Fig. 7, the behaviors of the temperature of the slab at the points x a , x b , and x c in the case B with n = 5 basis functions and the corresponding optimal control input are reported. One can note that the reference temperature profile yref is followed quite well and that the higher values of the control input are in correspondence with the steps of the reference reheating curve. The temperature distribution of the slab at three different time instants is shown in Fig. 8. Starting from a nonuniform initial

condition, the temperature of the slab grows and tends to become quite uniform in the domain . Using the information available in Table II and Fig. 6, the orders of convergence with n = 5 basis functions turn out to be equal to 0.10 for the RAS and 0.17 for the ERAS, whereas there is no convergence for the DP approach. As in problem DBOC, the values of the optimal costs obtained by the RAS and ERAS methods decrease, if the mesh used for the discretization of the domain  becomes finer and finer. On the

994

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 6, JUNE 2012

RAS

300

300

200

100

100

100

t 5

10

15

20

60

no

q (t)

t

0 0

5

10

15

20

q (t)

40

40

40

20

20

20

t

0 0

5

10

15

20

t

0 0

5

10

15

20

t

0 0

60

no

y no (xa , t) y no (xb , t) y no (xc , t) yref (t)

300

200

60

5

10

15

5

10

15

20

no

q (t)

t

0 0

20

Temperatures of the slab at the points xa , xb , and xc (first row) and optimal control inputs (second row) in the case B with n = 5 basis functions.

y no (x, 0)

y no (x, 10)

x2

y no (x, 20)

x2

x1

Fig. 8.

DP y no (xa , t) y no (xb , t) y no (xc , t) yref (t)

200

0 0

Fig. 7.

ERAS y no (xa , t) y no (xb , t) y no (xc , t) yref (t)

x2

x1

x1

Temperature distribution of the slab at different time instants obtained by the ERAS in the case B with n = 5 basis functions.

contrary, the DP approach shows a nonconvergent behavior: for fine discretizations, such an approach presents an increase of the cost rather than a reduction. Once fixed the discretization grid, the optimal cost provided by both the RAS and ERAS decreases if the number n of basis functions increases (see Table II) according to the best outcome pointed out in Remark 1, though convexity is not verified. Also in this case, the RAS is characterized by a reduced computational burden, whereas the ERAS is more computational demanding, but it is able to provide accurate solutions with a reduced number of basis functions. Note that in problem BHOC the ERAS provides much better results than the RAS as compared with problem DBOC, as one can realize by looking at Figs. 2 and 6. This may be ascribed to the increased difficulty in dealing with problem BHOC, which is bidimensional instead of monodimensional like problem DBOC.

with the DP approach grows rapidly with the occurrence of the curse of dimensionality. On the contrary, the growth of effort required for the controller design with both the RAS and ERAS is moderate. Moreover, though the difficulty in solving the related mathematical programming problems for the ERAS is higher than for the RAS, accurate solutions can be obtained by the ERAS with a reduced number of basis functions, particularly in high-dimensional settings, as shown via an extensive simulation campaign with two numerical examples. In practice, the ERAS approach has turned out to be more effective in constructing simpler controllers for DPSs, since fewer basis functions were required as compared with the solution obtained with the RAS. Such an advantage may deserve further research efforts from both theoretical and practical viewpoints as a prospect of future investigation.

VI. C ONCLUSION

[1] O. Aamo and M. Krstic, Flow Control by Feedback: Stabilization and Mixing. New York: Springer-Verlag, 2003. [2] A. Smyshlyaev and M. Krstic, Adaptive Control of Parabolic PDEs. Princeton, NJ: Princeton Univ. Press, 2010. [3] P. Christofides, Nonlinear and Robust Control of PDE Systems: Methods and Applications to Transport Reaction Processes. Boston, MA: Birkhauser, 2001. [4] J. Lions, Optimal Control of Systems Governed by Partial Differential Equations. New York: Springer-Verlag, 1971. [5] A. Fursikov, Optimal Control of Distributed Systems: Theory and Applications. Boston, MA: American Mathematical Society, 2000.

We have presented an approach to design feedback optimal controllers for DPSs in approximate form based on the classical Ritz and the more recent extended Ritz methods (denoted in this paper by RAS and ERAS for short, respectively) in comparison with a discretize-then-optimize method based on approximate DP. If the discretizations become finer, the computational time needed to perform the approximation

R EFERENCES

ALESSANDRI et al.: FEEDBACK OPTIMAL CONTROL OF DISTRIBUTED PARAMETER SYSTEMS

[6] E. Zuazua, “Propagation, observation, and control of waves approximated by finite difference methods,” SIAM Rev., vol. 47, no. 2, pp. 197–243, 2005. [7] M. Gunzburger, Perspective in Flow Control and Optimization. Philadelphia, PA: SIAM, 2003. [8] E. Zuazua, “Controllability and observability of partial differential equations: Some results and open problems,” in Handbook of Differential Equations: Evolutionary Differential Equations, C. Dafermos and E. Feiteisl, Eds. New York: Elsevier, 2006, pp. 527–621. [9] A. Alessandri, R. Cianci, M. Gaggero, and R. Zoppoli, “Approximate solution of feedback optimal control problems for distributed parameter systems,” in Proc. 8th IFAC Symp. Nonlinear Control Syst., 2010, pp. 987–992. [10] R. Bellman, Dynamic Programming. Princeton, NJ: Princeton Univ. Press, 1957. [11] M. Demetriou, “Design of consensus and adaptive consensus filters for distributed parameter systems,” Automatica, vol. 46, no. 2, pp. 300–311, 2010. [12] M. Krstic, B.-Z. Guo, and A. Smyshlyaev, “Boundary controllers and observers for the linearized Schrödinger equation,” SIAM J. Control Opt., vol. 49, no. 4, pp. 1479–1497, 2011. [13] S. Tang and C. Xie, “State and output feedback boundary control for a coupled PDE-ODE system,” Syst. Control Lett., vol. 60, no. 8, pp. 540–545, 2011. [14] P. Frihauf and M. Krstic, “Leader-enabled deployment onto planar curves: A PDE-based approach,” IEEE Trans. Autom. Control, vol. 56, no. 8, pp. 1791–1806, Aug. 2011. [15] A. Quarteroni, Numerical Models for Differential Problems. New York: Springer-Verlag, 2009. [16] A. Steindl and H. Troger, “Methods for dimension reduction and their application in nonlinear dynamics,” Int. J. Solids Struct., vol. 38, nos. 10–13, pp. 2131–2147, 2001. [17] N. El-Farra, A. Armaou, and P. Christofides, “Analysis and control of parabolic PDE systems with input constraints,” Automatica, vol. 39, no. 4, pp. 715–725, 2003. [18] O. Agudelo, M. Baes, J. Espinosa, M. Diehl, and B. D. Moor, “Positive polynomial constraints for POD-based model predictive controllers,” IEEE Trans. Autom. Control, vol. 54, no. 5, pp. 988–999, May 2009. [19] H.-N. Wu and H.-X. Li, “Adaptive neural control design for nonlinear distributed parameter systems with persistent bounded disturbances,” IEEE Trans. Neural Netw., vol. 20, no. 10, pp. 1630–1644, Oct. 2009. [20] K. Karhunen, “Zur spektraltheorie stochastischer prozesse,” Ann. Acad. Sci. Fenn. A1, vol. 37, 1946. [21] M. Loeve, “Functions aleatoire de second ordre,” Comptes Rendus Acad. Sci., vol. 220, 1945. [22] K. Pearson, “On lines and planes of closest fit to systems of points in space,” Philosophical Mag., vol. 2, no. 6, pp. 559–572, 1901. [23] H. Hotelling, “Analysis of a complex of statistical variables into principal components,” J. Educat. Psy., vol. 24, no. 7, pp. 417–441, 1933. [24] R. Bellman, R. Kalaba, and B. Kotkin, “Polynomial approximation: A new computational technique in dynamic programming allocation processes,” Math. Comp., vol. 17, no. 82, pp. 155–161, 1963. [25] S. Mehraeen and S. Jagannathan, “Decentralized optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton-Jacobi-Bellman formulation,” IEEE Trans. Neural Netw., vol. 22, no. 11, pp. 1757–1769, Nov. 2011. [26] Y. Jiang and Z.-P. Jiang, “Approximate dynamic programming for optimal stationary control with control-dependent noise,” IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 2392–2398, Dec. 2011. [27] R. Becker, H. Kapp, and R. Rannacher, “Adaptive finite element methods for optimal control of partial differential equations: Basic concepts,” SIAM J. Control Opt., vol. 39, no. 1, pp. 113–132, 2000. [28] R. Becker, “Mesh adaptation for stationary flow control,” J. Math. Fluid Mech., vol. 3, no. 4, pp. 317–341, 2001. [29] B. Vexler and W. Wollner, “Optimal control of the Stokes equations: A priori error analysis for finite element discretization with postprocessing,” SIAM J. Numer. Anal., vol. 44, no. 5, pp. 1903–1920, 2006. [30] R. Becker, D. Meidner, and B. Vexler, “Efficient numerical solution of parabolic optimization problems by finite element methods,” Opt. Methods Softw., vol. 22, no. 5, pp. 813–833, 2007. [31] B. Vexler and W. Wollner, “Adaptive finite elements for elliptic optimization problems with control constraints,” SIAM J. Control Opt., vol. 47, no. 1, pp. 509–534, 2008. [32] R. Zoppoli, M. Sanguineti, and T. Parisini, “Approximating networks and extended Ritz method for the solution of functional optimization problems,” J. Opt. Theory Appl., vol. 112, no. 2, pp. 403–440, 2002.

995

[33] W. Ritz, “Über eine neue methode zur lösung gewisser variationsprobleme der mathematischen physik,” J. für die Reine und Angew. Math., vol. 135, no. 1, pp. 1–61, 1909. [34] I. Gelfand and S. Fomin, Calculus of Variations. Englewood Cliffs, NJ: Prentice Hall, 1963. [35] B. Finlayson, The Method of Weighted Residuals and Variational Principles. New York: Academic, 1972. [36] D. Burnett, Finite Element Analysis - From Concepts to Applications. Reading, MA: Addison-Wesley, 1988. [37] V. Kurkova and M. Sanguineti, “Comparison of worst-case errors in linear and neural-network approximation,” IEEE Trans. Inform. Theory, vol. 48, no. 1, pp. 264–275, Jan. 2002. [38] G. Gnecco, M. Sanguineti, and M. Gaggero, “Suboptimal solutions to team optimization problems with stochastic information structure,” SIAM J. Opt., vol. 22, no. 1, pp. 212–243, 2012. [39] J. Nocedal and S. Wright, Numerical Optimization. New York: SpringerVerlag, 2006. [40] S. S. Ravindran, “Numerical approximation of optimal control of unsteady flows using SQP and time decomposition,” Int. J. Numer. Methods Fluids, vol. 45, no. 1, pp. 21–42, 2004. [41] C. Xu, Y. Ou, J. Dalessio, E. Schuster, T. C. Luce, J. R. Ferron, M. L. Walker, and D. A. Humphreys, “Ramp-up-phase current-profile control of tokamak plasmas via nonlinear programming,” IEEE Trans. Plasma Sci., vol. 38, no. 2, pp. 163–173, Feb. 2010. [42] A. Sage, Optimum Systems Control. Englewood Cliffs, NJ: Prentice-Hall, 1968. [43] I. Singer, Best Approximation in Normed Linear Spaces by Elements of Linear Subspaces. New York: Springer-Verlag, 1970. [44] A. Pinkus, n-Widths in Approximation Theory. New York: SpringerVerlag, 1985. [45] A. Barron, “Universal approximation bounds for superpositions of a sigmoidal function,” IEEE Trans. Inf. Theory, vol. 39, no. 3, pp. 930– 945, May 1993. [46] S. C. Tong, Y. M. Li, and H.-G. Zhang, “Adaptive neural network decentralized backstepping output-feedback control for nonlinear largescale systems with time delays,” IEEE Trans. Neural Netw., vol. 22, no. 7, pp. 1073–1086, Jul. 2011. [47] A. Alessandri, M. Baglietto, G. Battistelli, and M. Gaggero, “Movinghorizon state estimation for nonlinear systems using neural networks,” IEEE Trans. Neural Netw., vol. 22, no. 5, pp. 768–780, May 2011. [48] D. P. Bertsekas, Nonlinear Programming, 2nd ed. Belmont, MA: Athena Scientific, 1999. [49] P. D. Bertsekas, Dynamic Programming and Optimal Control, 3rd ed. Belmont, MA: Athena Scientific, 2005. [50] T. Coleman, M. Branch, and A. Grace, Optimization Toolbox. Natick, MA: The MathWorks, 1999. [51] COMSOL Multiphysics Modeling Guide: Version 3.5, COMSOLAB, Stockholm, Sweden, 2008. [52] R. Leveque, Finite Volume Methods for Hyperbolic Problems. Cambridge, U.K.: Cambridge Univ. Press, 2004. [53] H. Niederreiter, Random Number Generation and Quasi-Monte Carlo Methods. Philadelphia, PA: SIAM, 1992. [54] I. Sobol, “The distribution of points in a cube and the approximate evaluation of integrals,” Zh. Vychisl. Mat. Mat. Fiz., vol. 7, no. 4, pp. 784–802, 1967.

Angelo Alessandri (S’88–M’97–SM’02) received the Laurea degree in electronic engineering and the Ph.D. degree in electronic engineering and computer science from the University of Genoa, Genoa, Italy, in 1992 and 1996, respectively. He was a Visiting Scientist with the Naval Postgraduate School, Monterey, CA, in 1998. From 1996 to 2005, he was a Research Scientist with the National Research Council of Italy, Italy. In 2005, he joined the University of Genoa, where he is currently an Associate Professor. His current research interests include optimal control, linear and nonlinear estimation, and neural networks. Dr. Alessandri was an Associate Editor of the IEEE T RANSACTIONS ON N EURAL N ETWORKS and the Journal of Engineering Applications of Artificial Intelligence. He is an editor of the International Journal of Adaptive Control and Signal Processing. He is an Associate Editor of the IEEE Control Systems Society Conference Editorial Board and the IEEE T RANSACTIONS ON C ONTROL S YSTEMS T ECHNOLOGY .

996

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 6, JUNE 2012

Mauro Gaggero was born in Genoa, Italy, on December 8, 1981. He received the B.S. and M.S. degrees in electronic engineering and the Ph.D. degree in mathematical engineering from the University of Genoa, Genoa, in 2003, 2005, and 2010, respectively. He was a Post-Doctoral Fellow with the Faculty of Engineering, University of Genoa, in 2010. Since 2011, he has been a Research Scientist with the Institute of Intelligent Systems for Automation, National Research Council of Italy, Italy. His current research interests include distributed parameter systems, neural networks, optimal control, and functional optimization problems.

Riccardo Zoppoli received the Laurea degree in electronic engineering from the University of Genoa, Genoa, Italy, in 1965, and the Libero Docente degree in 1971. He was an Associate Professor of control theory with the University of Genoa from 1968 to 1979. Since 1980, he has been a Full Professor of control theory and operations research with the same university. In 1972, he was a NATO Visiting Scholar with the University of California, Los Angeles. From 1979 to 1985, he was responsible for the special project Distributed Process Control Systems supported by the National Research Council of Italy. He is the author of over 150 scientific papers in the area of automatic control and operations research. His current research interests include large-scale systems, team theory, infinite-dimensional optimization, and neural approximation. Dr. Zoppoli has served on various national and international technical and academic committees and as a member of several editorial boards of technical journals. He was the co-recipient of the Outstanding Paper Award of the IEEE T RANSACTIONS ON N EURAL N ETWORKS in 2004.

Feedback optimal control of distributed parameter systems by using finite-dimensional approximation schemes.

Optimal control for systems described by partial differential equations is investigated by proposing a methodology to design feedback controllers in a...
974KB Sizes 0 Downloads 3 Views