Adaptive Dynamic Programming for Control: Algorithms and by Huaguang Zhang, Derong Liu, Yanhong Luo, Ding Wang

By Huaguang Zhang, Derong Liu, Yanhong Luo, Ding Wang

There are many tools of sturdy controller layout for nonlinear platforms. In looking to transcend the minimal requirement of balance, Adaptive Dynamic Programming in Discrete Time methods the hard subject of optimum regulate for nonlinear structures utilizing the instruments of adaptive dynamic programming (ADP). the diversity of platforms taken care of is broad; affine, switched, singularly perturbed and time-delay nonlinear structures are mentioned as are the makes use of of neural networks and strategies of price and coverage generation. The textual content positive factors 3 major elements of ADP during which the tools proposed for stabilization and for monitoring and video games enjoy the incorporation of optimum keep an eye on tools:
• infinite-horizon keep an eye on for which the trouble of fixing partial differential Hamilton–Jacobi–Bellman equations without delay is conquer, and facts only if the iterative worth functionality updating series converges to the infimum of the entire price features received through admissible keep watch over legislations sequences;
• finite-horizon keep an eye on, carried out in discrete-time nonlinear structures exhibiting the reader tips to receive suboptimal keep an eye on recommendations inside of a set variety of keep an eye on steps and with effects extra simply utilized in genuine structures than these often won from infinite-horizon keep watch over;
• nonlinear video games for which a couple of combined optimum guidelines are derived for fixing video games either while the saddle aspect doesn't exist, and, while it does, warding off the lifestyles stipulations of the saddle element.
Non-zero-sum video games are studied within the context of a unmarried community scheme during which regulations are bought ensuring approach balance and minimizing the person functionality functionality yielding a Nash equilibrium.
In order to make the insurance appropriate for the scholar in addition to for the specialist reader, Adaptive Dynamic Programming in Discrete Time:
• establishes the elemental conception concerned basically with each one bankruptcy dedicated to a truly identifiable keep watch over paradigm;
• demonstrates convergence proofs of the ADP algorithms to deepen realizing of the derivation of balance and convergence with the iterative computational equipment used; and
• exhibits how ADP equipment might be positioned to exploit either in simulation and in actual purposes.
This textual content could be of substantial curiosity to researchers attracted to optimum regulate and its purposes in operations learn, utilized arithmetic computational intelligence and engineering. Graduate scholars operating up to the mark and operations study also will locate the guidelines awarded right here to be a resource of strong tools for furthering their study.

Show description

Read or Download Adaptive Dynamic Programming for Control: Algorithms and Stability PDF

Similar system theory books

Controlled and Conditioned Invariants in Linear System Theory

Utilizing a geometrical method of method idea, this paintings discusses managed and conditioned invariance to geometrical research and layout of multivariable keep an eye on platforms, offering new mathematical theories, new methods to plain difficulties and utilized arithmetic themes.

Theory of Commuting Nonselfadjoint Operators

Contemplating quintessential variations of Volterra sort, F. Riesz and B. Sz. -Nagy no­ ticed in 1952 that [49]: "The life of the sort of number of linear modifications, having an identical spectrum centred at a unmarried aspect, brings out the problems of characterization of linear changes of common variety by way of their spectra.

General Pontryagin-Type Stochastic Maximum Principle and Backward Stochastic Evolution Equations in Infinite Dimensions

The classical Pontryagin greatest precept (addressed to deterministic finite dimensional keep watch over platforms) is among the 3 milestones in smooth keep watch over concept. The corresponding thought is by way of now well-developed within the deterministic countless dimensional environment and for the stochastic differential equations.

Science and the Economic Crisis: Impact on Science, Lessons from Science

This booklet not just explores the ways that the commercial situation and linked austerity guidelines have adversely impacted the actual and human infrastructure and behavior of clinical study, but in addition considers how technology might help us to appreciate the concern and supply unique suggestions. beginning with an in depth yet available research of the clinical approach and the character of clinical prediction, the booklet proceeds to deal with the failure to forecast the commercial difficulty and the origins of the ongoing inertia in monetary coverage and conception.

Additional info for Adaptive Dynamic Programming for Control: Algorithms and Stability

Sample text

IEEE Trans Syst Man Cybern, Part B, Cybern 40(3):831–844 119. Zhang HG, Cui LL, Zhang X, Luo YH (2011) Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans Neural Netw 22(12):2226–2236 120. Zhang HG, Wei QL, Liu D (2011) An iterative approximate dynamic programming method to solve for a class of nonlinear zero-sum differential games. Automatica 47(1):207–214 References 25 121. Zhao Y, Patek SD, Beling PA (2008) Decentralized Bayesian search using approximate dynamic programming methods.

Either in continuous-time domain or discrete-time domain, the adaptive critic algorithms are easy to initialize considering that initial policies are not required to be stabilizing. In the following, we present some basic knowledge regarding non-linear zerosum differential games first [120]. 18 1 Overview Consider the following two-person zero-sum differential games. 36) where x ∈ Rn , u ∈ Rk , w ∈ Rm and the initial condition x(0) = x0 is given. The two control variables u and w are functions chosen on [0, ∞) by player I and player II from some control sets U [0, ∞) and W [0, ∞), respectively, subject to the constraints u ∈ U (t), and w ∈ W (t) for t ∈ [0, ∞), for given convex and compact sets U (t) ⊂ Rk , W (t) ⊂ Rm .

Term 2 0 i Therefore, in the following we will present another method called iterative DHP algorithm to implement the iterative ADP algorithm. Define the costate function λ(x) = ∂V (x)/∂x. Here, we assume that the value function V (x) is smooth so that λ(x) exists. 10) can be implemented as follows. First, we start with an initial costate function λ0 (·) = 0. Then, for i = 0, 1, . . 45), we obtain the corresponding control law vi (x) as 1 vi (x(k)) = U¯ ϕ − (U¯ R)−1 g T (x(k))λi (x(k + 1)) . 46) 38 2 + = Optimal State Feedback Control for Discrete-Time Systems ∂vi (x(k)) ∂x(k) T T ∂x(k + 1) ∂vi (x(k)) ∂Vi (x(k + 1)) ∂x(k + 1) ∂ x T (k)Qx(k) + W (vi (x(k))) ∂x(k) + ∂vi (x(k)) ∂x(k) + + ∂ x T (k)Qx(k) + W (vi (x(k))) ∂vi (x(k)) T T ∂x(k + 1) ∂vi (x(k)) ∂x(k + 1) ∂x(k) T ∂Vi (x(k + 1)) ∂x(k + 1) ∂Vi (x(k + 1)) .

Download PDF sample

Rated 4.28 of 5 – based on 35 votes