function [P,F] = olrp(beta,A,B,Q,R,H) %function [P,F] = olrp(beta,A,B,Q,R,H) % written by O. Mikhail %OLRP can have arguments: (beta,A,B,Q,R) if there are no cross products % (i.e. H=0). Set beta=1, if there is no discounting. % % OLRP calculates f of the feedback law: % % u = -fx % % that maximizes the function: % % sum {beta^t [x'Qx + u'Ru +2u'Hx] } % % subject to % x[t+1] = Ax[t] + Bu[t] % % where x is the nx1 vector of states, u is the kx1 vector of controls, % A is nxn, B is nxk, Q is nxn, R is kxk, H is nxk. % % Also returned is p, the steady-state solution to the associated % discrete matrix Riccati equation. % m=max(size(A)); [rb,cb]=size(B); if nargin==5; H=zeros(m,cb); end; p0=-.01*eye(m); dd=1; it=1; maxit=10000; % check tolerance; for greater accuracy set it to 1e-10 while (dd>1e-10 & it<=maxit); % equation 4.48 page 76 f0=(R+beta*A'*p0*A)-((beta*A'*p0*B)+H')*inv(Q+(beta*B'*p0*B))*((beta*B'*p0*A)+H); dd=max(max(abs(f0-p0))); %disp(' it: ') %it %disp(' dd: ') %dd it=it+1; p0=f0; end; P=p0; % now compute F the optimal policy rule F = inv(Q+(beta*B'*P*B))*((beta*B'*P*A)+H); % wrong F = beta*(inv(Q+(beta*B'*P'*B))*B'*P'*A); % also that was wrong F = inv(Q+B'*P*B)*B'*P*A; % SAY BYE BYE NOW