An efficient policy iteration algorithm for dynamic programming equations

We present an accelerated algorithm for the solution of static Hamilton–Jacobi–Bellman equations related to optimal control problems. Our scheme is based on a classic policy iteration procedure, which is known to have superlinear convergence in many relevant cases provided the initial guess is sufficiently close to the solution. This limitation often degenerates into a behavior similar to a value iteration method, with an increased computation time. The new scheme circumvents this problem by combining the advantages of both algorithms with an efficient coupling. The method starts with a coarse-mesh value iteration phase and then switches to a fine-mesh policy iteration procedure when a certain error threshold is reached. A delicate point is to determine this threshold in order to avoid cumbersome computations with the value iteration and at the same time to ensure the convergence of the policy iteration method to the optimal solution. We analyze the methods and efficient coupling in a number of examples in different dimensions, illustrating their properties.

An efficient policy iteration algorithm for dynamic programming equations

Alla A.;Falcone M.;Kalise D.

2015-01-01

Abstract

We present an accelerated algorithm for the solution of static Hamilton–Jacobi–Bellman equations related to optimal control problems. Our scheme is based on a classic policy iteration procedure, which is known to have superlinear convergence in many relevant cases provided the initial guess is sufficiently close to the solution. This limitation often degenerates into a behavior similar to a value iteration method, with an increased computation time. The new scheme circumvents this problem by combining the advantages of both algorithms with an efficient coupling. The method starts with a coarse-mesh value iteration phase and then switches to a fine-mesh policy iteration procedure when a certain error threshold is reached. A delicate point is to determine this threshold in order to avoid cumbersome computations with the value iteration and at the same time to ensure the convergence of the policy iteration method to the optimal solution. We analyze the methods and efficient coupling in a number of examples in different dimensions, illustrating their properties.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2015
			
	Titolo della Rivista
	
				SIAM JOURNAL ON SCIENTIFIC COMPUTING
			
	N° Volume
	
				37
			
	DOI
	
				https://dx.doi.org/10.1137/130932284
			
	Appare nelle tipologie:
	
				2.1 Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
12_AFK.pdf accesso aperto Tipologia: Versione dell'editore Licenza: Accesso gratuito (solo visione) Dimensione 806.51 kB Formato Adobe PDF Visualizza/Apri	806.51 kB	Adobe PDF	Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/3746322

Citazioni

ND

57

56

social impact