Author
|
Conference
|
Journal
|
Organization
|
Year
|
DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all domains
Limit my searches in the following domains
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(5)
Error Bound
Finite Horizon
Numerical Analysis
Optimal Policy
Markov Decision Process
Subscribe
Academic
Publications
Numerical analysis of continuous time Markov decision processes over finite horizons
Edit
Numerical analysis of continuous time Markov decision processes over finite horizons
(
Citations: 4
)
BibTex
|
RIS
|
RefWorks
Download
Peter Buchholz
,
Ingo Schulz
Continuous time Markov decision processes (CTMDPs) with a finite state and action space have been considered for a long time. It is known that under fairly general conditions the reward gained over a
finite horizon
can be maximized by a so-called piecewise constant policy which changes only finitely often in a finite interval. Although this result is available for more than 30 years,
numerical analysis
approaches to compute the
optimal policy
and reward are restricted to discretization methods which are known to converge to the true solution if the discretization step goes to zero. In this paper, we present a new method that is based on uniformization of the CTMDP and allows one to compute an ε-optimal policy up to a predefined precision in a numerically stable way using adaptive time steps.
Journal:
Computers & Operations Research - CoR
, vol. 38, no. 3, pp. 651-659, 2011
DOI:
10.1016/j.cor.2010.08.011
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
www.sciencedirect.com
)
(
www.informatik.uni-trier.de
)
(
dx.doi.org
)
(
linkinghub.elsevier.com
)
More »
Citation Context
(3)
...While efficient approximation [
7
,17,22] is of interest for a practitioner, being unable to determine whether or not optimal control exists—or if the optimal quality which was describedmorethanhalfacenturyagobyBellman[5]maybealimitthatcannotbereached— is very dissatisfying from a theoretical point of view...
Markus N. Rabe
,
et al.
Finite optimal control for time-bounded reachability in CTMDPs and con...
...Uniformization and discretization have been applied to approximate the optimal rewards for both infinite and finite horizons [26], [18], [27], [28], [
29
]...
Martin R. Neuhäußer
,
et al.
Time-Bounded Reachability Probabilities in Continuous-Time Markov Deci...
...The method we present is an adaption and extension of a recent algorithm [
11
] to compute the accumulated reward in a CTMDP over a finite interval...
...This then provides both theoretical and practical insights: (i) on the theoretical side, we have that all maximal (or minimal) values arising in model checking can be obtained by finite memory policies, (ii) on the practical side, we exploit recent algorithmic advances [
11
] to arrive at an efficient approximation algorithm—providing upper and lower bounds— based on the well known notion of uniformization...
...The following notations are mainly taken from [12] and are used similarly in [
11
]...
...It is based on uniformization [16] for CTMCs and its recent extension to CTMDPs with rewards [
11
], which, in our notation, treats the approximation of R max (C [0,T] true)...
...Eqns. (9) and (10), combined with the uniformization approach (8), can be used to derive (see [
11
]) the following sequences of vectors:...
...Notice that for zero rewards for probabilistic reachability, we have w (0) = w (0) = r = 0. From the known bounds for g ∗ , new bounds for g ∗−δ can then be computed as follows (see [
11, Theorem 3
]): Model Checking Algorithms for CTMDPs 235...
...where d is the decision vector chosen by the selection procedure using g t .A s shown in [
11
] the local error of a step of length δ is in O(δ 2 ) such that theoretically the...
Peter Buchholz
,
et al.
Model Checking Algorithms for CTMDPs
References
(12)
Dynamic Programming and Optimal Control
(
Citations: 1567
)
Dimitri Bertsekas
Published in 1995.
The Randomization Technique as a Modeling Tool and Solution Procedure for Transient Markov Processes
(
Citations: 175
)
D. Gross
,
D. R. Miller
Journal:
Operations Research
, vol. 32, no. 2, pp. 343-361, 1984
Markoff chains as an aid in the study of markoff processes
(
Citations: 121
)
A. Jensen
An equivalence between continuous and discrete time markov decision processes
(
Citations: 63
)
R. F. Serfozo
Conference:
Operations Research - OR
, 1979
Finite State Continuous Time Markov Decision Processes with a Finite Planning Horizon
(
Citations: 19
)
Bruce L. Miller
Journal:
Siam Journal on Control
, vol. 6, no. 2, 1968
Order by:
Citations
(4)
Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games
Markus N. Rabe
,
Sven Schewe
Published in 2011.
Time-Bounded Reachability Probabilities in Continuous-Time Markov Decision Processes
(
Citations: 3
)
Martin R. Neuhäußer
,
Lijun Zhang
Conference:
Quantitative Evaluation of Systems - QEST
, pp. 209-218, 2010
Efficient Approximation of Optimal Control for Markov Games
(
Citations: 1
)
Markus N. Rabe
,
Sven Schewe
,
Lijun Zhang
Journal:
Computing Research Repository - CORR
, vol. abs/1011.0, 2010
Model Checking Algorithms for CTMDPs
Peter Buchholz
,
Ernst Moritz Hahn
,
Holger Hermanns
,
Lijun Zhang