|Summary||Making decisions under uncertainty requires careful evaluation of the cost and benefits not only of the immediate action but also of choices to be made in the future. This evaluation becomes harder due to the stochastic nature of the problem requiring to evaluate many possible outcomes in parallel. In the traditional normative decision theory framework, the mathematical approach consists of the following steps: Initially, consider a stylized stochastic process that captures the evolution of the system's state over time (which may be partially observable, e.g., due to noise in the data collection or incomplete information. Also consider a set of possible actions. Thereafter, derive the cost-optimal policy over time by reducing the problem to the HJB equation and/or using Pontryagin's Maximum Principle. Examples of decision problems with similar characteristics - stochasticity, partial observability, complex temporal cost-benefit tradeoffs - include maintenance management, portfolio management and asset allocation, allocation of resources (multi-armed bandit problems), scheduling in queueing systems, and the like.
Such sequential decision problems can be modelled as (partially observable) Markov decision processes ((PO)MDP). Although, useful from a modelling perspective, (PO)MDPs have the disadvantage of being exceptionally hard to solve, and optimal or ε-optimal policies are oftentimes completely elusive, while numerical solutions can be obtained only for problems with simplistic structure (e.g., countable state-space or for problems of low computational complexity. A challenging goal in this research area is to exploit additional structural properties of the domain and suitable approximations that can be used to obtain good solutions in an efficient manner.
The proposed research takes a fresh look at a classical problem in the heart of SOR and capitalizes on the strengths of combining stochastics and algorithmics. Progress along these directions will lead to the development of a novel methodological and analytic framework for decision-making under uncertainty.
|Supervisors||Remco van der Hofstad (TU/e) and Stella Kapodistria(TU/e)|
|PhD Student||Rowel Gündlach|
|Location||Technische Universiteit Eindhoven (TU/e)|