Editors: Mickael Randour and Jeremy Sproston

Authors, Title and AbstractPaperTalk

ABSTRACT. We study n-player turn-based games played on a finite directed graph. For each play, each player receives a gain that he wants to maximise. Instead of the well-known notions of Nash equilibrium (NE) and subgame perfect equilibrium (SPE), we focus on the recent notion of weak subgame perfect equilibrium (weak SPE), a refinement of SPE. In this setting, players who deviate cannot use the full class of strategies but only a subclass with a finite number of deviation steps. We are interested in the constraint problem: given an upper and a lower thresholds x, y in {0,1}^n, we want to determine if there exists a weak SPE such the gain of each player is componentwise between the two bounds. The goal of our ongoing research is to characterise the complexity of the constraint problem for the class of games with Boolean omega-regular objectives such that Muller, Büchi, Reachability ...

Jul 13 11:00

ABSTRACT. Timed automata and hybrid systems are important frameworks for the analysis of continuous-time systems. In this ongoing work, we study a stochastic extension of the latter. There are clear challenges regarding decidability: (i) the reachability problem is already quickly undecidable for non-stochastic hybrid systems; (ii) even in the simpler setting of timed automata, the finite abstraction known as the region graph does not preserve its correctness when lifting it to the stochastic setting. Our goal is to define the class of stochastic o-minimal hybrid systems (SoHSs) and to show that it ensures good properties while being reasonably rich (e.g., it encompasses continuous-time Markov chains). We look for decisiveness of SoHSs (as introduced by Adbulla et al.) and definability in our o-minimal structure. Hopefully, approximation of reachability probabilities could be obtainable in appropriate (and still quite rich) structures.

Jul 13 15:00

ABSTRACT. In recent years, two-player zero-sum games with multiple objectives have received a lot of interest as a model for the synthesis of complex reactive systems. In this framework, Player 1 wins if he can ensure that all objectives are satisfied against any behavior of Player 2. When this is not possible to satisfy all the objectives at once, an alternative is to use some preorder on the objectives according to which subset of objectives Player 1 wants to satisfy. For example, it is often natural to provide more significance to one objective over another, a situation that can be modelled with lexicographically ordered objectives for instance. Inspired by recent work on concurrent games with multiple omega-regular objectives by Bouyer et al., we investigate in detail turned-based games with monotonically ordered and omega-regular objectives. We study the threshold problem which asks whether Player 1 can ensure a payoff greater than or equal to a given threshold w.r.t. a given monotonic preorder. As the number of objectives is usually much smaller than the size of the game graph, we provide a parametric complexity analysis and we show that our threshold problem is fixed parameter tractable for all monotonic preorders and all classical types of omega-regular objectives. We also provide polynomial time algorithms for Buchi, coBuchi and explicit Muller objectives for a large subclass of monotonic preorders that includes among others the lexicographic preorder. In the particular case of lexicographic preorder, we also study the complexity of computing the values and the memory requirements of optimal strategies.

Jul 13 11:30

ABSTRACT. We consider a stochastic scheduling problem with both hard and soft tasks. Each task is described by a discrete probability distribution over possible execution times, a deadline and a distribution over inter-arrival times. Our scheduling problem is non-clairvoyant in the sense that the execution and inter-arrival times are subject to uncertainty modeled by stochastic distributions. Given an instance of our problem, we want to synthesize a schedule that is safe and efficient: safety imposes that deadline of hard tasks are never violated while efficient means that soft tasks should be completed as often as possible. To enforce that property, we impose a cost when a soft task deadline is violated and we are looking for a schedule that minimizes the expected value of the limit average of this cost.

First, we show that the dynamics of such a system can be modeled as a finite Markov Decision Process (MDP). Second, we show that the problem of deciding the existence of a safe and efficient scheduler is PP-hard and in EXPTIME. Third, we have implemented our synthesis algorithm that partly relies on the probabilistic model-checker Storm to analyze the underlying MDP. Given a set of both hard and soft tasks, we first compute the set of safe schedules, i.e., the ones in which the hard tasks always finish their execution, then we compute the scheduler that minimizes the limit average cost among the set of all safe schedules. Finally, we compare the optimal solutions obtained with our procedure against an adaptation of the earliest deadline first (EDF) algorithm to account for the soft task. We show that this EDF strategy can be arbitrarily worse in terms of the expected limit average cost for missing the deadlines of the soft tasks when compared to our optimal procedure.

Jul 13 16:30

ABSTRACT. We summarize our recent TACAS paper, where we provide an efficient algorithm for multi-objective model-checking problems on Markov decision processes (MDPs) with multiple cost structures. The key problem at hand is to check whether there exists a scheduler for a given MDP such that all objectives over cost vectors are fulfilled. Reachability and expected cost objectives are covered and can be mixed. Empirical evaluation shows the algorithm’s scalability. We discuss the need for output beyond Pareto curves and exploit the available information from the algorithm to support decision makers.

Jul 13 10:00

ABSTRACT. We propose a Reinforcement Learning (RL) algorithm to synthesize policies for a Markov Decision Process (MDP), such that a linear time property is satisfied. We convert the property into a Limit Deterministic Buchi Automaton (LDBA), then construct a product MDP between the automaton and the original MDP. A reward function is then assigned to the states of the product automaton, according to accepting conditions of the LDBA. With this reward function, our algorithm synthesizes a policy that satisfies the linear time property: as such, the policy synthesis procedure is "constrained" by the given specification. Additionally, we show that the RL procedure sets up an online value iteration method to calculate the maximum probability of satisfying the given property, at any given state of the MDP - a convergence proof for the procedure is provided. Finally, the performance of the algorithm is evaluated via a set of numerical examples. We observe an improvement of one order of magnitude in the number of iterations required for the synthesis compared to existing approaches.

Jul 13 17:00

ABSTRACT. Markov decision processes (MDPs) are a well-known formalism for mod- eling discrete event systems. Several extensions of the model have been proposed in order to encompass additional model-level uncertainty in MDPs. Here, we introduce a multi-scenario uncertainty model which has been proposed in the author’s PhD thesis and possible future research.

Jul 13 16:00