Admissiblity in Concurrent Games

In this paper, we study the notion of admissibility for randomised strategies in concurrent games . Intuitively, an admissible strategy is one where the player plays ‘as well as possible’, because there is no other strategy that dominates it, i.e., that wins (almost surely) against a superset of adversarial strategies. We prove that admissible strategies always exist in concurrent games, and we characterise them precisely. Then, when the objectives of the players are ω -regular, we show how to perform assume-admissible synthesis , i.e., how to compute admissible strategies that win (almost surely) under the hypothesis that the other players play admissible strategies only. 1998 ACM Subject Classiﬁcation D.2.4 Software/Program Veriﬁcation, F.3.1 Specifying and Verifying and Reasoning about Programs, I.2.2 Program Synthesis


Introduction
In a concurrent n-player game played on a graph, all n players independently and simultaneously choose moves at each round of the game, and those n choices determine the next state of the game [13].Concurrent games generalise turn-based games and it is well-known that, while deterministic strategies are sufficient in the turn-based case, randomised strategies are necessary for winning with probability one even for reachability objectives.Intuitively, randomisation is necessary because in concurrent games, in each round, players have no information about the concurrent choice of moves made by the other players.Randomisation allows for some probability of choosing a good move while not knowing the choice of the other players.As a consequence, there are two classical semantics that are considered to analyse these games qualitatively: winning with certainty (sure semantics in the terminology of [13]), and winning with probability one (almost sure semantics in the terminology of [13]).We consider both semantics here.
Previous papers on concurrent games are mostly concerned with two-player zero-sum games, i.e. two players that have fully antagonistic objectives.In this paper, we consider the more general setting of n-player non zero-sum concurrent games in which each player has its own objective.The notion of winning strategy is not sufficient to study non zero-sum games and other solution concepts have been proposed.One such concept is the notion of admissible strategy [1].
For a player with objective Φ, a strategy σ is said to be dominated by a strategy σ ′ if σ ′ does as well as σ with respect to Φ against all the strategies of the other players and strictly better for some of them.A strategy σ is admissible for a player if it is not dominated by any other of his strategies.Clearly, playing a strategy which is not admissible is sub-optimal and a rational player should only play admissible strategies.While recent works have studied the notion of SCO LA Adm.  admissibility for n-player non zero-sum game graphs [4,14,9,7,6], they are all concerned with the special case of turn-based games and this work is the first to consider the more general concurrent games.
Throughout the paper, we consider the running example in Figure 2.This is a concurrent game played by two players.Player 1's objective is to reach Trg, while Player 2 wants to reach s 2 .Edges are labelled by pairs of moves of both players which activate that transition (where − means 'any move').It is easy to see that no player can enforce its objective with or without randomisation, so, there is no winning strategy in this game for either player.This is because moving from s 0 to s 1 and from s 1 to s 2 requires the cooperation of both players.Moreover, the transitions from s 2 behave as in the classical 'matching pennies' game: player 1 must chose between f and f ′ ; player 2 between g and g ′ ; and the target is reached only when the choices 'match'.So, randomisation is needed to make sure Trg is reached with probability one, from s 2 .In the paper, we will describe the dominated and admissible strategies of this game.
Technical contributions First, we study the notion of admissible strategies for both the sure and almost sure semantics of concurrent games.We show in Theorem 1 that in both semantics admissible strategies always exist.The situation is thus similar to the turn-based case [4,9].Nevertheless, the techniques used in this simpler case do not generalise easily to the concurrent case and we need substantially more involved technical tools here.To obtain our universal existence result, we introduce two weaker solution concepts: locally admissible moves and strongly cooperative optimal strategies.While cooperative optimal strategies were already introduced in [6] and shown equivalent to admissible strategies in the turn based setting, there are strictly weaker than admissible strategies in the concurrent setting (both for the sure and the almost sure semantics), and they need to be combined with the notion of locally admissible moves to fully characterise admissible strategies.In the special case of safety objectives, we can show that admissible strategies are exactly those that always play locally admissible moves.This situation is depicted in Figure 1.
Second, we build on our characterisation of admissible strategies based on the notions of locally admissible moves and strongly cooperative optimal strategies to obtain algorithms to solve the assume admissible synthesis problem for concurrent games.In the assume admissible synthesis problem, we ask whether a given player has an admissible strategy that is winning against all admissible strategies of the other players.So this rule relaxes the classical synthesis rule by asking for a strategy that is winning against the admissible strategies of the other players only and not against all of them.This is reasonable as in a multi-player game, each player has his own objective which is generally not the complement of the objectives of the other players.The assume-admissible rule makes the hypothesis that players are rational, hence they play admissible strategies and it is sufficient to win against those strategies.Our algorithm is applicable to all ω-regular objectives and it is based on a reduction to a zero-sum two-player game in the sure semantics.While this reduction shares intuitions with the reduction that we proposed in [7] to solve the same problem in the turn-based case, our reduction here is based on games with imperfect information [17].In contrast, in the turn-based case, games of perfect information are sufficient.The correctness and completeness of our reduction are proved in Theorem 2.
Related works Concurrent reachability games were studied in [13] and algorithms to solve more general omega-regular objectives were given in [10].The games studied there are two-player and zero-sum only.We rely on the algorithms defined in [10] to compute states from which players have almost surely winning strategies for their objectives when all the other players play adversely.States where a player has a (deterministic) strategy to surely win against all other players can be computed by a reduction to more classical turn-based game graphs [2].Nash equilibria have been studied in concurrent games [5], but without randomised strategies.None of those papers consider the notion of admissibility.
We use the notion of admissibility to obtain synthesis algorithms for systems composed of several sub-systems starting from non zero-sum specifications.There have been a few other proposals in the literature that are based on refinements of the notion of Nash equilibrium (and not on admissibility), most notably: assume-guarantee synthesis [11] and rational synthesis [15,16].Those works assume the simpler setting of turn-based games and so they do not deal with randomised strategies.In the context of infinite games played on graphs, one well known limitation of Nash equilibria is the existence of non-credible threats, admissibility does not suffer from this problem.
In [12], Damm and Finkbeiner use the notion of dominant strategy to provide a compositional semi-algorithm for the (undecidable) distributed synthesis problem.So while we use the notion of admissible strategy, they use a notion of dominant strategy.The notion of dominant strategy is strictly stronger: every dominant strategy is admissible but an admissible strategy is not necessary dominant.Also, in multiplayer games with omega-regular objectives with complete information (as considered here), admissible strategies are always guaranteed to exist [4] while it is not the case for dominant strategies.the set of possible successors of the state s ∈ S when player p performs action a ∈ Σ p (s).A particular case of concurrent games are the turn-based games.A game G = (S, Σ, s init , (Σ p ) p∈P , δ) is turn-based iff for all states s ∈ S, there is a unique player p s.t. the successors of s depend only on p's choice of action, i.e., Succ(s, a) contains exactly one state for all a ∈ Σ p (s).
A history is a finite path h = s 1 s 2 . . .s k ∈ S * s.t.(i) k ∈ N; (ii) s 1 = s init ; and (iii) for every 2 ≤ i ≤ k, there exists (a 1 , . . ., a n ) ∈ Σ |P | with s i = δ(s i−1 , a 1 , . . ., a n ).The length |h| of a history h = s 1 s 2 . . .s k is its number of states k; for every 1 ≤ i ≤ k, we denote by h i the state s i and by h ≤i the history s 1 s 2 . . .s i .We denote by last(h) the last state of h, that is, last(h) = h |h| .A run is defined similarly as a history except that its length is infinite.For a run ρ = s 1 s 2 . . .∈ S ω and i ∈ N, we also write ρ ≤i = s 1 s 2 . . .s i and ρ i = s i .Let Hist(G) (resp.Runs(G)) denote the set of histories (resp.runs) of G.The game is played from the initial state s init for an infinite number of rounds, producing a run.At each round i ≥ 0, with current state s i , all players p select simultaneously a action a i p ∈ Σ p (s i ), and the state δ(s i , a i 1 , . . ., a i n ) is appended to the current history.The selection of the action by a player is done according to strategies defined below.
Randomised moves and strategies Given a finite set A, a probability distribution on A, is a function α : A → [0, 1] such that a∈A α(a) = 1; and we let Supp(α) = {a | α(a) > 0} be the support of α.We denote by α(B) = a∈B α(a) the probability of a given set B according to α.The set of probability distributions on A is denoted by D(A).A randomised move of player p in state s is a probability distribution on Σ p (s), that is, an element of D(Σ p (s)).A randomised move that assigns probability 1 to an action and 0 to the others is called a Dirac move.We will henceforth denote randomised moves as sums of actions weighted by their respective probabilities.For instance 0.5f + 0.5g denotes the randomised move that assigns probability 0.5 to f and g (and 0 to all other actions).In particular, we denote by b a Dirac move that assigns probability 1 to action b.
Given a state s and a tuple β = (β p ) p∈P ∈ p∈P D(Σ p (s)) of randomised moves from s, one per player, we let δ r (s, β) ∈ D(S) be the probability distribution on states s.t. for all s ′ ∈ S: δ r (s, β)(s ′ ) = a|δ(s,a)=s ′ β(a), where β(a 1 , . . ., a n ) = n i=1 β i (a i ).Intuitively, δ r (s, β)(s ′ ) is the probability to reach s ′ from s when the players play according to β.
A strategy for player p is a function σ from histories to randomised moves (of player p) such that, for all h ∈ Hist(G): σ(h) ∈ D(Σ p (last(h))).A strategy is called Dirac at history h, if σ(h) is a Dirac move; it is called Dirac if it is Dirac in all histories.We denote by Γ p (G) the set of player-p strategies in the game, and by Γ det p (G) the set of player-p strategies that only use Dirac moves (those strategies are also called deterministic); we might omit G if it is clear from context.A strategy profile σ for a subset A ⊆ P of players is a tuple (σ p ) p∈A with σ p ∈ Γ p for all p ∈ A. When the set of players A is omitted, we assume A = P .Let σ = (σ p ) p∈P be a strategy profile.Then, for all players p, we let σ −p denote the restriction of σ to P \ {p} (hence, σ −p can be regarded as a strategy of player −p that returns, for all histories h, a randomised move from p∈P \{p} D(Σ p (s)) ⊆ D(Σ −p (last(h)))).We sometimes denote σ by the pair (σ p , σ −p ).Given a history h, we let (σ p ) p∈A (h) = (σ p (h)) p∈A .
Let h be a history and let ρ be a history or a run.Then, we write h ⊆ pref ρ iff h is a prefix of ρ, i.e., ρ ≤|h| = h.Consider two strategies σ and σ ′ for player p, and a history h.We denote by σ h ← σ ′ the strategy that follows strategy σ and shifts to σ ′ as soon as h has been played.Formally, σ h ← σ ′ is the strategy s.t., for all histories h ′ : Probability measure and outcome of a profile Given a history h, we let Cyl(h) = {ρ | h ⊆ pref ρ} be the cylinder of h.To each strategy profile σ, we associate a probability measure P σ on certain sets of runs.First, for a history h, we define P σ (Cyl(h)) inductively on the length of h: P σ (Cyl(s init )) = 1, and P σ (Cyl(h ′ s ′ )) = P σ (Cyl(h ′ )) • δ r (last(h ′ ), σ(h ′ ))(s ′ )when |h| > 1 and h = h ′ s ′ .Based on this definition, we can extend the definition of P σ to any Borel set of runs on cylinders.In particular, the function P σ is well-defined for all ω-regular sets of runs, that we will consider in this paper [18].We extend the Hist notation and let Hist(σ) be the set of histories h such that P σ (Cyl(h)) > 0. Given a profile σ we denote by Outcome(σ) the set of runs ρ s.t.all prefixes h of ρ belong to Hist(σ).In particular, P σ (Outcome(σ)) = 1.Note that when σ is composed of Dirac strategies then Outcome(σ) is a singleton.The outcome (set of histories) of a strategy σ ∈ Γ p , denoted by Outcome(σ) (Hist(σ)), is the union of outcomes (set of histories, respectively) of profile σ s.t.σ p = σ.
Winning conditions To determine the gain of all players in the game G, we define winning conditions that can be interpreted with two kinds of semantics denoted by the symbols S for the sure semantics or and A for the almost sure semantics.A winning condition Φ is a subset of Runs(G) called winning runs.From now on, we assume that concurrent games are equipped with a function Φ, called the winning condition, and mapping all players p ∈ P to a winning condition Φ(p).A profile σ is A-winning for Φ(p) if P σ (Φ) = 1 which we write G, σ |= A Φ(p).A profile σ is S-winning for Φ(p) if Outcome(G, σ) ⊆ Φ(p) which we write G, σ |= S Φ(p).Note that when σ is Dirac, the two semantics coincide: . The profile σ is winning for the sure semantics from We often omit G in notations when clear from the context.Most of our definitions and results hold for both semantics and we often state them using the symbol ⋆ ∈ {S, A} as in the following definition.Given a semantics ⋆ ∈ {S, A}, a strategy σ for player p (from a history h) is called ⋆-winning for player p if for every τ ∈ Γ −p , the profile (σ, τ ) is ⋆-winning for player p (from h).Note that a strategy σ for player p is S-winning iff Outcome(σ) ⊆ Φ(p).We often describe winning conditions using standard linear temporal operators and ♦; e.g.♦S means the set of runs that visit infinitely often S. See [3] for a formal definition.
A winning condition Φ(p) is prefix-independent if for all s 1 s 2 . . .∈ Φ(p), and all i ≥ 1: s i s i+1 . . .∈ Φ(p).When Φ(p) contains all runs that do not visit some designated set Bad p ⊆ S of states, we say that Φ(p) is a safety condition.A safety game is a game whose winning condition Φ is such that Φ(p) is a safety condition for all players p.Without loss of generality, we assume that safety games are so-called simple safety games: a safety game (S, Σ, s init , (Σ p ) p∈P , δ) is simple iff for all players p, for all s ∈ S: s ∈ Bad p implies that no s ′ ∈ Bad p is reachable from s.That is, once the safety condition is violated, then it remains violated forever at all future histories.
We note the following property of winning strategies.
for every h ∈ Hist(σ) which we show by contraposition.Assume there exists h ∈ Hist(σ) such that σ |= A h Φ(p).This means that P σ (Cyl(h)) > 0 and Example 1.Let us consider three player-1 strategies in Fig. 2. (i) σ 1 is any strategy that plays a in s 0 ; (ii) σ 2 is any strategy that plays b in s 0 , d in s 1 and f in s 2 ; and (iii) σ 3 is any strategy that plays b in s 0 , d in s 1 , and 0.5f + 0.5g in s 2 .Clearly, σ 1 never allows one to reach Trg while some runs respecting σ 2 and σ 3 do (remember that there is no ⋆-winning strategy in this game).We will see later that the best choice of player 1 (among σ 2 , σ 3 ) depends on the semantics we consider.In the almost-sure semantics, σ 3 is 'better' for player 1, because σ 3 is an A-winning strategy from all histories ending in s 2 , while σ 2 is not.On the other hand, in the sure semantics, playing σ 2 is 'better' for player 1 than σ 3 .Indeed, for all player-2 strategies τ , either Outcome(σ 3 , τ ) contains only runs that do not reach s 2 , or Outcome(σ 3 , τ ) contains at least a run that reaches s 2 , but, in this case, it also contains a run of the form hs ω 2 (because, intuitively, player 1 plays both f and g from s 2 ).So, σ 3 is not winning against any τ , while σ 2 wins at least against a player 2 strategy that plays b ′ in s 0 , d ′ in s 1 and f ′ in s 2 .We formalise these intuitions in the next section.

Admissibility
In this section, we define the central notion of the paper: admissibility [4,8].Intuitively, a strategy is admissible when it plays 'as well as possible'.Hence the definition of admissible strategies is based on a notion of domination between strategies: a strategy σ ′ dominates another strategy σ when σ ′ wins every time σ does.Obviously, players have no interest in playing dominated strategies, hence admissible strategies are those that are not dominated.Apart from these (classical) definitions, we characterise admissible strategies as those that satisfy two weaker notions: they must be both strongly cooperative optimal and play only locally-admissible moves.Finally, we discuss important characteristics of admissible strategies that will enable us to perform assume-admissible synthesis (see Section 4).
In this section, we fix a game G, a player p, and, following our previous conventions, we denote by Γ −p the set {σ −p | σ ∈ Γ}.
Admissible strategies We first recall the classical notion of admissible strategy [4,1].
Values of histories Before we discuss strongly cooperative optimal and locally admissible strategies, we associate values to histories.Let h be a history, and σ be a strategy of player p.Then, the value of h w.r.t.σ for semantics ⋆ ∈ {S, A} is defined as follows.
Value χ ⋆ σ (h) = 1 corresponds to the case where σ is ⋆-winning for player p from h (thus, against all possible strategies in Γ −p ).When χ ⋆ σ (h) = 0, σ is not ⋆-winning from h (because of τ ′ in the definition), but the other players can still help p to reach his objective (by playing some τ s.t.(σ, τ ) |= ⋆ h Φ(p), which exists by definition).Last, χ ⋆ σ (h) = −1 when there is no hope for p to ⋆-win, even with the collaboration of the other players.In this case, there is no τ s.t.(σ, τ ) |= ⋆ h Φ(p).Hence, having χ ⋆ σ (h) = −1 is stronger than saying that σ is not winning-when σ is not winning, we could have χ ⋆ σ (h) = 0 as well.We define the value of a history h for player p as the best value he can achieve with his different strategies: Strongly cooperative optimal strategies We are now ready to define strongly cooperative optimal (SCO) strategies.Recall that, in the classical setting of turn-based games, admissible strategies are exactly the SCO strategies [8].We will see that this condition is still necessary but not sufficient in the concurrent setting.
A strategy σ of Player Intuitively, when σ is a ⋆-SCO strategy of Player p, the following should hold: (i) if p has a ⋆-winning strategy from h (i.e.χ ⋆ p (h) = 1), then, σ should be ⋆-winning (i.e.χ ⋆ σ (h) = 1); and (ii) otherwise if p has no ⋆-winning strategy from h but still has the opportunity to ⋆-win with the help of other players (hence χ ⋆ p (h) = 0), then, σ should enable the other players to help p fulfil his objective (i.e.χ ⋆ σ (h) = 0).Observe that when χ ⋆ p (h) = −1, no continuation of h is ⋆-winning for p, so χ ⋆ σ (h) = −1 for all strategies σ.
Example 3. Consider again the example in Figure 2.For the almost-sure semantics, we have For the sure semantics, we have: Let us Consider again the three strategies σ 1 , σ 2 and σ 3 from Example 1.We see that σ 2 is S-SCO but it is not A-SCO because, for all profiles h ending in s 2 : χ A σ2 (h) = 0 while h ∈ Val A 1,1 .On the other hand, σ 3 is A-SCO; but it is not S-SCO.Indeed, one can check that, for all strategies τ ∈ Γ 2 : if Outcome(σ 3 , τ ) contains a run reaching Trg, then it also contains a run that cycles in s 2 .So, for all such strategies τ , Outcome(σ 3 , τ ) |= S Φ(1), hence χ S σ3 (h) = −1 for all histories that end in s 2 ; while χ S p (h) = 0 since χ S σ ′ (h) = 0 for all Dirac strategies σ ′ .Next, let us build a strategy σ ′ 3 that is A-dominated by σ 3 (hence, not A-admissible), but A-SCO.We let σ ′ 3 play as σ 3 except that σ ′ 3 plays c the first time s 1 is visited (hence ensuring that the self-loop on s 1 will be taken after the first visit to s 1 ).Now, σ 3 is A-dominated by σ ′ 3 , because (i) σ 3 A-wins every time σ ′ 3 does; but (ii) σ ′ 3 does not A-win against the player 2 strategy τ that plays d ′ only when s 1 is visited for the first time, while σ 3 A-wins against τ .However, σ ′ 3 is SCO because playing c keeps the value of the history equal to 0 = χ A 1 (h) (intuitively, playing c once does not prevent the other players from helping in the future).As similar example can be built in the S semantics.Thus, there are ⋆-SCO strategies which are not admissible, so, being ⋆-SCO is not a sufficient criterion for admissibility.
Locally admissible moves and strategies Let us now discuss another criterion for admissibility, which is more local in the sense that it is based on a domination between moves available to each player after a given history.Let h be a history, and let α and α ′ be two randomised moves in D(Σ p ).We say that α is ⋆-weakly dominated at h by α ′ (denoted α ⋆ h α ′ ) iff for all σ ∈ Γ p such that h ∈ Hist(σ) and σ(h) = α, there exists at h by α ′ and denote this by α < ⋆ h α ′ .When a randomised move α is not ⋆-dominated at h, we say that α is ⋆-admissible at h.This allows us to define a more local notion of dominated strategy: a strategy σ of player p is ⋆-locally-admissible (LA) if σ(h) is a ⋆-admissible move at h, for all histories h.Example 4. Consider the Dirac move f and the non-Dirac move 0.5f + 0.5g played from s 2 in the example in Figure 2. One can check that 0.5f + 0.5g < S s2 f .Indeed, consider a strategy σ s.t.σ(h) = 0.5f + 0.5g for some h with last(h) = s 2 .Then, playing σ(h) from h will never allow Player 1 to reach Trg surely at the next step, whatever Player 2 plays; while playing, for instance, f (Dirac move) ensures player 1 to reach Trg surely at the next step, against a Player-2 strategy that plays f ′ .Thus, σ 2 is S-LA but σ 3 is not.
On the other hand, after every randomised move played in state s 2 , the updated state is s 2 or s 3 from which A-winning strategies exist, thus f ).It follows that both σ 2 and σ 3 are A-LA.However, in the long run, player 1 needs to play λf +(1−λ)g, with λ ∈ (0, 1), infinitely often in order to A-win.In fact, σ 3 is A-winning from s 2 while σ 2 is not.Thus, there are ⋆-LA strategies which are not admissible, so being ⋆-LA is not a sufficient criterion for ⋆-admissibility.
We close this section by several lemmata that allow us to better characterise the notion of LA strategies.First, we observe that, while randomisation might be necessary for winning in certain concurrent games (for example, in Figure 2, no Dirac move allows player 1 to reach Trg surely from s 2 , while playing repeatedly f and g with equal probability ensures to reach Trg with probability 1) randomisation is useless when a player wants to play only locally admissible moves.This is shown by the next Lemma (point (i)), saying that, if a randomised move α plays some action a with some positive probability, then α is dominated by the Dirac move a.However, this does not immediately allow us to characterise admissible moves: some Dirac moves could be dominated (hence non-admissible), and some non-Dirac moves could be admissible too.Points (ii) and (iii) elucidate this: among Dirac moves, the non-dominated ones are admissible, and a non-Dirac move is admissible iff all the Dirac moves that occur in its support are admissible and equivalent to each other.Lemma 2. For all histories h and all randomised moves α: (i) For all a ∈ Supp(α): α ⋆ h a; (ii) Dirac moves that are not ⋆-dominated at h by another Dirac move are admissible; (iii) A move α is ⋆-LA at h iff, for all a ∈ Supp(α): (1) a is ⋆-LA at h; and (2) a ≃ ⋆ h b for all b ∈ Supp(α).
Proof.Proof of (i): Take a strategy σ ∈ Γ p such that h ∈ Hist(σ) and σ(h) = α.Define σ ′ the strategy that plays as σ except in h where it plays a instead of α.One show that σ ⋆ σ ′ .Consider any τ ∈ Γ −p such that (σ, τ ) |= A h Φ(p).This means that for all a ∈ Supp(α).Since σ and σ ′ are identical in all other histories, including those extending hδ(s, a, τ (h)), we deduce that (σ ′ , τ ) |= A h Φ(p).The proof for the sure semantics is similar.Proof of (ii): If a Dirac move a is ⋆-dominated at h by a move α ′ then by (i) it is ⋆-dominated at h by a Dirac move a ′ ∈ Supp(α ′ ).This shows (ii) by contraposition.Proof of (iii): Assume α is a ⋆-LA move at h.By (i), for every a ∈ Supp(α), α ⋆ h a and α < ⋆ h a (because α is a ⋆-LA move at h) so α ≃ ⋆ h a. Hence all the elements of Supp(α) are equivalent and ⋆-LA at h. Assume now that ∀a ∈ Supp(α), a is ⋆-LA at h and ∀b ∈ Supp(α), a ≃ ⋆ h b.We take a ∈ Supp(α).By Lemma 2 it holds that α ⋆ σ a; so it remains to show that a ⋆ h α.Let σ be such that σ(h) = a.We construct a strategy σ ′ such that σ ′ (h) = α and σ ⋆ σ ′ .For every b ≃ ⋆ h a, there exists a strategy σ b such that σ b (h) = b and σ ⋆ σ b .We construct σ ′ as follows.We start with σ ′ = σ, we set σ ′ (h) = α, for every b ∈ Supp(α) and for every s ′ ∈ Succ(s, b), we do σ ′ ← σ ′ hs ′ ← σ b .This shows that a ⋆ h α.We conclude that α is ⋆-LA at h.
This example seems to suggest that the local dominance of two moves coincide with the natural order on the values of histories that are obtained when playing those moves (in other words x < ⋆ h y would hold iff the value of the history obtained by playing x is smaller than or equal to the value obtained by playing y).This is not true for histories of value 0: we have seen that a and b are ⋆ h -incomparable, yet playing a or b from s 0 yields a history with value 0 in all cases (even when s 1 is reached).The next Lemma gives a precise characterisation of the dominance relation between Dirac moves in terms of values: Lemma 3.For all players p, histories h with last(h) = s and Dirac moves a, b ∈ Σ p (s): a ⋆ h b if, and only if the following conditions hold for every c ∈ Σ −p (s) where we write s (a,c) = δ(s, (a, c)) and s (b,c) = δ(s, (b, c)): hs (a,c) Φ(p) then (σ ′ , τ ) |= ⋆ hs (b,c) Φ(p).We deduce that σ ′ is winning from hs (b,c) if σ is winning from hs (a,c) .As σ is arbitrary this shows that χ ⋆ p (hs (a,c) ) = 1 ⇒ χ ⋆ p (hs (b,c) ) = 1.Dually if χ ⋆ p (hs (b,c) ) = −1 then σ ′ is losing from hs (b,c) and so is σ from hs (a,c) .This shows the implication χ ⋆ p (hs (b,c) ) = −1 ⇒ χ ⋆ p (hs (a,c) ) = −1.These two implications yield χ ⋆ p (hs (a,c) ) ≤ χ ⋆ p (hs (b,c) ).Proof of (ii): We show the contrapositive, consider c such that s (a,c) = s (b,c) and show that χ ⋆ p (hs (a,c) ) and χ ⋆ p (hs (b,c) ) cannot be both equal to 0. For this purpose we assume that χ ⋆ p (hs (a,c) ) = 0 and show that χ ⋆ p (hs (b,c) ) = 1.Since σ is chosen arbitrary we take it such that χ ⋆ σ (hs (a,c) ) = χ ⋆ p (hs (a,c) ) = 0.In particular there exists τ ∈ Γ −p such that (σ, τ ) |= ⋆ hs (a,c) Φ(p).Let τ ′ be an arbitrary player −p strategy.By σ ⋆ σ ′ , we have (σ, τ ′ ) |= ⋆ Φ(p) implies (σ ′ , τ ′ ) |= ⋆ Φ(p).Since τ ′ hs (a,c) ← τ and τ ′ are equal on histories incomparable to hs (a,c) , it then holds that (σ ′ , τ ′ ) |= ⋆ hs (b,c) Φ(p) for every arbitrary profile τ ′ .We have shown that σ ′ is winning from hs (b,c) and hence that χ ⋆ p (hs (b,c) ) = 1 as claimed.
Characterisation and existence of admissible strategies Equipped with our previous results, we can now establish the main results of this section.First, we show that ⋆-admissible strategies are exactly those that are both ⋆-LA and ⋆-SCO (Theorem 1(i)).Then, we show that admissible strategies always exist in concurrent games (Theorem 1(ii)).
To establish Theorem 1, we need an ancillary lemma, which relates the notions of ⋆-admissibility and the notions of ⋆-LA.To this end, we draw a link between (global) equivalence of strategies (in terms of ≈ ⋆ ) and the local equivalence of moves (in terms of ≃ ⋆ h ).For all player p strategies σ and all v ∈ {−1, 0, 1}, let us define Hist v (σ) as Hist(σ) ∩ Val ⋆ p,v , i.e., the set of histories of σ that have value v. Then: Lemma 5. Let σ and σ ′ be two player p strategies s.t.: (i) σ is ⋆-LA; and (ii) for all histories h: χ ⋆ p (h) = 1 implies that σ and σ ′ are ⋆-winning from h.Then, the following conditions are equivalent: It suffices to take an arbitrary τ ∈ Γ −p and show that P (σ,τ ) (Φ(p)) = 1 iff P (σ ′ ,τ ) (Φ(p)) = 1.We first show that: ( This can be shown by induction using Lemma 4 where the base case is P (σ,τ ) (Cyl(s init )) = 1 = P (σ ′ ,τ ) (Cyl(s init )) and the induction step is that P (σ,τ ) (Cyl(h)) = P (σ ′ ,τ ) (Cyl(h)) implies: We decompose the set of runs as Val ⋆ p,0 When intersecting with Φ(p), the last set in the union becomes empty and we have: Hence for σ ′′ ∈ {σ, ′ }, ) where we used that ♦Val ⋆ p,1 implies Φ(p) almost surely because σ ′′ is winning from histories of value 1. Hence it suffices to show that ).The former equality is due to (3).To prove the latter equality, we write for σ ′′ ∈ {σ, σ ′ }, This equality traduces the decomposition of the even Val ⋆ p,1 into the disjoint events parametrised by the history h of value 0 just before entering Val ⋆ p,1 .Using (4), Lemma 4 and (3) we deduce that ), ending the proof.Proof of (3) ⇒ (1): Straightforward.
Theorem 1 (Characterisation and existence of admissible strategies).The following holds for all strategies σ in a concurrent game with semantics ⋆ ∈ {S, A}: (i) σ is ⋆-admissible iff σ is ⋆-LA and ⋆-SCO; in the special case of simple safety objectives, if σ is ⋆-LA then σ is ⋆-admissible.
In particular, point (ii) implies that admissible strategies always exist in concurrent games.
Proof of Theorem 1.We start with the proof of point (i).
"σ is ⋆-admissible" implies "σ is ⋆-SCO": We show the contrapositive.Let σ be a strategy that is not ⋆-SCO, and let us show that σ is not ⋆-admissible.Since σ is not ⋆-SCO, there is, by definition of ⋆-SCO, a history h s.t.
and that there exists a strategy σ ′ s.t.χ ⋆ σ (h) = χ ⋆ p (h).Consider the strategy σ = σ h ← σ ′ that plays like σ and switches to σ ′ after history h.We claim that σ ≺ ⋆ σ.Observe that, by construction: χ ⋆ σ (h) < χ ⋆ σ (h).Note also that χ h = (σ) − 1 is not possible, because this means that all continuations of h are losing for p, so it is not possible to have σ with χ ⋆ σ (h) < χ ⋆ σ (h).So we have necessarily χ h = (σ)0 and χ h = (σ)1.This means that σ is winning from h (against all possible strategies of −p), while, by definition of χ ⋆ σ (h) = 0, σ is, from h, losing for p against at least one strategy of −p.Hence, σ ≺ ⋆ σ and σ is not ⋆-admissible."σ is ⋆-admissible" implies "σ is ⋆-LA": We show the contrapositive: we assume that σ is not ⋆-LA and show that σ is not ⋆-admissible.There exists a history h ∈ Hist(σ) such that σ(h) is ⋆-dominated at h by an ⋆-LA move b that can be chosen Dirac by virtue of Lemma 2 (i).There exists a ∈ Supp(σ(h)) such that a < ⋆ h b (otherwise all the moves of the support of Supp(σ(h)) would be equivalent to the ⋆-LA move b and so would be α by Lemma 2 (ii)).We already saw in the proof of Lemma 2 (i), that σ ⋆ σ a where σ a is the strategy that plays like σ everywhere except in h where it plays a instead of α.Then it suffices to show that σ a is ⋆-dominated.Using Lemma 3 and notation therein with a ⋆ h b, we know that for every c ∈ Σ −p (s), χ ⋆ p (hs (a,c) ) ≤ χ ⋆ p (hs (b,c) ) and if χ ⋆ p (hs (a,c) ) = χ ⋆ p (hs (b,c) ) = 0 then s (a,c) = s (b,c) .Here we can reuse the end of the proof of Lemma 3 to find a strategy σ ′ such that σ a ⋆ σ ′ and that satisfies the further properties that it is ⋆-SCO at hs (b,c) for every c ∈ Σ −p (s), and it plays as h in every proper prefix of h.It remains to show that there exists a profile τ ∈ Γ −p such that (σ a , τ ) |= ⋆ Φ(p) and (σ ′ , τ ) |= ⋆ Φ(p).Using Lemma 3, this time with σ a (h) ⋆ h a we get the existence of a Dirac move c ∈ Σ −p (s) such that χ ⋆ p (hs (a,c) ) < χ ⋆ p (hs (b,c) ).We define τ ∈ Γ −p such that h ∈ Hist(σ a , τ ), τ (h) = c and a further property depending on cases described as follows.If χ p hsa = −1 then χ p hs b ≥ 0 and we further require that (σ ′ , τ ) |= ⋆ hs b Φ(p).If χ p hsa = 0 then χ p hs b = 1 and we further require that (σ a , τ ) |= ⋆ hsa Φ(p).In both cases it holds that (σ "σ is ⋆-SCO and ⋆-LA" implies "σ is ⋆-admissible": Let σ be a strategy that is both ⋆-LA and ⋆-SCO.We show that for every σ ′ such that σ ⋆ σ ′ implies σ ≈ ⋆ σ ′ ; which is a way of proving that σ is ⋆-Admissible.Let σ ′ ∈ Γ p such that σ ⋆ σ ′ .We can assume without loss of generality that σ ′ is winning from histories of value 1 (otherwise it suffices to change σ ′ so that when a history of value 1 it switches to a winning strategy from that history).
We can apply Lemma 5, and deduce that σ ⋆ σ ′ as required.In simple safety games: "σ is ⋆-LA" implies "σ is ⋆-admissible": To establish this case, we prove that, in simple safety games, "σ is ⋆-LA" implies "σ is ⋆-SCO".This implies that all ⋆-LA strategies are also ⋆-SCO, hence admissible.First, observe that, in the case of simple safety games, the two semantics A and S coincide.Indeed, a winning strategy for the S semantics is also winning for the almost sure semantics.On the other hand, a winning strategy for the almost sure semantics ensures that no prefix generated by the strategy reaches the bad states (otherwise, this prefix would have non-negative measure, and the probability of winning would be < 1).Hence, a winning strategy for the almost sure semantics is also winning for the sure semantics.So, we can restrict ourselves to the sure semantics in the arguments that follow.
Moreover, since the objective is a simple safety one, it is easy to see that when σ ′ is S-winning, then σ is S-winning too.So σ ′ is dominated by σ and it is sufficient to prove that σ ′ is admissible to show that σ is admissible too.
So, when states of value 1 are entered, σ ′ ensures that they are never left, and σ ′ is thus winning from states of value 1.On the other hand, when a history of value 0 is entered, σ ′ will also ensure that only histories of value 0 or 1 are visited.In both cases, the σ ′ is winning because we are considering a safety objective.In particular, if we stay in histories of value 0 forever, the strategy is winning because the bad states are never visited (otherwise, the value of the history would drop to −1).

Now, we move to the proof of point (ii).
Let σ be a strategy for player p, we build an admissible strategy that weakly dominates σ.This proof is similar to the proof of a similar result for SCO strategies in turn based games ([6, Lemma 7]).The strategy σ ′ is built dynamically along runs.It plays a winning strategy as soon as the current history begins to be of value 1.When the history begins to be of value −1, there is nothing more to do, the strategy can play arbitrarily.Consider now, runs in which the value is always 0. The strategy σ ′ keeps in memory a wished strategy σ h for himself and a wished strategy τ h of player −p such that (σ h , τ h ) |= ⋆ h Φ(p).The strategies σ h and τ h are inductively defined as follows.At the beginning, h = s init , σ h and τ h are chosen such that (σ h , τ h ) |= ⋆ h Φ(p) and such that σ h is h-admissible.Here and below if σ satisfies these properties then the choice σ h = σ is made by default, otherwise another strategy σ h is chosen (and it dominates σ).After history h, σ ′ plays σ h (h).If after some history h the continuation hs ′ chosen is not in Hist(σ h , τ h ) (this is the case only if player −p plays something else than τ h (h)), then σ hs ′ and τ hs ′ are chosen such that σ hs ′ is hs ′ -admissible and (σ hs ′ , τ hs ′ ) |= ⋆ hs ′ Φ(p).Otherwise, if the wished profile is followed then the wished strategy for player −p is left unchanged (τ hs ′ = τ h ); the wished strategy for player p is left unchanged (σ hs ′ = σ h ) if σ h is hs ′ -admissible and otherwise σ hs ′ is defined as a strategy that plays an admissible moves in hs ′ and that dominates σ h .The constructed strategy is ⋆-LA.It is also ⋆-SCO because from every history h of value 0, (σ ′ , τ h ) |= ⋆ h Φ(p) and from every history of value 1, σ ′ is winning.σ ′ plays as σ by default or a strategy that dominates σ.At the end σ ′ is an admissible strategy that weakly dominates σ.Example 6.We consider again the example in Figure 2, and consider strategies σ 2 and σ 3 as defined in Example 1. Remember that these two strategies do their best to reach s 2 , and that, from s 2 , σ 2 plays deterministically f , while σ 3 plays f and g with equal probabilities.From Example 3, we know that σ 2 is S-SCO but not A-SCO; while σ 3 is A-SCO but not S-SCO.Indeed, we have already argued in Example 2 that σ 2 is not A-admissible, and that σ 3 is not S-admissible.However, from Example 4, we know that σ 2 is S-LA and that σ 3 is A-LA.So, by Theorem 1, σ 2 is S-admissible and σ 3 is A-admissible as expected.
Finally, we close the section by a finer characterisation of ⋆-admissible strategies.We show that: (i) in the sure semantics, there is always an S-admissible strategy that plays Dirac moves only; and (ii) in the almost-sure semantics, there is always an A-admissible strategy that plays Dirac moves only in histories of values 0 or −1.The difference between the two semantics should not be surprising, as we know already that randomisation is sometimes needed to win (i.e., from histories of value 1) in the almost sure semantics: Proposition 1.For all player p strategies σ in a concurrent game with ⋆ ∈ {S, A}: (i) If σ is A-admissible then there exists a strategy σ ′ that plays only Dirac moves in histories of value ≤ 0 such that σ ≃ A σ ′ .
(ii) If σ is S-admissible then there exists a Dirac strategy σ ′ such that σ ≃ S σ ′ .

Assume admissible synthesis
In this section we discuss an assume-admissible synthesis framework for concurrent games.With classical synthesis, one tries and compute winning strategies for all players, i.e., strategies that always win against all possible strategies of the other players.Unfortunately, it might be the case that such unconditionally winning strategies do not exist, as in our example.As explained in the introduction, the assume-admissible synthesis rule relaxes the classical synthesis rule: instead of searching from strategies that win unconditionally, the new rule requires winning against the admissible strategies of the other players.So, a strategy may satisfy the new rule while not winning unconditionally.Nevertheless, we claim that winning against admissible strategies is good enough assuming that the players are rational ; if we assume that players only play strategies that are good for achieving their objectives, they will be playing admissible ones.
The general idea of the assume-admissible synthesis algorithm is to reduce the problem (in a concurrent n-player game) to the synthesis of a winning strategy in a 2-player zero-sum concurrent game of imperfect information, in the S-semantics (even when the original assume-admissible problem is in the A-semantics), where the objective of player 1 is given by an LTL formula.Such games are solvable using techniques presented in [10].
More precisely, from a concurrent game G in the semantics ⋆ ∈ {S, A} and player p, we build a game G ⋆ p with the above characteristics, which is used to decide the assume-admissible synthesis rule.If such a solution exists, our algorithm constructs a witness strategy.For example, the game G ⋆ 1 corresponding to the game in Figure 2 is given in Figure 3.The main ingredients for this construction are the following.(i) In G ⋆ p , the protagonist is player p, and the second player is −p.(ii) Although randomisation is needed to win in such games in general, we interpret G ⋆ p in the S-semantics only.In fact, we have seen that for the protagonist, Dirac moves suffice in states of value 0; so the only states where he might need randomisation are those of value 1 (randomisation does not matter if the value is −1 since the objective is lost anyway).Hence we define winning condition to be Φ(p) ∨ ♦Val ⋆ p,1 enabling us to consider only histories of values 0 in G ⋆ p ; and thus hiding the parts of the game where randomisation might be needed.We also prove that we can restrict to Dirac strategies for −p when it comes to admissible strategies.(iii) In order to restrict the strategies to admissible ones, we only allow ⋆-LA moves in G ⋆ p .These moves can be computed by solving classical 2-player games ([2]) using Lemma 3.For example, in Figure 3, moves c and c ′ are removed since they are not A-LA.(iv) Last, since ⋆-admissible strategies are those that are both ⋆-LA and ⋆-SCO (see Theorem 1), we also need to ensure that the players play ⋆-SCO.This is more involved than ⋆-LA, as the ⋆-SCO criterion is not local, and requires information about the sequence of actual moves that have been played, which cannot be deduced, in a concurrent game, from the sequence of visited states.So, we store, in the states of G ⋆ p , the moves that have been played by all the players to reach the state.For example, in Figure 3, the state labelled by s 1 , (b, b ′ ) means that G has reached s 1 , and that the last actions played by the players were b and b ′ respectively.However, players' strategies must not depend on this extra information since they do not have access to this information in G either.We thus interpret G ⋆ p as a game of imperfect information where all the states labelled by the same state of G are in the same observation class.Thanks to these constructions, we can finally encode the fact that the players must play ⋆-SCO strategies in the new objective of the games, which will be given as an LTL formula, as we describe below.
To simplify the presentation, we restrict ourselves to prefix independent winning conditions, although the results can be generalised.Also, to ensure we can effectively solve subproblems mentioned above, we consider ω-regular objectives.The values of the histories depend thus only on their last states, i.e. for all pairs of histories h 1 and h 2 : last(h 1 ) = last(h 2 ) implies that χ ⋆ p (h 1 ) = χ ⋆ p (h 2 ).We thus denote by χ ⋆ p (s) the value χ ⋆ p (h) of all histories h s.t.last(h) = s.
the next claim that there is always such a finite state strategy and it can be computed effectively.
Claim.Given a game G, a state s, and a player p, we can construct a finite state stochastic Moore machine that encodes an admissible strategy.We establish this claim constructively.We consider the following case study to describe the machine: • if the value of χ ⋆ p (s) = 1, then the machine plays a finite state ⋆-winning strategy, such a finite memory strategy always exists and can be computed effectively.
• if the value of χ ⋆ p (s) = −1, then the machine plays arbitrarily.• if the value of χ ⋆ p (s) = 0, then the machine selects a finite lasso-shape path ρ = ρ 1 • ρ ω 2 such that ρ is compatible with ⋆-LA moves of player p and such that ρ ∈ Φ(p).Such a finite lasso-path always exists as χ ⋆ p (s) = 0. Then the machine plays according to this lasso-path either forever or up to a deviation by another player.If the lasso path is played forever then the outcome is ρ ∈ Φ(p), and if there is a deviation, the new state of the game is s ′ and the three rules here are applied from s ′ (according to the value of state s ′ ).
Clearly, if entering a state s with value 0, we always choose the same finite lasso-path then the machine has finite state.As in states with value 0, it always plays ⋆-LA moves and ρ from s is such that ρ ∈ Φ(p), then we conclude that the strategy is ⋆-SCO and ⋆-LA and thus ⋆-admissible.So, we are done.
Theorem 2 (Assume-admissible synthesis).Player p has a ⋆-admissible strategy σ that is ⋆winning against all player −p ⋆-admissible strategies in G iff Player p has an S-winning strategy in G ⋆ p for the objective Φ G ⋆ p .Such a ⋆-admissible strategy σ can be effectively computed (from any player p S-winning strategy in G ⋆ p ). Proof.Completeness: Assume there exists an admissible strategy σ ∈ Γ p (G) that wins against every admissible strategy.Let σ ∈ Γ det p (G) be a realisation of σ.Note that every runs ρ ∈ Outcome(G ⋆ p , σ) that satisfies ♦Val ⋆ p,1 also satisfies Φ G ⋆ p .The other runs ρ ∈ Outcome(G ⋆ p , σ) satisfy ¬♦Val ⋆ p,1 ≡ (¬Val ⋆ p,1 ); for these runs we show that q =p Φ ⋆ 0 (q) ∧ Φ ⋆ 1 (q) → Φ(p).So let ρ ∈ Outcome(G ⋆ p , σ) be such that (¬Val ⋆ p,1 ) ∧ q =p Φ ⋆ 0 (q) ∧ Φ ⋆ 1 (q).By Lemma 7, there exists τ that only contains admissible profiles such that Outcome(σ, τ ) = {o(ρ)}.By assumption (σ, τ ) is winning for Φ(p), hence so is ρ.Correctness: We show the following stronger statement.If a strategy σ is winning for Φ G ⋆ p in G ⋆ p then one can construct a strategy σ admissible that wins against every profile of admissible strategies as follows: (i) take an extension σ ′ of σ; (ii) modify σ ′ into σ ′′ that plays a winning strategy as soon as a history of value 1 is entered; (iii) use Lemma 4 to design an admissible strategy σ that weakly dominates σ ′′ .
Let us explain how we build a strategy in G with the desired properties, from any player p strategy enforcing Φ G ⋆ p in G ⋆ p .Remember that G ⋆ p ensures that the players play ⋆-LA moves only.We will use Φ G ⋆ p to make sure that, when SCO strategies are played by −p (relying on the extra information we have encoded in the states), then p reaches a state of value 1.First, consider Φ ⋆ 0 (q) for q = p.Runs that satisfy this formula are either those that visit states of value 0 only finitely often (♦¬Val ⋆ q,0 ); or those that stay in states of value 0, in which case they must be either winning (Φ(q)) or visit infinitely often states where Player q could have been helped by the other players ( ♦AfterHelpMove ⋆ q ).This is a necessary condition on runs visiting only value 0 states for the strategy to be SCO.Next, observe that Φ ⋆ 1 (q) states that if a history of value 1 is entered then Player q must win.This allows us to understand the left part of the implication in Φ G ⋆ p : the implication can be read as 'if all other players play a ⋆-admissible strategy, then either p should win (Φ(p)) or a state of value 1 for player p should eventually be visited (♦Val ⋆ p,1 )'.Then a strategy σ (in G) that wins against admissible strategies can be extracted from a winning strategy σ (in G ⋆ p ) in a straightforward way, except when σ enforces to reach a state of value 1 (♦Val ⋆ p,1 in Φ G ⋆ p ).In this case, σ cannot follow σ, but must rather switch to a winning strategy, which: (i) is guaranteed to exist since the state that has been reached has value 1; and (ii) can be computed using classical techniques [10].The strategy σ is not necessarily admissible but by Theorem 1 (i), there is an admissible strategy σ with σ ⋆ σ.By weak domination, σ wins against more profiles than σ, in particular, it wins against the profiles of admissible strategies of the other players.) .Observe that this strategy is compatible with o.From σ, we can extract an admissible player 1 strategy in G: always play b in s 0 ; always play d in s 1 ; and play a winning strategy from s 2 (which is of value 1), for instance: always play 0.5f + 0.5g from s 2 like σ 3 does.
We conclude by a remark on games with simple safety objectives.
Remark 1.In the case of simple safety games, the situation is much simpler.We have seen in Theorem 1 that, for simple safety objectives, ⋆-LA strategies are exactly the admissible strategies.So, can simply build G p from G by pruning the actions which are not ⋆-LA (the labelling by actions is not necessary anymore since its sole purpose is to enforce SCO), and look for a player p winning strategy in the resulting game.

Figure 1 :
Figure 1: The relationships between the classes of Admissible, LA, and SCO strategies for three families of games.All the inclusions are strict.

Figure 2 :
Figure 2: A concurrent game where Player 1 and 2 want to reach Trg and s 2 respectively.

s 0 s 0 ,Figure 3 :
Figure 3: The game G A 1 obtained from the game in Figure 2. Bold states s 0 , (a, b ′ ) and s 1 , (b, b ′ ) are the states of AfterHelpMove A 2 .There is a (b, b ′ )-labelled transition from all states in the dashed rectangle to s 1 , (b, b ′ ) .

Example 7 . 2 , 1 =
In our running example, observe that ¬ValA 2,0 = Val A {W in} since there is no state of value −1 in G. Hence, Φ(2) = ♦W in = ♦Val A 2,1 = ♦¬Val A 2,0 .Finally, AfterHelpMove A 2 = s 0 , (a, b ′ ) , s 1 , (b, b ′ ) , so, after simplification: Φ G A 1 = ♦W in∨ ♦ (s 0 , (a, b ′ ))∨(s 1 , (b, b ′ )) →♦W in.Thus, to win in G A 1 (under the sure semantics), player 1 must ensure to reach W in as long as player 2 visits the set of bold states in Figure3infinitely often.A winning strategy σ in G A 1 consists in (eventually) always playing b from all states in the dashed rectangle; and d from s 1 , (b, b ′