Sparse Positional Strategies for Safety Games^{†}^{†}thanks: This work was partially supported by the DFG as part of the Transregional Collaborative Research Center “Automatic Verification and Analysis of Complex Systems” (SFB/TR 14 AVACS).
Abstract
We consider the problem of obtaining sparse positional strategies for safety games. Such games are a commonly used model in many formal methods, as they make the interaction of a system with its environment explicit. Example applications are the synthesis of finitestate systems from specifications in temporal logic and alternatingtime temporal logic (ATL) model checking. Often, a winning strategy for one of the players is used as a certificate or as an artefact for further processing in the application. Small such certificates, i.e., strategies that can be written down very compactly, are typically preferred. For safety games, we only need to consider positional strategies. These map game positions of a player onto a move that is to be taken by the player whenever the play enters that position. For representing positional strategies compactly, a common goal is to minimize the number of positions for which a winning player’s move needs to be defined such that the game is still won by the same player, without visiting a position with an undefined next move. We call winning strategies in which the next move is defined for few of the player’s positions sparse. From a sparse winning positional strategy for the safety player in a synthesis game, we can compute a small implementation satisfying the specification used for building the game, and for ATL model checking, sparse strategies are easier to comprehend and thus help in analysing the cause of a model checking result.
Unfortunately, even roughly approximating the density of the sparsest strategy for a safety game has been shown to be NPhard. Thus, to obtain sparse strategies in practice, one either has to apply some heuristics, or use some exhaustive search technique, like ILP (integer linear programming) solving. In this paper, we perform a comparative study of currently available methods to obtain sparse winning strategies for the safety player in safety games. Approaches considered include the techniques from common knowledge, such as using ILP or SAT (satisfiability) solving, and a novel technique based on iterative linear programming. The restriction to safety games is not only motivated by the fact that they are the simplest game model for continuous interaction between a system and its environment, and thus an evaluation of strategy extraction methods should start here, but also by the fact that they are sufficient for many applications, such as synthesis. The results of this paper shed light onto which directions of research in this area are the promising ones, and if current techniques are already scalable enough for practical use.
Doron Peled and Sven Schewe (Eds.): First Workshop on Synthesis (SYNT2012) EPTCS 84, 2012, pp. Sparse Positional Strategies for Safety Games^{†}^{†}thanks: This work was partially supported by the DFG as part of the Transregional Collaborative Research Center “Automatic Verification and Analysis of Complex Systems” (SFB/TR 14 AVACS).–LABEL:LastPage, doi:10.4204/EPTCS.84.1 © R. Ehlers & D. Moldovan This work is licensed under the Creative Commons AttributionNo Derivative Works License.
Sparse Positional Strategies for Safety Games^{†}^{†}thanks: This work was partially supported by the DFG as part of the Transregional Collaborative Research Center “Automatic Verification and Analysis of Complex Systems” (SFB/TR 14 AVACS).
Rüdiger Ehlers Daniela Moldovan \IfArrayPackageLoaded  



1 Introduction
Games with regular winning conditions have been proven to be valuable tools for the construction and analysis of complex systems and are suitable computation models for logics such as the monadic secondorder logic of one or two successors [11, 17, 2, 18]. By reducing a decision problem to determining the winning player in a game, the algorithmic aspect of solving the problem can easily be separated from the details of the application under concern. Winning strategies for one of the players in a game can be used as certificates for the answer to the original problem or serve as artefacts to be used in other steps of the application.
For example, when synthesizing finite state systems [19, 14] from temporal logic specifications, the winning strategy for the system player in the corresponding game is an artefact that represents a system satisfying the specification, and is used for building circuits that implement the specification. In alternatingtime temporal logic (ATL) [2], the question is imposed whether agents in a certain setting can ensure certain global properties of a system to hold. A winning strategy for one of the players in the corresponding model checking game represents a certificate for the fact that the agents can achieve their goal or that there exists a counterstrategy for the remaining agents to prevent this. The certificate can then be used for human inspection on why or why not the agents can achieve their goal. In calculus model checking, a strategy for the induced model checking game explains why or why not a given system satisfies some property. Again, a winning strategy serves as a certificate that is useful for further analysis of the setting.
In all these cases, certificates and artefacts that have a smaller representation are normally preferred. Such solutions are easier to comprehend and have (computational) advantages if used in successive steps (like building circuits from strategies in a synthesis game) or for analysing why a certain property holds or not. While for automata over regular words, which can be seen as oneplayer games, there exist some results on obtaining compactly representable oneplayer strategies for Büchi [13] and generalised Büchi [4] acceptance conditions, little research has been performed on obtaining compactly representable strategies in twoplayer games, even though it has been noticed that these are desperately needed in practice [3].
In this paper, we consider the problem of obtaining sparse positional strategies in safety games. Whenever a player follows a positional strategy, then the choice of action to perform in one of its positions only depends on the position the game is currently in. While positional strategies are too restricted to allow representing winning strategies in very expressive game types such as Muller or Streett games in general, for more simple game types such as parity or safety games, it is assured that whenever for one of the players, a winning strategy exists, then there also exists a winning positional strategy for that player. Positional strategies are suitable for giving insight on why a modal calculus formula is valid in some model or provide information about why a specification is unrealisable in synthesis, as the obligations are encoded into the game graph. Technically, positional strategies are represented as functions from the positions of a player to the next move of the player. Thus, at a first glance, all strategies have the same size. However, if some position is never reachable along a play, then the player’s move at that position does not matter, and we can leave the move for this position undefined. Positional strategies with many undefined moves can be represented more compactly, have the advantages outlined above, and are what we aim at computing in this paper. The number of game positions for which a next move is defined in a positional strategy is called its density, and strategies with a low density are called sparse in the following.
For applications such as synthesis, positional strategies are not necessarily the best model: a Mealy or Moore machine that implements a specification can have far less states than the density of the sparsest winning positional strategy for the system player in a corresponding synthesis game. Nevertheless, even in synthesis, positional strategies are useful. For example, one of the more recent synthesis approaches, namely Bounded Synthesis [16], can easily be altered such that there exists a positional strategy that represents the smallest possible implementation. Furthermore, stronger nonapproximability results are known for nonpositional strategies: it was shown that the number of states of the smallest Mealy or Moore machine that implements a winning strategy in a safety game is NPhard to approximate within any polynomial function [5], while for positional strategies, nonapproximability of the density of a sparsest strategy is known only within any constant [6].
While safety games are the main computational model that we aim to tackle, the techniques we compare in this paper are also useful for more expressive game models such as parity games. Extracting winning strategies in parity games can be done by computing a strategy that follows some attractor sets computed during the game solving process [9]. If we leave the concrete choice of a successor position in such a game open whenever there is more than one possibility to follow the attractor, we obtain a nondeterministic strategy that leaves some room for density improvement. Any strategy that is a special case of this nondeterministic strategy is a valid winning strategy, just like every strategy that does not leave the set of winning positions in a safety game is a valid winning strategy. Thus, the techniques discussed here can also be applied to the parity game case, with the drawback that the sparsest winning strategy in a parity game might not be a special case of the nondeterministic strategy computed from the attractor sets observed during the game solving process, and thus may be missed. Nevertheless, as there is, to the best of our knowledge, no work on sparse strategies in parity games yet, using an approach to obtain sparse winning strategies in safety games is still the best technique available so far.
We compare a variety of techniques for obtaining sparse winning strategies in this paper. Apart from a fully randomized heuristic, which will serve as a comparison basis, we use a smarter randomized heuristic that finds locally optimal strategies and consider the usage of SAT and ILP solvers to obtain a sparsest strategy. A novel technique, based on the repeated application of a linear programming solver to obtain hints on which game position to add to the strategy domain next provides a tradeoff between the density of the strategy and the computation time needed. For comparison, we also consider a recent algorithm by Neider [15], which uses computational learning to obtain small nonpositional strategies. As there is no standard benchmark set available for safety games, we take games from the Bounded Synthesis domain.
We start the following presentation by defining safety games. As we compare the techniques to obtain sparse positional winning strategies against the computational learning approach, which produces nonpositional strategies, we use an actionbased definition of safety games, which ensures that the strategy types stand on a common ground. In Section 3, we describe the techniques considered to find sparse strategies. Then, in Section 4, we briefly describe the benchmarks used. Preceded by a short description of the experimental setup (including the tools used), we then state the experimental results in Section 5. We close with a discussion of the results and indicate open problems.
Due to space restrictions, we do not describe how the computational learningbased strategy finding approach [15] works and how to produce games from specifications in the Bounded Synthesis [16] process. Rather, we assume familiarity with the subjects in the corresponding sections 3.6 and 4, and only explain the connection to this work.
2 Preliminaries
2.1 Safety Games
A safety game is defined as a tuple . In the game, we have two competing players, namely player and player . Player has the (finite) set of positions , the (finite) set of actions , and the partial edge function . Player in turn has her set of positions , her set of actions and her edge function . The game also has a designated initial position . For simplicity, we define , , and with if is defined and otherwise as shortcuts to be used in the following. If for some position and action , we have for some position , then we call a successor of . The set of successors of a position is also denoted by .
In a play of the game, the players move a pebble along the positions in the game. Starting from the initial position, whenever the pebble is in a position , then player chooses an action and moves the pebble to position . The case of the pebble being in a position is analogous for player . By concatenating the actions taken by the two players along the play, we obtain a decision sequence in the game.
Given a set , we denote the set of finite sequences of by , and the set of infinite sequences of by . A sequence is then a play with a corresponding decision sequence if and for all , if , then and (or if is undefined), and if , then and (or if is undefined). Note that for every decision sequence, there is precisely one play to which it corresponds. Plays in a game are either winning for player or player . Finite plays for which we have are winning for player , whereas for , the play is winning for player . Infinite plays are won by player .
2.2 Strategies
When playing the game, a player may follow a predefined strategy. Formally, a strategy for player is simply a function . A decision sequence is said to correspond to if for the play that corresponds to and all , if , then . If all decision sequences that correspond to a given strategy of player induce only plays that are winning for player , then we call the strategy winning.
In safety games, it is assured that one of the two players has a winning strategy (see, e.g., [11]). If player has a strategy to win the game, then we say that player wins the game. We can restrict our attention to a special kind of strategies, namely positional strategies. We call a strategy positional if for all pairs of prefix decision sequences and , if , then . In other words, at any position in a play, the next decision of a player that follows a positional strategy only depends on the position the play is in at that time. As a consequence, such a positional strategy can also be described by a function that maps every position of player in the game to an action to be chosen by the player whenever the position is visited. The restriction to positional strategies is motivated by the fact that in safety games, whenever there exists a winning strategy for one of the players, then there also exists a positional strategy for the player. The standard algorithm to solve safety games (i.e., determining the winner of the game) described in the next subsection also produces positional strategies as certificates/artefacts.
For comparing different strategies and in particular finding sparse strategies, we need to define a density measure for positional strategies. Recall that the motivation of focusing on sparse strategies is that they are better comprehensible certificates and have computational advantages when used as artefacts for further processing. For positional strategies, we only need to consider choices from positions in that are reachable along some path that corresponds to the strategy. If for a positional strategy , there is some position that can never be reached along a path that corresponds to a decision sequence that in turn corresponds to the strategy, then for the positional strategy , can be arbitrary without changing the behaviour of the strategy. We thus define the density of a positional strategy for player to be the number of positions of player that can be visited along some play that corresponds to this strategy.
More formally, we could also define as a partial function from to and define the strategy density to be the size of the domain of . In this case, whenever the pebble is in a position for which is undefined, we assume that player declares that she loses the play. If the strategy is still winning under this modified definition of who wins a play, then the fact that is only a partial function apparently does not to matter, and can be considered to be a valid positional strategy.
2.3 Solving Safety Games
For discussing the problem of obtaining sparse positional strategies in safety games, it is reasonable to separate the complexity of the process of solving the game (which is doable in polynomial time) from the actual optimization problem of minimizing the strategy (which is NPhard). Solving the game means to identify the set of winning positions in the game, i.e., those for which if any of these positions is an initial one, the safety player (player ) wins the game. Solving a safety game is relatively simple: it can be shown that the set of winning positions is precisely the largest set of positions that (1) does not contain a position of player that has no successors, (2) for which for every position of player , one of its successors is in the set, and (3) for every position of player , all of its successors are in the set. This largest set can be computed by starting with all positions, and successively removing any position that does not satisfy (1), (2), or (3). Once no more positions can be removed, the game solving process is complete.
Let be the set of winning positions and . Any positional strategy for which for all , we have that , is a winning one, as it ensures that the set of winning positions is not left, by condition (2) above, player cannot initiate leaving along a play, and no dead end for player is part of . At the same time, any positional strategy that allows leaving at some point in a play is not winning. This motivates the description of a most permissive winning strategy for player in the game: we define with for every , as every concrete winning positional strategy must be a specialization of , i.e., have for every position that is reachable along some play that corresponds to . For a procedure to find sparse positional strategies in a safety game, we can thus use as a basis for finding a sparse specialization.
3 Approaches for Obtaining Sparse Winning Strategies
In the experimental evaluation to follow, we compare five techniques to obtain sparse winning strategies in games. In this section, we explain them and state the properties of the approaches. We are particularly interested in sparsest strategies in safety games, i.e., winning positional strategies with the lowest possible density.
3.1 Randomized Strategy Extraction
Probably the most simple way to obtain a concrete winning positional strategy from a most permissive strategy is to simply pick arbitrarily one allowed action for every winning position of player , and then to remove all positions that became unreachable from the strategy domain. Here, we perform a random pick, based on a uniform distribution over the available actions.
3.2 Smarter Randomized Strategy Extraction
Given a game with the set of winning position for player (and ), another way to describe the problem of obtaining sparse winning positional strategies is to search for an aslargeaspossible set of positions for which the concrete positional strategy function should be undefined. Any strategy that respects will then have the same density (as otherwise, there is some position that we can add to and thus is not as large as possible).
As finding the density of sparsest positional strategies is NPhard to approximate within any constant [6], finding an approximately largest set is also NPhard. However, we might settle for local optima of , i.e., declare ourselves to be satisfied to obtain a set such that there is no position of player in the game that can be added to such that there is still a winning positional strategy that respects (i.e., has undefined for every ). Such a set can be obtained in time polynomial in the size of the game (i.e., in ).
In particular, we can do so as follows: we first create a random permutation of , and then for every position in the list, examine if the safety game is still winning for player if we remove all outgoing edges of that position. Whenever this is the case, we add the position to , and continue. Whenever the safety game becomes losing for player with this change, we undo it and try the next position in the list. Once every position in the list has been tried, we obtained a locally optimal set (whose local maximality easily be proven by deriving a contradiction from assuming the converse).
Since we randomize the permutation, for every game, there is a nonzero probability of obtaining a sparsest strategy. However, it is not hard to define a series of games for which the sizes of the games grow linearly in , but for which the probability to obtain a sparsest strategy using the algorithm above is at most for every game .
3.3 Integer Linear Programming
Given a game , we can formulate the problem of obtaining a sparse positional winning strategy for player as an integer linear programming (ILP) problem, in which we use one variable per position in the game. Whenever we obtain a solution to the ILP problem, a variable value of is supposed to mean that the position can be reached from the initial position along some path that corresponds to the computed strategy, whereas a value of means the opposite. By optimizing the sum of the variables that correspond to the vertices of , we can obtain a sparsest strategy.
Formally, an ILP problem is a threetuple , for which is a set of variables, is a linear function over that is to be minimized, and is a set of linear constraints over the allowed values of . Given a game and a most permissive strategy , we can encode the problem of obtaining a sparsest positional winning strategy for player that is a specialization of into an ILP problem by setting , , and:
There are four types of constraints in this ILP formulation: first of all, all variable values are fixed between 0 and 1. Then, the variable corresponding to the initial position in the game is forced to be . For every position of player whose variable value is , the third kind of constraint ensures that the variable for some successor position that is reachable via some action allowed by has to be set to 1. Finally, for positions of player whose variable has a value of , the variables for all successors positions have to be . For actually obtaining a positional strategy from a variable assignment , for every position, we pick an action that leads to a successor in .
3.4 SATbased Strategy Extraction
The ILP formulation of the sparsest positional strategy problem has the property that when regarding the variables as Boolean by interpreting as and as , all of the constraints can be represented as a disjunction of Boolean literals. For example, a constraint can be written as in the Boolean domain. By rewriting the ILP instance in this way, SAT (satisfiability) solvers can be applied. A SATbased approach to strategy finding has already been pursued in [8].
Most currently available SAT solvers however cannot take into account optimization objectives when computing a solution. For using such a solver then, we could encode some cardinality constraint on the amount of variables for player ’s positions that might be set to at most, and perform a binary search on the best possible strategy density. For this paper, we use the SAT solver OPTSAT v.1.1 [10] that has this functionality already built in.
3.5 Repetitive Linear Programming
The integer linear programming approach from Section 3.3 is exact and guaranteed to find a sparsest strategy. As the problem of obtaining sparsest winning positional strategies is NPhard, we cannot expect ILP solvers to work fast on ILP instances that encode this problem in general. To counter this fact, we propose an alternative approach here, which implements a heuristic based on linear programming over the real numbers (LP). In contrast to ILP solving, LP solving can be performed in polynomial time.
Consider the constraint system built in the ILP approach of Section 3.3, but this time over the real numbers. After applying a linear programming solver to the system, we obtain a variable valuation , which is, w.l.o.g., of the form for . Some values here might be , some might be and in many cases, some values are in between. Thus, the values might not represent an actual solution to the sparse strategy problem. We can however fix the vector in an iterative fashion. Suppose that we start the linear programming solver on the problem again, but this time fix all variables that were in after the previous solver run to , fix all variables that had a value of in to , and additionally fix one variable whose value was equal to to . The linear programming solver will compute a new solution, but possibly with a worse value of the objective function. However, the number of variables that are not or will have decreased by at least . If we iterate the process until all variables have values of either or , we have a blueprint for a sparse, but not necessarily sparsest strategy. However, the complexity of this approach is only polynomial, and we use the LP solver to guide our search for a sparse winning strategy.
3.6 Computational Learning of Sparse Strategies
Recently, the problem of obtaining compactly representable winning strategies in safety games has been tackled from a computational learning perspective by Neider [15]. In computational learning of a regular language over finite words, the task is to obtain a deterministic finite automaton (DFA) representation of such a language using only equality and containment checks. The idea in applying this idea to strategy extraction is that we use the prefixes of the winning decision sequences for player in a game as a language to be learned, but we can actually stop the learning process after a subset of this language has been learned that is closed under appending allowed actions of player (i.e., those actions that are available to player at a respective point in the game). The left part of Figure 1 depicts an example automaton for such a language.
Note that the automaton is also concerned with actions of player , and when taking its number of states as size measure, it can easily be larger than the density of a positional strategy. However, at the same time, a strategy automaton can also be smaller, as it allows to merge states with the same suffix language. Also, a strategy DFA might offer more than one possible action to player at any point in the play, and there is no guarantee that there actually exists a positional strategy in the game that the DFA represents (or overapproximates). As a consequence, the density of the sparsest positional strategy and the size of the smallest automatonbased representation of a strategy are incomparable.
For games that represent some synthesis problem and have strict alternation between the two players in the game, positional strategies are not necessarily the model of choice. Typically, when the safety player is winning such a game, it is desired to build a Mealy or Moore machine from a winning strategy that then represents a reactive system that satisfies the specification that the game is built from. Such a Mealy or Moore machine takes the actions of the other player as input and produces player 0’s actions as output. Any trace that the machine may produce must then be a winning decision sequence in the original game. A Mealy or Moore machine can have a size (represented by its number of states) that is far less than the density of the sparsest winning positional strategy in a game. For example, a game with many positions could be winning for the system player by always playing the same action. A machine representing this strategy would only have one state, whereas many positions of the safety player in the game might be visited along a corresponding play. While it is always possible to translate a winning positional strategy of some density into a Mealy machine of size at most (assuming that player plays first in the game), the DFA produced by a computational learning approach is equally suitable as a starting point for a Mealy machine computation: we use the state set of the DFA as state set of the Mealy machine, but contract a sequence of two successive transitions that represents the input and the output in one round to one transition in the Mealy machine. The number of states that then remain reachable is the size of the Mealy machine. Figure 1 illustrates this translation process. For a more thorough definition and discussion of the connection between Mealy/Moore machines and games, see [7].
4 Benchmarks
To make our experimental evaluation as insightful as possible, we only consider games from practice as benchmarks, and leave out the commonly used randomly generated games and toy examples such as variants of tictactoe or other folk games. Instead, we use games stemming from Bounded Synthesis [16], which is an approach for the synthesis of finitestate systems from specifications in temporal logic. Intuitively, a synthesis process that follows this approach starts by representing the specification as an automaton that ensures that for every trace of a system to be synthesized that we declare to be illegal, the automaton has some corresponding run on which some socalled rejecting state is visited infinitely often. If we now restrict the number of visits to these rejecting states along a run to some finite value , and find a system for which all automaton runs for all traces of the system visit the rejecting states of the automaton only at most times, then we have a valid implementation. At the same time, the problem of synthesizing such solutions can be reduced to safety game solving, which makes the approach conceptually simple.
Here, we consider two variants of building the games from specifications. The first one uses the classical construction from [16], adapted to finding Mealy machines instead of Moore machine implementations. In the second one, we use a modification proposed in [7]: we allow the system player to voluntarily put herself into an unnecessarily bad situation in the game. In a bounded synthesis game, positions are labelled by some counter vector , which are updated whenever both players have made their moves. The positions have the property that for two positions and labelled by and such that for every , we have , all of player 0’s winning strategies for are also winning strategies for the same game but with . Thus, by allowing player to increase her counters voluntarily, we do not give her additional power. Additionally, we introduce a position for player to increase her counter values from the initial ones before the actual start of the game. While this modification does not give player more possibilities to win the game, it allows us to find sparser strategies. In fact, it is a corollary of Theorem 2 of [16] that if and only if there exists some Mealy machine with states that satisfies the specification and adheres to a bound of , then the bounded synthesis game with the counter increase possibility for player will allow a strategy of density . Thus, searching for the sparsest positional strategy will lead to the smallest Mealytype implementation. Note that strictly speaking, the safety games resulting from the modification do not conform to the safety game definition in Section 2 any more, as the counter increasing possibility leads to multiple successors that all correspond to the same action for some positions of player . However, for approaches to find sparse positional winning strategies, this makes no difference. For both variants of Bounded Synthesis, we consider the following benchmarks:

a basic mutex (), for the lineartime temporal logic (LTL) specification , the input bits , and the output bits ,

a basic reaction scheme () with the specification for the input bits and the output bits ,

three dining philosophers () getting hungry at the same time, with for the input bit and the output bits (describing which philosophers are eating), and

some examples from [12], mostly arbiter and traffic light examples (). Unrealizable specifications have been left out.
All benchmarks are parametrized by the bound value. For example, the table entry in the following section will refer to the basic mutex example with a bound value of . In the case that the second variant of the Bounded Synthesis process is used, in which player has the possibility to voluntarily increase some counter values, the benchmark name appears primed, e.g., as in .
5 Experimental Results
We implemented the approaches described in Section 3 in C++, except for the learning approach [15], for which we use an implementation provided by the author of [15] (also written in C++). For OPTSAT [10] and the learningbased tool, we used default settings. As (integer) linear programming library, we took liblpsolve v.5.5.
For obtaining the benchmarks, we implemented a tool that computes safety games for the Bounded Synthesis approach, without using any symbolic data structure such as binary decision diagrams (BDDs). Benchmarks for which the preparation required more than 64 gigabytes of RAM were left out. This limit was frequently exceeded for the modified Bounded Synthesis approach, as many of the resulting games have a huge number of positions, even though the percentage of positions that are winning for the safety player, and thus are input to the strategy density optimization algorithms, is quite low. Typically, we scaled the bound for the synthesis benchmarks up to . If the number of winning positions in a game exceeds 10000 for some bound , or if increasing the bound would yield the same game, we did however not consider higher bounds. All games were pruned to the positions reachable when player follows some arbitrary specialization of the most permissive strategy.
We used a Sun XFire computer with 2.6 Ghz AMD Opteron processors running an x64version of Linux for obtaining the results. All tools considered are singlethreaded. We restricted the memory usage for the strategy extraction to 4 GB and set a timeout of 600 seconds per invocation. All tools were ran five times (25 times for the randomized approaches) to level out fluctuations. The tables in the following represent mean values.
5.1 Strategy Densities
Table 1 and Table 2 compare the obtained strategy densities (or sizes for nonpositional strategies) on the classical Bounded Synthesis games, whereas Table 3 considers the Bounded Synthesis benchmarks with the modification that player can increase counter values at will. Timeouts are represented by “t/o”. Since for the modification switched on, building the safety games resulted in running out of memory in many cases, Table 3 only has relatively few entries. The remaining benchmarks have a low to medium number of positions, as the nonwinning positions have already been pruned away, and these constitute the majority of positions created while building the game. However, the large search spaces and the bad performance of the purely random strategy extraction approach show that the benchmarks are still far from being trivial. The search space size (in bits) represents how many syntactically different positional strategies are possible, and is defined to be for the set of winning positions . Quite often, the sparsest strategies only have a very low density. This is not a surprising situation in synthesis, as many systems can be implemented in very few states. The combination of large search spaces and the availability of sparse winning strategies make the benchmarks at hand an excellent competition ground for the sparse strategy extraction approaches. To compare the density of positional strategies and the size of learningbased strategies, for all tables, the number of input and output atomic propositions in the benchmark are also given.
It can be seen that for both Bounded Synthesis variants, the randomized approach and the repetitive linearprogramming approach are quite competitive against the exact minimization approaches (ILP and OPTSAT), despite the large search space. For many benchmarks for which very sparse strategies are possible (e.g., ), all of the approaches dealing with positional strategies find some sparsest strategy. Furthermore, there is no clear winner of the smart randomized approach and repetitive linear programming. For example, for the basic mutex (unprimed), the latter approach always finds a sparsest strategy, whereas the randomized approach does not. On the other hand, for the dining philosophers (unprimed) and other benchmarks like , the situation is reversed.
When evaluating how well the computational learning approach works, we need to compare across tables. In Table 3, the approach is not listed. The reason is that due to the counter increase option of player , there can be many successors in the game that all correspond to the same action, and the implementation of the approach does not support such games. However, since the learning approach can already find the smallest strategy in the games produced in the classical Bounded Synthesis approach, this is no drawback. Recall from Section 4 that if and only if there exists a Mealy machine with states that satisfies the specification and respects some bound , then in the modified synthesis game for bound , there is a positional strategy with density . This fact allows us to measure the success of the learning approach. We can see than in most cases, it did not find the minimal implementation. For example, for , the sparsest positional strategy has density , i.e., for . Intuitively, for this benchmark, a Mealy machine with two states that satisfies the specification would simply alternate between giving the grant to the two requesters. The Mealy machine sizes for the learning approach and for are however larger, and grow with the values of . Thus, the learning approach can be fooled by needlessly large games. However, for benchmarks such as , for which the modified version of the game was too large to fit into 64 GB of memory, the learning approach can deal well with the classical version of the game: a Mealy machine with 13 states is found, although the sparsest positional strategy has reachable positions of , for . The benchmark represents an elevator controller synthesis problem. For comparison, the numbers of states of the deterministic finite automata produced from the benchmarks in the learningbased approach are also given in Table 1 and Table 2.
Benchmark  Search space  Random  ILP  OPTSAT  Random  RepLP  Learning  Learning  
(bits)  (dumb)  (smart)  (aut.)  (Mealy)  
13  3  4  4  3  13  13  13  13  13  10  3  
33  8  4  4  17.9248  27.4  13  13  14.12  13  20  7  
61  15  4  4  47.2842  43.56  13  13  16.52  13  34  13  
97  24  4  4  89.3233  66.6  13  13  17  13  52  21  
141  35  4  4  144.042  86.6  13  13  17  13  74  31  
7  3  2  2  0  7  7  7  7  7  5  3  
53  26  8  2  15.1699  51.08  41  41  41  43.2  30  11  
151  75  8  2  107.699  117.8  37  37  41.16  43.6  50  19  
309  154  8  2  291.248  211.2  35  35  48.36  59.2  74  29  
113  14  2  8  32  113  113  113  113  113  26  10  
177  22  2  8  56  169.6  113  113  113  113  29  12  
209  26  2  8  56  206.4  177  177  177  177  36  15  
145  18  2  8  60  135.4  113  113  113  113.4  27  11  
193  24  2  8  104  173.8  113  113  113  114  29  12  
321  40  2  8  168  275.6  177  177  177  177.2  37  16  
81  10  2  8  34  80.68  65  65  65  66  17  7  
105  13  2  8  56  98.6  65  65  65  65.4  19  8  
137  17  2  8  84  122.9  65  65  65  66  21  9  
9  4  2  2  6  5.32  3  3  3  3  6  3  
13  6  2  2  10  6.2  3  3  3  3  8  4  
17  8  2  2  14  6.12  3  3  3  3  10  5  
21  10  2  2  18  6.36  3  3  3  3  12  6  
25  12  2  2  22  6.52  3  3  3  3  14  7  
13  6  2  2  6  11.32  9  9  9  9.6  9  4  
17  8  2  2  9  13.72  9  9  9  9.2  18  8  
21  10  2  2  12  12.68  9  9  9  9.4  23  10  
25  12  2  2  15  12.76  9  9  9  10  28  12  
29  14  2  2  18  14.6  9  9  9  10  33  14  
53  13  4  4  56  38.12  13  13  13  13  10  5  
89  22  4  4  80  51.08  13  13  13  13  16  8  
141  35  4  4  120  65.96  13  13  13  13  24  12  
209  52  4  4  176  85.64  13  13  13  13  34  17  
293  73  4  4  248  134  13  13  13  13  46  23  
33  8  4  4  50.3399  26.6  9  9  9  9  2  1  
37  9  4  4  58.3399  27.72  9  9  9  9  2  1  
41  10  4  4  66.3399  30.44  9  9  9  9  2  1  
45  11  4  4  74.3399  32.84  9  9  9  9  2  1  
49  12  4  4  82.3399  29.32  9  9  9  9  2  1  
5  2  2  2  0  5  5  5  5  5  4  2  
9  4  2  2  2  6.6  5  5  5  5  8  4  
13  6  2  2  4  8.68  5  5  5  5  12  6  
17  8  2  2  6  6.92  5  5  5  5  16  8  
21  10  2  2  8  8.92  5  5  5  5  20  10  
49  12  4  4  47.794  43.08  13  13  13  13  15  8  
129  32  4  4  145.137  98.44  13  13  13  13  29  15  
241  60  4  4  297.293  154.1  13  13  13  13  47  24  
385  96  4  4  500.168  243.6  13  13  13  13  69  35  
561  140  4  4  753.762  350  13  13  13  13  95  48  
37  9  4  4  15  35.56  29  29  29  29  13  4  
65  16  4  4  42.6045  57  29  29  31.4  29.8  25  9  
101  25  4  4  82.3038  86.28  29  29  32.68  29  41  16  
145  36  4  4  134.683  111.1  29  29  34.6  29.2  61  25  
197  49  4  4  199.741  138.6  29  29  35.56  30.6  85  36  
409  51  8  8  270.176  385.6  217  217  228.2  217.8  59  17  
873  109  8  8  764.786  783.7  209  209  255.1  219  138  47  
1577  197  8  8  1623.06  1367  209  t/o  290.6  221  252  93  
2569  321  8  8  2941.41  2109  t/o  t/o  322.6  217.8  420  164  
209  52  8  4  188.229  168.7  25  25  29.48  25  51  26 
Benchmark  Search space  Random  ILP  OPTSAT  Random  RepLP  Learning  Learning  
(bits)  (dumb)  (smart)  (aut.)  (Mealy)  
733  183  8  4  863.098  526.9  25  25  40.04  26.6  139  70  
1641  410  8  4  2187.1  1084  25  t/o  59.88  26.6  273  137  
3029  757  8  4  4343.1  1851  25  t/o  51.24  26.2  491  246  
6273  784  16  8  7501.28  5477  t/o  t/o  338.3  253.8  439  218  
65  16  4  4  53.5489  53.8  17  17  21.32  17  20  7  
125  31  4  4  126.702  93.64  17  17  25.8  17.2  38  13  
201  50  4  4  226.385  135.9  17  17  27.24  17  61  21  
293  73  4  4  351.427  184  17  17  35.4  17  90  31  
401  100  4  4  501.829  218.6  17  17  32.2  17  125  43  
509  127  8  4  741.196  266.4  17  17  29.48  21  4  3  
1585  396  8  4  2755.88  404.4  17  17  31.72  21.4  6  4  
3081  770  8  4  5696.31  701  17  17  37.8  21.2  8  5  
5217  1304  8  4  9754.35  1120  17  t/o  41.32  21.2  10  6  
97  6  16  16  0  97  97  97  97  97  31  6  
417  26  16  16  20  417  97  97  97  97  111  13  
2017  126  16  16  155.098  2017  97  97  97  97  333  74  
353  44  2  8  40  314.3  201  201  201  201  28  13  
505  63  2  8  139  454.4  201  201  227.9  201  37  18  
633  79  2  8  211  582.1  201  201  229.5  201.4  47  23  
761  95  2  8  268  703.1  201  201  228.5  202.8  57  28  
889  111  2  8  325  816.4  201  201  216.4  204  67  33  
27  13  2  2  11  20.44  15  15  15.56  15  10  5 
Benchmark  Search space  Random  ILP  OPTSAT  Random  RepLP  
(bits)  (dumb)  (smart)  
33  8  4  4  66  28.04  9  9  9  9  
121  30  4  4  414.762  60.04  9  9  9  9  
289  72  4  4  1298.79  130.8  9  9  9  9  
561  140  4  4  2998.44  214.6  9  9  9  9  
961  240  4  4  5816.8  348.4  9  9  9  9  
13  6  2  2  7  7.88  7  7  7  7.2  
19  9  2  2  17.0947  7.96  7  7  7  8  
25  12  2  2  29.5098  7.88  7  7  7  7.2  
31  15  2  2  43.7633  9.64  7  7  7  7.4  
37  18  2  2  59.5361  9.24  7  7  7  8  
217  108  8  2  865.396  53.4  7  7  7  7  
785  392  8  2  4246.45  83.16  t/o  7  7  7  
1921  960  8  2  12504.8  114.3  t/o  7  7  7  
37  18  2  2  136.287  16.44  3  3  3  3  
97  48  2  2  472.304  31.24  3  3  3  3  
201  100  2  2  1164.66  56.92  3  3  3  3  
361  180  2  2  2365.96  83.64  3  3  3  3  
589  294  2  2  4239.85  118.5  3  3  3  3  
649  324  2  2  2517.11  39.32  5  5  5.24  6  
4609  2304  2  2  24739  71.32  t/o  5  5.56  6.2  
55  27  2  2  259.934  42.2  3  3  3  3  
129  64  2  2  773  105.8  3  3  3  3  
251  125  2  2  1747.67  196.9  3  3  3  3  
433  216  2  2  3357.28  347  3  3  3  3  
687  343  2  2  5785.47  547.3  3  3  3  3  
169  42  4  4  542.836  104.2  13  13  13.96  14.6  
769  192  4  4  3614.23  261  13  13  14.28  14.6  
2241  560  4  4  12946.8  781.5  9  9  9.8  9  
2521  1260  2  2  14742  186.8  t/o  5  6.84  5 
5.2 Computation Times
Table 4 presents computation times for the classical Bounded Synthesis benchmarks, whereas Table 5 describes the results for the modified version. For brevity, benchmarks for which all tools needed less than 50 ms of computation times have been left out.
The tables show no big surprises. The exact approaches time out for the largest benchmarks. For the benchmarks stemming from the modified Bounded Synthesis version, OPTSAT performs better than the ILPbased approach, whereas for the nonmodified version, the ILP solver seems to be faster. The main difference between the two classes is the fact that the number of successors of positions of player is much higher in the modified synthesis games. OPTSAT seems to be able to deal with this situation in a better way. The learning approach is typically slower than the heuristics for positional strategies, but unlike the exact approaches, did not time out for any of the benchmarks.
Benchmark  Random  ILP  OPTSAT  Random  RepLP  Learning 

(dumb)  (smart)  
0.00671  0.00948  0.06751  0.00804  0.0204  0.038  
0.00865  0.0109  0.4895  0.00736  0.0107  0.07821  
0.00795  2.58  3.241  0.00847  0.059  0.04376  
0.00925  129  89.7  0.0114  0.241  0.114  
0.00659  0.0132  0.0641  0.0117  0.0262  0.02546  
0.00874  0.0427  0.04663  0.0335  0.0584  0.06382  
0.00629  0.00974  0.07321  0.00747  0.00999  0.01083  
0.00831  0.0165  0.1267  0.00979  0.0151  0.07951  
0.00675  0.0133  0.05853  0.00664  0.0123  0.03176  
0.00845  0.0174  0.04967  0.0104  0.0156  0.1244  
0.00896  0.0237  0.3457  0.0112  0.0189  0.4291  
0.0107  0.0337  4.473  0.0134  0.0261  1.271  
0.00677  0.0177  0.1765  0.00989  0.0167  0.0701  
0.00834  0.0266  0.1027  0.00938  0.0187  0.1399  
0.00903  0.408  0.2805  0.035  0.555  0.1248  
0.0144  84.9  86.47  0.0589  1.46  0.846  
0.0219  548  t/o  0.111  3.24  5.05  
0.0325  t/o  t/o  0.214  8.11  19.91  
0.00762  0.0345  0.1598  0.00947  0.0205  0.1075  
0.0123  0.509  16.84  0.017  0.0906  1.963  
0.0245  2.02  t/o  0.0455  0.253  20.97  
0.0353  4.45  t/o  0.0691  0.571  163  
0.0885  t/o  t/o  0.917  92.8  247.5  
0.00761  0.0142  0.2191  0.00896  0.0156  0.06583  
0.00764  0.0173  1.272  0.0106  0.0193  0.1379  
0.00929  0.0219  1.51  0.0127  0.022  0.2704  
0.0127  0.0323  0.9498  0.0131  0.0377  0.01549  
0.0206  0.222  15.81  0.0288  0.197  0.06499  
0.0355  0.929  479.8  0.0715  0.729  0.04946  
0.0634  3.52  t/o  0.168  2.14  0.08163  
0.00909  0.0216  0.09432  0.0148  0.0216  0.584  
0.0283  0.136  2.672  0.0714  0.156  27.07  
0.00896  0.0302  0.104  0.0244  0.116  0.0816  
0.0107  0.0504  0.2223  0.0371  0.138  0.1752  
0.0127  0.0683  1.801  0.0458  0.341  0.2638  
0.0138  0.0817  13.89  0.0519  0.522  0.4677  
0.0153  0.154  4.138  0.0547  0.608  0.7734 
Benchmark  Random  ILP  OPTSAT  Random  RepLP 

(dumb)  (smart)  
0.0119  0.201  0.3138  0.0177  0.0528  
0.0272  1.83  1.118  0.107  0.224  
0.0728  20.1  1.535  0.66  2.06  
0.0109  0.0628  0.03041  0.018  0.0426  
0.0525  t/o  1.155  0.329  1.09  
0.197  t/o  5.147  4.6  14.4  
0.0143  0.0491  0.06251  0.0241  0.0456  
0.0333  0.365  0.1345  0.161  0.193  
0.055  1.6  0.5534  0.867  1.51  
0.0218  12  0.7623  0.0675  0.466  
0.477  t/o  130.8  15.2  248  
0.0246  0.0773  0.05977  0.136  0.0856  
0.0666  0.745  0.1788  0.973  0.814  
0.183  2.69  0.9863  2.94  3.32  