[
Abstract
Answer Set Programming (ASP) is a prominent knowledge representation language with roots in logic programming and nonmonotonic reasoning. Biennial ASP competitions are organized in order to furnish challenging benchmark collections and assess the advancement of the state of the art in ASP solving. In this paper, we report on the design and results of the Seventh ASP Competition, jointly organized by the University of Calabria (Italy), the University of Genova (Italy), and the University of Potsdam (Germany), in affiliation with the 14th International Conference on Logic Programming and NonMonotonic Reasoning (LPNMR 2017). (Under consideration foracceptance in TPLP).
The Seventh Answer Set Programming Competition: Design and Results]The Seventh Answer Set Programming Competition: Design and Results
M. Gebser, M. Maratea and F. Ricca]
MARTIN GEBSER
Institute for Computer Science, University of Potsdam, Germany
and MARCO MARATEA
DIBRIS, University of Genova, Italy
and FRANCESCO RICCA
Dipartimento di Matematica e Informatica, Università della Calabria, Italy
\jdateMarch 2003
2003
\pagerange[–LABEL:lastpage
\doiS1471068401001193
nswer Set Programming; Competition
1 Introduction
Answer Set Programming (ASP) is a prominent knowledge representation language with roots in logic programming and nonmonotonic reasoning [Baral (2003), Brewka et al. (2011), Eiter et al. (2009), Gelfond and Leone (2002), Lifschitz (2002), Marek and Truszczyński (1999), Niemelä (1999)]. The goal of the ASP Competition series is to promote advancements in ASP methods, collect challenging benchmarks, and assess the state of the art in ASP solving (see, e.g., [Alviano et al. (2015), Alviano et al. (2017), Bruynooghe et al. (2015), Gebser et al. (2015), Lefèvre et al. (2017), Maratea et al. (2015), Marple and Gupta (2014), Calimeri et al. (2017)] for recent ASP systems, and [Gebser et al. (2018)] for a recent survey). Following a nowadays customary practice of publishing results of AIbased competitions in archival journals, where they are expected to remain available and can be used as references, the results of ASP competitions have been hosted in prominent journals of the area (see, [Calimeri et al. (2014), Calimeri et al. (2016), Gebser et al. (2017b)]). Continuing the tradition, this paper reports on the design and results of the Seventh ASP Competition,^{1}^{1}1http://aspcomp2017.dibris.unige.it which was jointly organized by the University of Calabria (Italy), the University of Genova (Italy), and the University of Potsdam (Germany), in affiliation with the 14th International Conference on Logic Programming and NonMonotonic Reasoning (LPNMR 2017).^{2}^{2}2http://lpnmr2017.aalto.fi
The Seventh ASP Competition is conceived along the lines of the System track of previous competition editions [Calimeri et al. (2016), Lierler et al. (2016), Gebser et al. (2016), Gebser et al. (2017b)], with the following characteristics: benchmarks adhere to the ASPCore2 standard modeling language,^{3}^{3}3http://www.mat.unical.it/aspcomp2013/ASPStandardization/ subtracks are based on language features utilized in problem encodings (e.g., aggregates, choice or disjunctive rules, queries, and weak constraints), and problem instances are classified and selected according to their expected hardness. Both single and multiprocessor categories are available in the competition, where solvers in the first category run on a single CPU (core), while they can take advantage of multiple processors (cores) in the second category. In addition to the basic competition design, which has also been addressed in a preliminary version of this report [Gebser et al. (2017a)], we detail the revised benchmark selection process as well as the results of the event, which were orally presented during LPNMR 2017 in Hanasaari, Espoo, Finland.
The rest of this paper is organized as follows. Section 2 introduces the format of the Seventh ASP Competition. In Section 3, we describe new problem domains contributed to this competition edition as well as the revised benchmark selection process for picking instances to run in the competition. The participant systems of the competition are then surveyed in Section 4. In Section 5, we then present the results of the Seventh ASP Competition along with the winning systems of competition categories. Section 6 concludes the paper with final remarks.
2 Competition Format
This section gives an overview of competition categories, subtracks, and scoring scheme(s), which are similar to the previous ASP Competition edition. One addition though concerns the system ranking of Optimization problems, where a ranking by the number of instances solved “optimally” complements the relative scoring scheme based on solution quality used previously.
Categories.
The competition includes two categories, depending on the computational resources provided to participant systems: SP, where one processor (core) is available, and MP, where multiple processors (cores) can be utilized. While the SP category aims at sequential solving systems, MP allows for exploiting parallelism.
Subtracks.
Both categories are structured into the following four subtracks, based on the ASPCore2 language features utilized in problem encodings:

Subtrack #1 (Basic Decision): Encodings consisting of nondisjunctive and nonchoice rules (also called normal rules) with classical and builtin atoms only.

Subtrack #2 (Advanced Decision): Encodings exploiting the language fragment allowing for aggregates, choice as well as disjunctive rules, and queries, yet excepting weak constraints and nonheadcyclefree (nonHCF) disjunction.

Subtrack #3 (Optimization): Encodings extending the aforementioned language fragment by weak constraints, while still excepting nonHCF disjunction.

Subtrack #4 (Unrestricted): Encodings exploiting the full language and, in particular, nonHCF disjunction.
A problem domain, i.e., an encoding together with a collection of instances, belongs to the first subtrack its problem encoding is compatible with.
Example 1
To illustrate the subtracks and respective language features, consider the directed graph displayed in Figure LABEL:sub@fig:tsp:graph and the corresponding fact representation given in Figure LABEL:sub@fig:tsp:data. Facts over the predicate node/1 specify the nodes of the graph, those over edge/2 provide the edges, and cost/3 associates each edge with its cost. The idea in the following is to encode the wellknown Traveling Salesperson problem, which is about finding a Hamiltonian cycle, i.e., a round trip visiting each node exactly once, such that the sum of edge costs is minimal. Note that the example graph in Figure LABEL:sub@fig:tsp:graph includes precisely two outgoing edges per node, and for simplicity the encodings in Figures LABEL:sub@fig:tsp:basic–LABEL:sub@fig:tsp:disjunctive build on this property, while accommodating an arbitrary number of outgoing edges would also be possible with appropriate modifications.
The first encoding in Figure LABEL:sub@fig:tsp:basic complies with the language fragment of Subtrack #1, as it does not make use of aggregates, choice or disjunctive rules, queries, and weak constraints. Note that terms starting with an uppercase letter, such as X, Y, and Z, stand for universally quantified firstorder variables, Y != Z is a builtin atom, and not denotes the (default) negation connective. Given this, the rule in line 1 expresses that exactly one of the two outgoing edges per node must belong to a Hamiltonian cycle, represented by atoms over the predicate cycle/2 within a stable model [Lifschitz (2008)]. Starting from the distinguished node 1, the least fixpoint of the rules in lines 2 and 3 provides the nodes reachable from 1 via the edges of a putative Hamiltonian cycle. The socalled integrity constraint, i.e., a rule with an empty head that is interpreted as false, in line 4 then asserts that all nodes must be reachable from the starting node 1, which guarantees that stable models coincide with Hamiltonian cycles. While edge costs are not considered so far, the encoding in Figure LABEL:sub@fig:tsp:basic can be used to decide whether a Hamiltonian cycle exists for a given graph (with precisely two outgoing edges per node).
The second encoding in Figure LABEL:sub@fig:tsp:advanced includes a choice rule in line 1, thus making use of language features permitted in Subtrack #2, but incompatible with Subtrack #1. The instance of this choice rule obtained for the node 1, {cycle(1,2); cycle(1,4)} = 1., again expresses that exactly one outgoing edge of node 1 must be included in a Hamiltonian cycle, and respective rule instances apply to the other nodes of the example graph in Figure LABEL:sub@fig:tsp:graph. Notably, the choice rule adapts to an arbitrary number of outgoing edges, and the assumption that there are precisely two per node could be dropped when using the encoding in Figure LABEL:sub@fig:tsp:advanced.
The rules in lines 1 and 2 of the third encoding in Figure LABEL:sub@fig:tsp:disjunctive are disjunctive, and rule instances as follows are obtained together with line 3:
Observe that reach(3) occurs in the body of a disjunctive rule with cycle(3,2) and cycle(3,4) in the head. These atoms further imply reach(2) or reach(4), respectively, which lead on two disjunctive rules, one containing cycle(2,3) in the head and the other cycle(4,3). As the latter two atoms also occur in the body of rules with reach(3) in the head, we have that all of the mentioned atoms recursively depend on each other. Since cycle(3,2) and cycle(3,4) jointly constitute the head of a disjunctive rule, this means that rule instances obtained from the encoding in Figure LABEL:sub@fig:tsp:disjunctive are nonHCF [BenEliyahu and Dechter (1994)] and thus fall into a syntactic class of logic programs able to express problems at the second level of the polynomial hierarchy [Eiter and Gottlob (1995)]. Hence, the encoding in Figure LABEL:sub@fig:tsp:disjunctive makes use of a language feature permitted in Subtrack #4 only.
Given that either of the encodings in Figures LABEL:sub@fig:tsp:basic–LABEL:sub@fig:tsp:disjunctive yields stable models corresponding to Hamiltonian cycles, the weak constraint in Figure LABEL:sub@fig:tsp:optimize can be added to each of them to express the objective of finding a Hamiltonian cycle whose sum of edge costs is minimal. In case of the encodings in Figures LABEL:sub@fig:tsp:basic and LABEL:sub@fig:tsp:advanced, the addition of the weak constraint leads to a reclassification into Subtrack #3, since the focus is shifted from a Decision to an Optimization problem. For the encoding in Figure LABEL:sub@fig:tsp:disjunctive, Subtrack #4 still matches when adding the weak constraint, as nonHCF disjunction is excluded in the other subtracks.
Scoring Scheme.
The applied scoring schemes are based on the following considerations:

All domains are weighted equally.

If a system outputs an incorrect answer to some instance in a domain, this invalidates its score for the domain, even if other instances are solved correctly.
In general, points can be earned in each problem domain. The total score of a system is the sum of points over all domains.
For Decision problems and Query answering tasks, the score of a system in a domain featuring instances is calculated as
where is the number of instances successfully solved within the time and memory limits of 20 minutes wallclock time and 12GB RAM per run.
For Optimization problems, we employ two alternative scoring schemes. The first one, which has also been used in the previous competition edition, performs a relative ranking of systems by solution quality, following the approach of the Mancoosi International Solver Competition.^{4}^{4}4http://www.mancoosi.org/misc/ Given participant systems, the score of a system for an instance in a domain featuring instances is calculated as
where is

, if did neither produce a solution nor report unsatisfiability; or otherwise

the number of participant systems that did not produce any strictly better solution than , where a confirmed optimum solution is considered strictly better than an unconfirmed one.
The score of system in domain is then taken as the sum of scores over the instances in .
The second scoring scheme considers the number of instances solved “optimally”, i.e., a confirmed optimum solution or unsatisfiability is reported. Hence, the score of a system in a domain is defined as above, with being the number of instances solved optimally. This second scoring scheme (inspired by the MaxSAT Competition) gives more importance to solvers that can actually solve instances to the optimum, but it does not consider “nonoptimal” solutions. The two measures provide alternative perspectives on the performance of participants solving optimization problems.
Note that, as with Decision problems and Query answering tasks, and range from to in each domain. focuses on the best solutions found by participant systems, while on completed runs.
In each category and respective subtracks, the participant systems are ranked by their sums of scores over all domains, in decreasing order. In case of a draw in terms of the sum of scores, the sums of runtimes over all instances are taken into account as a tiebreaking criterion.
3 Benchmark Suite and Selection
The benchmark suite of the Seventh ASP Competition includes 36 domains, where 28 stem from the previous competition edition [Gebser et al. (2017b)], and 8 domains, as well as additional instances for the Graph Colouring problem, were newly submitted. We first describe the eight new domains and then detail the instance selection process based on empirical hardness.
3.1 New Domains
The eight new domains of this ASP Competition edition can be roughly characterized as closely related to machine learning (Bayesian Network Learning, Markov Network Learning, and Supertree Construction), personnel scheduling (Crew Allocation and Resource Allocation), or combinatorial problem solving (Paracoherent Answer Sets, Random Disjunctive ASP, and Traveling Salesperson), respectively. While Traveling Salesperson constitutes a classical optimization problem in computer science, the five domains stemming from machine learning and personnel scheduling are applicationoriented, and the contribution of such practically relevant benchmarks to the ASP Competition is particularly encouraged [Gebser et al. (2017b)]. Moreover, the Paracoherent Answer Sets and Random Disjunctive ASP domains contribute to Subtrack #4, which was sparsely populated in recent ASP Competition editions, and beyond theoretical interest these benchmarks are relevant to logic program debugging and industrial solvers development. The following paragraphs provide more detailed background information for each of the new domains.
Bayesian Network Learning.
Bayesian networks are directed acyclic graphs representing (in)dependence relations between variables in multivariate data analysis. Learning the structure of Bayesian networks, i.e., selecting arcs such that the resulting graph fits given data best, is a combinatorial optimization problem amenable to constraintbased solving methods like the one proposed in [Cussens (2011)]. In fact, data sets from the literature serve as instances in this domain, while a problem encoding in ASPCore2 expresses optimal Bayesian networks, given by directed acyclic graphs whose associated cost is minimal.
Crew Allocation.
This scheduling problem, which has also been addressed by related constraintbased solving methods [Guerinik and Caneghem (1995)], deals with allocating crew members to flights such that the amount of personnel with certain capabilities (e.g., role on board and spoken language) as well as offtimes between flights are sufficient. Moreover, instances with different numbers of flights and available personnel restrict the amount of personnel that may be allocated to flights in such a way that no schedule is feasible under the given restrictions.
Markov Network Learning.
As with Bayesian networks, the learning problem for Markov networks [Janhunen et al. (2017)] aims at the optimization of graphs representing the dependence structure between variables in statistical inference. In this domain, the graphs of interest are undirected and required to be chordal, while associated scores express marginal likelihood with respect to given data. Problem instances of varying hardness are obtained by taking samples of different size and density from literature data sets.
Resource Allocation.
This scheduling problem deals with allocating the activities of business processes to human resources such that role requirements and temporal relations between activities are met [Havur et al. (2016)]. Moreover, the total makespan of schedules is subject to an upper bound as well as optimization. The hardness of instances in this domain varies with respect to the number of activities, temporal relations, available resources, and upper bounds.
Supertree Construction.
The goal of the supertree construction problem [Koponen et al. (2015)] is to combine the leaves of several given phylogenetic subtrees into a single tree fitting the given subtrees as closely as possible. That is, optimization aims at preserving the structure of subtrees, where the introduction of intermediate nodes between direct neighbors is tolerated, while the avoidance of such intermediate nodes is an optimization target as well. Instances of varying hardness are obtained by mutating projections of binary trees with different numbers of leaves.
Traveling Salesperson.
The wellknown traveling salesperson problem [Applegate et al. (2007)] is to find a round trip through a (directed) graph that is optimal in terms of the accumulated edge costs. Instances in this domain are twofold by stemming from the TSPLIB repository^{5}^{5}5http://elib.zib.de/pub/mptestdata/tsp/tsplib/tsplib.html or being randomly generated to increase the variety in the ASP Competition, respectively.
Paracoherent Answer Sets.
Given an incoherent logic program , i.e., a program without answer sets, a paracoherent (or semistable) answer set corresponds to a gapminimal answer set of the epistemic transformation of [Inoue and Sakama (1996), Amendola et al. (2016)]. The instances in this domain, used in [Amendola et al. (2017), Amendola et al. (2018)] to evaluate genuine implementations of paracoherent ASP, are obtained by grounding and transforming incoherent programs from previous editions of the ASP Competition. In particular, weak constraints single out answer sets of a transformed program containing a minimal number of atoms that are actually underivable from the original program.
Random Disjunctive ASP.
The disjunctive logic programs in this domain [Amendola et al. (2017)] express random 2QBF formulas, given as conjunctions of terms in disjunctive normal form, by an extension of the EiterGottlob encoding [Eiter and Gottlob (1995)]. Parameters controlling the random generation of 2QBF formulas (e.g., number of variables and number of conjunctions) are set so that instances lie close to the phase transition region, while having an expected average solving time below the competition timeout of 20 minutes per run.
3.2 Benchmark Selection
Domain  P  Easy  Medium  Hard  Too hard  

Graph Colouring  D  1  (1)  3  (5)  2  (2)  4  (21)  2  (3)  5  (16)  0  (0)  3  (3) 
Subtrack #1 
Knight Tour with Holes  D  2  (5)  3  (4)  4  (4)  0  (0)  4  (9)  0  (0)  0  (0)  7  (302)  
Labyrinth  D  4  (45)  0  (0)  5  (72)  0  (0)  7  (83)  0  (0)  0  (0)  4  (8)  
Stable Marriage  D  0  (0)  0  (0)  3  (3)  0  (0)  6  (15)  1  (1)  0  (0)  10  (55)  
Visitall  D  8  (14)  0  (0)  5  (5)  0  (0)  7  (40)  0  (0)  0  (0)  0  (0)  
Combined Configuration  D  1  (1)  0  (0)  1  (1)  0  (0)  12  (44)  0  (0)  0  (0)  6  (34) 
Subtrack #2 
Consistent Query Answering  Q  0  (0)  0  (0)  0  (0)  0  (0)  0  (0)  0  (0)  0  (0)  20  (120)  
Crew Allocation  D  0  (0)  4  (10)  0  (0)  6  (11)  0  (0)  6  (10)  0  (0)  4  (6)  
Graceful Graphs  D  3  (3)  0  (0)  4  (4)  1  (1)  4  (28)  2  (2)  0  (0)  6  (21)  
Incremental Scheduling  D  2  (11)  2  (6)  3  (47)  2  (11)  3  (37)  2  (10)  0  (0)  6  (76)  
Nomystery  D  4  (4)  0  (0)  4  (5)  0  (0)  4  (10)  0  (0)  0  (0)  8  (32)  
Partner Units  D  3  (9)  1  (1)  4  (34)  0  (0)  3  (15)  1  (1)  0  (0)  8  (32)  
Permutation Pattern Matching  D  2  (16)  2  (32)  2  (14)  2  (58)  0  (0)  5  (14)  0  (0)  7  (20)  
Qualitative Spatial Reasoning  D  5  (35)  4  (35)  4  (34)  2  (19)  3  (7)  2  (2)  0  (0)  0  (0)  
Reachability  Q  0  (0)  0  (0)  10  (30)  10  (30)  0  (0)  0  (0)  0  (0)  0  (0)  
Ricochet Robots  D  2  (2)  0  (0)  7  (18)  0  (0)  4  (181)  0  (0)  0  (0)  7  (38)  
Sokoban  D  2  (77)  2  (10)  2  (84)  2  (8)  5  (114)  2  (12)  0  (0)  5  (620)  
Bayesian Network Learning  O  4  (4)  0  (0)  4  (8)  0  (0)  8  (19)  0  (0)  4  (20)  0  (6) 
Subtrack #3 
Connected Still Life  O  0  (0)  0  (0)  5  (5)  0  (0)  10  (70)  0  (0)  5  (45)  0  (0)  
Crossing Minimization  O  1  (1)  0  (0)  1  (1)  0  (0)  17  (80)  0  (0)  1  (1)  0  (0)  
Markov Network Learning  O  0  (0)  0  (0)  0  (0)  0  (0)  0  (0)  0  (0)  10  (10)  10  (50)  
Maximal Clique  O  0  (0)  0  (0)  0  (0)  0  (0)  10  (41)  0  (0)  10  (94)  0  (1)  
MaxSAT  O  0  (0)  0  (0)  4  (4)  0  (0)  0  (0)  0  (0)  0  (0)  16  (50)  
Resource Allocation  O  –  (3)  –  (0)  –  (3)  –  (0)  –  (0)  –  (0)  –  (0)  –  (0)  
Steiner Tree  O  0  (0)  0  (0)  0  (0)  0  (0)  1  (1)  0  (0)  16  (45)  3  (3)  
Supertree Construction  O  0  (0)  0  (0)  0  (0)  0  (0)  6  (30)  0  (0)  14  (30)  0  (0)  
System Synthesis  O  0  (0)  0  (0)  0  (0)  0  (0)  8  (16)  0  (0)  8  (80)  4  (4)  
Traveling Salesperson  O  0  (0)  0  (0)  2  (2)  0  (0)  3  (3)  0  (0)  12  (60)  3  (3)  
Valves Location  O  6  (10)  0  (0)  2  (2)  0  (0)  7  (29)  0  (0)  5  (244)  0  (23)  
Video Streaming  O  11  (16)  0  (0)  0  (0)  0  (0)  0  (0)  0  (0)  8  (22)  1  (1)  
Abstract Dialectical Frameworks  O  4  (18)  0  (0)  8  (20)  0  (0)  6  (122)  0  (0)  2  (2)  0  (0) 
Subtrack #4 
Complex Optimization  D  0  (0)  0  (0)  0  (0)  0  (0)  20  (78)  0  (0)  0  (0)  0  (0)  
Minimal Diagnosis  D  7  (158)  2  (55)  3  (9)  2  (8)  4  (4)  1  (1)  0  (0)  0  (0)  
Paracoherent Answer Sets  O  0  (0)  0  (0)  1  (1)  0  (0)  12  (112)  0  (0)  0  (0)  7  (43)  
Random Disjunctive ASP  D  0  (0)  0  (0)  0  (0)  0  (0)  5  (48)  13  (73)  0  (0)  2  (2)  
Strategic Companies  Q  0  (0)  0  (0)  0  (0)  0  (0)  0  (0)  0  (0)  0  (0)  20  (37) 
Table 1 gives an overview of all problem domains, grouped by their respective subtracks, of the Seventh ASP Competition, where the names of new domains are highlighted in boldface. The second column provides the computational task addressed in a domain, distinguishing Decision (“D”) and Optimization (“O”) problems as well as Query answering (“Q”). Further columns categorize the instances in each domain by their empirical hardness, where hardness classes are based on the performance of the same reference systems, i.e., clasp, lp2normal2+clasp, and wasp1.5, as in the previous ASP Competition edition [Gebser et al. (2017b)]:^{6}^{6}6This choice of reference systems allows us to reuse the runtime results for previous domains gathered in exhaustive experiments on all available instances that took about 212 CPU days on the competition platform. Instances that do not belong to any of the listed hardness classes are in the majority of cases “very easy” and the remaining ones “nongroundable”, and we exclude such (uninformative regarding the system ranking) instances from the benchmark suite.

Easy: Instances completed by at least one reference system in more than 20 seconds and by all reference systems in less than 2 minutes solving time.

Medium: Instances completed by at least one reference system in more than 2 minutes and by all reference systems in less than 20 minutes (the competition timeout) solving time.

Hard: Instances completed by at least one reference system in less than 40 minutes, while also at least one (not necessarily the same) reference system did not finish solving in 20 minutes.

Too hard: Instances such that none of the reference systems finished solving in 40 minutes.
For each of these hardness classes, numbers of available instances per problem domain are shown within parentheses in Table 1, further distinguishing satisfiable and unsatisfiable instances, whose respective numbers are given first or second, respectively. In case of instances classified as “too hard”, however, no reference system could report unsatisfiability, and thus the numbers of instances listed second refer to an unknown satisfiability status. Note that there are likewise no “too hard” instances of Decision problems or Query answering domains known as satisfiable, so that the respective numbers are zero. For example, the Sokoban domain features satisfiable as well as unsatisfiable instances for each hardness class apart from the “too hard” one, where 0 instances are known as satisfiable and 620 have an unknown satisfiability status. Unlike that, “too hard” instances of Optimization problems are frequently known to be satisfiable, in which case none of the reference systems was able to confirm an optimum solution within 40 minutes. Moreover, we discarded any instance of an Optimization problem that was reported to be unsatisfiable, so that the respective numbers given second are zero for the first three hardness classes. This applies, e.g., to instances in the Bayesian Network Learning domain, including 4, 8, 19, and 20 satisfiable instances that are “easy”, “medium”, “hard”, or “too hard”, respectively, while the satisfiability status of further 6 “too hard” instances is unknown. Finally, the numbers in front of parentheses in Table 1 report how many instances were (randomly) selected per hardness class and satisfiability status, and the selection process is described in the rest of this section.
Given the numbers of satisfiable, unsatisfiable, or unknown in case of “too hard” instances per hardness class, our benchmark selection process aims at picking 20 instances in each problem domain such that the four hardness classes are balanced, while another share of instances is added freely. Perfect balancing would then consist of picking four instances per hardness class and another four instances freely in order to guarantee that each hardness class contributes 20% of the instances in a domain. Since in most domains the instances are not evenly distributed, it is not possible though to insist on at least four instances per hardness class, and rather we have to compensate for underpopulated classes at which the number of available instances is smaller.
The input to our balancing scheme includes a collection of classes, where each class is identified with the set of its contained instances. The first step of balancing then determines the nonempty classes from which instances can be picked:
The number of nonempty classes is used to calculate how many instances should ideally be picked per class, where the calculation makes use of the parameters and , standing for the total number of instances to select per domain and the fraction of instances to pick freely, respectively:
(1) 
To account for underpopulated classes, in the next step we calculate the gap between the intended number of and the available instances in each class:
Example 2
In the Graceful Graphs domain, the “easy”, “medium”, “hard”, and “too hard” classes contain 3, 5, 30, or 21 instances, respectively, when not (yet) distinguishing between satisfiable and unsatisfiable instances. Since all four hardness classes are nonempty, we obtain . The calculation of instances to pick per class yields , so that we aim at 4 instances per hardness class. Along with the number of instances available in each class, we then get , , , and . Note that a positive number expresses underpopulation of a class relative to the intended number of instances, while negative numbers indicate capacities to compensate for such underpopulation.
Our next objective is to compensate for underpopulated classes by increasing the number of instances to pick from other classes in a fair way. Regarding hardness classes, our compensation scheme relies on the direct successor relation given by , , and . We denote the strict total order obtained as the transitive closure of by , and its inverse relation by . Moreover, we let below stand for either or to specify calculations that are performed symmetrically, such as determining the number of easier or harder instances available to compensate for the (potential) underpopulation of a class:
The possibility of compensation in favor of easier or harder instances is then determined as follows:
The calculation is such that a positive gap, standing for the underpopulation of a class, is a prerequisite for obtaining a nonzero outcome, and the availability of easier or harder instances to compensate with is required in addition. Given the compensation possibilities, the following calculations decide about how many easier or harder instances, respectively, are to be picked to resolve an underpopulation, where the distribution should preferably be even and tiebreaking in favor of harder instances is used as secondary criterion if the number of instances to compensate for is odd:^{7}^{7}7Given that (or ) is limited by (or ), the superscripts “” and “” refer to easier or harder instances, respectively, to be picked in addition. This reading is chosen for a convenient notation in the specification of classes whose numbers of instances are to be increased for compensation.
It remains to choose classes whose numbers of instances are to be increased for compensation, where we aim to distribute instances to closest classes with compensation capacities. The following inductive calculation scheme accumulates instances to distribute according to this objective:
In a nutshell, and express how many easier or harder instances, respectively, ought to be distributed up to a class , and and stand for corresponding increases of the number of instances to be picked from . The instances to increase with are then added to the original number of instances to pick from a class as follows:
Example 3
Given , , , and from Example 2 for the Graceful Graphs domain, we obtain the following numbers indicating the availability of easier instances: , , , and . Likewise, the available harder instances are expressed by , , , and . Again note that positive numbers like represent a (cumulative) underpopulation, while negative numbers such as indicate compensation capacities.
Considering “easy” instances, we further calculate and . This tells us that we can add one harder instance to compensate for the underpopulation of the “easy” class, while for the other classes and . Given that instances to distribute are limited by compensation possibilities, which are nonzero at underpopulated classes only, it is sufficient to concentrate on “easy” instances in the Graceful Graphs domain. This yields and , so that one harder instance is to be picked more.
The calculation of instance number increases to compensate for underpopulated classes then starts with , , , , and . That is, the instance to distribute from the underpopulated “easy” class to some harder class increases the number of “medium” instances, while we obtain as well as for all . The final numbers of instances to pick per hardness class in the Graceful Graphs domain are thus determined by , , , and . Note that 16 instances are to be selected from particular hardness classes in total, sparing the four instances to be picked freely, and also that our balancing scheme takes care of exchanging an “easy” for a “medium” instance.
After determining the numbers of instances to pick per hardness class, we also aim to balance between satisfiable and unsatisfiable instances within the same class. In fact, the above balancing scheme is general enough to be reused for this purpose by letting consist of the subclasses of satisfiable or unsatisfiable instances, respectively, in a hardness class that includes at least one instance known to be satisfiable or unsatisfiable.^{8}^{8}8Otherwise, all subclasses to pick instances from are empty, which would lead to division by zero in (1). Moreover, the parameters and used in (1) are fixed to and , which reflect that the satisfiability status should be balanced among all instances to be picked from without allocating an additional share of instances to pick freely. For the strict total order on the subclasses in , we use , let denote the transitive closure of , and its inverse relation.
Example 4
Reconsidering the Graceful Graphs domain, we obtain the following number of instances to pick based on their satisfiability status: , , , , , and . Note that for , while . The latter is due to rounding in , and then compensating for the underpopulated unsatisfiable instances by increasing the number of satisfiable “medium” instances to pick by one.
For instances of Decision problems or Query answering domains, we have that secondary balancing based on the satisfiability status is generally void for “too hard” instances, of which none are known to be satisfiable or unsatisfiable. In case of Optimization problems, where we discard instances known as unsatisfiable, holds for , while our balancing scheme favors “too hard” instances known as satisfiable over those with an unknown satisfiability status. This approach makes sure that “too hard” instances to be picked possess solutions, yet confirming an optimum is hard, and instances with an unknown satisfiability status can still be contained among those that are picked freely.
Example 5
Regarding the Optimization problem in the Valves Location domain, we obtain , given that the 23 instances whose satisfiability status is unknown are not considered for balancing.
The described twofold balancing scheme, first considering the hardness of instances and then the satisfiability status of instances of similar hardness, is implemented by an ASPCore2 encoding that consists of two parts: a deterministic program part (having a unique answer set) takes care of determining the numbers from the runtime results of reference systems, and a nondeterministic part similar to the selection program used in the previous ASP Competition edition [Gebser et al. (2017b)] encodes the choice of 20 instances per domain such that lower bounds given by the calculated numbers are met. In comparison to the previous competition edition, we updated the deterministic part of the benchmark selection encoding by implementing the balancing scheme described above, which is more general than before and not fixed to a particular number of classes (regarding hardness or satisfiability status) to balance. The instance selection was then performed by running the ASP solver clasp with the options randfreq, signdef, and seed for guaranteeing reproducible randomization, using the concatenation of winning numbers in the EuroMillions lottery of 2nd May 2017 as the random seed. This process led to the numbers of instances picked per domain, hardness class, and satisfiability status listed in Table 1.
As a final remark, we note that we had to exclude the Resource Allocation domain from the main competition in view of an insufficient number of instances belonging to the hardness classes under consideration. In fact, the majority of instances turned out to be “very easy” relative to an optimized encoding devised in the phase of checking/establishing the ASPCore2 compliance of initial submissions by benchmark authors. This does not mean that the problem of Resource Allocation as such would be trivial or uninteresting, but rather time constraints on running the main competition did unfortunately not permit to extend and then reassess the collection of instances.
4 Participant Systems
Fourteen systems, registered by three teams, participate in the System track of the Seventh ASP Competition. The majority of systems runs in the SP category, while two (indicated by the suffix “mt” below) exploit parallelism in the MP category. In the following, we survey the registered teams and systems.
Aalto.
The team from Aalto University submitted nine systems that utilize normalization [Bomanson et al. (2014), Bomanson et al. (2016)] and translation [Bogaerts et al. (2016), Bomanson et al. (2016), Gebser et al. (2014), Janhunen and Niemelä (2011), Liu et al. (2012)] means. Two systems, lp2sat+lingeling and lp2sat+plingelingmt, perform translation to SAT and use lingeling or plingeling, respectively, as backend solver. Similarly, lp2mip and lp2mipmt rely on translation to Mixed Integer Programming along with a single or multithreaded variant of cplex for solving. The lp2acycasp, lp2acycpb, and lp2acycsat systems incorporate translations based on acyclicity checking, supported by clasp run as ASP, PseudoBoolean, or SAT solver, as well as the graphsat solver in case of SAT with acyclicity checking. Moreover, lp2normal+lp2sts takes advantage of the sattosat framework to decompose complex computations into several SAT solving tasks. Unlike that, lp2normal+clasp confines preprocessing to the (selective) normalization of aggregates and weak constraints before running clasp as ASP solver. Beyond syntactic differences between target formalisms, the main particularities of the available translations concern space complexity and the supported language features. Regarding space, the translation to SAT utilized by lp2sat+lingeling and lp2sat+plingelingmt comes along with a logarithmic overhead in case of nontight logic programs that involve positive recursion [Fages (1994)], while the other translations are guaranteed to remain linear. Considering language features, the systems by the Aalto team do not support queries, and the backend solver clasp of lp2acycasp, lp2acycpb, and lp2normal+clasp provides a native implementation of aggregates, which the other systems treat by normalization within preprocessing. Optimization problems are supported by all systems but lp2sat+lingeling, lp2sat+plingelingmt, and lp2normal+lp2sts, while only lp2normal+lp2sts and lp2normal+clasp are capable of handling nonHCF disjunction.
MeAsp.
The MEASP team from the University of Genova, the University of Sassari, and the University of Calabria submitted the multiengine ASP system measp2, which is an updated version of measp [Maratea et al. (2012), Maratea et al. (2014), Maratea et al. (2015)], the winner system in the Regular track of the Sixth ASP Competition. Like its predecessor version, measp2 investigates features of an input program to select its backend among a pool of ASP grounders and solvers. Basically, measp2 applies algorithm selection techniques before each stage of the answer set computation, with the goal of selecting the most promising computation strategy overall. As regards grounders, measp2 can pick either dlv or gringo, while the available solvers include a selection of those submitted to the Sixth ASP Competition as well as the latest version of clasp. The first selection (basically corresponding to the selection of the grounder) is based on features of nonground programs and was obtained by implementing the result of the application of the PART decision list algorithm, whereas the choice of a solver is based on the multinomial classification algorithm kNearest Neighbors, used to train a model on features of ground programs extracted (whenever required) from the output generated by the grounder (for more details, see [Maratea et al. (2015)]).
Unical.
The team from the University of Calabria submitted four systems utilizing the recent idlv grounder [Calimeri et al. (2017)], developed as a redesign of (the grounder component of) dlv going along with the addition of new features. Moreover, backends for solving are selected from a variety of existing ASP solvers. In particular, idlvclaspdlv makes use of dlv [Leone et al. (2006), Maratea et al. (2008)] for instances featuring a ground query; otherwise, it consists of the combination of the grounder idlv with clasp executed with the option configuration=trendy. The idlv+claspdlv system is a variant of the previous system that uses a heuristicguided rewriting technique [Calimeri et al. (2018)] relying on hypertree decomposition, which aims to automatically replace long rules with sets of smaller ones that are possibly evaluated more efficiently. idlv+waspdlv is obtained by using wasp in place of clasp. In more detail, wasp is executed with the options shrinkingstrategy=progression shrinkingbudget=10 trimcore enabledisjcores, which configure wasp to use two techniques tailored for Optimization problems. Inspired by measp, idlv+s [Fuscà et al. (2017)] integrates idlv+ with an automatic selector to choose between wasp and clasp on a per instance basis. To this end, idlv+s implements classification, by means of the wellknown support vector machine technique. A more detailed description of the idlv+s system is provided in [Calimeri et al. (2019)].
5 Results
This section presents the results of the Seventh ASP Competition. We first announce the winners in the SP category and analyze their performance, and then proceed overviewing results in the MP category. Finally, we analyze the results more in details outlining some of the outcomes.
5.1 Results in the Sp Category
Figures 2 and 3 summarize the results of the SP category, by showing the scores of the various systems, where Figure 2 utilizes function for computing the score of Optimization problems, while Figure 3 utilizes function . To sum up, considering Figure 2, the first three places go to the systems:

idlv+s, by the UNICAL team, with 2665 points;

idlv+claspdlv, by the UNICAL team, with 2655,9 points;

idlvclaspdlv, by the UNICAL team, with 2634 points.
Also, measp is quite close in performance, earning 2560 points in total.
Thus, the first three places are taken by versions of the idlv system pursuing the approaches outlined in Section 4. The fourth place, with very positive results, is instead taken by measp which pursues a portfolio approach, and was the winner of the last competition.
Going into the details of subtracks, the three topperforming systems overall take the first places as well:

Subtrack #1 (Basic Decision): idlvclaspdlv and idlv+claspdlv with 400 points;

Subtrack #2 (Advanced Decision): idlv+claspdlv with 805 points;

Subtrack #3 (Optimization): idlv+claspdlv with 1015,9 points;

Subtrack #4 (Unrestricted): idlv+s with 450 points.
Considering Figure 3, which employs function for computing scores of Optimization problems, the situation is slightly different, i.e., measp now gets the third place. To sum up, the first three places go to the systems:

idlv+s, by the UNICAL team, with 2330 points;

idlv+claspdlv, by the UNICAL team, with 2200 points;

measp, by the MEASP team, with 2185 points.
idlv+claspdlv is now fourth with the same score of 2185 points but higher cumulative CPU time: the difference is in Subtrack #3, where relative results are different with respect to using , and with the new score computation measp earns 55 points more than idlv+claspdlv. In general, employing function for computing scores of Optimization problems leads to lower scores: indeed, is more restrictive than given that only optimal results are considered.
An overall view of the performances of all participant systems on all benchmarks is shown in the cactus plot of Figure 4. Detailed results are reported in Appendix A.
Official results in Figures 2 and 3 are complemented by the data showed in Figures 5 and 6. Figure 5 contains, for each solver, the number of (optimally) solved instances in each reasoning problem of the competition, i.e., Decision, Optimization and Query (denoted Dec, Opt, and Query in the figure, respectively). From the figure, we can see that idlv+claspdlv and idlvclaspdlv are the solvers that perform best on Decision problems, while idlv+s is the best on Optimization problems. For what concern Query answering, the first four solvers perform equally well on them. Figure 6, instead, reports, for each solver, the percentage of solved instances (resp. score) in the various subtracks out of the total number of (optimally) solved instances (resp. global score), i.e., what is the “contribution” of tasks in each subtrack to the results of the respective system.
5.2 Results in the Mp category
Figure 7 shows results about the MP category. The bottom part of the figure reports the scores acquired by the two participant systems, which cumulatively are: for

lp2sat+plingelingmt, by the Aalto team, 715 points;

lp2mipmt, by the Aalto team, 635 points.
Looking into details of the subtracks, we can note that lp2sat+plingelingmt is better than lp2mipmt on Subtrack #1 and much better on Subtrack #2, while on Subtrack #3, where lp2sat+plingelingmt does not compete, lp2mipmt earns a consistent number of points, but not enough to globally reach the score of lp2sat+plingelingmt in the first two subtracks.
The top part of Figure 7, instead, complements the results by showing the “contribution” of solved instances in each subtrack out of the score of the respective system.
5.3 Analysis of the results
There are some general observations that can be drawn out of the results presented in this section. First, the best overall solver implements algorithm selection techniques, and continues the “tradition” of the efficiency of portfoliobased solvers in ASP competitions, given that claspfolio [Gebser et al. (2011)] and measp [Maratea et al. (2014)] were the overall winners of the 2011 and 2015 competitions, respectively. At the same time, the result outlines the importance of the introduction of new evaluation techniques and implementations. Indeed, although idlv+s applies a strategy similar to the one of measp, idlv+s exploited a new component (i.e., the grounder [Calimeri et al. (2016)]) that was not present in measp (which is based on solvers from the previous competition). Second, the approach implemented by lp2normal using clasp confirms its very good behavior in all subtracks, and thus overall. Third, specific istantiations of the translationbased approach perform particularly well in some subtracks: this is the case for the lp2acycasp solver using clasp in Subtrack #3, especially when considering scoring scheme , but also for lp2mip, that compiles to a general purpose solver, in the same subtrack, especially when considering scoring scheme (even if to a less extent). As far as the comparison between solvers in the MP category and their counterpart in the SP category is concerned, we can see that globally the score of lp2sat+plingelingmt and lp2sat+plingeling is the same, with the small advantage of lp2sat+plingelingmt in Subtrack #1 being compensated in Subtrack #2. Instead, lp2mip+mt earns a consistent number of points more than lp2mip, especially in Subtrack #1 and #3. In general, more specific research is probably needed on ASP solvers exploiting multithreading to take real advantage from this setting.
6 Conclusion and Final Remarks
We have presented design and results of the Seventh ASP Competition, with particular focus on new problem domains, revised benchmark selection process, systems registered for the event, and results.
In the following, we draw some recommendations for future editions. These resemble the ones of the past event: for some of them some steps have been already made in this seventh’s event, but they may be considered, with the aim of widening the number of participant systems and application domains that can be analyzed, starting from the next (Eighth) ASP competition that will take place in 2019 in affiliation with the 15th International Conference on Logic Programming and NonMonotonic Reasoning (LPNMR 2019), in Philadelphia, US:

We also tried to reintroduce a Model&Solve track at the competition. But, given the short call for contributions and the low number of expressions of interest received, we decided not to run the track. Despite this, we still think that a (restricted form of a) Model&Solve track should be reintroduced in the ASP competition series.

Our aim with the reintroduction of a Model&Solve track was at solving domains involving, e.g., discrete as well as continuous dynamics [Balduccini et al. (2017)], so that extensions like Constraint Answer Set Programming [Mellarkod et al. (2008)] and incremental ASP solving [Gebser et al. (2008)] may be exploited. The mentioned extensions could be added as tracks of the competition, but for CASP the first step that would be needed is a standardization of its language.

Given that still basically all participant systems rely on grounding, the availability of more grounders is crucial. In this event the IDLV grounder came into play, but there is also the need for more diverse techniques. This may also help improving portfolio solvers, by exploiting machine learning techniques at nonground level (for a preliminary investigation, see [Maratea et al. (2013), Maratea et al. (2015)]).

Portfolio solvers showed good performance in the editions where they participated. However, no such system in the various editions has exploited a parallel portfolio approach. Exploring such techniques in conjunction could be an interesting topic of future research for further improving the efficiency.

Another option for attracting (young) researchers from neighboring areas to the development of ASP solvers may be a track dedicated to modifications of a common reference system, in the spirit of the Minisat hack track of the SAT Competition series. This would lower the âentrance barrierâ by keeping the effort of a participation affordable, even for small teams.
Acknowledgments.
The organizers of the Seventh ASP Competition would like to thank the LPNMR 2017 officials for the colocation of the event. We also acknowledge the Department of Mathematics and Computer Science at the University of Calabria for supplying the computational resources to run the competition. Finally, we thank all solver and benchmark contributors, and participants, who worked hard to make this competition possible.
References
 Alviano et al. (2017) Alviano, M., Calimeri, F., Dodaro, C., Fuscà, D., Leone, N., Perri, S., Ricca, F., Veltri, P., and Zangari, J. 2017. The ASP system DLV2. In Proceedings of the Fourteenth International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR’17), M. Balduccini and T. Janhunen, Eds. Lecture Notes in AI (LNAI), vol. 10377. SpringerVerlag, 215–221.
 Alviano et al. (2015) Alviano, M., Dodaro, C., Leone, N., and Ricca, F. 2015. Advances in WASP. In Proceedings of the Thirteenth International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR’15), F. Calimeri, G. Ianni, and M. Truszczyński, Eds. Lecture Notes in Computer Science, vol. 9345. SpringerVerlag, 40–54.
 Amendola et al. (2017) Amendola, G., Dodaro, C., Faber, W., Leone, N., and Ricca, F. 2017. On the computation of paracoherent answer sets. In Proceedings of the ThirtyFirst AAAI Conference on Artificial Intelligence (AAAI’17), S. P. Singh and S. Markovitch, Eds. AAAI Press, 1034–1040.
 Amendola et al. (2018) Amendola, G., Dodaro, C., Faber, W., and Ricca, F. 2018. Externally supported models for efficient computation of paracoherent answer sets. In Proceedings of the ThirtySecond AAAI Conference on Artificial Intelligence (AAAI’18), S. A. McIlraith and K. Q. Weinberger, Eds. AAAI Press, 1720–1727.
 Amendola et al. (2016) Amendola, G., Eiter, T., Fink, M., Leone, N., and Moura, J. 2016. Semiequilibrium models for paracoherent answer set programs. Artificial Intelligence 234, 219–271.
 Amendola et al. (2017) Amendola, G., Ricca, F., and Truszczyński, M. 2017. Generating hard random Boolean formulas and disjunctive logic programs. In Proceedings of the TwentySixth International Joint Conference on Artificial Intelligence (IJCAI’17), C. Sierra, Ed. ijcai.org, 532–538.
 Applegate et al. (2007) Applegate, D., Bixby, R., Chvátal, V., and Cook, W. 2007. The Traveling Salesman Problem: A Computational Study. Princeton University Press.
 Balduccini et al. (2017) Balduccini, M., Magazzeni, D., Maratea, M., and Leblanc, E. 2017. CASP solutions for planning in hybrid domains. Theory and Practice of Logic Programming 17, 4, 591–633.
 Baral (2003) Baral, C. 2003. Knowledge Representation, Reasoning and Declarative Problem Solving. Cambridge University Press.
 BenEliyahu and Dechter (1994) BenEliyahu, R. and Dechter, R. 1994. Propositional semantics for disjunctive logic programs. Annals of Mathematics and Artificial Intelligence 12, 53–87.
 Bogaerts et al. (2016) Bogaerts, B., Janhunen, T., and Tasharrofi, S. 2016. Stableunstable semantics: Beyond NP with normal logic programs. Theory and Practice of Logic Programming 16, 56, 570–586.
 Bomanson et al. (2014) Bomanson, J., Gebser, M., and Janhunen, T. 2014. Improving the normalization of weight rules in answer set programs. In Proceedings of the Fourteenth European Conference on Logics in Artificial Intelligence (JELIA’14), E. Fermé and J. Leite, Eds. Lecture Notes in Artificial Intelligence, vol. 8761. SpringerVerlag, 166–180.
 Bomanson et al. (2016) Bomanson, J., Gebser, M., and Janhunen, T. 2016. Rewriting optimization statements in answerset programs. In Technical Communications of the Thirtysecond International Conference on Logic Programming (ICLP’16), M. Carro and A. King, Eds. Open Access Series in Informatics, vol. 52. Schloss Dagstuhl, 5:1–5:15.
 Bomanson et al. (2016) Bomanson, J., Gebser, M., Janhunen, T., Kaufmann, B., and Schaub, T. 2016. Answer set programming modulo acyclicity. Fundamenta Informaticae 147, 1, 63–91.
 Brewka et al. (2011) Brewka, G., Eiter, T., and Truszczyński, M. 2011. Answer set programming at a glance. Communications of the ACM 54, 12, 92–103.
 Bruynooghe et al. (2015) Bruynooghe, M., Blockeel, H., Bogaerts, B., De Cat, B., De Pooter, S., Jansen, J., Labarre, A., Ramon, J., Denecker, M., and Verwer, S. 2015. Predicate logic as a modeling language: Modeling and solving some machine learning and data mining problems with IDP3. Theory and Practice of Logic Programming 15, 6, 783–817.
 Calimeri et al. (2019) Calimeri, F., Dodaro, C., Fuscà, D., Perri, S., and Zangari, J. 2019. Efficiently coupling the IDLV grounder with ASP solvers. Theory and Practice of Logic Programming. To appear.
 Calimeri et al. (2016) Calimeri, F., Fuscà, D., Perri, S., and Zangari, J. 2016. IDLV: The new intelligent grounder of dlv. In Proceedings of AI*IA 2016: Advances in Artificial Intelligence  Fifteenth International Conference of the Italian Association for Artificial Intelligence, G. Adorni, S. Cagnoni, M. Gori, and M. Maratea, Eds. Lecture Notes in Computer Science, vol. 10037. Springer, 192–207.
 Calimeri et al. (2017) Calimeri, F., Fuscà, D., Perri, S., and Zangari, J. 2017. IDLV: The new intelligent grounder of DLV. Intelligenza Artificiale 11, 1, 5–20.
 Calimeri et al. (2018) Calimeri, F., Fuscà, D., Perri, S., and Zangari, J. 2018. Optimizing answer set computation via heuristicbased decomposition. In Proceedings of the Twentieth International Symposium on Practical Aspects of Declarative Languages (PADL’18), F. Calimeri, K. W. Hamlen, and N. Leone, Eds. Lecture Notes in Computer Science, vol. 10702. Springer, 135–151.
 Calimeri et al. (2016) Calimeri, F., Gebser, M., Maratea, M., and Ricca, F. 2016. Design and results of the fifth answer set programming competition. Artificial Intelligence 231, 151–181.
 Calimeri et al. (2014) Calimeri, F., Ianni, G., and Ricca, F. 2014. The third open answer set programming competition. Theory and Practice of Logic Programming 14, 1, 117–135.
 Cussens (2011) Cussens, J. 2011. Bayesian network learning with cutting planes. In Proceedings of the Twentyseventh International Conference on Uncertainty in Artificial Intelligence (UAI’11), F. Cozman and A. Pfeffer, Eds. AUAI Press, 153–160.
 Eiter and Gottlob (1995) Eiter, T. and Gottlob, G. 1995. On the computational cost of disjunctive logic programming: Propositional case. Annals of Mathematics and Artificial Intelligence 15, 34, 289–323.
 Eiter et al. (2009) Eiter, T., Ianni, G., and Krennwallner, T. 2009. Answer Set Programming: A Primer. In Reasoning Web. Semantic Technologies for Information Systems, 5th International Summer School  Tutorial Lectures. BrixenBressanone, Italy, 40–110.
 Fages (1994) Fages, F. 1994. Consistency of Clark’s Completion and Existence of Stable Models. Journal of Methods of Logic in Computer Science 1, 1, 51–60.
 Fuscà et al. (2017) Fuscà, D., Calimeri, F., Zangari, J., and Perri, S. 2017. IDLV+MS: Preliminary report on an automatic ASP solver selector. In Proceedings of the Twentyfourth RCRA International Workshop on Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion (RCRA’17), M. Maratea and I. Serina, Eds. CEUR Workshop Proceedings, vol. 2011. CEURWS.org, 26–32.
 Gebser et al. (2014) Gebser, M., Janhunen, T., and Rintanen, J. 2014. Answer set programming as SAT modulo acyclicity. In Proceedings of the Twentyfirst European Conference on Artificial Intelligence (ECAI’14), T. Schaub, G. Friedrich, and B. O’Sullivan, Eds. Frontiers in Artificial Intelligence and Applications, vol. 263. IOS Press, 351–356.
 Gebser et al. (2008) Gebser, M., Kaminski, R., Kaufmann, B., Ostrowski, M., Schaub, T., and Thiele, S. 2008. Engineering an incremental ASP solver. In Proceedings of the Twentyfourth International Conference on Logic Programming (ICLP’08), M. Garcia de la Banda and E. Pontelli, Eds. Lecture Notes in Computer Science, vol. 5366. SpringerVerlag, 190–205.
 Gebser et al. (2015) Gebser, M., Kaminski, R., Kaufmann, B., Romero, J., and Schaub, T. 2015. Progress in clasp series 3. In Proceedings of the Thirteenth International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR’15), F. Calimeri, G. Ianni, and M. Truszczyński, Eds. Lecture Notes in Computer Science, vol. 9345. SpringerVerlag, 368–383.
 Gebser et al. (2011) Gebser, M., Kaminski, R., Kaufmann, B., Schaub, T., Schneider, M. T., and Ziller, S. 2011. A portfolio solver for answer set programming: Preliminary report. In Proceedings of the Eleventh International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR’11). Lecture Notes in Computer Science, vol. 6645. Springer, Vancouver, Canada, 352–357.
 Gebser et al. (2018) Gebser, M., Leone, N., Maratea, M., Perri, S., Ricca, F., and Schaub, T. 2018. Evaluation techniques and systems for answer set programming: a survey. In Proceedings of the TwentySeventh International Joint Conference on Artificial Intelligence (IJCAI 2018), J. Lang, Ed. ijcai.org, 5450–5456.
 Gebser et al. (2016) Gebser, M., Maratea, M., and Ricca, F. 2016. What’s hot in the answer set programming competition. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI 2016), D. Schuurmans and M. P. Wellman, Eds. AAAI Press, 4327–4329.
 Gebser et al. (2017a) Gebser, M., Maratea, M., and Ricca, F. 2017a. The design of the seventh answer set programming competition. In Proceedings of the Fourteenth International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR’17), M. Balduccini and T. Janhunen, Eds. Lecture Notes in AI (LNAI), vol. 10377. SpringerVerlag, 3–9.
 Gebser et al. (2017b) Gebser, M., Maratea, M., and Ricca, F. 2017b. The sixth answer set programming competition. Journal of Artificial Intelligence Research 60, 41–95.
 Gelfond and Leone (2002) Gelfond, M. and Leone, N. 2002. Logic Programming and Knowledge Representation – the AProlog perspective. Artificial Intelligence 138, 1–2, 3–38.
 Guerinik and Caneghem (1995) Guerinik, N. and Caneghem, M. V. 1995. Solving crew scheduling problems by constraint programming. In Proceedings of the First International Conference on Principles and Practice of Constraint Programming (CP’95), U. Montanari and F. Rossi, Eds. Lecture Notes in Computer Science, vol. 976. Springer, 481–498.
 Havur et al. (2016) Havur, G., Cabanillas, C., Mendling, J., and Polleres, A. 2016. Resource allocation with dependencies in business process management systems. In Proceedings of the Business Process Management Forum (BPM’16), M. L. Rosa, P. Loos, and O. Pastor, Eds. Lecture Notes in Business Information Processing, vol. 260. Springer, 3–19.
 Inoue and Sakama (1996) Inoue, K. and Sakama, C. 1996. A Fixpoint Characterization of Abductive Logic Programs. Journal of Logic Programming 27, 2, 107–136.
 Janhunen et al. (2017) Janhunen, T., Gebser, M., Rintanen, J., Nyman, H., Pensar, J., and Corander, J. 2017. Learning discrete decomposable graphical models via constraint optimization. Statistics and Computing 27, 1, 115–130.
 Janhunen and Niemelä (2011) Janhunen, T. and Niemelä, I. 2011. Compact translations of nondisjunctive answer set programs to propositional clauses. In Proceedings of the Symposium on Constructive Mathematics and Computer Science in Honour of Michael Gelfonds 65th Anniversary. Lecture Notes in Computer Science, vol. 6565. Springer, 111–130.
 Koponen et al. (2015) Koponen, L., Oikarinen, E., Janhunen, T., and Säilä, L. 2015. Optimizing phylogenetic supertrees using answer set programming. Theory and Practice of Logic Programming 15, 45, 604–619.
 Lefèvre et al. (2017) Lefèvre, C., Béatrix, C., Stéphan, I., and Garcia, L. 2017. ASPeRiX, a firstorder forward chaining approach for answer set computing. Theory and Practice of Logic Programming 17, 3, 266–310.
 Leone et al. (2006) Leone, N., Pfeifer, G., Faber, W., Eiter, T., Gottlob, G., Perri, S., and Scarcello, F. 2006. The DLV system for knowledge representation and reasoning. ACM Trans. Comput. Log. 7, 3, 499–562.
 Lierler et al. (2016) Lierler, Y., Maratea, M., and Ricca, F. 2016. Systems, engineering environments, and competitions. AI Magazine 37, 3, 45–52.
 Lifschitz (2002) Lifschitz, V. 2002. Answer Set Programming and Plan Generation. Artificial Intelligence 138, 39–54.
 Lifschitz (2008) Lifschitz, V. 2008. Twelve definitions of a stable model. In Proceedings of the Twentyfourth International Conference on Logic Programming (ICLP’08), M. Garcia de la Banda and E. Pontelli, Eds. Lecture Notes in Computer Science, vol. 5366. SpringerVerlag, 37–51.
 Liu et al. (2012) Liu, G., Janhunen, T., and Niemelä, I. 2012. Answer set programming via mixed integer programming. In Proceedings of the Thirteenth International Conference on Principles of Knowledge Representation and Reasoning (KR’12), G. Brewka, T. Eiter, and S. A. McIlraith, Eds. AAAI Press, 32–42.
 Maratea et al. (2012) Maratea, M., Pulina, L., and Ricca, F. 2012. The multiengine ASP solver measp. In Proceedings of the 13th European Conference on Logics in Artificial Intelligence (JELIA 2012), L. F. del Cerro, A. Herzig, and J. Mengin, Eds. Lecture Notes in Computer Science, vol. 7519. Springer, 484–487.
 Maratea et al. (2013) Maratea, M., Pulina, L., and Ricca, F. 2013. Automated selection of grounding algorithm in answer set programming. In Advances in Artificial Intelligence  Proceedings of the 13th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2013),, M. Baldoni, C. Baroglio, G. Boella, and R. Micalizio, Eds. Lecture Notes in Computer Science, vol. 8249. Springer, 73–84.
 Maratea et al. (2014) Maratea, M., Pulina, L., and Ricca, F. 2014. A multiengine approach to answerset programming. Theory and Practice of Logic Programming 14, 6, 841–868.
 Maratea et al. (2015) Maratea, M., Pulina, L., and Ricca, F. 2015. Multilevel algorithm selection for ASP. In Proceedings of the Thirteenth International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR’15), F. Calimeri, G. Ianni, and M. Truszczyński, Eds. Lecture Notes in Computer Science, vol. 9345. SpringerVerlag, 439–445.
 Maratea et al. (2008) Maratea, M., Ricca, F., Faber, W., and Leone, N. 2008. Lookback techniques and heuristics in dlv: Implementation, evaluation and comparison to qbf solvers. Journal of Algorithms in Cognition, Informatics and Logics 63, 1–3, 70–89.
 Marek and Truszczyński (1999) Marek, V. W. and Truszczyński, M. 1999. Stable Models and an Alternative Logic Programming Paradigm. In The Logic Programming Paradigm – A 25Year Perspective, K. R. Apt, V. W. Marek, M. Truszczyński, and D. S. Warren, Eds. Springer Verlag, 375–398.
 Marple and Gupta (2014) Marple, K. and Gupta, G. 2014. Dynamic consistency checking in goaldirected answer set programming. Theory and Practice of Logic Programming 14, 45, 415–427.
 Mellarkod et al. (2008) Mellarkod, V., Gelfond, M., and Zhang, Y. 2008. Integrating answer set programming and constraint logic programming. Annals of Mathematics and Artificial Intelligence 53, 14, 251–287.
 Niemelä (1999) Niemelä, I. 1999. Logic Programming with Stable Model Semantics as Constraint Programming Paradigm. Annals of Mathematics and Artificial Intelligence 25, 3–4, 241–273.
Appendix A Detailed Results
We report in this appendix the detailed results aggregated by solver. In particular, Figures 811 report for each solver and for each domain the score with computation for Optimization problems (ScoreASP2015), with computation score for Optimization problems (ScoreSolved), the sum of the execution times for all instances (Sum(Time)), the average memory usage on solved instances (Avg(Mem)), the number of solved instances (#Sol), the number of timed out executions (#TO), the number of execution terminated because the solver exceeded the memory limit (#MO) and the number of execution with abnormal execution (#OE), this last counting the instances that could not be solved by a solver, thus including output errors, abnormal terminations, giveups as well as instances that cannot be solved by a solver when it did not participate to a domain. An “*” near to a score of 0 indicates that the solver was disqualified from a domain because it terminated normally but produced a wrong witness in some instance of the domain.