Bound Your Models!How to Make OWL an ASP Modeling Language

Bound Your Models! How to Make OWL an ASP Modeling Language

Abstract

To exploit the Web Ontology Language OWL as an answer set programming (ASP) language, we introduce the notion of bounded model semantics, as an intuitive and computationally advantageous alternative to its classical semantics. We show that a translation into ASP allows for solving a wide range of bounded-model reasoning tasks, including satisfiability and axiom entailment but also novel ones such as model extraction and enumeration. Ultimately, our work facilitates harnessing advanced semantic web modeling environments for the logic programming community through an “off-label use” of OWL.

Keywords:
A

nswer Set Programming, Bounded-Model Semantics, Semantic Web

1 Introduction

Answer set programming (ASP) is a powerful declarative language for knowledge representation and reasoning [4]. In ASP the knowledge is encoded in a set of logical rules and interpreted under the stable model semantics [8, 9]. Recent developements led to powerful systems e.g. dlv [17], and gringo/clasp [6], to name some of them, which are capable to solve a large variety of problems [7]. In particular, ASP has shown to be well suited for big combinatorial search problems, as the dedicated solvers are specially designed to enumerate all solutions [2].

However, it has often been noted that, while being a powerful and versatile formalism, popularity and widespread adoption of logic programming in general and answer set programming in particular is hindered by the non-availability of user-friendly and scalable editing environments.

On the other side, formalisms coming with a more elaborate modeling tool support – most notably the Web Ontology Language OWL [31] – are often preferred, even if the application scenario actually is of a constraint-satisfaction type which does not go well with OWL’s standard semantics allowing for models of arbitrary size. Ontology editors like Protégé [16] provide user-friendly interfaces and combined with the natural language alike Manchester syntax [12] possesses perspicuous access to a presumably complex and involved formalism.

We propose bounded model reasoning as an intuitive and simple approach to overcome this situation. Thereby, we endow OWL with a non-standard model-theoretic semantics and modifying the modelhood condition by restricting the domain to a finite set of bounded size, induced by the named individuals occurring in the given OWL ontology. We note that this additional condition can be axiomatized in the latest version of OWL. While reasoning in OWL under the classical semantics is N2ExpTime-complete [15], we show that reasoning under the bounded model semantics is merely NP-complete. Still, employing the axiomatization, existing OWL reasoners struggle on bounded model reasoning, due to the heavy combinatorics involved.

Therefore, we propose a different approach and definine a translation of knowledge bases (the logical counterparts to OWL ontologies) into answer set programs [4], such that the set of bounded models coincides with the set of answer sets of the obtained program, allowing us to use existing answer set solvers (see [2] for an overview) for bounded model reasoning. Next to the inferencing tasks typically used in semantic web technologies, this approach also allows for solving other, non-standard reasoning problems like model enumeration.

The benefits are manifolded, whereas in this work we particularly emphasize OWL as modeling language for typical constraint-satisfaction-type problems. The translation based approach can be seen as higher-level layer on top of the ASP language. Although we focus on the description logic and its native DL syntax, other syntax specifications like the OWL 2 Manchester Syntax [12] very well strive towards user-friendliness by means of natural language features.

We have implemented the proposed approach, for which first preliminary evaluations on typical constraint-satisfaction-type problems not only demonstrate feasibility, but also suggest significant improvement compared to the axiomatized approach using highly optimized OWL reasoners.

The article is organized as follows. In Section 2 we introduce the necessary background on description logics and ASP. Then, in Section 3 we define the bounded model semantics and analyze their complexity. The particular encoding of a knowledge base into ASP is given in Section 4. A preliminary evaluation of the implemented system is summarized in Section 5. Finally, we conclude in Section 6 and discuss possible future directions.

2 Preliminaries

In this section we provide the necessary background of description logics and answer set programming.

2.1 Description Logics

OWL 2 DL, the version of the Web Ontology Language we focus on, is defined based on description logics (DLs, [3, 25]). We briefly recap the description logic (for details see [13]). Let , , and be finite, disjoint sets called individual names, concept names and role names respectively. These atomic entities can be used to form complex ones as displayed in Table 1.

A knowledge base is a tuple where is a ABox, is a TBox and is a RBox. Table 2 presents the respective axiom types available in the three parts, and we will refer to each TBox axiom as general concept inclusion (GCI). The original definition of contained more RBox axioms (expressing transitivity, (a)symmetry, (ir)reflexivity of roles), but these can be shown to be syntactic sugar. Moreover, the definition of contains so-called global restrictions which prevents certain axioms from occurring together. These complicated restrictions, while crucial for the decidability of classical reasoning in are not necessary for the bounded-model reasoning considered here, hence we omit them for the sake of brevity.

The semantics of is defined via interpretations composed of a non-empty set called the domain of and a function mapping individual names to elements of , concept names to subsets of and role names to subsets of . This mapping is extended to complex role and concept expressions (cf. Table 1) and finally used to define satisfaction of axioms (see Table 2). We say that satisfies a knowledge base (or is a model of , written: ) if it satisfies all axioms of , , and . We say that a knowledge base entails an axiom (written ) if all models of are models of .

We give a brief overview of the syntax and semantics of disjunctive logic programs under the answer-sets semantics [9]. We fix a countable set of (domain) elements, also called constants; and suppose a total order over the domain elements. An atom is an expression , where is a predicate of arity and each is either a variable or an element from . An atom is ground if it is free of variables. denotes the set of all ground atoms over . A (disjunctive) rule is of the form

,

with , , where are atoms, or a count expression of the form , where is an atom and or , for an atom, , a non-negative integer, and . Moreover, “” denotes default negation. The head of is the set = and the body of is . Furthermore, = and = . A rule is normal if and a constraint if . A rule is safe if each variable in occurs in . A rule is ground if no variable occurs in . A fact is a ground rule with empty body and no disjunction. An (input) database is a set of facts. A program is a finite set of rules. For a program and an input database , we often write instead of . If each rule in a program is normal (resp. ground), we call the program normal (resp. ground).

For any program , let be the set of all constants appearing in . is the set of rules obtained by applying, to each rule , all possible substitutions from the variables in to elements of . For count-expressions, denotes the set of all ground instantiations of , governed through . An interpretation satisfies a ground rule iff whenever , , and for each contained count-expression, holds, where is the cardinality of the set of ground instantiations of , , for and a non-negative integer. satisfies a ground program , if each is satisfied by . A non-ground rule (resp., a program ) is satisfied by an interpretation iff satisfies all groundings of (resp., ). is an answer set of iff it is a subset-minimal set satisfying the Gelfond-Lifschitz reduct . For a program , we denote the set of its answer sets by .

3 Bounded Models

When reasoning in description logics, models can be of arbitrary cardinality. In many applications, however, the domain of interest is known to be finite. In fact, restricting DL reasoning to models of finite domain size (called finite model reasoning, a natural assumption in database theory), has become the focus of intense studies lately [18, 5, 24].

As opposed to assuming the domain to be merely finite (but of arbitrary, unknown size), we consider the case where the domain has an a priori known cardinality, more precisely, when the domain coincides with the set of named individuals mentioned in the knowledge base. We refer to such models as bounded models and argue that in many applications this modification of the standard DL semantics represents a more intuitive definition of what is considered and expected as model of some knowledge base.1

Definition 1 (Bounded-Model Semantics)

Let be a knowledge base. An interpretation is said to be individual-bounded w.r.t. , if all of the following holds:

1. ,

2. for each individual , .

Accordingly, we call an interpretation (individual-)bounded model of , if is an individual-bounded interpretation w.r.t.  and holds. A knowledge base is called bounded-model-satisfiable if it has a bounded model. We say bounded-model-entails an axiom (written ) if every bounded model of is also a model of .

Note that, under the bounded-model semantics, there is a one-to-one correspondence between (bounded) interpretations and sets of ground facts, if one assumes the set of domain elements fixed and known. That is, for every bounded-model interpretation , we find exactly one Abox with atomic concept assertions and role assertions defined by and likewise, every such Abox gives rise to a corresponding interpretation . This allows us to use ABoxes as representations of models.

We briefly demonstrate the effects of bounded model semantics as opposed to finite model semantics (with entailment ) and the classical semantics. Let with , , and . First we note that has a bounded (hence finite) model representable as , thus is satisfiable under all three semantics. Then holds in all models of , therefore , , and . Opposed to this, merely holds in all finite models, whence and , but . Finally, only holds in all bounded models, thus , but and .

3.1 Extraction & Enumeration of Bounded Models

When performing satisfiability checking in DLs (the primary reasoning task considered there), a model constructed by a reasoner merely serves as witness to claim satisfiability, rather than an accessible artifact. However, as mentioned before, our approach aims at scenarios where a knowledge base is a formal problem description for which each model represents one solution. Then, retrieval of one, several, or all models is a natural task, as opposed to merely checking existence. With model extraction we denote the task of materializing an identified model in order to be able to work with it, i.e. to inspect it in full detail and reuse it in downstream processes. The natural continuation of model extraction is to make all models explicit, performing model enumeration. Conveniently, for both tasks we can use the introduced model representation via ABoxes.

Most existing DL reasoning algorithms attempt to successively construct a model representation of a given knowledge base. However, most of the existing tableaux reasoners do not reveal the constructed model, besides the fact that in the non-bounded case models might end up being infinite such that an explicit representation is impossible. Regarding enumeration, we state that this task is not supported – not even implicitly – by any state-of-the-art DL reasoner, also due to the reason that in the non-bounded case, the number of models is typically infinite and even uncountable. We want to stick to the notions of model extraction and enumeration as their meaning should be quite intuitive. Although, in the general first-order case the term model expansion is used, e.g. in the work of Mitchell and Ternovska [19]. There, an initial (partial) interpretation representing a problem instance is expanded to ultimately become a model for the encoded problem.

3.2 Complexity of Bounded Model Reasoning

The combined complexity of reasoning in over arbitrary interpretations is known to be N2ExpTime-complete [15]. Still, it is considered to be usable in practice since worst-case knowledge bases would be of very artificial nature. Restricting to bounded models leads to a drastic drop in complexity.

Theorem 3.1

The combined complexity of checking bounded-model satisfiability of knowledge bases is NP-complete.

Proof

(Sketch) To show membership, we note that after guessing an interpretation , (bounded) modelhood can be checked in polynomial time. For this we let contain all the concept expressions occurring in (including subexpressions). Furthermore, let contain all role expressions and role chains (including subchains) occurring in . Obviously, and are of polynomial size. Then, in a bottom-up fashion, we can compute the extension of every element of and the extension of every element of along the defined semantics. Obviously, each such computation step requires only polynomial time. Finally, based on the computed extensions, every axiom of can be checked – again in polynomial time.

To show hardness, we note that any 3SAT problem can be reduced to bounded-model satisfiability as follows: Let be a set of 3-clauses. Then satisfiability of coincides with the bounded-model satisfiability of the knowledge base containing the two axioms and , where if and if for any propositional symbol .

Note that this finding contrasts with the observation that bounded-model reasoning in first-order logic is PSpace-complete. We omit the full proof here, just noting that membership and hardness can be easily shown based on the fact that checking modelhood in FOL is known to be PSpace-complete [30] and, for the membership part, keeping in mind that NPSpace=PSpace thanks to Savitch’s Theorem [28]. This emphasizes the fact that, while the bounded-model restriction turns reasoning in FOL decidable, restricting to still gives a further advantage in terms of complexity (assuming ).

3.3 Axiomatization of Bounded Models inside SROIQ

When introducing a new semantics for some logic, it is worthwhile to ask if existing reasoners can be used. Indeed, it is easy to see that, assuming , adding the GCI as well as the set of inequality axioms containing with to will rule out exactly all the non-bounded models of . Denoting these additional axioms with , we then find that is bounded-model satisfiable iff is satisfiable under the classical DL semantics and, likewise, iff for any axiom . Consequently, any off-the-shelf reasoner can be used for bounded-model reasoning, at least when it comes to the classical reasoning tasks.

However, the fact that the currently available DL reasoners are not optimized towards reasoning with axioms of the prescribed type (featuring disjunctions over potentially large sets of individuals) and that available reasoners do not support model extraction and model enumeration led us to develop an alternative computational approach based on ASP.

4 Encoding SROIQ Knowledge Bases into ASP

We propose an encoding of an arbitrary knowledge base , into an answer set program , such that the set of answer sets , coincides with the set of bounded models of the given knowledge base. This allows us to use existing ASP machinery to perform both standard reasoning as well as model extraction and model enumeration quite elegantly. Intuitively, the set of all bounded models defines a search space, which can be traversed searching for models, guided by appropriate constraints. We thus propose an ASP encoding consisting of a generating part , defining all potential candidate interpretations, and a constraining part , ruling out interpretations violating the knowledge base.

Our translation into ASP requires a knowledge base in normal form which can be obtained by an easy syntactic transformation.

Definition 2 (Normalized Form [22])

A GCI is normalized, if it is of the form , where is of the form , , , , , , or , for a literal concept, a role, and a positive integer. A TBox  is normalized, if each GCI in  is normalized. An ABox  is normalized if each concept assertion in  contains only a literal concept, each role assertion in  contains only an atomic role, and  contains at least one assertion. An RBox  is normalized, if each role inclusion axiom is of the form or . A  knowledge base is normalized if , , and  are normalized.

Given , the normalized form is obtained by applying a transformation , given in Table 3, which is mainly standard in DLs [22]. The normalized knowledge base is a model-conservative extension of , i.e. every (bounded) model of is a (bounded) model of and every (bounded) model of can be turned into a (bounded) model of by finding appropriate interpretions for the concepts and roles introduced by . Thereby it is straightforward to extract a model for , given a model of . In the remainder, we will assume a knowledge base in normalized form, if not stated otherwise.

4.1 Candidate Generation

As shown, any potential bounded interpretation is induced by a set of individual assertions , such that for each concept name , role name and individuals occurring in , either , or and either or . This construction is straightforward to encode via subsequent rules:

 Πgen(K) := {A(X) :− not ¬A(X),⊤(X) | A∈NC(K)}∪ (1) {¬A(X) :− not A(X),⊤(X) | A∈NC(K)}∪ (2) {ar(r,X,Y) :− not ¬ar(r,X,Y),⊤(X),⊤(Y) | r∈NR(K)}∪ (3) {¬ar(r,X,Y) :− not ar(r,X,Y),⊤(X),⊤(Y) | r∈NR(K)}∪ (4) {⊤(a)|a∈NI(K)}. (5)

Recall, that a rule is unsafe, if a variable that occurs in the head does not occur in any positive body literal. The predicate ensures safe rules, each of the guessing rules (14) would otherwise be unsafe. This predicate represents the -concept, to which the statement (5) asserts each individual present in . The function takes care of potential inverse roles (cf. Table 4). Whereas “” denotes default negation, is without attached semantics and merely used as syntactic counterpart to the DL vocabulary. We show now that computes , the set of all constructible , and each determines a solution of .

Proposition 1 (BK=AS(Πgen(K)))

Let be a knowledge base and the logic program obtained by the translation given in (15). Then, it holds that coincides with the set of all answer sets of .

4.2 Axiom Encoding

In the next step, we turn each axiom into a constraint, ultimately ruling out those candidate interpretations not satisfying . Moreover, each individual assertion in the ABox restricts the search space further, since for some present fact any solution candidate containing is eliminated. We will successively introduce appropriate encodings for axioms of each knowledge base component, altogether manifested in the program and will finally show that the program computes all bounded models of .

Encoding TBox Axioms

Since is normalized, each GCI is of certain form which simplifies the encoding. We obtain as follows:

 Πchk(T) := {:− trans(C1),…,trans(Cn)|for each ⊤⊑n⨆i=1Ci in T} (6)

Each concept expression is translated according to the function depicted in Table 4. Note, each is only one of the ones given in Definition 2, the ones given in the first column; i.e. not complex, with the nice effect of to be realized non-recursively.

Encoding RBox Axioms

Role assertions and role inclusion axioms are also transformed into constraints, grouped in the program . According to their DL semantics, this yields:

 Πchk(R):= {:− ar(r,X,Y),notar(s,X,Y)|r⊑s∈R} ∪ (7) {:− ar(s,X,Y),ar(r,X,Y)|% Dis(r,s)∈R} ∪ (8) {:− ar(s1,X,Y),ar(s2,Y,Z),notar(r,X,Z)|s1∘s2⊑r∈R}. (9)

Encoding ABox Axioms

The ABox itself represents an input database, which we can directly use. However, it remains to check whether does not contain contradictory knowledge; i.e. propositional clashes of the form . Hence, the program consists of and one additional constraint for each concept and role name ruling out inconsistent input ABoxes.

 Πchk(A):= A ∪ (10) {:− A(X),¬A(X)|A∈NC(K)} ∪ (11) {:− ar(r,X,Y),¬ar(r,X,Y)|r∈NR(K)}. (12)

Note that the presence of does not cause an unsatisfiable program under the answer set semantics, since does not have any meaning under the semantics; is treated as just another predicate name. Thus, the imposed constraints simulate the known DL semantics.

Theorem 4.1

Let be a normalized knowledge base, and be the program obtained by applying Rules (112). Then, it holds:

With this theorem in place, we benefit from the translation in many aspects. Most notably, in addition to the standard DL reasoning tasks, model extraction and model enumeration can be carried out without additional efforts, since both are natural tasks for answer set solvers.

5 Evaluation

We implemented our approach as an open-source tool, named .2 The obtained logic programs can be evaluated with most modern ASP solvers. However, the evaluation was conducted using [6] for grounding and solving, since it currently is the most prominent solver leading the latest competitions [2]. We present preliminary evaluation results based on simple ontologies, encoding constraint-satisfaction-type combinatorial problems. Existing OWL ontologies typically used for benchmarking, e.g. SNOMED or GALEN [29, 23], do not fit our purpose, since they are modeled with the classical semantics in mind and often have little or no ABox information.

Our tests provide runtimes of and the popular reasoner [10]. Whereas a direct comparison would not be fair, the conducted tests shall merely show the feasibility of our approach and the infeasibility of the axiomatization using standard DL reasoners. The evaluation itself is conducted on a standard desktop machine.3

5.1 Unsatisfiability

We construct an unsatisfiable knowledge base , with and as follows:

 Tn ={A1⊑∃r.A2,…,An⊑∃r.An+1}∪ (13) {Ai⊓Aj⊑⊥|1≤i

Inspired by common pigeonhole-type problems, we have enforce an -chain of length without repeating elements, yet, given only individuals such a model cannot exist. Table 5 depicts the runtimes for detecting unsatisfiability of , for increasing . The durations correspond to the pure solving time of and pure reasoning time of , respectively, as both and have a comparable preprocessing. As the figures suggest, is a potential worst-case scenario, where both tools are doomed to test all combinations. On this task, constantly outperforms . For , is stopped after minutes, whereas detects unsatisfiability within seconds.

5.2 Model Extraction and Model Enumeration

With Table 6, we next provide some figures for model extraction and partial enumeration (retrieving a given number of bounded models). To this end, we created a knowledge base modeling fully and correctly filled Sudokus, featuring named individuals, concept names and role name. When invoking a satisfiability test on this knowledge base using , no answer was given within minutes.

On average, provides a solution for a given Sudoku instance in around seconds, of which more than seconds are needed for grounding, while the actual solving is done in less than seconds. For model enumeration, we used the knowledge base but removed information concerning pre-filled cells, turning the task into generating new Sudoku instances. The size of the grounded program is MB, the grounding process taking around seconds as reflected in Table 6.

6 Conclusion

With this paper, we have established the starting point for further developments on the theoretical and practical side, as well as we can identify benefits for both, the description logic and logic programming community. For the latter, our approach enables one to use OWL as ASP modeling language and therefore make use of the available tool support. Although modeling features are limited, we argue that quite large and involved problem scenarios can be modeled in OWL ontologies. Clearly, evaluations of our system with respect to such ontologies remain as imperative issue.

Complementarily, model extraction and enumeration supplement DL reasoning tasks for which our ASP translation represents not only a feasible approach, but apparently also a use case of ASP in another research field. Moreover, the framework may be extended to realize non-standard reasoning tasks useful for debugging purposes such as axiom pinpointing, explanation, justification and abduction, exploiting the innate capabilities of ASP to realize minimization as well as model enumeration.

On a more practical level, the proposed translation can certainly be optimized to exploit more built-in features of today’s ASP solvers. In terms of harnessing the convenience of OWL modeling environments, we will implement an OWL API reasoner interface for , such that it can e.g., be seamlessly be integrated with other OWL software, such as Protégé [16].

Regarding future theoretical DL investigations, in recent years, significant extensions of the modeling and querying capabilities of DLs have been proposed and partially implemented. A major such extension is considering the reasoning task of answering queries, most prominently (unions of) conjunctive queries, positive queries, conjunctive 2-way regular path queries, and monadically defined queries subsuming all of the former [27]. It is not overly difficult to show that answering all these query types over knowledge bases (and hence over OWL ontologies) under the bounded model semantics is -complete, which again contrasts with the much worse results (if any) for the unbounded case [26, 11]. Moreover, as all these query formalisms can be straightforwardly expressed in a rule-based way, an integration in our framework is immediate. In the same way, rule-based extensions of OWL – monotonic [14, 21] or nonmonotonic [20, 1] – should be straightforward to accommodate, at the cost of the combined complexity jumping to ExpTime or NExpTime.

Acknowledgements

We are grateful for all the valuable feedback from our colleagues and the anonymous workshop reviewers, which helped greatly to improve this work.

Appendix A Proofs

Proof of Theorem 4.1

By Proposition 1, computes the set . It remains to show, that obeys the bounded model semantics, and consequently excludes each not inducing a bounded model .

AS(Π(K))⊆{B|B∈BK and IB⊨K}

Let be an answer set of . From Proposition 1, we know . We show now that the interpretation induced by is a bounded model of , and therefore , for each axiom . Then, let

• : we distinguish role disjointness, and role inclusion axioms:

• : Let , then by definition of , there is a ground constraint in , for all individuals . Since is an answer set, . Consequently either , or , hence .

• : then let be of the form , with , and be the ground constraint in . Since is an answer set, we have that, if and implies . And consequently , and , thus .

• : then is normalized and of the form . In Rule (6), is turned into a constraint in . Since is an answer set, it does not violate any of the grounded instances of in . Suppose now towards contradiction, induced by does not satisfy , . Then, , for all . However, since does not violate , in each of the ground instantiations of , there is exists a which is not satisfied by , . Then, is one of the expressions given in Definition 2, and we distinguish:

• : then , and for any . Consequently , which contradicts the assumption .

• : then , and for any . Consequently , which contradicts the assumption .

• : then and , thus necessarily . In order to not satisfy , . Consequently we have with as nominal guard concept, and therefore , which contradicts the assumption.

• : then