Markov Logic Networks with Statistical QuantifiersAuthors appear in strict alphabetical order.

# Markov Logic Networks with Statistical Quantifiers††thanks: Authors appear in strict alphabetical order.

Víctor Gutiérrez-Basulto
Cardiff University, UK
gutierrezbasultov@cardiff.ac.uk
&Jean Christoph Jung
University of Bremen, Germany
KU Leuven, Belgium
jeanjung@uni-bremen.de
&Ondřej Kuželka
KU Leuven, Belgium
ondrej.kuzelka@kuleuven.be
###### Abstract

Markov Logic Networks (MLNs) are well-suited for expressing statistics such as “with high probability a smoker knows another smoker” but not for expressing statements such as “there is a smoker who knows most other smokers”, which is necessary for modeling, e.g. influencers in social networks. To overcome this shortcoming, we investigate quantified MLNs which generalize MLNs by introducing statistical universal quantifiers, allowing to express also the latter type of statistics in a principled way. Our main technical contribution is to show that the standard reasoning tasks in quantified MLNs, maximum a posteriori and marginal inference, can be reduced to their respective MLN counterparts in polynomial time.

Markov Logic Networks with Statistical Quantifiersthanks: Authors appear in strict alphabetical order.

Víctor Gutiérrez-Basulto Cardiff University, UK gutierrezbasultov@cardiff.ac.uk                        Jean Christoph Jung University of Bremen, Germany KU Leuven, Belgium jeanjung@uni-bremen.de                        Ondřej Kuželka KU Leuven, Belgium ondrej.kuzelka@kuleuven.be

## 1 Introduction

Markov Logic Networks (?) extend first-order logic (FOL) with means to capture uncertainty. This is intuitively achieved by softening the meaning of FOL formulas by associating weights to them, such that the higher the weight, the higher the probability of the formula to be satisfied. Indeed, MLNs provide a compact representation of large Markov Networks with repeated substructures. The kind of statistical regularities (that hold for a given problem) encoded by an MLN, directly depends on the type of quantifiers available in the language. Since MLNs are based on FOL, they come equipped with the standard and quantifiers of FOL. However, it has been observed that the modeling capabilities of these quantifiers might not be appropriate for certain application scenarios that require a form of quantification describing e.g. most or few or at least thresholds, for more details see e.g. (??) and references therein, and Sec. 8 below.

With a similar motivation, in this paper we introduce an extension of MLNs, quantified MLNs (QMNLs), which generalizes standard MLNs with statistical quantifiers . Indeed, standard MLNs lack means to describe certain types of statistics e.g. to describe the proportion of elements, say people, that are maximally connected to others. As a concrete example, in QMLNs using universal statistical quantifiers :

 ∃x∀∗y:(smoker(x)∧knows(x,y)∧smoker(y)) (1)

roughly, we can measure the proportion of the population that are smokers and are known by one particular smoker, who knows most other smokers. This type of modeling capabilities might be useful, for instance, in social network analysis to model influencers.

The study of QMLNs also has a strong theoretical motivation given the correspondence of MLNs with quantifier-free formulas and max entropy models constrained by statistics based on the random substitution semantics (??). Following arguments from (?), one can easily extend this correspondence to restricted QMLNs containing only formulas of the form

 α=Q1x1…Qnxn,Qn+1xn+1…Qkxk:φ(x1,…,xk)

where and for all and . The statistics given by the above formula can be intuitively understood as follows. The statistical quantifiers correspond to uniform sampling of grounding substitutions of the respective bound variables. Hence, in this case, the respective statistic represents the probability that the formula is true in a given possible world after we ground variables , , using the randomly sampled substitution. However, it is not straightforward how to extend these statistics to allow arbitrary quantifier prefixes, i.e. allowing any order of the quantifiers , and . For instance, for formulas like (1) above. In this paper, we will address this problem by investigating (unrestricted) QMLNs.

Objective and Contributions The main objective of this paper is to propose QMLNs as an extension of MLNs with means to express more complex statistics, and to develop technical foundations for them. Our main technical contributions are  the establishment of basic properties of QMNLs, analogous to those existing for standard MLNs; a generalization of the random substitution semantics to QMLNs and a polynomial time translation from QMLNs to MLNs, yielding a polytime reduction of the maximum a posteriori and marginal inference in QMLNs to their respective variant problems in standard MLNs. Furthermore, we pinpoint certain implications of extending MLNs to QMLNs in the context of symmetric weighted first-order model counting (WFOMC).

An extended version with an appendix can be found under https://bit.ly/2x5E9Rx.

## 2 Background and Notation

FOL We assume the reader is familiar with the function free finite-domain fragment of first-order logic (FOL), see appendix for details. Throughout the paper we will use to denote a finite set of constants and an infinite set of variables. A term is an element in . An atom is an expression of the form , where is a predicate name with arity and terms . Given an FOL formula , a variable and a constant , we use to denote the result of substituting in every occurrence of with . Let and be tuples of variables and constants, respectively, we write to denote the application of for all . For an atom , we denote with the set and if , then we say is ground. A ground formula is one containing only ground atoms. The grounding of a formula is the set of all possible ground formulas obtained from by substituting all its free variables with any possible combination of constants from ; we denote with the grounding of .

Given a vocabulary and a domain , a -structure over is any set consisting only of facts of the form such that , , and . We denote with the set of all -structures over and refer to the members of with possible worlds. Throughout the paper we assume that a fixed vocabulary is given and that all formulas use only predicate names from . The semantics of FOL is defined as usual in the context of MLNs. More precisely, we use only finite domains and assume that always . Formally, we write when a sentence is satisfied in a structure . Given a set of sentences, we write if it is the case that for all .

MLNs A Markov logic Network (MLN) is a finite set of weighted formulas , where is a weight and is a FOL-formula. If then is called a hard constraint, otherwise a soft constraint. Let be an MLN, we use and to denote, respectively, the set of soft and hard constraints in . We sometimes see simply as a set of FOL-formulas, i.e. disregarding the weight component.

The semantics of MLNs is defined as follows. Given an MLN and a domain , represents the following probability distribution

 pΦ(ω)={1Zexp(∑(φi,wi)∈ΦSwi⋅N(φi,ωi))ω⊨ΦH0otherw.

where is a world, is the number of ground formulas such that and .

Reasoning Problems We study the reasoning problems maximum a posteriori (MAP), that is, given an MLN and a domain , determine the world maximizing , and marginal inference (MARG), that is, given and a FOL sentence , determine defined by

 PrΦ,Δ(φ)=Pω∼pΦ(ω)[ω⊨φ],

that is, the probability of a world satisfying .

### 2.1 Relational Marginal Problems

MLNs containing only quantifier-free FOL formulas can also be seen as solutions to a maximum entropy problem (?) (defined below) constrained by statistics which are based on the random-substitution semantics (??). Such statistics are defined as follows. For a possible world and a FOL formula where with , the free variables of , the statistic of is defined as follows , where denotes the uniform distribution over elements of the set . Intuitively, the statistics measure how likely it is that the formula is satisfied when a random substitution of domain elements to its free variables is picked. The statistics can then be straightforwardly extended to statistics of probability distributions as follows. Let be a distribution over possible worlds from , is defined as

 Q[α]=Eω[Qω(α)]=∑ω∈Ω(σ,Δ)p(ω)⋅Qω(α)

Now, we introduce the announced maximum entropy problem. Given a set of formulas , the maximum entropy relational marginal problem is the problem of finding a distribution over possible worlds from satisfying the constraints , , . Here, the values , , are typically estimated from data. Note that the solution of the relational marginal problem is a Markov logic network given by where the weights are obtained from the dual problem of the maximum entropy relational marginal problem. In particular, when the constraints are estimated from a training example (possible world) and , then the MLN obtained from the maximum entropy is the same as the MLN that would have been obtained directly by maximum likelihood estimation on . However, when , the results obtained using maximum likelihood estimation are not statistically consistent (?), unlike the MLNs obtained using the relational marginal approach. Hence, the relational marginal view is more general from the statistical point of view; we refer to (?) for a detailed discussion.

In this work, the relational marginal view plays a key role on the development of the semantics for MLNs with statistical quantifiers.

## 3 Quantified Markov Logic Networks

We introduce the notion of quantified Markov Logic Networks (QMLNs), a generalization of standard Markov logic networks capable of expressing expectations using “statistical” quantifiers. In QMLNs, the main ingredients of MLNs – weighted formulas with a weight and a FOL formula – are replaced with weighted quantified sentences .

A quantified sentence is a formula of the shape

 α=Q1x1…Qnxn:ψ(x1,…,xn),

where each is a quantifier from and is a classical first-order formula with free variables precisely . Note that every FOL sentence is also a quantified sentence, but conversely a quantified sentence using the qualifier is not a FOL sentence.

###### Definition 1 (Qmln).

A Quantified Markov Logic Network (QMLN) is a finite set of pairs such that is a quantified sentence and .

Before we can give the semantics of QMLNs, we give the semantics for quantified sentences. Intuitively, given a quantified sentence and a possible world , we measure the extent to which is satisfied in .

###### Definition 2 (Sentence Statistics).

Let be a possible world and be a quantified sentence. Then the -statistic of , denoted , is defined as follows:

• if is an FOL sentence then

 Qω(α)={1if ω⊨α0otherwise (2)
• if is not an FOL sentence, then

 Qω(α)=mina∈ΔQω(α′[x/a]), (3)
• if is not an FOL sentence, then

 Qω(α)=maxa∈ΔQω(α′[x/a]), (4)
• if is not an FOL sentence, then

 Qω(α)=1|Δ|∑a∈ΔQω(α′[x/a]). (5)

Alternatively, the case of the quantifier cf. (5) above can be expressed as

 Qω(α)=Ea∼Unif(Δ)[Qω(α′[x/a])] (6)

where the expectation is w.r.t. a uniform distribution of over . From this we see that the definition of statistics given by Definition 2 generalizes that of statistics based on random substitution semantics, cf. Section 2.1 above.

###### Remark 1.

We can easily check the following property of sentence statistics. Let be a sentence and be a possible world. If is a sentence obtained from by replacing every quantifier by its classical counterpart then

 Qω(α)=1 iff ω⊨α′.

As a result of this, we will sometimes abuse notation and write when even if is not an FOL sentence.

###### Example 1.

In classical first-order logic, the sentence asserts that there is someone who knows everyone else (e.g. in a social network). If we replace by , we get a sentence

 ∃x∀∗y:knows(x,y)

whose associated statistic measures the proportion of people (domain elements) who are known by a person who knows most of the others. In graph-theoretical terms, the statistic measures maximum out-degree of nodes. Note that we could not directly express the same statistics in normal MLNs since, using normal MLNs, we could only express statistics corresponding to the sentence , which intuitively measures the proportion of people who know at least one person. As we show later in the paper, it is possible to express MLNs with constraints encoding the same statistics but in order to do that we will have to enlarge the vocabulary , introducing additional predicates.

Next we present another example of a slightly more complicated sentence.

###### Example 2.

Let us consider the sentence

 ∃x∀∗y∃z:(knows(x,z)∧knows(z,y))∨knows(x,y).

Its classical counterpart asserts that there exists some such that everyone either knows or knows someone who knows while the actual non-classical sentence measures the fraction of population for which this is true.

###### Definition 3 (Semantics of QMLNs).

Given a QMLN and a domain , the probability of a possible world is defined as:

 pΦ(ω)={1Zexp(∑(αi,wi)∈ΦSwi⋅Qω(αi))ΦH⊨ω0 otherwise.

where and is the -statistic of .

###### Definition 4 (Marginal Query Problem).

Let be a sentence and be a distribution on possible worlds from . The marginal query problem is to compute the marginal probability defined as:

 QΦ,Δ[α]=Eω[Qω(α)]=∑ω∈Ω(σ,Δ)pΦ(ω)⋅Qω(α).
###### Remark 2.

If is a sentence that does not contain quantifiers and is a distribution on possible worlds from then

 QΦ,Δ[α]=Pω∼pΦ(ω)[ω⊨α].

In other words, what the above remark says is that the classical definition of marginal inference is a special case of its more general version given by Definition 4.

## 4 Initial Observations

In this section we make some initial observations about QMLNs.

It is easy to see that QMLNs generalize MLNs in the sense that we can view a weighted formula as a quantified sentence with implicit -quantification over all free variables of . More formally, we have:

###### Proposition 1.

Let be an MLN and obtain a QMLN from by replacing every weighted formula with the weighted quantified sentence . Then, for every and every , we have .

### 4.1 Negation in QMLNs

It is well-known that in classical MLNs it is without loss of generality to assume positive weights. We show an analogous property of QMLNs.

###### Definition 5 (Negation).

We define the negation of quantified sentences by

 neg(α)=¯¯¯¯Q1x1…¯¯¯¯Qnxn:¬ψ(x1,…,xn),

where is , is , and is .

It is easy to check that . Next we illustrate the way negation works in our setting on a concrete example.

###### Example 3.

Let us see what happens if we take the sentence from Example 1 and negate it. Using Definition 5, we obtain

 neg(∃x∀∗y:knows(x,y))=∀x∀∗y:¬knows(x,y).

For the statistic we have

 Qω(∀x∀∗y:¬knows(x,y)) =mint∈Δ1|Δ|∑u∈Δ\mathbbm1(ω⊨¬knows(t,u)) =mint∈Δ1|Δ|∑u∈Δ(1−\mathbbm1(ω⊨knows(t,u))) =1−maxt∈Δ1|Δ|∑u∈Δ\mathbbm1(ω⊨knows(t,u)) =1−Qω(∃x∀∗y:knows(x,y)).

In the above example, the statistic of a negation of a sentence turns out to be equal to one minus the statistic of that sentence, which is intuitively desirable. By repeatedly applying the shown argument, one can show that this holds in general:

###### Proposition 2.

For any sentence and any possible world the following holds:

 Qω(neg(α))=1−Qω(α).

Further we show that the same distribution represented by a QMLN can be represented by another QMLN in which we replace some of the sentences by their negations while also inverting the signs of their respective weights:

###### Proposition 3.

Let and . Then, for any , we have:

 pΦ(ω)=pΦ′(ω).

It follows from Propositions 2 and 3 that we can focus on QMLNs that have only positive weights. It also follows that it makes no sense to have a sentence and its negation in the set of sentences defining a QMLN.

### 4.2 Limit w→∞

In the seminal paper on Markov logic networks (?) it was shown that if the weights of formulas of an MLN tend to infinity at the same pace, in the limit the MLN will define a uniform distribution over models of the classical first-order logic theory consisting of the MLN’s rules. More precisely, let us denote by a classical first-order logic theory obtained from a given by replacing every by where are precisely the free variables in . Then the possible worlds that have non-zero probability for are the models of . The next proposition straightforwardly generalizes this by establishing that an analogical property also holds for QMLNs, containing sentences with arbitrary quantifier prefixes composed of , , and .

###### Proposition 4.

Let be a finite domain and be a QMLN such that the sentence has a model from . Then

 limw→∞pΦ(w)(ω)=⎧⎨⎩0%ifω⊭ˆΦ1|{ω∈Ω(σ,Δ)|ω⊨ˆΦ}|if ω⊨ˆΦ

where is a set of sentences obtained from the weighted sentences in by replacing every occurrence of by .

### 4.3 Relational Marginal Problems and QMLNs

From Section 2.1 and Definition 2, it follows that standard MLNs containing only quantifier-free FOL formulas are solutions of maximum entropy relational marginal problems constrained by sentence statistics of sentences that contain only quantifiers. Next we show that the same holds for relational marginal problems constrained by the more general sentence statistics (cf. Def. 2).

We use to denote the constraint , where and are as in Section 2. Let be a finite domain and be a set of constraints that the sought distribution must satisfy. We require some auxiliary notation. We define the sets ; and and Note that, if a probability distribution satisfies the constraints from , then any world that is not a model of must have probability . Below, we will have to treat this separately from the rest of the constraints111Otherwise, if we just plugged the constraints from and into the optimization problem, Slater’s condition (?) would not hold.. Some additional definitions are required, we define

 Ω={ω∈Ω(σ,Δ)∣ω∈Ω(σ,Δ)∧ω⊨C0∪C1}

and for every , we introduce a variable representing the probability of . The optimization problem representing the maximum entropy relational marginal problem is then given by:

 sup{Pω∣ω∈Ω}∑ω∈ΩPωlog1Pω st.\
 ∀(αi,qi)∈CS:∑ω∈ΩQω(αi)⋅Pω=qi ∀ω∈Ω:Pω≥0,∑ω∈ΩPω=1

Assuming222If this condition is not satisfied we have to add additional hard formulas that explicitly rule out the worlds that have zero probability in every solution satisfying the given marginal constraints. that there exists a feasible solution of the optimization problem such that for all , and using standard techniques from convex optimization (??), specifically the construction of Lagrangian dual problems and the use of Slater’s condition, we arrive at the solution

 Pω=1Zexp⎛⎝∑(αi,wi)∈Φwi⋅Qω(αi)⎞⎠ (7)

where .

The weights in (7) are solutions of the optimization problem (the dual of the maximum entropy relational marginal problem stated above) which is to maximize the following expression:

 ∑(αi,qi)∈CSwiqi−log⎛⎝∑ω∈Ωexp⎛⎝∑(αi,qi)∈CSwi⋅Qω(αi)⎞⎠⎞⎠

When the parameters are estimated from a training example , i.e. , then this optimization problem is the same as maximizing likelihood of . In general, e.g. if with , the two problems are not equivalent and as mentioned before, maximum likelihood estimation is not statistically consistent.

We have verified that QMLNs are also solutions of max entropy relational problems constrained by sentence statistics based on formulas with arbitrary quantifier prefixes, similar to what has been done in the propositional (?) and relational (?) settings.

## 5 A Translation For MAP-Inference

In this section we describe a translation from arbitrary quantified MLNs to quantified MLNs that contain the statistical quantifiers only as a leading prefix. We have already seen in Proposition 1 that the latter QMLNs correspond exactly to standard MLNs. Since the translation can be performed in polynomial time, the given translation establishes a polynomial time reduction of MAP in QMLNs to MAP in MLNs.

Overview The given quantified MLN is translated by processing the weighted sentences one by one. More specifically, we show how to eliminate a single classical quantifier that appears before a block of ’s in the quantifier prefix of the quantified sentence. By exhaustively applying this elimination, we end up with a set of weighted sentences that either contain only classical logic quantifiers and or that contain a prefix of quantifiers.

For the description of the elimination, let us suppose that is a weighted sentence with defined as follows

 Q1x1…Qkxk∀∗xk+1…∀∗xk+l:ψ(x1,…,xk+l)

where for , and is a formula with free variables ; recall that the formula may also contain variables bound by quantifiers and but not by . The quantified sentence is transformed into a set of hard constraints, that is, weighted sentences of the shape with a FOL sentence, and a single weighted sentence with and being

 Q1x1…Qk−1xk−1∀∗xk…∀∗xk+l:ψ′(x1,…,xk+l)

for some formula to be defined below. Observe that the effect of the step is to turn quantifier into .

Eliminating In order to simplify notation in the description of the elimination step we will abbreviate with and with , and write, e.g.  instead of . We describe how to replace by . By Proposition 3, we can assume without loss of generality that is in fact . By the semantics, the variable maximizes the sentence statistics for the variables over all possible choices of . Our main idea is to simulate the computation of the sentence statistic in the MLN itself. For this purpose, we introduce a fresh333In general, we need to introduce fresh predicates names for every transformed separately, e.g. etc. For brevity, we do not show this explicitly in the text. -ary predicate name max, set

 ψ′(x,xk,z)=max(x,xk)∧ψ(x,xk,z),

and appropriately define max (using hard constraints). More formally, let us denote with the set of all assignments of to values such that is satisfied in world , that is,

 Witψ,ω(a,a)={b∈Δl∣ω⊨ψ(a,a,b)}.

Our goal is to enforce that, in every world , max satisfies the following property :

• for every choice of values for , there is precisely one such that , and moreover, for all , this satisfies

 |Witψ,ω(a,a)|≤|Witψ,ω(a,a∗)| (8)

Indeed, property  formalizes the mentioned semantics for the sentence statistic for . For enforcing it, observe that the inequality (8) is satisfied iff there is an injective mapping from the set on the left-hand side, , to the set on the right-hand side, . We exploit this observation as follows. First define a collection of linear orders on domain elements, one linear order for each assignment of a tuple of domain elements to the variables in . We represent the order by the predicates . The linear orders are enforced by hard constraints, more precisely, we ensure that one such linear order exists for any assignment of domain elements to variables in by adding hard constraints for axiomatizing antisymmetry, transitivity, and totality, respectively:

 ∀x∀y,z :leq(x,y,z)∧leq(x,z,y)⇒y=z, (9) ∀x∀x,y,z :leq(x,x,y)∧leq(x,y,z)⇒leq(x,x,z), (10) ∀x∀x,y :leq(x,x,y)∨leq(x,y,x). (11)

Next, we connect the linear order construction with the idea of injective mappings described above. This is done via another fresh predicate name fn which encodes the required mapping. Intuitively, in , refers to the current assignment to , constants refer to the elements we are interested in for , and the function maps to . We add the following hard constraints:

 ∀x∀y,y′∀z :leq(x,y,y′)∧ψ(x,y,z) ⇒(∃z′:ψ(x,y′,z′)∧fn(x,y,y′,z,z′)), ∀x∀y,y′,z,z′,z′′ :fn(x,y,y′,z,z′)∧fn(x,y,y′,z,z′′) ⇒z′=z′′, ∀x∀y,y′,z,z′,z′′ :fn(x,y,y′,z,z′)∧fn(x,y,y′,z′′,z′) ⇒z=z′′.

The first two sentences enforce that if then there exists a mapping from to . Injectivity of the mapping is ensured by the third sentence.

In order to define the predicate max, we add the following hard constraints: and

 ∀x,y,y′ :max(x,y)∧leq(x,y,y′)⇒y=y′

Correctness We have given some intuition above, but let us provide some more details. First, it is not hard to see that the added hard constraints ensure that max indeed satisfies the desired property .

Now, let be an arbitrary domain, be the most probable world of the QMLN over domain , and let be obtained from by the application of a single quantifier elimination step. Further, denote with the extended vocabulary. We call a world an extension of if for every of arity and all , we have iff . It is easy to see that our construction ensures that, in fact, every world in is the extension of some world in , and conversely, every world in has an extension in . Moreover, the sentence statistics for and its replacement relate as follows:

###### Lemma 5.

Let be an extension of . Then

 Qω′(α′)=1|Δ|Qω(α).

The definition of the updated weight now implies that for some constant . This establishes the correctness of the described reduction.

###### Theorem 6.

If is an extension of , then is a most probable world of over iff is a most probable world of over .

## 6 A Translation For Marginal Inference

Due to ties in Equation (8), one world might have several extensions . Moreover, different might have different numbers of extensions. Hence, the translation given in the previous section does not quite work for marginal inference. We fix these problems by further restricting the function encoded by fn and the order encoded by leq. More specifically, our goal is to add another set of hard constraints such that, for every domain ,

• every world has the same number of extensions .

To realize that, we exploit again the idea of the linear order. More specifically, we add a fresh binary predicate name and enforce that it is a linear order on by using hard constraints such as those in Equations (9)–(10). Based on , we break all possible ties that might occur in the definition of , in the sense that for a fixed choice of , there is exactly one choice of . First, we enforce that fn has the right domain:

 ∀x∀x,y∀z,z′ :fn(x,x,y,z,z′)⇒(ψ(x,x,z)∧ψ(x,y,z′))

For breaking ties in , we add the following constraint stating that, if the function encoded by fn is also surjective at given points , then :

 ∀x∀x,y: (leq(x,x,y)∧ (∀z′.ψ(x,y,z′)⇒∃z.fn(x,x,y,z,z′))) ⇒x≤y

Next we enforce that fn preserves the order by including the constraint

 ∀x∀x,y∀z1,z′1,z2,z′2: (z1≤∗z2∧fn(x,x,y,z1,z′1)∧fn(x,x,y,z2,z′2))

where the order is defined – using straightforward constraints – as the (unique) lexicographic extension of to the arity of .

Finally, note that it can still be the case that two worlds have different number of extensions because each of the functions represented by fn is order-preserving w.r.t. and has a uniquely defined domain but, apart form that, does not have to satisfy any other constraints. For instance the number of such functions from to and the number of such functions from to are different. We address this by requiring that the functions represented by fn map every element to the smallest element possible:

 ∀x∀x,y∀z1,z′1:fn(x,x,y,z1,z′1) ⇒(∀z′2.ψ(x,y,z′2)∧z′2≤∗z′1⇒∃z2.fn(x,x,y,z2,z′2))

Correctness Let be the result of adding the described constraints to . Based on the given intuitions, one can easily show that Property  is satisfied. Since there are possible choices for , we get that

 pΦ′′(ω′)=1|Δ|!pΦ(ω),

for any extension of . Since iff for any given sentence over , we obtain the desired result:

###### Theorem 7.

For every FOL sentence , we have

 Pω∼pΦ,Δ(φ)=Pω∼pΦ′′,Δ(φ).

Interestingly, we note that the result from Theorem 7 can be extended to computing a marginal query for a quantified sentence . See appendix for details.

## 7 On QMLNs Restricted to Two Variables

It has been shown that marginal inference for MLNs can often be reduced to symmetric weighted first-order model counting (WFOMC), see e.g., (?) and references therein. In this context of particular importance is the two-variable fragment of FOL (FO), since for FO the data complexity of symmetric WFOMC can be solved in polynomial time data complexity (???). Since MLNs containing formulae with up two variables (-MLNs) can be encoded as WFOMC for FO, -MLNs are domain liftable (?). One wonders whether the same holds for quantified MLNs.

Let us first remark that the reduction described in Section 6 does not preserve the quantifier rank. Indeed, the elimination step introduces predicate names with greater arity. Thus, we cannot ‘reuse’ the above results on WFOMC for FO to attain domain liftability for -QMLNs. Moreover, our reduction explicitly introduces transitivity axioms while there are only a few known very restricted cases where WFOMC is domain liftable in the presence of transitivity (?).

Due to recent results by ? (?) on WFOMC for FO with counting quantifiers, it is not surprising that a straightforward translation preserving the quantifier rank from QMLNs to MLNs seems not possible. Note that to compute the sentence statistic, we need to take into account the out-degree of domain elements. Indeed,  (?) put quite some technical effort to show that WFOMC for FO with one functional axiom is polynomial time in data complexity. The complexity of WFOMC for FO with many functional axioms or more generally, with arbitrary counting quantifiers remains a challenging open problem. This gives an insight on the difficulties of studying the computational complexity of reasoning in QMLNs restricted to two variables. We plan to address this in the future. In particular, we will investigate the exact connection between QMLNs with two variables and WFOMC for extensions FO with means for counting.

## 8 Related Work

Classical FOL quantifiers were already considered in the original work on Markov Logic Networks (?), albeit without a rigorous definition. A precise treatment of them was carried out later on in (??). In particular, (?) show how to remove existential quantifiers while preserving marginal inference results. In all these works, however, MLNs with quantifiers were defined in a way that is equivalent to QMLNs with a prefix of quantifiers. As a consequence, it is not possible to directly represent statistics (features) that correspond to sentences in which or precedes in the quantifier block.

There has been also some work on other types of aggregation. For example, some works considered explicit constructs for counting in relational models (??). ? (?) introduced recursive random fields that are capable of emulating certain forms of more complex aggregation. However, recursive random fields do not seem capable of even representing statistics such as . Finally, ? (?) studied the effect of the domain closure assumption on the semantics of probabilistic logic when existential quantifiers are allowed.

In the context of probabilistic soft logic (PSL), ? (?) recently introduced soft quantifiers based on quantifiers from fuzzy logic. However, their approach applies strictly to fuzzy logic. So, in PSL random variables may acquire non-Boolean truth values. Relevant to this work is also the study of the effect of domain size and its extrapolation on the probability distributions encoded using various relational learning systems (???). However, none of these works studied the interplay of statistical and classical quantifiers.

## 9 Discussion and Future Work

In this paper, we have investigated the extension QMLNs of MLNs with statistical quantifiers, allowing to express e.g. measures on the proportion of domain elements fulfilling certain property. We developed some key foundations by establishing a relation between MLNs and QMLNs. In particular, we provided a polytime reduction of the standard reasoning tasks MAP and MARG in QMLNs to their counterpart in MLNs. Furthermore, we also showed how to generalize the random substitution semantics to QMLNs.

As for future work, it might be interesting to develop more direct approaches to MAP and MARG in QMLNs. Indeed, even if the developed translations provide polytime reductions of reasoning in QMLNs to reasoning in MLNs (and overall, a good understanding of the relation between QMLNs and MLNs), they do not yield an immediate practical approach since the introduction of new predicates with greater arity is required. Another interesting aspect of future work is to investigate the statistical properties of QMLNs. For MLNs with quantifier-free formulas, ? (?) derived bounds on expected errors of the statistics’ estimates. However, obtaining similar bounds for the more general statistics considered here seems considerably more challenging because of the minimization and maximization that are involved in them.

Acknowledgments The first author was supported by EU’s Horizon 2020 programme under the Marie Skłodowska-Curie grant 663830 and the third one by the Research Foundation - Flanders (project G.0428.15).

## References

• [1992] Bacchus, F.; Grove, A. J.; Koller, D.; and Halpern, J. Y. 1992. From statistics to beliefs. In Proc. of AAAI-92, 602–608.
• [2015] Beame, P.; Van den Broeck, G.; Gribkoff, E.; and Suciu, D. 2015. Symmetric weighted first-order model counting. In Proc. of PODS-15, 313–328.
• [2015] Beltagy, I., and Erk, K. 2015. On the proper treatment of quantifiers in probabilistic logic semantics. In Proc. of IWCS-15, 140–150.
• [2004] Boyd, S., and Vandenberghe, L. 2004. Convex optimization. Cambridge university press.
• [2017] Farnadi, G.; Bach, S. H.; Moens, M.; Getoor, L.; and Cock, M. D. 2017. Soft quantification in statistical relational learning. Machine Learning 106(12):1971–1991.
• [2010] Jain, D.; Barthels, A.; and Beetz, M. 2010. Adaptive markov logic networks: Learning statistical relational models with dynamic parameters. In Proc. of ECAI 2010, 937–942.
• [2014] Kazemi, S. M.; Buchman, D.; Kersting, K.; Natarajan, S.; and Poole, D. 2014. Relational logistic regression. In Proc. of KR-14.
• [2016] Kazemi, S. M.; Kimmig, A.; den Broeck, G. V.; and Poole, D. 2016. New liftable classes for first-order probabilistic inference. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems, 3117–3125.
• [2018] Kuusisto, A., and Lutz, C. 2018. Weighted model counting beyond two-variable logic. In Proc. of LICS-18.
• [2018] Kuželka, O.; Wang, Y.; Davis, J.; and Schockaert, S. 2018. Relational marginal problems: Theory and estimation. In Proc. of AAAI-18.
• [2007] Lowd, D., and Domingos, P. M. 2007. Recursive random fields. In Proc. of IJCAI-07, 950–955.
• [2008] Milch, B.; Zettlemoyer, L. S.; Kersting, K.; Haimes, M.; and Kaelbling, L. P. 2008. Lifted probabilistic inference with counting formulas. In Proc. of AAAI-08, 1062–1068.
• [2014] Poole, D.; Buchman, D.; Kazemi, S. M.; Kersting, K.; and Natarajan, S. 2014. Population size extrapolation in relational probabilistic modelling. In Proc. of SUM-14, 292–305.
• [2006] Richardson, M., and Domingos, P. M. 2006. Markov logic networks. Machine Learning 62(1-2):107–136.
• [2014] Schulte, O.; Khosravi, H.; Kirkpatrick, A. E.; Gao, T.; and Zhu, Y. 2014. Modelling relational statistics with Bayes nets. Machine Learning 94(1):105–125.
• [2013] Shalizi, C. R., and Rinaldo, A. 2013. Consistency under sampling of exponential random graph models. The Annals of Statistics 41(2):508–535.
• [2014] Singh, M., and Vishnoi, N. K. 2014. Entropy, optimization and counting. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, 50–59. ACM.
• [2017] Van den Broeck, G., and Suciu, D. 2017. Query Processing on Probabilistic Data: A Survey. Foundations and Trends in Databases. Now Publishers.
• [2014] Van den Broeck, G.; Meert, W.; and Darwiche, A. 2014. Skolemization for weighted first-order model counting. In Proc. of KR-14.
• [2011] Van den Broeck, G. 2011. On the completeness of first-order knowledge compilation for lifted probabilistic inference. In Proc. of NIPS-11, 1386–1394.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters