Lifted Marginal MAP Inference  Paper accepted in UAI-18 (Sharma et al. 2018).

# Lifted Marginal MAP Inference††thanks:   Paper accepted in UAI-18 (Sharma et al. 2018).

Vishal Sharma1, Noman Ahmed Sheikh2, Happy Mittal1, Vibhav Gogate3    Parag Singla1
1IIT Delhi, {vishal.sharma, happy.mittal, parags}@cse.iitd.ac.in
2IIT Delhi, nomanahmedsheikh@outlook.com
3UT Dallas, vgogate@hlt.utdallas.edu
###### Abstract

Lifted inference reduces the complexity of inference in relational probabilistic models by identifying groups of constants (or atoms) which behave symmetric to each other. A number of techniques have been proposed in the literature for lifting marginal as well MAP inference. We present the first application of lifting rules for marginal-MAP (MMAP), an important inference problem in models having latent (random) variables. Our main contribution is two fold: (1) we define a new equivalence class of (logical) variables, called Single Occurrence for MAX (SOM), and show that solution lies at extreme with respect to the SOM variables, i.e., predicate groundings differing only in the instantiation of the SOM variables take the same truth value (2) we define a sub-class SOM-R (SOM Reduce) and exploit properties of extreme assignments to show that MMAP inference can be performed by reducing the domain of SOM-R variables to a single constant. We refer to our lifting technique as the SOM-R rule for lifted MMAP. Combined with existing rules such as decomposer and binomial, this results in a powerful framework for lifted MMAP. Experiments on three benchmark domains show significant gains in both time and memory compared to ground inference as well as lifted approaches not using SOM-R.

marginparsep has been altered.
paperwidth has been altered.
marginparwidth has been altered.
marginparpush has been altered.
paperheight has been altered.
The page layout violates the UAI style. Please do not change the page layout, or include packages like geometry, savetrees, or fullpage, which change it for you. We’re not able to reliably undo arbitrary changes to the style. Please remove the offending package(s), or layout-changing commands and try again.

Lifted Marginal MAP Inferencethanks:   Paper accepted in UAI-18 (Sharma et al. 2018).

Vishal Sharma1, Noman Ahmed Sheikh2, Happy Mittal1, Vibhav Gogate3and Parag Singla1 1IIT Delhi, {vishal.sharma, happy.mittal, parags}@cse.iitd.ac.in 2IIT Delhi, nomanahmedsheikh@outlook.com 3UT Dallas, vgogate@hlt.utdallas.edu

## 1 Introduction

Several real world applications such as those in NLP, vision and biology need to handle non-i.i.d. data as well as represent uncertainty. Relational Probabilistic models (?) such as Markov logic networks (?) combine the power of relational representations with statistical models to achieve this objective. The naïve approach to inference in these domains grounds the relational network into a propositional one and then applies existing inference techniques. This can often result in sub-optimal performance for a large number of applications since inference is performed oblivious to the underlying network structure.

Lifted inference (?) overcomes this shortcoming by collectively reasoning about groups of constants (atoms) which are identical to each other. Starting with the work of Poole (?), a number of lifting techniques which lift propositional inference to the first-order level have been proposed in literature. For instance, for marginal inference, exact algorithms such as variable elimination and AND/OR search and approximate algorithms such as belief propagation and MCMC sampling have been lifted to the first-order level (cf. (???????)). More recently, there has been increasing interest in lifting MAP inference (both exact and approximate) (???). Some recent work has looked at the problem of approximate lifting i.e., combining together those constants (atoms) which are similar but not necessarily identical (???).

Despite a large body of work on lifted inference, to the best of our knowledge, there is no work on lifted algorithms for solving marginal maximum-a-posteriori (MMAP) queries. MMAP inference is ubiquitous in real-world domains, especially those having latent (random) variables. It is well known that in many real-world domains, the use of latent (random) variables significantly improves the prediction accuracy  (?). Moreover, the problem also shows up in the context of SRL domains in tasks such as plan and activity recognition (?). Therefore, efficient lifted methods for solving the MMAP problem are quite desirable.

MMAP inference is much harder than marginal (sum) and MAP (max) inference because sum and max operators do not commute. In particular, latent (random) variables need to be marginalized out before MAP assignment can be computed over the query (random) variables and as a result MMAP is NP-hard even on tree graphical models (?). Popular approaches for solving MMAP include variational algorithms (?), AND/OR search (?) and parity solvers (?).

In this paper, we propose the first ever lifting algorithm for MMAP by extending the class of lifting rules (???). As our first contribution, we define a new equivalence class of (logical) variables called Single Occurrence for MAX (SOM). We show that the MMAP solution lies at extreme with respect to the SOM variables, i.e., predicate groundings which differ only in the instantiation of the SOM variables take the same truth (true/false) value in the MMAP assignment. The proof is fairly involved due to the presence of both MAX and SUM operations in MMAP, and involves a series of problem transformations followed by exploiting the convexity of the resulting function.

As our second contribution, we define a sub-class of SOM, referred to as SOM-R (SOM Reduce). Using the properties of extreme assignments, we show that the MMAP solution can be computed by reducing the domain of SOM-R variables to a single constant. We refer to this as SOM-R rule for lifted MMAP. SOM-R rule is often applicable when none of the other rules are, and can result in significant savings since inference complexity is exponential in the domain size in the worst case.

Finally, we show how to combine SOM-R rule along with other lifting rules e.g., binomial and decomposer, resulting in a powerful algorithmic framework for lifted MMAP inference. Our experiments on three different benchmark domains clearly demonstrate that our lifting technique can result in orders of magnitude savings in both time and memory compared to ground inference as well as vanilla lifting (not using the SOM-R rule).

## 2 Background

First-Order Logic: The language of first-order logic (?) consists of constant, variable, predicate, and function symbols. A term is a variable, constant or is obtained by application of a function to a tuple of terms. Variables in the first-order logic language are often referred to as logical variables, simply referred to as variables, henceforth. A predicate defines a relation over the set of its arguments. An atom is obtained by applying a predicate symbol to the corresponding arguments. A ground atom is an atom having no variables in it. Formulas are obtained by combining predicates using a set operators: (and), (or) and (not). Variables in a formula can be universally or existentially quantified using the operators and , respectively. A first-order theory (knowledge base) is a set of formulas. We will restrict our attention to function free finite first-order logic with Herbrand interpretations (?). We will also restrict our attention to the case of universally quantified variables. In the process of (partially) grounding a theory, we replace all (some) of the universally quantified variables with the possible constants in the domain. In the following, we will use capital letters (e.g., , etc.) to denote logical variables and small case letters to denote constants. We will use denotes the domain of variable .

Markov Logic: A Markov logic network (?) (MLN) is defined as a set of pairs where is a formula in first-order logic and is the weight of . We will use to denote the set of all the formulas in MLN. Let denote the set of all the logical variables appearing in MLN. An MLN can be seen as a template for constructing ground Markov networks. Given the domain for every variable , the ground network constructed by MLN has a node for every ground atom and a feature for every ground formula. Let denote the set of all the predicates appearing in . We will use to denote all the ground atoms corresponding to the set and to denote an assignment, i.e. a vector of true/false values, to . The distribution specified by an MLN is given as:

 P(Tg=t)=1Ze∑ni=1∑mij=1wifij(t) (1)

where denotes the number of groundings of the formula. represents the feature corresponding to the grounding of the formula. The feature is on if the corresponding formula is satisfied under the assignment off otherwise. is the normalization constant. Equivalently, in the potential function representation, the distribution can be written as:

 P(t)=1Zn∏i=1mi∏j=1ϕij(t) (2)

where there is a potential for each such that .

Marginal MAP (MMAP): Let the set of all predicates be divided into two disjoint subsets and , referred to as MAX and SUM predicates, respectively. Let (resp. ) denote an assignment to all the groundings of the predicates in (resp. ). Note that , and given assignment to , . Then, the marginal-MAP (MMAP) problem for MLNs can be defined as:

 argmaxq∑sn∏i=1mi∏j=1ϕij(q,s)=argmaxqWM(q) (3) where, WM(q)=∑sn∏i=1mi∏j=1ϕij(q,s)

is referred to as the MMAP objective function for the MLN , and its solution is referred as the MMAP solution. Note that we can get rid of in equation 3, since we are only interested in finding the maximizing assignment and is a constant.

Preliminaries: We will assume that our MLN is in Normal Form (?) i.e., (a) no constants appear in any of the formulae (b) if and appear at the same predicate position in one or more formulae, then . Any MLN can be converted into normal form by a series of mechanical operations. We will also assume that formulas are standardized apart i.e., we rename the variables such that the sets of variables appearing in two different formulae are disjoint with each other. We define an equivalence relation over the set of variables such that if (a) and appear at the same predicate position OR (b) such that and . We will use to denote the equivalence class corresponding to variable . Variables in the same equivalence class must have the same domain due to the normal form assumption. We will use to refer to the domain of the variables belonging to .

Finally, though our exposition in this work is in terms of MLNs, our ideas can easily be generalized to other representations such as weighted parfactors (?) and probabilistic knowledge bases (?).

## 3 Single Occurrence for Mmap

### 3.1 Motivation

In this work, we are interested in lifting the marginal-MAP (MMAP) problem. Since MMAP is a problem harder than both marginal and MAP inference, a natural question to examine would be if existing lifting techniques for MAP and marginal inference can be extended to the case of MMAP. Or further still, if additional rules can be discovered for lifting the MMAP problem. Whereas many of the existing rules such as decomposer and binomial 111applicable when the binomial predicate belongs to MAX  (??) extend in a straightforward manner for MMAP, unfortunately the SO rule (?), which is a powerful rule for MAP inference, is not directly applicable.

In response, we propose a new rule, referred to as Single Occurrence for MAX Reduce (SOM-R), which is applicable for MMAP inference. We first define a variable equivalence class, referred to as SOM, which requires that (1) no two variables in the class appear in the same formula (2) at least one of the variables in the class appears in a MAX predicate. We further define a sub-class of SOM, referred to as SOM-R, which imposes a third condition (3) either all the SUM predicates in the theory contain a SOM variable or none of them does. Our SOM-R rule states that domain of SOM-R variables can be reduced to a single constant for MMAP inference. Consider the following example MLN, henceforth referred to as :

 w1:Frnds(X,Y)∧Parent(Z,X)⇒Knows(Z,Y) w2:Knows(U,V) SUM:ParentMAX:Frnds,Knows

The equivalence classes in this example are given by , and. It is easy to see that each of these equivalence classes satisfies the three conditions above and hence, SOM-R rule can be applied over them. This makes the MMAP inference problem independent of the size of the domain and hence, it can be solved in time. Ground inference has to deal with number of ground atoms resulting in complexity in the worst case 222Inference complexity is exponential in the number of ground atoms. Here, we assume . Further, in the absence of the SOM-R rule, none of the existing lifting rules apply and one has to resort to partial grounding again resulting in worst case exponential complexity.

We note that conditions for identifying SOM and SOM-R specifically make use of the structure of the MMAP problem. Whereas condition 1 is same as Mittal et al.’s SO condition, condition 2 requires the variables in the SOM class to belong to a MAX predicate. Condition 3 (for SOM-R) further refines the SOM conditions so that domain reduction can be applied.

We prove the correctness of our result in two phases. First, we show that SOM equivalence class implies that MMAP solution lies at extreme, meaning that predicate groundings differing only in the instantiation of the SOM class take the same truth value. Second, for the sub-class SOM-R, we further show that domain can be reduced to a single constant for MMAP. Here, we rely on the properties of extreme assignments.

Our proof strategy makes use of a series of problem transformations followed by using the convexity of the resulting function. These algebraic manipulations are essentials to prove the correctness of our result, and are some of the important contributions of our paper. Next, we describe each step in detail. The proofs of theorems (and lemmas) marked with () are presented in the supplement.

### 3.2 SOM implies Extreme Solution

We introduce some important definitions. We will assume that we are given an MLN . Further, we are interested in solving an MMAP problem over where the set of MAX predicates is given by .

###### Definition 1.

(Single Occurrence for MAX) We say that a variable equivalence class is Single Occurrence for MAX (SOM) if (a) , there is at most one variable from the set occurring in (b) there exists a variable and a predicate , such that appears in .

Next, we define the notion of an extreme assignment.

###### Definition 2.

(Extreme Assignment) Let be a variable equivalence class. An assignment to MAX predicates lies at extreme (with respect to ), if , all the groundings of with the same instantiation to variables , take the same value in .

In , an extreme assignment with respect to variable equivalence class will assign the same truth value to the ground atoms and , and . We next define the notion of an MLN variablized with respect to a variable equivalence class.

###### Definition 3.

(Variablized MLN) Let be an equivalence class. Let be the MLN obtained by instantiating (grounding) the variables in the set . We say that is variablized (only) with respect to the set .

It is important to note that, represents the same distribution as . Further, can be converted back into normal form by introducing a new predicate for every combination of constants appearing in a predicate. In , variablizing with respect to the equivalence class results in MLN with formulas similar to:

 w1:Frnds(x,Y)∧Parent(z,x)⇒Knows(z,Y) w2:Knows(u,V)

where and are constants belonging to respective domains. , and can be treated as unary predicates over the equivalence class since and are constants. Similarly, can be treated as a propositional predicate. We now define one of the main theorems of this paper.

###### Theorem 1.

Let be an MLN and let be an SOM equivalence class. Then, a MMAP solution for lies at extreme with respect to .

We will prove the above theorem by defining a series of problem transformations. In the following, we will work with MLN and as a SOM variable equivalence class. We will use and to denote set of MAX and SUM predicates, respectively. and will denote the assignments to respective predicate groundings (see Background (section 2)).

#### 3.2.1 Problem Transformation (PT) 1

Objective PT1: Convert MMAP objective into a form which only has unary and propositional predicates.

###### Lemma 1.

Let denote the MLN variablized with respect to SOM equivalence class . Then, contains only unary and propositional predicates. Further, the MMAP objective can be written as:

 argmaxqWM(q)=% argmaxqWM~X(q)

The proof that only has unary and propositional predicates follows immediately from the definition of (defn. 3) and the fact that is SOM. Further, since and define the same distribution, we have the equivalence of the MMAP objectives. Since, only has unary and propositional predicates, we will split the assignment to groundings of into where and denote the assignments to groundings of unary and propositional predicates, respectively. Similarly, for assignment to , we will split as .

#### 3.2.2 Problem Transformation 2

Objective PT2: In the MMAP objective, get rid of propositional MAX predicates.

###### Lemma 2.

* Consider the MMAP problem over . Let be some assignment to propositional MAX predicates. Let be an MLN obtained by substituting the truth value in for propositional predicates. Then, if has a solution at extreme for all possible assignments of the form then, also has a solution at extreme.

Therefore, in order to prove the extrema property for , it is sufficient to prove it for a generic MLN , i.e., without making any assumptions on the form of .

For ease of notation, we will drop the prime in and simply refer to it as . Therefore, we need to show that the solution to the following problem lies at extreme:

 argmaxquWM~X(q)

where the propositional MAX predicates have been gotten rid of in .

#### 3.2.3 Problem Transformation 3

Objective PT3: In the MMAP objective, get rid of unary SUM predicates using inversion elimination (?).
First, we note that the MMAP objective:

 WM~X(qu)=∑sp,sun∏i=1mi∏j=1ϕij(qu,sp,su) can be equivalently written as: WM~X(qu)=∑sp,sun∏i=1m∏j=1ϕ′ij(qu,sp,su)

where . if contains a variable from , else otherwise. It is easy to see this equivalence since the only variables in the theory are from the class . When contains a variable from , it has exactly groundings since is SOM. On the other hand, if does not contain a variable from , it only contains propositional predicates. Then we raise it to power , and then multiply times in the latter expression to get an equivalent form. Next, we state an important lemma.

###### Lemma 3.

MMAP problem over can be written as:

 argmaxquWM~X(qu)=argmaxqu∑spm∏j=1Θj(qu,sp)

where is a function of unary MAX and propositional SUM predicates groundings and , respectively.

Proof. We can write the MMAP objective as:

 =∑sp,sun∏i=1m∏j=1ϕ′ij(qu,sp,su) =∑sp,sum∏j=1n∏i=1ϕ′ij(qu,sp,su) =∑sp,sum∏j=1Φj(qu,sp,su) =∑sp∑su1,su2,…,summ∏j=1Φj(qu,sp,suj) (apply inversion elimination) =∑spm∏j=1∑sujΦj(qu,sp,suj) =∑spm∏j=1Θj(qu,sp)

Proof Explanation: Second equality is obtained by interchanging the two products. Third equality is obtained by defining . In fourth equality, we have made explicit the dependence of on i.e. the groundings corresponding to the constant.
Inversion Elimination (?): Since only depends on (among ) groundings, we can use inversion elimination to invert the sum over and product over in the fifth equality.
Final Expression: We define .
Note that, at this point, we have only propositional SUM and unary MAX predicates in the transformed MMAP objective.

#### 3.2.4 Problem Transformation 4

Objective PT4: Exploit symmetry of the potential functions in the MMAP objective.
We rename to and to for ease of notation in Lemma 3. The MMAP objective can be written as:

 WM~X(q)=∑sm∏j=1Θj(qj,s) (4)

Here, and represents the assignment to the unary MAX predicate groundings corresponding to constant . In the expression above, we have made explicit the dependence of on . We make the following two observations.

1) Due to the normal form assumption, all the groundings of a first-order logic formula are identical to each other (up to renaming of a constant). Hence, the resulting potential function ’s are also identical to each other.

2) If there are unary MAX predicates in , then each can take possible values 333since there are predicate groundings for each and each is Boolean valued.

Using these observations, we note that the value of the product in the RHS of Equation 4 depends only on the number of different types of values ’s take in (and not on which takes which value). Let denote the set of different values that ’s can take. Given a value , let denote the number of times appears in . Then, we have the following lemma.

###### Lemma 4.

The MMAP problem can be written as:

 argmaxqWM~X(q)=argmaxN1,N2,⋯,NR∑sR∏l=1fl(s)Nl

subject to the constraints that and . Here, .

Proof. Proof follows from the fact that ’s are symmetric to each other and that the ’s take a total of possible (non-unique) assignments since .

We say that an assignment subject to the constraints: and is at extreme if such that . Note that for , extreme assignment also implies that . We have the following lemma.

###### Lemma 5.

* The solution to the MMAP formulation lies at extreme iff solution to its equivalent formulation:

 argmaxN1,N2,⋯,NR∑sR∏l=1fl(s)Nl

subject to the constraints and lies at extreme.

#### 3.2.5 Proving Extreme

###### Lemma 6.

Consider the optimization problem:

 argmaxN1,N2,⋯,NR∑sg(s)×R∏l=1fl(s)Nl

subject to the constraints , . is an arbitrary real-valued function independent of . The solution of this optimization problem lies at extreme.

Proof. Note that it suffices to prove this theorem assuming ’s are real-valued. If the solution is at extreme with real-valued ’s, it must also be at extreme when ’s are further constrained to be integer valued. We will use induction on to prove the result. Consider base case of , the function becomes . This function is convex and has its maximum value at or (see supplement for a proof).

Assuming that the induction hypothesis holds for . We need to show for the case when . We will prove it by contradiction. Assume that the solution to this problem does not lie at extreme. Then, in this solution, it must be the case that . If not, we can then reduce the problem to a sized one and apply our induction hypothesis to get an extreme solution. Also, clearly . Let has the optimal value of in this solution. Then, substituting the optimal value of this component in the expression, we can get the optimal value for by solving , subject to . Here, . Using the induction hypothesis, the solution for this must be at extreme, i.e. since . This is a contradiction.

###### Corollary 1.

The solution to the optimization problem

 argmaxN1,N2,⋯,NR∑sR∏l=1fl(s)Nl

subject to the constraints , and lies at extreme.

Theorem 1 (Proof): Corollary 1 combined with Lemma 5, Lemma 4, Lemma 3, Lemma 2 and Lemma 1 proves the theorem.

### 3.3 SOM-R Rule for lifted MMAP

We will first define the SOM-R (SOM Reduce) equivalence class which is a sub-class of SOM. Following our notation, we will use and to denote the set of MAX and SUM predicates, respectively in the MMAP problem.

###### Definition 4.

We say that an equivalence class of variables is SOM-R if (a) is SOM (b) , contains a variable from OR , does not have a variable from .

Note that if , then any SOM equivalence class is also necessarily SOM-R. Next, we exploit the properties of extreme assignments to show that domain of SOM-R variables can be reduced to a single constant for MMAP inference. We start with the definition of a reduced MLN.

###### Definition 5.

Let denote the set of (weighted) formulas in . Let be a SOM-R equivalence class with . We construct a reduced MLN by considering the following two cases:

CASE 1: , contains a variable from

containing a variable , add to .

not containing a variable , add to .

CASE 2: , does not contain a variable from

containing a variable , add to .

not containing a variable , add to .

In each case, we reduce the domain of to a single constant in .

We are ready to state our SOM-R rule for lifted MMAP.

###### Theorem 2.

(SOM-R Rule for MMAP) Let be a SOM-R equivalence class. Let be the reduced MLN in which domain of has been reduced to single constant. Then, MMAP problem can be equivalently solved over .

Proof. Let denote the set of MAX predicates in the problem. We prove the above theorem in two parts. In Lemma 7 below, we show that for every extreme assignment (with respect to ) to groundings of in , there is a corresponding extreme assignment in (and vice-versa). In Lemma 8, we show that given two extreme assignments, and for the respective MLNs, the MMAP value at (in ) is a monotonically increasing function of the MMAP value at (in ). These two facts combined with the fact that MMAP solution to the original problem is at extreme (using Theorem 1) prove the desired result. Next we prove each result in turn.

###### Lemma 7.

Let (resp. ) denote the sets of extreme assignments to the groundings of in (resp. ). There exists a one to one to mapping between and .

Proof. Instead of directly working with and , we will instead prove this lemma for the corresponding variablized MLNs and . This can be done since the process of variablization preserves the distribution as well as the set of extreme assignments. Let denote an extreme assignment to MAX predicates in . We will construct a corresponding assignment for MAX predicate in . Since is SOM-R, has only unary and propositional predicates, whereas is full ground since the domain of is reduced to a single constant.

First, let us consider a propositional MAX predicate in . Since is ground both in and , we can assign the value of in to be same as . Next, let us consider a unary predicate . Let the assignments to the groundings of in be given by the set where . Since is extreme, each element in the set takes the same truth value. We can simply assign this value to the ground appearance of in . Hence, we get a mapping from to . It is easy to see that we can get a reverse mapping from to in a similar manner. Hence, proved.
Next, we state the relationship between the MMAP values obtained by the extreme assignments in and .

###### Lemma 8.

* Let be an MLN and be the reduced MLN with respect to the SOM-R equivalence class . Let and denote two corresponding extreme assignments in and , respectively. Then, a monotonically increasing function such that .

The proof of Lemma 8 exploits inversion elimination and symmetry of potential functions (similar to their use in Sections 3.2.3 and 3.2.4, respectively), which are our key insights for reducing the complexity of inference compared to ground inference (see supplement for details).

###### Corollary 2.

SOM-R rule for MMAP problem subsumes SO rule for MAP problem given by Mittal et al.(?).

The corollary follows from the fact that MAP is a special case of MMAP when all the predicates are MAX .

## 4 Algorithmic Framework

SOM-R rule can be combined with existing lifted inference rules such as lifted decomposition and conditioning (??) (with minor modifications) to yield a powerful search-based algorithm for solving MMAP (see Algorithm 1). The algorithm takes as input an MLN , the set of MAX predicates , SUM predicates and a ground MMAP solver . It has six steps. In the first step, the algorithm checks to see if the MLN, along with and can be partitioned into disjoint MLNs that do not share any ground atoms. If this condition is satisfied, then the MMAP solution can be constructed by solving each component independently and simply concatenating the individual solutions. In the next three steps, we apply the decomposer (?), SOM-R (this work) and binomial rules (??) in order. The former two reduce the domain of all logical variables in the equivalence class to a constant and thus yield exponential reductions in complexity. Therefore, they are applied before the binomial rule which creates () smaller sub-problems. In the algorithm, refers to an MLN obtained from by setting the domain of to a single constant and we assume that . Similarly, refers to the MLN obtained from by applying the SOM-R rule (see Definition 5).

The binomial rule (steps 4a and 4b) efficiently conditions on the unary predicates and can be applied over the SUM as well as MAX predicates. However, care must be taken to ensure that all MAX predicates are instantiated before the SUM predicates. Therefore, the binomial rule is applied over the SUM predicates only when the MLN has no MAX predicates (Step 4b). In the algorithm, refers to the MLN obtained from by setting exactly groundings of to true and the remaining to false.

If none of the lifting rules are applicable and the MLN has only ground atom, we return the solution returned by the propositional solver . Otherwise, if not all predicates are ground, we resort to partial grounding, namely we heuristically ground a logical variable and recurse on the corresponding MLN .

Finally, note that the algorithm returns the exponentiated weight of the MMAP assignment. The assignment can be recovered by tracing the recursion backwards.

Heuristics: (a) Binomial: In case of multiple possible binomial applications, we pick the one which results in the application of other lifting rules (in the priority order described above) using a one step look ahead. In case of a tie, we pick the one with maximum domain size.

(b) Partial Grounding: We pick the equivalence class which results in further application of lifting rules (in the priority order) using a one step look ahead. In case of a tie, we pick the one which has smallest domain size.

## 5 Experiments

The goal of our experiments is two fold. First, we would like to examine the efficacy of lifting for MMAP. Second, we would like to analyze the contribution of SOM-R rule in lifting. Towards this end, we compare the following three algorithms: (1) Ground: ground inference with no lifting whatsoever (2) Lifted-Basic: lifted inference without use of the SOM-R rule 444We use the rules described in Algorithm 1. For Lifted-Basic, too many applications of the binomial rule led to blow up. So, we restricted the algorithm to a single binomial application and before any partial grounding. Lifted-SOM-R had no such issues. (3) Lifted-SOM-R: using all our lifting rules including SOM-R. For ground inference, we use a publicly available  base (exact) solver built on top of And/Or search developed by Marinescu et al. (?).

We experiment with three benchmark MLNs: (1) Student (?) (2) IMDB (?) (3) Friends & Smokers (FS) (?). All the datasets are described in the lower part of Figure 1 along with the MAP predicates used in each case; the remaining predicates are treated as marginal predicates. Weights of the formulas were manually set.

We compare the performance of the three algorithms on two different metrics: (a) time taken for inference (b) memory used. We used a time-out of 30 minutes for each run. Memory was measured in terms of the number of formulas in the ground network in each case. We do not compare the solution quality since all the algorithms are guaranteed to produce MMAP assignments with same (optimal) probability. All the experiments were run on a 2.20 GHz Xeon(R) E5-2660 v2 server with 10 cores and 62 GB RAM.

Results: For each of the graphs in Figure 1, we plot time (memory) on y-axis (log-scale) and domain size on x-axis. Time is measured in seconds. Since we are primarily concerned about the scaling behavior, we use the number of ground formulae as a proxy for the actual memory usage. Domain size is measured as a function of a scaling factor, which is the number by which (all of) the starting domain sizes are multiplied. We refer to domain descriptions (Figure 1) for the starting sizes.

Figures 0(a) and  0(d) compare the performance of the three algorithms on the Student dataset. None of the lifting rules apply for Lifted-Basic. Hence, its performance is identical to Ground. For Lifted-SOM-R, all the variables (except teacher(T)) can be reduced to a single constant, resulting in significant reduction in the size of the ground theory. Lifted-SOM-R is orders of magnitude better than Ground and Lifted-Basic for both time and memory.

Figures 0(b) and  0(e) compare the three algorithms on the FS dataset. Here, Lifted-Basic performs identical to Lifted-SOM-R. This is because binomial rule applies in the beginning on Smokes, following which theory decomposes. We never need to apply SOM-R rule on this domain. Both Lifted-SOM-R and Lifted-Basic perform significantly better than Ground on this domain (in both time and memory).

IMDB dataset (Figures 0(c) and  0(f)) presents a particularly interesting case of interspersed application of rules. For Lifted-SOM-R, SOM-R rule applies on movie(M) variables, simplifying the theory following which binomial rule can be applied on Mov, Dir and Act predicates. Theory decomposes after these binomial applications. For Lifted-Basic, though binomial rule can be applied on Dir, Act the movie variables still remain, eventually requiring for partial grounding. Surprisingly, Ground does slightly better than both the lifted approaches for smaller domains for time. This is due to the overhead of solving multiple sub-problems in binomial without much gain since domains are quite small. Lifted-SOM-R has a much better scaling behavior for larger domains. It also needs significantly less memory compared to both other approaches.

In none of the above cases, Lifted-SOM-R has to ever partially ground the theory making a very strong case for using Lifted-SOM-R for MMAP inference in many practical applications. Overall, our experiments clearly demonstrate the utility of SOM-R in the scenarios where other lifting rules fail to scale.

## 6 Conclusion

We present the first lifting technique for MMAP. Our main contribution is the SOM-R rule, which states that the domain of a class of equivalence variables, referred to as SOM-R, can be reduced to a single constant for the purpose of MMAP inference. We prove the correctness of our rule through a series of problem transformations followed by the properties of what we refer to as extreme assignments. Our experiments clearly demonstrate the efficacy of our approach on benchmark domains. Directions for future work include coming up with additional lifting rules, approximate lifting and lifting in presence of constraints (?), all in the context of MMAP, and experimenting with a wider set of domains.

#### Acknowledgements

Happy Mittal is being supported by the TCS Research Scholars Program. Vibhav Gogate and Parag Singla are being supported by the DARPA Explainable Artificial Intelligence (XAI) Program with number N66001-17-2-4032. Parag Singla is being supported by IBM Shared University Research Award and the Visvesvaraya Young Faculty Research Fellowship by the Govt. of India. Vibhav Gogate is being supported by the National Science Foundation grants IIS-1652835 and IIS-1528037. Any opinions, findings, conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views or official policies, either expressed or implied, of the funding agencies.

## References

• [de Salvo Braz, Amir, and Roth 2005] de Salvo Braz, R.; Amir, E.; and Roth, D. 2005. Lifted first-order probabilistic inference. In Proc. of IJCAI-05, 1319–1325.
• [de Salvo Braz, Amir, and Roth 2006] de Salvo Braz, R.; Amir, E.; and Roth, D. 2006. MPE and partial inversion in lifted probabilistic variable elimination. In Proc. of AAAI-06, 1123–1130.
• [Domingos and Lowd 2009] Domingos, P., and Lowd, D. 2009. Markov Logic: An Interface Layer for Artificial Intelligence. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers.
• [G. Van den Broeck et al. 2011] G. Van den Broeck; Taghipour, N.; Meert, W.; Davis, J.; and Raedt, L. D. 2011. Lifted probabilistic inference by first-order knowledge compilation. In Proc. of IJCAI-11.
• [Getoor and Taskar 2007] Getoor, L., and Taskar, B., eds. 2007. Introduction to Statistical Relational Learning. MIT Press.
• [Gogate and Domingos 2011] Gogate, V., and Domingos, P. 2011. Probabilisitic theorem proving. In Proc. of UAI-11, 256–265.
• [Jha et al. 2010] Jha, A. K.; Gogate, V.; Meliou, A.; and Suciu, D. 2010. Lifted inference seen from the other side : The tractable features. In Proc. of NIPS-10, 973–981.
• [Kersting, Ahmadi, and Natarajan 2009] Kersting, K.; Ahmadi, B.; and Natarajan, S. 2009. Counting belief propagation. In Proc. of UAI-09.
• [Kimmig, Mihalkova, and Getoor 2015] Kimmig, A.; Mihalkova, L.; and Getoor, L. 2015. Lifted graphical models: a survey. Machine Learning 99(1):1–45.
• [Liu and Ihler 2013] Liu, Q., and Ihler, A. T. 2013. Variational algorithms for marginal map. Journal of Machine Learning Research 14(1):3165–3200.
• [Maaten, Welling, and Saul 2011] Maaten, L.; Welling, M.; and Saul, L. K. 2011. Hidden-unit conditional random fields. In International Conference on Artificial Intelligence and Statistics, 479–488.
• [Marinescu, Dechter, and Ihler 2014] Marinescu, R.; Dechter, R.; and Ihler, A. T. 2014. And/or search for marginal map. In Proc. of UAI-14, 563–572.
• [Mittal et al. 2014] Mittal, H.; Goyal, P.; Gogate, V.; and Singla, P. 2014. New rules for domain independent lifted MAP inference. In Proc. of NIPS-14, 649–657.
• [Mittal et al. 2015] Mittal, H.; Mahajan, A.; Gogate, V.; and Singla, P. 2015. Lifted inference rules with constraints. In Advances in Neural Information Processing Systems, 3519–3527.
• [Mittal et al. 2016] Mittal, H.; Singh, S. S.; Gogate, V.; and Singla, P. 2016. Fine grained weight learning in markov logic networks. In Proc. of IJCAI-16 Wkshp. on Statistical Relational AI.
• [Mladenov, Kersting, and Globerson 2014] Mladenov, M.; Kersting, K.; and Globerson, A. 2014. Efficient lifting of map lp relaxations using k-locality. In Proc. of AISTATS-14, 623–632.
• [Niepert 2012] Niepert, M. 2012. Markov chains on orbits of permutation groups. In In Proc. UAI-14, CA, USA, August 14-18, 2012, 624–633.
• [Park 2002] Park, J. 2002. MAP Complexity Results and Approximation Methods. In Proc. of UAI-02.
• [Poole 2003] Poole, D. 2003. First-order probabilistic inference. In Proc. of IJCAI-03, 985–991.
• [Russell and Norvig 2010] Russell, S. J., and Norvig, P. 2010. Artificial Intelligence - A Modern Approach. Pearson Education.
• [Sarkhel et al. 2014] Sarkhel, S.; Venugopal, D.; Singla, P.; and Gogate, V. 2014. Lifted MAP inference for Markov logic networks. In Proc. of AISTATS-14, 895–903.
• [Sarkhel, Singla, and Gogate 2015] Sarkhel, S.; Singla, P.; and Gogate, V. G. 2015. Fast lifted map inference via partitioning. In Advances in Neural Information Processing Systems, 3240–3248.
• [Singla and Domingos 2008] Singla, P., and Domingos, P. 2008. Lifted first-order belief propagation. In Proc. of AAAI-08.
• [Singla and Mooney 2011] Singla, P., and Mooney, R. 2011. Abductive Markov logic for plan recognition. In Proc. of AAAI-11, 1069–1075.
• [Singla, Nath, and Domingos 2014] Singla, P.; Nath, A.; and Domingos, P. 2014. Approximate lifted belief propagation. In Proc. of AAAI-14, 2497–2504.
• [Van den Broeck and Darwiche 2013] Van den Broeck, G., and Darwiche, A. 2013. On the complexity and approximation of binary evidence in lifted inference. In Advances in Neural Information Processing Systems, 2868–2876.
• [Venugopal and Gogate 2012] Venugopal, D., and Gogate, V. 2012. On lifting the Gibbs sampling algorithm. In Proc. of NIPS-12, 1664–1672.
• [Xue et al. 2016] Xue, Y.; Ermon, S.; Gomes, C. P.; Selman, B.; et al. 2016. Solving marginal map problems with np oracles and parity constraints. In In Proc. of NIPS-16, 1127–1135.

## Lemmas

We will start by proving Lemma Lemma 0., which will be used in the proof of Lemma 2 .

###### Lemma 0.

Given a function , let be a maximizing assignment, i.e., . Then, s.t. , is also a maximizing assignment.

Proof. We can write the following (in)equality:

 f(x∗,y∗) ≤maxyf(x∗,y)=f(x∗,y′)

But since , was the maximizing assignment for , it must be the case that . Hence, must also be a maximizing assignment. Hence, proved.

###### Lemma 2.

Consider the MMAP problem over . Let be an assignment to the propositional MAP predicates. Let be an MLN obtained by substituting the truth value in for propositional predicates. Then, if has a solution at extreme for all possible assignments of the form then, also has a solution at extreme.

Proof. The MMAP problem can be written as:

 argmaxqp,qu∑sp,suWM~X(qp,qu,sp,su) (5)

here, denote an assignment to the propositional and unary MAX predicate groundings in , respectively. Similarly, denote an assignment to the propositional and unary SUM predicate groundings in , respectively. Let denote an optimal assignment to the propositional MAX predicates. Then, using Lemma Lemma 0., we can get the MMAP assignment as a solution to the following problem:

 argmaxqu∑sp,suWM′~X(qu,su,sp) (6)

where is obtained by substituting the truth assignment in . Since, has a solution at extreme , it must also be at extreme when . Hence, must be at extreme. Hence, proved.

###### Lemma 5.

The solution to the MMAP formulation lies at extreme iff solution to its equivalent formulation:

 argmaxN1,N2,⋯,NR∑sR∏l=1fl(s)Nl (7)

subject to the constraints and lies at extreme.

Proof. If lies at extreme then such that and . Let denote the value taken by the groundings of the unary MAX predicates corresponding to index . Since , it must be the case that all the groundings get the identical value . Hence, the solution to lies at extreme. Similar proof strategy holds for the other way around as well.

###### Lemma.

(Indction Base Case in Proof of Lemma 6): Let and denote real-valued functions of a vector valued input  666Recall that was an assignment to all the propositional SUM predicates in our original Lemma.. Further, let each of be non-negative. Then, for , we define a function where the domain of is further restricted to be in the interval , i.e., . The maxima of the function lies at or .

Proof. First derivative of with respect to is:

 dhdN=∑s( f1(s)Nf2(s)m−Ng(s)× [log(f1(s))−log(f2(s))])

Second derivative of with respect to is given as:

 d2hdN2=∑s( f1(s)Nf2(s)m−Ng(s)× [log(f1(s))−log(f2(s))]2)≥0

The inequality follows from the fact that each of is non-negative. Hence, the second derivative of is non-negative which means the function is convex. Therefore, the maximum value of this function must lie at the end points of its domain, i.e, either at or at .

###### Lemma 8.

Let be an MLN and be the reduced MLN with respect to the SOM-R equivalence class . Let and denote two corresponding extreme assignments in and , respectively. Then, a monotonically increasing function such that .

Proof. First, we note that if we multiply the weight of a formula in an MLN by a factor , then, the corresponding potential (i.e., potential corresponding to the grounding of the formula) gets raised to the power . If gets replaced by , then, correspondingly, gets replaced by . We will use this fact in the following proof.

As in the case of Lemma 7, we will instead work with the variablized MLNs and , respectively. Let be the MMAP assignment for in and similarly be the MMAP assignment for in .

For MLN , the MMAP objective at can be written as

 ∑sp,su(r∏i=1m∏j=1ϕij(qp,qu,sp,su)t∏k=1ϕk(qp,sp)) (8)

where are potentials over formulas containing some and are potentials over formulas which do not contain any . In particular, note that we have separated out the formulas which involve a variable from the class from those which don’t. denotes the count of the formulas of the first type and denotes the count of the formulas of the second type. We will use this form in the following proof.

Let the reduced domain of in is given by , i.e., the only constant which remains in the domain is corresponding to index . Next we prove the above lemma for the two cases considered in Definition 5:

CASE 1: , contains a variable from
In this case and will not contain any propositional SUM predicate i.e. .

In this case, while constructing , for formulas not containing some we divided the weight by . This combined with the result stated in the beginning of this proof, the MMAP objective for can be written as:

 WMr~X(qp,qu)= ∑su(r∏i=1ϕi1(qp,qu1,su1)t∏k=1ϕk(qp)1m) = (∑su1r∏i=1ϕi1(qp,qu1,su1))t∏k=1ϕk(qp)1m

Next for MLN we have,

 ∑su(r∏i=1m∏j=1ϕij(qp,qu,su)t∏k=1ϕk(qp)) = (∑sum∏j=1r∏i=1ϕij(qp,qu,su))t∏k=1ϕk(qp) = (∑sujm∏j=1r∏i=1ϕij(qp,quj,suj))t∏k=1ϕk(qp) = (m∏j=1∑sujr∏i=1ϕij(qp,quj,suj))t∏k=1ϕk(qp) = (m∏j=1∑sujr∏i=1ϕij(qp,qu1,suj))t∏k=1ϕk(qp) = (∑sujr∏i=1ϕij(qp,qu1,suj))mt∏k=1ϕk(qp) = (∑sujr∏i=1ϕij(qp,qu1,suj))m