Tree Projections and Constraint Optimization Problems: Fixed-Parameter Tractability and Parallel Algorithms
Tree projections provide a unifying framework to deal with most structural decomposition methods of constraint satisfaction problems (CSPs). Within this framework, a CSP instance is decomposed into a number of sub-problems, called views, whose solutions are either already available or can be computed efficiently. The goal is to arrange portions of these views in a tree-like structure, called tree projection, which determines an efficiently solvable CSP instance equivalent to the original one. However, deciding whether a tree projection exists is NP-hard. Solution methods have therefore been proposed in the literature that do not require a tree projection to be given, and that either correctly decide whether the given CSP instance is satisfiable, or return that a tree projection actually does not exist. These approaches had not been generalized so far to deal with CSP extensions tailored for optimization problems, where the goal is to compute a solution of maximum value/minimum cost. The paper fills the gap, by exhibiting a fixed-parameter polynomial-time algorithm that either disproves the existence of tree projections or computes an optimal solution, with the parameter being the size of the expression of the objective function to be optimized over all possible solutions (and not the size of the whole constraint formula, used in related works). Tractability results are also established for the problem of returning the best solutions. Finally, parallel algorithms for such optimization problems are proposed and analyzed.
Given that the classes of acyclic hypergraphs, hypergraphs of bounded treewidth, and hypergraphs of bounded generalized hypertree width are all covered as special cases of the tree projection framework, the results in this paper directly apply to these classes. These classes are extensively considered in the CSP setting, as well as in conjunctive database query evaluation and optimization.
Keywords: Constraint Satisfaction Problems, AI, Optimization Problems, Structural Decomposition Methods, Tree Projections, Parallel Models of Computation, Conjunctive Queries, Query Optimization, Database Theory.
1.1 Optimization in Constraint Satisfaction Problems
Constraint satisfaction is a central topic of research in Artificial Intelligence, and has a wide spectrum of concrete applications ranging from configuration to scheduling, plan design, temporal reasoning, and machine learning, just to name a few.
Formally, a constraint satisfaction problem (for short: CSP) instance is a triple , where is a finite set of variables, is a finite domain of values, and is a finite set of constraints (see, e.g., [D03]). Each constraint , with , is a pair , where is a set of variables called the constraint scope, and is a set of assignments from variables in to values in indicating the allowed combinations of values for the variables in . A (partial) assignment from a set of variables to is explicitly represented by the set of pairs of the form , where is the value to which is mapped. An assignment satisfies a constraint if its restriction to , i.e., the set of pairs such that , occurs in . A solution to is a (total) assignment for which satisfying assignments exist such that . Therefore, a solution is a total assignment that satisfies all the constraints in .
By solving a CSP instance we usually just mean finding any arbitrary solution. However, when assignments are associated with weights because of the semantics of the underlying application domain, we might instead be interested in the corresponding optimization problem of finding the solution of maximum or minimum weight (short: Max and Min problems), whose modeling is possible in several variants of the basic CSP framework, such as the valued and semiring-based CSPs [BFMRSV96]. Moreover, we might be interested in the Top- problem of enumerating the best (w.r.t. Max or Min) solutions in form of a ranked list (see, e.g., [FD10, BDGM12]),111Related results on graphical models, conjunctive query evaluation, and computing homomorphisms on relational structures are transparently recalled hereinafter in the context of constraint satisfaction. or even in the Next problem of computing the next solution (w.r.t. such an ordering) following one that is at hand [BRSVW10].
CSP instances, as well as their extensions tailored to model optimization problems, are computationally intractable. Indeed, even just deciding whether a given instance admits a solution is a well-known NP-hard problem, which calls for practically effective algorithms and heuristics, and for the identification of specific subclasses, called “islands of tractability”, over which the problem can be solved efficiently. In this paper, we consider the latter perspective to attack CSP instances, by looking at structural properties of constraint scopes.
1.2 Structural Decomposition Methods and Tree Projections
The avenue of research looking for islands of tractability based on structural properties originated from the observation that constraint satisfaction is tractable on acyclic instances (cf. [MONTANARI197495, Y81]), i.e., on instances whose associated hypergraph (whose hyperedges correspond one-to-one to the sets of variables in the given constraints) is acyclic.222There are different notions of hypergraph acyclicity. In the paper, we consider -acyclicity, which is the most liberal one [F83].
Motivated by this result, structural decomposition methods have been proposed in the literature as approaches to transform any given cyclic CSP into an equivalent acyclic one by organizing its constraints or variables into a polynomial number of clusters and by arranging these clusters as a tree, called decomposition tree. The satisfiability of the original instance can be then checked by exploiting this tree, with a cost that is exponential in the cardinality of the largest cluster, also called width of the decomposition, and polynomial if the width is bounded by a constant (see [GGLS16] and the references therein). Similarly, by exploiting this tree, solutions can be computed even to CSP extensions tailored for optimization problems, again with a cost that is polynomial over bounded-width instances. For instance, we know that (in certain natural optimization settings) Max is feasible in polynomial time over instances whose underlying hypergraphs are acyclic [KS06], have bounded treewidth [FD10], or have bounded hypertree width [GGS09, GS11].
Despite their different technical definitions, there is a simple framework encompassing all structural decomposition methods,333The notion of submodular width [M13] does not fit this framework, as it is not purely structural. which is the framework of the tree projections [GS84]. The basic idea of these methods is indeed to “cover” all the given constraints via a polynomial number of clusters of variables and to arrange these clusters as a tree, in such a way that the connectedness condition holds, i.e., for each variable , the subgraph induced by the clusters containing is a tree. In particular, any cluster identifies a subproblem of the original instance, and it is required that all solutions to this subproblem can either be computed efficiently, or are already available (e.g., from previous computations). A tree built from the available clusters and covering all constraints is called a tree projection [GS84, SS93, GMS09, GrecoSIC17]. In particular, whenever such clusters are required to satisfy additional conditions, tree projections reduce to specific decomposition methods. For instance, if we consider candidate clusters given by all subproblems over variables at most (resp., over any set of variables contained in the union of constraints at most), then tree projections correspond to tree decompositions [RS86, D03] (resp., generalized hypertree decompositions [GLS02, GMS09]), and is their associated width.
Deciding whether a tree projection exists is NP-hard in general, that is, when a set of arbitrary clusters/subproblems is given [GMS09]. Moreover, the problem remains intractable is some specific settings, such as (bounded width) generalized hypertree decompositions [GMS09]. Therefore, designing tractable algorithms within the framework of tree projections is not an easy task. Ideally we would like to efficiently solve the instances without requiring that a tree projection be explicitly computed (or provided as part of the input). For standard CSP instances, algorithms of this kind have already been exhibited [SS93, GS84, CD05, GS10]. These algorithms are based on enforcing pairwise-consistency [BFMY83], also known in the CSP community as relational arc consistency (or arc consistency on the dual graph) [D03], 2-wise consistency [G86], and [KWRCB10]. Note that these algorithms are mostly used in heuristics for constraint solving algorithms. The idea is to repeatedly take—until a fixpoint is reached—any two constraints and and to remove from all assignments that cannot be extended over the variables in , i.e., for which there is no assignment such that the restrictions of and over the variables in coincide. Here, the crucial observation is that the order according to which pairs of constraints are processed is immaterial, so that this procedure is equivalent to Yannakakis’ algorithm [Y81], which identifies a correct processing order based on the knowledge of a tree projection.444The algorithm has been originally proposed for acyclic instances. For its application within the tree projection setting, the reader is referred to [GS10]. Actually, it is even unnecessary to know that a tree projection exists at all, because any candidate solution can be certified in polynomial time. Indeed, these algorithms are designed in a way that, whenever some assignment is computed that is subsequently found not to be a solution, then the (promised) existence of a tree projection is disproved. We define these algorithms computing certified solutions as promise-free, with respect to the existence of a tree projection (cf. [CD05, GS10]). We note that, so far, this kind of solution approach has not been generalized in the literature to deal with CSP extensions tailored for optimization problems.
All previous algorithms proposed in the literature for computing the best CSP solutions in polynomial time [FD10, GGS09, KDLD05, BMR97, TJ03, GG13, GS11, DBLP:journals/corr/JoglekarPR15, AboKhamis:2016:FQA:2902251.2902280] (or, more generally, for optimizing functions in different application domains—see, e.g., [LS90]) require the knowledge of some suitable tree projection, which provides at each node a list of potentially good partial evaluations with their associated values to be propagated within a dynamic programming scheme. The main conceptual contribution of the present paper is to show that this knowledge is not necessary, since promise-free algorithms can be exhibited in the tree projection framework even when dealing with optimization problems.
More formally, we consider a setting where the given CSP instance is equipped with a valuation function to be maximized over the feasible solutions. The function is built from basic weight functions defined on subsets of variables occurring in constraint scopes, combined via some binary operator .555In fact, our results are designed to hold in a more general setting where different binary operators may be used together in the definition of more complex valuation functions. However, for the sake of presentation, we shall mainly focus on a single operator, in the spirit of the (standard) valued and semiring-based CSP settings. Moreover, we assume that a set of subproblems is given together with their respective solutions. Then, within this setting,
We provide a fixed-parameter polynomial-time algorithm [down-fell-99] for Max that either computes a solution (if one exists) having the best weight according to , or says that no tree projection can be built by using the available subproblems in . In any case, the algorithm does not output any wrong answer, because the computed solutions are certified. More precisely, the algorithm runs in time , where the parameter is the number of basic functions occurring in , is the size of the input, and is a fixed natural number. Thus, the running time has no exponential dependency on the input, but possibly on the fixed parameter .
We show that the Top- problem of returning the best solutions over all possible solutions is fixed-parameter tractable, too. As we may have an exponential number of solutions (w.r.t. ), tractability means here having a promise-free algorithm that computes the desired output with fixed-parameter polynomial delay: The first solution is computed in fixed-parameter polynomial-time, and any other solution is computed within fixed-parameter polynomial-time after the previous one.
Moreover, we complement the above research results, by studying the setting where a tree projection is given at hand. In this case, we show that the task of computing the best solutions over a set of output variables is not only feasible in polynomial time (as we already know from the literature pointed out above), but it is even possible to define parallel algorithms that can exploit the availability of machines with multiple processors.
Concerning our main technical contributions, we stress here that different kinds of fixed-parameter polynomial-time algorithms can be defined for the problems of interests when varying the underlying parameter of interest. For instance, a trivial choice would be to consider the overall number of constraints involved in the CSP at hand. In fact, our parameter is very often much smaller, so that our algorithms can be useful in all those applications where the optimization function consists of few basic functions, while the number of constraints is large (which makes infeasible computing any tree projection).
The rest of the paper is organized as follows. Section 2 illustrates some basic notions about CSPs and their structural properties. The formal framework for equipping CSP instances with optimization functions is introduced in Section 3. Our fixed-parameter tractability results are illustrated in Section 4. Parallel algorithms are presented in Section 5. Relevant related works are discussed in Section 6, and concluding remarks are drawn in Section 7.
Logic-Based Modeling of Constraint Satisfaction
Let be a CSP instance, with . Following [KV00], we shall exploit throughout the paper the logic-based characterization of as a pair , which simplifies the illustration of structural tractability results. In particular, is the constraint formula (associated with ), i.e., a conjunction of atoms of the form where , for each , is obtained by listing all the variables in the scope . The set of variables in is denoted by , while the set of atoms occurring in is denoted by . Moreover, DB is the constraint database, i.e., a set of ground atoms encoding the allowed tuples of values for each constraint, built as follows. For each constraint index and for each assignment , DB contains the ground atom where if is the -th variable in the list , then holds for each . No further ground atom is in DB.
In the following, for any set of variables and any assignment , denotes the partial assignment obtained by restricting to the variables in . Therefore, a (total) substitution is a solution to if holds for each . The set of all solutions to the CSP instance is denoted by . Moreover, for any set of variables, denotes the set .
Structural Properties of CSP Instances
The structure of a constraint formula is best represented by its associated hypergraph , where , i.e., variables are viewed as nodes, and where , i.e., for each atom in , contains a hyperedge including all its variables. For any hypergraph , we denote the sets of its nodes and of its hyperedges by and , respectively.
A hypergraph is acyclic if it has a join tree [BG81]. A join tree of is a labeled tree , where for each vertex , it holds that , and where the following conditions are satisfied:
- Covering Condition:
, for some vertex of , holds;
- Connectedness Condition:
for each pair of vertices in such that , and are connected in (via edges from ) and , for every vertex in the unique path linking and in .
Note that this definition is apparently more liberal than the traditional one (in [BG81]), where there is a one-to-one correspondence between hyperedges and vertices of the join tree. We find it convenient to allow multiple occurrences of the same hyperedge in the labels of different vertices of , but it is straightforward to show that a standard join tree may be obtained from by repeatedly contracting edges of the form , where (until such a one-to-one correspondence is met).
Structural decomposition methods have been proposed in the literature in order to provide a measure of the degree of acyclicity of hypergraphs, and in order to generalize positive computational results from acyclic hypergraphs to nearly-acyclic ones. Despite their different technical definitions, there is a simple framework encompassing all known (purely structural) decomposition methods. The framework is based on the concept of tree projection [GS84], which is recalled below.
For two hypergraphs and , we say that covers , denoted by , if each hyperedge of is contained in at least one hyperedge of . Let . Then, a tree projection of with respect to is an acyclic hypergraph such that . Whenever such a hypergraph exists, we say that the pair has a tree projection.
Consider the hypergraph depicted in Figure 1, and the hypergraph whose hyperedges are listed on the right of the same figure. Note that is (just) a graph and it contains a cycle over the nodes , , and .
The acyclic hypergraph shown in the middle is a tree projection of w.r.t. . For instance, note that the cycle is “absorbed” by the hyperedge , which is in its turn trivially contained in a hyperedge of .
Following [GS10], tree projections can be used to solve any CSP instance whenever we have (or we can build) an additional pair such that:
is a set of atoms (hence, corresponding to a set of constraint scopes). Each atom in clusters together the variables of a subproblem whose solutions are assumed to be available in the constraint database DB’ and that can be exploited in order to answer the original CSP instance . Atoms in will be called views, and will be called view set. It is required that, for each atom , contains a base view with the same list of variables as .
DB’ is a constraint database that satisfies the following conditions:
holds for each base view ; that is, base views should be at least as restrictive as atoms in the constraint formula;
holds for each ; that is, any view cannot be more restrictive than the constraint formula, otherwise correct solutions may be deleted by performing operations involving such views.
Such a database DB’ is said legal for w.r.t. and DB.
The pair is used as follows. Let denote the view hypergraph precisely containing, for each view in , one hyperedge over the variables in . We look for a sandwich formula of w.r.t. , that is, a constraint formula such that includes all base views and is a tree projection of w.r.t. . By exploiting the sandwich formula , solving can be reduced to answering an acyclic instance, hence to a task which is feasible in polynomial time. Indeed, by projecting the assignments of any legal database DB’ over the (portions of the) views used in , a novel database can be obtained such that [GS84].
Most structural decomposition methods of constraint satisfaction problems can be viewed as special instances of this approach, where the peculiarities of each method lead to different ways of building the additional view set , with its associated database DB’. For instance, the methods based on generalized hypertree decompositions [GLS02, GMS09] and tree decompositions [RS86], for a constant width , fit into the framework as follows:
- -width generalized hypertree decompositions:
The method uses a set of views including, for each subformula of with and , a view that is built over the set of all variables on which these atoms are defined (hence, base views are obtained for =1) and whose assignments in the corresponding constraint database are all solutions to .
- -width tree decompositions:
The method uses the set of views consisting of the base views plus all the views that can be built over all possible sets of at most variables. In the associated constraint relations in , base views consist of the assignments in the corresponding atoms in DB, whereas each of the remaining views contains all possible assignments that can be built over them, hence assignments at most, where is the size of the largest domain over the selected variables.
Consider a CSP instance such that and where and are the only two ground atoms in DB, for each . The constraint hypergraph associated with is precisely the hypergraph illustrated in Figure 1. Since is not acyclic, our goal is to apply a structural decomposition method for transforming the original instance into a novel acyclic one that “covers” all constraints in and is equivalent to it. To this end, let us consider the application on of the tree decomposition method with being the associated width, resulting in the pair . For instance, the base view is in and its associated tuples in are and . Moreover, for each natural number , a view having the form is in and the associated tuples in are , with . In particular, for , the hypergraph precisely coincides with the hypergraph (whose hyperedges are) illustrated in Figure 1. Consider then the constraint formula
and note that coincides with the acyclic hypergraph , which is a tree projection666The fact that is a tree projection of w.r.t. witnesses that the treewidth of is 2. In general, a tree projection of w.r.t. exists if and only if has treewidth at most (see [GS10, GS13]). of w.r.t. . Then, solving is equivalent to solving where is just the restriction of over the atoms in .
3 Valuation Functions and Basic Results
In this section, we illustrate a formal framework for equipping constraint formulas with valuation functions suited to express a variety of optimization problems. Moreover, we introduce and analyze a notion of embedding as a way to represent and study the interactions between constraint scopes and valuation functions.
In the following we assume that a domain of values, a constraint formula , and a set of weights totally ordered by a relation are given. Moreover, on the set , we define and as the operations returning any -maximum and the -minimum weight, respectively, over a given set of weights.
3.1 Formal Framework
Let be a set of variables. Then, a function associating each assignment with a weight is called a weight function (for ), and we denote by the set on which it is defined. If an assignment with is given, then we write as a shorthand for .
Let be a closed, commutative, and associative binary operator over being, moreover, distributive over . A valuation function (for over ) is an expression of the form , with . The set of all weight functions occurring in is denoted by . For an assignment , is the weight .
As an example, note that valuation functions built for basically777For more information on weighted CSPs, see Section 6. correspond to those arising in the classical setting of weighted CSPs, where combination of values in the constraints come associated with a cost and the goal is to find a solution minimizing the sum of the costs over the constraints.
A constraint formula equipped with a valuation function is called a (constraint) optimization formula, and is denoted by . For an optimization formula and a constraint database DB, we define the total order over the assignments in such that for each pair and in , if and only if .
For a set , we also define as the total order over the assignments in as follows. For each pair and of assignments in , if and only if there is an assignment with such that holds, for each assignment such that . Note that reflects a descending order over real numbers. To have an ascending order, we can consider operators distributing over (rather than over ), and define the order such that if and only if . Our results are presented by focusing, w.l.o.g., on only.
Two problems that naturally arise with constraint optimization formulas are stated next. The two problems receive as input an optimization formula , a set of distinguished variables, and a constraint database DB:
Compute an assignment such that there is no assignment with ;888As usual, the fact that and hold is denoted by . Answer NO SOLUTION, if .
Compute a list of distinct assignments from , where , and where for each , there is no assignment with . Note that the parameter is an additional input of the problem Top- and, as usual, is assumed to be given in binary notation. This means, in particular, that the answers to this problem can be exponentially many when compared to the input size.
3.1.1 Structured Valuation Functions
Our results are actually given in the more general setting of the structured valuation functions, where different binary operators can be used in the same constraint formula, and the order according to which the basic weight functions have to be processed is syntactically guided by the use of parentheses (in this case the evaluation order of operators does matter, in general).
Formally, a structured valuation function is either a weight function or an expression of the form , where and are in turn structured valuation functions, and is a binary operator with the same properties as the operator in Definition 3.1.
Clearly enough, any structured valuation function built with one binary operator can be transparently viewed as a standard valuation function by just omitting the parenthesis. On the other hand, given a valuation function , we can easily built the set of all possible equivalent structured valuation functions, by just considering all possible legal ways of adding parenthesis to .
Working on the elements of appears to be easier in the algorithms we shall illustrate in the following sections and, accordingly, our presentation will be focused on structured valuation functions. We will discuss in Section 4.4 how to move from structured valuation functions to equivalent standard valuation functions.
Consider a simple configuration scenario defined in terms of the CSP instance , where
and where the atoms in DB are those shown in Figure 2 using an intuitive graphical notation. Note that the instance is trivially satisfiable.
In fact, we are usually not interested in finding just any solution in this setting, but would rather like to single out one that matches as much as possible our preferences over the possible configurations. For instance, we might be interested in computing (the solution corresponding to) a car minimizing the sum of its price plus the cost that is expected to be paid, for the given quotation of the fuel, to cover 100.000 kilometers. Moreover, for cars that are equally ranked w.r.t. this first criterion, we might want to give preference to cars minimizing the emission of . For a sufficiently large constant (which can be treated in a symbolic way), this requirement can be modeled via the function such that
where is the identity weight function on each variable and where any real number is viewed as a constant weight function.
3.2 Structured Valuation Functions and Embeddings
It is easily seen that structured valuation functions introduce further dependencies among the variables, which are not reflected in the basic hypergraph-based representation of the underlying CSP instances. Therefore, when looking at islands of tractability for constraint optimization formulas, this observation motivates the definition of a novel form of structural representation where the interplay between functions and constraint scopes is made explicit. In order to formalize this structural representation, we introduce the concept of parse tree of a structured valuation function.
Let be a structured valuation function. Then, the parse tree of is a labeled rooted tree , where maps vertices either to variables or to binary operators, defined inductively as follows:
If is a weight function , then where . That is, has no edges and a unique (root) node , labeled by the variables occurring in .
Assume that with and , and with and being the root nodes of and , respectively. Then, is rooted at a fresh node , with the labeling function such that and its restrictions over and coincide with and , respectively.
Let be a set of variables, and let be the root of . Then, the output-aware parse tree of w.r.t. is the labeled tree rooted at a fresh node , and where is such that and its restriction over coincides with .
Now, we define the concept of embedding as a way to characterize how the parse tree of a structured function interacts with the constraints of an acyclic constraint formula.
Let be a structured valuation function for a constraint formula , let be an acyclic hypergraph with , let be a set of variables, and let be the associated output-aware parse tree.
We say that the pair can be embedded in if there is a join tree of and an injective mapping , such that every vertex is associated with a vertex of , called -separator, which satisfies the following conditions:
, i.e., the variables occurring in occur in the labeling of , too; and
there is no pair , of vertices adjacent to in such that their images and are not separated by , i.e., such that they occur together in some connected component of the forest .
The mapping is called an embedding of in , and is its witness.
Intuitively, condition (1) states that any leaf node of the parse tree (i.e., with ) is mapped into a -separator whose -labeling covers the variables involved in the domain of the underlying weight function. Moreover, it requires that the root node is mapped into a node whose -labeling covers the output variables in . On the other hand, condition (2) guarantees that the structure of the parse tree is “preserved” by the embedding. This is explained by the example below.
Recall the setting of Example 3.2 and the output-aware parse tree shown on the left of Figure 3. Observe that is acyclic, as it is witnessed by the join tree depicted on the right. Moreover, note that the figure actually shows that there is an embedding of in that maps each node to (the hyperedge containing the variables of) the atom , except for the leaf which is mapped to . Note also that the root is mapped to , which indeed covers the output variable MODEL, and that mapping constant functions is immaterial.
3.3 Properties of Embeddings for Structured Valuation Functions
We shall now analyze some relevant properties of embeddings, which are useful for providing further intuitions on this notion and will be used in our subsequent explanations.
Let be a structured valuation function for a constraint formula , let be an acyclic hypergraph with , and let be a set of variables. Let be any injective mapping and denote by its image. Thus, for each vertex , its inverse is the vertex in the parse tree whose image under is precisely . In the following, let us view as a tree rooted at the vertex , where is the root of . Moreover, in any rooted tree, we say that a vertex is a descendant of , if either is a child of , or is a descendant of some child of .
Assume that is an embedding of in , with being its witness. Let and be two distinct vertices in . Then, is a descendant of in if and only if is a descendant of in .
We prove the property by structural induction, from the root to the leaves of . In the base case, is the root of and thus is the root of . In this case, the result is trivially seen to hold. Now assume that the property holds on any vertex in the path connecting the root and a vertex . That is, is a descendant of in if and only if is a descendant of in . We show that the property holds on , too. To this end, we first claim the following.
Let be a path in such that: (i) is a child of , for each ; and (ii) is a descendant in of , for each . Then, is a descendant of , for each .
Proof. We prove the property by induction. Consider first the case where . The fact that is a descendant of is immediate by (ii). Then, assume that the property holds up to an index , with . We show that it holds on , too. Indeed, by inductive hypothesis, we know that is a descendant of , which is in turn a descendant of by (ii). Consider the vertex , and recall that since is an embedding, disconnects from . This means that is a descendant of .
We now resume the main proof.
Assume that is a descendant of . Hence, is a descendant of some vertex and we can apply the inductive hypothesis to derive that and are both descendant of in . Assume, for the sake of contradiction, that is not a descendant of . We distinguish two cases.
In the first case is a descendant of . This means that there is a path . Note that on this path we can apply the inductive hypothesis in order to conclude that is a descendant of , for each . Therefore, we are in the position to apply Claim 3.7, and we conclude that is a descendant of , for each . In particular, by transitivity, we get that is a descendant of . That is, is a descendant of . Contradiction.
The only remaining possibility is that there are two distinct vertices and that are children of a vertex , which is a descendant of or is precisely , and such that (resp., ) is a descendant of (resp., ) or coincides with (resp., ) itself. Consider then the paths and . Similarly to the case discussed above, note that on each of them we can apply Claim 3.7. Thus, we get that and are descendant of . Moreover, either is a descendant of , or coincides with . Similarly, either is a descendant of , or coincides with . Finally, recall that is an embedding, and hence must disconnect and . Therefore, it disconnects and , too. Contradiction with the fact that is a descendant of .
Assume that is a descendant of in . Assume, for the sake of contradiction, that is not a descendant of . Because of the only-if part, we are guaranteed that is in any case not a descendant of . Therefore, there is a vertex disconnecting and and such that and are both descendant of . We can now apply the inductive hypothesis on the vertex in the path connecting and the root, and that is the closest to (possibly coinciding with it). Therefore, we know that and are both descendant of . Hence, occurs in the path connecting and . Consider then the path . Because of the inductive hypothesis, we know that is a descendant of , for each . Therefore, we are in the position of applying Claim 3.7 and, by transitivity, we derive that is a descendant of . That is, is a descendant of . Contradiction.
In words, the above result tells us that embeddings preserve the descendant relationship. In fact, preserving this relationship suffices for an embedding to exist.
Let be an injective function satisfying condition (1) in Definition 3.4 for and for a join tree of . Assume that for each pair of distinct vertices in , is a descendant of in if and only if is a descendant of in . Then, there is an embedding of in (with witness ), which can be built in polynomial time from .
Based on , we build a function as follows. Let be any node in . If is a leaf node in or it is the root, then we set . Otherwise, i.e., if is an internal node with children and , then we first observe that, by hypothesis, and are both descendant of . Moreover, (resp., ) is not a descendant of (resp., ), and therefore there is a vertex in possibly coinciding with such that and occur in different components of as descendants of —in particular, whenever , we have that is a descendant of . For the node , we now define .
Note that trivially satisfies condition (1) in Definition 3.4, as differs from only over non-leaf nodes different from the root. We claim that satisfies condition (2), too.
Recall that, by Definition 3.3, the output-aware parse tree is binary, and its root is the only vertex having one child. Consider next any vertex with parent and children and . Indeed, if is the root or a leaf, then condition (2) in Definition 3.4 trivially holds. By the above construction (setting ), we know that and occur in different components of as descendants of . Moreover, whenever , we are guaranteed that is a descendant of (again by the above construction, this time setting ). Similarly, either , or is a descendant of . Hence, and occur in different components of as descendants of .
Consider now the parent and the child —the same line of reasoning applies to the child . By hypothesis, we know that occurs in the path connecting and , with being a descendant of . Moreover, by construction of (over , , and ), we have that: either occurs in the path connecting and , or ; either occurs in the path connecting and , or ; and is either a descendant of , or coincides with . Therefore, occurs in the path connecting and . That is, and occur in different components of .
By putting all together, we have shown that condition (2) in Definition 3.4 holds on any vertex . ∎
4 Structural Tractability in the Tree Projection Setting
The concept of embedding has been introduced in Section 3 as a way to analyze the interactions of valuation functions with acyclic instances. However, the concept can be easily coupled with the tree projections framework in order to be applied to instances that are not precisely acyclic. This coupling is formalized below.
Let be a structured valuation function for a constraint formula , let be a set of variables, and let be a view set for . We say that can be embedded in if there is a sandwich formula of w.r.t. such that can be embedded in the acyclic hypergraph . If , then we just say that can be embedded in .
Recall the setting of Example 2.3 and the valuation function , where , for each solution . It is immediate to check that can be embedded in . Indeed, consider the sandwich formula depicted in Figure 1, and note that can be embedded in . In fact, as consists of a weight function only, this is witnessed by any join tree of because we can always build an embedding that maps to a vertex of such that . Note that, for this kind of valuation functions, checking the existence of an embedding always reduces to checking the existence of a tree projection.
Recall that deciding whether a pair of hypergraphs has a tree projection is an NP-complete problem [GMS09], so that the notion of embedding can hardly be exploited in a constructive way when combined with tree projections. However, we show in this section that the knowledge of an embedding (and of a tree projection) is not necessary to compute the desired answers. Indeed, a promise-free algorithm can be exhibited that is capable of returning a solution to Max (and Top-) or to check that the given instance is not embeddable. The algorithm is in fact rather elaborated, and we start by illustrating some useful properties that can help the intuition.
Hereinafter, let be a constraint formula, DB a constraint database, a set of variables, and a structured valuation function, all of them being provided as input to our reasoning problems. Moreover, to deal with the setting of tree projections, we assume that a view set for plus a constraint database DB’ that is legal for w.r.t. and DB are provided. Accordingly, to emphasize the role played by these structures, the problems of interest will be denoted as Max() and Top-().
4.1 Useful Properties of Embeddings
For a constraint database DB and an atom , the set will be also denoted by . Substitutions in will be also viewed as the ground atoms in DB to which they are unambiguously associated. If is an atom, denotes .
Without loss of generality, assume that, for each weight function , contains a function view over the variables in . Indeed, if , then we can just define as the projection over of any view such that . In particular, if such a view does not exist, then we can immediately conclude that cannot be embedded in .
Let us define as the constraint database obtained (in polynomial time) by enforcing pairwise consistency on DB’ w.r.t. [BFMY83]. The method consists of repeatedly applying, till a fixpoint is reached, the following constraint propagation procedure: Take any pair and of views in , and delete from DB any (ground atom associated with an) assignment in for which no assignment exists with . In words, the procedure removes, for each view , all its associated assignments that cannot be extended to some assignment in each of the remaining views. In the database terminology, this is called a semijoin operation over and .
The crucial property enjoyed by the database , which we shall intensively use in our elaborations, is recalled below.
Proposition 4.3 ([Gs10]).
Assume there exists a tree projection of with respect to . Then, holds, for every and such that there is a hyperedge of with .
For any partial assignment , where , and for any constraint optimization formula , denote by the maximum weight that any assignment with can get according to , that is, , where denotes the minimum weight in the codomain of the valuation function.
Let be any vertex of occurring in the parse tree . Let denote the subexpression of whose parse tree is the subtree rooted at . Note that if holds for some weight function , then