Communication Complexity of Permutation-Invariant Functions

Communication Complexity of Permutation-Invariant Functions

Abstract

Motivated by the quest for a broader understanding of communication complexity of simple functions, we introduce the class of “permutation-invariant” functions. A partial function is permutation-invariant if for every bijection and every , it is the case that . Most of the commonly studied functions in communication complexity are permutation-invariant. For such functions, we present a simple complexity measure (computable in time polynomial in given an implicit description of ) that describes their communication complexity up to polynomial factors and up to an additive error that is logarithmic in the input size. This gives a coarse taxonomy of the communication complexity of simple functions. Our work highlights the role of the well-known lower bounds of functions such as Set-Disjointness and Indexing, while complementing them with the relatively lesser-known upper bounds for Gap-Inner-Product (from the sketching literature) and Sparse-Gap-Inner-Product (from the recent work of Canonne et al. [ITCS 2015]). We also present consequences to the study of communication complexity with imperfectly shared randomness where we show that for total permutation-invariant functions, imperfectly shared randomness results in only a polynomial blow-up in communication complexity after an additive overhead.

1 Introduction

Communication complexity, introduced by Yao [Yao79], has been a central object of study in complexity theory. In the two-way model, two players, Alice and Bob, are given private inputs and respectively, along with some shared randomness, and they exchange bits according to a predetermined protocol and produce an output. The protocol computes a function if the output equals with high probability over the randomness 1. The communication complexity of is the minimum over all protocols computing of the maximum, over inputs and , of the number of bits exchanged by the protocol. The one-way communication model is defined similarly except that all the communication is from Alice to Bob and the output is produced by Bob. For an overview of communication complexity, we refer the reader to the book [KN97] and the survey [LS09].

While communication complexity of functions has been extensively studied, the focus typically is on lower bounds. Lower bounds on communication complexity turn into lower bounds on Turing machine complexity, circuit depth, data structures, streaming complexity, just to name a few. On the other hand, communication complexity is a very natural notion to study on its own merits and indeed positive results in communication complexity can probably be very useful in their own rights, by suggesting efficient communication mechanisms and paradigms in specific settings. For this perspective to be successful, it would be good to have a compact picture of the various communication protocols that are available, or even the ability to determine, given a function , the best, or even a good, communication protocol for . Of course such a goal is overly ambitious. For example, the seminal work of Karchmer and Wigderson [KW90] implies that finding the best protocol for is as hard as finding the best (shallowest) circuit for some related function .

Given this general barrier, one way to make progress is to find a restrictive, but natural, subclass of all functions and to characterize the complexity of all functions within this class. Such approaches have been very successful in the context of non-deterministic computation by restricting to satisfiability problems [Sch78], in optimization and approximation by restricting to constraint satisfaction problems [Cre95, KSTW00], in the study of decision tree complexity by restricting to graph properties [Ros73], or in the study of property testing by restricting to certain symmetric properties (see the survey [Sud10, Gol10] and the references therein). In the above cases, the restrictions have led to characterizations (or conjectured characterizations) of the complexity of all functions in the restricted class. In this work, we attempt to bring in a similar element of unification to communication complexity.

In this work, we introduce the class of “permutation-invariant” (total or partial) functions. Let denote the set . A function is permutation invariant if for every bijection and every it is the case that . We propose to study the communication complexity of this class of functions.

To motivate this class, we note that most of the commonly studied functions in communication complexity including Equality [Yao79], (Gap) Hamming distance [Woo04, JKS08, CR12, Vid11, She12, PEG86, Yao03, HSZZ06, BBG14], (Gap) Inner Product, (Small-Set) Disjointness [KS92, Raz92, HW07, ST13], Small-Set Intersection [BCK14] are all permutation-invariant functions. Other functions, such as Indexing [JKS08], can be expressed without changing the input length significantly, as permutation-invariant functions. Permutation-invariant functions also include as subclasses, several classes of functions that have been well-studied in communication complexity, such as (AND)-symmetric functions [BdW01, Raz03, She11] and XOR-symmetric functions [ZS09]. It is worth noting that permutation-invariant functions are completely expressive if one allows an exponential blow-up in input size, namely, for every function there are functions , , s.t. is permutation-invariant and . So results on permutation-invariant functions that don’t depend on the input size apply to all functions. Finally, we point out that permutation-invariant functions have an important standpoint among functions with small communication complexity, as permutation-invariance often allows the use of hashing/bucketing based strategies, which would allow us to get rid of the dependence of the communication complexity on the input length . We also note that functions on non-Boolean domains that are studied in the literature on sketching such as distance estimation (given , decide if or if ) are also permutation-invariant. In particular, the resulting sketching/communication protocols are relevant to (some functions in) our class.

1.1 Coarse characterization of Communication Complexity

Permutation-invariant functions on -bits are naturally succinctly described (by bits). Given this natural description, we introduce a simple combinatorial measure (which is easy to compute, in particular in time given ) which produces a coarse approximation of the communication complexity of . We note that our objective is different from that of the standard objectives in the study of communication complexity lower bounds, where the goal is often to come up with a measure that has nice mathematical properties, but may actually be more complex to compute than communication complexity itself. In particular, this is true of the Information Complexity measure introduced by [CSWY01] and [BYJKS02], and used extensively in recent works, and which, until recently, was not even known to be approximately computable [BS15] (whereas communication complexity can be computed exactly, albeit in doubly exponential time). Nevertheless, our work does rely on known bounds on the information complexity of some well-studied functions and our combinatorial measure also coarsely approximates the information complexity for all the functions that we study.

To formally state our first theorem, let denote the randomized communication complexity of a function and denote its information complexity. Our result about our combinatorial measure (see Definition 3.2) is summarized below.

{restatable}

theoremmainthmone Let be a (total or partial) permutation-invariant function. Then,

In other words, the combinatorial measure approximates communication complexity to within a polynomial factor, up to an additive factor. Our result is constructive — given it gives a communication protocol whose complexity is bounded from above by . It would be desirable to get rid of the factor but this seems hard without improving the state of the art vis-a-vis communication complexity and information complexity. To see this, first note that our result above also implies that information complexity provides a coarse approximator for communication complexity. Furthermore, any improvement to the additive error in this relationship would imply improved relationship between information complexity and communication complexity for general functions (better than what is currently known). Specifically, we note:

Proposition 1.1.

Let be a function such that for every permutation-invariant partial function on . Then, for every (general) partial function on , we have .

Thus, even an improvement from an additive to additive would imply new relationships between information complexity and communication complexity.

1.2 Communication with imperfectly shared randomness

Next, we turn to communication complexity when the players only share randomness imperfectly, a model introduced by [BGI14, CGMS14]. Specifically, we consider the setting where Alice gets a sequence of bits and Bob gets a sequence of bits where the pairs are identically and independently distributed according to distribution , which means, the marginals of and are uniformly distributed in and and are -correlated (i.e., ).

The question of what can interacting players do with such a correlation has been investigated in many different contexts including information theory [GK73, Wit75], probability theory [MO05, BM11, CMN14, MOR06], cryptography [BS94, Mau93, RW05] and quantum computing [BBP96]. In the context of communication complexity, however, this question has only been investigated recently. In particular, Bavarian et al. [BGI14] study the problem in the Simultaneous Message Passing (SMP) model and Canonne et al. [CGMS14] study it in the standard one-way and two-way communication models. Let denote the communication complexity of a function when Alice and Bob have access to -correlated bits. The work of [CGMS14] shows that for any total or partial function with communication complexity it is the case that . They also give a partial function with for which . Thus, imperfect sharing leads to an exponential slowdown for low-communication promise problems.

One of the motivations of this work is to determine if the above result is tight for total functions. Indeed, for most of the common candidate functions with low-communication complexity such as Small-Set-Intersection and Small-Hamming-Distance, we show (in Section 4.1) that . 2 This motivates us to study the question more systematically and we do so by considering permutation-invariant total functions. For this class, we show that the communication complexity with imperfectly shared randomness is within a polynomial of the communication complexity with perfectly shared randomness up to an additive factor; this is a tighter connection than what is known for general functions. Interestingly, we achieve this by showing that the same combinatorial measure also coarsely captures the communication complexity under imperfectly shared randomness. Once again, we note that the factor is tight unless we can improve the upper bound of [CGMS14].

{restatable}

theoremmainthmtwo Let be a permutation-invariant total function. Then, we have

Furthermore, .

1.3 Overview of Proofs

Our proof of Theorem 1.1 starts with the simple observation that for any permutation-invariant partial function , its value is determined completely by , and (where denotes the Hamming weight of and denotes the (non-normalized) Hamming distance between and ). By letting Alice and Bob exchange and (using bits of communication), the problem now reduces to being a function only of the Hamming distance . To understand the remaining task, we introduce a multiparameter version of the Hamming distance problem where is undefined if or or . The function is if and if .

This problem turns out to have different facets for different choices of the parameters. For instance, if , then the communication complexity of this problem is roughly and the optimal lower bound follows from the lower bound on Gap Hamming Distance [CR12, Vid11, She12] whereas the upper bound follows from simple hashing. However, when , and different bounds and protocols kick in. In this range, the communication complexity turns out to be with the upper bound coming from the protocol for Sparse-Gap-Inner-Product given in [CGMS14], and a lower bound that we give based on a reduction from Set Disjointness. In this work, we start by giving a complete picture of the complexity of for all parameter settings. The lower bound for communication complexity, and even information complexity, of general permutation-invariant functions follows immediately - we just look for the best choice of parameters of that can be found in . The upper bound requires more work in order to ensure that Alice and Bob can quickly narrow down the Hamming distance to a range where the value of is clear. To do this, we need to verify that does not change values too quickly or too often. The former follows from the fact that hard instances of cannot be embedded in , and the latter involves some careful accounting, leading to a full resolution.

Turning to the study of communication with imperfectly shared randomness, we hit an immediate obstacle when extending the above strategy since Alice and Bob cannot afford to exchange and anymore, since this would involve bits of communication and we only have an additional budget of . Instead, we undertake a partitioning of the “weight-space”, i.e., the set of pairs , into a finite number of regions. For most of the regions, we reduce the communication task to one of the Small-Set-Intersection or Small-Hamming-Distance problems. In the former case, the sizes of the sets are polynomially related to the randomized communication complexity, whereas in the latter case, the Hamming distance threshold is polynomially related to the communication complexity. A naive conversion to protocols for the imperfectly shared setting using the results of [CGMS14] would result in an exponential blow-up in the communication complexity. We give new protocols with imperfectly shared randomness for these two problems (which may be viewed as extensions of protocols in [BGI14] and [CGMS14]) that manage to reduce the communication blow-up to just a polynomial. This manages to take care of most regions, but not all. To see this, note that any total function can be encoded as a permutation-invariant function and such functions cannot be partitioned into few classes. Our classification manages to eliminate all cases except such functions, and in this case, we apply Newman’s theorem to conclude that the randomness needed in the perfectly shared setting is only bits (since the inputs to are in the range ). Communicating this randomness and then executing the protocol with perfectly shared randomness gives us in this case a private-randomness protocol with communication .

1.4 Roadmap of this paper

In Section 2, we give some of the basic definitions and introduce the background material necessary for understanding the contributions of this paper. In Section 3, we introduce our measure and prove Theorem 1.1. In Section 4, we show the connections between communication complexity with imperfectly shared randomness and that with perfectly shared randomness and prove Theorem 1.2. We end with a summary and some future directions in Section 5.

2 Preliminaries

In this section, we provide all the necessary background needed to understand the contributions in this paper.

2.1 Notations and Definitions

Throughout this paper, we will use bold letters such as , , etc. to denote strings in , where the -th bit of will be accessed as . We denote by the Hamming weight of binary string , i.e., the number of non-zero coordinates of . We will also denote by the Hamming distance between binary strings and , i.e., the number of coordinates in which and differ. We also denote for every positive integer .

Very significant for our body of work is the definition of permutation-invariant functions, which we define as follows,

Definition 2.1 (Permutation-Invariant functions).

A (total or partial) function is permutation-invariant if for all and every bijection , (where is such that ).

We note the following simple observation about permutation-invariant functions.

Observation 2.2.

Any permutation-invariant function depends only on , , and . Since these numbers add up to , really depends on any three of them, or in fact any three linearly independent combinations of them. Thus, we have that for some appropriate functions , ,

We will use these representations of interchangeably throughout this paper. We will often refer to the slices of obtained by fixing and for some and , in which case we will denote the sliced by either or , and similarly for .

2.2 Communication Complexity

We define the standard notions of two-way (resp. one-way) randomized commmunication complexity3 (resp. ), that is studied under shared/public randomness model (cf. [KN97]).

Definition 2.3 (Randomized communication complexity ).

For any function , the randomized communication complexity is defined as the cost of the smallest randomized protocol, which has access to public randomness, that computes correctly on any input with probability at least . In particular,

where the minimum is taken over all randomized protocols , where Alice and Bob have access to public randomness.

The one-way randomized communication complexity is defined similarly, with the only difference being that we allow only protocols where only Alice communicates to Bob, but not other way round.

Another notion of randomized communication complexity that is studied, is under private randomness model. The work of [CGMS14] sought out to study an intermediate model, where the two parties have access to i.i.d. samples from a correlated random source , that is, Alice has access to and Bob has access to . In their work, they considered the doubly symmetric binary source, parametrized by , defined as follows,

Definition 2.4 (Doubly Symmetric Binary Source ).

is a distribution on , such that for ,

Note that corresponds to the standard notion of public randomness, and corresponds to the standard notion of private randomness.

Definition 2.5 (Communication complexity with imperfectly shared randomness [Cgms14]).

For any function , the ISR-communication complexity is defined as the cost of the smallest randomized protocol, where Alice and Bob have access to samples from , that computes correctly on any input with probability at least . In particular,

where the minimum is taken over all randomized protocols , where Alice and Bob have access to samples from .

For ease of notation, we will often drop the subscript and denote .

We use the term ISR as abbreviation for “Imperfectly Shared Randomness” and ISR-CC for “ISR-Communication Complexity”. To emphasize the contrast, we will use PSR and PSR-CC for the classical case of (perfectly) shared randomness. It is clear that if , then .

An extreme case of ISR is when . This corresponds to communication complexity with private randomness, denoted by . Note that for any . A theorem (due to Newman [New91]) shows that any communication protocol using public randomness can be simulated using only private randomness with an extra communication of additive (both in the 1-way and 2-way models). We state the theorem here for the convenience of the reader.

Theorem 2.6 (Newman’s theorem [New91]).

For any function (any range ), the following hold,

here, is also and is also .

2.3 Information Complexity

Information complexity4 is an interactive analogue of Shannon’s information theory [Sha48]. Informally, information complexity is defined as the minimum number of bits of information that the two parties have to reveal to each other, when the inputs are coming from the ‘worst’ possible distribution .

Definition 2.7 ((Prior-Free) Interactive Information Complexity; [Bra12]).

For any , the (prior-free) interactive information complexity of , denoted by , is defined as,

where, the infimum is over all randomized protocols such that for all such that , and the supremum is over all distributions over . [ is the mutual information between and conditioned on ]

We refer the reader to the survey by Weinstein [Wei15] for a more detailed understanding of the definitions and the role of information complexity in communication complexity.

A general question of interest is: what is the relationship between and ? It is straightforward to show . Upper bounding as a function of has been investigated in several works including the work of Barak et al. [BBCR13]. The cleanest relation known is that [Bra12]. Our first result, namely Theorem 1.1, shows that for permutation-invariant functions, is not much larger than .

2.4 Some Useful Communication Problems

Central to our proof techniques is a multi-parameter version of Gap-Hamming-Distance, which we define as follows.

Definition 2.8 (Gap-Hamming-Distance, , ).

We define as the following partial function,

Additionally we define as the following partial function,

Informally, computing is equivalent to the following problem:

  • Alice is given such that

  • Bob is given such that

  • They wish to distinguish between the cases and .

In the problem, they wish to distinguish between the cases or . Arguably, is an ‘easier’ problem than . However, in this work we will show that in fact is not much harder than .

We will use certain known lower bounds on the information complexity and one-way communication complexity of for some settings of the parameters. The two main settings of parameters that we will be using correspond to the problems of Unique-Disjointness and Sparse-Indexing (a variant of the more well-known Indexing problem).

IC lower bound for Unique-Disjointness

Definition 2.9 (Unique-Disjointness, ).

is given by the following partial function,

Note that is an instance of .

Informally, is the problem where the inputs satisfy and Alice and Bob wish to decide whether or (promised that it is one of them is the case).

Lemma 2.10.

For all , .

Proof.

Bar-Yossef et al. [BJKS04] proved that General-Unique-Disjointness, that is, unique disjointness without restrictions on , on inputs of length , has information complexity . We convert the general Unique-Disjointness instance into an instance of by a simple padding argument as follows. Given an instance of general Unique-Disjointness . Alice constructs and Bob constructs . Note that . Also, we have that , and . Thus, we have reduced General-Unique-Disjointness to , and thus the lower bound of [BJKS04] implies that . ∎

On top of the above lemma, we apply a simple padding argument in addition to the above lower bound, to get a more general lower bound for Unique-Disjointness as follows.

Proposition 2.11 (Unique Disjointness IC Lower Bound).

For all ,

Proof.

We look at two cases, namely and .
Case 1. []: We have from Lemma 2.10 that . We map the instance of to an instance of by the following reduction, and . This implies that .

Case 2. []: We have from Lemma 2.10 that . As before, we map the instance of to an instance of by the following reduction, and . This implies that .

Combining the above two lower bounds, we get that . ∎

1-way CC lower bound for Sparse-Indexing

Definition 2.12 (Sparse-Indexing, ).

is given by the following partial function,

Note that is an instance of .

Informally, is the problem where the inputs satisfy and and Alice and Bob wish to decide whether or (promised that one of them is the case).

Lemma 2.13.

For all , .

Proof.

Jayram et al. [JKS08] proved that if Alice is given , Bob is given , and Bob needs to determine upon receving a single message from Alice, then Alice’s message should consist of bits, even if they are allowed shared randomness. Using their result, we deduce that via the following simple padding argument: Alice and Bob double the length of their strings from to , with Alice’s new input consisting of while Bob’s new input consists of , where is the bitwise complement of and is the indicator vector for location . Note that the Hamming weight of Alice’s new string is equal to while its length is , as desired. ∎

On top of the above lemma, we apply a simple padding argument in addition to the above lower bound, to get a more general lower bound for Sparse-Indexing as follows.

Proposition 2.14 (Sparse-Indexing 1-way CC Lower Bound).

For all ,

Proof.

We look at two cases, namely and .
Case 1. []: We have from Lemma 2.13 that . We map the instance of to an instance of by the following reduction, and . This implies that .

Case 2. []: We have from Lemma 2.13 that . We map the instance of to an instance of by the following reduction, and . This implies that .

Combining the above two lower bounds, we get that . ∎

3 Coarse Characterization of Information Complexity

In this section, we prove the first of our results, namely Theorem 1.1, which we restate below for convenience of the reader.

\mainthmone

*

where is the combinatorial measure we define in Definition 3.2.

3.1 Overview of proof

We construct a measure such that . In order to do this, we look at the slices of obtained by restricting and . As in Observation 2.2, let be the restriction of to and . We define the notion of a jump in as follows.

Definition 3.1 (Jump in ).

is a jump in if , both , are in and is undefined for .

Thus, any protocol that computes with low error will in particular be able to solve a Gap-Hamming-Distance problem as in Definition 2.8. Thus, if is a jump for , then is a lower bound on . We will prove lower bounds on for any value of , , and by obtaining a variety of reductions from Unique-Disjointness, and then our measure will be obtained by taking the largest of these lower bounds for over all choices of and and jumps in .

Suppose . We construct a randomized communication protocol with cost that computes correctly with low constant error. The protocol works as follows: First, Alice and Bob exchange the values and , which requires communication (say, and ). Now, all they need to figure out is the range in which lies (note that finding exactly can require communication!). Let be the set of all jumps in . Note that the intervals are all pairwise disjoint. To compute , it suffices for Alice and Bob need to resolve each jump, that is, for each , they need to figure out whether or . We will show that any particular jump can be resolved with a constant probability of error using communication, and the number of jumps is at most . Although the number of jumps is large, it suffices for Alice and Bob to do a binary search through the jumps, which will require them to resolve only jumps each requiring communication. Thus, the total communication cost will be .5

3.2 Proof of Theorem 1.1

As outlined earlier, we define the measure as follows.

Definition 3.2 (Measure ).

Given a permutation-invariant function and integers , , s.t. , let be the function given by if there exist with , , and otherwise. (Note. by permutation invariance of , is well-defined.) Let be the set of jumps in , defined as follows,

Then, we define as follows.

where is a suitably large constant (which does not depend on ).

We will need the following lemma to show that the measure is a lower bound on .

Lemma 3.3.

For all for which is a meaningful problem6, the following lower bounds hold,

Next, we obtain randomized communication protocols to solve .

Lemma 3.4.

Let . Then, the following upper bounds hold

We defer the proofs of the Lemmas 3.3 and 3.4 to Section 3.3. For now, we will use these lemmas to prove Theorem 1.1. First, we show that each jump can be resolved using communication.

Lemma 3.5.

Suppose , and let and . Let be any jump in . Then the problem of , that is, deciding whether or can be solved, with a constant probability of error, using communication.

Proof.

We can assume without loss of generality that . This is because both Alice and Bob can flip their respective inputs to ensure . Also, if , then we can flip the role of Alice and Bob to get .

Since , from the definition of in Lemma 3.3 we have that,

We consider two cases as follows.

Case 1. or : In this case we have, from part (ii) of Lemma 3.4, a randomized protocol with cost .

Case 2. : In this case, we will show that . Clearly, and . We will now show that in fact . From part (ii) of Lemma 3.3 we know that either or . Thus, it suffices to show that . We know that . The left inequality gives (since ). The right inequality gives . Since , we have . ∎

Next, we obtain an upper bound on , that is, the number of jumps in .

Lemma 3.6.

For any function with , the number of jumps in is at most .

Proof.

Let be the set of all jumps in . Partition into , where and . From second part in Lemma 3.3 we know the following:

Let