Limits of Ordered Graphs and Images

# Limits of Ordered Graphs and Images

Omri Ben-Eliezer Tel Aviv University, Israel. Email: omrib@mail.tau.ac.il.    Eldar Fischer Technion - Israel Institute of Technology, Israel. Email: eldar@cs.technion.ac.il.    Amit Levi University of Waterloo, Canada. Email: amit.levi@uwaterloo.ca. Research supported by the David R. Cheriton Graduate Scholarship. Part of this work was done while the author was visiting the Technion.    Yuichi Yoshida National Institute of Informatics (NII), Japan. Email: yyoshida@nii.ac.jp. Research supported by JSPS KAKENHI Grant Number JP17H04676.
###### Abstract

The emerging theory of graph limits exhibits an interesting analytic perspective on graphs, showing that many important concepts and tools in graph theory and its applications can be described naturally in analytic language. We extend the theory of graph limits to the ordered setting, presenting a limit object for dense vertex-ordered graphs, which we call an orderon. Images are an example of dense ordered bipartite graphs, where the rows and the columns constitute the vertices, and pixel colors are represented by row-column edges; thus, as a special case, we obtain a limit object for images.

Along the way, we devise an ordered locality-preserving variant of the cut distance between ordered graphs, showing that two graphs are close with respect to this distance if and only if they are similar in terms of their ordered subgraph frequencies. We show that the space of orderons is compact with respect to this distance notion, which is key to a successful analysis of combinatorial objects through their limits. For the proof we combine techniques used in the unordered setting with several new techniques specifically designed to overcome the challenges arising in the ordered setting. We derive several results related to sampling and property testing on ordered graphs and images; For example, we describe how one can use the analytic machinery to obtain a new proof of the ordered graph removal lemma [Alon et al., FOCS 2017].

## 1 Introduction

Large graphs appear in many applications across all scientific areas. Naturally, it is interesting to try to understand their structure and behavior: When can we say that two graphs are similar (even if they do not have the same size)? How can the convergence of graph sequences be defined? What properties of a large graph can we capture by taking a small sample from it?

The theory of graph limits addresses such questions from an analytic point of view. The investigation of convergent sequences of dense graphs was started to address three seemingly unrelated questions asked in different fields: statistical physics, theory of networks and the Internet, and quasi-randomness. A comprehensive series of papers [BCL06a, BCL06b, LS06, FLS07, LS07, BCL08, BCL10, BCL12] laid the infrastructure for a rigorous study of the theory of dense graph limits, demonstrating various applications in many areas of mathematics and computer science. The book of Lovász on graph limits [Lov12] presents these results in a unified form.

A sequence of finite graphs, whose number of vertices tends to infinity as , is considered convergent111In unordered graphs, this is also called convergence from the left; see the discussion on [BCL08]. if the frequency222The frequency of in is roughly defined as the ratio of induced subgraphs of isomorphic to among all induced subgraphs of on vertices. of any fixed graph as a subgraph in converges as . The limit object of a convergent sequence of (unordered) graphs in the dense setting, called a graphon, is a measurable symmetric function , and it was proved in [LS06] that, indeed, for any convergent sequence of graphs there exists a graphon serving as the limit of in terms of subgraph frequencies. Apart from their role in the theory of graph limits, graphons are useful in probability theory, as they give rise to exchangeable random graph models; see e.g. [DJ08, OR15]. An analytic theory of convergence has been established for other types of discrete structures. These include sparse graphs, for which many different (and sometimes incomparable) notions of limits exist – see e.g. [BC17, BCG17] for two recent papers citing and discussing many of the works in this field; permutations, first developed in [HKM13] and further investigated in several other works; partial orders [Jan11]; and high dimensional functions over finite fields [Yos16]. The limit theory of dense graphs has also been extended to hypergraphs, see [Zha15, ES12] and the references within.

In this work we extend the theory of dense graph limits to the ordered setting, establishing a limit theory for vertex-ordered graphs in the dense setting, and as a by-product, for images, which can be viewed as ordered graph-like structures that are inherently dense; see Subsection 1.4 for a discussion regarding images. An ordered graph is a symmetric function . is simple if for any . A weighted ordered graph is a symmetric function . Unlike the unordered setting, where are considered isomorphic if there is a permutation over so that for any , in the ordered setting, the automorphism group of a graph is trivial: is only isomorphic to itself through the identity function.

For simplicity, we consider in the following only graphs (without edge colors). All results here can be generalized in a relatively straightforward manner to edge-colored graph-like ordered structures, in which pairs of vertices may have one of colors (the definition above corresponds to the case ). This is done by replacing the range with the -dimensional simplex (which corresponds to the set of all possible distributions over ). As we shall see in Subsection 1.2, the main results proved in this paper are, in a sense, natural extensions of results in the unordered setting. However, proving them requires machinery that is heavier than that used in the unordered setting: the tools used in the unordered setting are not rich enough to overcome the subtleties materializing in the ordered setting. In particular, the limit object we use in the ordered setting – which we call an orderon – has a -dimensional structure that is more complicated than the analogous -dimensional structure of the graphon, the limit object for the unordered setting. The tools required to establish the ordered theory are described next.

### 1.1 Main ingredients

Let us start by considering a simple yet elusive sequence of ordered graphs, which has the makings of convergence. The odd-clique ordered graph on vertices is defined by setting – i.e., having an edge between vertices and – if and only if and are both odd, and otherwise setting . In this subsection we closely inspect this sequence to demonstrate the challenges arising while trying to establish a theory for ordered graphs, and the solutions we propose for them. First, let us define the notions of subgraph frequency and convergence.

The (induced) frequency of a simple ordered graph on vertices in an ordered graph with vertices is the probability that, if one picks vertices of uniformly and independently (repetitions are allowed) and reorders them as , is isomorphic to the induced ordered subgraph of over . (The latter is defined as the ordered graph on vertices satisfying for any .) A sequence of ordered graphs is convergent if as , and the frequency of any simple ordered graph converges as . Observe that the odd-clique sequence is indeed convergent: The frequency of the empty -vertex graph in tends to as , the frequency of any non-empty -vertex ordered graph containing only a clique and a (possibly empty) set of isolated vertices tends to , and the frequency of any other graph in is .

In light of previous works on the unordered theory of convergence, we look for a limit object for ordered graphs that has the following features.

Representation of finite ordered graphs

The limit object should have a natural and consistent representation for finite ordered graphs. As is the situation with graphons, we allow graphs and to have the same representation when one is a blowup333A graph on vertices is an ordered -blowup of on vertices if for any and . of the other.

Usable distance notion

Working directly with the definition of convergence in terms of subgraph frequencies is difficult. The limit object we seek should be endowed with a metric, like the cut distance for unordered graphs (see discussion below), that should be easier to work with and must have the following property: A sequence of ordered graph is convergent (in terms of frequencies) if and only if it is Cauchy in the metric.

Completeness and compactness

The space of limit objects must be complete with respect to the metric: Cauchy sequences should converge in this metric space. Combined with the previous requirements, this will ensure that any convergent sequence of ordered graphs has a limit (in terms of ordered frequencies), as desired. It is even better if the space is compact, as compactness is essentially an “ultimately strong” version of Szemerédi’s regularity lemma [Sze76], and will help to develop applications of the theory in other areas.

Additionally, we would like the limit object to be as simple as possible, without unnecessary over-representation. In the unordered setting, the metric used is the cut distance, introduced by Frieze and Kannan [FK96, FK99] and defined as follows. First, we define the cut norm of a function as the supremum of over all measurable subsets . The cut distance between graphons and is the infimum of over all measure-preserving bijections , where .

For the ordered setting, we look for a similar metric; the cut distance itself does not suit us, as measure-preserving bijections do not preserve ordered subgraph frequencies in general. A first intuition is then to try graphons as the limit object, endowed with the metric . However, this metric does not satisfy the second requirement: the odd-clique sequence is convergent, yet it is not Cauchy in , since for any . Seeing that seems “too strict” as a metric and does not capture the similarities between large odd-clique graphs well, it might make sense to use a slightly more “flexible” metric, which allows for measure-preserving bijections, as long as they do not move any of the points too far from its original location. In view of this, we define the cut-shift distance between two graphons as

 d△(W,W′)\lx@stackreldef=inff(Shift(f)+∥Wf−W′∥□), (1)

where is a measure-preserving bijection, , and for any . As we show in this paper (Theorem 1.2 below), the cut-shift distance settles the second requirement: a sequence of ordered graphs is convergent if and only if it is Cauchy in the cut-shift distance.

Consider now graphons as a limit object, coupled with the cut-shift distance as a metric. Do graphons satisfy the third requirement? In particular, does there exist a graphon whose ordered subgraph frequencies are equal to the limit frequencies for the odd-clique sequence? The answers to both of these questions are negative: it can be shown that such a graphon cannot exist in view of Lebesgue’s density theorem, which states that there is no measurable subset of whose density in every interval is (see e.g. Theorem 2.5.1 in the book of Franks on Lebesgue measure [Fra09]). Thus, we need a somewhat richer ordered limit object that will allow us to “bypass” the consequences of Lebesgue’s density theorem. Consider for a moment the graphon representations of the odd clique graphs. In these graphons, the domain can be partitioned into increasingly narrow intervals that alternately represent odd and even vertices. Intuitively, it seems that our limit object needs to be able to contain infinitesimal odd and even intervals at any given location, leading us to the following limit object candidate, which we call an orderon.

An orderon is a symmetric measurable function viewed, intuitively and loosely speaking, as follows. In each point , corresponding to an infinitesimal “vertex” of the orderon, the first coordinate, , represents a location in the linear order of . Each set can thus be viewed as an infinitesimal probability space of vertices that have the same location in the linear order. The role of the second coordinate is to allow “variability” (in terms of probability) of the infinitesimal“vertex” occupying this point in the order. The definition of the frequency of a simple ordered graph in an orderon is a natural extension of frequency in graphs. First, define the random variable as follows: Pick points in uniformly and independently, order them according to the first coordinate as with , and then return a -vertex graph , in which the edge between each pair of vertices and exists with probability , independently of other edges. The frequency is defined as the probability that the graph generated according to is isomorphic to .

Consider the orderon satisfying if and only if , and otherwise . now emerges as a natural limit object for the odd-clique sequence: one can verify that the subgraph frequencies in it are as desired.

The cut-shift distance for orderons is defined similarly to (1), except that is now a measure-preserving bijection from to and , where is the projection to the first coordinate.

### 1.2 Main results

Let denote the space of orderons endowed with the cut-shift distance. In view of Lemma 2.8 below, is a pseudo-metric for . By identifying whenever , we get a metric space . The following result is the main component for the viability of our limit object, settling the third requirement above.

###### Theorem 1.1.

The space is compact with respect to .

The proof of Theorem 1.1 is significantly more involved than the proof of its unordered analogue. While at a very high level, the roadmap of the proof is similar to that of the unordered one, our setting induces several new challenges, and to handle them we develop new shape approximation techniques. These are presented along the proof of the theorem in Section 4.

The next result shows that convergence in terms of frequencies is equivalent to being Cauchy in . This settles the second requirement.

###### Theorem 1.2.

Let be a sequence of orderons. Then is Cauchy in if and only if converges for any fixed simple ordered graph .

As a corollary of the last two results, we get the following.

###### Corollary 1.3.

For every convergent sequence of ordered graphs , there exists an orderon such that for every ordered graph .

The next main result is a sampling theorem, stating that a large enough sample from an orderon is almost always close to it in cut-shift distance. For this, we define the orderon representation of an -vertex ordered graph by setting for any , where we define for and . This addresses the first requirement.

###### Theorem 1.4.

Let be a positive integer and let be an orderon. Let . Then,

 d△(W,WG)≤C(loglogklogk)1/3

holds with probability at least for some constant .

Theorem 1.4 implies, in particular, that ordered graphs are a dense subset in .

###### Corollary 1.5.

For every orderon and every , there exists a simple ordered graph on at most vertices such that .

#### Applications

We finish by mentioning two applications of the ordered limit theory to illustrate the use of our theory. We provide a full proof for the first one and a detailed sketch for the other. The first application is concerned with naturally estimable ordered graph parameters, defined as follows.

###### Definition 1.6 (Naturally Estimable Parameter).

An ordered graph parameter is naturally estimable if for every and there is a positive integer satisfying the following. If is an ordered graph with at least nodes and is the subgraph induced by a uniformly random ordered set of exactly nodes of , then

 Pr\/G|k[|f(G)−f(G|k)|>ε]<δ.

The following result provides an analytic characterization of ordered natural estimability.

###### Theorem 1.7.

Let be a bounded simple ordered graph parameter. Then, the following are equivalent:

1. is naturally estimable.

2. For every convergent sequence of ordered simple graphs with , the sequence of numbers is convergent.

3. There exists a functional over that satisfies the following:

1. is continuous with respect to .

2. For every , there is such that for every ordered graph with , it holds that .

Our second application is a new analytic proof of the ordered graph removal lemma of [ABEF17], implying that every hereditary property of ordered graphs (and images over a fixed alphabet) is testable, with one-sided error, using a constant number of queries. (For the relevant definitions, see [ABEF17] and Definition 1.6 here.)

###### Theorem 1.8 ([Abef17]).

Let be an hereditary property of ordered graphs, and fix . Then there exists satisfying the following: For every ordered graph on vertices that is -far from , the probability that does not satisfy is at least .

The proof is rather long and involved (but somewhat cleaner than the combinatorial proof in [ABEF17]), and here we only provide a detailed sketch for it. The complete proof will appear in the full version of this paper, and will contain a proof (via Theorem 1.7) that the distance from any given hereditary property of ordered graphs is a naturally estimable graph parameter.

### 1.3 Related work

The theory of graph limits has strong ties to the area of property testing, especially in the dense setting. Regularity lemmas for graphs, starting with the well-known regularity lemma of Szemerédi [Sze76], later to be joined by the weaker (but more efficient) versions of Frieze and Kannan [FK96, FK99] and the stronger variants of Alon et al. [AFKS00], among others, have been very influential in the development of property testing. For example, regularity was used to establish the testability of all hereditary properties in graphs [AS08], the relationship between the testability and estimability of graph parameters [FN07], and combinatorial characterizations of testability [AFNS09]. The analytic theory of convergence, built using the cut distance and its relation to the weak regularity lemma, has proved to be an interesting alternative perspective on these results. Indeed, the aforementioned results have equivalent analytic formulations, in which both the statement and the proof seem cleaner and more natural. A recent line of work has shown that many of the classical results in property testing of dense graphs can be extended to dense ordered graph-like structures, including vertex-ordered graphs and images. In [ABEF17], it was shown that the testability of hereditary properties extends to the ordered setting (see Theorem 1.8 above). Shortly after, in [BEF18] it was proved that characterizations of testability in unordered graphs can be partially extended to similar characterizations in ordered graph-like structures, provided that the property at stake is sufficiently “well-behaved” in terms of order.

Graphons and their sparse analogues have various applications in different areas of mathematics, computer science, and even social sciences. The connections between graph limits and real-world large networks have been very actively investigated; see the survey of Borgs and Chayes [BC17]. Graph limits have applications in probability and data analysis [OR15]. Graphons were used to provide new analytic proofs of results in extremal graph theory; see Chapter 16 in [Lov12]. Through the notion of free energy, graphons were also shown to be closely connected to the field of statistical physics [BCL12]. We refer the reader to [Lov12] for more details.

### 1.4 Limits of images

An interesting direction for future investigation is to establish a theory of convergence for images, suitable for practical applications. A two-dimensional image is one of the most widely investigated structures in computer science, being the main object of interest in computer vision. Nowadays, this field is largely dominated by deep learning based methods (see the recent survey [VDDP18]), that are usually very effective, but the mathematical theory behind them is not yet sufficiently established. The use of analytic models to represent images, like those naturally arising when studying theories of convergence, might be an intriguing approach in which meaningful mathematical results can be proved.

The ordered limit theory presented here can be easily adapted to binary images, i.e., ordered bipartite graphs , as long as . We believe that the results can be generalized to images with range for fixed (including greyscale images and RGB (red-green-blue) images, corresponding to the cases and ) through a suitable generalization of the relevant definitions, as was done for unordered graphs; see Chapter 5 of Lovász’s book [Lov12] for more details. The limit object for images can be viewed as a bipartite variant of an orderon: This is a measurable (and not necessarily symmetric) function , where sets of the form in the first and second coordinate of correspond to infinitesimal rows and columns, respectively. Convergence is in terms of (non-consecutive) sub-image frequencies, calculated by picking points in infinitesimal rows and points in infinitesimal columns uniformly at random, and inspecting the value of on their intersection.

While our type of limit object is a natural extension of the unordered one and has applications in other areas, it seems that for practical applications in computer vision, one has to design more specifically tailored types of limit objects for images. Let us present two possible use-cases. First, the problem of understanding a continuous scene from a (possibly sparse) series of still images is one of the central tasks in computer vision, and when describing each of the images as an analytic object in some suitable space, the continuous scene should naturally correspond to a smooth curve in this space. To design a useful limit object for problems of this type, one has to consider analytic objects that are relatively robust against geometric transformations typically occurring due to movements of the camera and the objects in the scene, temporary occlusions of elements in the scene, slights changes in the amount of light, and other challenging phenomena typically occurring in continuous scenes. On the other hand, the analytic representation should be “sufficiently far” for images that do not look similar to the human eye. Another possible application is template matching, where the task is to find an approximate instance of a given pattern in an image, possibly rotated, shadowed, and partly occluded. Local sampling-based methods were shown to be useful for template matching [KRTA17], raising hopes that a local type of limit object might be helpful here. A recent line of works [BEKR17, BE19] studies local properties of images and pattern matching from the perspective of property testing, and might be useful in the development of such a limit object.

## 2 Preliminaries

In this section we formally describe some of the basic ingredients of our theory, including the limit object – the orderon, and several distance notions including the cut-norm for orderons (both unordered and ordered variants are presented), and the cut-shift distance. We then show that the latter is a pseudo-metric for the space of orderons. This will later allow us to view the space of orderons as a metric space, by identifying orderons of cut-shift distance .

The measure used here is the Lebesgue measure, denoted by . We start with the formal definition of an orderon.

###### Definition 2.1 (Orderon).

An orderon is a measurable function that is symmetric in the sense that for all . For the sake of brevity, we also denote by for .

We denote the set of all orderons by .

###### Definition 2.2 (measure-preserving bijection).

A map is measure preserving if the pre-image is measurable for every measurable set and . A measure preserving bijection is a measure preserving map whose inverse map exists (and is also measure preserving).

Let denote the collection of all measure preserving bijections from to itself. Given an orderon and , we define as the unique orderon satisfying for any . Additionally, denote by the projection to the first coordinate, that is, for any .

### 2.1 Cut-norm and ordered cut-norm

The definition of the (unordered) cut-norm for orderons is analogous to the corresponding definition for graphons.

###### Definition 2.3 (cut-norm).

Given a symmetric measurable function , we define the cut-norm of as

As we are working with ordered objects, the following definition of ordered cut-norm will sometimes be of use (in particular, see Section 6). Given , we write to denote that . Let be the indicator function for the event .

###### Definition 2.4 (Ordered cut-norm).

Let be a symmetric measurable function. The ordered cut norm of is defined as

We mention two important properties of the ordered-cut norm. The first is a standard smoothing lemma, and the second is a relation between the ordered cut-norm and the unordered cut-norm.

###### Lemma 2.5.

Let and . Then,

 ∣∣∣∫v1,v2μ(v1)ν(v2)W(v1,v2)1v1≤v2dv1dv2∣∣∣≤∥W∥□′.
###### Proof.

Fix partitions and of . We show below that the claim holds when and are step functions on and , respectively. Then, the proof is complete by the fact that all integrable functions are approximable in by step functions.

Since and are step functions, we can write and for some vectors and . We define

 f(a,b)\lx@stackreldef=∫v1,v2μ(v1)ν(v2)W(v1,v2)1v1≤v2dv1dv2.

When and , we have

 |f(a,b)| =∣∣ ∣∣∫∑i∑jaibj1Si(v1)1Ti(v2)W(v1,v2)dv1dv2∣∣ ∣∣ =∣∣ ∣∣∫⋃i:ai=1Si∫⋃j:bj=1TjW(v1,v2)dv1dv2∣∣ ∣∣≤∥W∥□′,

where the last inequality follows from the definition of the ordered cut-norm. As is bilinear in and , and for any and , we have for any and . ∎

###### Lemma 2.6.

Let be a symmetric measurable function. Then,

 ∥W∥2□′4≤∥W∥□≤2∥W∥□′.
###### Proof.

The inequality follows immediately from the fact that is symmetric. For the other inequality, let , fix , and let be a pair of sets satisfying

 ∣∣∣∫S×TW(v1,v2)1v1≤v2dv1dv2∣∣∣≥ξ−γ.

We partition into strips , such that for every , . For every , let (where ). Then,

 ξ−γ ≤∣∣∣∫S×TW(v1,v2)1v1≤v2dv1dv2∣∣∣ ≤∑i∈[2/ξ]∣∣∣∫(S∩Ii)×(T∩Ii)W(v1,v2)1v1≤v2dv1dv2∣∣∣+∑j∈[2/ξ]∣∣∣∫(S∩I(

Note that by the fact that for all ,

 ∑i∈[2/ξ]∣∣∣∫(S∩Ii)×(T∩Ii)W(v1,v2)1v1≤v2dv1dv2∣∣∣≤∑i∈[2/ξ]λ(Ii×Ii)≤ξ/2,

and therefore,

 ∑j∈[2/ξ]∣∣∣∫(S∩I(

On the other hand, the above implies that there exists such that

 ∣∣∣∫(S∩I(

Note that for every , we have that , and thus

 ∥W∥□≥∣∣∣∫(S∩I(

Since the choice of is arbitrary, the lemma follows. ∎

### 2.2 The cut and shift distance

The next notion of distance is a central building block in this work. It can be viewed as a locality preserving variant of the unordered cut distance, which accounts for order changes resulting from applying a measure preserving function.

###### Definition 2.7.

Given two orderons we define the CS-distance (cut-norm+shift distance) as:

 d△(W,U)\lx@stackreldef=inff∈F(Shift(f)+∥W−Uf∥□),

where .

###### Lemma 2.8.

is a pseudo-metric on the space of orderons.

###### Proof.

First note that non-negativity follows trivially from the definition. In addition, it is easy to see that for any orderon . For symmetry,

 d△(W,U )=infg∈F(Shift(g)+∥W−Ug∥□)=infg∈F(Shift(g−1)+∥W−Ug∥□) =infg−1∈F(Shift(g−1)+∥Wg−1−U∥□)=inff∈F(Shift(f)+∥U−Wf∥□) =d△(U,W).

Where we used the fact that is a measure preserving bijection and that for any .

Consider three orderons . We now show that .

 d△(W,U) =inff,g∈F(Shift(g−1∘f)+∥W−Ug−1∘f∥□) ≤inff,g∈F(Shift(f)+Shift(g−1)+∥Wg−Uf∥□) ≤inff∈F(Shift(f)+∥Z−Uf∥□)+infg∈F(Shift(g)+∥Wg−Z∥□) =d△(W,Z)+d△(Z,U),

where the first equality holds since is a measure preserving bijection, and the last inequality follows from the triangle inequality; note that for any . ∎

## 3 Block orderons and their density in W

In this section we show that weighted ordered graphs are dense in the space of orderons coupled with the cut-shift distance. To start, we have to define the orderon representation of a weighted ordered graph, called a naive block orderon. A naive -block orderon is defined as follows.

###### Definition 3.1 (Naive block orderon).

Let be an integer. For , we denote ; we also set . An -block naive orderon is a function that can be written as

 W((x,a),(y,b))=G(Qn(x),Qn(y)),∀x,a,y,b∈[0,1],

for some weighted ordered graph on vertices.

Following the above definition, we denote by the naive block orderon defined using , and view as the orderon “representing” in . Similarly to the unordered setting, this representation is slightly ambiguous (but this will not affect us). Indeed, it is not hard to verify that two weighted ordered graphs and satisfy if and only if both and are blowups of some weighted ordered graph . Here, a weighted ordered graph on vertices is a -blowup of a weighted ordered graph on vertices if for any .

We call an orderon a step function with at most steps if there is a partition of such that is constant on every .

###### Remark (The name choices).

The definition of a step function in the space of orderons is the natural extension of a step function in graphons. Note that a naive block orderon is a special case of a step function, where the steps are rectangular (this is why we call these “block orderons”). The “naive” prefix refers to the fact that we do not make use of the second coordinate in the partition.

For every and every partition of into measurable sets, let denote the step function obtained from by replacing its value at by the average of on . That is,

 WP((x,a),(y,b))=1λ(Si)λ(Sj)∫Si×SjW((x′,a′),(y′,b′))dx′da′dy′db′,

Where and are the unique indices such that and , respectively.

The next lemma is an extension of the regularity lemma to the setting of Hilbert spaces.

###### Lemma 3.2 ([Ls07] Lemma 4.1).

Let be arbitrary non-empty subsets of a Hilbert space . Then, for every and there is an and there are () and such that for every

 |⟨g,f−(γ1f1+⋯+γkfk)⟩|≤ε∥f∥∥g∥

The next lemma is a direct consequence of Lemma 3.2.

###### Lemma 3.3.

For every and there is a step function with at most steps such that

 ∥W−U∥□≤ε.
###### Proof.

We apply Lemma 3.2 to the case where the Hilbert space is , and each is the set of indicator functions of product sets , where is a measurable subset. Then for , there is an , which is a step function with at most steps. Therefore, we get a step function with at most steps such that for every measurable set

 ∣∣∣∫v1,v2∈S×S(W(v1,v2)−U(v1,v2))dv1dv2∣∣∣≤ε.

By the above and the fact that

 ∣∣∣∫v1,v2∈(S∪T)×(S∪T)(W(v1,v2)−U(v1,v2))dv1dv2∣∣∣=∣∣∣∫v1,v2∈S×S(W(v1,v2)−U(v1,v2))dv1dv2 +2⋅∫v1,v2∈S×T(W(v1,v2)−U(v1,v2))dv1dv2+∫v1,v2∈T×T(W(v1,v2)−U(v1,v2))dv1dv2∣∣∣≤ε,

we get that for any two measurable sets ,

 ∣∣∣∫v1,v2∈S×T(W(v1,v2)−U(v1,v2))dv1dv2∣∣∣≤2ε,

which implies the lemma. ∎

Similarly to the graphon case, the step function might not be a stepping of . However, it can be shown that these steppings are almost optimal.

###### Claim 3.4.

Let , let be a step function, and let denote the partition of into the steps of . Then .

###### Proof.

The proof follows from the fact that and the fact that the stepping operator is contractive with respect to the cut norm. More explicitly,

 ∥W−WP∥□≤∥W−U∥□+∥U−WP∥□=∥W−U∥□+∥UP−WP∥□≤2∥W−U∥□.

Using Lemma 3.3 and Claim 3.4 we can obtain the following lemma.

###### Lemma 3.5.

For every function and every , there is a partition of into at most sets with positive measure such that .

Using the above lemma, we can impose stronger requirements on our partition. In particular, we can show that there exists a partition of to sets of the same measure. Such a partition is referred to as an equipartition. Also, we say that a partition refines , if can be obtained from by splitting each into a finite number of sets (up to sets of measure ).

###### Lemma 3.6.

Fix some . Let be an equipartition of into sets, and fix such that divides . Then, for any , there exists an equipartition that refines with sets, such that .

###### Proof.

Let be a partition of into sets such that , and let be a common refinement of and , with . We construct an equipartition as follows. For every , consider all the sets consisting of . For each we let and partition into sets , each of measure , plus an exceptional part which is the residual set. That is,

Next, for every let and repartition each to sets of measure to get an equipartition of size . Let be a step function that agrees with on and on the complement. Since disagrees with on a set of measure at most , we have that

 ∥W−U∥□≤∥W−W˜Q∥□+∥W˜Q−U∥□≤4ε9+2k⋅2162/ε2q.

By our choice of we get that

 ∥W−U∥□≤4ε9+1k.

By construction is a step function with steps in , and using Claim 3.4 we get that

 ∥W−WQ∥□≤2∥W−U∥□≤8ε9+2k,

and the proof is complete. ∎

The next lemma is an (easier) variant of Lemma 3.6, in the sense that we refine two given partitions. However, the resulting partition will not be an equipartition.

###### Lemma 3.7.

Fix some and . Let be an equipartition of into sets, be a partition of into sets, and fix such that both and divide . Then, for any , there exists a partition that refines both and with sets, such that .

###### Proof.

Let be a partition of into sets such that , and let be a common refinement of the three partitions ,