Non-cooperatively assembling large structures: a 2D pumping lemma cannot be as powerful as its 1D counterpart.

Non-cooperatively assembling large structures:
a pumping lemma cannot be as powerful as its counterpart.

Pierre-Étienne Meunier
Inria
pierre-etienne.meunier@inria.fr
   Damien Regnault
IBISC, Univ Évry, Université Paris-Saclay,
91025, Evry, France.
damien.regnault@univ-evry.fr
Abstract

We show the first asymptotically efficient constructions in the so-called ”noncooperative planar tile assembly” model.

Algorithmic self-assembly is the study of the local, distributed, asynchronous algorithms ran by molecules to self-organise, in particular during crystal growth. The general cooperative model, also called ”temperature 2”, uses synchronisation to simulate Turing machines, build shapes using the smallest possible amount of tile types, and other algorithmic tasks. However, in the non-cooperative (”temperature 1”) model, the growth process is entirely asynchronous, and mostly relies on geometry. Even though the model looks like a generalisation of finite automata to two dimensions, its 3D generalisation is capable of performing arbitrary (Turing) computation [SODA 2011], and of universal simulations [SODA 2014], whereby a single 3D non-cooperative tileset can simulate the dynamics of all possible 3D non-cooperative systems, up to a constant scaling factor.

However, we showed in [STOC 2017] that the original 2D non-cooperative model is not capable of universal simulations, and the question of its computational power is still widely open. Here, we show an unexpected result, namely that this model can reliably grow assemblies of size with only tile types, which is the first asymptotically efficient positive construction.

1 Introduction

Our ability to understand and control matter at the scale of molecules conjures a future where we can engineer our own materials, interact with biological networks to cure their malfunctions, and build molecular computers and nanoscale factories. The field of molecular computing and molecular self-assembly studies the algorithms run by molecules to exchange information, to self-organise in space, to grow and replicate. Our goal is to build a theory of how they compute, and of how we can program them.

One of the most successful models of algorithmic self assembly is the abstract tile assembly model, imagined by Winfree [23]. In that model, we start with a single seed assembly and a finite number of tile types (with an infinite supply of each type), and attach tiles, one at a time, asynchronously and nondeterministically, to the assembly, based on a condition on their borders’ colours.

This model has served to bootstrap the field of molecular computing, which has since produced an impressive number of experimental realisations, from DNA motors [25] to arbitrary hard-coded shapes at the nanoscale [19], and cargo-sorting robots [22].

On the theoretical side, the abstract tile assembly model has been used to explore different features of self-organisation in space, especially in an asynchronous fashion. In many variants of the model, tile assembly is capable of simulating Turing machines [20, 21, 1, 3, 2, 18, 16, 15, 10, 6, 8, 23, 24]. More surprisingly, in its original form, the model is intrinsically universal [4], meaning that there is a single “universal” tileset capable of simulating the behaviour of any other tileset (that behaviour is encoded only in the seed of the universal tileset).

In the usual form of the model, a part of the assembly can “wait” for another to grow long enough to cooperate. In the non-cooperative model, however, any tile can attach to any location, as long as at least one side matches the colour of that location. Therefore, “synchronising” different parts of the assembly is impossible, and the main question becomes, what kind of computation can we do in a completely asynchronous way? The answer seems to depend crucially on the space in which the assemblies grow: in one dimension, non-cooperative tile assembly is equivalent to finite automata111Actually, deterministic tile assembly systems map directly to deterministic finite automata., and are therefore not too powerful. In three dimensions though, this model is capable of simulating Turing machines [2], and even of simulating itself intrinsically [12]. If instead of square tiles, we use tiles that do not tile the plane, the situation becomes even more puzzling: tiles whose shape are regular polygons can perform arbitrary computation, but only if they have at least seven sides [9]. In a similar way, polyomino tiles can also simulate Turing machines, provided that at least one of their dimensions is at least two [7].

However, in two dimensions with regular square tiles, the capabilities of this model remain largely mysterious. All we know is that it cannot simulate the general (cooperative) model up to rescaling [12], and cannot simulate itself either [14], but we know very little about its actual computational power. A number of related questions and simpler models have been studied to try and approach this model: a probabilistic assembly schedule [2], negative glues [17], no mismatches [5, 11], and different tile shapes [9, 7].

Due to the proximity with finite automata, a first intuition is that we can try to “pump” parts of an assembly between two tiles of equal type, resulting in infinite, periodic paths, as shown in Figure 1.1. However, this is not always possible, as shown in Figure 1.2, where an attempt to pump would result in a glue mismatch, which would block the growth.

Figure 1.1: Example of “pumping” a path. The seed is the white tile, and the two tiles highlighted in red are of the same type. We can try to repeat the part between them infinitely many times.

Figure 1.2: An example path which, unlike the one in Figure 1.1, cannot be repeated even a single time completely. The seed is the white tile, and the two tiles highlighted in red are of the same type.

Before this paper, a single positive construction was known, in which for all , a tileset could build multiple assemblies, all of Manhattan diameter (this means in particular that cannot build any infinite assembly). Even though that result was the first example of an algorithmic construction, the term “algorithm” in that case is to be taken in an extremely weak sense of a program whose running time is larger than its size. Indeed, the resulting assemblies were only a constant factor bigger than the program size, like in a program where we call the same function twice.

Here, we show a way to build an assembly of width with only different tile types, using the two dimensions to build a “controlled loop”:

Theorem 1.1.

For all , there is a tile assembly system , where , , and all assemblies are of width and height less than , and contain the same path of width .

However, there are strong reasons to believe that 2D noncooperative tile assembly is not capable of performing Turing computation, since it is in particular not capable of simulating Turing machines inside a rectangle [14], which is the only known form of Turing computation in tile assembly.

This result is therefore not meant as a first step towards “full-featured algorithm”, but will be useful as a benchmark against which strategies to characterise the computational power of this model can be evaluated.

2 Definitions and preliminaries

These definitions are for a large part taken from [14].

2.1 Abstract tile assembly model

The abstract tile assembly model was introduced by Winfree [23]. In this paper we study a restriction of this model called the temperature 1 abstract tile assembly model, or noncooperative abstract tile assembly model. For a mode detailed definition of the full model, as well as intuitive explanations, see for example [20, 18].

A tile type is a unit square with four sides, each consisting of a glue type and a nonnegative integer strength. Let be a a finite set of tile types. The sides of a tile type are respectively called north, east, south, and west, as shown in the following picture:

West

East

South

North

An assembly is a partial function where is a set of tile types and the domain of (denoted ) is connected.We let denote the set of all assemblies over the set of tile types . In this paper, two tile types in an assembly are said to bind (or interact, or are stably attached), if the glue types on their abutting sides are equal, and have strength . An assembly induces a weighted binding graph , where , and there is an edge if and only if the tiles at positions and interact, and this edge is weighted by the glue strength of that interaction. The assembly is said to be -stable if every cut of has weight at least . A tile assembly system is a triple , where is a finite set of tile types, is a -stable assembly called the seed, and is the temperature.

Given two -stable assemblies and , we say that is a subassembly of , and write , if and for all , . We also write if we can obtain from by the binding of a single tile type, that is: , and the tile type at the position stably binds to at that position. We say that is producible from , and write if there is a (possibly empty) sequence where , and , such that . A sequence of assemblies over is a -assembly sequence if, for all , .

The set of productions, or producible assemblies, of a tile assembly system is the set of all assemblies producible from the seed assembly and is written . An assembly is called terminal if there is no such that . The set of all terminal assemblies of is denoted .

In this paper, we consider that . Thus, we make the simplifying assumption that all glue types have strength 0 or 1: it is not difficult to see that this assumption does not change the behavior of the model (if a glue type has strength , in the model then a tile with glue type binds to a matching glue type on an assembly border irrespective of the exact value of ). Consider a assembly which is producible by a tile assembly system at temperature , since only one glue of strenght is needed to stably bind a tile type to an assembly then any path of the binding graph of can grow if it is bind to the seed. Thus at temperature , it is more pertinent to consider path instead of assembly. Now, we introduce definitions which are useful to study temperature .

2.2 Paths and non-cooperative self-assembly

Let be a set of tile types. A tile is a pair where is a position and is a tile type. Intuitively, a path is a finite or one-way-infinite simple (non-self-intersecting) sequence of tiles placed on points of so that each tile in the sequence interacts with the previous one, or more precisely:

Definition 2.1 (Path).

A path is a (finite or infinite) sequence of tiles , such that:

  • for all and defined on it is the case that  and  interact, and

  • for all such that it is the case that .

Whenever is finite, i.e. for some , is termed the length of . Note that by definition, paths are simple (or self-avoiding).

Although a path is not an assembly, we know that each adjacent pair of tiles in the path sequence interact implying that the set of path positions forms a connected set in and hence every path uniquely represents an assembly containing exactly the tiles of the path. More formally, since an assembly is a function of then it can also interpreted as a subset of tiles and then for a path we define the assembly which we observe is an assembly222or vice-versa, can be interpreted as a partial function from to tile types that is defined on a connected set. and we call a path assembly. A path is said to be producible by some tile assembly system if the assembly is producible, and we call such a a producible path. We define

to be the set of producible paths of .333Intuitively, although producible paths are not assemblies, any producible path has the nice property that it encodes an unambiguous description of how to grow from the seed , in () path order, to produce the assembly .

For any path and integer , we write , or , for the position of and for the tile type of . Hence if then and .

Note that, since the domain of a producible assembly is a connected set in , and since in an assembly sequence of some tile assembly system each tile binding event adds a single node to the binding graph of to give a new binding graph , and adds at least one weight-1 edge joining to the subgraph , then for any tile in a producible assembly , there is a edge-path (sequence of edges) in the binding graph of from to . From there, the following important fact about temperature 1 tile assembly is straightforward to see.

Observation 2.2.

Let be a tile assembly system and let . For any tile there is a producible path that for some contains .

When referring to the relative placements of positions in the grid graph of , we say that a position is east of (respectively, west of, north, south) of another position if (respectively , , ).

If two paths, or two assemblies, or a path and an assembly, share a common position we say they intersect at that position. Furthermore, we say that two paths, or two assemblies, or a path and an assembly, agree on a position if they both place the same tile type at that position and conflict if they place a different tile type at that position. We sometimes say that a path is blockable to mean that there is another path (producible by the same tile assembly system as produced ) that conflicts with .

The translation of a tile by a vector of is (the type of the tile is not modified while its position is translated by . The translation of a path by , written , is the path where and for all indices of , . As a convenient notation, for a path composed of subpaths and , when we write we mean (i.e. the translation of all of by ). The translation of a path by a vector , written , is the path where and for all indices of ,

The width of an assembly is the number of columns on which has at least one tile, and the height of is the number of rows on which has at least one tile.

3 The tile assembly system.

3.1 Definition of the tile assembly system.

Our construction relies on two parameters . Also, we define the series as:

In this paper, we work on a zone of the plane delimited as follow: we will only consider positions such that and . The seed will be made of only one tile at position and any assembly will have a height bounded by and a width bounded by .

The aim of this section is to define a the tile assembly system . The path of Figure 3.6 illustrates the definition of the set of tile types : each tile type is used exactly one time in this assembly. This path is made of five parts: one is the seed and the four others are represented in green, orange, blue and red. The set of tile types is defined as the union of these five kinds of tile types and .

The seed.

Consider the tile type with only one glue called on its east side. We define and the seed is defined as the assembly made of only the tile . From now on, will always be the seed of our tile assembly system.

The green tile types.

The second kind of tile types is made of tile types called defined as follow:

  • for all the tile type is made of the glue on its west side and the glue on its east side; moreover the tile is defined as .

  • the tile type is made of the glue on its west side and the glue on its north side; moreover the tile is defined as .

  • the tile type is made of the glue on its south side and the glue on its west side; moreover the tile is defined as .

  • the tile type is made of the glue on its east side and the glue on its north side; moreover the tile is defined as .

This set of tile types is used to hardcode the path .

The orange tile types.

The third kind of tile types is made of tile types called defined as follow. For all the tile type is made of:

  • the glue on its south side;

  • the glue on its north side if and only if ;

  • the glue on its east side if and only if there exists such that .

For all , the tile is defined as . This set of tile types is used to hardcode the paths for . Note that, for all , the path is a prefix of the path . To summarize, for all :

  • if then is the following tile type:

  • if for some integer , then is the following tile type:

  • otherwise, is the following tile type:

The blue tile types.

The fourth kind of tile types is made of tile types called defined as follow:

  • for all the tile type is made of the glue on its west side and the glue on its east side; moreover the tile is defined as ;

  • the tile type is made of the glue on its west side and the glue on its south side; moreover the tile is defined as ;

  • the tile type is made of the glue on its north side and the glue on its west side; moreover the tile is defined as ;

  • for all , the tile type is made of the glue on its east side and the glue on its west side; moreover the tile is defined as ;

  • the tile type is made of the glue on its east side and the glue on its south side; moreover the tile is defined as .

This set of tile types is used to hardcode the path . For all , we define the path as translated by .

The red tile types.

The fifth kind of tile types is made of tile types called defined as follow. For all , the tile type is made of:

  • the glue on its north side;

  • the glue on its south side if and only if ;

  • the glue on its east side if and only if there exists such that .

For all , the tile is defined as . This set of tile types is used to hardcode the path defined as . For all and , we define the path as the prefix of of length translated by . Note that, for all , the path is a prefix of and that for all , is : the empty path of length . To summarize, for all :

  • if then is the following tile type:

  • if for some integer , then is the following tile type:

  • otherwise, is the following tile type:

Figure 3.1: In our examples, we consider and . The seed (in white) is at position and we represent the path . This path is producible by and this figure contains exactly one occurrence of each tile type of . The height of is and its width is . The tiles of and with a glue on their east side are marked by a black dot.

3.2 Basic properties.

The aim of this section is to define a set of paths which characterized all the possible prefixes of a path producible by . These paths are obtained by gluing together the different paths defined in section 3.1. When two of these paths are glued together, we have to verify that the result of this operation is also a path. To achieve this goal, we have to check two properties. The first one is that the last tile of the first path can be glued to the first tile of the second path. The second one is that that the two paths do not intersect. We start by a first lemma which gives the positions occupied by the paths defined in section 3.1. This lemma is useful to show that a path is west or north of another one and thus that these two paths do not intersect.

Lemma 3.1.

For any , for any and for any position occupied by:

  • a tile of , we have and ;

  • a tile of , we have and ;

  • a tile of , we have and ;

  • a tile of , we have and (for , we have ).

Proof.

Straightforward for the green, orange and blue paths. For any position occupied by , we have . Moreover, the position of is and the following tiles are all below the previous one and the length of is then the tile occupies the position . For the special case where , we have . ∎

Now, for all , we defined as and we show that these sequences of tiles are paths producible by (see Figure 3.6 for an example of such a path).

Lemma 3.2.

For any and for any , is a path producible by .

Proof.

Consider and the paths , , and . Firstly, we show that these different paths do no intersect. For the special where , we remind that . By lemma 3.1:

  • the seed is west of ;

  • the path is south of ;

  • the path is west of ;

  • the path is north of .

Secondly, we show that these different paths can be glued together:

  • the position of the seed is and there is a glue on its east side and the position of is and there is a glue on its west side;

  • the position of is and there is a glue on its north side and the position of is and there is a glue on its south side;

  • the position of is and there is a glue on its east side and the position of is and there is a glue on its west side;

  • the position of is and there is a glue on its south side and the position of is and there is a glue on its north side.

Then, the path is producible by . For all , is a prefix of and thus it is a path producible by . ∎

Note that for any , all prefixes of are also producible by . The paths will be used to characterize all the path producible by . To achieve this goal, we need to know the positions of the free glues on these paths (see Figure 3.6).

Lemma 3.3.

For any and for any , the free glues of are:

  • the north glue of the tile whose position is if ;

  • for all the east glue of the tile whose position is ;

  • for all the east glue of the tile whose position is ;

  • the south glue of the tile whose position is if .

Proof.

Except for the last tile, for any tile of two of its glues are used to assemble the path . Since the tile types have two or three glues on their side, we have to check the remaining glue on the tiles with three glues on their sides. First all the tile types of and have only two glues on their sides, thus there are no free glues on the tiles of and . Note that for the special case where (see Figure 3.2) the last tile of is the last tile of . Nevertheless, the south glue of the tile whose position is is not free because of a mismatch with the tile whose position is and which has no glue on its north side. The tile types of with three glues on their side are for all . For all , the glue on the east side of are free whereas if the north glue of tile is free. Finally, the only tile types of with three glues on their sides are for all . For , the glue on the east side of is free. Moreover since is the last tile of the path then its south glue is also free (except when because has no glue on its south side). ∎

Figure 3.2: The path (for and ). The south glue of its last tile does a mismatch. Thus, this path is a dead-end .

A corollary of this lemma is that is a dead-end (see Figure 3.2). Also, for all , let be the vector and since the last tile of is and since the type of its east east glue is then for all producible by , if is a prefix of then can be written where is a path producible by .

3.3 Analyzes of the prefixes.

Now, remark that the only tile with a glue on its west side is then any path producible by begins by a tile . In fact, this reasoning can be done for any glue and direction. Thus for any path and any , if we know the position and the tile type of and the position of then we can deduce the tile type of . Thus, consider two paths and which are producible by , then they share a common prefix until they split away. They can split away only at a tile with at least three glues on its sides. From lemma 3.3, we have the following fact:

Fact 3.4.

For any and for any path producible by , either is a prefix of or there exists such that is a prefix of .

There are only different prefixes for a producible path with is large enough. Now, let’s look at how these different prefixes can grow. For all , by attaching tiles to the end of , this pall will always grow into . For all , we define as the path obtained by attaching red tiles to until it is no more possible (see Figure 3.3). For the special case , we have since we run out of red tiles. For the other special case , the path is since this path is a dead-end then it is not possible to add any red tiles (see Figure 3.2). For the general case , is a dead-end since its last tile is and its only free glue is the south one which does a mismatch with the tile . In all cases, there are only red tiles with a free glue on their east side (see Lemma 3.3). Thus if a path producible by is not a prefix of then it has to use on these free glues. The previous remarks are summarized in the following fact.

Figure 3.3: The path for and . This path is obtained by attaching red tiles to until a conflict occurs with . For all , the path is a dead-end. The path is a special case since it is equal to (see Figure 3.6) whose last tile has a free glue on its east side.
Fact 3.5.

For any and for any path producible by , there exists such that either:

  • is a prefix of ;

  • there exists such that is a prefix of .

Moreover, for all , is a dead-end.

Now, we have obtained different prefixes which are dead-end and prefixes which are not. Now, let’s look at how these different prefixes can grow. Consider and by attaching tiles to the end of , this path will always grow into (this assembly is producible by since is east of and , north of and south of (see Lemma 3.1)). The last tile of this path is and this path can keep growing to the north by attaching orange tiles at its end. We define the path as the path obtained by attaching orange tiles to until it is no more possible (see Figure 3.4). The last tile of this path is since it not possible to add a orange tile because of a mismatch with the tile . Moreover if , this path is a dead-end because the last tile of this path has no east glue. Also, cannot be a prefix of and by applying lemma 3.4 to we obtain that either is a prefix of or there exists such that is a prefix of . These observations are summarized in the following fact.

Figure 3.4: The path (on the left) and (on the right) for and . These paths are obtained by attaching blue and orange tiles to and until a conflict occurs with . For all , the path is a dead-end if .
Fact 3.6.

For any and for any path producible by , there exists such that either:

  • is a prefix of ;

  • there exists such that either:

    • is prefix of ;

    • there exists