# Markovian robots: minimal navigation strategies for active particles

###### Abstract

We explore minimal navigation strategies for active particles in complex, dynamical, external fields, introducing a class of autonomous, self-propelled particles which we call Markovian robots (MR). These machines are equipped with a navigation control system (NCS) that triggers random changes in the direction of self-propulsion of the robots. The internal state of the NCS is described by a Boolean variable that adopts two values. The temporal dynamics of this Boolean variable is dictated by a closed Markov chain – ensuring the absence of fixed points in the dynamics – with transition rates that may depend exclusively on the instantaneous, local value of the external field. Importantly, the NCS does not store past measurements of this value in continuous, internal variables. We show that, despite the strong constraints, it is possible to conceive closed Markov chain motifs that lead to nontrivial motility behaviors of the MR in one, two and three dimensions. By analytically reducing the complexity of the NCS dynamics, we obtain an effective description of the long-time motility behavior of the MR that allows us to identify the minimum requirements in the design of NCS motifs and transition rates to perform complex navigation tasks such as adaptive gradient following, detection of minima or maxima, or selection of a desired value in a dynamical, external field. We put these ideas in practice by assembling a robot that operates by the proposed minimalistic NCS to evaluate the robustness of MR, providing a proof-of-concept that is possible to navigate through complex information landscapes with such a simple NCS whose internal state can be stored in one bit. These ideas may prove useful for the engineering of miniaturized robots.

###### pacs:

-## I Introduction

As early as 1959, Feynman discussed the technology transfer from the macro- to the microscale, a highly relevant field of research nowadays in terms of medical applications such as targeted drug delivery and microsurgery feynman1960there. In recent years, the remarkable advance of nanoscience has made the fabrication of synthetic and molecular machines such as sensors and actuators possible bayley2001; hess2017; sanchez2009. Moreover, micrometer-sized devices capable of moving autonomously in a fluid are already a reality. We refer to these microdevices as microrobots. Microrobots can transport cargo and invade cells; healthcare applications for early diagnosis, targeted drug delivery or nanosurgery appear therefore realizable in the not too distant future solovev2012; sanchez2016; sanchez2017; Josephe1700362. There exists a large variety of microrobots, rigid and soft ones, whose self-propulsion can be achieved via electrical, chemical or optical stimulation solovev2012; sanchez2016; sanchez2017; Josephe1700362. The direction of navigation of these devices can be controlled remotely, for instance via a magnetic field in chemically-driven nanorods sanchez2009. However, the ultimate goal is to design and fabricate microrobots with a programmable, autonomous navigation system on board integrating sensors, an energy source and actuators. At present, the miniaturization of autonomous robots has advanced to the millimeter scale churaman2012first; kim_microbiorobotics_2017. Further progress along these lines requires the development of minimal, yet robust algorithms in the sense that they should work reliably in the presence of noise.

Physical properties of small-size objects, e.g. at the microscale, impose technical constraints on the design of microrobots: viscous forces dominate over inertial ones, fluctuations of thermal origin are not negligible and the instantaneous sensing of external signals can only involve local values, never gradients purcell1977life; Berg2008; Tindall2012. For this reason, here we consider a class of autonomous, self-propelled particles, which we refer to as Markovian robots (MR), that move at constant speed, are subject to fluctuations, and can only sense local values of an external field (see Fig.1). Notice that we adopt the same constraints small objects are subjected to, but we do not need to assume necessarily that we work at microscopic scales: the proposed navigation strategies are also of interest for macroscopic robots exposed to weakly modulated signals. Nevertheless, the long-term motivation of this study is to pave the way for the engineering, in a near future, of tiny, autonomous robots. With this intention in mind, we aim at conceiving simple machines that are able to navigate across a complex field – providing valuable information clues – in an autonomous way with a minimum of information storage capacity. Specifically, we equip these machines with a navigation control system (NCS) that triggers random changes in the self-propulsion direction of the robots. An essential aspect of the NCS is that it exhibits only two internal states, meaning that the NCS state can be stored in a single internal Boolean variable that adopts two values. Transitions between these Boolean values are determined by a closed Markov chain, with transition rates that may depend on the instantaneous local value of the external field; see Fig. 1 for sketches of the two relevant NCS discussed in this work. Only one of the transition pathways in the closed two-state Markov chain triggers random reorientation in the moving direction. It is worth noticing that the closed-loop nature of the investigated Markov chains ensures the constant resetting of the internal Boolean variable, preventing the presence of fixed points in the dynamics of this variable. Importantly, the NCS does not store previous measurements in the form of internal continuous variables, preventing a priori any mathematical operation to estimate the gradient of the external field. We show that, despite the requested constraints, it is possible to conceive closed Markov chain motifs that lead to non-trivial motility behaviors. By analytically reducing the complexity in the NCS dynamics, we obtain an effective description of the long-time motility behavior of the MR that allows us to identify the minimum requirements in the design of NCS motifs and transition rates to perform complex navigation tasks such as adaptive gradient following, detection of minima or maxima, or selection of a desired value in a dynamical, external field. We show that MR exhibit non-trivial motility behaviors in one, two and three dimensions.

We put these concepts in practice by assembling a macroscopic robot that operates by the proposed NCS and is subjected to the constraints indicated above. A series of statistical tests allows us to assess the robustness of the proposed minimalistic navigation algorithms. The performance of the robot provides solid evidence in favor of the practical interest of these ideas as well as a proof-of-concept that is possible to navigate through a complex information landscape with only 1-bit of memory. These ideas may prove of help in the engineering of miniature robots.

The minimalistic navigation strategies discussed here are fundamentally different from bacterial chemotactic strategies Berg2008; Tindall2012; Schnitzer1993; Celani2010; Cates2012; Flores2012; chatterjee2011chemotaxis; segall_temporal_1986; deGennes_chemotaxis_2004 as explained in the following. In Celani2010, Celani and Vergassola have cleverly shown that bacterial chemotaxis can be described in a Markovian way by enlarging the space of variables, beyond position and velocity variables, to include continuous (as opposed to Boolean) internal variables. The temporal dynamics of these continuous variables obeys a chain of ordinary differential equations, where the first of them depends on the external field. The frequency of changes in the moving direction of the bacterium is a function of these variables. According to Celani2010, chemotactic behavior is already obtained by keeping the first two of these internal continuous variables. The past measurements are encoded by these continuous variables Celani2010, that, from an algorithmic point of view, need to be constantly updated, see Fig. 2. These dynamical variables are somehow connected to intracellular chemical species, see tu2013 for more details. It is worth mentioning that mathematical procedures similar to the ones utilized here have also been used in the context of bacterial chemotaxis, namely a reduction of a complex internal dynamics in order to obtain the effective long-time motility behavior, see e.g. Schnitzer1993; Clark9150; Celani2010. However, we stress that the analogy between bacterial chemotaxis and the navigation strategies discussed here is limited to the observation that both strategies are Markovian and make use of internal states. Notably, there is no direct link between them; see Fig. 2 for a comparison at the algorithmic level. The differences become evident when looking closely: here, we discuss navigation strategies that make use of a single internal Boolean variable to describe the internal state of the moving entity, while in Celani2010 the internal state of the bacterium is described by (a minimum of) two continuous variables. From this, it is evident that MR have only two possible internal states, while in Celani2010 the internal state of a bacterium is given by vector , with and being two internal continuous variables, implying that there is an infinite (or at least a very large) number of potential internal states. Furthermore, the temporal evolution of the continuous variables and is given, as indicated above, by a hierarchy of ordinary differential equations, while in MR the temporal evolution of the Boolean variable is given by a closed Markov chain. The differences between both strategies are evident even for a trivial scenario where the external field is constant. The internal variables and would converge in this scenario to a fixed point and, thus, the internal state reaches a stationary state. In contrast, the internal state of MR never converges to a stationary value but oscillates ad infinitum. Another indication, how different these strategies are, is the following: the frequency at which the direction of motion changes in the model Celani2010 is a function of the internal state, i.e. in Fig.2, while that is not the case for MR. In summary, the mathematical structures of both strategies are, analytically and algorithmically, fundamentally different.

While the goal of bacterial chemotaxis models is to understand bacterial navigation, we aim here at conceiving minimal navigation algorithms to engineer simple self-propelled robots. Our intention is to provide new perspectives on the engineering of artificial active particles romanczuk_active_2012; vicsek_collective_2012; marchetti_hydrodynamics_2013; menzel_tuned_2015 by concentrating on the design of the navigation control system, while previous studies primarily focused on the design of autonomous self-propulsion mechanisms paxton_catalytic_2004; golestanian_designing_2007; kudrolli_swarming_2008; deseigne_collective_2010; bricard_emergence_2013; bricard2015; palacci_living_2013.

## Ii Markovian robots

In this section, several variants of MR are introduced with a particular focus on their capability of responding to a static, external (scalar) field. At first, the general dynamics in space – equal for all model variants – is formulated. Subsequently, several examples of increasing complexity of the internal robot dynamics, which controls the occurrence of reorientation events, are studied. In particular, an effective Langevin dynamics is derived analytically for each case, which reveals the large-scale robot dynamics in the diffusive limit. These concepts are illustrated within a didactic introduction first by means of a simplified version of the model where the reorientation rate is directly a function of the external field.

### Spatial dynamics

Throughout, individual robots are assumed to move at constant speed by means of an active self-propulsion mechanism. For simplicity, we focus on one-dimensional systems of linear size – generalizations to higher dimensions are commented on in Section IV. In one dimension, the dynamics of the robot is given by

(1) |

where is the position of the robot at time , denotes its active speed, is the bare diffusivity in the absence of active motion (), abbreviates white Gaussian noise and indicates the direction of active motion at time . The temporal dynamics of is controlled by a navigation control system (abbreviated NCS for short in the following), which we consider to operate with one internal Boolean variable that adopts two values. The dynamics of this internal Boolean variable is dictated by a closed Markov chain, see Fig. 1 which illustrates several NCS motifs and the robot dynamics. Notice that the dynamics of the NCS is affected by the external field via the -dependency of the transition rates. In all of these motifs, there is one particular transition leading to state (depicted by a red, dashed arrow in Fig. 1), which triggers a reversal of the driving engine and, thus, induces the inversion of the direction of active motion of the robot: . Given a certain NCS motif, we want to understand the motility response of the robot to an external field . This is addressed in the following.

### A didactic introduction

We start by studying the long-time behavior of the simplest possible scenario where the reorientation rate depends directly on the external field, i.e. there is no internal dynamics. Let us stress that we use this case as a didactic introduction to illustrate a series of fundamental concepts that will allow us to obtain a simplified long-time dynamics of NCS motifs of higher complexity (NCS motifs 1 and 2). Here, reversal events occur at a rate , which depends on the external signal . The temporal evolution of the system can be expressed in terms of the probabilities and to find a robot at position at time moving to the right and to the left, respectively. The associated Master equation gardiner_stochastic_2010 reads:

(2a) | ||||

(2b) |

By introducing the new variables and , we recast Eq. (2) into

(3a) | ||||

(3b) |

The variable of interest is representing the probability to find the robot at position at time . Since the local dynamics of [Eq. (3b)] is faster than [Eq. (3a)] and we are interested in the long-time behavior of the latter, we approximately set , enabling us to express to lowest order in spatial gradients. Inserting this expression into Eq. (3a) yields the following effective equation for the density:

(4) | ||||

All details related to the reorientation dynamics were coarse-grained by deriving the effective scalar equation (4) for . This approach is valid as long as the mean distance traversed by a robot in between two transitions is shorter than the characteristic scales at which the external field varies.

### Effective Langevin dynamics

Now we consider the inverse problem: starting with equation (4) for the density , we aim at finding a suitable Langevin equation in Ito’s interpretation kampen_stochastic_2011; gardiner_stochastic_2010 of the form

(5) |

whose associated Fokker-Planck equation

(6) |

for the evolution of the density is structurally identical to Eq. (4) ^{1}^{1}1The use of Stratonovich’s interpretation leads to a different Fokker-Planck equation kampen_stochastic_2011..
This approach is advantageous in several regards.
By obtaining an effective drift term and an effective diffusion coefficient , we characterize the transport properties of the MR, encoding the details of the NCS in and .
The physical interpretation of and as drift and dispersion, respectively, results from the short-time solution of Eq. (6) for the propagator risken_fokker-planck_1996

which determines the probability to find a robot at position after a short observation time given that it was observed at position at time . Notably, the propagator provides a direct way how to measure the mean local drift or bias as well as the position dependent dispersion .

Knowing drift and diffusion coefficient, we can further determine whether a NCS motif lets the MR display a long-time motility response to the external field as follows. The steady state solution of Eq. (6) for no-flux boundary conditions takes the form

(7) |

with a normalization coefficient . In general, a MR is said not to exhibit a long-term response to a non-constant external field if the stationary density is constant, i.e., Otherwise, the motif under consideration induces a response in the sense that the coupling to the external field increases or decreases the probability to find a robot in certain areas in space. The sign of the derivative of the stationary density distribution is determined by the simple criterion

(8) |

which follows directly from Eq. (7). A constant density requires all -dependencies in Eq. (7) to compensate each other. This implies the specific relation between drift and diffusion. The nature and form of the response depends on the topology of the motif and the functional form of the rates; this is addressed further below.

Notice that it is possible to obtain a motility response to an external field without involving biased motion. This is evident from Eq. (7): if and is still a function of , a nontrivial, stationary density profile will emerge. This kind of motility response is known in biology as chemokinesis. In contrast, directed motion requires a non-vanishing . In biology, a motility response involving a bias is known as chemotaxis.

For the introductory example considered above, the comparison of Eq. (4) and Eq. (6) reveals

(9) |

which satisfies the above-mentioned relation, i.e. , implying . We observe that though the diffusion depends on and the local drift is nonzero and varies over space, there is no long-time motility response – the long-time density distribution is flat as noticed when memory kernels were introduced Berg2008; Schnitzer1993. Using the terminology of chemotaxis, one can summarize that chemotactic and chemokinetic part compensate each other in this case.

### NCS motif 1: up- & downgradient motion

Now, we focus on a more complex scenario where the state of the navigation control system is given by an internal Boolean variable that adopts two values: and . The possible transitions are with rate and with rate . The latter transition triggers a reversal of the direction of active motion. This is motif in Fig. 1. Due to the presence of two internal states, we introduce four fields and with , which denote the probability to find a robot at position at time with internal state moving to the right () and to the left (), respectively. The temporal evolution of these fields is determined by the following Master equation:

(10a) | ||||

(10b) | ||||

(10c) | ||||

(10d) |

By introducing the change of variables and , we recast Eq. (10) into two groups of equations for the densities ,

(11a) | ||||

and for the differences , | ||||

(11b) | ||||

This seemingly innocent change of variables simplifies the problem substantially. If spatial derivatives in Eqs. (11) were absent, Eqs. (11a) would decouple completely from Eqs. (11b). Further, we note that the eigenvalues associated to the local dynamics of Eqs. (11a) are and , while the real parts of those associated to Eqs. (11b) are both negative, i.e. . The lesson is: in the long-wavelength limit, there is one eigenvector whose temporal evolution is slow while the other three are fast. Accordingly, we can define a new set of four fields by linear combination of those in Eqs. (11) in such a way that only one of those fields is slow. Due to number conservation, the total density , which is the primary quantity of interest, is the slow field (). In order to reduce Eqs. (11) to the density dynamics, we request local equilibrium and take , allowing us to express all fields as function of and spatial derivatives of it (see Appendix C for further details). By keeping all terms up to second order spatial derivatives of the density , we obtain an effective Fokker-Planck equation, cf. Eq. (6), where and adopt the form

(12a) | ||||

(12b) |

We highlight that yields the relation and, thus, the stationary density would be a constant according to Eq. (7). In other words, we learn that we need to require and at least one of the rates should depend on in a nontrivial way in order to design robots that respond to the external field . In the spatially homogeneous case, the diffusion coefficient [Eq. (12b)] reduces to the expression, which was derived in Ref. grossmann_diffusion_2016.

The potentially simplest example that leads to upgradient motion of MR is and , where is a constant, see Fig. 3. It is interesting to observe that the robots move downgradient if we make the opposite choice, namely and . Thus, the previous discussion reveals how the field-dependence of the transition rates controls whether robots tend to move up- or downgradient.

Notably, both types of robots are entirely indistinguishable in spatially homogeneous environments; it is therefore a priori impossible to infer information about the type of response on the basis of measurements, which are performed in spatially homogeneous external fields, i.e. at different levels of . To be concrete, consider the following gedankenexperiment: two types of robots, robots of type with and and robots of type with and exposed to the same external field ; this scenario is depicted in Fig. 3. If corresponds to an homogeneous environment such that , with being a constant, the diffusion coefficient and the mean rate of reversals are identical for both robot types; notice that the latter increases with the field value . One could easily be misled to think that robots tend to accumulate in those regions in space where the reversal rate is high, leading to an effective “trapping” of the robots in those regions. However, the distinct long-time behaviors of robots of type A and B provide clear evidence against this simplified picture. While robots of type A tend to move upgradient and accumulate close to , robots of type B tend to move downgradient and accumulate close to , see Fig. 3. Despite this evident qualitative difference, both types exhibit a higher reversal rate close to . This finding, i.e. the existence of different motility responses for robots of type A and B, highlights how subtle and nontrivial the impact of the NCS design is on the long-time motility response of robots: by exchanging the functional form of the transitions from and , we switch from up- to downgradient motion.

In addition, the previous analysis reveals that robots navigating by NCS motif exhibit a motility response that involves both, directed motion and position-dependent diffusion, i.e. it is neither purely chemotactic nor purely chemokinetic, but involves a combination of both, in the sense that the force and the diffusion coefficient are non-vanishing functions, which depend on the position via .

### NCS motif 2: adaptation

We introduce NCS motif (cf. Fig. 1) to obtain robots whose motility response is purely chemotactic, with a bias resulting from only. Adopting the terminology of bacterial chemotaxis, we call robots whose diffusion coefficient possesses an explicit dependency on , and thus on nonadaptive, while those with a constant diffusion coefficient are referred to as adaptive robots. Following this nomenclature, we seek to create chemotactic, adaptive robots. Adaptive MR are characterized by the independence of their motility pattern from the intensity of external stimuli in spatially homogeneous environments – the diffusion coefficient, for example, is independent of the basis level of the external field. We insist that only NCS motif can yield adaptive robots; motif leads always to nonadaptive motility responses.

Motif differs from motif by the existence of a backward transition from state to , which does not activate a reversal, and whose associated transition rate is . Following the analytical procedure outlined before, we first write the equations for and perform the change of variables and for all . Again, the dynamics of the ’s is fast and, furthermore, the system of equations for the ’s contains one fast and one slow mode allowing us, eventually, to reduce the four-dimensional system to the slow dynamics of the density . Keeping derivatives up to second order, we obtain a Fokker-Planck equation of the form given by Eq. (6) where

(13a) | ||||

(13b) |

In the limit , Eqs. (12) are recovered. Again, if all rates are equal, , chemotactic and chemokinetic part are related by and, thus, the stationary density is constant.

So far, no restrictions have been imposed on the rates , , and . Consequently, the terms and given by Eqs. (13) are generic for motif . In order to obtain adaptive robots, we want to choose these rates in such a way that becomes independent of , while still depends on it. With this idea in mind, we define

(14a) | ||||

(14b) |

In order to ensure all rates to be positive, we further choose , where and are positive constants. By inserting these rates into Eq. (13), we find

(15a) | ||||

(15b) |

Notably, Eq. (15b) is structurally identical to Eq. (12b) for motif , however, by definition it is independent of . Further, we defined the response function

(16) |

Notice that any function restricting the values of between and serves our purpose. This freedom of choice may be used to design according to the desired response.

With the above choice of rates, we obtain purely adaptive, chemotactic robots whose directed motion is controlled by only [cf. Eq. (15)]. In the absence of a external field gradient, , robots diffuse with a constant diffusion coefficient given by Eq. (15b) that is independent of the external field value. We notice that requesting is equivalent to fixing the average and variance of the run-time distribution of the robots; accordingly, their behavior in homogeneous environments of different (constant) field values is microscopically indistinguishable.

## Iii Performing complex tasks

In the following, we discuss the possibility of designing MR to perform multiple complex tasks by playing with the response function , cf. Eq. (16), on the basis of NCS motif . If we define such that in the interval of interest of field values, robots move upgradient. As a consequence, they accumulate around the maxima of in a complex landscape as shown in Figs. 4(a,b). This requires to be a decreasing function of . In addition, we have to make sure that is bounded by . As an example, we consider where and are chosen such that and , and where and are constants. On the other hand: if robots are supposed to move downgradient to accumulate in the minima of , the response function has to be a decreasing function of the signal, , and, thus, should be an increasing function of . This can be achieved by using the same functional form as before, but requesting and , cf. Figs. 4(a,b).

We can further design robots to accumulate at a given value of the external field as shown in Figs. 4(c,d). For this task, we need to be positive for and negative for . Fig. 4(c) illustrates this type of robot design: where is a Gaussian distribution centered at and of variance . The coefficients and are chosen such that and .

As a final example, we study how robots chase a signal that moves at speed as shown in Figs. 4(e)-(i). The analyzed scenario is analogous to recent bacterial chemotactic experiments performed with a moving chemoattractant signal Li_barrier_2017. For this purpose, the MR design for the detection of maxima is used, cf. the discussion of Figs. 4(a,b). As a signal, we use a Gaussian distribution that moves at a constant speed (remember that robots move at constant speed ). There is a critical signal speed above which the robots become decreasingly responsive. In the limit of high signal speeds (), robots just diffuse around as they would do in an homogeneous field. We quantify the efficiency of robots by the deviation from the homogeneous distribution , see Fig. 4(e), which is large if robots follow the moving signal and decreases to zero for non-responsive MR. Considering a simplified scenario, we derive a rough estimate of the crossover speed beyond which robots cannot follow the moving signal. The estimation is based on the idea of a quasi-stationary situation: the density distribution of robots should have relaxed into the stationary state before the signal has moved to ensure that robots can follow a dynamic signal. Imagine first a static field , where and are constants such that ; notice that is a length-scale that characterizes the spatial extent of the gradient. Assuming that we can linearize around , we approximate , yielding a linear restoring force as if it was coming from a harmonic potential. The characteristic relaxation time for an harmonic potential is . The critical speed is now given by the product of the relaxation rate and a characteristic size of the signal; therefore, we estimate that robots can follow any signal that travels at a speed less or equal to

(17) |

For the parameters used in the simulations shown in Fig. 4, the critical speed yields , indicated by a red (dashed) line in Fig. 4(e).

## Iv Two & three-dimensional systems

The results obtained so far regarding the motility response of MR, based on a one-dimensional approach, hold true qualitatively in higher dimensions. Below, we briefly outline how biased motion of MR can be addressed in higher spatial dimensions within the same theoretical framework and provide a proof of principle. Technical details as well as a full account of the general dynamics in two as well as in three dimensions can be found in Appendix D and Appendix E, respectively.

#### Two-dimensional case

The equation of motion of a MR in two dimensions reads

(18) |

where is the position of a robot in two dimensions and is a unit vector pointing in the direction of motion parametrized by the polar angle . The polar angle may undergo a stochastic, rotational dynamics due to small-scale spatial heterogeneities, thermal fluctuations or temporal variations of the active driving force mikhailov_self_1997; peruani_self_2007; romanczuk_brownian_2011; chepizhko2013:

(19) |

The noise amplitude parametrizes the persistence of trajectories during run phases, and denotes white, Gaussian noise.

The internal robot dynamics, which controls abrupt changes of the direction of motion, is determined as before by a certain NCS. The reorientation of robots may be implemented in several ways: the new direction of active motion could be selected from a probability distribution of reorientation angles, representing, for example, a cone centered around the previous orientation or it could be chosen uncorrelated with respect to the previous direction of motion. The qualitative behavior is independent of this choice.

At first, we put forward a heuristic argument valid for low angular noise intensities to illustrate why the results derived so far based on one-dimensional systems should hold in higher dimensions. Consider a robot equipped with some NCS, which controls the moments in time when the robot selects a new direction of motion from a certain probability distribution. The velocity of a robot may always be divided into its components parallel and perpendicular to the gradient orientation. The upgradient climbing speed is a random variable, which changes at each reorientation event. Thus, the speed has to be rescaled to obtain an average climbing speed. Furthermore, not every reorientation inevitably leads to a reversal of the direction of active motion with respect to the gradient orientation. Upon reorientation, the projection of the new direction of active motion onto the gradient is positive or negative with equal probability () if the direction of self-propulsion of a robot at each reorientation event is chosen randomly from the interval in two dimensions; in general, there is a reversal probability that the robot moves upgradient (downgradient) given that it was moving downgradient (upgradient) before the reorientation. These arguments indicate that it is always possible to come up with an effective one-dimensional description – in the sense of a projection – for the motion along the local gradient orientation, which is analogous to the problem considered in previous sections.

We now turn to a quantitative analysis of the problem in two dimensions. For the sake of concreteness, we formulate the problem for adaptive robots as discussed in Section II, which are controlled by NCS motif , cf. Fig. 1. In contrast to one dimension, where only two directions of motion (denoted by ) are possible, a continuum of orientations parametrized by the polar angle exists in two dimensions. Therefore, the probability densities are replaced by the probability densities to find a robot at position in state , moving into direction at time . We introduced the new symbol in order to avoid confusion with the probability density

(20) |

to find a robot at a certain position at time in state , independent of its direction of motion. Altogether, the full set of Master equations for robots controlled by NCS motif in two dimensions reads

(21a) | ||||

(21b) |

The details of the stochastic reorientation event, triggered by the NCS, are determined by the probability density function . For unbiased reorientations, this function should be symmetric: .

Just as in one dimension, the total density dynamics is a slow quantity since it is locally conserved. It is therefore possible to reduce the set of Master equations to the Fokker-Planck equation

(22) |

for the density . Technically, the derivation follows the same logic as in one dimension. At first, the dynamics of the probability densities , cf. Eq. (20), are derived by integration of Eqs. (21) over the angular variable . Those equations are coupled to the flux, which is determined by the local order parameter

(23) |

in two dimensions. The fields , which replace from the one-dimensional discussion, may again be adiabatically eliminated to obtain a reduced set of equations for the densities . Finally, the density dynamics follows by assuming local equilibrium. Details on this derivation are summarized in Appendix D. For the example considered above, namely adaptive MR with NCS motif , one obtains the local drift

(24) |

with the parameter-dependent prefactor

(25) |

Further, the constant diffusion coefficient reads

(26) |

In these effective transport quantities, the mean cosine of the reorientation distribution was abbreviated by :

(27) |

These results constitute a straightforward generalization of the results for the one-dimensional case.

#### Three-dimensional case

For the sake of completeness, we finally consider MR in three spatial dimensions. Their transport characteristics turn out to be marginally different from those in two dimensions, as explained below. This implies in particular that all qualitative statements made above hold true in three dimensions as well. However, there are some technical complications as a consequence of the three dimensional motion regarding the implementation of angular fluctuations as well as the angular reorientation, which are explained below for this reason. Again, adaptive MR controlled by NCS motif are considered for simplicity as an example. All details concerning the general dynamics of MR in three dimensions can be found in Appendix E.

The dynamics in space for MR in three dimensions,

(29) |

is unchanged with respect to previous cases. However, the orientation of the active driving force, determined by the unit vector , has to be parametrized differently in three dimensions. One could, for example, use spherical coordinates , where and . The angular dynamics for unbiased, orientational fluctuations reads then as follows:

(30a) | ||||

(30b) |

For a derivation of these equations, see grossmann_anistropic_2015, and for a detailed discussion on Brownian motion in 3D we refer the reader to Perrin; yosida1949; doi:10.1143/JPSJ.22.219; Brillinger1997; PhysRevLett.103.068102. All interpretations of multiplicative noise terms in the angular dynamics [Eq. (30)] are equivalent in this particular case. From a technical point of view, it is, however, more convenient to use Cartesian coordinates for the director , at least for analytical calculations.

Besides the continuous fluctuations of the direction of motion due to rotational diffusion [Eq. (30)], robots change the orientation of the active driving force in a discontinuous fashion each time that the NCS triggers one of this events: . Given that the previous direction of motion was , a novel orientation is chosen from a transition probability density . Since reorientations are supposed to occur in an unbiased manner, the transition probability density can only depend on the scalar product . Further, the normalization of the director, , has to be preserved. One may, therefore, parametrize

(31) |

where is the probability distribution function for the scalar product of the orientations just right before and after a reorientation event. Put differently, it denotes the probability density for the cosine of the angle between the vectors and , i.e. .

The internal robot dynamics is independent of the spatial dimension. Therefore, the general structure of the Master equations, which describe the dynamics of MR, remains unchanged in three dimensions, but transport terms are adapted accordingly. For MR controlled by NCS motif , these Master equations are given by

(32a) | ||||

(32b) |

which is the analogue of Eq. (21) for the corresponding two-dimensional case: the convective term is replaced by its three dimensional equivalent, the reorientation distribution is replaced by and the implementation of the director dynamics due to rotational noise has changed. The latter is determined in Cartesian coordinates by the operator

(33) |

where a sum over and is implicit. This parametrization is entirely equivalent to the angular representation [Eqs. (30)], which can be verified by insertion of the parametrization of via spherical coordinates grossmann_anistropic_2015.

In the diffusive limit, i.e. if the external signal varies weakly on spatial scales, which a robot traverses in between two reorientation events, a drift-diffusion approximation in the same spirit as in one and two spatial dimensions is feasible. The basic prerequisites of this derivation and its logic are analogous to the arguments put forward before; technical subtleties are summarized in Appendix E. It turns out that solely the speed and the angular noise intensity are rescaled by numerical factors, which depend on the spatial dimension. The drift reads [cf. Eq. (24)]

(34) |

in three dimensions, where the prefactor is structurally very similar to determined by Eq. (25) in two dimensions. Here, the prefactor is determined by

(35) |

Along similar lines, only a few numerical factors are replaced in the expression for the effective diffusion coefficient, which reads in three dimensions as follows:

(36) | ||||

Just as in two dimensions, the parameter denotes the mean cosine of the angle between the directors before and after the reorientation event. In three dimensions, it may be expressed by

(37) |

The stationary probability density is thus determined by

(38) |

for adaptive MR controlled by NCS motif .

Notably, the simple rescaling of speed and rotational diffusion described above is not a particularity of the example under consideration, but it is generally the only quantitative difference of the transport properties of MR in two and three dimensions. The proof of this result is sketched in Appendix E. In short, the behaviour of MR is qualitatively independent of the spatial dimension.

A comparison of individual-based model simulations and theoretical predictions in terms of the stationary probability density , as shown in Fig. 6, serves as sanity check that the analytically obtained transport coefficients, drift [Eq. (34)] and diffusion [Eq. (36)], provide a reasonable description of the large-scale transport of MR in the diffusive limit.

## V Tests with a real robot

We tested the concepts developed before in practice by assembling a macroscopic robot that operates with NCS motif as defined above. The robot – a Lego Mindstorms EV3 shown in Fig. 7(a) – was equipped with a single light sensor capable of reading light intensities, providing a signal in arbitrary units between and at the current position. A gray scale from black to white printed on paper (total length: ) was utilized as an external field [cf. the top panel of Fig. 7(c)]. The robot possessed two synchronously steered motors in the front, each of which are connected to one wheel. A metallic roller in the back of the robot served as stabilization. For simplicity, we focused on the one-dimensional scenario: the robot was attached to a metallic rail to prevent turns, thereby ensuring straight trajectories.

Being a real-word system, the robot was naturally subjected to a series of fluctuations. Vibrations of the arm that connected the light sensor to the robot and, moreover, imperfections in the printed gray scale itself resulted in noisy measurements of the signal intensity . Furthermore, imperfect rotations of the wheels imply varying step lengths and, hence, led effectively to noise in the particle position.

The robot was programed in the LabVIEW based Lego Mindstorms EV3-software. The basic flowchart of the algorithm is shown in Fig. 2. The temporal update was composed of a streaming and a signal processing step that were repeated continuously. The length of one streaming step was fixed to be of the wheel perimeter, resulting in a step length of approximately . Afterwards, the signal strength was read from the sensor. Based on this measurement, the internal state was updated and, possibly, a reversal of the direction of rotation of the wheels could be triggered. The transition rates , , and were translated into probabilities , and for the corresponding transitions, where and were used.

In the following, we aimed at testing the theoretical predictions at the level of exit probabilities. For this purpose, a single experimental run proceeded as follows: it was monitored whether a robot which was initially placed in the middle of the experimental setup reached the left (black) or right (white) boundary of the system first. Once the robot touched one of the boundaries, the experiment was stopped and repeated. In total, realizations were recorded. In cases, the robot left the system via the right boundary. A typical trajectory for an exit on the right boundary is displayed in Fig. 7(c).

Based on this experimental result, we first test the null hypothesis that the robot performed just an unbiased random walk. The exit probability on the right side of the system for a single experimental run should therefore be . The probability to observe exits on the right, given realizations in total, is determined by the Binomial distribution

(39) |

This distribution is shown for and in Fig. 7(b) by black circles. The total probability to observe exits to the right of the system or a more extreme result than this is determined by the tails of the Binomial distribution. It thereby constitutes the -value under the null hypothesis the robot performs an unbiased random walk freedman_statistics_1997. In the case under consideration, we obtain a -value of approximately . Accordingly, the null hypothesis may be discarded based on the standard significance level . In short, there is considerable statistical significance that the motion of the robot is biased due to the NCS at work.

The NCS was implemented such that the robot tended to move towards brighter areas in terms of the gray value and, thus, we expect the number of exits to the right to be larger than to the left. Simulations of the corresponding process predict that the probability to touch the right boundary first is . The Binomial distribution for this -value is represented by red squares in Fig. 7(b). The experimentally observed result () is indicated by a blue, vertical line. Apparently, the likelihood for the observed result given is higher than for the random walk:

(40) |

Based on the Akaike information criterion akaike_information_1998, we infer that the theoretically predicted value is considerably more likely than the random walk hypothesis corresponding to .

Finally, we specify the last statements regarding the likelihood of certain -values given the experimental observation. Using the Bayesian theorem meyer_introductory_1965, the following expression for the probability distribution for given a certain number of exits to the right out of total experimental realizations is deduced: