A Evaluation of Kullback–Leibler divergences

# An application of Pontryagin's principle to Brownian particle engineered equilibration

## Abstract

We present a stylized model of controlled equilibration of a small system in a fluctuating environment. We derive the equations governing the optimal control steering in finite time the system between two equilibrium states. The corresponding thermodynamic transition is optimal in the sense that occurs at minimum entropy if the set of admissible controls is restricted by certain bounds on the time derivatives of the protocols. We apply our equations to the engineered equilibration of an optical trap considered in a recent proof of principle experiment. We also analyze an elementary model of nucleation previously considered by Landauer to discuss the thermodynamic cost of one bit of information erasure. We expect our model to be a useful benchmark for experiment design as it exhibits the same integrability properties of well known models of optimal mass transport by a compressible velocity field.

## 1 Introduction

An increasing number of applications in micro and sub-micro scale physics call for the development of general techniques for engineered finite-time equilibration of systems operating in a thermally fluctuating environment. Possible concrete examples are the design of nano-thermal engines [13, 45] or of micro-mechanical oscillators used for high precision timing or sensing of mass and forces [33].

A recent experiment [36] exhibited the feasibility of driving a micro-system between two equilibria over a control time several order of magnitude faster than the natural equilibration time. The system was a colloidal micro-sphere trapped in an optical potential. There is consensus that non-equilibrium thermodynamics (see e.g. [49]) of optically trapped micron-sized beads is well captured by Langevin–Smoluchowski equations [24]. In particular, the authors of [36] took care of showing that it is accurate to conceptualize the outcome of their experiment as the evolution of a Gaussian probability density according to a controlled Langevin–Smoluchowski dynamics with gradient drift and constant diffusion coefficient. Finite time equilibration means that at the end of the control horizon, the probability density is solution of the stationary Fokker–Planck equation. The experimental demonstration consisted in a compression of the confining potential. In such a case, the protocol steering the equilibration process is specified by the choice of the time evolution of the stiffness of the quadratic potential whose gradient yields the drift in the Langevin–Smoluchowski equation. As a result, the set of admissible controls is infinite. The selection of the control in [36] was then based on simplicity of implementation considerations.

A compelling question is whether and how the selection of the protocol may stem from a notion of optimal efficiency. A natural indicator of efficiency in finite-time thermodynamics is entropy production. Transitions occurring at minimum entropy production set a lower bound in Clausius inequality. Optimal control of these transitions is, thus, equivalent to a refinement of the second law of thermodynamics in the form of an equality.

In the Langevin–Smoluchowski framework, entropy production optimal control takes a particularly simple form if states at the end of the transition are specified by sufficiently regular probability densities [6]. Namely, the problem admits an exact mapping into the well known Monge–Kantorovich optimal mass transport [50]. This feature is particularly useful because the dynamics of the Monge–Kantorovich problem is exactly solvable. Mass transport occurs along free-streaming Lagrangian particle trajectories. These trajectories satisfy boundary conditions determined by the map, called the Lagrangian map, transforming into each other the data of the problem, the initial and the final probability densities. Rigorous mathematical results [8, 14, 18] preside over the existence, qualitative properties and reconstruction algorithms for the Lagrangian map.

The aforementioned results cannot be directly applied to optimal protocols for engineered equilibration. Optimal protocols in finite time unavoidably attain minimum entropy by leaving the end probability densities out of equilibrium. The qualitative reason is that optimization is carried over the set of drifts sufficiently smooth to mimic all controllable degrees of freedom of the micro-system. Controllable degrees of freedom are defined as those varying over typical time scales much slower than the time scales of Brownian forces [3]. The set of admissible protocols defined in this way is too large for optimal engineered equilibration. The set of admissible controls for equilibration must take into account also extra constraints coming from the characteristic time scales of the forces acting on the system. From the experimental slant, we expect these restrictions to be strongly contingent on the nature and configuration of peripherals in the laboratory setup. From the theoretical point of view, self-consistence of Langevin–Smoluchowski modeling imposes a general restriction. The time variation of drift fields controlling the dynamics must be slow in comparison to Brownian and inertial forces.

In the present contribution, we propose a refinement of the entropy production optimal control adapted to engineered equilibration. We do this by restricting the set of admissible controls to those satisfying a non-holonomic constraint on accelerations. The constraint relates the bound on admissible accelerations to the pathwise displacement of the system degrees of freedom across the control horizon. Such displacement is a deterministic quantity, intrinsically stemming from the boundary conditions inasmuch we determine it from the Lagrangian map.

This choice of the constraint has several inherent advantages. It yields an intuitive hold on the realizability of the optimal process. It also preserves the integrability properties of the unrestricted control problem specifying the lower bound to the second law. This is so because the bound allows us to maintain protocols within the admissible set by exerting on them uniform accelerating or decelerating forces. On the technical side, the optimal control problem can be handled by a direct application of Pontryagin maximum principle [34]. For the same reasons as for the refinement of the second law [6], the resulting optimal control is of deterministic type. This circumstance yields a technical simplification but it is not a necessary condition in view of extensions of our approach. We will return to this point in the conclusions.

The structure of the paper is as follows. In section 2 we briefly review the Langevin–Smoluchowski approach to non-equilibrium thermodynamics [47]. This section can be skipped by readers familiar to the topic. In section 3 we introduce the problem of optimizing the entropy production. In particular we explain its relation with the Schrödinger diffusion problem [46, 1]. This relation, already pointed out in [38], has recently attracted the attention of mathematicians and probabilists interested in rigorous application of variational principles in hydrodynamics [5]. In section 4 we formulate the Pontryagin principle for our problem. Our main result follows in section 5 where we solve in explicit form the optimal protocols. Sections 6 and 7 are devoted to applications. In 6 we revisit the theoretical model of the experiment [36], the primary motivation of our work. In section 7 we apply our results to a stylized model of controlled nucleation obtained by manipulating a double well potential. Landauer and Bennett availed themselves of this model to discuss the existence of intrinsic thermodynamic cost of computing [31, 9]. Optimal control of this model has motivated in more recent years several theoretical [19] and experimental works [11, 28, 27].

Finally, in section 8 we compare the optimal control we found with those of [7]. This reference applied a regularization technique coming from instanton calculus [4] to give a precise meaning to otherwise ill-defined problems in non-equilibrium thermodynamics, where terminal cost seem to depend on the control rather than being a given function of the final state of the system.

In the conclusions we discuss possible extensions of the present work. The style of the presentation is meant to be discursive but relies on notions in between non-equilibrium physics, optimal control theory and probability theory. For this reason we include in appendices some auxiliary information as a service to the interested reader.

## 2 Kinematics and thermodynamics of the model

We consider a physical process in a -dimensional Euclidean space () modeled by a Langevin–Smoluchowski dynamics

 dξt=−∂ξtU(ξt,t)dt+√2βdωt (1)

The stochastic differential stands here for the increment of a standard -dimensional Wiener process at time [24]. denotes a smooth scalar potential and is a constant sharing the same canonical dimensions as . We also suppose that the initial state of the system is specified by a smooth probability density

 P(q≤ξtι

Under rather general hypotheses, the Langevin–Smoluchowski equation (1) can be derived as the scaling limit of the overdamped non-equilibrium dynamics of a classical system weakly coupled to an heat bath [51]. The Wiener process in (1) thus embodies thermal fluctuations of order . The fundamental simplification entailed by (1) is the possibility to establish a framework of elementary relations linking the dynamical to the statistical levels of description of a non-equilibrium process [47, 32]. In fact, the kinematics of (1) ensures that for any time-autonomous, confining potential the dynamics tends to a unique Boltzmann equilibrium state.

 peq(q)∝exp(−βU(q))

Building on the foregoing observations [47], we may then identify over a finite time horizon with the internal energy of the system. The differential of

 dU(ξt,t)=dt∂tU(ξt,t)+dξt1/2⋅∂ξtU(ξt,t) (3)

yields the energy balance in the presence of thermal fluctuations due to interactions with the environment. We use the notation for the Stratonovich differential [24]. From (3) we recover the first law of thermodynamics by averaging over the realizations of the Wiener process. In particular, we interpret

 W=E∫tftodt∂tU(ξt,t) (4)

as the average work done on the system. Correspondingly,

 Q=−E∫tftodξt1/2⋅∂ξtU(ξt,t) (5)

is the average heat discarded by the system into the heat bath and therefore

 W−Q=E(U(ξtf,tf)−U(ξtf,tf)) (6)

is the embodiment of the first law.

The kinematics of stochastic processes [41], allow us also to write a meaningful expression for the second law of thermodynamics. The expectation value of a Stratonovich differential is in general amenable to the form

 Q=−E∫tftιdt(v⋅∂ξtU)(ξt,t) (7)

where

 v(q,t)=−∂q(U(q,t)+1βlnp(q,t)) (8)

is the current velocity. For a potential drift, the current velocity vanishes identically at equilibrium. As well known from stochastic mechanics [20, 40], the current velocity permits to couch the Fokker–Planck equation into the form of a deterministic mass transport equation. Hence, upon observing that

 Extra open brace or missing close brace (9)

we can recast (7) into the form

 QT=Q−1βElnp(ξtf,tf)p(ξtι,tι)=E∫tftιdt∥v(ξt,t)∥2 (10)

which we interpret as the second law of thermodynamics (see e.g. [42]). Namely, if we define as the total entropy change in , (10) states that the sum of the entropy generated by heat released into the environment plus the change of the Gibbs–Shannon entropy of the system is positive definite and vanishes only at equilibrium. The second law in the form (10) immediately implies a bound on the average work done on the system. To evince this fact, we avail us of the equality

 W=E(U(ξtf,tf)−U(ξtι,tι)+1βlnp(ξtf,tf)p(ξtι,tι))+QT (11)

and define the current velocity potential

 F(q,t)=U(q,t)+1βlnp(q,t)

We then obtain

 W=E(F(ξtf,tf)−F(ξtι,tι))+QT≥E(F(ξtf,tf)−F(ξtι,tι))

In equilibrium thermodynamics the Helmholtz free energy is defined as the difference

 F=U−β−1S

between the internal energy and entropy of a system at temperature . This relation admits a non-equilibrium extension by noticing that the information content [48] of the system probability density

 S(q,t)=−lnp(q,t)

weighs the contribution of individual realizations of (1) to the Gibbs-Shannon entropy. We refer to [41] for the kinematic and thermodynamic interpretation of the information content as osmotic potential. We also emphasize that the notions above can be given an intrinsic meaning using the framework of stochastic differential geometry [40, 38]. Finally, it is worth noticing that the above relations can be regarded as a special case of macroscopic fluctuation theory [10].

## 3 Non-equilibrium thermodynamics and Schrödinger diffusion

We are interested in thermodynamic transitions between an initial state (2) at time and a pre-assigned final state at time also specified by a smooth probability density

 P(q≤ξtf

We also suppose that the cumulative distribution functions of (2) and (12) are related by a Lagrangian map such that

 P(ξtι

According to the Langevin–Smoluchowski dynamics (1), the evolution of probability densities obey a Fokker–Planck equation, a first order in time partial differential equation. As a consequence, a price we pay to steer transitions between assigned states is to regard the drift in (1) not as an assigned quantity but as a control. A priori a control is only implicitly characterized by the set of conditions which make it admissible. Informally speaking, admissible controls are all those drifts steering the process between the assigned end states (2) and (12) while ensuring that at any time the Langevin–Smoluchowski dynamics remains well-defined.

Schrödinger [46] considered already in 1931 the problem of controlling a diffusion process between assigned states. His work was motivated by the quest of a statistical interpretation of quantum mechanics. In modern language [17, 43], the problem can be rephrased as follows. Given (2) and (12) and a reference diffusion process, determine the diffusion process interpolating between (2) and (12) while minimizing the value of its Kullback–Leibler divergence (relative entropy) [30] with respect to the reference process. A standard application (appendix A) of Girsanov formula [24] shows that the Kullback–Leibler divergence of (1) with respect to the Wiener process is

 K(P∥Pω)=β2E∫tftιdt∥∂ξtU(ξt,t)∥2 (14)

and denote respectively the measures of the process solution of (1) with drift and of the Wiener process . The expectation value on the right hand side is with respect to as elsewhere in the text. A now well-established result in optimal control theory see e.g. [17, 43] is that the optimal value of the drift satisfies a backward Burgers equation with terminal condition specified by the solution of the Beurling–Jamison integral equations. We refer to [17, 43] for further details. What interest us here is to emphasize the analogy with the problem of minimizing the entropy production in a transition between assigned states.

Several observations are in order at this stage.

The first observation is that also (10) can be directly interpreted as a Kullback–Leibler divergence between two probability measures. Namely, we can write (appendix A)

 K(P∥PR)=β2E∫tftιdt∥v(ξt,t)∥2 (15)

for the path-space measure of the process

 dξt=∂ξtU(ξt,t)dt+√2βdωt (16)

evolving backward in time from the final condition (12) [25, 15].

The second observation has more far reaching consequences for optimal control. The entropy production depends upon the drift of (1) exclusively through the current velocity (8). Hence we can treat the current velocity itself as natural control quantity for (15). This fact entails major simplifications [6]. The current velocity can be thought as deterministic rather than stochastic velocity field (see [41] and appendix B). Thus, we can couch the optimal control of (15) into the problem of minimizing the kinetic energy of a classical particle traveling from an initial position at time and a final position at time specified by the Lagrangian map (13). In other words, entropy production minimization in the Langevin–Smoluchowski framework is equivalent to solve a classical optimal transport problem [50].

The third observation comes as a consequence of the second one. The optimal value of the entropy production is equal to the Wasserstein distance [26] between the initial and final probability measures of the system, see [21] for details. This fact yields a simple characterization of the Landauer bound and permits a fully explicit analysis of the thermodynamics of stylized isochoric micro-engines (see [39] and refs therein).

Finally, the construction of Schrödinger diffusions via optimal control of (14) corresponds to a viscous regularization of the optimal control equations occasioned by the Schrödinger diffusion problem (15).

## 4 Pontryagin’s principle for bounded accelerations

An important qualitative feature of the solution of the optimal control of the entropy production is that the system starts from (2) and reaches (12) with non-vanishing current velocity. This means that the entropy production attains a minimum value when the end-states of the transition are out-of-equilibrium. We refer to this lower bound as the refinement of the second law.

Engineered equilibration transitions are, however, subject to at least two further types of constraints not taken into account in the derivation of the refined second law. The first type of constraint is on the set of admissible controls. For example, admissible controls cannot vary in an arbitrary manner: the fastest time scale in the Langevin–Smoluchowski dynamics is set by the Wiener process. The second type is that end-states are at equilibrium. In mathematical terms, this means that the current velocity must vanish identically at and .

We formalize a deterministic control problem modeling these constraints. Our goal is to minimize the functional

 E=∫tftιdtβ∥νt∥2 (17)

over the set of trajectories generated for any given choice of the measurable control by the differential equation

 ˙χt =νt (18a) ˙νt =αt (18b)

satisfying the boundary conditions

 χtι=q&χtf=ℓ(q) (19)

We dub the dynamical variable running Lagrangian map as it describes the evolution of the Lagrangian map within the control horizon. We restrict the set of admissible controls to those enforcing equilibration at the boundaries of the control horizon

 Missing dimension or its units for \hskip (20)

whilst satisfying the bound

 |α(i)t|≤K(i)(q)(t% f−tι)2∀t∈[tι,tf]∀i=1,…,d (21)

We suppose that the are strictly positive functions of the initial data of the form

 K(i)(q)∝|ℓ(i)(q)−q(i)| (22)

The constraint is non-holonomic inasmuch it depends on the initial data of a trajectory. The proportionality (22) relates the bound on acceleration to the Lagrangian displacement needed to satisfy the control problem.

We resort to Pontryagin principle [34] to find normal extremals of (17). We defer the statement of Pontryagin principle as well as the discussion of abnormal extremals to appendix C. We proceed in two steps. We first avail us of Lagrange multipliers to define the effective cost functional

 A=∫tftιdt(β∥νt∥2+ηt⋅(˙χt−νt)+θt⋅(˙νt−αt))

subject to the boundary conditions (19), (20). Then, we couch the cost functional into an explicit Hamiltonian form

 A=∫tftιdt(ηt⋅˙χt+θt⋅˙νt−H(χt,νt,ηt,θt,αt)) (23)

with

 H(χt,νt,ηt,θt,αt)=ηt⋅νt+θt⋅αt−β∥νt∥2

Pontryagin’s principle yields a rigorous proof of the intuition that extremals of the optimal control equations correspond to stationary curves of the action (23) with Hamiltonian

 H⋆(χt,νt,ηt,θt)=maxα∈AH(χt,νt,ηt,θt,αt)=ηt⋅νt+∑di=1K(i)|θ(i)t|(tf−tι)2−β∥νt∥2

In view of the boundary conditions (19), (20) extremals satisfy the Hamilton system of equations formed by (18a) and

 ˙ν(i)t=∂θtH⋆=K(i)(tf−tι)2sgnθ(i)t (24a) ˙ηt=−∂χtH⋆=0 (24b) ˙θt=−∂νtH⋆=−ηt+2βνt (24c)

In writing (24a) we adopt the convention

 sgn0=0

## 5 Explicit solution in the 1d case

The extremal equations (18a), (24) are time-autonomous and do not couple distinct vector components. It is therefore non-restrictive to focus on the case in the time horizon .

The Hamilton equations are compatible with two behaviors. A “push-region” where the running Lagrangian map variable evolves with constant acceleration

 Missing dimension or its units for \hskip

and a “no-action” region specified by the conditions

 θt=0&−η⋆+2βν⋆=0 (25)

where follows a free streaming trajectory:

 ˙χt=ν⋆

We call switching times the values of corresponding to the boundary values of a no-action region. Switching times correspond to discontinuities of the acceleration . Drawing from the intuition offered by the solution of the unbounded acceleration case, we compose push and no-action regions to construct a single solution trajectory satisfying the boundary conditions. If we surmise that during the control horizon only two switching times occur, we obtain

 Unknown environment '% (26)

which implies

 Missing dimension or its units for \hskip (27)

Self-consistence of the solutions fixes the initial data in (27)

 θ0=βKt21T2sgnθ0

whilst the requirement of vanishing velocity at determines the relation between the switching times

 t2=T+sgnθ0sgnθTt1

Self-consistence then dictates

 sgnθtf=−sgnθt0

We are now ready to glean the information we unraveled by solving (24), to write the solution of (18a)

 Missing dimension or its units for \hskip (28)

The terminal condition on fixes the values of and :

 ℓ(q)=q+K(q)t1(T−t1)T2sgnθt0

The equation for well posed only if

 sgnθt0=sgn(ℓ(q)−q) (29)

The only admissible solution is then of the form

 t1=T2(1−√1−4δ) (30)

The switching time is independent of in view of (22). It is realizable as long as

 Missing dimension or its units for \hskip (31)

The threshold value of correspond to the acceleration needed to construct an optimal protocol consisting of two push regions matched at half control horizon.

### 5.1 Qualitative properties of the solution

Equation (28) complemented by (29) and the realizability bound (31) fully specify the solution of the optimization problem we set out to solve. The solution is optimal because it is obtained by composing locally optimal solutions. Qualitatively, it states that transitions between equilibrium states are possible at the price of the formation of symmetric boundary layers determined by the occurrence of the switching times. For the relative size of the boundary layers is

 t1T=T−t2T≈δ

In the same limit, the behavior of the current velocity far from the boundaries tends to the optimal value of the refined second law [6]. Namely, for we find

 K(q)t1T2sgn(ℓ(q)−q)δ≪1≈K(q)δTsgn(ℓ(q)−q)=ℓ(q)−qT

More generally for any , we can couch (28) into the form

 χt=q+(ℓ(q)−q)×⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩t22t1(T−t1)t∈[0,t1)2t−t12(T−t1)t∈[t1,T−t1](1−(T−t)22t1(T−t1))t∈(T−t1,T] (32)

The use of the value of the switching time to parametrize the bound simplifies the derivation of the Eulerian representation of the current velocity. Namely, in order to find the field satisfying

 νt=v(χt,t) (33)

we can invert (32) by taking advantage of the fact that all the arguments of the curly brackets are independent of the position variable .

We also envisage that the representation (32) may be of use to analyze experimental data when finite measurement resolution may affect the precision with which microscopic forces acting on the system are known.

## 6 Comparison with experimental swift engineering (ESE) protocols

The experiment reported in [36] showed that a micro-sphere immersed in water and trapped in an optical harmonic potential can be driven in finite time from an equilibrium state to another. The probability distribution of the particle in and out equilibrium remained Gaussian within experimental accuracy.

It is therefore expedient to describe more in detail the solution of the optimal control problem in the case when the initial equilibrium distribution in one dimension is normal, i.e. Gaussian with zero mean and variance . We also assume that the final equilibrium state is Gaussian and satisfy (13) with Lagrangian map

 ℓ(q)=σq+h

The parameters and respectively describe a change of the mean and of the variance of the distribution. We apply (13) and (32) for any to derive the minimum entropy production evolution of the probability density. In consequence of (22), the running Lagrangian map leaves Gaussian distributions invariant in form with mean value

 Eξt=h×⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩t22t1(T−t1)t∈[0,t1)(2t−t1)2(T−t1)t∈[t1,T−t1]2t1(T−t1)−(T−t)22t1(T−t1)t∈(T−t1,T] (34)

and variance

 Vξt=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩(2t1(T−t1)+(σ−1)t2)24βt21(T−t1)2t∈[0,t1)(2(T−t1)+(σ−1)(2t−t1))24β(T−t1)2t∈[t1,T−t1](2t1(T−t1)+(σ−1)(2t1(T−t1)−(T−t)2))24βt21(T−t1)2t∈(T−t1,T] (35)

Finally, we find that the Eulerian representation (33) of the current velocity at is

 v(q,t)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩2t(h+q(σ−1))2t1(T−t1)+(σ−1)t2t∈[0,t1)2(h+q(σ−1))2(T−t1)+(σ−1)(2t−t1)t∈[t1,T−t1]2(T−t)(h+q(σ−1))2t1(T−t1)+(σ−1)(2t1(T−t1)−(T−t)2)t∈(T−t1,T] (36)

The foregoing expression allows us to write explicit expressions for the all the thermodynamic quantities governing the energetics of the optimal transition. In particular, the minimum entropy production is

 E∫tf0dtv2(ξt,t)=T(3T−4t1)3(T−t1)2E∞ (37)

with

 E∞=h2β+(σ−1)2βT (38)

the value of the minimum entropy production appearing in the refinement of the second law [6].

In Fig. 1 we plot the evolution of the running average values of the work done on the system, the heat release and the entropy production during the control horizon. In particular, Fig. a illustrates the first law of thermodynamics during the control horizon. A transition between Gaussian equilibrium states occurs without any change in the internal energy of the system. The average heat and work must therefore coincide at the end of the control horizon. The theoretical results are consistent with the experimental results of [36].

## 7 Optimal controlled nucleation and Landauer bound

The form of the bound (22) and running Lagrangian map formula (32) reduce the computational cost of the solution the optimal entropy production control to the determination of the Lagrangian map (13). In general, the conditions presiding to the qualitative properties of the Lagrangian map have been studied in depth in the context of optimal mass transport [50]. We refer to [18] and [21] respectively for a self-contained overview from respectively the mathematics an physics slant.

For illustrative purpose, we revisit here the stylized model of nucleation analyzed in [6]. Specifically, we consider the transition between two equilibria in one dimension. The initial state is described by the symmetric double well:

 pι(q)=Z−1ιexp−β(q2−¯q2)2σ2

In the final state the probability is concentrated around a single minimum of the potential:

 pf(q)=Z−1fexp−β(q−¯q)2((q−¯q)+¯q(3q−¯q))σ2

In the foregoing expressions is a constant ensuring consistency of the canonical dimensions.

We used the ensuing elementary algorithm to numerically determine the Lagrangian map. We first computed the median of the assigned probability distributions and then evaluated first the left and then right branch of the Lagrangian map. For the left branch, we proceeded iteratively in as follows

• We renormalized the distribution restricted to .

• We computed the quantile of the remaining distribution.

• We solved the ODE

 dℓdq=pι(q)pf(ℓ(q))

We skipped Step 3 whenever the difference turned out to be smaller than a given threshold ’resolution’. We plot the results of this computation in Fig. 2.

Once we know the Lagrangian map, we can numerically evaluate the running Lagrangian map (32) and its spatial derivatives. In Fig. 3 we report the evolution of the probability density in the control horizon for two reference values of the switching time.

Fig. 4 illustrates the the corresponding evolution of the current velocity.

The qualitative behavior is intuitive. The current velocity starts and ends with vanishing value, it catches up with the value for , i.e. when the bound on acceleration tends to infinity, in the bulk of the control horizon. There the displacement described by the running Lagrangian map occurs at speed higher than in the case. The overall value of the entropy production is always higher than in the limit. From (32) we can also write the running values of average heat released by the system. The running average heat is

 Extra open brace or missing close brace

and the running average work

 W(t)=∫Rdqpι(q)F(χt(q),t)+∫t0ds∫Rddqpι(q)ν2t(q)

with

 F(χt(q),t)=−∫q0dydχtdy(y)νt(y)−1β∫Rddqpι(q)lndχt(q)dq (39)

The second summand on the right hand side of (39) fixes the arbitrary constant in the Helmholtz potential in the same way as in the Gaussian case.

In Fig. 5 we plot the running average work, heat and entropy production.

## 8 Comparison with the valley method regularization

An alternative formalism to study transitions between equilibrium states in the Langevin–Smoluchowski limit was previously proposed in [7]. As in the present case, [7] takes advantage of the possibility to map the stochastic optimal control problem into a deterministic one via the current velocity formalism. Physical constraints on admissible controls are, however, enforced by adding to the entropy production rate a penalty term proportional to the squared current acceleration. In terms of the entropy production functional (17) we can couch the regularized functional of [7] into the form

 A=E+ετ2∥δχE∥2

stands for the variation of with respect to the running Lagrangian map. The idea behind the approach is the “valley method” advocated by [4] for instanton calculus. The upshot is to approximate field configurations satisfying boundary conditions incompatible with stationary values of classical variational principles by adding extra terms to the action functional. The extra term is proportional to the squared first variation of the classical action. Hence it vanishes whenever there exists a classical field configurations matching the desired boundary conditions. It otherwise raises the order of the time derivative in the problem thus permitting to satisfy extra boundary conditions.

Optimal control problems are well-posed if terminal costs are pure functionals of the boundary conditions. The rationale for considering valley method regularized thermodynamic functionals is to give a non-ambiguous meaning to the optimization of functionals whenever naive formulations of the problem yield boundary conditions or terminal costs as functional of the controls.

Contrasted with the approach proposed in the present work, [7] has one evident drawback and one edge. The drawback is that the quantities actually minimized are not anymore the original thermodynamic functionals. The edge is that the resulting optimal protocol has better analyticity properties. In particular, the running Lagrangian map takes the form

 χt=q+ℓ(q)−qT−2τ√εtanhT2τ√ε⎛⎜⎝t−τ√εsinh2t−T2τ√ε+sinhT2τ√εcoshT2τ√ε⎞⎟⎠ (40)

In fig. a we compare the qualitative behavior of the universal part of the running Lagrangian map predicted by the valley method and by the bound (21) on admissible current accelerations. The corresponding values of the running average entropy production are in fig. b.

The upshot of the comparison is the weak sensitivity of the optimal protocol to the detail of the optimization once the intensity of the constraint on the admissible control (i.e. the current acceleration) is fixed. We believe that this is an important observation for experimental applications (see, e.g., discussion in the conclusions of [27]) as the detail of how control parameters can be turned on and off in general depend on the detailed laboratory setup and on the restrictions by the available peripherals.

## 9 Conclusions and outlooks

We presented a stylized model of engineered equilibration of a micro-system. Owing to explicitly integrability modulo numerical reconstruction of the Lagrangian map, we believe that our model may provide an useful benchmark for the devising of efficient experimental setups. Furthermore extensions of the current model are possible although at the price of some complications.

The first extension concerns the form of the constraint imposed on admissible protocols. Here we showed that choosing the current acceleration constraint in the form (22) greatly simplifies the determination of the switching times. It also guarantees that optimal control with only two switching times exists for all boundary conditions if we allow accelerations to take sufficiently large values. The non-holonomic form of the constraint (21) may turn out to be restrictive for the study of transitions for which admissible controls are specified by given forces. If the current velocity formalism is still applicable to these cases, then the design of optimal control still follows the steps we described here. In particular, uniformly accelerated Lagrangian displacement at the end of the control horizon correspond to the first terms of the integration of Newton law in Peano–Picard series. The local form of the acceleration may then occasion some qualitative differences in the form of the running Lagrangian map. Furthermore, the analysis of the realizability conditions of the optimal control may also become more involved.

A further extension is optimal control when constraints on admissible controls are imposed directly on the drift field appearing in the stochastic evolution equation. Constraints of this type are natural when inertial effects become important and the dynamics is governed by Langevin–Kramers equation in the so-called under-damped approximation. In the Langevin–Kramers framework, finding minimum entropy production thermodynamic transitions requires instead a full-fledged formalism of stochastic optimal control [39]. Nevertheless, it is possible also in that case to proceed in a way analogous to one of the present paper by applying the stochastic version of Pontryagin principle [12, 29, 44].

We expect that considering these theoretical refinements will be of interest in view of the increasing available experimental resolution for efficient design of atomic force microscopes [33, 16].

## 10 Acknowledgments

The authors thank S. Ciliberto for useful discussions. The work of KS was mostly performed during his stay at the department of Mathematics and Statistics of the University of Helsinki. PMG acknowledges support from Academy of Finland via the Centre of Excellence in Analysis and Dynamics Research (project No. 271983) and to the AtMath collaboration at the University of Helsinki.

## Appendix A Evaluation of Kullback–Leibler divergences

Let us consider first the drift-less process

 dξt=√2βdωt (41)

with initial data (2). If we denote by the path-space Wiener measure generated by (41) in , Girsanov formula yields

 dPdPω=exp−β2∫tftι(dξt⋅∂ξtU+dt∥∂ξtU∥22)

The Kullback–Leibler divergence is defined as

 K(P||Pω)=E∫tftιlndPdPω

The expectation value is with respect the measure generated by (1):

 K(P∥Pω)=−β2E∫tftι(dξt⋅∂ξtU+dt∥∂ξtU∥22) =−β2E∫tftι((dξt+dt∂ξtU)⋅∂ξtU−dt∥∂ξtU∥22)

The last expression readily recovers (14) as is a Wiener process with respect to .

To show that the entropy production is proportional to the Kullback–Leibler divergence between the path-space measures of (1) and (16) we observe that

 dPRdPω=expβ2∫tftι(dξt1⋅∂ξtU−dt∥∂ξtU∥22) (42)

The stochastic integral is evaluated in the post-point prescription as the Radon–Nikodym derivative between backward processes must be a martingale with respect the filtration of future event (see e.g. [37] for an elementary discussion). We then avail us of the time reversal invariance of the Wiener process to write

 dPdPR=pι(ξtι)pf(ξtf)exp−β∫tftι(dξt1/2⋅∂ξtU) =exp−β∫tftι(dξt1/2⋅∂ξt(U+1βlnp)+dt∂tlnp)

Finally, the definition

 K(P∥PR)=E∫tftιlndPdPR

recovers (15) since probability conservation entails

 E∂tlnp=0

whilst the properties of the Stratonovich integral [40] yield

 E∫tftιdξt1/2⋅∂ξt(U+1βlnp)=−E∫tftιdt∥v∥2

We refer to e.g. [32, 35, 25, 15] for thorough discussions of the significance and applications of the entropy production in stochastic models of non-equilibrium statistical mechanics and to [22, 23] for applications to non-equilibrium fluctuating hydrodynamics and granular materials.

## Appendix B Current velocity and acceleration in terms of the generator of the stochastic process

The current velocity is the conditional expectation along the realizations of (1) of the time symmetric conditional increment

 Missing or unrecognized delimiter for \Big

A relevant feature of the time symmetry is that the differential can be regarded as the result of the action of a generator including only first order derivatives in space:

 v(ξt,t)=¯\mathdsDξtξt

where

 ¯\mathdsDξt:=\mathdsDξt+\mathdsD∗ξt2 (43)

On the right hand side of (43) there appear the scalar generator of (1)

 \mathdsDq=∂t−(∂qU)(q,t)⋅∂q+1β∂2q

and the generator of the dual process conjugated by time-reversal of the probability density in [41, 40]:

 \mathdsD∗q=∂t−(∂qU+2β∂qlnp)(q,t)⋅∂q−1β∂2q

The arithmetic averages of these generators readily defines a first order differential operator as in the deterministic case. Analogously, we define the current acceleration as

 a(q,t)=limτ↓0E(v(ξt+τ,t+τ)−v(ξt−τ,t−τ)∣∣ξt=q)2τ

or equivalently

 αt=a(ξt,t)=¯\mathdsD2ξtξt

## Appendix C Pontryagin principle

We recall the statement of Pontryagin’s principle for fixed time and fixed boundary conditions [2, 34].

Maximum Principle: Let the functional (44) be subject to the dynamical constraint (45) and the endpoint constraints with the parameter belonging for fixed to a set , the variable taking values in or in a open subset of and the time interval fixed. A necessary condition for a function and a corresponding solution of (45) to solve the minimization of (44) is that there exist a function t and a constant such that (non-triviality condition) for each fixed (maximum condition) obey the equations (Hamilton system condition) The proof of the maximum principles requires subtle topological considerations culminating with the application of Brouwer’s fixed point theorem. The maximum principle has, nevertheless, an intuitive content. Namely, we can reformulate the problem in an extended configuration space by adding the ancillary equation

 ˙ζt=L(ξt,πt,t) (46a) ζtι=0 (46b)

and looking for stationary point of the action functional

 ~A=ζtf+∫tftιdt(πt⋅˙ξt+ϕt˙ζt−(πt⋅b(ξt,αt,t)+ϕtL(ξt,αt,t)))

Let us make the simplifying assumption that any pair of trajectory and control variables satisfying the boundary have a non-empty open neighborhood where linear variations are well defined. Looking for a stationary point of (44) entails considering variations of under the constraints . Then it follows immediately that the stationary value of the Lagrange multiplier must satisfy

 ˙¯ϕt=0

This observation clarifies why the maximum principle is stated for some constant such that . In particular, if we can always rescale it to and recover familiar form of the Hamilton equations. Moreover, the Maximum principle coincides with the Hamilton form of the stationary action principle if and is quadratic in . If instead there exist stationary solutions for , they describe abnormal controls.

Abnormal control do not occur in the optimization problem considered in the main text. In the push regions where the acceleration is non-vanishing abnormal control drive the Lagrange multiplier away from zero. Thus, they are not compatible with the occurrence of switching times between push and no-action regions. Looking for abnormal control in the no-action region yields the requirement that all Lagrange multipliers vanish against the hypothesis of the maximum principle.

### References

1. R. Aebi. Schrödinger Diffusion Processes. Probability and its applications. Birkhäuser, 1996.
2. A. A. Agrachev and Y. Sachkov. Control Theory from the Geometric Viewpoint, volume 2 of Encyclopaedia of mathematical sciences: Control theory and optimization. Springer, 2004.
3. A. Alemany, M. Ribezzi, and F. Ritort. Recent progress in fluctuation theorems and free energy recovery. AIP Conference Proceedings, 1332(1):96–110, 2011, 1101.3174.
4. H. Aoyama, H. Kikuchi, I. Okouchi, M. Sato, and S. Wada. Valley Views: Instantons, Large Order Behaviors, and Supersymmetry. Nuclear Physics B, 553(3):644–710, August 1999, hep-th/9808034.
5. M. Arnaudon, A. B. Cruzeiro, C. Léonard, and J.-C. Zambrini. An entropic interpolation problem for incompressible viscid fluids. eprint arXiv, April 2017, 1704.02126.
6. E. Aurell, K. Gawȩdzki, C. Mejía-Monasterio, R. Mohayaee, and P. Muratore-Ginanneschi. Refined Second Law of Thermodynamics for fast random processes. Journal of Statistical Physics, 147(3):487–505, April 2012, 1201.3207.
7. E. Aurell, C. Mejía-Monasterio, and P. Muratore-Ginanneschi. Boundary layers in stochastic thermodynamics. Physical Review E, 85(2):020103(R), Februray 2012, 1111.2876.
8. J.-D. Benamou and Y. Brenier. A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem. Numerische Mathematik, 84(3):375â393, January 2000.
9. C. H. Bennett. The thermodynamics of computation â A review. International Journal of Theoretical Physics, 21(12):905–940, 1982.
10. L. Bertini, A. D. Sole, D. Gabrielli, G. Jona-Lasinio, and C. Landim. Macroscopic fluctuation theory. Reviews of Modern Physics, 87:593–636, April 2015, 1404.6466.
11. A. Bérut, A. Arakelyan, A. Petrosyan, S. Ciliberto, R. Dillenschneider, and E. Lutz. Experimental verification of Landauer’s principle linking information and thermodynamics. Nature, 483:187â189, March 2012.
12. J.-M. Bismut. An Introductory Approach to Duality in Optimal Stochastic Control. SIAM Review, 20(1):62–78, 1978.
13. V. Blickle and C. Bechinger. Realization of a micrometre-sized stochastic heat-engine. Nature Physics, 8(2):143–146, December 2011.
14. Y. Brenier, U. Frisch, M. Hénon, G. Loeper, S. Matarrese, R. Mohayaee, and A. Sobolevskiǐ. Reconstruction of the early Universe as a convex optimization problem. Monthly Notices of the Royal Astronomical Society, 346(2):501–524, Nov 2003, astro-ph/0304214.
15. R. Chétrite and K. Gawȩdzki. Fluctuation relations for diffusion processes. Communications in Mathematical Physics, 282(2):469–518, September 2008, 0707.2725.
16. A. L. Cunuder, I. Martinez, A. Petrosyan, D. GuÃ©ry-Odelin, E. Trizac, and S. Ciliberto. Fast equilibrium switch of a micro mechanical oscillator. Applied Physics Letters, 109(11):113502, September 2016, 1606.02453.
17. P. Dai Pra. A stochastic control approach to reciprocal diffusion processes. Applied Mathematics and Optimization, 23(1):313–329, 1991.
18. G. De Philippis and A. Figalli. The MongeâAmpère equation and its link to optimal transportation. Bulletin of the American Mathematical Society, 51(4):527â580, May 2014, 1310.6167.
19. R. Dillenschneider and E. Lutz. Memory Erasure in Small Systems. Physical Review Letters, 102(21):210601, May 2009, 0811.4351.
20. I. Fényes. Eine wahrscheinlichkeitstheoretische Begründung und Interpretation der Quantenmechanik. Zeitschrift für Physik, 132(1):81–106, February 1952.
21. K. Gawȩdzki. Fluctuation Relations in Stochastic Thermodynamics. Lecture notes, 2013, 1308.1518.
22. G. Gradenigo, A. Puglisi, and A. Sarracino. Entropy production in non-equilibrium fluctuating hydrodynamics. Journal of Chemical Physics, 137(1):014509–014509, July 2012, 1205.3639.
23. G. Gradenigo, A. Puglisi, A. Sarracino, D. Villamaina, and A. Vulpiani. Out-of-equilibrium generalized fluctuation-dissipation relations. In R. Klages, W. Just, and C. Jarzynski, editors, Nonequilibrium Statistical Physics of Small Systems: Fluctuation Relations and Beyond, chapter 9. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, Germany, February 2013, 1203.4941.
24. K. Jacobs. Stochastic Processes for Physicists. Understanding Noisy Systems. Cambridge University Press, September 2010.
25. D.-Q. Jiang, M. Qian, and M.-P. Qian. Mathematical Theory of Nonequilibrium Steady States, volume 1833 of Lecture Notes in Mathematics. Springer, 2004.
26. R. Jordan, D. Kinderlehrer, and F. Otto. The Variational Formulation of the Fokker–Planck Equation. SIAM Journal on Mathematical Analysis, 29(1):1–17, January 1998.
27. Y. Jun, M. Gavrilov, and J. Bechhoefer. High-precision test of Landauer’s principle in a feedback trap. Physical Review Letters, 113:190601, November 2014, 1408.5089.
28. J. V. Koski, V. F. Maisi, J. P. Pekola, and D. V. Averin. Experimental realization of a Szilard engine with a single electron. Proceedings of the National Academy of Sciences of the United States of America, 111(38):13786â13789, September 2014, 1402.5907.
29. P. Kosmol and M. Pavon. Lagrange approach to the optimal control of diffusions. Acta Applicandae Mathematicae, 32:101–122, 1993. 10.1007/BF00998149.
30. S. Kullback and R. Leibler. On Information and Sufficiency. Annals of Mathematical Statistics, 22(1):79–86, 1951.
31. R. Landauer. Irreversibility and heat generation in the computing process. IBM Journal of Research and Development, 5:183–191, 1961.
32. J. L. Lebowitz and H. Spohn. A Gallavotti-Cohen Type Symmetry in the Large Deviation Functional for Stochastic Dynamics. Journal of Statistical Physics, 95(1):333–365, March 1999, cond-mat/9811220.
33. S. Liang, D. Medich, D. M. Czajkowsky, S. Sheng, J.-Y. Yuan, and Z. Shao. Thermal noise reduction of mechanical oscillators by actively controlled external dissipative forces. Ultramicroscopy, 84(1-2):119–125, July 2000.
34. D. Liberzon. Calculus of Variations and Optimal Control Theory. A Concise Introduction. Princeton University Press, 2012.
35. C. Maes, F. Redig, and A. V. Moffaert. On the definition of entropy production, via examples. Journal of Mathematical Physics, 41(3):1528–1554, March 2000.
36. I. A. Martínez, A. Petrosyan, D. Guéry-Odelin, E. Trizac, and S. Ciliberto. Engineered swift equilibration of a Brownian particle. Nature Physics, 12:843â846, May 2016.
37. P.-A. Meyer. Géométrie différentielle stochastique, II. Séminaire de probabilités de Strasbourg, S16:165–207, 1982.
38. P. Muratore-Ginanneschi. On the use of stochastic differential geometry for non-equilibrium thermodynamics modeling and control. Journal of Physics A: Mathematical and General, 46(27):275002, June 2013, 1210.1133.
39. P. Muratore-Ginanneschi and K. Schwieger. How nanomechanical systems can minimize dissipation. Physical Review E, 90(6):060102(R), December 2014, 1408.5298.
40. E. Nelson. Quantum fluctuations. Princeton series in Physics. Princeton University Press, 1985.
41. E. Nelson. Dynamical Theories of Brownian Motion. Princeton University Press, 2nd edition, 2001.
42. H. Qian. Mesoscopic nonequilibrium thermodynamics of single macromolecules and dynamic entropy-energy compensation. Physical Review E, 65(1):016102, December 2001, cond-mat/0110042.
43. S. Roelly and M. Thieullen. A characterization of Reciprocal Processes via an integration by parts formula on the path space. Probability Theory and Related Fields, 123:97–120, 2002.
44. L. C. G. Rogers. Duality in constrained optimal investment and consumption problems: a synthesis. In P. Bank, F. Baudoin, R. Carmona, H. Föllmer, L. C. G. Rogers, N. Touzi, and M. Soner, editors, Paris-Princeton Lectures on Mathematical Finance, volume 1814 of Lecture Notes in Mathematics, pages 95–131. Springer, June 2001 2003. Lectures from the CIRANO Workshop on Mathematical Financial Mathematics and Econometrics.
45. J. Roßnagel, O. Abah, F. Schmidt-Kaler, K. Singer, and E. Lutz. Nanoscale Heat Engine Beyond the Carnot Limit. Physical Review Letters, 112(3):030602, January 2014, 1308.5935.
46. E. Schrödinger. Über die Umkehrung der Naturgesetze. Sitzungsberichte der preussischen Akademie der Wissenschaften, physikalische mathematische Klasse, 8(9):144–153, 1931.
47. K. Sekimoto. Langevin Equation and Thermodynamics. Progress of Theoretical Physics Supplement, 130:17–27, 1998.
48. C. E. Shannon. A Mathematical Theory of Communication. The Bell System Technical Journal, 27:379â423, 623â656, July-October 1948.
49. E. H. Trepagnier, C. Jarzynski, F. Ritort, G. E. Crooks, C. J. Bustamante, and J. Liphardt. Experimental test of Hatano and Sasa’s nonequilibrium steady-state equality. Proceedings of the National Academy of Sciences, 101:15038–15041, 2004.
50. C. Villani. Optimal transport: old and new, volume 338 of Grundlehren der mathematischen Wissenschaften. Springer, 2009.
51. R. Zwanzig. Nonequilibrium statistical mechanics. Oxford University Press, 2001.