A dynamic domain decompositionfor a class of second order semi-linear equations

A dynamic domain decomposition for a class of second order semi-linear equations

Simone Cacace and Maurizio Falcone
Abstract

We propose a parallel algorithm for the numerical solution of a class of second order semi-linear equations coming from stochastic optimal control problems, by means of a dynamic domain decomposition technique. The new method is an extension of the patchy domain decomposition method presented in [6] for first order Hamilton-Jacobi-Bellman equations related to deterministic optimal control problems. The semi-Lagrangian scheme underlying the original method is modified in order to deal with (possibly degenerate) diffusion, by approximating the stochastic optimal control problem associated to the equation via discrete time Markov chains. We show that under suitable conditions on the discretization parameters and for sufficiently small values of the diffusion coefficient, the parallel computation on the proposed dynamic decomposition is faster than that on a static decomposition. To this end, we combine the parallelization with some well known techniques in the context of fast-marching-like methods for first order Hamilton-Jacobi equations. Several numerical tests in dimension two are presented, in order to show the features of the proposed method.

Key words.

AMS subject classifications.

1 Introduction

Domain decomposition techniques have become popular in the solution of partial differential equations arising in several applied contexts, including fluid dynamics and electromagnetism.

Given a decomposition of a domain in subsets of manageable size and prescribed suitable transmission conditions at the interfaces or overlapping regions between the sub-domains, massive parallel computations can be performed exploiting the computing power of modern clusters of CPUs and/or GPUs. This is the most common strategy to attack the so called curse of dimensionality, a quite general term to denote the difficulties related to the numerical approximation of problems in high dimension, both in terms of memory requirements and computational efforts. New efficient and accurate parallel algorithms are increasingly required, to open the way to the solution of real-life problems with a very huge number of degrees of fredom.

Despite such techniques have been mainly designed, analyzed and implemented in the framework of variational methods for elliptic and parabolic equations, in the last decades an increasing interest has also emerged in fields more related to hyperbolic equations, e.g. optimal control problems, differential games, front propagation and image processing. In this setting, a leading role is played by the theory of viscosity solutions for Hamilton-Jacobi equations, representing a solid theoretical background for a growing number of numerical methods for the solution of problems in robotics, aeronautics, electrical and aerospace engineering.

In [6] the authors proposed the patchy domain decomposition method, a parallel algorithm for the solution of deterministic optimal control problems, based on a semi-Lagrangian discretization of the corresponding first order Hamilton-Jacobi-Bellman equations. The main idea there can be summarized as follows. First the solution of the equation is computed on a grid which is coarse with respect to the actual (fine) grid. Then, an approximation of the optimal vector field associated to the underlying control problem is synthesized on the fine grid. Finally, a decomposition of the fine grid is dynamically build driven by the approximate optimal vector field, and the equation is solved in parallel on the corresponding sub-domains. This pre-computation has clearly a cost and produces a rather complex subdivision of the computational box, but gives the fundamental property that each sub-domain is, up to an error, independent on the others. This feature allows for an efficient parallelization, since transmission conditions at the interfaces of the sub-domains can be completely avoided. The computational resources can be better employed, so that no processor remains idle or computes useless information, thus saving CPU time to reach convergence. Despite the pre-computation step and the non trivial implementation, the overall performance of the patchy decomposition overcomes that of a static domain decomposition.

Another technique widely used to reduce the computational efforts when computing the solution to first order Hamilton-Jacobi equations, is to exploit the so called causality property. This term refers to a peculiarity of hyperbolic equations, namely the fact that, starting from the boundary data, information propagates in the domain along characteristics at finite speed. Fast-marching methods [18, 15, 16] have been developed trying to reproduce this feature at the discrete level. The main idea is to process the grid nodes in a suitable order that decouples, at least partially, the nonlinear system corresponding to the discretization of the equation. Moreover, the computation is localized, at each iteration, on the nodes that bring the relevant information. Due to the causality property, these nodes are very few, compared to the size of the whole grid. In this way, the solution can be computed in a cascade fashion, accelerating the convergence of the underlying scheme significantly. It turns out that each node converges in a predetermined and finite number of iterations, only one iteration in the most favorable cases. For this reason fast-marching methods are also termed single-pass methods.

Keeping these ideas in mind, in this paper we revisit the patchy domain decomposition in the context of second order semi-linear equations. We try to extend the main features of the method to a class of problems where diffusion appears, namely stationary nonlinear Hamilton-Jacobi-Bellman equations coming from stochastic optimal control problems. A prototype problem is the following:

 \vspace−9pt{−εΔu(x)+maxa∈A{−f(x,a)⋅∇u(x)}=l(x)x∈Ωu(x)=g(x)x∈∂Ω

where is the diffusion coefficient, is a compact set of admissible controls, is the controlled dynamics driving the system (aka the controlled advection), is a source term and is a boundary datum.

The semi-Lagrangian local solver of the original patchy method has to be modified in order to deal with diffusion. Indeed, the presence of diffusion invalidates the connection with deterministic optimal control, in the sense that characteristic curves associated to the hyperbolic equation are no longer well defined in this more general setting. Nevertheless, a weak notion of characteristics can be still provided via stochastic differential equations, interpreting the diffusion as a Wiener process. Then, a semi-Lagrangian discretization of the corresponding semi-linear equation can be still performed, approximating the stochastic process via discrete time Markov chains. The resulting scheme is able to compute the solution of equations with very degenerate diffusions. This is a very delicate problem in the community working on advection-diffusion equations, especially in cases of interest where the diffusion is much smaller than the advection (e.g. Navier-Stokes equations in fluid dynamics). Indeed, it is well known that a degeneracy in the diffusion translates into a lack of coercivity of the variational form associated to the equation. In particular, discretizations based on finite elements need to be stabilized with ad-hoc procedures, in order to capture the boundary or internal layers that typically arise in such equations.

In this more general setting the application of the patchy domain decomposition is not trivial. The main obstacle is that the transmission conditions at the interfaces cannot be avoided, due again to the presence of diffusion. A natural question arises: is the patchy domain decomposition still favorable than a static domain decomposition? We show that the answer to this question depends on the ratio between the controlled advection and the diffusion coefficient, a characteristic quantity similar to the Reynolds number for Navier-Stokes equations. More precisely, we show that if the controlled advection dominates diffusion and the discretization parameters are chosen in a suitable way, then a parallel computation on the dynamic decomposition can still be faster than that on a static decomposition. To reach this achievement, we need to exploit all the information collected in the pre-computation step of the dynamic decomposition. In particular, we try to employ, despite the diffusion, the causality property of hyperbolic equations discussed above. Further technical modifications should be applied to the local solver, in order to remove from the scheme the dependency of each node on itself, an issue that dramatically breaks down the causality. This results in a remarkable speedup for the computation, even more substantial if combined with a parallel algorithm.

The paper is organized as follows: in Section 2 we briefly review, for the readers convenience, the patchy domain decomposition method for first order Hamilton-Jacobi-Bellman equations proposed in [6]. Section 3 is devoted to the extension of the semi-Lagrangian local solver to second order semi-linear equations, in a form suitable for our purposes. In Section 4, we show how to adapt the patchy decomposition method to the more general setting of equations with diffusion, in particular we establish conditions on the parameters that guarantee the effectiveness of the dynamic domain decomposition for the parallel computation. Finally, in Section 5, we present several numerical tests in dimension two, in order to show the performance of the proposed method compared to its static version.

2 Patchy decomposition for first order HJB equations

In this section we review the patchy domain decomposition method, proposed in [6] to solve boundary value problems for first order Hamilton-Jacobi equations of the form

 {maxa∈A{−f(x,a)⋅∇u(x)−l(x,a)}=0x∈Ωu(x)=g(x)x∈∂Ω \hb@xt@.01(2.1)

where is an open set and is a compact set. Moreover, is a vector field and , , are scalar functions.

It is well known that equation can be interpreted, via the celebrated dynamic programming principle, as the Hamilton-Jacobi-Bellman equation associated to a suitable deterministic optimal control problem. Indeed, let us consider the following controlled dynamical system

 {˙y(t)=f(y(t),α(t)),t>0y(0)=x \hb@xt@.01(2.2)

where is the state of the system and is a generic function belonging to the following set of admissible controls:

 A={α:[0,+∞)→A, measurable}.

We denote by the solution to starting from using the control , and we define the first time arrival to the boundary as

 τ(x,α(⋅))=inf{t≥0:y(t;x,α(⋅))∈∂Ω}.

Finally, we consider the following functional:

 J(x,α(⋅))=∫τ(x,α(⋅))0l(y(s;x,α(⋅)),α(s))ds+g(y(τ(x,α(⋅)))).

In this setting, the minimum time problem with running cost and exit cost consists in finding, for each , an optimal control that minimizes among all the admissible controls. Under suitable assumptions on the data, it can be proved that the value function of the problem, i.e.

 u(x)=infα(⋅)∈AJ(x,α(⋅))x∈Ω

is the unique viscosity solution to . We refer the reader to [1] for the details.

The main advantage of this approach is that, once the value function is computed, we can quite easily synthesize an optimal control for the minimum time problem, by taking

 a∗(x)=argmina∈A{f(x,a)⋅∇u(x)+l(x,a)}x∈Ω. \hb@xt@.01(2.3)

Note that is in feedback form, namely it depends only on the the state an not on the time. It follows that, for each , we can compute the optimal trajectory starting at (i.e. a characteristic curve of the hyperbolic equation ) by simply plugging in the dynamical system :

This is not the case for other types of techniques, based on the Pontryagin maximum principle, which provide only necessary conditions for optimality and sub-optimal open-loop controls (i.e. controls that depend on time).

Unfortunately, from the numerical point of view, the Hamilton-Jacobi approach implies severe computational efforts, since it requires the computation of the value function on the whole state space . Considering that industrial applications demand the solution of optimal control problems at least in dimension six (as for the most simple second order dynamics in with controlled acceleration), this approach still suffers the curse of dimensionality mentioned in the Introduction. For this reason, the development of efficient parallel algorithms for Hamilton-Jacobi equations is nowadays an active field of research. The patchy domain decomposition method places itself exactly in this context.

The semi-Lagrangian discretization of equation is postponed to the next section, in a generalized form that recovers first order equations as a particular cases. The interested reader can refer to [6] for the original version of the scheme. Here, we prefer to keep the discussion free from technical details, and mainly focus on the ideas behind the construction of the parallel algorithm.

We consider two different discretizations and of the state space . The grid denotes the actual grid on which we want to solve the problem , also termed the fine grid, whereas is very coarse compared to (and possibly contained in) .

The first step of the method consists in computing a coarse solution on , also via parallel computations using a standard (static) domain decomposition technique. Due to the low resolution of the grid, this is a very cheap operation, but gives a first rough approximation on , say , of the actual solution . It is reconstructed by interpolation of on .

Now we employ to compute a feedback optimal control , via an appropriate discrete version of the synthesis procedure , presented in Section LABEL:PD2. We stress that is just a coarse approximation of the actual optimal control , but it is defined on the fine grid . This is enough to start the construction of the patchy domain decomposition.

We divide the boundary in a fixed number of disjoint sub-sets, denoted by , with . Then, for each , we compute the sub-domain as the numerical domain of dependence of through the optimal dynamics . Let us clarify this point in a continuous setting and postpone the actual implementation to Section LABEL:PD2. We denote by the characteristic function of the sub-boundary , and consider the following Cauchy-Dirichlet problem for the advection equation:

 \hb@xt@.01(2.4)

It is well known that the boundary datum acts as a source of information, that flows (backward) in along characteristics, according to the drift . The limit in time

 ϕ∞p(x)=limt→+∞ϕp(x,t)

is still a characteristic function, since the hyperbolic equation preserves the properties of (e.g. the maximum and the singularities). Then, we define the -th patch of our dynamic decomposition as

 Ωp={x∈Ω:ϕ∞p(x)=1}.

We remark that each patch is a bundle of characteristics enjoying, by construction, the fundamental property of being invariant with respect to the optimal dynamics, i.e. . This is not completely true, since is built using only the coarse control . Moreover, at the discrete level, the projection of the patches on the grid introduces an additional error, in particular if the dynamics defines very bended characteristic curves. Figure LABEL:patchy-decomposition shows the patchy decomposition for two test dynamics, in dimension two and three respectively. Note that, by construction, the patches do not overlap, sharing only sharp interfaces.

The next step is the parallel computation on the patches. This can be done avoiding completely the transmission conditions, exploiting the invariance property just discussed above. Since this feature is weakened at the discrete level, it can be enforced in the computation by imposing state constraint conditions at the interfaces.

Finally, all the solutions are merged together, producing a solution on the whole domain . As expected, this patchy solution is slightly different from that computed using a standard domain decomposition method. Nevertheless, the error is localized at the interfaces between the patches and does not propagate in the interior of the sub-domains, provided that the grid is not too much coarse compared to the fine grid . This is shown in [6] by numerical evidence, but a precise error estimate is still missing.

Despite the small errors, the absence of transmission conditions can give to the parallelization a remarkable speedup. In this respect a key role is played by the relative sizes of the patches. Indeed, we remark that the construction of the patchy decomposition is completely driven by the dynamics of the optimal control problem. Hence, even a subdivision of the boundary in sub-sets of the same size can produce a highly unbalanced domain decomposition. In these cases the performance of the parallelization is very poor, since the processors associated to the smaller patches complete their job (and remain idle) much earlier than those corresponding to the larger patches. This drawback was pointed out in [6] and can be overcome via a multi-level technique, which is currently under development. The idea is to alternate the construction of the patchy decomposition with the computation of the patchy solution. More precisely, one can start and continue the construction of the patches as long as they have about the same size, obtaining a first level of balanced sub-domains. The solution is then computed on the first level sub-decomposition. The new boundaries (possibly divided again in sub-sets of the same size) are employed to start and build the second level sub-decomposition. Moreover, the values of the first level solution at the corresponding points (correct values due to the causality property of hyperbolic equations) are assigned as boundary data for the computation of the second level solution. This procedure is then iterated and terminates when the sub-decompositions cover the whole domain .

3 A semi-Lagrangian scheme for second order HJB equations

In this section we present a semi-Lagrangian discretization for a special class of stationary second order semi-linear equations. The resulting scheme will be employed as local solver for the extension of the patchy method to this more general setting, as discussed in the next section.

The prototype problem we have in mind is the following boundary value problem for second order semi-linear equations, written in a control theory perspective:

 {maxa∈A{−D(x,a)u(x)−l(x,a)}=0x∈Ωu(x)=g(x)x∈∂Ω \hb@xt@.01(3.1)

where is an open set, , is a compact set and the second order differential operator is given by

 D(x,a)=12n∑i,j=1(d∑k=1σik(x,a)σjk(x,a))∂2∂xi∂xj+n∑i=1fi(x,a)∂∂xi

with , and , .

The crucial point here is that the connection with deterministic optimal control, depicted in the previous section, is lost due to the presence of the diffusion term . Indeed, in this case, characteristic curves are no longer well defined by the system of controlled ordinary differential equations . Nevertheless, a weak notion of characteristics is still available, interpreting these curves as generalized trajectories, namely solutions to the following system of controlled stochastic differential equations with dynamics :

 {dX(t)=f(X(t),α(t))dt+σ(X(t),α(t))dW(t),t>0X(0)=x \hb@xt@.01(3.2)

Here is a progressively measurable process, representing the state of a system evolving in starting from , the process is the control applied to the system at the time with values in the control set and is a -dimensional Wiener process. We define the set of admissible controls

 A={α:[0,+∞)→A, progressively measurable}

and we denote by the solution to starting from using the control . Finally, we define the first time arrival to the boundary as

 τ(x,α(⋅))=inf{t≥0:X(t;x,α(⋅))∈∂Ω}

and we consider the following cost functional ( stands for the probabilistic expectation):

 J(x,α(⋅))=E{∫τ(x,α(⋅))0l(X(s;x,α(⋅)),α(s))ds+g(X(τ(x,α(⋅))))}.

In this setting, the unique viscosity solution to the problem can be interpreted as the value function of the following stochastic minimum time problem with running cost and exit cost :

 u(x)=infα(⋅)∈AJ(x,α(⋅))x∈Ω.

Typical assumptions on , , and for the well posedness of the problem is boundedness and Lipschitz continuity in space uniformly in the control. We refer the interested reader to [7] and the references therein for further explanations and details.

Following [7], the semi-Lagrangian discretization of equation can be performed introducing a time step , interpreting the first order term in the operator as a directional derivative of along the dynamics and approximating the Wiener process in the stochastic differential equation by means of discrete time Markov chains. We introduce in the state space a structured grid with uniform step in each coordinate direction and nodes for . This leads to the following scheme, which is a nonlinear system in the form of a fixed point operator:

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩xa,si,k=xi+hf(xi,a)+s√hσk(xi,a)s=±1,k=1,...,du(xi)=mina∈A{12dd∑k=1∑s=±1u(xa,si,k)+hl(xi,a)}xi∈G∩Ωu(xi)=g(xi)xi∈G∩∂Ω \hb@xt@.01(3.3)

where denotes the -th row of . The evaluation of the operator is done by direct comparison discretizing the control set with points. Moreover, as usual in semi-Lagrangian approximations, the points do not lie in general on the grid and the values of at the these points need to be reconstructed by interpolation. A natural choice here is the bilinear interpolation, as shown in Figure LABEL:bilinear.

We choose a time step such that, for each node and every control , the points fall in the first neighboring cells of . The reason is twofold. First, it is easy to prove that the semi-Lagrangian scheme is consistent to equation only with order and we do not want to get a too poor accuracy in the approximation of the trajectories. Second and more important, we want to keep the stencil of the scheme strictly localized, in order to reduce the computational efforts.

Now, let us denote by , , , the four points involved in the reconstruction of the value , with weights , , , respectively. Without loss of generality, we can assume that for each and . It follows that

 u(xa,si,k)=3∑p=0λa,sip,ku(xa,sip,k)=λa,si0,ku(xi)+3∑p=1λa,sip,ku(xa,sip,k).

Substituting this expression in the scheme we get

 u(xi)=mina∈A{12dd∑k=1∑s=±1λa,si0,ku(xi)+12dd∑k=1∑s=±13∑p=1λa,sip,ku(xa,sip,k)+hl(xi,a)}, \hb@xt@.01(3.4)

which shows explicitly the dependency of on itself.

As explained in the Introduction, in the hyperbolic case () relevant information starts from the boundary of the domain, and propagates along characteristics backward in time at finite speed. At the discrete level, we can try to mimic this causality property, by means of an appropriate ordering of the grid nodes that tracks the front of information. In this way, we can get a cascade effect that decouples, at least partially, the nonlinear system, drastically reducing the number of iterations to reach convergence. Despite this property does not hold in the continuous case in presence of diffusion (), we can still have a numerical causality if some suitable condition on the parameters is satisfied. We will come back on this important point in the next section.

Here the key argument is that we have to remove in the self-dependency on , which makes the scheme strongly iterative by construction. This is a really crucial step, if we hope to benefit of the speedup induced by the causality property. To this end, we denote by

 Λ0(a)=12dd∑k=1∑s=±1λa,si0,k,Λ1(a)=12dd∑k=1∑s=±13∑p=1λa,sip,ku(xa,sip,k)+hl(xi,a),

so that

 u(xi)=mina∈A{Λ0(a)u(xi)+Λ1(a)}. \hb@xt@.01(3.5)

We remark that and do not depend on the value . Moreover, we note that for all , since the interpolation weights can be simultaneously equal to only if the time step . Then, it is well defined the value

 v(xi)=min¯a∈A{Λ1(¯a)1−Λ0(¯a)} \hb@xt@.01(3.6)

and we claim that . This kind of explicitation is trivial in the linear case without diffusion (i.e. for not depending on the control and ), but it is less evident (and to our knowledge also quite surprising) in the general nonlinear case. So let be the control achieving the minimum in , i.e.

 u(xi)=Λ0(a∗)u(xi)+Λ1(a∗).

This implies

 u(xi)=Λ1(a∗)1−Λ0(a∗)≥min¯a∈A{Λ1(¯a)1−Λ0(¯a)}=v(xi).

Conversely, let be the control achieving the minimum in , i.e.

 v(xi)=Λ1(¯a∗)1−Λ0(¯a∗).

This implies

 u(xi)≤Λ0(¯a∗)u(xi)+Λ1(¯a∗).

Since , we get

 u(xi)≤Λ1(a∗)1−Λ0(a∗)=v(xi).

Finally, we obtain the following modified scheme

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩xa,si,k=xi+hf(xi,a)+s√hσk(xi,a)s=±1,k=1,...,du(xi)=mina∈A⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩12dd∑k=1∑s=±13∑p=1λa,sip,ku(xa,sip,k)+hl(xi,a)1−12dd∑k=1∑s=±1λa,si0,k⎫⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎬⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎭xi∈G∩Ωu(xi)=g(xi)xi∈G∩∂Ω \hb@xt@.01(3.7)

where the value at each node now depends only on values at nodes different from . In Section LABEL:experiments we compare this scheme with the original scheme , showing the effectiveness of the modification in terms of iterations to reach convergence.

We conclude this section by remarking that the proposed semi-Lagrangian scheme can handle by construction very degenerate diffusions. Indeed, at points where the diffusion coefficient , we naturally recover the semi-Lagrangian scheme presented in [6] for first order Hamilton-Jacobi-Bellman equations (see also the numerical experiments in Section LABEL:experiments). This is not the case for other types of discretization, e.g. finite elements, where the degeneracy in the diffusion produces instabilities that need to be treated with very specific techniques. This is due to the fact that the variational forms associated to the equations under consideration suffer a lack of coercivity. We refer the interested reader to [12] for details and insights on this topic.

4 Patchy decomposition for second order semi-linear equations

In this section we aim to extend the patchy decomposition method to the class of second order semi-linear equations presented in the previous section.

The main idea is really simple: we first compute the patchy domain decomposition in the hyperbolic case with . Then we solve in parallel the full equation with using the patchy decomposition.

At a first look this attempt could seem meaningless. Indeed, due to the diffusion, information spreads instantaneously from the boundary to the whole domain and then it cannot be confined in independent sub-domains, as for the first order case. This implies that, in order to compute the correct solution in this more general setting, we cannot replace the transmissions between the patches with state constraint boundary conditions, as discussed in Section LABEL:PD1. So we loose the main advantage of the patchy decomposition compared to an arbitrary and static decomposition. Moreover, we recall that the preliminary step in the computation of the dynamic decomposition results in an additional cost in terms of CPU time. The reader can easily convince himself that, in presence of transmission conditions, the performance of a parallel computation does not significantly depend, in general, on the particular shape of the sub-domains, but only on their number. As pointed out in [6], if we assume that the sub-domains have about the same size, then the overall time consumption to compute the solution almost exclusively depends on the number of processors involved in the computation and the transmission delays.

Nevertheless, we show through the next sections that the patchy method with transmission conditions can still be competitive if we combine two different features, described in the following sub-sections.

4.1 The upwind diffusion ball condition

Here we present the first ingredient for the extension of the patchy method to second order semi-linear equations. In particular we show that, under suitable relations between the parameters, the discretization of equation behaves in a sense more like hyperbolic than elliptic.

To simplify the presentation we consider a special case in dimension two. Let and take of the form

 σ=√2ε(1001) \hb@xt@.01(4.1)

Equation writes

 {−εΔu(x)+maxa∈A{−f(x,a)⋅∇u(x)−l(x,a)}=0x∈Ωu(x)=g(x)x∈∂Ω

and the semi-Lagrangian scheme simplifies in

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩xa,si,k=xi+hf(xi,a)+s√2εheks=±1,k=1,2u(xi)=mina∈A⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩142∑k=1∑s=±13∑p=1λa,sip,ku(xa,sip,k)+hl(xi,a)1−142∑k=1∑s=±1λa,si0,k⎫⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎬⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎭xi∈G∩Ωu(xi)=g(xi)xi∈G∩∂Ω

where denotes the -th element of the canonical base of .

As discussed in Section LABEL:SL2HJB, for each node and control , we want to reconstruct the value of at the four points (, ), via bilinear interpolation of the first neighboring grid nodes. To this end, it suffices to choose a time step such that

 hfmax+√2εh<Δx, \hb@xt@.01(4.2)

where denotes the maximum of the dynamics on . Note that condition can be localized for each node and control . This results in a more accurate approximation, but it is clearly more expensive from the computational point of view.

Now consider the half space defined by and also the ball of radius centered at , enclosing the four points that need to be reconstructed (see Figure LABEL:halfspace).

The key argument of our construction is that, if the diffusion ball is contained in the half space , then the value is computed using only grid nodes that are upwind with respect to the vector field , as shown in Figure LABEL:halfspace-(a). If we combine this property with a suitable order for processing the grid nodes (aka causality, see next subsection) we can accelerate the convergence of the scheme in the same spirit of fast-marching methods, reducing significantly the number of iterations. On the contrary, if part of crosses , then also downwind nodes are employed in the reconstruction of , as shown in Figure LABEL:halfspace-(b). It follows that some information is flowing in directions opposite to the vector field , so that cannot be computed in a single pass fashion and the number of iterations to reach convergence increases. This is clearly expected, since we are considering the particular example in which the diffusion uniformly spreads information in all the directions. But the crucial point here is that the semi-Lagrangian scheme propagates diffusion at speed , and not instantaneously as in the continuous case. Indeed, at each iteration, we move with discrete time steps from the point to the points (, ), adding the contribution of the vector field and the diffusion. Then, the upwind condition on the diffusion ball holds if the time step satisfies also

 hfmin−√2εh>0, \hb@xt@.01(4.3)

where denotes the minimum of the dynamics on . Note that also (LABEL:lower-bound) can be localized for each node and control , but in the following we will always consider only global conditions.

We now relate the discretization parameters to the data, aiming to satisfy both conditions and . To this end, we set with to be determined. Moreover, we define and . Substituting in , we get

 √αΔxω<(1−αΥ)Δx,

which implies, for and by straightforward computations, the following condition:

 α<1Υ−√1+4ωΔxΥ−12ωΔxΥ2=:¯¯¯¯α \hb@xt@.01(4.4)

with satisfying . On the other hand, substituting in , we immediately get

 α>1ωΔx=:α––. \hb@xt@.01(4.5)

By imposing the compatibility of the two conditions, i.e. , we easily obtain the following upwind diffusion ball condition:

 1ω<Δx1+Υ. \hb@xt@.01(4.6)

Summarizing, if condition is satisfied, we can choose such that both and hold. In particular, it is easy to check that is the only value satisfying for all and such that holds (see also Figure LABEL:alpha-opt).

Conversely, if fails, only one of the two conditions can be fulfilled, by choosing appropriately. We stress again that, for accuracy and computational issues discussed in Section LABEL:SL2HJB, from now on we will always satisfy condition , possibly loosing condition .

In the general case , where the diffusion coefficient is given by the matrix-valued map , we obtain the same condition with , where

We remark that identifies the diffusion vector with greatest length among the rows of , namely an upper bound for the radius of the diffusion ball.

For a fixed and small mesh size , condition can be clearly fulfilled if the controlled advection dominates diffusion, i.e. for . Note that is a characteristic quantity of the problem. It resembles the Reynolds number for Navier-Stokes equations and the case is of great interest in the applications [12]. In this perspective, condition describes an interesting threshold effect, by means of the mesh dependent parameter . Indeed, for , the upwind diffusion ball condition is satisfied, and the semi-Lagrangian scheme exhibits the same peculiarities of the hyperbolic case. Conversely, for , the upwind diffusion ball condition fails and an elliptic behavior emerges. Note that this effect is also related to the degree of anisotropy . Equivalently, we can fix , depending on the data , , and choose the mesh size according to the threshold . It follows that, at a coarse scale, i.e. for , the approximation “looks like” that of a controlled advection equation. Again, this effect also depends on the degree of anisotropy, in the sense that the larger is the coarser should be the mesh. On the other hand, at a sufficiently fine scale , diffusion starts to get noticed and this hyperbolic behavior disappears as . This is expected, due to the consistency of the semi-Lagrangian scheme with the considered semi-linear equation.

Finally, we remark that, from a numerical point of view, the effect of the upwind diffusion ball condition on the performance of the scheme can be less relevant than expected. Indeed, even if the diffusion ball is contained in the half space associated to , some grid points involved in the interpolation can be outside, as shown in Figure LABEL:halfspace-ex.

This strongly depends on the alignment to the grid of the optimal vector field achieving the minimum in . Nevertheless, as confirmed by the numerical experiments in Section LABEL:experiments, this drawback does not significantly affect the hyperbolic behavior of the scheme, since the downwind grid points contribute in these cases with very small interpolation weights, compared to the upwind ones.

4.2 The role of causality

Under the regime given by condition , we have a chance to get a good performance of the patchy method. To this end, the second crucial point for letting work all the machinery in this more general setting is the causality property of hyperbolic equations outlined in the previous section, a quite well established topic in the literature.

Starting from the work by Tsitsiklis [18] and then Sethian [14], a lot of research has been devoted to find an implicit order of the nodes of a grid that allows to compute the solution in just one or very few iterations, the so called single-pass property. With this idea in mind, the celebrated fast-marching method has been proposed to solve the eikonal equation. It can be proved that, in this special case, the right order corresponds to process the nodes by ascending values, progressively accepting as correct (and then removing from the computation) the node with the minimal value. This translates in computing the solution following its level sets, namely propagating information along its gradient curves. Since for the eikonal equation gradient curves coincide with characteristics, we get the correct solution.

Unfortunately, this ordering is optimal only for eikonal equations of isotropic type, namely for a dynamics that does not depend (or depends in a weak sense) on the control variable. We refer the reader to [16] for a detailed explanation of this problem. Whenever a strong anisotropy comes into play, the choice of a correct order for the grid nodes becomes a subtle topic, meaning that it may not even exist, despite the causality property always holds at the continuous level. This goes beyond the scope of this paper, but we refer the reader to [4] for a discussion on the applicability of fast-marching-like methods to general Hamilton-Jacobi equations. We only point out that, in order to compute the correct solution, one has to enlarge the stencil of the scheme and/or give up the single-pass property mentioned above. This results in more computational efforts and more iterations for the scheme to reach convergence.

We mention here also the fast-sweeping method [17, 19] and some of its generalizations [5], based on another technique that, in a weaker sense, still exploits the causality property. The grid nodes are alternatively swept in a pre-determined number of directions according to the dimension of the problem, until convergence is reached. This makes the method iterative by construction, but it allows to compute the solution to very general equations of Hamilton-Jacobi type. The number of iterations to reach convergence is strongly dependent on the problem and the mesh structure. Nevertheless, it can be proved that only sweeps are needed to compute the solution to the eikonal equation in dimension .

In Section LABEL:PD1, we denoted by the interpolation on the fine grid of the coarse solution on the coarse grid . Our strategy is then to sort the grid nodes of in a fast-marching fashion, according to the increasing values of . This can be accomplished with some additional but cheap time consumption, using some state-of-the-art sorting algorithm, embedded in the pre-computation step. Note that this procedure was already mentioned in [6] as a possible add-on for the original patchy method. Here it becomes an essential part of the proposed method. We will see in the next section that this ordering of the grid nodes, even if in general sub-optimal, can give an exceptional speedup to the computation.

4.3 The patchy algorithm

Here we summarize all the implementation steps of the patchy method for second order semi-linear equations. To this end, we present both the discrete version of the procedure for the synthesis of the feedback optimal control and the atcual construction of the patchy decomposition discussed in Section LABEL:PD1.

We consider the non modified semi-Lagrangian scheme in the special case without diffusion, i.e. :

 {u(xi)=mina∈A{u(xi+hf(xi,a))+hl(xi,a)}xi∈G∩Ωu(xi)=g(xi)xi∈G∩∂Ω

Once the value function is computed, we easily get

 a∗(xi)=argmina∈A{u(xi+hf(xi,a))+hl(xi,a)}xi∈G∩Ω. \hb@xt@.01(4.7)

Note that the computational cost of this operation mainly depends on the number of points used to discretize the control set , but also on the bilinear interpolation for the reconstruction of at the points .

We proceed with the construction of the dynamic decomposition. We divide the boundary nodes of in a fixed number of sub-sets , with , defining the initial guess

 ϕp(xi)={1xi∈Γp0otherwise

Then, for each and a fixed tolerance , we iterate until convergence the scheme

 ϕp(xi)=ϕp(xi+hf(xi,a∗(xi))xi∈G∩Ω, \hb@xt@.01(4.8)

which is a semi-Lagrangian discretization of the advection equation . Note that, due to the interpolation and differently from the continuous case, scheme spreads the sharp values of the initial guess, producing a final solution valued on the whole interval . This makes the patches fuzzy sub-sets of . Hence, an additional thresholding procedure is needed to define them correctly. Indeed, for a fixed , we set

 Ωp={xi∈G∩Ω:ϕp(xi)≥τP}.

In practice, we loose some information, projecting on the space of characteristic functions. Nevertheless, this allows to choose a poor tolerance , accelerating the convergence of the scheme. Moreover, by tuning the threshold parameter , we can achieve a desired level of overlapping between the patches, including sharp interfaces. We recall that, by condition , the stencil of the semi-Lagrangian scheme consists of first neighboring nodes. In addition, in this more general setting, we can no longer impose state constraint conditions at the boundaries of the patches. Then, some overlap is required, if we work with distributed memory architectures. On the other hand, sharp interfaces are still possible, if we prefer a shared memory architecture, as the one employed for the numerical experiments presented in the next section.

Finally, the new patchy algorithm summarizes as follows:

Initialization:

• Build two grids and such that .

• Fix tolerances , , , the number of patches and the threshold .

• Build the initial guess on equal to the exit cost on and (i.e. a very big value) otherwise.

Pre-computation:

• Starting from , compute on , iterating the scheme with until convergence (up to ).

• Build on by interpolation of .

• Compute on using in synthesis procedure .

• Divide the boundary nodes of in sub-sets.

• For , use to compute on the patch (with threshold ), iterating the scheme until convergence (up to ).

• For , sort the nodes of according to the increasing values of (to exploit causality).

Computation:

• For , starting from , compute on , iterating the scheme until convergence (up to ). At each iteration, update at the nodes where intersects other patches, using the transmission condition

 up(xi)=min{up(xi),uq(xi)} for all q≠p such that ∅≠Ωp∩Ωq∋xi
• Build the solution on , merging all the solutions on .

Note that all the steps of the method can be parallelized. In particular, the solution on the coarse grid can be computed by means of a standard domain decomposition technique. In the following section we will compare this dynamic domain decomposition to the classical static domain decomposition. The interested reader can find in [13] a good introduction to this topic and in [11, 8] some static domain decomposition methods for first order Hamilton-Jacobi equations.

We conclude this section by remarking that, in the case of second order semi-linear equations, the issue of balancing the size of the patches can no longer be addressed via the multi-level approach discussed in Section LABEL:PD1 for first order equations. Indeed, due to the presence of the diffusion, we are not guaranteed that the values of the solution at the boundary of the current level decomposition are correct. Some information could flow back in the future from patches that have not yet been built. In principle, we can continue the construction of the decomposition, postponing the computation of the solution. To this end we need more processors, exactly , where is the number of levels, which is clearly bounded but a priori unknown. In the case of only available processors, an alternative could be to find an iterative method to solve the following optimization problem: build an initial subdivision of the boundary such that the corresponding patches have about the same sizes. An interesting question is to understand if the additional computational cost of this optimization procedure is compensated or not by the balancing in size of the patches. This method is at present under development.

5 Numerical experiments

In this section, we present some numerical experiments in dimension two, performed on a server Supermicro 8045C-3RB using 1 CPU Intel Xeon Quad-Core E7330 2.4 GHz with 32 GB RAM, running under the Linux Gentoo operating system. The aim is to emphasize the features of the proposed method. In particular, we first show that the modified semi-Lagrangian scheme (LABEL:SLscheme3) allows for a substantial reduction of the number of iterations needed to reach convergence, compared to the original scheme (LABEL:SLscheme2). Next we show that it is able to compute the solution to general nonlinear equations with very degenerate diffusion. Finally, we combine the scheme with the upwind diffusion ball condition (LABEL:hyperbolicity) and the fast-marching-like sorting of the grid nodes, in order to get an extra reduction of the computational time. This will confirm the effectiveness of the extension of the patchy method to the more general setting of second order Hamilton-Jacobi equations.

In all the following tests, we set the domain and the control set (the unit ball centered in the origin), which is discretized by means of points. Moreover, if not differently specified, we take the boundary datum , the running cost and the diffusion , where and denotes the identity matrix. The corresponding second order operator in equation (LABEL:HJB2) is just , i.e. the laplacian with diffusion coefficient . Finally, we denote by the characteristic function of a generic subset of .

Now, let us give a list of typical problems, depending on the choice of the dynamics :

• for a given vector field . The corresponding equation is a stationary advection equation along with uniform source:

 b(x)⋅∇u(x)=1.
• for a given positive speed function . In this case it is easy to see that the in (LABEL:HJB1) is achieved for . The corresponding equation is the eikonal equation, whose solution represents the minimum time to reach the boundary starting from and traveling at speed :

 c(x)|∇u(x)|=1.

For we recover the distance function from the boundary .

• , where is a fixed switch parameter to activate/deactivate the control and is a fixed counter-clock-wise rotation with . The corresponding equation is a controlled advection equation with uniform source, also termed the Zermelo navigation problem. This is an example hard to compute, due to the strong anisotropy introduced by the control, namely the dependency of the speed on the direction . See below for further details.

Whenever we use to add the term diffusion to declare the presence of the laplacian, as for the following eikonal-diffusion equation

 {−εΔu(x)+c(x)|∇u(x)|=1x∈Ωu(x)=0x∈∂Ω \hb@xt@.01(5.1)

5.1 Self-dependency removal

Here we compare our modified semi-Lagrangian scheme (LABEL:SLscheme3) and the original one (LABEL:SLscheme2) without any removal of the self-dependency. In Table LABEL:table0 we report the number of iterations needed to converge for both schemes in the cases listed above. All the solutions are computed on a grid.

For a pure advection dynamics (A) with vector field and , the causality property is explicit, in the sense that, starting from the left boundary, information propagates to the right in the whole domain. Since, by default, we process the grid nodes from left to right and from top to bottom, then the modified SL scheme converges in this case in just one iteration. Each node is computed (backward in time) using only nodes on its left, hence it is correct by induction (see Figure LABEL:causalitya).

Note that the additional iteration reported in the table is the one needed to check the convergence. On the contrary, the original scheme employs the value of a node (which is set at the beginning to a big value as initial guess) to compute itself. It follows that the correct value carried by the left neighbor is changed by the interpolation and several additional fixed-point iterations are needed to fix it. The case of the eikonal equation (B) with uniform speed and is similar, as shown in Figure LABEL:causalityb. It turns out that characteristics are straight lines, moving from the boundary until they intercept the diagonals of the square. Using as before the default order for processing the grid nodes (left to right, top to bottom), it is easy to see that, at the end of the first iteration, only the nodes above the diagonal will contain correct values of the solution. For the remaining part of the domain, only the first neighboring nodes of the boundary are correctly updated. The same holds in the following iterations, as long as information reaches the middle of the grid. This gives precisely iterations (plus one to check the convergence) for the modified SL scheme and much more (145) for the original scheme. This behavior is confirmed also for more complicated dynamics, as for the eikonal equation (B) with a non homogeneous speed function and for the anisotropic dynamics (C). We finally remark that, in presence of diffusion , the causality property of hyperbolic equations is lost and the number of iterations necessarily increases. Nevertheless, we still get a relevant reduction of this number for the modified SL scheme. From now on, only this scheme will be employed.

5.2 Degenerate diffusion

The following tests aim at showing the built-in ability of the proposed semi-Lagrangian scheme to solve problems with degenerate diffusion. To this end, we allow the diffusion coefficient to depend also on , taking the form . With this choice, we consider again the eikonal-diffusion equation (LABEL:eikdiff) with speed and we note that it is elliptic in and hyperbolic otherwise. Accordingly, at the grid nodes belonging to , we recover by construction the semi-Lagrangian scheme designed for first order hyperbolic equations. Figure LABEL:degeneracy shows the level sets of the corresponding solution. We can immediately distinguish the two different behaviors, by looking at the smoothness of the solution. In particular, the sharp corners corresponding to the shocks in the hyperbolic part of the domain, are completely smoothed out in .

Now we consider the more suggestive and complicated example of the Zermelo navigation problem, whose dynamics is rewritten here for the reader’s convenience:

 f(x,a)=11+|x|2(Rθx|x|+η2a).

This problem consists in reaching the boundary in minimum time, starting from a point in and using the control . The first term in the dynamics represents a whirling drift from the origin to the boundary, whereas the fixed parameter allows to switch on/off the control. This introduces in the problem both inhomogeneity and strong anisotropy. It is easy to see that, for a fixed , the counter-clock-wise rotation guarantees the reachability of the boundary for each starting point in , even for , but we want to play with the control to minimize the time of arrival. In Figure LABEL:degeneracy2 we compare the level sets of the solutions and the corresponding optimal dynamics for three different cases. The first case (see Figure LABEL:degeneracy2a) represents a pure advection-diffusion equation, with uniform diffusion on the whole and no control, i.e. . In the second case (see Figure LABEL:degeneracy2b) also the control is active, i.e. , and we can clearly see how it produces a resistance to the drift that allows to reach the boundary in a smaller time. This is much more evident in the third case (see Figure LABEL:degeneracy2c) where the control is still active and the diffusion is switched off, i.e. .

5.3 General nonlinear problems

This test shows the ability of the proposed semi-Lagrangian scheme to solve nonlinear problems with very general diffusion terms and running costs , possibly depending on and . To this end, we consider a non homogeneous dynamics of the form

 f(x,a)=c(x)awithc(x)=1+max{x2,max{x1,0}}. \hb@xt@.01(5.2)

Moreover, we define and we consider three different diffusion terms:

 σ1≡0,σ2(x)=dε(x)I2,σ3(x,a)=dε(x)a. \hb@xt@.01(5.3)

Note that corresponds to the usual uniform two dimensional Wiener process, whereas defines a one dimensional Wiener process in the direction of the control . Finally, we consider three different running costs:

 l1≡1,l2(x)=1+|x1x2|,l3(x,a)=1+|x1x2|+∣∣∣a12+a2∣∣∣. \hb@xt@.01(5.4)

In Figure LABEL:generalsigmacost we show the level sets of the solutions corresponding to all the nine possible pairs for . We clearly see the effect of the diffusion, whenever applied in the upper half of the square. Moreover, we can also distinguish the different behavior between the 2D uniform diffusion and the 1D control driven diffusion.