Inviscid damping and the asymptotic stability of planar shear flows in the 2D Euler equations
Abstract
We prove asymptotic stability of shear flows close to the planar Couette flow in the 2D inviscid Euler equations on . That is, given an initial perturbation of the Couette flow small in a suitable regularity class, specifically Gevrey space of class smaller than 2, the velocity converges strongly in to a shear flow which is also close to the Couette flow. The vorticity is asymptotically driven to small scales by a linear evolution and weakly converges as . The strong convergence of the velocity field is sometimes referred to as inviscid damping, due to the relationship with Landau damping in the Vlasov equations. This convergence was formally derived at the linear level by Kelvin in 1887 and it occurs at an algebraic rate first computed by Orr in 1907; our work appears to be the first rigorous confirmation of this behavior on the nonlinear level.
1 Introduction
We consider the 2D Euler system in the vorticity formulation with a background shear flow:
(1.1) 
Here, , and are periodic in the variable with period normalized to . The physical velocity is where denotes the velocity perturbation and the total vorticity is . We denote the streamfunction by . The velocity itself satisfies the momentum equation
(1.2) 
where denotes the pressure. Linearizing the vorticity equation (1.1) yields the linear evolution
(1.3) 
In this work, we are interested in the long time behavior of (1.1) for small initial perturbations . In particular, we show that all sufficiently small perturbations in a suitable regularity class undergo ‘inviscid damping’ and satisfy as for some determined by the evolution.
The field of hydrodynamic stability started in the nineteenth century with Stokes, Helmholtz, Reynolds, Rayleigh, Kelvin, Orr, Sommerfeld and many others. Rayleigh [74] studied the linear stability and instability of planar inviscid shear flows using what is now referred to as the normal mode method. Such a method yields spectral instability or spectral stability depending on whether or not an unstable eigenvalue exists. In that work, Rayleigh proves the famous inflection point theorem which gives a necessary condition for spectral instability. At around the same time, Kelvin [45] constructed exact solutions to the linearized problem around the Couette flow (which are actually solutions of the nonlinear problem). This was the first attempt to solve the initial value problem of the linearized problem which was later developed further in [72, 21, 64].
Even to the present day, the methods used and the conclusions of these works are debated both on physical and mathematical grounds [90]. Experimental realizations of Couette and similar spectrally stable flows show instability and transition to turbulence for sufficiently high Reynolds numbers [76, 73, 60, 84, 14]. However, experiments are ultimately inconclusive (mathematically) since many factors are notoriously difficult to control, such as imperfections in the walls and viscous boundary layers. The paradox that Couette flow is known to be spectrally stable for all Reynolds numbers in contradiction with instabilities observed in experiments is now often referred to as the ‘Sommerfeld paradox’, or ‘turbulence paradox’. Of course, from a mathematical point of view, the notion of linear stability was not completely precise in the early works: was it enough that the linear operator has no growing mode (spectral stability) or should one consider general initial perturbation and study the time evolution under the linear equation using, for instance, a Laplace transform in time (see [21, 64, 17]). These early works also predated the notion of Lyapunov stability and Sobolev spaces; indeed, the stability of (1.3) depends heavily on the norm chosen. A more variational approach, which is based on the conserved quantities and uses the notion of EnergyCasimir, was introduced by Arnold [1], and yields Lyapunov stability for a class of shear flows (which does not include Couette flow). We also refer to [42, 51] for the use of the variational approach in the Vlasov case. Recently there were many mathematical studies of stability and instability of various flows (see for instance [32, 10, 41, 55]). We also refer to the following textbooks on the topic of hydrodynamic stability and instability [54, 29, 90].
There were many attempts in the literature to find an explanation to the Sommerfeld paradox (see [53] and the references therein). The first attempt might be due to Orr [72] in 1907, whose work plays a central role in ours. Orr’s observation can be summarized in modern terminology (and adapted to our infinitein setting) as follows. Given a disturbance in the vorticity, the linear evolution under (1.3) is simply advected by the background shear flow: . If one changes coordinates to then the streamfunction in these variables solves . On the Fourier side, ,
(1.4) 
From (1.4), Orr made two important observations, together known now as the Orr mechanism. Firstly, if and is very large relative to , then the streamfunction amplifies by a factor at a critical time given by . These modes correspond to waves tilted against the shear which are being advected to larger lengthscales (lower frequencies). Orr suggested that this transient growth is a possible explanation for the observed practical instability or at least as a reason to question the validity of the linear approximation. Moreover, this shows that the Couette flow is linearly unstable (in the sense of Lyapunov) in the kinetic energy norm. On the other hand, Orr states in [72, ART. 12] that “the motion is stable, for the most general disturbance, if sufficiently small”. Orr does not precise the meaning of sufficiently small but concludes in this case that “the velocitycomponent eventually diminishes indefinitely as , and, the component of the relative velocity as ”. In fact, on the linear level, it is not about smallness but about regularity. Indeed, rigorous proof of the stability and decay on the linear level requires the use of a stronger norm on the initial data than on the evolution, as already noticed in Case [21] and Marcus and Press [64] where this linear stability and decay are proved.
Physically, the decay predicted by the Orr mechanism can be understood as the transfer of enstrophy to small scales (which yields the decay of the velocity by the BiotSavart law) and the transient growth can be understood as the timereversed phenomenon: the transfer of enstrophy from small scales to large scales and hence the growth of the velocity (see also [16, 59] for further discussion). The transfer to small scales by mixing is now considered a fundamental mechanism intimately connected with the stability of coherent structures and the theory of 2D turbulence [46, 36]. However, to our knowledge, our work is the first mathematically rigorous study of this mechanism in the full 2D Euler equations. We refer to [90, 81, 48, 40] for the most recent developments. Mathematically, one can also explain the transient growth by the nonnormality of the linearized operator (also an insight first due to Orr). See for example [85], where the implications of this are studied in terms of the spectra and pseudospectra of the linearized Couette and Poiseuille flows. Indeed, the fact that for nonnormal operators the pseudospectrum can be very different from the spectrum can be seen as another explanation of the transient linear growth [75]. See also [79] for further information.
In 1946, Landau [49] predicted rapid decay of the electric field in hot plasmas perturbed from homogeneous equilibrium by solving the linearized Vlasov equation with a Laplace transform. Now referred to as Landau damping, this somewhat controversial prediction of collisionless relaxation in a timereversible physical model was confirmed by experiments much later in [62] and is now a wellaccepted, ubiquitous phenomenon in plasma physics [77]. In [87], van Kampen showed that one way to interpret this mechanism was through the transfer of information to small scales in velocity space; a scenario completely consistent with timereversibility and conservation of entropy. In this scenario, the freestreaming of particles creates rapid oscillations of the distribution function which are averaged away by the nonlocal Coulomb interactions (see also [20, 27]). The fundamental stabilizing mechanism in this picture is the phasemixing due to particle streaming. The gap between the linear and nonlinear theory of Landau damping was only bridged recently by the groundbreaking work of Mouhot and Villani, who showed that the phasemixing indeed persists in the nonlinear Vlasov equations for small perturbations [68] (see also [18, 44]).
The algebraic decay of the velocity field for solutions to (1.3) predicted by Orr can be most readily understood as a consequence of vorticity mixing driven by the shear flow, and hence can be considered as a hydrodynamic analogue of Landau damping, a viewpoint furthered by many authors [15, 78, 17, 5]. Hence the origin of the term inviscid damping. The first, and most fundamental, difference between (1.3) and the linearized Vlasov equations is the fact that the velocity field induced by meanzero solutions to (1.3) in general does not converge back to the Couette flow, but in fact converges to a different nearby shear flow, whereas the electric field in the linearized Vlasov equations converges to zero. This ‘quasilinearity’ will be a major difficulty in studying inviscid damping on the nonlinear level. Another key difference is that unlike in the Vlasov equations, the decay of the velocity field in (1.3) cannot generally be better than the algebraic rate predicted by Orr, which is not even integrable for the component of the velocity; to contrast, in the Vlasov equations the decay is exponential for analytic perturbations.
It is wellknown that the nonlinearity can change the picture dramatically. A clear example of this are the results of Lin and Zeng [56] who prove that there exists nontrivial periodic solutions to the vorticity equation (1.1) which are arbitrarily close to the Couette flow in for . They have also proved the corresponding, and related, result for the Vlasov equations [57]. In our setting, the primary interest is to rule out the possibility that weakly nonlinear effects create a selfsustaining process and push the solution out of the linear regime. The idea that the interaction between nonlinear effects and nonnormal transient growth can lead to instabilities is classical in fluid mechanics (see e.g. [85]). The basic mechanism suggested in [85] is that nonlinear effects can repeatedly excite growing modes and precipitate a sustained cascade or socalled ‘nonlinear bootstrap’, studied further in the fluid mechanics context in, for example, [2, 89, 88]. Actually, this effect is very similar to what is at work behind plasma echos in the Vlasov equations, first captured experimentally in [63]. This phenomenon is referred to as an ‘echo’ because the measurable result of nonlinear effects can occur long after the event. Very similar echos have been studied and observed in 2D Euler, both numerically [89, 88] and experimentally [91, 92] (interestingly, nonneutral plasmas in certain settings make excellent realizations of 2D Euler).
The plasma echos play a pivotal role in the work of Mouhot and Villani on Landau damping [68]. Although our approach to this challenge is quite different, one of the main difficulties we face is to precisely understand the weakly nonlinear effects at work; sometimes called nonlinear transient growth [89]. We will need a more precise alternative to the moment estimates of [68] which is tailored to the specific structure of 2D Euler; what we call the “toy model” (see §9 for a detailed discussion about the relationship of our work to [68]). The toy model, formally derived in §3.1.1, provides modebymode upper bounds on the ‘worst possible’ growth of high frequencies that the weakly nonlinear effects can produce. The model is not just a heuristic and in fact plays a key role in our work: it is used in the construction of a norm specially designed to match the evolution of (1.1); this norm is the subject of §3. We remark that our model has not appeared in the literature before to our knowledge, however related models have been studied in [89, 88].
The mixing phenomenon behind the inviscid damping also appears in many other fluid models, for example, more general shear profiles [6, 15], stratified shear flows [61, 19] and 2D Euler with the plane approximation to the Coriolis force [16, 86]. A particularly fundamental setting is the ‘axisymmetrization’ of vortices in 2D Euler which has important implications for the metastability of coherent vortex structures in atmosphere and ocean dynamics (see e.g. [36, 11, 78, 91, 92] for a small piece of the extensive literature). Actually, this stability problem was mentioned by Rayleigh [74] and was considered by Orr as well [72]. Interestingly, it is also relevant to the stability of charged particle beams in cyclotrons [22].
In general, phasemixing, or ‘continuum damping’, can be directly associated with the continuous spectrum of the linearized operator and is a phenomenon shared by a number of infinitedimensional Hamiltonian systems, for example the damping of MHD waves [83], the CaldeiraLegget model from quantum mechanics [43] and synchronization models in biology [82]. See the series of works [5, 6, 66, 67, 8] which draws a connection between the van Kampen generalized eigenfunctions and the normal form transform to write the linearized 2D Euler and VlasovPoisson equations as a continuum of decoupled harmonic oscillators. See also [7] and the references therein for a recent survey which contains other examples and discusses some connections between these various models.
Phase mixing also shares certain similarities with scattering in the theory of dispersive wave equations (see for instance [50, 34, 58]) as already pointed out in [27, 18]. In both cases the long time behavior is governed by a linear operator, or a modified version of it due to long range interactions [37, 69] (something like this occurs in our Theorem 1). Unlike dissipative equations, the final linear evolution is usually chosen by the entire nonlinear dynamics and cannot be completely characterized by the relevant conservation laws. Also in both cases, the phenomena can be related to the continuous spectrum in the linear problem; for example, the RAGE theorem applies equally well to transport equations as to dispersive equations [26]. However, there are also clear differences since in dispersive wave equations, the dispersion uses the fact that different wave packets travel with different group velocities to yield decay of the norm and hence nonlinear terms often become weaker. Normally, this decay costs spatial localization rather than regularity. In the inviscid damping (and Landau damping), the decay is due to the combination of the mixing which sends the information into high frequencies and the application of the inverse Laplacian (or any operator of negative order), which averages out the small scales. That is, dispersion transfers information to infinity in space whereas mixing transfers information to infinity in frequency.
1.1 Statement and discussion
In this section we state our nonlinear stability result and a few immediate corollaries. The key aspects of the proof are discussed after the statement.
The data will be chosen in a Gevrey space of class for [35]; the origin of this restriction is (mathematically) natural and arises from the weakly nonlinear effects, discussed further in §3. We note that the analogous space for the Vlasov equations with Coloumb/Newton interaction is Gevery3 (e.g. ) [68]. It is worth noting that unlike, for example, [33] where the Gevrey regularity is required due to the linear growth of high frequencies, here (and [68]) the Gevrey regularity is required because of a potential nonlinear frequency cascade.
Our main result is
Theorem 1.
For all , there exists an such that for all if satisfies , and
then there exists with and such that
(1.5) 
where is given explicitly by
(1.6) 
with . Moreover, the velocity field satisfies
(1.7a)  
(1.7b)  
(1.7c) 
Remark 1.
Of course, by timereversibility, Theorem 1 is also true for some and (which will generally not be equal to their counterparts). Also, due to the Hamiltonian structure of (1.1) (see e.g. [1, 66]), one could only hope to prove asymptotic stability in a norm weaker than the norm in which the initial data is given. This is an important theme underlying our work, and the works of [18, 44, 68], which is that decay costs regularity.
Remark 2.
From the proof of Theorem 1, it is clear that , as the effect of the nonlinear evolution is one order weaker than that of the linear evolution.
Remark 3.
Notice the surprisingly rapid convergence in (1.7a) (it is of course matched by a similar rapid convergence of the averaged vorticity). This arises from a subtle cancellation between the oscillations of and upon taking averages; indeed it was previously believed that the convergence should be and that (1.6) involved a logarithmic correction. The origin of the rapid convergence rate can be best understood from studying the linearized problem (1.3), a computation that we carry out in §A.4.
Remark 4.
The proof of Theorem 1 implies that if is compactly supported then remains supported in a strip for some for all time.
Remark 5.
Remark 6.
Both Orr and Kelvin (and many others) expressed doubt that the inviscid problem was stable unless the set of permissible data was of a certain type, suggesting that for general data the stability restriction would diminish with the inverse Reynolds number. To reconcile this viewpoint with Theorem 1, we conjecture that for high (but finite) Reynolds number flows, an analogous result to Theorem 1 holds with initial data where has Gevrey regularity uniformly in the Reynolds number and has Sobolev regularity with norm small with respect to the inverse Reynolds number. Note that in the viscous case, the flow will return to Couette, but on a timescale comparable to the Reynolds number. We are currently investigating the proof of this conjecture.
Remark 7.
The spatial localization is only used to assert that the velocity is in and to ensure the coordinate transformations used in the proof are not too drastic. This assumption can be relaxed to for any . It might be possible to treat more general cases with with some technical enhancements, as does not play an important role in the proof.
Corollary 1.
There exists an open set of smooth solutions to (1.1) for which is not precompact in as . In particular, and in general .
This shows the existence of solutions for which enstrophy is lost to high frequencies in the limit , which to our knowledge was not previously known for 2D Euler in any setting. See [81, 48, 40] for further discussions on the physical interest of this fact and the potential relationship with 2D turbulence. A related corollary is the following which shows the linear growth of Sobolev norms as a direct consequence of the mixing. Compare with the construction of Denisov [28] which yields superlinear growth of the gradient.
Corollary 2.
There exists an open set of smooth solutions to (1.1) for which for all and for all ,, .
Let us now outline the main new steps in the proof of Theorem 1. First, we provide a (well chosen) change of variable that adapts to the solution as it evolves and yields a new ‘relative’ velocity which is timeintegrable while keeping the Orr critical times as in (1.4). This change of variables allows us to work on a quantity which has a strong limit as goes to infinity. This is related to the notion of “profile” used in dispersive wave equations (see [34] for instance) as well as the notion of “gliding regularity” in [68]. However, here it is important that the coordinate transformation depends on the solution, a source of large technical difficulty and an expression of the ‘quasilinearity’ alluded to above.
A second new idea is the use of a special norm that loses regularity in a very precise way adapted to the Orr critical times and the associated nonlinear effect. The construction of this norm is based on the socalled “toy model” which mimics the worse possible growth of high frequencies (derived in §3.1.1). This special norm allows us to control the nonlinear growth due to the resonances at the critical times. However, this comes with a big danger: energy estimates and cancellations tend to dislike ‘unbalanced’ norms, namely norms that assign different regularities to different frequencies (see for instance [65] for a similar problem). In particular, by design, our norm is not an algebra. This is one of the main technical problems that we have to overcome, and here the decay of the velocity is crucial.
In the course of the proof, we need to gain regularity from inverting the Laplacian to get the streamfunction from the vorticity; indeed the ellipticity is the origin of the decay. However, in the new variables the Laplacian is transformed to a weakly elliptic operator with coefficients that depend on the solution. This additional nonlinearity presents huge difficulties due to the limited regularity of the coefficients (relative to what is desired). This has similarities with elliptic estimates in domains with limited regularity used for water waves (see for instance [80, Appendix A]). Here, the interplay between regularity and decay will be crucial to ensure that the final estimate holds. As in (1.4), the loss of ellipticity is an expression of the Orr critical times. It will be important for our work that the norm derived from the toy model precisely ‘matches’ the loss of ellipticity.
Related to the issue of inverting the Laplacian in the new variables is the final technical ingredient in our proof, which is the need to obtain a variety of precise controls on the evolving coordinate system (see Proposition 2.5 below). This will require us to quantify the convergence of the background shear flow in several ways. In particular, we will need to carefully estimate how the modes that depend on force those that do not and in fact, this forcing loses a derivative (see the last term in (8.9)). However, the estimates turn out to be possible precisely under the assumption of Gevrey class with (see the discussion after Proposition 2.5). Second to the toy model, here seems to be next most fundamental use of the regularity .
1.2 Notation and conventions
See §A.1 for the Fourier analysis conventions we are taking. A convention we generally use is to denote the discrete (or ) frequencies as subscripts. By convention we always use Greek letters such as and to denote frequencies in the or direction and lowercase Latin characters commonly used as indices such as and to denote frequencies in the or direction (which are discrete). Another convention we use is to denote as dyadic integers where
When a sum is written with indices or it will always be over a subset of . This will be useful when defining LittlewoodPaley projections and paraproduct decompositions, see §A.1. Given a function , we define the Fourier multiplier by
We use the notation when there exists a constant independent of the parameters of interest such that (we analogously define). Similarly, we use the notation when there exists such that . We sometimes use the notation if we want to emphasize that the implicit constant depends on some parameter . We will denote the vector norm , which by convention is the norm taken in our work. Similarly, given a scalar or vector in we denote
We use a similar notation to denote the (or ) average of a function: . We also frequently use the notation . We denote the standard norms by . We make common use of the Gevrey norm with Sobolev correction defined by
Since most of the paper we are taking as a fixed constant, it is normally omitted. We refer to this norm as the norm and occasionally refer to the space of functions
See §A.2 for a discussion of the basic properties of this norm and some related useful inequalities.
For , we define to be the integer part. We define for and with , and and the critical intervals
For minor technical reasons, we define a slightly restricted subset as the resonant intervals
Note this is the same as putting a slightly more stringent requirement on : .
2 Proof of Theorem 1
We now give the proof of Theorem 1, stating the primary steps as propositions which are proved in subsequent sections.
2.1 Coordinate transform
The original equations in vorticity form are (1.1), and we are trying essentially to prove that
as , where is the correction to the shear flow determined by . From the initial data alone, there is no simple way to determine ; it is chosen by the nonlinear evolution. In order to deal with this lack of information about how the final state evolves we choose a coordinate system which adapts to the solution and converges to the expected form as . The change of coordinates used is , where
(2.1a)  
(2.1b) 
where we recall denotes the average of in the variable (or equivalently in the variable), namely . The reason for the change is not immediately clear, however is named as such since it is an approximation for the background shear flow. If the velocity field in the integrand were constant in time, then we are simply transforming the variables so that the shear appears linear. It will turn out that this choice of ensures that the BiotSavart law is in a form amenable to Fourier analysis in the variables ; in particular, even when the shear is timevarying we may still study the Orr critical times as was explained in (1.4). In this light, the motivation for the shift in is clear: we are following the flow in the horizontal variable to guarantee compactness, as done even by Orr [72], Kelvin [45] and many authors since then.
Define and , hence
where
Expressing , and , we get the following evolution equation for ,
Using the definition of and the BiotSavart law to transform to in the new variables, this becomes
The BiotSavart law also gets transformed into:
(2.2) 
The original 2D Euler system (1.1) is expressed as
(2.3) 
It what follows we will write and specify when other variables are used. Next we transform the momentum equation to allow us to express in a form amenable to estimates. Denoting and we have by the same derivation on ,
Taking averages in we isolate the zero mode of the velocity field,
(2.4) 
Finally, one can express and as solutions to a system of PDE in the variables coupled to (2.3) (see §8.1 below for a detailed derivation):
(2.5a)  
(2.5b)  
(2.5c) 
Note that to leading order in , one can express as a time average of . Note also that we have a simple expression for from the BiotSavart law:
(2.6) 
Given a priori estimates on the system (2.3), (2.5), we can recover estimates on the original system (1.1) by the inverse function theorem as long as remains sufficiently small (see §2.3). Compared to the original system (1.1), the system (2.3), (2.5) appears much more complicated and nonlinear. Indeed, is not divergence free and the dependence of on through is significantly more subtle than in the original variables. The main advantage of (2.3) is that formally has an integrable decay, indeed, we will see that if one is willing to pay four derivatives, the decay rate is formally (the decay we deduce is not quite as sharp).
2.2 Main energy estimate
In light of the previous section, our goal is to control solutions to (2.3) uniformly in a suitable norm as . The key idea we use for this is the carefully designed timedependent norm written as
The multiplier has several components,
The index is the bulk Gevrey regularity and will be chosen to satisfy
(2.7a)  
(2.7b) 
where is a small parameter that ensures and is a parameter chosen by the proof. The reason for (2.7a) is to account for the behavior of the solution on the timeinterval ; see Lemma 2.1 for this relatively minor detail. The use of a timevarying index of regularity is classical, for example the CauchyKovalevskaya local existence theorem of Nirenberg [70, 71]. For more directly relevant works which use norms of this type, see [31, 52, 23, 47, 24, 68]. Let us remark here that to study analytic data, , we would need to add an additional Gevrey correction to with as an intermediate regularity so that we may take advantage of certain beneficial properties of Gevrey spaces; see for example Lemma A.3. In this case, the analytic regularity would simply be propagated more or less passively through the proof. Using the same idea, we may assume without loss of generality that is close to (say ), which simplifies some of the technical details but is not essential. The Sobolev correction with fixed is included mostly for technical convenience so we may easily quantify loss of derivatives without disturbing the index of regularity. We will also use the slightly stronger multiplier that satisfies to control the coefficients and ; see (3.10) below for the definition.
The main multiplier for dealing with the Orr mechanism and the associated nonlinear growth is
(2.8) 
where is constructed in §3 and describes the expected ‘worstcase’ growth due to nonlinear interactions at the critical times. What will be important is that imposes more regularity on modes which satisfy (the ‘resonant modes’) than those that do not (the ‘nonresonant modes’). The multiplier replaces growth in time by controlled loss of regularity and is reminiscent of the notion of losing regularity estimates used in [3, 25]. One of the main differences is that here we have to be more precise in the sense that the loss of regularity occurs for different frequencies during different time intervals.
With this special norm, we can define our main energy:
(2.9) 
where, for some constants , depending only on fixed by the proof,
(2.10) 
In a sense, there are two coupled energy estimates: the one on and the one on . The latter quantity is encoding information about the coordinate system, or equivalently, the evolution of the background shear flow. It turns out is a physical quantity that measures the convergence of the averaged vorticity to its time average (see (8.5) in §8.1) and satisfies a useful PDE (see (8.9) in §8.2). It will be convenient to get two separate estimates on as opposed to just one ( is essentially measuring how rapidly the averaged velocity is converging to its time average).
The goal is to prove by a continuity argument that this energy (together with some related quantities) is uniformly bounded for all time if is sufficiently small. We define the following controls referred to in the sequel as the bootstrap hypotheses,

;


‘CK’ integral estimates (for ‘CauchyKovalevskaya’):
The CK terms above that appear without the prefactor arise from the time derivatives of and are naturally controlled by the energy estimates we are making. The others are related quantities that are controlled separately in Proposition 2.5 below.
By the wellposedness theory for 2D Euler in Gevrey spaces [9, 30, 31, 52, 47] we may safely ignore the time interval (say) by further restricting the size of the initial data. That is, we have the following lemma; see §A.3 for a sketch of the proof.
Lemma 2.1.
For all , there exists an such that if and , then , , with
(2.11) 
By Lemma 2.1, for the rest of the proof we may focus on times . Let be the connected set of times such that the bootstrap hypotheses (B1B3) are all satisfied. We will work on regularized solutions for which we know takes values continuously in time, and hence is a closed interval with . The bootstrap is complete if we show that is also open, which is the purpose of the following proposition, the proof of which constitutes the majority of this work.
Proposition 2.1 (Bootstrap).
There exists an depending only on and such that if , and on the bootstrap hypotheses (B1B3) hold, then for ,

,

,

and the CK controls satisfy:
from which it follows that .
The remainder of the section is devoted to the proof of Proposition 2.1, the primary step being to show that on , we have
(2.12) 
for some constant which is independent of and . If is sufficiently small then (2.12) implies Proposition 2.1. Indeed, the control is an immediate consequence of (B1) by Sobolev embedding for sufficiently small.
To prove (2.12), it is natural to compute the time evolution of ,
The first contribution is of the form
(2.13) 
where the CK stands for ‘CauchyKovalevskaya’ since these three terms arise from the progressive weakening of the norm in time, and are expressed as
(2.14a)  
(2.14b) 
In what follows we define
(2.15a)  
(2.15b) 
Note that and if then .
Strictly speaking, equality (2.13) is not rigorous since it involves a derivative of , which is not a priori welldefined. To make this calculation rigorous, we have first to approximate the initial data of (1.1) by (for instance) analytic initial data and use that the global solutions of (1.1) stay analytic for all time (see [9, 31, 30]). Hence, we can perform all calculations on these solutions with regularized initial data and then perform a passage to the limit to infer that (2.12) still holds. Similarly, the bootstrap is performed on these regularized solutions for which takes values continuously in time.
To treat the main term in (2.13), begin by integrating by parts, as in the techniques [31, 52, 47]
(2.16) 
Notice that the relative velocity is not divergence free:
The first term is controlled by the bootstrap hypothesis (B1). For the second term we use the ‘lossy’ elliptic estimate, Lemma 4.1, which shows that under the bootstrap hypotheses we have
(2.17) 
Therefore, by Sobolev embedding, and the bootstrap hypotheses,
(2.18) 
To handle the commutator, , we use a paraproduct decomposition (see e.g. [13, 4]). Precisely, we define three main contributions: transport, reaction and remainder:
(2.19) 
where (the factors of are for future notational convenience)
Here and denotes the th LittlewoodPaley projection and means the LittlewoodPaley projection onto frequencies less than (see §A.1 for the Fourier analysis conventions we are taking). Formally, the paraproduct decomposition (2.19) represents a kind of ‘linearization’ for the evolution of higher frequencies around the lower frequencies. The terminology ‘reaction’ is borrowed from Mouhot and Villani [68] (see §9 for more information).
Controlling the transport contribution is the subject of §5, in which we prove:
Proposition 2.2 (Transport).
Under the bootstrap hypotheses,
The proof of Proposition 2.2 uses ideas from the works of [31, 52, 47]. Since the velocity is restricted to ‘low frequency’, we will have the available regularity required to apply (2.17). However, the methods of [31, 52, 47] do not adapt immediately since is imposing slightly different regularities to certain frequencies, which is problematic. Physically speaking, we need to ensure that resonant frequencies do not incur a very large growth due to nonlinear interactions with nonresonant frequencies (which are permitted to be slightly larger than the resonant frequencies). Controlling this imbalance is why appears in Proposition 2.2.
Controlling the reaction contribution in (2.19) is the subject of §6. Here we cannot apply (2.17), as an estimate on this term requires in the highest norm on which we have control, and hence we have no regularity to spare. Physically, here in the reaction term is where the dangerous nonlinear effects are expressed and a great deal of precision is required to control them. In §6 we prove
Proposition 2.3 (Reaction).
Under the bootstrap hypotheses,
(2.20) 
The terms are defined below in (2.24). The first step to controlling the term in (2.20) involving is Proposition 2.4, proved in §4.2. This proposition treats as a perturbation of and passes the multipliers in the last term of (2.20) onto and the coefficients of . Physically, these latter contributions are indicating the nonlinear interactions between the higher modes of and the coefficients , (which involve timeaverages of (2.5)).
Proposition 2.4 (Precision elliptic control).
Under the bootstrap hypotheses,
(2.21) 
where the ‘coefficient CauchyKovalevskaya’ terms are given by
(2.22a)  
(2.22b)  
(2.22c)  
(2.22d) 
The next step in the bootstrap is to provide good estimates on the coordinate system and the associated CK and CCK terms, a procedure that is detailed in §8. The following proposition provides controls on , the CCK terms arising in (2.22), the pair , and finally all of the terms. The norm defined by is stronger than that defined by , which we use to measure . It turns out that we will be able to propagate this stronger regularity on due to a timeaveraging effect, derived via energy estimates on (2.5). By contrast, is expected basically to have the regularity of and hence even (2.23b) has fewer derivatives than expected. On the other hand, it has a significant amount of time decay, which near critical times can be converted into regularity.
Proposition 2.5 (Coordinate system controls).
Under the bootstrap hypotheses, for sufficiently small and sufficiently large there is a such that
(2.23a)  
(2.23b)  