Explicit construction of quasiconserved local operator of translationally invariant nonintegrable quantum spin chain in prethermalization

# Explicit construction of quasiconserved local operator of translationally invariant nonintegrable quantum spin chain in prethermalization

Cheng-Ju Lin    Olexei I. Motrunich Department of Physics and Institute for Quantum Information and Matter, California Institute of Technology, Pasadena, California 91125, USA
July 27, 2019
###### Abstract

We numerically construct translationally invariant quasi-conserved operators with maximum range which best-commute with a non-integrable quantum spin chain Hamiltonian, up to . In the large coupling limit, we find that the residual norm of the commutator of the quasi-conserved operator decays exponentially with its maximum range at small , and turns into a slower decay at larger . This quasi-conserved operator can be understood as a dressed total “spin-z” operator, by comparing with the perturbative Schrieffer-Wolff construction developed to high order reaching essentially the same maximum range. We also examine the operator inverse participation ratio of the operator, which suggests its localization in the operator Hilbert space. The operator also shows an almost exponentially decaying profile at short distance, while the long-distance behavior is not clear due to limitations of our numerical calculation. Further dynamical simulation confirms that the prethermalization-equilibrated values are described by a generalized Gibbs ensemble that includes such quasi-conserved operator.

## I Introduction

Two decades ago, the eigenstate thermalization hypothesis (ETH) was proposed as a mechanism accounting for the validity of the statistical mechanics in isolated quantum systems.Deutsch (1991); Srednicki (1994) In contrast, many-body localization (MBL) refers to a class of interacting systems that fail to thermalize due to the presence of strong disorder. Phenomenologically, MBL systems can be viewed as having an extensive number of local integrals of motion,Serbyn et al. (2013); Huse et al. (2014); Pekker et al. (2014); Chandran et al. (2015); O’Brien et al. (2016) analogous to integrable quantum systems.

Many research works have proposed systems “in between,” namely, systems that fail or partially fail to thermalize but are disorder-free. For example, Ref. Grover and Fisher, 2014 proposed a phase of matter called “quantum disentangled liquid,” where the system is composed of heavy degrees of freedom and light degrees of freedom, and where after a partial measurement the light degrees of freedom will localize around the heavy degrees of freedom. Recent numerical and theoretical works provided some support for the existence of such phases of matter.Garrison et al. (2017); Veness et al. (2017) Other studies observed that in some such systems the dynamics shows behavior similar to MBL systems.Schiulaz et al. (2015); van Horssen et al. (2015); Yao et al. (2016)

While numerous proposals have tried to realize MBL in translationally invariant systems, it was argued that this cannot happen in the true sense of MBL. However, some phenomenological aspects of MBL can still be realized in such systems.De Roeck and Huveneers (2015, 2014a, 2014b, b); Bols and De Roeck () That is, when one performs some dynamical simulation in such a system, the system will appear to be localized at some intermediate time scale, but will delocalize eventually. Therefore, one may view such “quasi-localization” or “asymptotic localization” as prethermalization, where the system equilibrates to a state which is described by a Gibbs ensemble controlled by some effective Hamiltonian (instead of the original Hamiltonian) at some intermediate time, and truly thermalizes only at much later time. Nevertheless, a recent work has proposed another model with translational invariance and has claimed to find true disorder-free localization,Smith et al. (2017) so this question is still open.

Prethermalization has been observed and studied in many different systems. In particular, various works showed that systems with weak integrability breaking exhibit this phenomenon.Essler et al. (2014); Bertini et al. (2016); Langen et al. (2016) In addition, prethermalization has been shown rigorously to exist in periodically driven many-body systems under strong driving frequencies using the Floquet-Magnus expansionKuwahara et al. (2016); Mori et al. (2016) and renormalization technique.Abanin et al. (2017a, b) The latter also applies to time-independent many-body systems, and in particular can be used to prove rigorously the presence of exponentially long relaxation times of “particles” such as doublons in the Hubbard model in the strong coupling limit.Bukov et al. (2015); Sensarma et al. (2010); Strohmaier et al. (2010) There are also very recent proposals utilizing the prethermalization to protect the edge modes in the topological superconductor.Kemp et al. (2017); Else et al. ()

In fact, we can view most of the aforementioned prethermalization systems as having quantities with hierarchically different thermalization time scales or having different rates of dynamics. Upon time evolution, the fast degrees of freedom relax very quickly, while the slow degrees of freedom evolve slowly during this initial period. This results in the apparent prethermalization stage, where the slow degrees of freedom appear to be frozen. These quantities with slow dynamics can be viewed as quasi-conserved.Fagotti (); Mierzejewski et al. (2015) Emergence of such a quasi-conserved quantity is what accounts for the prethermalization stage. If such a quantity could develop an exact conservation law, this would extend the prethermalization to infinitely long time and would correspond to partial breakdown of the ETH, as envisioned, e.g., in Refs. Grover and Fisher, 2014; Garrison et al., 2017.

Motivated by this point of view, in this paper we numerically systematically search for such hidden quasi-conserved quantities which cannot be directly identified from the Hamiltonian itself. Following the “slowest operator formalism” introduced in Ref. Kim et al., 2015, we numerically construct the quasi-conserved local operator for the non-integrable spin model

 H=∞∑j=−∞(JZjZj+1+hZj+gXj) , (1)

where , , and denote Pauli matrices operating on site of the one-dimensional chain. We constrain our slowest operator to be translationally invariant and represented as a sum of local terms. We find that, in the large limit, there exists a quasi-conserved operator whose thermalization time scale increases exponentially as one increases its maximum range up to some point. Furthermore, the operator can be understood as a dressed “total spin-z operator” (for appropriately chosen spin axes). This operator has a very slow dynamics compared to other quantities. We also simulate the dynamics of the quantum spin chain following a quench and confirm that this quasi-conserved quantity has a non-trivial effect. Specifically, at intermediate times, the system equilibrates to a state which can be described by a generalized Gibbs ensemble (GGE) that includes such a quantity as an “integral of motion.” While our study cannot reach infinite maximum range, we find that the rate of decrease of the slowest operator with the maximum range becomes weaker beyond some point and starts resembling behavior observed in regimes of good thermalization. A conservative interpretation of this behavior is that our system shows only prethermalization with very long time scale. Nevertheless, the available data does not rule out a more exotic possibility that the slowest operator converges and becomes exactly conserved in the thermodynamic limit, which would indicate breakdown of the ETH.

The paper is organized as follows. In Sec. II, we briefly describe the formalism we use to search for the slowest operator in the translationally invariant setting. In Sec. III, we present our numerical results focusing on the scaling of the “residual norm” (i.e., norm of the commutator with the Hamiltonian) versus the maximum range of the operator. In the large coupling limit, we find that the residual norm shows exponential decay at least on short distances and identify the slowest operator as quasi-conserved operator. As a comparison, in Sec. IV, we use the Schrieffer-Wolff approach to perturbatively construct a quasi-conserved operator, which can be understood as a dressed total spin- operator. We find that in the large coupling limit, the overlap between the perturbative construction and exact numerical construction of the slowest operator is almost 100%; thus we understand the nature of the slowest operator in this regime, at least up to some value of the maximum range. In Sec. V, we examine the operator inverse participation ratio and the weight distribution in the slowest operator at different distances, demonstrating its localization in the operator space and real space. To verify the conjecture that this quasiconserved quantity results in prethermalization, we explicitly simulate a quench dynamics in Sec. VI and confirm the importance of the quasiconserved quantity when describing the equilibrated values at intermediate time. Finally, in Sec.VII, we summarize and discuss some outstanding questions. Several appendices all focus on the Schrieffer-Wolff approach: Appendix A presents a ladder algebra formalism convenient for analytical calculations at low order. Appendices B and C present some analytical bounds on the convergence of the Schrieffer-Wolff procedure, while Appendix D presents better bounds calculated numerically. Finally, Appendix E compares these bounds with exact numerical calculations, finding that the former are gross overestimations; we trace possible origins of these overestimations and consider how one might improve upon them and speculate about implications for the Schrieffer-Wolff approach.

## Ii Method of the slowest operator

Our motivation is to numerically search for the operator that “best-commutes” with the Hamiltonian. We focus on translationally-invariant Hermitian operators obtained as sums of local terms and adopt the formalism of Ref. Kim et al., 2015. We restate this approach as a problem in the operator Hilbert space as follows.

We consider traceless, and translationally-invariant operators with maximum range ,

 Q(M)=∞∑j=−∞q(M)j , (2)

where is an operator with support on a region extending from site to site . We denote the space of traceless translationally-invariant operators with maximum range as . The operator space is a vector space, as one can easily verify. A natural basis for is provided by “Pauli string operators,” i.e., operators of the form where can be , , , or acting on site , and are independent for different . However, there is a “gauge degree of freedom” for the representation of . For instance, we can write using or , etc. We fix the gauge by requiring the operator on the first site, , to be non-identity in every Pauli string basis vector, i.e., can only be , , or , while can be , , , or . This also automatically satisfies the tracelessness condition. The Hermiticity condition of an operator just corresponds to the condition of real coefficients in this basis. It is now easy to see that the dimension of is .

We define the Frobenius inner product (also know as Hilbert-Schmidt inner product) on the operator space as

 ⟨Q,Q′⟩=Tr[q†jq′j]Tr[I⊗M] , (3)

where are understood in the above gauge acting on sites only and is the identity operator also acting on sites. One can easily see that the aforementioned Pauli-string operators are advantageous as they form an orthonormal basis under this inner product. The above inner product defines the norm , which we can view as an “intensive Frobenius norm” (see below). For example, . Note that instead of the conventional definition of the operator inner product, here we only take the local piece in the trace calculation after the gauge fixing. This definition has the advantage that the norm is “intensive,” compared to the conventional definition of Frobenius norm that would increase with the system size. In fact, if we consider a chain of length with periodic boundary conditions and operators (assuming ), we can easily verify that the above inner product is simply appropriately scaled conventional Frobenius inner product:

 ⟨Q,Q′⟩=Tr[Q†Q′]/(LTr[I⊗L]) . (4)

In other words, Eq. (3) is obtained from Eq. (4) when applied to this “gauge-fixing” writing of the translationally-invariant operators. If one does not use the gauge-fixing, one should use Eq. (4) to calculate the inner product. In what follows, we will always use only the intensive Frobenius norm, often dropping the descriptor “intensive” for brevity.

A natural embedding for is obtained by the tensor product with the identities, , where and . We will not emphasize the difference between and , since it only depends on what operator space one is considering, while the inner product in Eq. (3) is independent of the embedding. We can further consider the norm closure , which is a mathematically well-defined Hilbert space.

The commutator with a fixed operator can be viewed as a linear map between the operator spaces. We define the superoperator

Clearly, is a linear map from the operator space to space , since . In fact, for any operator and , we have . Using the Pauli string basis, we can write down the matrix representation for , which in general will be a matrix. We want to find an operator in that “best commutes” with the Hamiltonian, which we define as minimizing the residual norm under the constraint . This corresponds to finding the smallest singular value of , or the smallest eigenvalue of , where . The corresponding eigenoperator is the sought-for slowest operator; we will denote this operator as and the corresponding eigenvalue as , which will be the squared residual norm of the slowest operator. To avoid the trivial zero-eigenvalue solution given by the Hamiltonian itself, we add to , with large enough such that the slowest operator is nontrivial. Thus found operator is orthogonal to in the Frobenius inner product.

Note that in the --- Pauli-string basis, is always a symmetric matrix with real coefficients. This guarantees the eigenvalues to be real, and the eigenvectors can be chosen with real amplitudes in the --- Pauli-string basis. This means that the slowest operator can always be chosen to be Hermitian. In other words, we fix the overall phase of the eigenoperator by requiring the Hermicity of the operator, up to a minus sign.

We can argue that this defines a procedure to find a translationally invariant (quasi)-local conserved quantity in the thermodynamic limit. Indeed, consider the limit . Since is a decreasing function of bounded from below by , exists. If and exists, then we have a normalizable operator [hence quasilocal or local if for some finite already] which commutes with the Hamiltonian. If such (quasi)local conserved quantity does exist, a suitable thermal equilibrium description should include this quantity in the GGE. On the other hand, even though an arbitrary linear combination of eigenstate projectors commutes with the Hamiltonian, can be non-normalizable under our definition of the Frobenius norm. It is therefore not guaranteed that . Furthermore, even if , we cannot guarantee that the limit exists. In practice, one can only find with finite, but we can try to explore these questions by studying behaviors for increasing .

### ii.1 Simplifications due to symmetries

The size of the matrix can be further reduced by using time-reversal and parity symmetries. The time-reversal operation corresponds to the complex conjugation in the basis; this maps , while leaving the other Pauli operators unchanged. Therefore, the time-reversal-even (-odd) sector corresponds to even (odd) number of Pauli operators in the Pauli string basis respectively.

The matrix can be further simplified by utilizing the parity (i.e., mirror) symmetry with respect to the origin. To illustrate how the parity operation acts on the --- Pauli-string basis, we consider an example of . Upon parity operation, , where in the last equality we gauge-fixed the writing of . We see that the parity operation acts on the operators in by reversing the order of operators in each of the Pauli-string basis vector and gauge-fixing the expression. More specifically, if , where and can only be , , or , then . We can therefore easily form the parity-even and -odd subspaces by forming basis vectors.

### ii.2 Algorithm

For small maximum range , we exactly diagonalize the matrix to find the lowest eigenvalue and the slowest operator. For larger maximum range , iterative methods are preferred since one can construct as a sparse matrix. While Lanczos method is one of the standard iterative algorithms to find the lowest eigenpair, the smallness of the relevant eigenvalues in the large regime makes the convergence extremely slow. Fortunately, the positive-definite character of the matrix enables us to adapt a conjugate-gradient-based algorithm. Here, we use the “locally optimal block preconditioned conjugate gradient method” from Ref. Knyazev, 2001 to find the lowest eigenpair.

## Iii Scaling of the squared residual norm

Figure 1 shows the -dependence of the squared residual norm on a log-log plot and a semilogrithmic plot. For small , the dependence is roughly a power law, which is consistent with the result in Ref. Kim et al., 2015 in the regime where the system has good ergodic behavior. On the other hand, for large , first decays exponentially with but then turns into a slower decay at larger . The exponential decay was also observed in the case of such “slowest operator” construction in the MBL phase.O’Brien et al. (2016) This exponential behavior differentiates the speed of the dynamics of this operator compared to other quantities. As one increases the maximum range, one can optimize the residual norm exponentially better, which also indicates longer thermalization time scale, since the residual norm is related to the speed of the dynamics of the operator (see Sec. III.2 below). We therefore expect this quantity to be quasi-conserved, which can affect the thermalization of the system.

Interestingly, the exponential decay of for the slowest operator does not continue to larger . Instead, the decay trend seems to turn into a power law at larger . As discussed in the previous section, even though the scaling trend turns into a slower decay at large , one always gets an equal or smaller residual norm as one increases . If the residual norm goes to zero as and exists, then we would indeed obtain a conserved quasilocal operator. However, due to limits on our numerical calculations, we cannot reach larger maximum range and cannot be conclusive about the behavior of at large . The eventual turn to a slower decay (similar to behavior in the good ergodic regime ) may be signaling that beyond some time the operator will thermalize. Hence, it may well be that the observed behavior corresponds to a prethermalization phenomenon on some intermediate time scales, where the time scale can be parametrically large.

### iii.1 Next-slowest operators

While the exponential scaling of the slowest operator for large suggests that it is quasi-conserved, one may wonder how many quasi-conserved quantities exist. To answer this question, we further study the scaling of the squared residual norm of the first five slowest operators in the time-reversal and parity even (odd) sector, denoted as “TePe” (“ToPo”) in Fig. 2. The operators in the “TePo” and “ToPe” sectors have higher squared residual norms than the ones shown in the figure and are hence less interesting and not included. Here we only show results that are accessible using the exact diagonalization of the matrix , or .

Figure 2(a) shows the scaling of for . Note that the slowest operator in this case has a similar scaling trend compared to other operators. Therefore the speed of the dynamics is not hierarchically slower than for other degrees of freedom.

On the other hand, in panels Figs. 2(b) and  2(c), the slowest operator clearly has faster scaling than the next-slowest operators. This is another feature suggesting that for large , the speed of the dynamics of the slowest operator is hierarchically slower than other operators, resulting in apparent freezing of its dynamics and hence the prethermalization phenomenon. We conclude that in these particular cases, there is only one quasi-conserved quantity. This differs from the proposal in Ref. Abanin et al., 2017b that there may be two independent quasi-conserved quantities (excluding the energy itself) in the strong coupling regime. We suspect that this difference comes from our separation of operators into independent ones using the orthogonality in the Frobenius inner product.

### iii.2 Relation to operator norm and thermalization time scale

Minimizing the commutator with respect to the Frobenius norm is advantageous because it can be relatively easily calculated numerically and is independent of the system size. On the other hand, to relate the smallness of the commutator to the dynamics, it is more appropriate to use the conventional operator norm. Indeed, following Ref. Kim et al., 2015, let us consider a quench setting where we start from some initial state . Using the Heisenberg representation of observables, , and denoting the expectation value of the operator , the deviation of the expectation value from its initial value can be estimated as

 |⟨QH(t)⟩−⟨Q⟩| = ∣∣∣⟨∫t0dτdQHdτ(τ)⟩∣∣∣ (6) ≤ ∫t0dτ∣∣∣⟨dQHdτ(τ)⟩∣∣∣ ≤ ∫t0dτ∥[H,QH(τ)]∥op=t∥[H,Q]∥op ,

where we have used for arbitrary , and the above inequality holds for any initial state. If we assume that has unit operator norm, we see that for to deviate from its initial value by an order-one number, the time scale is . For a general not normalized , including the suitable normalization gives the time scale .

Figure 3 demonstrates the comparison between the Frobenius norm measure and the operator norm measure of the smallness of the commutator , where the slowest operator is as before obtained by minimizing the residual Frobenius norm for given . Note that the operator norm per site of a translationally invariant operator like , unlike the intensive Frobenius norm defined earlier, depends on the system size and should be obtained in the thermodynamic limit (a familiar example is the ground-state energy per site of a translationally invariant Hamiltonian). However, we expect the size dependence to diminish for increasing . We confirmed this by calculating the operator norms by diagonalizing the corresponding operators on finite systems up to size , and Fig. 3 shows our results for the largest ; we were able to go only up to because the calculations became prohibitively expensive for larger . Unlike the residual Frobenius norm, the residual operator norm can increase with since the minimization procedure is not with respect to the operator norm. This can also potentially serve as a criterion for picking an “optimal” quasi-conserved operator for some that gives the minimum residual operator norm measure. However, we do not observe a clear minimum of the residual operator norm measure for the accessible . Nevertheless, we can already bound from below from the data. Thus, for , we can bound which is already very long; while for , we can bound from below by approximately .

While here we were able to calculate the operator norm explicitly numerically, it is instructive to consider the following crude bound for the prethermalization condition obtained from the scaling of the residual Frobenius norm. First, we note that we can write , where has maximum range . We then have (recall that here and below we use the intensive Frobenius norm). On the other hand, for , heuristically we can estimate , and we also have exact bound . We therefore obtain

 ∥[H,Q(M)0]∥op∥Q(M)0∥op≤2M+12∥[H,Q(M)0]∥F∥Q(M)0∥F=2M+12√λ0(M) , (7)

(which is nonrigorous bound). To maximize the thermalization time scale, we find by minimizing the right-hand side and obtain a crude criterion

 dlog10λ0(M)dM|M=¯M∗=−log102 . (8)

Thus, the optimal from this heuristic bound is determined as the point where the magnitude of the slope of vs drops below value (assuming that the magnitude of the slope is decreasing with , as observed in Fig. 1). We expect (the latter defined from the true operator-norm minimization).

The above arguments also show how one may reconcile the fact that while the Frobenius norm measure is always decreasing with , the thermalization time scale could still be finite. The actual data for the operator norm vs Frobenius norm in Fig. 3 shows that the operator norm measure is numerically close to the Frobenius norm over the available maximum range , particularly for large . That is, the factor of in the heuristic bound Eq. (7) between the two measures is an overestimate, and at least over this range of the Frobenius norm measure can be used to bound the speed of the dynamics.

We can understand the rough agreement between the Frobenius and operator norm measures if the operators and have roughly similar “profiles” in the operator space. Indeed, in this case, the numerators on both sides of the inequality in Eq. (7) and the denominators should have similar relations, which would cancel out in the ratio (while the overestimating factor arose from using different limits of the relations between the Frobenius and operator norms for the denominator and numerator). We expect this to be particularly true when is “localized” in real space, which we indeed find in the strong coupling regime at least for the available —see our understanding of the slowest operator from the perturbative SW picture in Sec. IV and direct measurements of its profile in Sec. V.2. We do start observing some deviations between the Frobenius and operator norm measures for larger , which could be indicating changing localization properties; however, the differences are still small to reach definite conclusions.

Examining carefully all data in Fig. 3, we would like to point out that even though for the operator norm measure is smaller than the one for , it does not imply that the system with will exhibit prethermalization. For a fair comparison of the dynamics, one also needs to compare the thermalization time scale of to other degrees of freedom in the same system. We indeed know from the previous section, cf. Fig. 2, that for , the next-slowest operators have comparable relaxation times to and the prethermalization phenomenon is less likely than for , where the slowest operator is more separated from the rest. This could explain our findings in Sec. VI of clear prethermalization at and no prethermalization at .

While the residual norm provides us some bound on the thermalization time scale, it is also important to obtain the physical meaning of the slowest operator. In the system in the good ergodic regime studied in Ref. Kim et al., 2015, in the nontranslationally invariant setting, the slowest operator can be understood as dressed energy density modulation operator. On the other hand, in the translationally invariant setting, the slowest operator does not have simple connection to the energy density modulation and its physical meaning remains an open question. In the MBL system, Ref. O’Brien et al., 2016 used this approach to explicitly construct the approximately conserved operators as local integrals of motion. As we will show in the next Sec. IV, the slowest operator we found in the large regime can be understood as a dressed total spin- operator, coming from the solvable limit , which can be viewed as a quasi-local integral of motion.

## Iv Schrieffer-Wolff Construction of Quasi-Conserved Quantity

Reference Abanin et al., 2017b used a renormalization scheme to construct an effective Hamiltonian which commutes with up to some order in small parameter, which can then be used to describe the prethermalization dynamics. Here, we use an approach with similar spirit but based on the local Schrieffer-Wolff (SW) transformationDatta et al. (1996); Bravyi et al. (2011) to construct a quasi-conserved operator perturbatively. The term “local” is stressed since the generators are solved in the form of sum of local terms, in contrast with the “global” SW transformation, where the generators are solved using projectors of the eigenspaces.Bravyi et al. (2011) The locality in particular allows us to construct the quasi-conserved quantity numerically to high order and measure its properties exactly, in contrast to the more abstract construction in Ref. Abanin et al., 2017b. A popular variant of a local SW transformation was in fact proposed in Ref. MacDonald et al., 1988 as a perturbative treatment of the Hubbard model in the large limit; this reference used generalized “ladder” operators connecting different Hubbard sectors, and we discuss the relation to our approach in App. A. Before proceeding, we briefly point some differences with Ref. Bravyi et al., 2011. First, our setup works in the thermodynamic limit from the start. More importantly, we choose the solution of Eq. (12) for the generator that eliminates the off-diagonal part of among all the sectors, while in Ref. Bravyi et al., 2011 one is only focusing on the off-diagonal part between the ground-state sector and other sectors.

We first describe the specific SW transformation used here and how we numerically construct a perturbation series for a quasi-conserved operator to -th order. We then calculate the squared residual norm of and the overlap between and to demonstrate the similarity between the two operators. We will see that the slowest operator in the large regime can be understood—at least up to the maximum range accessible in our work—as , which is essentially dressed “total spin- operator.”

### iv.1 Procedure of SW transformation

In the large- limit, we can decompose , with being our solvable limit and treated as perturbation with small parameter . [For example, we can define so that for convenience in the intensive Frobenius norm, but the specific choice is not important.] We construct a unitary transformation , with being Hermitian and -independent, such that the rotated Hamiltonian commutes with up to order in the formal expansion in . Stated another way, the eigenvalues of define the corresponding unperturbed sectors, and we want to have only sector-diagonal terms up to order in , while sector-off-diagonal terms are present only in higher order. If we then undo the rotation on back to the original picture, i.e., perform the inverse rotation to define , we obtain an operator that commutes with up to order by construction.

To be more specific, we follow Ref. Datta et al., 1996 and consider an expansion of in powers of :

where and

for . Here we have used the notation “” to mean the summation conditions for and , while the function , where if the condition in the argument is true and otherwise, and counts the number of elements in that are equal to . By construction, each is -independent; it enters with a coefficient and is part of the -th term in Eq. (9) for . Furthermore, collects all the terms with powers higher than .

The generators of the SW transformation are solved order by order by finding such that

where we have defined as a part of an operator that is “diagonal” in the sector label; i.e., is the component of the operator that commutes with . Equivalently, is the component of in the kernel (nullspace) of . The remainder is the “off-diagonal” part of the operator, and can be also viewed as a component of orthogonal to the kernel of in the Frobenius inner productDatta et al. (1996). We can solve for the generator

where is the pseudoinverse of . Note that solving Eq. (11) is determined only up to a component in the kernel of , and we make a choice here where such component is zero, i.e., is composed of only sector-off-diagonal operators; this is common choice in the SW approach, cf. Refs. MacDonald et al., 1988; Datta et al., 1996; Bravyi et al., 2011. The described procedure generates an effective Hamiltonian which commutes with up to order by truncating out , obtaining .

An important property of the above SW transformation is its locality, which ensures the representability of and in finite-dimensional operator spaces, making the SW procedure programmable as operations of matrices and vectors. In fact, one can show that for and we have and , see Ref. Bravyi et al., 2011 and Proposition B.1 in App. B.

We remark that the SW transformation generally does not converge when one takes the limit. There are rigorous results for the convergence of the ground state energy estimates for gapped HamiltoniansDatta et al. (1996); Bravyi et al. (2011) but no known results for the ability of the SW procedure to capture the entire spectrum of interest here. Nevertheless, the SW transformation is well-defined for any finite and can be used to obtain rigorous bounds on the dynamics in the spirit of Refs. Mori et al., 2016; Kuwahara et al., 2016; Abanin et al., 2017b. Thus, one can show that, for small enough , , see Ref. Bravyi et al., 2011 and Theorem B.1 in App. B. The dynamics described by in the rotated picture does not truly conserve but only approximately. In other words, while conserves , the “remainder” does not and is responsible for the eventual thermalization of the dynamics, which can be very slow if is small.

We can thus intuitively understand the prethermalization via this perturbative SW construction.Mori et al. (2016); Kuwahara et al. (2016); Abanin et al. (2017b); Else et al. () The solvable limit defines different sectors labeled by different integers, which can be viewed as counting the number (up to some off-set) of some emergent “particles.” (see also App. A). The perturbation term introduces interactions within the sectors and transitions between the sectors. The interactions within the sectors are indeed the “diagonal” part of . At -th order, the coefficient in the SW perturbation theory basically describes the transition amplitude of any process with inter-sector transitions. The generator is set to rotate the picture such that these processes are eliminated. The remaining part basically describes the processes which start and end in the same sector connected by times of the inter-sector transitions. The perturbation series would be convergent for small enough if there were at most of such processes. However, generically, in a translationally invariant system, there are order such processes coming from combinatorial factorials in . The exponential suppression of the transition amplitude is then not enough to suppress the factorial factor. Therefore, even though at high order of , the transition amplitude is perturbatively small , manifesting slowness of individual processes, there are, however, too many ways of the transitions such that the system will eventually thermalize.

### iv.2 Quasi-conserved quantity by SW transformation

Once we have obtained the generators for the SW transformation, we can rotate back to the original picture and obtain the quasi-conserved operator. Consider

 I≡UH0U†=H0+n∑m=1ϵmIm+I>n , (13)

where

and collects all the higher-power in terms. We then obtain the quasi-conserved operator . In Appendix B, we show that . To compare with the slowest operator, we remove the part of that is parallel to and normalize the resulting operator:

 I(n)⊥ = I(n)−H⟨H,I(n)⟩∥H∥2F , (15) ~I(n) = I(n)⊥∥I(n)⊥∥F . (16)

For small enough , we can bound the squared residual norm as

The proof of this bound and a more precise statement is in Appendix C.

Applying the previous heuristic argument for the thermalization time scale, Eq. (7), we get . If we treat the perturbation strength as given, and the SW order as an optimization parameter, then we can find that the residual operator norm is minimized at . The thermalization time scale is therefore maximized as . Note that unlike Refs. Mori et al., 2016; Kuwahara et al., 2016; Abanin et al., 2017a, b, where the heating rate is proven to be , we only obtain . This can be traced back to the estimation of the convergence radius in Appendices. B and C to be , hence the squared residual norm . We suspect that a tighter convergence radius is possible (see App. D); hence the bound on the thermalization time-scale could be improved to .ftn () Without pursuing this tighter bound further, we leave this for future studies.

As mentioned earlier, the locality of and allows us to formulate this procedure in finite-dimensional operator Hilbert spaces amenable to numerical calculations. Figure 1(b) shows the squared residual norm calculated from such SW construction of the quasi-conserved operator for several values of parameter . Note that at order , the constructed operator has maximum range . The trend of at large more or less follows the trend of , where the residual norm drops almost exponentially in low order, and turns into a slower trend, which is possibly a manifestation of the combinatorial factor . While not appearing in the figure yet for large , we expect will eventually start increasing at high enough order ; this is because in generic systems the combinatorial factors (like the ones appearing in the previous paragraph) will win over the exponential suppression at large enough ; such behavior of is observed in the and cases. Nevertheless, noting that the above arguments are based on the “worst-case-scenario” analytical bounds on the perturbatively-constructed operators, our numerical results for in the larger cases do not rule out the possibility that . On the other hand, unlike the perturbative construction, the numerical minimization for the slowest operator is guaranteed to get an equal or smaller residual norm when increasing .

Figure 4(a) shows the overlap between the slowest operator with maximum range and the SW construction with order up to . The overlap at large is almost ! Accordingly, we can understand the slowest operator we found in the large limit as the translationally invariant sum of the dressed spin- operator, or the dressed . Interestingly, there appears to be a strong change in behavior at . For , the slowest operator looks like the dressed spin- operator, with an exponential scaling of the residual norm for small ; on the other hand, for , the slowest operator does not look like the dressed spin- operator, and its residual norm has a power-law scaling.

Note that despite the fact that the SW construction and the slowest operator have very high overlap , where can be a very small number as shown in Fig. 4(b), the difference between their squared residual norms can still be sizable. Indeed, consider , where and is some operator perpendicular to in the Frobenius inner product. The normalization condition of gives , hence . The squared residual norm of is . We can thus see that

where we expressed everything in terms of the small number and kept only terms that are expected to dominate. Note that while is a small number, no such smallness is expected for since the deviation direction is not special in any way. Since , we expect that is a number of order in the energy units of (and could be larger depending on the range of typical terms in ), which could be sufficient to explain the visible difference between the two residual norms in Fig. 1(b) despite the high overlap between and .

## V Characterizing the Slowest Operators

In this section, we analyze some properties of the quasi-conserved operators that we found in Sec. II. We measure their “locality” in the operator space and in the real space, to contrast different behaviors of the slowest operators between small and large regimes.

### v.1 Operator inverse participation ratio

From the previous section, we expect that for large the quasi-conserved operator looks like a dressed spin operator. It is therefore reasonable to expect that should be a sum of a small number of Pauli string operators, analogous to the local integrals of motion in MBL studies. O’Brien et al. (2016) Using the Pauli string basis , , , (without forming the parity-invariant basis), we measure the operator inverse participation ratio (OIPR) 111Here we call the quantity in Eq. (18) “operator inverse participation ratio” so that it is consistent with usual definition, e.g., as used in single-particle localization problems where for a normalized wavefunction the inverse participation ratio is ; this convention is different from that in Ref. O’Brien et al., 2016. defined as

 OIPR(Q(M)0)=⎛⎝3⋅4M−1∑i=1|ai|4⎞⎠−1 , (18)

where ’s are the amplitude of the --- Pauli-string basis and we assumed normalization . The OIPR is bounded from below by .

Figure 5 shows the OIPR of the slowest operator for different . Interestingly, for larger , the OIPR seems to converge to a finite value at large enough . This behavior is consistent with our expectation that the quasi-conserved operator is a dressed total spin operator. The convergence of the OIPR indicates locality in the operator space. On the other hand, for small , the OIPR does not saturate but instead grows strongly with . This suggests that the slowest operators we found in the ergodic regime are composed of an extensive number of the Pauli string basis states; hence they are “delocalized” in the operator space.

### v.2 Real-space profile of the slowest operator

In this subsection, we examine the real-space shape of the slowest operator more closely. We define as the weight of on range- operators. In other words, we can decompose , with being an operator with range exactly equal to , and define . The normalization condition ensures that . Figure 6(a) shows the weights measured for the slowest operator with .

For large , the weight has an almost-exponential decay at small . Figures 6(b)-(e) show the weights for at fixed when increasing from to . From the plots, we can see that for large , the weight of the profile is peaked on -local operators, which we can understand already from the leading order SW construction, see Eq. (34) in App. A. We also see that the exponentially decaying part of at short distances is essentially converged, or independent of . However, the “shape” of the operator at long distances is not yet converged and is hence undetermined. Despite the fact that we can not determine the long-distance behavior for the slowest operators due to computational limitations, it is clear that the short-distance decay becomes slower when one decreases .

On the other hand, for small , there is no clear exponential decay even at short distance. In fact, for fixed and , the weights appear to decay faster than exponentially (with a Gaussian-like profile). However, the overall curve shifts to larger as one increases , with no apparent convergence to some fixed curve independent of . This suggests the non-normalizability for the operators in the small regime and is also consistent with the result of increasing OIPR as one increases , since there are more Pauli string operators involved in .

## Vi Dynamical Simulation

In order to demonstrate the effect of the quasi-conserved operator that we found in the large limit, we perform a quench dynamics calculation and observe an intermediate prethermalization state. We explicitly show that to describe the prethermalization state, one needs to include the slowest operator in the generalized Gibbs ensemble (GGE). We prepare the initial state as a product state with all spins pointing in the positive- direction, at time . We evolve the state under the Hamiltonian Eq. (1) as and measure the evolution of the magnetizations , where . We use time-evolved block-decimation (TEBD) methodVidal (2003) to simulate the quench dynamics in a system of length with open boundary conditions. We use second-order Trotter-Suzuki decomposition with Trotter step , which is sufficiently small to achieve the desired accuracy. We control truncations of the MPS using “cut-off” , which means that we discard singular values smaller than . We also use “bond dimension” , which means that we keep at most singular values. Two different sets of truncation parameters are used and compared against each other in order to estimate the effect of truncations on the MPS: and . Figure 7 shows the results of the TEBD calculations. The loss of norm (truncation error) seen in the insets is due to various truncations and provides some measure of the accuracy of the time evolution (note that it is roughly compensated in the magnetization measurements by normalizing at each , so the exhibited magnetizations are still reasonably accurate over the time range shown).

The effective inverse temperature for any initial state is determined by finding the parameter such that equation is satisfied, where . The thermal value is defined as , where is the associated Gibbs ensemble. Since , it is easy to verify that the effective inverse temperature for this initial state. As a result, for any traceless observable , the thermal value . Hence, if the system thermalizes, the magnetizations should approach zero.

Figure 7 shows the dynamical evolution of the magnetizations for parameters and for system size . For , even though the magnetizations have not fully equilibrated yet on our simulation times, we can see that they are fluctuating around zero, which is the expected thermal value. It is therefore reasonable to assume that the magnetizations are equilibrating toward zero, and the system thermalizes, without any prethermalization stage. On the other hand, for , it is visually clear that is approaching a sizable nonzero value. is also approaching a small nonzero value, even though it is less clear visually. The prethermalization stage persists over our simulation time, which is consistent with our bound on in Sec. III.2.

Crude features in the dynamics for can in fact be understood easily as the precession of the spins. If , the spins, which are pointing along direction initially, will precess under persistently. The term introduces interactions among the spins, resulting in the decay of the precession, therefore the damping of the magnetization oscillation. There is a simple quasiparticle description to understand the oscillation and the decay.Lin and Motrunich (2017) Viewing as the “total particle number,” part of the term introduces “hopping” of the “particles.” The oscillation frequency can essentially be understood as the quasiparticle excitation energy. Even if we modeled the quasiparticles using an integrable hard-core boson Hamiltonian, the oscillations will damp eventually. However, the equilibrium value (at least at this intermediate stage) is not described by the Gibbs ensemble.

Here we verify the conjecture that, to describe these intermediate equilibrium values, one needs to include the quasi-conserved quantity into a generalized Gibbs ensemble (GGE). The GGE in this case is , and . [Here we used the above form for the GGE rather than , since the former is easier to evaluate numerically where one only needs to diagonalize once, instead of diagonalizing for each pair of . Furthermore, since and almost commute, we expect the two expressions are approximately the same.] The parameters are determined by finding the values satisfying the following equations

 ⟨ψ|H|ψ