On complexity of the quantum Ising model
We study complexity of several problems related to the Transverse field Ising Model (TIM). First, we consider the problem of estimating the ground state energy known as the Local Hamiltonian Problem (LHP). It is shown that the LHP for TIM on degree- graphs is equivalent modulo polynomial reductions to the LHP for general -local ‘stoquastic’ Hamiltonians with any constant . This result implies that estimating the ground state energy of TIM on degree- graphs is a complete problem for the complexity class — an extension of the classical class . As a corollary, we complete the complexity classification of -local Hamiltonians with a fixed set of interactions proposed recently by Cubitt and Montanaro. Secondly, we study quantum annealing algorithms for finding ground states of classical spin Hamiltonians associated with hard optimization problems. We prove that the quantum annealing with TIM Hamiltonians is equivalent modulo polynomial reductions to the quantum annealing with a certain subclass of -local stoquastic Hamiltonians. This subclass includes all Hamiltonians representable as a sum of a -local diagonal Hamiltonian and a -local stoquastic Hamiltonian.
StoqMA \newclass\StoqLHStoqLH \newclass\TIMTIM \newclass\HCBHCB \newclass\HCDHCD
- 1 Introduction and summary of results
- 2 Hard-core bosons and dimers
- 3 Simulation of eigenvalues and eigenvectors
- 4 Schrieffer-Wolff transformation and perturbative reductions
- 5 Reduction from degree- graphs to general graphs
- 6 Reduction from TIM to dimers
- 7 Reduction from dimers to range- bosons
- 8 Range- bosons with multi-particle interactions
- 9 Reduction from range- bosons to range- bosons
- 10 Range- bosons with a controlled hopping
- 11 From range- bosons to -local stoquastic Hamiltonians
- 12 Proof of the main theorems
- A Bounds on the energy splitting and matrix elements for the Ising chain
1 Introduction and summary of results
Numerical simulation of quantum many-body systems is a notoriously hard problem. A particularly strong form of hardness known as -completeness  has been recently established for many natural problems in this category. Among them is the problem of estimating the ground state energy for certain physically-motivated quantum models such as Hamiltonians with nearest-neighbor interactions on the two-dimensional  and one-dimensional [3, 4] lattices, the Hubbard model [5, 6], and the Heisenberg model [5, 7]. In contrast, a broad class of Hamiltonians known as sign-free or stoquastic  has been identified for which certain simulation tasks become more tractable. By definition, stoquastic Hamiltonians must have real matrix elements with respect to some fixed basis and all off-diagonal matrix elements must be non-positive. Ground states of stoquastic Hamiltonians are known to have real non-negative amplitudes in the chosen basis. Thus, for many purposes, the ground state can be viewed as a classical probability distribution which often enables efficient simulation by quantum Monte Carlo algorithms [9, 10, 11, 12]. A notable example of a model in this category is the transverse field Ising model (TIM). It has a Hamiltonian
Here denotes the number of qubits (spins), are real coefficients, and are the Pauli operators acting on a qubit . Note that is a stoquastic Hamiltonian in the standard -basis iff for all . This can always be achieved by conjugating with . It is known that the ground state energy and the free energy of the TIM can be approximated with an additive error in time using Monte Carlo algorithms  in the special case when the Ising interactions are ferromagnetic, that is, for all . Another important special case is the TIM defined on the one-dimensional lattice with . In this case the Hamiltonian Eq. (1) is exactly solvable by the Jordan-Wigner transformation and its eigenvalues can be computed analytically . The ground state and the thermal equilibrium properties of the TIM have been studied in many different contexts including quantum phase transitions , quantum spin glasses [16, 17] and quantum annealing algorithms [18, 19, 20, 21]. In the present paper we address two open questions related to the TIM. First, we consider the problem of estimating the ground state energy of the TIM and fully characterize its hardness in terms of the known complexity classes. Secondly we study quantum annealing algorithms with TIM Hamiltonians and show that such algorithms can efficiently simulate a much broader class of quantum annealing algorithms associated with many important classical optimization problems.
To state our main results let us define two classes of stoquastic Hamiltonians. Let be the set of all -qubit transverse field Ising Hamiltonians defined in Eq. (1) such that the coefficients have magnitude at most for all . A TIM Hamiltonian is said to have interactions of degree iff each qubit is coupled to at most other qubits with interactions. Such Hamiltonian can be embedded into a degree- graph such that only nearest-neighbor qubits interact. We note that the terms of that are linear in can be absorbed into the Ising interaction part by introducing one ancillary qubit and replacing by for each . This transformation does not change the spectrum of except for doubling the multiplicity of each eigenvalue, see Section 2 for details.
Let be the set of stoquastic -local Hamiltonians on qubits with the maximum interaction strength . By definition, iff
where is a hermitian operator acting on the qubits such that and all off-diagonal matrix elements of in the standard basis are real and non-positive. One can choose different operators for each pair of qubits. We shall provide a more explicit characterization of -local stoquastic Hamiltonians in terms of their Pauli expansion in Section 11, see Lemma 9.
Our first theorem asserts that any -local stoquastic Hamiltonian can appear as an effective low-energy theory emerging from the TIM on a degree- graph.
Consider any Hamiltonian
and a precision parameter .
There exist , ,
and a Hamiltonian such that
(1) The -th smallest eigenvalues of and differ at most by for all .
(2) One can compute in time .
(3) has interactions of degree .
Here the maximum degree of all polynomial functions is some fixed constant that does not depend on any parameters (although we expect this constant to be quite large). The theorem has important implications for classifying complexity of the Local Hamiltonian Problem (LHP) [1, 22]. Recall that the LHP is a decision problem where one has to decide whether the ground state energy of a given Hamiltonian acting on qubits is sufficiently small, , or sufficiently large, . Here are some specified thresholds such that . The Hamiltonian must be representable as a sum of hermitian operators acting on at most qubits each, where is a small constant. Each -qubit operator must have norm at most . Such Hamiltonians are known as -local. Theorem 1 implies that the LHP for -local stoquastic Hamiltonians has the same complexity as the LHP for TIM. Indeed, consider an instance of the LHP for some Hamiltonian where . Choose a precision and let be the TIM Hamiltonian constructed in Theorem 1. Note that acts on qubits and has the interaction strength . Let be the ground state energy of . Then implies and implies . Since , the LHP for a -local stoquastic Hamiltonian has been reduced to the LHP for TIM. The converse reduction is trivial since any TIM Hamiltonian can be made stoquastic by a local change of basis. Thus we obtain
The LHP for -local stoquastic Hamiltonians has the same complexity as the LHP for TIM with interactions of degree , modulo polynomial reductions.
It is known that the LHPs for -local and -local stoquastic Hamiltonians have the same complexity for any constant , modulo polynomial reductions . Thus estimating the ground state energy of TIM on a degree- graph is as hard as estimating the ground state energy of a general -local stoquastic Hamiltonian for . Furthermore, the LHP for -local stoquastic Hamiltonians is known to be a complete problem for the complexity class [8, 23]. This is an extension of the classical class where the verifier can accept quantum states as a proof. To examine the proof the verifier is allowed to apply classical reversible gates in a coherent fashion and, finally, measure some fixed qubit in the -basis. The verifier accepts the proof if the measurement outcome is ’’. Let be the acceptance probability of the verifier for a given problem instance maximized over all possible proofs. A decision problem belongs to if there exist a polynomial-size verifier as above and threshold probabilities such that for any yes-instance and for any no-instance . Here is the length of the problem instance , see  for a formal definition. Combining these known results and Corollary 1 we obtain
The Local Hamiltonian Problem for TIM with interactions of degree is complete for the complexity class .
Finally, Theorem 1 completes the complexity classification of -local Hamiltonians with a fixed set of interactions proposed recently by Cubitt and Montanaro . The problem studied in  is defined as follows. Let be a fixed set of two-qubit hermitian operators. Consider a special case of the -local LHP such that Hamiltonians are required to have a form , where is a real coefficient and is an operator from applied to some pair of qubits. For brevity, let us call the above problem -LHP. The main result of Ref.  is that depending on the choice of , the problem -LHP is either complete for one of the complexity classes , , or can be solved in polynomial time on a classical computer, or can be reduced in polynomial time to the LHP for TIM. In addition, one can efficiently determine which case is realized for a given choice of . Combining this result and Corollary 2 one obtains
Let be any fixed set of two-qubit hermitian operators. Then depending on , the problem -LHP is either complete for one of the complexity classes , , , or can be solved in polynomial time on a classical computer.
We also prove an analogue of Theorem 1 which gives new insights on the power of quantum annealing (QA) algorithms [18, 24] with TIM Hamiltonians which received a significant attention recently [19, 20, 21]. Recall that quantum annealing (QA) [18, 24] attempts to find a global minimum of a real-valued function that depends on binary variables by encoding into a diagonal problem Hamiltonian acting on qubits. To find the ground state of one chooses an adiabatic path , , where is some simple Hamiltonian usually chosen as the transverse magnetic field, . Initializing the system in the ground state of and traversing the adiabatic path slowly enough one can approximately prepare the ground state of . The running time of QA algorithms scales as , where is the minimum spectral gap of , see [18, 24, 25]. We focus on the special case of QA such that the objective function is a sum of terms that depend on at most variables each. Here is some small constant. This includes well-known optimization problems such as -SAT, MAX--SAT and many variations thereof. We show that any quantum annealing algorithm as above can be efficiently simulated by the quantum annealing with TIM Hamiltonians. The simulation has a slowdown at most .
Fix some integer . We will say that is a -local stoquastic Hamiltonian iff is a sum of a -local stoquastic Hamiltonian and a -local diagonal Hamiltonian. Let be the set of all -local stoquastic Hamiltonians on -qubits with the maximum interaction strength .
Consider any Hamiltonian with a non-degenerate ground state and a spectral
There exist , ,
a Hamiltonian , and an isometry
(1) has a non-degenerate ground state and a spectral gap at least .
(3) The isometry maps basis vectors to basis vectors.
(4) One can compute and the action of on any basis vector in time .
Here the maximum degree of all polynomial functions depends only on the locality parameter . We note that one can replace the constant in condition (2) by an arbitrary precision parameter . Then the same theorem holds with a scaling . One can also impose a restriction that the Hamiltonian has interactions of degree-. Then a similar theorem holds, but the isometry has slightly more complicated properties, see Section 12 for details.
Let us discuss implications of the theorem. Suppose is an adiabatic path such that is the problem Hamiltonian and . We assume that has a non-degenerate ground state and a spectral gap at least for all . Also we assume that . Since can be adiabatically rotated to without closing the gap, we can modify the path such that . Then the initial ground state is . Applying Theorem 2 to each Hamiltonian one obtains a family of TIM Hamiltonians such that has a non-degenerate ground state , the spectral gap at least , and the interaction strength at most . We will show that the map is sufficiently smooth, so that the family , , defines an adiabatic path and the time it takes to traverse the paths and differ at most by a factor , see Section 12 for details. Therefore one can (approximately) prepare the final state by initializing the system in the basis state and traversing the path . Measuring every qubit of the final state in the -basis one obtains a string of outcomes such that . Then , that is, the ground state of can be efficiently computed from . Thus we obtain
Any quantum annealing algorithm with -local stoquastic Hamiltonians can be simulated by a quantum annealing algorithm with TIM Hamiltonians. The simulation has overhead at most , where is the number of qubits and is the minimum spectral gap of the adiabatic path.
In the rest of this section we informally sketch the proof of the main theorems, discuss several open problems, and outline organization of the paper.
Sketch of the proof.
The proof of Theorems 1,2 relies on
perturbative reductions [22, 2]
and the Schrieffer-Wolff transformation [26, 27, 28].
At each step of the proof we work with two quantum models:
a simulator Hamiltonian acting on some Hilbert space
and a target Hamiltonian acting on a certain subspace
|TIM, degree- graph|
|TIM, general graph|
|Hard-core dimers, triangle-free graph|
|Hard-core bosons, range-|
|Hard-core bosons, range-|
|Hard-core bosons, range-, controlled hopping|
|-local stoquastic Hamiltonians|
We apply the above step recursively several times such that
the target Hamiltonian at the -th step becomes the simulator Hamiltonian at the
-th step. The recursion starts from the TIM with interactions of degree- at the highest energy scale,
goes through several intermediate models listed in Table 1, and arrives at
a given -local or -stoquastic Hamiltonian at the lowest energy scale.
Overall, the proof requires nine different reductions
For almost all of our reductions the Hamiltonian is diagonal in the standard basis, so that all eigenvalues and eigenvectors of can be easily computed. The only exception is the reduction from TIM with interactions of degree- to a general TIM. For this reduction we encode each qubit of the target model into the approximately two-fold degenerate ground subspace of the one-dimensional TIM on a chain of a suitable length. Accordingly, the Hamiltonian describes a collection of one-dimensional TIMs. We simulate the logical Ising interaction between some pair of logical qubits by applying the physical interaction to a properly chosen pair of qubits and , where is the chain encoding a logical qubit . The logical transverse field is automatically generated due to the energy splitting between the ground states of . The analysis of this reduction exploits recent exact results on the form-factors of the one-dimensional TIM .
We emphasize that the word “reduction” is used in two distinct senses. In the present paper we speak of a perturbative reduction from a Hamiltonian to a Hamiltonian when is the effective low-energy Hamiltonian derived from , following terminology in physics. However, if belongs to some particular class of Hamiltonians and belongs to some subclass , this is a reduction from the class to the class , according to terminology in computer science.
Open problems. Our work raises several questions. First, we expect that Theorems 1,2 can be extended in a number of ways. For example, one may ask whether the analogue of Theorem 1 holds for TIM Hamiltonians restricted to particular families of graphs, such as planar graphs or regular lattices. We note that a simple modification of our degree reduction method based on the one-dimensional TIM produces a simulator Hamiltonian which can be embedded into the 3D lattice of dimensions with periodic boundary conditions. We expect that applying additional perturbative reductions such as those described in Ref.  can further simplify the lattice. Likewise, we expect that Theorem 2 can be extended to the case when is a general -local stoquastic Hamiltonian by applying perturbative reductions of Ref. .
A challenging open question is whether TIM Hamiltonians defined on a 2D lattice can realize the topological quantum order. It has been recently shown that the hard-core bosons model defined on the kagome lattice has a topologically ordered ground state for a certain range of parameters [30, 31]. A preliminary analysis shows that the chain of reductions from TIM to hard-core bosons described in the present paper can be modified such that all intermediate Hamiltonians have geometrically local interactions. Assuming that the unphysical polynomial scaling of interactions in the simulator Hamiltonian can be avoided [27, 32], this points towards existence of topologically ordered phases described by TIM Hamiltonians.
Finally, a big open question is whether QA algorithms with TIM Hamiltonians can be efficiently simulated classically. It has been recently shown that the general purpose quantum Monte Carlo algorithms fail to simulate certain instances of the QA with TIM efficiently , even though these instances have a non-negligible minimum spectral gap. This leaves a possibility that some more specialized algorithms taking advantage of the special structure of TIM Hamiltonians can succeed even though the general purpose algorithm fail. Our results demonstrate that this is unlikely, since simulating the QA with TIM is as hard as simulating the QA with much more general -local stoquastic Hamiltonians.
The paper is organized as follows. Section 2 contains a rigorous definition of the models listed in Table 1. Our main technical tools are introduced in Sections 3,4 which present a general definition of a simulation, describe perturbative reductions based on the Schrieffer-Wolff transformation, and prove several technical lemmas used in the rest of the paper. Section 5 shows how to simulate a general TIM Hamiltonian using a special case of TIM with interactions of degree-. Sections 6-11 describe a chain of perturbative reductions between the models listed in Table 1. These reductions are combined together in Section 12 which contains the proof of Theorems 1,2. Finally, Appendix A proves certain bounds on eigenvalues and form-factors of the one-dimensional TIM which are used in Section 5.
2 Hard-core bosons and dimers
Consider a graph with a set of nodes and a set of edges . Define a Hilbert space with an orthonormal basis such that basis vectors are labeled by subsets of nodes . We shall identify subsets of nodes with configurations of particles that live at nodes of the graph. Each node can be either empty or occupied by a single particle. For any node define a particle number operator such that if and otherwise. We shall often consider diagonal Hamiltonians of the following form:
Here the second sum runs over all two-node subsets (not only nearest neighbors). The coefficients and can be viewed as a chemical potential and a two-particle interaction potential respectively.
Let us now define a hopping operator . Here are arbitrary nodes such that . By definition, annihilates any state in which both nodes are occupied or both nodes are empty. If one of the nodes is occupied and the other node is empty, transfers a particle from to or vice verse. Matrix elements of in the chosen basis are
Let be fixed integers. Define a subspace spanned by all subsets with exactly nodes. We shall refer to as an -particle sector. Obviously, the operators and preserve . A subset of nodes is said to be -sparse iff the graph distance between any distinct pair of nodes is at least . Define a subspace spanned by all -sparse subsets with exactly nodes. By definition, any subset of nodes is -sparse, so that . Note that the operators generally do not preserve . Below we consider hopping operators projected onto the subspace . Matrix elements of a projected hopping operator are defined by Eq. (3), where and run over all -sparse subsets of nodes.
Our first model is called hard-core bosons (HCB). It is defined on the Hilbert space , where and are fixed parameters. We shall refer to as the range of the model. The Hamiltonian is
Here is defined by Eq. (2) and all operators are projected onto the subspace . Thus moves a particle only if this does not violate the -sparsity condition. Otherwise annihilates a state. The coefficients are hopping amplitudes. We shall always assume that
for all . The coefficients and in may have arbitrary signs. Note that is a stoquastic Hamiltonian. Let be the set of Hamiltonians describing the -particle sector of range- hard-core bosons on a graph with nodes such that all the coefficients have magnitude at most . Here we take the union over all graphs with nodes. Our proof will only use HCB models with the range . Later on we shall define certain enhanced versions of the HCB which have multi-particle interactions, see Section 8, and/or controlled hopping terms, see Section 10. We note that the HCB model with non-positive hopping amplitudes has been recently studied by Childs, Gosset, and Webb  who showed that the corresponding LHP is -complete.
Our second model is called hard-core dimers. This model also depends on a graph . We shall only consider triangle-free graphs . Let be a fixed integer parameter. A subset of nodes is said to be a dimer iff for some pair of nodes such that . Define an -dimer as a subset of nodes that can be represented as a disjoint union of dimers such that the graph distance between and is at least three for all . This particular choice of the distance guarantees that -dimers can be represented as ground states of a suitable Ising Hamiltonian, see Lemma 8 in Section 6. Examples of -dimers are shown on Fig. 1.
Let be the subspace spanned by all basis vectors such that is an -dimer. Note that the operators generally do not preserve . Below we consider hopping operators projected onto the subspace . Matrix elements of a projected hopping operator are defined by Eq. (3), where and run over all -dimers. The hard-core dimers (HCD) model has a Hilbert space and a Hamiltonian
where is defined by Eq. (2) and all operators are projected onto the subspace . The sum in Eq. (5) runs over all pairs of nodes (not only nearest neighbors). Although the Hamiltonian does not explicitly depend on the graph structure, the underlying Hilbert space does depends on the graph since the latter determines which subsets of nodes are -dimers. A hopping process induced by can change a dimer to some other dimer with , see Fig. 1. The coefficient is a hopping amplitude. We shall assume that . Then is a stoquastic Hamiltonian. Let be the set of Hamiltonians describing the -dimer sector of hard-core dimers model on a graph with nodes such that all coefficients in have magnitude at most . Here we take the union over all triangle-free graphs with nodes.
Some perturbative reductions described below will alter the underlying graph . Whenever the choice of is not clear from the context, we shall use more detailed notations , , and instead of , , and . Our notations for various classes of Hamiltonians are summarized in Table 2.
|Transverse field Ising Model|
|-dimer sector of Hard-Core Dimers model.|
|-particle sector of Hard-Core Bosons with range .|
|same as .|
|with controlled hopping terms.|
|Stoquastic -Local Hamiltonians.|
Finally, consider a TIM Hamiltonian defined in Eq. (1). Let us add an ancillary qubit labeled by ’’ and consider a modified Hamiltonian defined as
Let be the global spin flip operator. Note that commutes with and , whereas and anti-commute. This implies that the restriction of onto the sectors and have exactly the same spectrum as the original Hamiltonian . Hence the full spectrum of is obtained from the one of by doubling the multiplicity of each eigenvalue. In particular, the LHPs for Hamiltonians Eqs. (1,6) have the same complexity. Finally, substituting into Eq. (1) one gets
where , and . Here we ignore the overall energy shift. Clearly, the coefficients and have magnitude at most . Below we shall work with TIM Hamiltonians as defined in Eq. (7).
3 Simulation of eigenvalues and eigenvectors
In this section we give a formal definition of a simulation. It quantifies how close are two different models in terms of their low-energy properties such as the low-lying eigenvalues and eigenvectors. We consider a target model described by a Hamiltonian acting on some -dimensional Hilbert space and a simulator model described by a Hamiltonian acting on some Hilbert space of dimension at least . Our definition of a simulation depends on a particular encoding transformation that embeds into some -dimensional subspace of . We assume that is an isometry, that is, . The encoding enables a comparison between eigenvectors of the two models. We envision a situation when the spectrum of consists of two well-separated groups of eigenvalues such that the smallest eigenvalues of are separated from the rest of its spectrum by a large gap. Let be the low-energy subspace spanned by the eigenvectors of associated with its smallest eigenvalues.
Let be a Hamiltonian acting on a Hilbert space of dimension . A Hamiltonian and an isometry (encoding) are said to simulate with an error if there exists an isometry such that
The image of coincides with the low-energy subspace .
Although we do not impose any restrictions on the encoding, in practice it must be sufficiently simple. For all our reductions (except for the one of Section 5) the encoding maps basis vectors to basis vectors. Whenever the choice of is clear from the context, we shall just say that simulates with an error . If one is interested only in reproducing eigenvalues of the target Hamiltonian, the encoding and condition (S3) can be ignored.
In the case of a zero error, , the target Hamiltonian coincides with the restriction of onto the low-energy subspace of , up to a change of basis described by . Clearly, any Hamiltonian simulates itself with a zero error since one can choose . We shall always assume that since otherwise the definition is meaningless (one can choose regardless of ). Note that has the dimension of energy while is dimensionless. Loosely speaking, and quantify simulation error for eigenvalues and eigenvectors respectively. Let us establish some basic properties of simulations.
Lemma 1 (Eigenvalue simulation).
Suppose simulates with an error . Then the -th smallest eigenvalues of and differ at most by for all .
Property (S1) implies that the spectrum of coincides with smallest eigenvalues of . The lemma now follows from (S2) and the standard Weyl’s inequality. ∎
Lemma 2 (Ground state simulation).
Suppose has a non-degenerate ground state separated from excited states by a spectral gap . Suppose simulates with an error such that . Then has a non-degenerate ground state and
Let be the ground state of . Note that is non-degenerate due to Lemma 1 and the assumption . Consider an unperturbed Hamiltonian and a perturbation . The perturbed Hamiltonian has a non-degenerate ground state . Using the first-order perturbation theory for eigenvectors one gets and thus . Here we used the fact that is an isometry and . Property (S3) then leads to Eq. (8). ∎
Importantly, our definition of a simulation is stable under compositions: if one is given some Hamiltonians such that simulates with a small error and simulates with a small error, this implies that simulates with a small error.
Lemma 3 (Composition).
Suppose simulates with an error and simulates with an error . Let be the spectral gap separating smallest eigenvalues of from the rest of the spectrum. Suppose and . Then simulates with an error , where
We shall always choose the simulator such that in which case .
Suppose act on Hilbert spaces respectively. Let and . By Lemma 1, the smallest eigenvalues of are separated from the rest of the spectrum by a spectral gap at least . Thus the low-energy subspace is well defined. Let be an isometry satisfying properties (S1-S3) for a simulator and a target Hamiltonian with an error . By definition, maps to the low-energy subspace . First, let us show that approximately maps to . More precisely, we claim that there exists a unitary operator such that
Indeed, let be the projector onto the low-energy subspace , where . Consider a pertubation . Note that . Applying Lemma 3.1 of Ref.  with an unperturbed Hamiltonian and a perturbation one gets
Taking into account that one gets
For brevity denote
By Jordan’s lemma, there exists an orthonormal basis such that the projectors and are block-diagonal in this basis with all blocks being either or projectors. Assuming that one has which implies that all blocks of and are the same. Consider some block. Without loss of generality, the restrictions of and onto this block have a form
for some such that . Then and thus . We conclude that for any block. Define a unitary
such that . Note that . Extending to the full space we obtain and which is equivalent to Eq. (10).
Now we are ready to prove that simulates with a small error. Define an isometry
Using the first part of Eq. (10) and the fact that maps to the low-energy subspace we conclude that maps to the low-energy subspace . Thus obeys property (S1) for the target Hamiltonian and the simulator . Furthermore, the second part of Eq. (10) implies that
Finally, let be the restriction of onto the low-energy subspace . Note that and . Thus
The first term is upper bounded by . Thus
To bound the second term we write and note that
The norm of the first term is upper bounded by
Since we assumed that , the last term is at most