Machine learning assisted measurement of local topological invariants

Machine learning assisted measurement of local topological invariants

Marcello D. Caio Instituut-Lorentz, Universiteit Leiden, P.O. Box 9506, 2300 RA Leiden, The Netherlands    Marco Caccin Independent Researcher    Paul Baireuther Instituut-Lorentz, Universiteit Leiden, P.O. Box 9506, 2300 RA Leiden, The Netherlands    Timo Hyart International Research Centre MagTop, Institute of Physics, Polish Academy of Sciences, Aleja Lotnikow 32/46, PL-02668 Warsaw, Poland    Michel Fruchart Instituut-Lorentz, Universiteit Leiden, P.O. Box 9506, 2300 RA Leiden, The Netherlands The James Franck Institute, The University of Chicago, Chicago, IL 60637, USA

The continuous effort towards topological quantum devices calls for an efficient and non-invasive method to assess the conformity of components in different topological phases. Here, we show that machine learning paves the way towards non-invasive topological quality control. We introduce a local topological marker, able to discriminate between topological phases of one-dimensional wires. The direct observation of this marker in solid state systems is challenging, but we show that an artificial neural network can learn to approximate it from the experimentally accessible local density of states. Our method distinguishes different non-trivial phases, even for systems where direct transport measurements are not available and for composite systems. This new approach could find significant use in experiments, ranging from the study of novel topological materials to high-throughput automated material design.


Topological insulators and superconductors are phases of matter characterised by the exact quantisation of macroscopic observables and the appearance of edge states at the boundary of open systems Hasan and Kane (2010). Such peculiar edge states include condensed-matter realisations of Majorana bound stated and unidirectional edge states, which are particularly robust against disorder and local perturbations. This makes them particularly appealing to engineer devices such as qubits, quantum channels Dlaska et al. (2017), and eventually quantum computers Nayak et al. (2008). In a quantum device, several components in different topological phases can be brought together; see Fig. 1. Therefore, it is convenient to have a means of locally discriminating between different topological phases. To this end, in analogy with the two-dimensional Chern marker Bianco and Resta (2011); Caio et al. (2019), we introduce a local quantity which we name “winding marker” that locally distinguishes topological phases of one-dimensional systems with chiral symmetry. This is in contrast both with the global approach of standard topological invariants that are only defined for infinite systems Hasan and Kane (2010), and with approaches based on scattering matrices Akhmerov et al. (2011); Fulga et al. (2011, 2012); Beenakker (2015) that fundamentally characterise an interface. Although attempting a direct measurement of the winding marker in solid-state systems would raise numerous challenges, we will show that it can be related to readily available experimental data.

Figure 1: Winding marker in a composite sample. We show a hypothetical one-dimensional quantum device composed of three regions in different topological phases. These are distinguished by quantised values of the winding marker matching the bulk invariants of the three corresponding infinite-size systems, up to fluctuations due to disorder. Here we consider a Kitaev chain (see main text) with total length , parameters , , , , and , or (from left to right).

In one-dimensional topological insulators and superconductors, the local density of states (LDOS) can be obtained from the tunnelling differential conductance, observed by scanning tunnelling microscopy (STM) Nadj-Perge et al. (2014); Ruby et al. (2015); Pawlak et al. (2016); Feldman et al. (2016); Jeon et al. (2017), or with more elaborate setups Zhang et al. (2018a). STM provides a relatively non-invasive measurement of the LDOS as it does not require the deposition of contacts; this might be relevant for the non-destructive testing of topological devices, e.g. to assess whether a manufactured sample is in the expected topological phase. Although the LDOS of a system without edges does not carry information about its topology, the edge states will appear in the LDOS of a finite size system. However, these may be obscured by the presence of disorder in the sample. Moreover, STM measurements only allow to access the LDOS up to an unknown prefactor Chen (2007). The relation between the measured LDOS and the winding marker can therefore be subtle, even in the absence of disorder, but we shall see that it can be inferred using supervised machine learning.

Machine learning techniques are increasingly used in physics Mehta et al. (); Carleo and Troyer (2017); Carrasquilla and Melko (2017); Baireuther et al. (); Baireuther et al. (2019). In particular, several works applied machine learning to study topological phases, mostly focusing on their classification from numerically accessible quantities such as entanglement spectra van Nieuwenburg et al. (2017), density matrices Carvalho et al. (2018), Hamiltonians Zhang et al. (2018b) or their eigenvectors Holanda and Griffith (2019), loops of two-point correlation functions Zhang and Kim (2017), the local density of a single state Ohtsuki and Ohtsuki (2016, 2017); Araki et al. (), and its disorder-averaged version Yoshioka et al. (2018). In cold-atom systems, artificial neural networks were used to identify topological phases from the experimental momentum distributions Rem et al. (). In solid-state systems however, the issue of determining the topological nature of a given sample from experimentally accessible data remains open.

In this article, we show that the winding marker in the centre of a finite size sample can be predicted from a measurement of the LDOS of the whole sample, by using supervised machine learning. Beside being able to distinguish trivial from topological phases, our method also discriminates between topological phases with distinct integer invariants. This is a non-trivial task as the simple counting of states is not available via STM measurement. Our method is of particular interest for unconventional superconductors, such as \ceSr2RuO4 Scaffidi and Simon (2015); Kallin and Berlinsky (2016) and one-dimensional Sahlberg et al. (2017) and two-dimensional Röntynen and Ojanen (2015) Shiba lattices, where large values of topological invariants are predicted. In those systems, the experimental determination of the number of edge states is highly challenging due to the lack of easily accessible electrical transport signatures and because of the difficulties in the accurate measurement of the quantised thermal conductance at low temperatures Jezouin et al. (2013); Banerjee et al. (2017, 2018); Kasahara et al. ().

Local winding marker. Topological phases of matter are characterised by quantised topological invariants, typically defined globally for infinite systems. These invariants often manifest themselves in the response function of the ground state to an appropriate gauge field Qi et al. (2008); Ludwig (2015). Hence, we can expect them to correspond to reasonably localised quantities in real space: topological invariants can be recast in terms of the Fermi projector on the ground state, which is nearsighted for gapped systems Aizenman and Graf (1998); Prodan and Kohn (2005); Bianco and Resta (2011). As first noticed by Bianco and Resta Bianco and Resta (2011) in the case of the anomalous Hall effect, this enables a local quantity closely related to the topological invariant to exist.

In this work, we focus on one-dimensional systems with chiral symmetry, such as polyacetylene Su et al. (1979) or the Kitaev chain Kitaev (2001). The chiral symmetry is realised by a unitary operator which anticommutes with the Hamiltonian . This class of topological systems is characterised by an integer-valued invariant called winding numbeA quantum computer needs the assistance of a classical algorithm to detect and identify errors that affect encoded quantum information. At this interface of classical and quantum computing the technique of machine learning has appeared as a way to tailor such an algorithm to the specific error processes of an experiment—without the need for a priori knowledge of the error model. Here, we apply this technique to topological color codes. We demonstrate that a recurrent neural network with long short-term memory cells can be trained to reduce the error rate epsilon L of the encoded logical qubit to values much below the error rate epsilon phys of the physical qubits—fitting the expected power law scaling , with d the code distance. The neural network incorporates the information from ’flag qubits’ to avoid reduction in the effective code distance caused by the circuit. As a test, we apply the neural network decoder to a density-matrix based simulation of a superconducting quantum computer, demonstrating that the logical qubit has a longer life-time than the constituting physical qubits with near-term experimental parameters.r, defined in momentum space as Chiu et al. (2016)


Here, , where , , and is the projector on the states below the Fermi level. While convenient, translation invariance is not necessary to define the winding number. In an infinite system, it can be defined as the trace per unit volume Mondragon-Shem et al. (2014); Song and Prodan (2014); Rakovszky et al. (2017)


where is the position operator. In particular, this real-space formulation applies to disordered systems. The topological invariant in (2) is a global quantity, quantised even at strong disorder Mondragon-Shem et al. (2014); Song and Prodan (2014); Prodan and Schulz-Baldes (2016).

Figure 2: Local density of states of topological and trivial samples. (a-f) Examples of normalised local densities of states for several disordered Kitaev chains of length , corresponding to winding numbers , and . It is relatively easy to guess that (a), (b) and (f) are in a topological phase, as there are clear edge states on the boundaries, and nothing in the bulk. However, it is not obvious that (a) and (b) have , while (f) has . It is even harder to identify the topology in the cases (c-e), because of the many peaks in the LDOS due to the disorder. It turns out that (c) corresponds to , while (d) and (e) correspond to .

In analogy with the Chern marker for two-dimensional systems introduced by Bianco and Resta Bianco and Resta (2011), we define the local winding marker


where is the position along the chain, labels the degrees of freedom in the unit cell of the Bravais lattice, and is the volume of the unit cell. While we focus here on one-dimensional systems, the same construction is available in all odd space dimensions.

The local winding marker can be computed for the experimentally relevant case of disordered finite-size systems with open boundaries and, notably, for composite systems. In Fig. 1, a chain is divided in three regions, each with different parameters of the Hamiltonian. Infinite-size systems with the corresponding parameters would have three different winding numbers. Away from the interfaces, the winding marker displays plateaux at the corresponding values, up to fluctuations due to disorder.

Neural network assisted measurement. In order to infer the value of the winding marker from accessible experimental data, we use supervised machine learning in the form of a feedforward neural network. The spatially resolved density of the states close to the Fermi energy can be measured in STM experiments, as discussed Chevallier and Klinovaja (2016) and observed Nadj-Perge et al. (2014); Ruby et al. (2015); Pawlak et al. (2016); Feldman et al. (2016); Jeon et al. (2017) in the context of one-dimensional topological systems. However, there is little control on the number of states involved in the measurement when the system is disordered, and the LDOS can be measured only up to an unknown prefactor. In order to model such a measurement, we use as input of our neural network the LDOS corresponding to an energy window of size centred at the Fermi energy


Here, the sum runs over the internal degrees of freedom , and over the eigenstates of the Hamiltonian with energies . Besides, is a normalisation constant ensuring that , so that the neural network does not simply count the total number of states in the window. In our numerical calculations, we set to be of the bandwidth. In Fig. 2, we show some examples of normalised LDOS drawn from the dataset used to train our neural network. A visual analysis reveals no obvious connection between the shape of the LDOS and the number of topological edge states.

Although the winding marker is defined locally, we are interested in its value in the bulk of the system. Away from the sample boundaries, the winding marker corresponds to the topological invariant , up to fluctuations due to disorder. To remove these, we label each item in the training set of the neural network with the average of over a region of size in the centre of the sample of size . The neural network is then trained to predict from a normalised LDOS. Details about the architecture, implementation, and -fold cross-validation training and testing of the feedforward neural network are discussed in the Supplemental Material.

Figure 3: Neural network prediction of the winding marker. (a) Normalised distribution of the predicted winding marker with respect to the actual averaged winding marker . Most of the data concentrate in spots on the diagonal (dashed line), corresponding to integer values of the winding marker. Data outside of these spots mostly correspond to parameters of the Hamiltonian close to phase transitions. The spot at is less sharp than the others, as many data points correspond to phases were the topology is made trivial by disorder. (b) Normalised distribution and cumulative distribution function of the error . The corresponding mean absolute error is .

Results. In this work, we focus on the disordered Kitaev chain Kitaev (2001), where we include next to nearest neighbours hoppings in order to explore the phase, in addition to the usual phases. For simplicity, we assume the hopping terms to be equal to the superconducting pairings, and consider the Hamiltonian , where , , , where are Pauli matrices in particle-hole space. Here, we consider uncorrelated disorder where are independent and identically distributed random variables following a normal distribution with mean and standard deviation , and similarly for . This Hamiltonian has both particle-hole and chiral symmetries. In a generic superconducting system, only particle-hole symmetry is present, and our method might be adapted to assess the corresponding topology. Here, we focus on the more delicate situation where several topologically non-trivial phases have to be distinguished.

The dataset for the training and testing of the neural network consists of tuples obtained by randomly drawing the parameters , and uniformly from the interval and from . In Fig. 3, we show the two-dimensional distribution of the predicted winding marker with respect to the actual (directly calculated) averaged winding marker . For a perfect prediction, all the data points should lie on the diagonal; and, for a perfectly quantised winding marker, all the data should concentrate at the points for . In Fig. 3(a), three spots are indeed clearly visible, and their finite width is due to the presence of disorder. The normalised distribution of the error , in Fig. 3(b), shows the accuracy of the predictions. For our trained neural network, we obtain a root mean squared error . The tail in the distribution of errors in Fig. 3(b) corresponds to the subtle vertical features in Fig. 3(a), where the error is larger. In order to test the scalability of our approach, we consider a system twice as big, with length . We obtain a similar using a dataset composed of tuples . The influence of the size of the dataset on the MSE is discussed in the Supplemental Material.

We expect the network to recognise features associated with the topological edge states, and not inessential features specific to the system. To verify this hypothesis, we train and test the same neural network using as input the LDOS of a sample of length restricted to the central sites. As expected, the network trained in this way loses any predictive ability; see Supplemental Material. In Fig. 4, we show a slice of the phase diagram of the disordered Kitaev chain, comparing the values of (a) the predicted winding marker and (b) the spatially averaged winding marker over a range of parameters, for a single disorder realisation. Further, in panels (c) and (d) of Fig. 4, we show their average over disorder realisations. The remarkable agreement between the actual and predicted winding markers illustrate the accuracy of the network in parameter space, even for large disorder.

Figure 4: Predicted and reference phase diagrams. (a) Predicted winding marker for one disorder realisation. (b) Actual winding marker for one disorder realisation. (c) Predicted winding marker averaged over disorder realisations. (d) Actual winding marker averaged over disorder realisations. This slice of the phase diagram, corresponding to different values of the onsite potential and of the disorder amplitudes , is computed for a system of length , with and .

So far, we have used a neural network to infer the bulk topology of a homogeneous finite-size chain from its LDOS. Further, we can take advantage of the local character of the winding marker by applying our method to a composite chain. As a proof of principle, we focus on the simplest example where the left and right halves of a one-dimensional chain of size are potentially different. More precisely, both the average values and the standard deviations of the parameters and are independently chosen for the left and the right of the chain. For simplicity, is set to identically vanish throughout the chain, which implies that the winding number can be either or , on each side. The same procedure as before is then applied: the LDOS of the entire chain is used as the input of a feedforward neural network, which is trained using as labels the averages and of the winding marker over regions of size centred at and . The network outputs the predicted values of the averaged winding markers and . Here, L and R respectively label the left and right sides of the chain.

Figure 5: Neural network prediction for a composite system. (a) Normalised marginal distribution of the predicted winding marker for the left half of the chain with respect to the actual averaged winding marker for the left (same) half. (b) Normalised marginal distribution of the predicted winding marker for the left half of the chain with respect to the actual averaged winding marker for the right (other) half. (c) Sketch of the composite system, with an example of LDOS in blue; in this case while . The four-dimensional histogram from which the marginals are obtained is computed with bins for each dimension.

In Fig. 5, we show the two-dimensional marginal distributions (a) and (b) , for a chain of length , and tuples in the dataset. The neural network is identical as for Fig. 3, except for the output layer which now includes two units. As expected, Fig. 5(a) resembles Fig. 3(a) while Fig. 5(b) shows the lack of any meaningful correlation between the predicted value of the left half and the actual value of the right half. The marginals and (not shown) have identical features. For the trained neural network, we obtain a .

Discussion. In this article, we introduced a local marker to characterise the topology of finite-size, possibly composite, one-dimensional chiral systems. We have shown that machine learning techniques allow to infer the average of this local winding marker from the experimentally accessible local density of states. Crucially, not only are we able to distinguish topological from non-topological phases but we can also discriminate between topological phases with different invariants. Our approach is as non-invasive as possible, and is suitable even for systems where direct transport measurements cannot be used.

While the winding marker is a genuinely local quantity, the neural network fundamentally recognises interfaces, as it relies upon the appearance of topological edge states in the local density of states. This is reminiscent of the scattering matrix description of topological systems. Although the neural network predicts spatially averaged values of the winding marker, we have shown that it can locally predict the topology of adjacent regions in a composite system.

Here, we focused on a proof of concept where the LDOS and the topological marker are determined from a specific family of tight-binding Hamiltonians. When a larger family of Hamiltonians is considered, e.g. including more degrees of freedom and longer range hoppings, a larger network and training set might be required to maintain the same level of accuracy. For example, we expect our approach to distinguish even larger values of the topological invariant, possibly at a cost of an increased size of the neural network. In most experimental setups, the parameter space of the Hamiltonians is strongly constrained by the symmetries and the locality. Therefore, we expect that for each setup, it is possible to tailor and train a network which can efficiently identify the distinct topological phases. We can also wonder if a finite training set is enough to learn a general rule, allowing the predictions to remain accurate even when the Hamiltonians are generated from larger and larger subsets of the whole set of class BDI Hamiltonians. This question goes beyond the scope of this work, but is highly interesting from a fundamental point of view. Future directions for research also include extensions to the Chern marker for two-dimensional systems, as well as to other local topological markers.

Acknowledgments. We thank J. Tworzydło and C. Beenakker for fruitful discussions. This research was supported by the Netherlands Organisation for Scientific Research (NWO/OCW) as part of the Frontiers of Nanoscience (NanoFront) program, by an ERC Synergy Grant, and by the Foundation for Polish Science through the IRA Programme, co-financed by EU within SG OP.


Appendix A Neural network architecture

The input of our model is an array of nonnegative real numbers of fixed length representing the normalised local density of states (LDOS) of the finite system close to the Fermi energy; its output is the predicted winding number , a single real number. For this regression task, we employ a feedforward artifical neural network composed of hidden layers. Each hidden layer , with , contains rectified linear units (ReLU) to provide non-linearity, followed by a batch normalisation (BN) Ioffe and Szegedy () to speed up and stabilise the training by reduction of the internal covariance shift. The output layer is a single unit with linear activation, which corresponds to a linear mapping from the last hidden layer. To regularise the model and thus prevent overfitting, during training we apply dropout Srivastava et al. (2014) to the output of the last hidden layer with a dropout probability of . The weights and biases of the network are fitted using the Adam optimizer Kingma and Ba (2014), with an learning rate of . We use the mean squared error (MSE) as the loss function to train the parameters of the neural network, as it provides both a suitable metric for the regression problem and can be used in backpropagation, being differentiable with respect to the network weights. The implementation is done with the Keras package Chollet et al. (2015), using TensorFlow Abadi et al. (2015) as backend.

The network is formally a function space of maps parametrised by a set of weights matrices and bias vectors that can be expressed as




The transformation is described in Algorithm 1 of Ref. Ioffe and Szegedy (). It is parametrised by the feature-wise mean and variance values, which are fitted during training.

For the case of a bipartite composite system, the output of the model is a vector . The only change to the neural network architecture is that in this case, the output layer is now composed of two neurons with linear activation.

The model architecture is obtained by evaluating the MSE of different architectures on a fixed test set when trained on a fixed training set; a separate validation set is used for interrupting the training when the MSE on it is no longer decreasing. The contending ML models were AlexNet-like convolutional neural networks, boosted trees, and support vector machine with linear kernel, but a feedforward neural network largely outperformed all of them. Once the general architecture of the model is found, the same strategy of MSE evaluation is used again to choose the hyperparameters of the network (e.g. number of hidden neurons, number of layers, dropout ratio).

After the architecture of the model is determined, a new dataset is generated to train and test the network. The presented results are obtained by -fold cross-validation with , where the whole dataset is randomly split in “folds” each containing the same fraction of data. One at a time, each fold is used for testing whereas the other are used for training. This allows us to estimate the expected error of our method as well as the uncertainty on this expected error, by computing the the average and standard deviation of the MSE for all the folds in the cross-validation.

Appendix B Kitaev model with second nearest neighbours

The Kitaev Hamiltonian with second nearest neighbours reads


The terms proportional to and break chiral symmetry and time-reversal symmetry (complex conjugation), but preserve particle-hole symmetry . As such, they collapse the invariant to a invariant when present.

In the main text, we consider for simplicity, and set to preserve chiral symmetry. We mainly consider disordered homogeneous systems, where are independent and identically distributed random variables following a normal distribution with mean and standard deviation , and similarly for .

When all parameters are uniform in space, the system is translation invariant and one can block-diagonalize it in Bloch representation, where the Bloch Hamiltonian is


Appendix C Additional data

Figure 6: Mean squared error for different dataset sizes. A series of subsets with sizes evenly distributed on a logarithmic scale are randomly drawn from the main dataset. The MSE is computed for each subset, and represented on a log-log plot. The same is done for and for . Error bars represent the estimated uncertainty on the MSE.
Figure 7: Predictions from a bulk LDOS. (a) Normalised distribution of the predicted winding marker with respect to the actual averaged winding marker , for a system of size where we only consider the LDOS on the central sites (the middle of the chain). The dashed line corresponds to . There is hardly any correlation between the actual averaged winding marker and the prediction. (b) Normalised distribution and cumulative distribution function of the error . The corresponding mean absolute error is (the RMSE is ).

The influence of the size of the dataset on the performance of the model is illustrated in Fig. 6. A series of subsets with sizes evenly distributed on a logarithmic scale are randomly drawn from the main dataset. For each subset, the MSE and its uncertainty are estimated using the -fold cross-validation training and testing procedure discussed in the section Neural network architecture. The MSE quickly decreases up to a dataset size of roughly . For larger datasets, a slower decrease of the MSE compatible with a linear behaviour on logarithmic scale is observed. The uncertainty on the MSE, which represents the variability for different folds in the -fold procedure, is notably larger in the first part. This might be due to an inadequate sampling of the parameter space for small datasets, due to an insufficient number of data points. The same behaviour is observed both for chains of length and ; in the second part of the graph, the curves for both lengths appear to be parallel with a constant offset, up to the uncertainties.

We presume that the machine learning procedure distinguishes systems with different topological invariants from the existence and shape of the topologically protected edge states expected from the bulk-boundary correspondence principle. Hence, it should not be possible to learn anything from the LDOS of a bulk system. To verify this, we train and test the same neural network as in the main text, but using as input the LDOS of a sample of length where the LDOS is restricted to the central sites. In this region, the edge states have a vanishing contribution for most of the system parameters. The two-dimensional histogram and the distribution of the absolute error in Fig. 7 show that the artificial neural network has lost any meaningful predictive ability. Correspondingly, the RMSE is relatively large, although this value alone would not be sufficient to draw a conclusion. The origin of the peaks in the marginal distribution is not clear, but is probably not physical; they may correspond to noise amplified by the artificial neural network.

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description