# Recent developments in helioseismic analysis methods and solar data assimilation

## Abstract

We review recent advances and results in enhancing and developing helioseismic analysis methods and in solar data assimilation. In the first part of this paper we will focus on selected developments in time-distance and global helioseismology. In the second part, we review the application of data assimilation methods on solar data. Relating solar surface observations as well as helioseismic proxies with solar dynamo models by means of the techniques from data assimilation is a promising new approach to explore and to predict the magnetic activity cycle of the Sun.

###### Keywords:

Sun helioseismology data analysis∎

## 1 Introduction

Helioseismic inferences are subject to statistical and systematic errors of different origin. Part of the errors may be reduced by analyzing longer observations or averaging over observations. But especially systematic errors often result from the neglect of inevitable instrumental effects on the observations or from model misspecifications in the data analysis and cannot be diminished for example by longer observations. Consequently, more accurate models and a better account of instrumental effects are essential for the enhancement of local and global helioseismic analysis methods. This will improve the accuracy and reliability of estimates of helioseismic parameters like mode frequencies and wave travel times. Just recently, for example, progress was made in measuring the meridional flow in the deep layers with the time-distance technique not only by the availability of high-resolution data from the HMI instrument but also after becoming aware and removal of a systematic center-to-limb effect in the travel–time estimates (Zhao et al., 2012, 2013). In order to detect weak processes like the meridional flow in the deeper interior or highly dynamic processes like supergranular convective motions with the time-distance technique, travel–time measurements of high signal-to-noise ratio are needed. The development of spatial averaging strategies and optimized filters can help here. Regarding global helioseismic investigations, different models and estimation schemes are used to extract the mode parameters of modes of medium and large harmonic degree from the data of GONG, MDI or HMI (e.g., Anderson et al., 1990; Schou, 1992, 1998; Hill et al., 1996; Korzennik, 2005; Reiter et al., 2015). Recent advances are made on identification and eradication of systematic influences of different origin on the mode parameter estimates (Larson & Schou, 2008; Rabello-Soares et al., 2008; Vorontsov et al., 2009; Korzennik et al., 2013). Further advances in global helioseismology are made by the development of analysis methods that exploit the perturbation of mode eigenfunctions due to the advection of the acoustic waves by flows. This kind of perturbation is also described as mode coupling. It manifests in correlations that can be investigated by cross-spectral analysis of spherical harmonic decomposed global oscillation data (Woodard, 2000; Schad et al., 2013). Such an approach is promising for inferences on the meridional flow inside the Sun since its influence on mode frequencies is only of second order and very small (Roth & Stix, 2008; Gough & Hindmann, 2010; Schad et al., 2011; Vorontsov, 2011).

One essential purpose of helioseismic investigations of dynamic processes within the solar interior is to explore and better understand the solar dynamo and the related activity cycle. In particular, some dynamo models predict a strong link between the meridional flow speed and the magnetic cycle period. Enabling observations as well as helioseismic constraints and models to work hand in hand could definitely help to progress on our understanding of the solar cycle and possibly produce reliable forecasts of solar phenomena and activity. The concepts of data assimilation are promising for solving these issues. Data assimilation has been extensively used for decades to predict the weather on Earth. Moreover, by combining physically-based models and well-chosen observables, it allows to constrain processes unaccessible to any measurement through their indirect impact on observations. With the increasing collection of solar data, this technique has become more and more attractive in solar physics and starts to be adapted to this field.

In the following we review some of the limitations related with the estimation of the helioseismic quantities in time-distance and global helioseismology and ideas and current attempts to accomplish them. For the time-distance analysis two recent advances, ensemble averaging and optimal filtering, will be discussed in detail. The inclusion of enhanced models of the solar oscillation spectrum to improve global helioseismic parameter estimates is addressed by the ”Global Helioseismic Metrology” project. As another advancement in global helioseismology we review the mode eigenfunction perturbation analyses that uses cross-spectra of solar oscillations. In the second part of this paper we overview the latest developments and results from application of data assimilation methods in solar physics with regard to activity cycle predictions.

## 2 Developments in local helioseismology

One recent challenge in local helioseismology is to obtain precise three-dimensional maps of the temporal evolution of the solar sub-surface layers. Here we focus on the time-distance method which is one tool to obtain such measurements and promises for example the detection of the emergence of magnetic flux or the analysis of convective motions like supergranulation.

#### Large and small separations

In time-distance helioseismology, travel times are measured between pairs of surface locations with angular heliocentric separation . A problem in the analysis of travel times is that information from larger depths is only obtained from signals measured at locations with larger separations and these are adversely affected by larger noise levels due to the geometrical spreading of the wavefronts (Gizon & Birch 2004, eq. (31)). In recent years much of time-distance helioseismology has been carried out using what we will call small separations, like (Jackiewicz, Gizon & Birch, 2008) and (Zhao & Kosovichev 2003). In particular for the analysis time periods appropriate for studying near-surface features, say 8 hours, and to get sufficient signal-to-noise ratio to easily see perturbation signals, analyses have been restricted to relatively small separations . One of the difficulties with the restriction to small separations is the inability to separate horizontal and vertical flow signals near the surface. The travel–time signal from a flow is measured from the time difference between counterpropagating waves traveling between two points. The ray theory predicts that the time difference is given by the integral of the flow along the ray path , , where is the flow velocity, is the element of length along the ray path, and is the sound speed. When studying near-surface phenomena, like supergranulation or sunspots, the restriction to short distances means that horizontal and vertical flows are not cleanly separated (Zhao & Kosovichev, 2003). However, for large distances, the separation is much cleaner. To illustrate this point, Fig. 1b shows the travel–time contributions of the vertical and horizontal flow components as well as the sum for the shallow supergranular flow model shown in Fig. 1a. For such a shallow flow model, the horizontal component contributes almost nothing for . But the vertical contribution of at large can be used to define a constant of the model, namely an integral over depth of the vertical flow at cell center, .

To make full use of large travel distances, the signal-to-noise ratio in the travel–time maps needs to be increased. This requires several steps: optimal choice of phase-speed filters and spatial averaging schemes. For example, Ilonidis et al. (2013) used a broad phase speed filter with a non-Guassian shape and particular averaging schemes with multiple arc configurations to obtain their results on the detection of emerging magnetic flux (Ilonidis et al., 2011). However, the reliability of their findings still needs to be validated (Braun, 2012).

#### Averaging over features

Different averaging schemes are applied in time-distance helioseismology to get a significant signature of a particular feature in the wave travel times. As an example Birch et al. (2010) employed numerical models to estimate that a rising flux tube causes travel–time shifts in the order of 1 s. A signal which could be certainly detected only by averaging over 150 of such events.

A powerful method to increase the signal-to-noise ratio and to obtain average properties directly is to average either cross correlations or travel times about the locations of features detected near the surface. The basic assumptions of this feature averaging are that there is an underlying linear process that can be averaged and that the systematic errors are sufficiently understood. This averaging technique has been used with small magnetic features (Duvall, Birch and Gizon (2006) and Felipe et al. (2012)) and to supergranulation (Birch et al. (2006), Hirzberger et al. (2008), Duvall & Birch (2010), Duvall & Hanasoge (2013)), and Švanda (2012)). An example of features corresponding to the supergranulation cell centers is shown in Fig. 2 from Duvall & Birch (2010). To derive these feature locations, a map of the center-annulus travel–time differences of f-mode waves is used to approximate the horizontal divergence of the flow. Local maxima of a smoothed version of this map are located and only features farther apart than 22 Mm are accepted. The smaller (in terms of travel–time difference) of a pair closer than 22 Mm is rejected. This results in a useful choice of supergranules, although it is likely biased towards larger than average cells (Švanda (2012)).

From the feature analysis of Duvall & Birch (2010), only a single parameter is derived from each cell, namely the strength of the f-mode divergence signal. Travel times could be binned based on the value of this parameter, enabling some discrimination of different strength or size cells. No one has published such results yet, but it is an obvious extension of the present work. However, it is likely that an additional parameter (at least one) would be required to describe supergranular cells. Cell size comes to mind as a candidate. However, it is very likely that the peak f-mode divergence signal will be highly correlated with the size. If the cells were to be described by two parameters, it would be useful to have parameters that are orthogonal. A way to analyze the horizontal divergence map to derive additional cell information is the Fourier segmentation procedure developed by Hirzberger et al. (2008).

#### Optimized filters

An essential element of time-distance helioseismic methods is the construction and usage of optimized filters. They have been used in time-distance helioseismology since its inception (Duvall et al. (1993)), where it was shown that filtering the data in horizontal phase speed ( circular frequency of a wave and its horizontal wavenumber) leads to isolated features in a time-distance correlation function. This is important because waves with the same horizontal phase speed travel to the same depth in the Sun (Duvall (1982)). In the early work, it was considered that by measuring travel–time differences between surface points that the effect of perturbations along the ray path could be measured just by considering waves with the particular phase speed corresponding to that depth. However, it was shown by Woodard (1998) that perturbations, such as supergranulation, spread power in the power spectrum by an amount corresponding to their inverse size. Supergranulation, with a spectrum peaking at spherical harmonic degree , spreads signal considerably in the power spectrum.

To study features such as supergranulation, it is useful to have a filter broad enough to admit most or all of the signal. This issue has been examined by Duvall & Hanasoge (2013). They have constructed filters with a central phase speed but a full width at half maximum (FWHM) that is independent of . For a range of large separations of , they have measured the center-annulus travel–time difference for the average supergranules versus filter width (Fig. 3a). At small widths, the signal strength is approximately linear with filter width until most of the supergranular signal is captured for . Another important parameter of the filter is the subsequent signal to noise ratio. This ratio is shown in Fig. 3b. It is found that narrow filters, in addition to not capturing most of the signal, do not yield the best signal to noise ratio. From Fig. 3b a filter width of was chosen for further study.

### 2.1 Discussion

In this section we have touched on two relatively new advances in time-distance helioseismology: ensemble or feature averaging and optimal filtering. Both strategies improve the signal-to-noise ratio of travel–time measurements that is necessary to measure the solar subsurface velocities. Of course other methods such as the ring-diagram analysis (Hill, 1988), helioseismic holography (Lindsey & Braun, 1990), and Fourier-Hankel analysis (Braun et al., 1987) can provide independent views on the processes insides the Sun. However, all local helioseismic techniques require a certain strategy of averaging or filtering to increase the signal-to-noise ratio. Another issue of time-distance analysis concerns its reliability and ability to retrieve subsurface structures, especially in the presence of magnetic fields (Gizon et al., 2009; Moradi et al., 2010), but also of subsurface supergranular-like flows as shown by DeGrave et al. (2014). The authors of that simulation study showed that current time-distance techniques were not able to adequately retrieve the supergranular flow pattern and suggested that averaging schemes, as proposed by Duvall & Birch (2010) and Švanda et al. (2011), might help to obtain reliable results in this case.

## 3 Developments in global helioseismology

Global helioseismic analyses of mode frequencies were very fruitful in the past for our current picture of the Sun’s internal structure and dynamics. The fidelity and resolution of these findings is restricted by the quality of the observations but also by systematic errors entering the analysis methods used to determine the mode parameters from the data. Especially the systematic errors are crucial for global helioseismic inferences and cannot be overcome by improving the quality or length of observations. In the following we focus on some of the difficulties in estimating global mode parameters and the efforts on identifying and eradicating associated systematic errors by means of the “Global Helioseismic Metrology” project.

### 3.1 Measurement of solar oscillation frequencies: Uncertainties and limitations

A significant part of the uncertainty in global helioseismic measurements has been identified as systematic in nature. The evidence for these errors comes both from the direct comparison of the results provided by different data-analysis techniques applied to the same data (Larson & Schou (2008); Vorontsov et al. (2009)), and from the helioseismic inversions, where they are revealed as an internal inconsistency in the input data set. Fig. 4 illustrates systematic errors in published SOHO MDI rotational splitting coefficients, revealed by the inversion in an attempt to implement the splitting measurements, accumulated over five years of observations and corrected for temporal variations in the solar internal rotation (“torsional oscillations”), to improve the measurement of the time-independent component of the rotation (Vorontsov et al., 2002a). The prominent horn-like structures in the mismatch between the data and the inverted model do not allow the measurement to benefit from the prolonged observation. Fig. 5 illustrates a simpler example with centroid (-averaged) frequencies. Here, the centroid frequencies tabulated over the entire span of MDI observations were averaged without any corrections for variations with solar activity, for addressing the equation of state of solar plasma (Vorontsov et al., 2013a). The difference between the averaged frequencies and those measured in the first year of SOHO mission (low solar activity) demonstrates a well-known frequency dependence, but only on average: the obvious outliers are the higher-degree modes, the most precious part of data for this particular measurement (modes with turning points in the vicinity of HeII ionization region). Again, attempting to implement a prolonged observation brings no benefit, but only makes systematic errors more evident.

The challenges to the accurate measurement of the solar oscillation frequencies are illustrated by two examples of Doppler-velocity power spectra shown in Fig. 6. The difficulties originate largely from the spatial leaks coming from modes of neighboring values of degree (Fig. 6a) and azimuthal order (Fig. 6b). The accurate modeling of the spatial leaks, which may blend into the target peak, puts very tough requirements on the accuracy of the leakage matrix. Further, the asymmetry of the line profiles (Fig. 6a) needs to be properly accounted for. Finally, deviations from the spherical symmetry of the equilibrium solar configuration lead, in general, to mode coupling—instead of individual modes described by a particular pair of values, we have the coherent composite states. The biggest effect comes from the mode coupling by differential rotation; an example of how the mode coupling affects the power spectra is the asymmetry of the amplitudes of and leaks, clearly seen in Fig. 6b. The biggest challenge is data analysis at high degree , where modes of individual degree can no longer be identified and the modal analysis has to be replaced by “ridge-fitting” techniques (Rabello-Soares et al., 2008; Korzennik et al., 2013; Reiter et al., 2015). Better spectral modeling in the intermediate-degree range, where systematic errors of frequency measurements are clearly seen and their origin can be identified, will also bring better confidence to the analysis of lower-degree (Korzennik, 2005) and higher-frequency (Rhodes et al., 2011) domains of the solar-oscillation power spectra.

### 3.2 Global Helioseismic Metrology

Significant efforts are now invested in enhancing the data analysis in global solar seismology. The standard SOHO MDI data analysis pipeline is largely redeveloped, from better accounting for instrumental effects when mapping Dopplergrams to spherical harmonics to more elaborated techniques of frequency fitting (Larson & Schou, 2008), and extended for implementation to SDO HMI data (Larson & Schou, 2011). Below, we address in more detail the prospects of the “Global Helioseismic Metrology” project, based on modeling the solar acoustic oscillation in the continuous spectrum.

For each pair of target degree and azimuthal order , the observed power spectrum is approximated by the spectral model defined as (Jefferies et al., 2006; Vorontsov & Jefferies, 2013b)

(1) |

In this model, is the phase integral of the trapped acoustic wave – a continuous function of frequency with at frequencies of acoustic resonances. The energy losses are assumed to be localized in the near-surface layers and described by the surface acoustic reflectivity . The average strength of the stochastic excitation is described by the excitation amplitude . The composite background is contributed by both the coherent and incoherent components of the solar noise. Parameter describes line asymmetry, which is governed by the depth and parity of the excitation source and by the coherent component of the solar noise. and are two components of the leakage matrix, which account for vertical and horizontal velocities on the solar surface, is the ratio of the magnitudes of the two velocity components. The leakage matrix is calculated using a computationally efficient semi-analytic approach described in (Vorontsov & Jefferies, 2005) and extended later to account for non-zero -angle and for the discrete bin sampling implemented in the SOHO MDI “medium-l” program. Mode coupling by differential rotation is treated as suggested in (Vorontsov, 2007).

The parameters of the spectral model, resulted from fitting SOHO MDI power spectra obtained from the first year of observations (at low solar activity) are illustrated by Fig. 7. The results are shown for modes of radial order from 1 to 10 and for modes, in the degree range limited by . The maximum-likelihood solution was obtained by an iterative improvement of the spectral parameters and of the resonant frequencies and frequency splittings. The resulted agreement between the model and the data is almost adequate, as can be judged by comparing the -averaged power spectra (Fig. 6). A small systematic inaccuracy in the predicted amplitudes of spatial leaks remains, however. The origin of this mismatch is not yet properly understood; it may be related, in part, with asymmetric distortion of the point-spread function of the MDI instrument (Rabello-Soares et al., 2008).

An important property of this model, which is behind its diagnostic potential, is that the spectral parameters of individual modes do not depend on the degree and collapse to slowly-varying functions of frequency only when the degree is not too high (less than about 100). The composite background (Fig. 7d) may look as an exception. However, an accurate measurement of the background at frequencies higher than about 2 mHz is a difficult task, because the background level appears to be significantly smaller than the resonant signals coming from the spatial leaks. As a result, the measurement of can be distorted by small inaccuracies in the leakage matrix. The fitted background is also significantly higher than the average at lowest values of target degree ( mode of in Fig. 7d). A possible explanation of this excess is the contribution of the instrumental noise, which can probably be modeled by adding an component to the (otherwise degree-independent) solar .

Interestingly, the analysis of solar modes reveal the same values of the excitation amplitude and “acoustic reflectivity” as solar modes of similar frequencies (Fig. 7a,b). It indicates that excitation and damping mechanisms do not distinguish between - and modes, despite the difference in their physical nature (the modes are incompressible waves). The composite background of modes appears to be smaller than that of modes (Fig. 7d). Since at low frequencies the fitted background is dominated by the granulation noise, one possible explanation is that its contribution to the observational power shall be modeled with a smaller (or zero) value of , the ratio of horizontal and vertical velocities. In data processing, a simple theoretical value has been used, which corresponds to incompressible motion: this approximation is hardly relevant to the granulation noise.

The major benefit to the -mode data analysis, expected from the global description of spectral parameters, is that the global spectral variables ( as functions of frequency), inferred from the large volume of high-quality intermediate-degree data, can be used in measurements at lower degree . Reducing the number of free parameters in low-degree measurements will bring significant improvement to the accuracy and precision of frequencies (and frequency splittings) of the modes which penetrate into the deepest solar interior.

Fig. 8 shows the rotational splitting coefficients resulting from the measurement which is described above, in comparison with published splitting coefficients. The horn-like structures, which signify systematic errors (cf Fig. 4) are now eliminated. As indicated by a detailed analysis (Vorontsov et al., 2009), the dominant part of the systematic errors came from discarding the effects of mode coupling by differential rotation in the original version of the SOHO MDI data analysis pipeline (which was later improved to include these effects, among others. Systematic errors in centroid frequencies, illustrated by Fig. 5, are due apparently to the temporal variation of the plate-scale error).

The differences between the centroid frequencies and their published values are shown in Fig. 9. The major part of the discrepancies is due to the line asymmetries, discarded in the original version of the MDI data analysis pipeline (cf Fig. 7c). Smaller-scale features are due apparently to the combination of the mode-coupling effects with the plate-scale error. The centroid frequencies measured in the “Global Helioseismic Metrology” project have been used in a recent study targeted at the seismic diagnostics of the equation of state (Vorontsov et al., 2013a). Interestingly, it was found that these frequencies allow the achievement of a significantly better agreement with solar models. This is quite an unusual finding: better accuracy of observational data brings better agreement with theoretical models.

The final goal of the project is a new approach to helioseismic inversion, where frequency measurements will be eliminated from the analysis, and the parameters of the rotating solar model will be matched directly with p-mode power spectra. This approach will bring benefits of streamlined regularization, first of all – by eliminating problems with error correlation and possible mode misidentification in frequency measurements.

### 3.3 Global helioseismology from cross-spectral analysis

In the past, global helioseismology was very successful in measuring the differential rotation inside the Sun from the splitting of the -mode frequencies. But the meridional flow in the deeper interior is hardly accessible from mode frequencies since the flow is small in amplitude and its influence on the frequencies is only of second order in the flow. Numerical simulations suggest frequency shifts of the order of a few nHz due to the flow (Roth & Stix, 2008; Chatterjee & Antia, 2009; Schad et al., 2011; Vorontsov, 2011). The perturbation of another characteristic of resonant waves by the meridional flow, the mode eigenfunction, and its potential for helioseismic analyses was early recognized, but largely neglected, likely due the difficulty to measure this kind of perturbation and to discriminate it from systematic instrumental effects in the observations. Below we address the recent advances in using cross-spectral analysis for a global helioseismic measurement of the meridional flow from the perturbation of mode eigenfunctions.

#### Perturbation of mode eigenfunctions by advective mode coupling

Expressions for the eigenfunctions perturbed by the meridional flow were derived by several authors (Woodard, 2000; Schad et al., 2011; Vorontsov, 2011), who used however partly different assumptions and approximations. The perturbations are considered with respect to a flow-free, purely hydrodynamic reference model of the Sun from which the frequencies and eigenfunctions, , of the unperturbed -modes are obtained. A flow leads to an advection of the acoustic waves, which causes a perturbation of eigenfunctions and eigenfrequencies of the -modes with respect to the reference model. If the flow amplitude is small compared to the speed of sound, the perturbed eigenfunctions can be expanded linearly in terms of the unperturbed eigenfunctions: , where substitutes the triplet of the wave numbers of a mode. The perturbations are often considered as a kind of mode coupling and the expansion coefficients specify the coupling coefficients which determine the strength of the coupling.

Following the general mathematical framework for the perturbation of -modes from Lavely & Ritzwoller (1992), Schad et al. (2011) have shown that the coupling coefficients of a mode due to the meridional flow are approximated in first order by a linear integral equation, whose integrand specifies the advection kernel,

(2) |

where . For the meridional flow, the coupling coefficients are purely imaginary. They are expected to be largest for modes of similar frequency and wave numbers . The meridional flow does not contribute to a self-coupling of modes. As a consequence, the shift of the mode frequencies due to the meridional flow is only of second order in . Assuming azimuthal symmetry for the meridional flow, coupling is only possible between modes of equal harmonic degree and the coupling coefficient for fixed pairs of and may be considered as a function of : . This expression can be expanded in terms of polynomials in azimuthal order, where the polynomial order is given by the harmonic degree of the individual meridional flow components when expanded in spherical harmonics . Different polynomial expansions are suggested. Schad et al. (2011) uses a complete set of orthogonal polynomials that are based on the Clebsch–Gordan coefficients. In the asymptotic case of high degree and for coupling modes of equal radial order , Vorontsov (2011) uses associated Legendre functions which are however not perfectly orthogonal on a discrete grid in . The expression in Equation (2) defines a linear inversion problem. The polynomial expansion allows to simplify the expression of the coupling coefficients in terms of one dimensional integral equations and to investigate the harmonic components of the flow separately.

#### Cross-spectral analysis of solar oscillations

The oscillating amplitude of individual global modes of harmonic degree and azimuthal order can be extracted by a spherical harmonic transformation (SHT) from sequences of Dopplergrams. Each of these spherical harmonic (SH) coefficients is a weighted sum over the amplitudes of several modes: . The weighting is given by the coupling coefficients , the amplitude of the radial eigenfunction component of mode observed at radius , where the respective absorption line is formed, and the leakage matrix elements of the observing instrument. The leakage origins from the imperfect orthogonality of the spherical harmonic functions when not integrated over the complete solar sphere. The leakage matrix elements are further largely influenced by the apodization mask and the projection of the solar velocity field onto the line-of-sight axis. As a consequence of leakage and mode coupling, modes of a specific target degree and order are not perfectly separated from modes of neighboring degree and order by the SHT. This results in a cross-talk between the SH coefficients and the power of a mode is spread amongst the spectra of spherical harmonic coefficients of neighboring , where it shows up as sidelobes at the respective mode frequency.

Woodard (2000) pointed out the sensitivity of the cross-spectrum of the SH coefficients to mode coupling. The cross-spectrum of two SH coefficients and is defined by , where is the Fourier transform of and denotes the statistical expectation value of the random variable . It is related to the coupling coefficients by

(3) |

where it is assumed that the solar modes are excited independently and is the auto-spectrum of mode .

Schad et al. (2011) introduced a slightly different quantity to relate the coupling-coefficients with observations: the amplitude ratio . It measures the relative cross-talk of power of a reference mode with frequency between the SH coefficients due to mode coupling and leakage. Given certain assumptions on the separability of the mode frequencies in the solar oscillation spectrum, which are met only by modes of low and medium harmonic degree , the amplitude ratio is independent of the amplitude and can be approximated in first order by (Schad et al., 2011)

(4) |

Its expectation value is in leading order determined by the cross-spectrum, since . This quantity, denoted as the complex gain, is related to the gain in linear filter theory. The asymptotic statistical distribution of the estimator of the complex gain can be expressed analytically (Schad et al., 2013).

#### Influence of differential rotation

The toroidal velocity field of differential rotation also leads to a coupling of modes, which cannot be neglected in the eigenfunction perturbation analysis. The perturbation of eigenfunctions due to rotation was investigated by Vorontsov (2007, 2011) for the asymptotic case of large degree . The case of low and medium degree was investigated by Schad (2013). Formally rotation leads to an additional term to the coupling coefficient: . Again, rotation couples only modes of similar frequency and of equal azimuthal order and the coupling coefficients can be expanded by polynomials in which are equal to the ones used for meridional flow. But in contrast to the coupling coefficients due to the meridional flow, the rotational coupling coefficients are real valued and anti-symmetric with respect to the azimuthal order. The different symmetry properties can be exploited to compensate approximately the influence of rotation on the amplitude ratios in analyses for the meridional flow. In this case, the amplitude ratios are symmetrized with respect to azimuthal order (Schad et al., 2013). Exemplary amplitude ratios estimated from about 6 years of MDI data for and , as well as simulated amplitude ratios from numerical forward computations of simple flow profiles, are shown in Fig. 10. The influence of rotation is clearly visible. Both the real and the imaginary part of the amplitude ratios deviate significantly from azimuthal symmetry if rotation is present. The compensation of the rotational influences by azimuthal symmetrization is also illustrated using simulated amplitude ratios. The symmetrized amplitude ratios matches very well to the amplitude ratios obtained for a velocity field without solar rotation.

#### Application to data: Measurement of the meridional flow

First measurements of the meridional flow from analysis of the perturbation of -mode eigenfunctions were given by Schad et al. (2012, 2013); Woodard et al. (2013).

Woodard et al. (2013) fitted a model to cross-spectra estimated from HMI data to measure the horizontal component of the meridional flow. The model for the cross-spectrum incorporates instrumental leakage, differential rotation, and the solar background. They analyzed HMI data with a length of 500 days and investigated couplings between modes of the same radial order for the harmonic degrees . Cross-spectra estimated from the HMI data and averaged over azimuthal order as well as cross-spectra after fitting different cross-spectral models to the data are depicted in Fig. 11. The comparison illustrates the improvement of the cross-spectral model by inclusion of flow-dependent eigenfunction perturbations (Woodard et al., 2013).

The estimated peak velocities of the horizontal flow component with harmonic degree 2 () are shown in Fig. 11 as a function of that is related to the lower turning point of the acoustic waves (Woodard et al., 2013).

Near the surface, the horizontal peak velocities are of the order of about 20 m/s as expected from local helioseismic measurements. Toward the interior, the velocities exhibit an unexpected large increase and the authors assume that their measurements are likely affected by a systematic effect (Woodard et al., 2013).

Schad et al. (2012, 2013) set up a global helioseismic estimation scheme for the meridional flow based on the analysis of amplitude ratios of coupling modes. It takes into account leakage and the influence of rotation is compensated by symmetrization of amplitude ratios. They applied their method to MDI data covering 2004–2010 for modes with and investigated couplings between modes of different radial order and for flow components of harmonic degree . They found two flow components of harmonic degree and , which differ significantly from zero. Individual components, like the flow component were measured deep down to 0.5 R. The radial component of the flow is estimated from a composite of the individual flow components. The horizontal flow component is reconstructed from the radial components assuming mass conservation. The composite of flow components of even degree are depicted in Fig. 12 as a function of radius and latitude. The meridional flow exhibits a multi-cellular pattern over latitude and depth. Near the solar surface, the horizontal flow is consistent with subsurface flow measurements from ring-diagram analyses indicating a poleward directed flow on each hemisphere with a small-scale latitudinal modulation and a speed of about 20 m/s at mid-latitudes. Their findings substantiate the assumption that the flow is confined between the tachocline region and the solar surface.

### 3.4 Discussion

The systematic features of the estimates of global mode parameters considered in Sec. 3.1 were found from analyses of medium- MDI data and can be reduced by the approach used by the global metrology project. Parameters estimated from data from the same instrument but of different preprocessing or from data from other instruments, e.g., HMI or GONG, may show other systematic features. But independent of the instrument, one cannot avoid for example leakage of mode power due to the observational restrictions or mode coupling dominated by rotation.

The development of global analysis methods based on mode eigenfunction perturbations and the analysis of cross-spectra of spherical harmonic Doppler velocity coefficients is a promising emerging field for inferences on the meridional flow in the deep solar interior. In this approach, care must be taken of mode eigenfunction perturbations from other sources, for example from differential rotation. Vice versa, analyses of mode eigenfunction perturbations may also be used to investigate rotation. Next to the solar subsurface, the meridional flow measured by Schad et al. (2013) seems to be consistent with flow measurements from a ring-diagram analysis. But, the recent measurement of the meridional flow from a deep focusing time-distance helioseismic analysis of HMI data by Zhao et al. (2013) seems to disagree with their findings. However, the investigated HMI data cover a later observation time period. Further investigations are necessary to test the reliability of these results.

The success of each of the global helioseismic methods considered here strongly depends on the accurate knowledge of the leakage matrix. The inferences from global helioseismic analyses suffer from systematic effects if the leakage matrix is inaccurate or the models used to estimate the helioseismic parameters from spectra or cross-spectra are misspecified. For example, a comparison study of splitting coefficients and rotation profiles obtained from medium- and full-disk MDI data indicate that the there used leakage matrix used for the medium- MDI data probably does not perfectly account for the apodization function and that might be responsible for a spurious polar jet in the solar rotation profile from the medium- MDI data (Larson & Schou, 2009).

The results obtained so far clearly illustrate the necessity of putting efforts into enhancing the models of the solar oscillation spectrum and of improving the accuracy of the leakage matrix of the respective instruments. A difficulty in the computation of the leakage matrix comes along with taking into account systematics that change over time, like the -angle, or from not properly known instrument characteristics, like the plate-scale error or the point spread function.

## 4 Data assimilation in solar physics

Understanding and predicting the solar activity cycle poses one of the main problems in solar physics and comes along with questions about the timing, amplitude, and shape of the currently evolving and following cycle. Convection, rotation and the mean meridional flow are thought to be key ingredients that drive the generation and evolution of the solar magnetic field. In the previous sections it was shown that new helioseismic techniques could provide estimates of some of these ingredients. We will see in this section how observations and helioseismic measurements could be combined with physical models to improve our knowledge of the solar activity cycle.

We have entered an era where extremely large amounts of data are available concerning the Sun, thanks to the high-resolution observations of satellites like Hinode or more recently SDO. At the same time, considerable progress has been made on the multi-dimensional numerical simulations of highly non-linear physical processes interacting in the solar interior and in its atmosphere. Although still far from realistic values of the parameters, several impressive local and global computations now seem to reach high levels of turbulence and capture relevant physical processes occurring in our star. Various 3D MHD codes are used for that purpose, among which the ASH code (e.g. Miesch et al., 2008; Brun et al., 2011; Nelson et al., 2013; Alvan et al., 2014), the PENCIL code (e.g. Käpylä et al., 2012; Warnecke et al., 2014), the MURaM code (e.g. Cheung et al., 2010; Rempel & Cheung, 2014) or the STAGGER code (e.g. Stein & Nordlund, 2012). This improvement is intimately linked to the recent fast development of high performance computing, several computers in the world now reaching the performance of tens of PetaFlops. Finally, we live today in a technological society in which strong solar flares, CMEs or any violent events linked to solar activity could cause significant damage to satellites, air traffic or telecommunication networks. That is why a solar cycle panel, whose role is to produce predictions of solar activity, was created in 1997 and has provided us with estimates of the sunspot number for Cycle 23 and current Cycle 24.

As Cycle 24 progressed, it became clearer and clearer that it would be a weak cycle. A quantitative estimate of the cycle strength can be given by the Wolf number, defined as , with the number of sunspot groups, the total number of individual sunspots in all groups and a variable scaling factor that accounts for instruments or observation conditions. We now know that for Cycle 24, the monthly smoothed Wolf number reached a peak of about 82 in April 2014. This will probably become the official maximum. This second peak surpassed the level of the first peak (about 67 in February 2012). Many cycles are double peaked but this is the first in which the second peak in sunspot number was larger than the first. These features make Cycle 24 the weakest cycle since Cycle 14, which peaked in 1906. Among the predictions, less than had anticipated such a small number. It thus still seems extremely difficult to provide reliable predictions of the long-term solar activity with the techniques that have been used so far (mostly relying on geomagnetic precursors or other statistical estimates – see Hathaway 2009 for a review on the subject). In meteorology, data assimilation which cleverly combines observational data and numerical models has been used for decades and now routinely in weather-forecasting. Considering the high-quality observations at our disposal, the recent progress in numerical simulations and the necessity to produce predictions of the amplitude, timing and shape of the next solar cycle, it seems rather reasonable to try to apply data assimilation to solar physics (Brun, 2007).

### 4.1 First attempts to introduce data into models

There have been first attempts to connect models and data, not exactly through data assimilation but rather by driving models with a time series of well-chosen data. This procedure has been implemented for flux-transport dynamo models and for photospheric and heliospheric magnetic field evolutions. In their mean-field dynamo model, Dikpati & Gilman (2006) have introduced a surface source of magnetic field that depended upon the sunspot areas observed since 1874, when Choudhuri et al. (2007) chose the surface field at minimum to be their driving observations. Both models produced good agreement with previous observed cycles but differed completely on the predictions for Cycle 24. One of the reasons was that one model (Dikpati’s) was dominated by the advection process while the other (Choudhuri’s) was dominated by diffusion, changing drastically the characteristics of the memory of the system, as was shown by Yeates et al. (2008). It should be noted that both predictions happen to be quite far from reality, either on the timing of the cycle, or its amplitude. Data-driven models for the solar atmosphere now also tend to develop. A first attempt was made by Schrijver & DeRosa (2003) when they introduced SOHO/MDI magnetograms into a flux-dispersal model to see the influence on the coronal reconfigurations. More recently, Cheung & DeRosa (2012) simulated the evolution of the active region coronal field driven by temporal sequences of photospheric magnetograms from SDO/HMI. Under certain conditions, they found that data-driven simulations could produce flux ropes that were ejected from the modeled active region due to loss of equilibrium, possibly showing a way to predict violent events through simulations.

In the future, we would like to make observations and models really work together and feed the simulations with data so that the model improves itself and allows to make reliable predictions. To do so, data assimilation techniques are exactly what we are looking for (see Kalnay 2003 for a general introduction related to atmospheric sciences and Fournier et al. 2010 for applications to geophysics). Indeed, they consist in combining observational data and numerical models to produce what is called an analysis: the new information provided by the observational data is taken into account in order to advance in time the “background” state that the numerical code has predicted. The increment is obtained by taking the difference, or innovation, between the observational data and the observation operator. More specifically, let be the background vector state characterizing the current state of the model, the observational operator and the observational data to be assimilated in the model, then one can show that the analysis is:

(5) |

where represents weights whose exact determination will differ from one assimilation technique to another. Two approaches and their applications to solar physics will now be briefly presented and discussed.

### 4.2 Sequential assimilation and application to solar physics

In the sequential assimilation technique, the background state is advanced in time thanks to the numerical model and corrected (i.e. an analysis is performed) each time an observation is available. The analysis is performed using Equation 5, where is given by the Kalman gain matrix which is a combination of the covariance matrices of the forecast and of the observational errors.

The analysis at time thus provides the new state vector from which the forecast is calculated by running the numerical model. This forecast step thus produces the new background which in turn will be corrected by the observation at time . This sequential process is illustrated in Fig. 13. This technique thus propagates information forward in time and also gives an estimate of the forecast errors and of their evolution.

As far as predictions of future solar activity is concerned, sequential assimilation has been implemented for the first time by Kitiashvili & Kosovichev (2008) in a 1D mean-field dynamo model evolving jointly the three components of the magnetic field and a measure of the magnetic helicity. The observations used were the annually smoothed Wolf sunspot number for the period 1857-2007. To derive the observational operator , the following relationship between the Wolf number and the toroidal (in the azimuthal direction) magnetic field was used: . The model used was a classical dynamo model in which the toroidal field owes its origin to the differential rotation shearing the poloidal (in the meridian plane) field lines (the -effect) and where the poloidal field is due to helical turbulence within the solar convection zone acting on the toroidal field (the -effect). This model contains a number of simplifications (see Brun et al. 2013 and Charbonneau 2010), most notably a rather crude parameterization of the effects of turbulence on the large-scale magnetic field but has the advantage of producing a cyclic variation of and an exponential growth of the magnetic energy (saturated here by the back-reaction of the magnetic field on the -effect).

Fig. 14 shows the result of their forecasting step for Cycle 24, after they applied their sequential assimilation to get the analysis at the time they wrote the paper (2008). Their predictions were shown for three different estimates of the Wolf number in 2008 since they did not have yet the real data. What we see on this figure is actually quite a good agreement with what Cycle 24 looks like now, reaching its maximum of around 80 in 2013 (a bit early then). However, we should keep in mind that the model was here very simple and the relationship between the observational data and the outputs of the model rather uncertain, leading them to state in the conclusion that additional observations like the latitudinal distribution of sunspots should probably also be taken into account to make progress in the subject.

First steps towards using a more complete model including a large-scale meridional circulation were undertaken recently by Dikpati & Anderson (2012) and Dikpati et al. (2014). Their idea is to use a 2D dynamo model in which the poloidal field is generated by the decay of active regions emerging at the solar surface (Babcock, 1961; Leighton, 1969) and not by small-scale turbulence in the convection zone like in classical dynamo models. The data they are planning to use is similar to what Kitiashvili & Kosovichev (2008) chose, namely the monthly smoothed sunspot number data from the Royal Observatory of Belgium but they also intend to make efficient use of data concerning the solar meridional circulation. Their first step was thus to determine the response time of the whole system to perturbations of the meridional flow, which is not very well measured below the first few Mm of the Sun and which is known to produce large changes in the timing and possibly the shape of the magnetic cycles in those types of flux-transport dynamo models (Dikpati & Charbonneau, 1999; Jouve & Brun, 2007). Dikpati & Anderson found that in the advection-dominated regimes they considered, the modelÂs time of peak response to a change in flow speed was rather short compared to the full circulation time and thus that a modification of the amplitude of the flow would quickly show large changes in the evolution of the magnetic field. In a recent subsequent paper (Dikpati et al., 2014), they applied a sequential data assimilation technique to reconstruct the meridional flow speed at the solar surface (fixing the meridional flow profile to one large circulation cell per hemisphere) from synthetic observations of the magnetic field. The synthetic observations were produced by running the model with a fixed meridional flow profile and speed and then noised to mimic observational errors. They found that the best reconstruction of meridional flow-speed could be obtained when 10 or more observations were used with an up-dating time of 15 days and an observational error of less than .

Another relevant quantity to assess when dealing with forecasting in such dynamical systems is what Lhuillier et al. (2011) call the forecast horizon or the time interval over which reliable predictions can be achieved. They made a detailed analysis of the growth rate of perturbations applied to the magnetic, velocity or temperature field for geodynamo simulations and found that the limit of predictability would be a combination of this growth time (estimated to be of about 30 yr), of all types of errors affecting the initial conditions and of the limited numerical resolution. Recently, Sanchez et al. (2014) performed the same kind of analysis for a flux-transport mean-field dynamo model, similar to the one considered by Dikpati et al. (2014). They measured the rate associated with the exponential growth of an initial perturbation of the model trajectory, and found a characteristic -folding time of 2.76 solar cycle durations. These results are quite promising for possible future predictions of solar activity. However, thorough studies of the sensitivity to all model parameters and on the predictability skills of these models will have to be undertaken before being able to apply more complete sequential data assimilation and to rely on predictions coming from simplified solar dynamo models.

### 4.3 Variational assimilation and applications to solar physics

As opposed to sequential assimilation, the variational technique consists in adjusting the trajectory of the model through observations over a significant time interval. This is illustrated in Fig. 15. To be more formal, the analysis is found here by minimizing a cost function , defined over the entire time window where observations are available. This cost (or objective) function is the sum of two terms. The first one measures the distance between the outputs of the model and the observations and the other one measures the distance to an a priori estimate of the background state, if there is any. This second term is called the background term, it can be used to provide the system with information about regularity of the solutions we are looking for, approximations that our flows need to satisfy or typical values we expect at certain points in the domain. After the analysis is found, the forecast step is similar to the one performed in the sequential technique, it is calculated by applying the numerical model to the analysis which has just been determined. This technique thus propagates information both forward and backward in time since the present state is estimated using the past and future observations available over the entire time window. This is important if we wish to reassimilate past data or in other words propagate backward in time the current quality of observational data. This advantage of variational assimilation might be of great use for solar physics for example if we wish to have a better insight on the state of the system at periods where observations were missing because of a lack of instruments or of a lack of surface events, like in the periods of grand minima. However, as stated before, we should of course keep in mind that the quality of the forecast will be strongly limited by the range of predictability in such dynamical systems (Lhuillier et al., 2011).

The main drawback however of using variational assimilation is that it requires the development of an adjoint model which will be used to provide the gradient of the cost function with respect to all input variables. Both the values of the function itself and of its gradient are then combined in a minimization algorithm to produce the analysis. Simple recipes can be used to write the adjoint code of a numerical program by hand (e.g. Talagrand, 1991; Giering & Kaminski, 1998) but for very large codes, it can be tempting to resort to automatic differentiation (AD) algorithms. AD is becoming a very efficient and powerful tool used to produce the adjoint code of general circulation models in meteorology (among many others, we can quote the on-line tool *TAPENADE*, developed by the TROPICS team in INRIA Sophia Antipolis, France).

Variational assimilation (or 3D/4D-VAR) was used recently in astrophysics for the problem of 2D stratified convection (Svedin et al., 2013) and in the context of solar physics for two main applications as of today: models of solar flares and dynamo models. In both studies, only synthetic observations were used, i.e. produced by a model and not by nature. Nevertheless, those are first steps towards understanding how variational assimilation may help us understand the current state of our Sun and hopefully predict its magnetic activity. In 2007, Bélanger et al. used a phenomenological model, called the avalanche model, which, although a priori far removed from the physics of magnetic reconnection and magnetohydrodynamical evolution of coronal structures, nonetheless reproduces quite well the observed statistical distribution of flare characteristics. This model is the continuous analogous to one of a sandpile where grains are dropped one by one until the pile reaches an equilibrium conical shape. Addition of more grains will then sometimes produce small to large avalanches or may have no consequences at all, leading to a strongly intermittent unloading while the loading remains slow and gradual. The cost function which is minimized here is the misfit between the outputs of the model and the synthetic observations (produced with a known set of parameters), in terms of an amount of energy released, averaged in time. They show that, despite the unpredictable (and unobservable) stochastic nature of the driving/triggering mechanism within the avalanche model, 4D-VAR succeeds in producing optimal initial conditions that reproduce adequately the time series of energy released by avalanches and flares. More recent works about the predictability of solar flares with the avalanche model have however shown that only a modified version of it, purely deterministically driven, could produce reliable predictions for large eruptive events (Strugarek & Charbonneau, 2014).

Variational data assimilation was also applied in the context of solar dynamo models. Since the idea is to produce predictions of future solar activity, trying to use a technique which is now used routinely in weather forecasting on Earth sounds like a reasonable way to go. In Jouve, Brun & Talagrand (2011), a variational data assimilation technique was developed, using a 2D mean-field dynamo model. As we said before, synthetic data were used, consisting of outputs from the model produced by a particular set of parameters and more specifically a particular choice of as a function of latitude (designated by the coordinate here). After the observations were produced, a cost function was chosen to be the misfit between the model toroidal field and the observations of the same quantity. The result of the minimization was then supposed to reproduce the function used to calculate the observations. The authors performed an analysis of the dependence of the quality of this recovery as a function of the number and location of observations. In Fig. 16, an example of such a calculation is shown, where data were only present in one hemisphere (from to ). The recovery of the function was very good in the hemisphere where data were present but the difference with the real solution (called the true state) was also found to be reduced in the other hemisphere when more observations were assimilated. This indicates that the cost function (and thus the values of the magnetic field) in one hemisphere was sensitive to the -effect in the other hemisphere, which is the kind of insight that can be gained from variational data assimilation techniques.

One of the challenges in solar physics that can be tackled under the angle of data assimilation is to get a better knowledge of the meridional flow and its role in the dynamo process. Hung, Jouve, Brun, Fournier & Talagrand, 2015 (in preparation) are now developing a variational data assimilation technique applied to a 2D spherical flux-transport dynamo model. This will allow consideration of various possible observations beyond the basic averaged sunspot number. For the magnetic field, one could use the amount of poloidal field at the poles, the timing of its reversal (that we can get from Hinode or SDO for example), the components of the multipolar expansion of the magnetic field (accessible of course on the Sun, DeRosa et al. 2012, but also now for other stars through spectropolarimetry, see Petit et al. 2008). For the velocity field, use could be made of the amplitude and profile of the meridional flow (both at the surface and deeper down thanks to new helioseismic techniques, as discussed in this paper) and possibly of torsional oscillations (Vorontsov et al., 2002b; Spruit, 2003). By applying data assimilation with those various sources of observations and an a priori knowledge of part of the meridional flow, the first results of this study indicate that it is possible to reconstruct not only its amplitude as in the work of Dikpati et al. (2014) but also its profile in the solar interior and its structure close to the base of the convection zone, which are of prime interest for dynamo modelers.

### 4.4 Perspectives for data assimilation in solar physics

The applications presented here are of course very preliminary in terms of models and observations. Firstly, they do not yet use real data but were only tested on synthetic observations or on very smoothed proxies. Secondly, the models are extremely simple compared to the most up-to-date 3D models evolving the full set of magneto-hydrodynamics equations in spherical geometry. However, and that is maybe what constitutes a big difference with meteorology, we have limited access to observations within the solar interior (with the notable exception of what is learned thanks to helioseismology and particularly new global and local techniques, as is shown in this paper) and moreover we do not have yet at our disposal a self-consistent 3D dynamo model reproducing the main characteristics of the large-scale solar magnetic field. We thus have to move step by step towards this goal by first considering surface observations assimilated in simplified solar models.

In this context, it could be useful to work on ensemble forecasting, similar to what is used today in weather predictions on Earth. In the case of weather forecasting, the idea is to make several predictions starting with slightly different initial conditions and to take an average of those predictions to get the actual forecast. This has mainly three goals: improving the forecast thanks to the averaging, providing an indication of the reliability of the prediction and giving a quantitative way to assess the quality of each individual forecast. In solar physics, we could think of forming an ensemble not by perturbing the initial conditions for the same model but rather by considering different models, where the key physical processes may have more or less impact on the evolution of the magnetic field. Each model would then provide its own forecast. Eventually, the averaging process would give us a way to distinguish between those various models and would indicate which characteristics of the cycle we are the most likely to correctly anticipate. That is probably a future development to be considered on the way towards better and more reliable long-term predictions of solar activity.

## 5 Summary

In the first part of this paper we reviewed some recent developments of local and global helioseismic data analysis methods. For the local helioseismic analysis we focused on the time-distance method as used for the estimation of velocity fields in the solar subsurface, e.g., supergranulation. Such analyses of rather rapidly evolving processes benefit from sophisticated averaging schemes and filtering methods that help to increase the signal-to-noise ratio and to retrieve signatures from travel–time shifts.

Regarding global helioseismology, we considered two advances, the metrology project and the mode eigenfunction perturbation analysis for inferences of the meridional flow. Global helioseismic inferences rely on the accuracy and reliability of estimated global seismic mode parameters, like the mode frequencies. The global helioseismic metrology project aims to improve accuracy and reliability of the parameter estimates by better incorporation of systematic influences in the parameter estimation scheme. The systematic influences cannot be overcome by longer observations and are of different origin, for example from leakage of mode power in the –diagram that comes essentially from the technical restrictions in observing the solar velocity field in the photosphere and mode coupling dominated by rotation. The method used by the global helioseismic metrology project is able to reduce some of the systematic errors of the estimated parameters which provide a better agreement with solar models constructed with recent versions of the equation of state.

One systematic influence, the mode coupling, actually results from a perturbation of the mode eigenfunctions by flows in the solar interior. These perturbations lead to a cross-talk between the spherical harmonic coefficients of the Doppler velocity measurements that manifests as leakage in the power spectra. This phenomenon was recently exploited to develop global helioseismic methods to infer the meridional flow in the deeper interior. Here, the characteristic cross-talk due to the meridional flow is investigated by a cross-spectral analysis of time series of these spherical harmonic velocity coefficients. The meridional flow measured by this approach shows a complex flow pattern over latitude and depth and extends from the surface down to the base of the convection zone. Since mode eigenfunctions are also perturbed by other kinds of disturbance, e.g., rotation, their analysis may be also of interest for studies on these perturbations.

One important objective of helioseismic investigations is to reveal the dynamic processes in the interior associated with the solar dynamo, e.g., the meridional flow, in order to better understand this mechanism. Another approach to explore this mechanism and moreover to predict the related solar magnetic activity cycle was recently made by the use of data assimilation methods. This subject was reviewed in detail in the second part of this paper. We discussed two approaches, sequential and variational data assimilation, and their application to solar data. So far, different kinds of observational quantities, for example the sunspot numbers, the speed of the meridional flow, the polar magnetic field as well as synthetic data, and rather simple models relating observations to the solar dynamo were combined to afford forecasts of the magnetic solar cycle or flares. The results of these investigations and the increasing amount of solar data of high quality make data assimilation a promising approach for forecasts of the solar activity cycle as well as magnetic activities relevant for space weather.

###### Acknowledgements.

LJ would like to thank Sacha Brun for helpful comments on the data assimilation part in this paper. MR and AS have received funding from the European Research Council under the European Unionâs Seventh Framework Program (FP/2007-2013)/ERC Grant Agreement no. 307117. The authors thank M.F. Woodard for providing Fig. 11.Conflict of Interest: The authors declare that they have no conflict of interest.

### References

- L. Alvan, A.S. Brun, S. Mathis, Theoretical seismology in 3D: nonlinear simulations of internal gravity waves in solar-like stars, Astron. Astrophys. 565, 42 (2014)
- E.R. Anderson, T.L. Duvall, Jr., S.M. Jeffries, Modelling of solar oscillation power spectra, Astrophys. J. 364, 699-705 (1990)
- H.W. Babcock, The topology of the Sun’s magnetic field and the 22-year cycle, Astrophys. J. 133, 572 (1961)
- E. Bélanger, A. Vincent, P. Charbonneau, Predicting Solar Flares by Data Assimilation in Avalanche Models. I. Model Design and Validation, Solar Phys. 245, 141–165 (2007)
- A. Birch, T.L. Duvall Jr., L. Gizon, J. Jackiewicz, Helioseismology of the ”Average” Supergranule, Bull. Am. Astron. Soc. 38, 224 (2006)
- A. Birch, D.C. Braun, Y. Fan, An Estimate of the Detectability of Rising Flux Tubes, Astrophys. J. Lett. 723, L190–L194 (2010)
- M. Bocquet, M2 OACOS/WAPE/ParisTech lecture notes 2014–2015, http://cerea.enpc.fr/HomePages/bocquet/teaching.html (2015)
- D.C. Braun, T.L. Duvall Jr., B.J. LaBonte, Acoustic absorption by sunspots, Astrophys. J. Lett. 319, L27-ÂÂ-L31 (1987)
- D.C. Braun, Comment on “Detection of Emerging Sunspot Regions in the Solar Interior”, Science 336, pp. 296 (2012)
- A.S. Brun, Towards using modern data assimilation and weather forecasting methods in solar physics, Astron. Nachr. 328, 329–338 (2007)
- A.S. Brun, M.S. Miesch, J. Toomre, Modeling the dynamical coupling of solar convection with the radiative interior, Astrophys. J. 742, 79 (2011)
- A.S. Brun, M.K. Browning, M. Dikpati, H. Hotta, A. Strugarek, Recent advances on Solar Global Magnetism and Variability Space Sci. Rev. doi: 10.1007/s11214-013-0028-0, (2013)
- P. Charbonneau, Dynamo Models of the Solar Cycle, Living Rev. Sol. Phys. 7, 3 (2010)
- P. Chatterjee, H.M. Antia, Solar Flows and their Effect on Frequencies of Acoustic Modes, Astrophys. J. 707, 208–217 (2009)
- M.C.M. Cheung, M. Rempel, A.M. Title, M. Schüssler, Simulation of the Formation of a Solar Active Region, Astrophys. J. 720, 233–244 (2010)
- M.C.M. Cheung, M.L. DeRosa, A Method for Data-driven Simulations of Evolving Solar Active Regions, Astrophys. J. 757, 147 (2012)
- A.R. Choudhuri, P. Chatterjee, J. Jiang, Predicting solar cycle 24 with a solar dynamo model, Phys. Rev. Lett. 98, 13 (2007)
- K. DeGrave, J. Jackiewicz, M. Rempel, Validating time-distance helioseismology with realistic quiet-Sun simulations Astrophys. J. 127, pp. 15 (2014)
- M.L. DeRosa, A.S. Brun, J.T. Hoeksema, Solar Magnetic Field Reversals and the Role of Dynamo Families, Astrophys. J. 757, 96 (2012)
- M. Dikpati, P. Charbonneau, A Babcock-Leighton flux transport dynamo with solar-like differential rotation, Astrophys. J. 518, 508–520 (1999)
- M. Dikpati, P.A. Gilman, Simulating and predicting solar cycles using a flux-transport dynamo, Astrophys. J. 649, 498–514 (2006)
- M. Dikpati, J.L. Anderson, Evaluating potential for data assimilation in a flux-transport dynamo model by assessing sensitivity and response to meridional flow variation, Astrophys. J. 756, 20 (2012)
- M. Dikpati, J.L. Anderson, D. Mitra, Ensemble Kalman filter data assimilation in a Babcock-Leighton solar dynamo model: an observation system simulation experiment for reconstructing meridional flow-speed, Geophys. Res. Lett. 41, 5361 (2014)
- T.L. Duvall Jr., A Dispersion Law for Solar Oscillations, Nature 300, 242–243 (1982)
- T.L. Duvall Jr., S.M. Jefferies, J.W. Harvey, M.A. Pomerantz, Time-Distance Helioseismology, Nature 362, 430–432 (1993)
- T.L. Duvall Jr., A.C. Birch, L. Gizon, Direct Measurement of Travel-Time Kernels for Helioseismology, Astrophys. J. 646, 553–559 (2006)
- T.L. Duvall Jr., A.C. Birch, The Vertical Component of the Supergranular Motion, Astrophys. J. 725, L47–L51 (2010)
- T.L. Duvall Jr., S.M. Hanasoge, Subsurface Supergranular Vertical Flows as Measured Using Large Distance Separations in Time-Distance Helioseismology, Solar Phys. 287, 71–83 (2013)
- T. Felipe, D. Braun, A. Crouch, A. Birch, Scattering of the f-mode by Small Magnetic Flux Elements from Observations and Numerical Simulations, Astrophys. J. 757, 148–160 (2012)
- A. Fournier, G. Hulot, D. Jault, W. Kuang, A. Tangborn, N. Gillet, E. Canet, J. Aubert, F. Lhuillier, An introduction to data assimilation and predictability in geomagnetism, Space Sci. Rev. 155, 247–291 (2010)
- L. Gizon, A.C. Birch, Time-Distance Helioseismology: Noise Estimation, Astrophys. J. 614, 472–489, (2004)
- L. Gizon, H. Schunker, C.S. Baldner, S. Basu, A.C. Birch, R.S. Bogart, D.C. Braun, R. Cameron, T.L. Duvall Jr., S.M. Hanasoge, J. Jackiewicz, M. Roth, T. Stahn, M.J. Thompson, S. Zharkov, Helioseismology of Sunspots: A Case Study of NOAA Region 9787, Space Sci. Rev. 114, 249–273, (2009)
- R. Giering, T. Kaminski, Recipes for adjoint code construction, ACM Trans. Math. Software 24, 437–474 (1998)
- D. Gough, B.W. Hindman, Helioseismic detection of deep meridional flow, Astrophys. J. 714, 960–970 (2010)
- D.H. Hathaway, Solar cycle forecasting, Space Sci. Rev. 144, 401–412 (2009)
- F. Hill, Rings and Trumpets – Three-dimensional Power Spectra Of Solar Oscillations, Astrophys. J. 333, 996–1013 (1988)
- F. Hill, P.B. Stark, R.T. Stebbins, E.R. Anderson, H.M. Antia, T.M. Brown, T.L. Duvall, Jr., D.A. Haber, J.W. Harvey, D.H. Hathaway, R. Howe, R. Hubbard, H.P. Jones, J.R. Kennedy, S.G. Korzennik, A. Kosovichev, J.W. Leibacher, K.G. Libbrecht, J.A. Pintar, E.J. Rhodes, Jr., J. Schou, M.J. Thompson, S. Tomczyk, C.G. Toner, R. Toussaint, W.E. Williams, The Solar Acoustic Spectrum and Eigenmode Parameters, Science 272, 1292–1295 (1996)
- J. Hirzberger, L. Gizon, S.K. Solanki, T.L. Duvall Jr., Structure and Evolution of Supergranulation from Local Helioseismology, Solar Phys. 251, 417–437 (2008)
- S. Ilonidis, J. Zhao, A. Kosovichev, Science 333, 993 (2011)
- S. Ilonidis, J. Zhao, T. Hartlep, Helioseismic investigation of emerging magnetic flux in the solar convection zone, Astrophys. J. 777, 138 (2013)
- J. Jackiewicz, L. Gizon, A.C. Birch, High-Resolution Mapping of Flows in the Solar Interior: Fully Consistent OLA Inversion of Helioseismic Travel Times, Solar Phys. 251, 381–415 (2008)
- S.M. Jefferies, S.V. Vorontsov, C. Giebink, Toward improving the seismic visibility of the solar tachocline, in Proc. SOHO 18/GONG 2006/HELAS I: Beyond the spherical Sun, ed. by K. Fletcher and M. Thompson (ESA SP-624, ESA, Noordwijk, 2006)
- L. Jouve, A.S. Brun, On the role of meridional flows in flux transport dynamo models, Astron. Astrophys. 474, 239–250 (2007)
- L. Jouve, A.S. Brun, O. Talagrand, Assimilating data into an dynamo model of the Sun: A variational approach, Astrophys. J. 735, 31 (2011)
- E. Kalnay, Atmospheric Modeling, Data Assimilation and Predictability (Cambridge University Press, 2003)
- P.J. Käpylä, M.J. Mantere, A. Brandenburg, Cyclic Magnetic Activity due to Turbulent Convection in Spherical Wedge Geometry.Astrophys. J. Lett. 755, L22 (2012)
- I. Kitiashvili, A.G. Kosovichev, Application of data assimilation method for predicting solar cycles, Astrophys. J. Lett. 688, L49 (2008)
- S.G. Korzennik, A mode-fitting methodology optimized for very long helioseismic time series, Astrophys. J. 626, 585–615 (2005)
- S.G. Korzennik, M.C. Rabello-Soares, J. Schou, T.P. Larson, Accurate Characterization of High-degree Modes Using MDI Observations, Astrophys. J. 772, 87(28pp) (2013)
- T.P. Larson, J. Schou, Improvements in global mode analysis, J. Phys. Conf. Series 118, 012083 (2008)
- T.P. Larson, J. Schou, Variations in global mode analysis, in Solar-Stellar Dynamos as Revealed by Helio- and Asteroseismology, ed. by M. Dikpati, T. Arentoft, I. Gonzalez Hernandez, C. Lindsey, F. Hill, (Astron. Soc. Pacific Conf. Series 416, 2009), pp. 311–314
- T.P. Larson, J. Schou, HMI global helioseismology data analysis pipeline, J. Phys. Conf. Series 271, 012062 (2011)
- E.M. Lavely, M.H. Ritzwoller, The Effect of Global-Scale, Steady-State Convection and Elastic-Gravitational Asphericities on Helioseismic Oscillations, Phil. Trans. R. Soc. Lond. A 339, 431–496 (1992)
- R.B. Leighton, A Magneto-Kinematic Model of the Solar Cycle, Astrophys. J. 156, 1 (1969)
- F. Lhuillier, J. Aubert, G. Hulot, EarthÃ¢ÂÂs dynamo limit of predictability controlled by magnetic dissipation, Geophys. J. Int. 186, 492–508 (2011)
- C. Lindsey, D.C. Braun, Helioseismic imaging of sunspots at their antipodes, Solar Phys. 126, 101–115 (1990)
- M.S. Miesch, A.S. Brun, M.L. DeRosa, J. Toomre, Structure and evolution of giant cells in global models of solar convection, Astrophys. J. 673, 557 (2008)
- H. Moradi, C. Baldner, A.C. Birch, D.C. Braun, R.H. Cameron, T.L. Duvall Jr., L. Gizon, D. Haber, S.M. Hanasoge, B.W. Hindman, J. Jackiewicz, E. Khomenko, R. Komm, P. Rajaguru, M. Rempel, M. Roth, R. Schlichenmaier, H.J. Schunker, H.C. Spruit, K.G. Strassmeier, M.J. Thompson, S. Zharkov, Modeling the Subsurface Structure of Sunspots, Solar Phys. 267, 1–62 (2010)
- N.J. Nelson, B.P. Brown, A.S. Brun, M.S. Miesch, J. Toomre, Magnetic Wreaths and Cycles in Convective Dynamos. The Astrophysical Journal 762, 73 (2013)
- P. Petit, B. Dintrans, S.K. Solanki, M. Aurière, F. Lignières , J. Morin, F. Paletou, J. Ramirez, C. Catala, R. Fares, Toroidal vs. Poloidal Magnetic Fields in Sun-Like Stars: A Rotation Threshold, Mon. Not. Roy. Astron. Soc. 388, 80–88 (2008)
- M.C. Rabello-Soares, S.G. Korzennik, J. Schou, Analysis of MDI High-Degree Mode Frequencies and their Rotational Splittings, Solar Phys. 251, 197 (2008)
- J. Reiter, E.J. Rhodes Jr., A.G. Kosovichev, J. Schou, P.H. Scherer, T.P. Larson, A method for the estimation of p-mode parameters from averaged solar oscillation power spectra, Astrophys. J. 803, 92(42pp) (2015)
- M. Rempel, M. Schüssler, M. Knölker, Radiative magnetohydrodynamic simulation of sunspot structure, Astrophys. J. 691, 640–649 (2009)
- M. Rempel, M.C.M. Cheung, Numerical Simulations of Active Region Scale Flux Emergence: From Spot Formation to Decay, Astrophys. J. 785, 90 (2014)
- E.J. Rhodes Jr, J. Reiter, J. Schou, T. Larson, P. Scherrer, J. Brooks, P. McFaddin, B. Miller, J. Rodriguez, J. Yoo, Temporal changes in the frequencies and widths of the solar p-mode oscillations, J. Phys.: Conf. Ser., 271, 012029 (2011)
- M. Roth, M. Stix, Meridional Circulation and Global Solar Oscillations, Solar Phys. 251, 77–89 (2008)
- S. Sanchez, A. Fournier, J. Aubert, The predictability of advection-dominated flux-transport solar dynamo models, Astrophys. J., 781, 8 (2014)
- A. Schad, J. Timmer, M. Roth, A Unified Approach to the Helioseismic Inversion Problem of the Solar Meridional Flow from Global Oscillations, Astrophys. J. 734, 97–105 (2011)
- A. Schad, J. Timmer, M. Roth, Measuring the solar meridional flow from perturbations of eigenfunctions of global oscillation, Astron. Nachr. 333, 991-994 (2012)
- A. Schad, J. Timmer, M. Roth, Global helioseismic evidence for a deeply penetrating meridional flow consisting of multiple flow cells, Astrophys. J. Lett. 778, L38–L44 (2013)
- A. Schad, A new approach for the global helioseismic investigation of the solar meridional flow, (Doctoral Dissertation, Albert-Ludwig University of Freiburg, Germany, 2013)
- J. Schou, On the analysis of helioseismic data, (Doctoral Dissertation, Aarhus University, Denmark, 1992)
- J. Schou, Observations of medium- and high-degree modes: methods and sandtraps, in Structure and Dynamics of the Interior of the Sun and Sun-like Stars, ed. by S.G. Korzennik & A. Wilson (ESA SP-418; Noord- wijk: ESA, 1998), pp. 47–52
- J. Schou, P.H. Scherrer, R.I. Bush, R. Wachter, S. Couvidat, and others, Design and Ground Calibration of the Helioseismic and Magnetic Imager (HMI) Instrument on the Solar Dynamics Observatory (SDO), Solar Phys. 275, 229–259 (2012)
- C.J. Schrijver, M.L. DeRosa, Photospheric and heliospheric magnetic fields, Solar Phys. 212, 165–200 (2003)
- H.C. Spruit, Origin of the torsional oscillation pattern of solar rotation, Solar Phys. 213, 1–21 (2003)
- R.F. Stein, Å. Nordlund, On the Formation of Active Regions, Astrophys. J. Lett. 753, L13 (2012)
- A. Strugarek, P. Charbonneau, Predictive Capabilities of Avalanche Models for Solar Flares, Solar Phys., 289, 4137 (2014)
- M. Švanda, L. Gizon, S.M. Hanasoge, S.D. Ustyugov, Validated helioseismic inversions for 3D vector flows, Å 530, pp. 148 (2011)
- M. Švanda, Inversions for Average Supergranular Flows Using Finite-frequency Kernels, Astrophys. J. Lett. 759, 29–33 (2012)
- A. Svedin, M. C. Cuéllar, A. Brandenburg, Data assimilation for stratified convection, Mon. Not. Roy. Astron. Soc., 433, 2278 (2013)
- O. Talagrand, The use of adjoint equations in numerical modeling of the atmospheric circulation, in Automatic Differentiation of Algorithms: Theory, Implementation, and Application, ed. by A. Griewank, G.F. Corliss, (Society for Industrial and Applied Mathematics, Philadelphia, 1991), pp. 169–180 (1991)
- S.V. Vorontsov, J. Christensen-Dalsgaard, J. Schou, V.N. Strakhov, M.J. Thompson, Solar internal rotation as seen from SOHO MDI data, in From Solar Min to Max: Half a Solar Cycle with SOHO, ed. by A. Wilson, (ESA SP-508, 2002a), pp. 111–114
- S.V. Vorontsov, J. Christensen-Dalsgaard, J. Schou, V.N. Strakhov, M.J. Thompson, Helioseismic measurement of solar torsional oscillations, Science 296, 101–103 (2002b)
- S.V. Vorontsov, S.M. Jefferies, Modeling Solar Oscillation Power Spectra. I. Adaptive Response Function for Doppler Velocity Measurements, Astrophys. J. 623, 1202–1214 (2005)
- S.V. Vorontsov, Solar p modes of high degree l: Coupling by differential rotation, Mon. Not. Roy. Astron. Soc. 378, 1499–1506 (2007)
- S.V. Vorontsov, S.M. Jefferies, C. Giebink, J. Schou, Toward Eliminating Systematic Errors in Intermediate-Degree p-Mode Measurements, in Solar-Stellar Dynamos as Revealed by Helio- and Asteroseismology, ed. by M. Dikpati, T. Arentoft, I. Gonzalez Hernandez, C. Lindsey, F. Hill, (Astron. Soc. Pacific Conf. Series 416, 2009), pp. 301–305
- S.V. Vorontsov, Effects of differential rotation and meridional circulation in solar oscillations of high degree , Mon. Not. Roy. Astron. Soc. 418, 1146–1155 (2011)
- S.V. Vorontsov, V.A. Baturin, S.V. Ayukov, V.K. Gryaznov, Helioseismic calibration of the equation of state and chemical composition in the solar convective envelope, Mon. Not. Roy. Astron. Soc. 430, 1636–1652 (2013a)
- S.V. Vorontsov, S. M. Jefferies, Modeling Solar Oscillation Power Spectra. II. Parametric Model of Spectral Lines Observed in Doppler-Velocity Measurements, Astrophys. J. 778, 75(10pp) (2013b)
- J. Warnecke, P.J. Käpylä, M.J. Käpylä, A. Brandenburg, On The Cause of Solar-like Equatorward Migration in Global Convective Dynamo Simulations, Astrophys. J. Lett. 796, L12 (2014)
- M.F. Woodard, Effect of Subsurface Inhomogeneities on the Statistics of Solar Oscillation Power Spectra, Solar Phys. 180, 19–28 (1998)
- M.F. Woodard, Theoretical signature of solar meridional flow in global seismic data, Solar Phys. 197, 11–20 (2000)
- M.F. Woodard, J. Schou, A.C. Birch, T.P. Larson, Global-oscillation eigenfunction measurements of solar meridional flow, Solar Phys. 287, 129–147 (2013),
- A.R. Yeates, D. Nandy, & D.H. Mackay, Exploring the physical basis of solar cycle predictions: Flux transport dynamics and persistence of memory in advection versus diffusion-dominated solar convection zones, Astrophys. J. 673, 544–556 (2008)
- J. Zhao, A.G. Kosovichev, On the inference of supergranular flows by time-distance helioseismology, in GONG+ 2002. Local and Global Helioseismology: the Present and Future, ed. by H. Sawaya-Lacoste (ESA SP-517, ESA, Noordwijk, 2003), pp. 417–420
- J. Zhao, K. Nagashima, R.S. Bogart, A.G. Kosovichev, T.L. Duvall Jr., Systematic center-to-limb variation in measured helioseismic travel times and its effect on inferences of solar interior meridional flows, Astrophys. J. Lett. 749, 5 (2012)
- J. Zhao, R.S. Bogart, A.G. Kosovichev, T.L. Duvall Jr., T. Hartlep, Detection of equatorward meridional flow and evidence of double-cell meridional circulation, Astrophys. J. Lett. 774, 29 (2013)