MemGE: a new maximum entropy method for image reconstruction from solar X-ray visibilities
Maximum Entropy is an image reconstruction method conceived to image a sparsely occupied field of view and therefore particularly appropriate to achieve super-resolution effects. Although widely used in image deconvolution, this method has been formulated in radio astronomy for the analysis of observations in the spatial frequency domain, and an Interactive Data Language (IDL) code has been implemented for image reconstruction from solar X-ray Fourier data. However, this code relies on a non-convex formulation of the constrained optimization problem addressed by the Maximum Entropy approach and this sometimes results in unreliable reconstructions characterized by unphysical shrinking effects.
This paper introduces a new approach to Maximum Entropy based on the constrained minimization of a convex functional. In the case of observations recorded by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI), the resulting code provides the same super-resolution effects of the previous algorithm, while working properly also when that code produces unphysical reconstructions. Results are also provided of testing the algorithm with synthetic data simulating observations of the Spectrometer/Telescope for Imaging X-rays (STIX) in Solar Orbiter. The new code is available in the HESSI folder of the Solar SoftWare (SSW) tree.
Solar SoftWare (SSW) contains a large collection of computational procedures for the reconstruction of images from the X-ray data recorded by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) (2002SoPh..210....3L) in the time interval between February 2002 to August 2018 - see 2019ApJ...887..131D for a recent evaluation of the performance of the different available methods. Some of these methods apply directly to RHESSI counts while others have been conceived to process RHESSI visibilities, i.e. calibrated samples of the Fourier transform of the incoming photon flux, generated via a data stacking process. Among count-based methods, SSW includes Back Projection (2002SoPh..210...61H), Clean (1974A&AS...15..417H), Forward Fit (2002SoPh..210..193A), Pixon (1996ApJ...466..585M), and Expectation Maximization (2013A&A...555A..61B); among visibility-based methods, SSW includes MEMNJIT (2006ApJ...636.1159B; 2007SoPh..240..241S), a Maximum Entropy method; VISFWDFIT (2007SoPh..240..241S), which selects pre-defined source shapes based on their best fitting of visibilities; uvsmooth (2009ApJ...703.2004M), an interpolation/extrapolation method in the Fourier domain; VISCS (2017ApJ...849...10F), a catalogue-based compressed sensing algorithm; and VISWV (2018A&A...615A..59D), a wavelet-based compressed sensing algorithm. Although each one of these algorithms combines specific values with applicability limitations and specific flaws, a critical comparison of the maps of a given flaring event obtained by the application of all (or most) of these algorithms provides a good picture of what a reliable image of the event could be.
In particular, MEMNJIT provides reconstructions characterized by a notable degree of reliability. The capability of fitting the experimental observations is generally satisfactory both in terms of comparison between the predicted and experimental visibility profiles with respect to each RHESSI detector; and in terms of the reduced values computed considering either all detectors or just the detectors providing the observations. Further, the algorithm is robust with respect to the level of noise affecting the observations. Moreover, the computational time is among the smallest in the SSW scenario and allows reconstruction within a few seconds. Finally, MEMNJIT is characterized by super-resolution properties (1985A&A...143...77C).
However, MEMNJIT sometimes produces images with multiple unrealistically small sources. The origin of this problem is not totally clear but is related to the minimization technique and the setting of the convergence criteria. MEMNJIT addresses a constrained maximization of the entropy function, which turns into an optimization problem with two penalty terms, the chi-squared function relating the measured and predicted visibilities and a term that ensures the conservation of the overall flux. The optimization of these terms is computationally difficult especially since the optimization functional is not convex, which implies that the numerical schemes may suffer convergence issues. Therefore, the optimization procedure may continue on past the physical meaningful solution to find a solution involving multiple bright points (see Figure 1, where the reconstructed images have dimension , the pixel size is arcsec and a zoom is applied for sake of clarity). Of course, this outcome can generally be avoided by increasing the tolerance parameter from its default value (which is currently set at ) to data-dependent higher values, but this typically results in a worse fitting of the observations and a less accurate estimate of the emission flux and, in general, impedes the use of MEMNJIT in an automatic pipeline for data processing.
The present paper illustrates a new algorithm for the constrained maximization of the image entropy which, differently than MEMNJIT, relies on the optimization of a convex functional. Indeed, this approach aims at the minimization of under the constraint of maximum entropy, which leads to the formulation of a convex functional characterized by a single penalty term and, therefore a single regularization parameter that is fixed a priori. The minimization of this functional is performed iteratively and in an alternate fashion (combettes:hal-00643807): for each iteration, a gradient step minimizes the functional and then a proximal step minimizes the negative entropy and projects the result of the first step onto the hyper-surface of vectors with positive components and constant flux.
The method is implemented in Solar SoftWare (SSW) and can be reached via the HESSI GUI with the name MEMGE. We point out that the IDL code is implemented in a way that is relatively independent of the instrument providing the experimental visibilities and the related uncertainties. In particular, when using RHESSI visibilities, MEMGE images are very similar to MEMNJIT images when this latter approach works but MEMGE also provides meaningful images for those cases where the MEMNJIT images, made with the standard value of the tolerance parameter, are unphysical.
The plan of the paper is as follows. Section 2 provides some details about the formulation of MEMGE and the optimization algorithm. Section 3 illustrates the results of its application against both experimental visibilities recorded by RHESSI and synthetic visibilities generated within the simulation software of the Spectrometer Telescope for Imaging X-rays (STIX) on-board Solar Orbiter. Comments on these results are contained in Section 4. Our conclusions will be offered in Section 5.
2 The optimization problem
Both RHESSI and STIX provide observations, named visibilities, which are calibrated Fourier transforms of the incoming photon flux picked up at specific sampling points in the spatial frequency plane (for RHESSI, the number of recorded visibilities is variable and depends on the observation; for STIX, it is fixed at ). Therefore image reconstruction for RHESSI and STIX involves the solution of the inverse Fourier transform problem with limited data
where is the vector containing the observed visibilities; the photon flux image to reconstruct is transformed into the -dimension vector with , by means of the standard column-wise concatenation procedure; is the matrix computing the Discrete Fourier Transform (DFT) of at the spatial frequencies sampled by either RHESSI or STIX.
The mathematical basis of MEMNJIT is the constrained maximization of the entropy
where is the signal content of pixel , , is the total flux in the image and is the Euler’s number (). Three constraints are considered in the algorithm; one is
where is the vector of the experimental uncertainties. A second constraint involves the flux and is given by
Finally, a third constraint requires that all components of must be non-negative. The algorithm implemented in the MEMNJIT IDL code addresses the constrained maximum problem
where and are the Lagrange multipliers associated to constraints (3) and (5); these two parameters are not estimated a priori but are updated during the maximization process. The first main drawback of this approach is that the maximization problem (7) involves a functional which is not convex and therefore numerical schemes may lead to unstable solutions. Further, the functional is characterized by two Lagrange multipliers, whose updating process is sometimes non optimal. When one of these conditions occurs, MEMNJIT produces unphysical reconstructions like the ones shown in Figure 1.
MEMGE addresses these two issues by providing a different formulation of the maximum entropy optimization problem. The first idea is to replace the maximization problem (7) with the minimization problem
under the flux constraint (5) and where is the regularization parameter. The main advantage of this approach is that now the optimization problem is convex and therefore it can be addressed by relying on several numerical methods whose convergence properties are well-established. In particular, in MEMGE we adopted the following iterative scheme whereby, at each iteration we compute (combettes:hal-00643807)
A gradient step to minimize ;
A proximal step to maximize the entropy subjected to the positivity and flux constraints.
After this second step, the algorithm is accelerated by computing a linear combination with the approximation corresponding to the previous iteration and a monotonicity check is also performed (beck2009fast).
Two main technical aspects are involved by the implementation of the algorithm. First, the regularization parameter is a priori determined relying on the observation that an over-regularizing value of this parameter implies that the corresponding regularized solution must be related to the average flux by
for all . Simple numerical approximations show that (9) implies
where indicates the partial derivative along the -th direction. In order to determine we approximate each component of with . This results in a over-estimate of the regularization parameter; therefore the optimal value for the regularization parameter is chosen as a rate of the estimated , where this rate is determined as a function of the visibility signal-to-noise ratio using a heuristic look-up table.
Second, the realization of the flux constraint in the second step relies on the solution of the not-regularized constrained minimum problem
which is performed by applying the projected Landweber method (piana1997projected): the estimate of the flux is computed by summing up the pixel content of the solution of problem (11).
3 Application to X-ray visibilities
This section validates the reliability of MEMGE in the case of both observations provided by RHESSI and synthetic visibilies simulated according to the STIX imaging concept setup.
In order to illustrate the behavior of the new algorithm when applied to RHESSI observations, we consider two sets of events. The first set is made of the same cases considered in Figure 1 and we verified whether MEMGE is able to provide reconstructions that do not suffer the same pathological behavior characterizing MEMNJIT, while guaranteeing the same data fidelity. Specifically, Figure 2 refers to the same events considered in Figure 1, but this time the reconstruction method employed is MEMGE. Then, Figures 3 through 5 compare the reconstructions provided by MEMGE and MEMNJIT with the ones of VISCS, EM, Clean and uvsmooth for these same three datasets. Finally, Figures 6 through 8 show the reconstructions provided by all six algorithms for three events whereby MEMNJIT works properly (for the reconstructions in Figures 3 through 8 we do not report the comparisons between the measured and predicted visibilities since they show very similar behaviors among the reconstruction methods, with the only exception of uvsmooth, which is characterized by fitting performances systematically slightly worse). Tables 1 and 2 illustrate a quantitative comparison of performances from all codes and for all events considered in this sub-section.
|February 20 2002 ( keV; UT)|
|May 1 2002 ( keV; UT)|
|December 13 2007 ( keV; UT)|
|February 13 2002 ( keV; UT)|
|July 23 2002 ( keV; UT)|
|August 31 2004 ( keV; UT)|
a double footpoint flare in which one of the sources is more extended and has double the flux of the other source (Configuration 1);
a double footpoint flare in which the sources have the same size and the same flux (Configuration 2);
a loop flare with small curvature (Configuration 3);
a loop flare with large curvature (Configuration 4).
For each one of these configurations the STIX software utilized a Monte Carlo approach to produce a set of synthetic visibilities with associated the corresponding standard deviations. These visibility sets have been processed by MEMGE and the results are compared in Table 2 with the ones provided by the two other methods currently implemented in the STIX software tree, i.e. visibility-based Clean and count-based Expectation Maximization (EM) (2019A&A...624A.130M).
4 Discussion of results
One of the nice aspects of MEMGE (see Figures 1 and 2) is that it provides reliable reconstructions for those events where MEMNJIT, with its default tolerance parameter set to , produces multiple unrealistically small sources, while predicting the experimental visibilities with a statistical fidelity close to the MEMNJIT one. On the other hand, it behaves similarly to MEMNJIT for those flaring events where MEMNJIT reconstructions are reliable (see Figures 6, 7, and 8). As far as the morphological properties are concerned, MEMGE systematically introduces high-resolution effects. This is particularly visibile in the case of the reconstruction of the February 13 2002 event (Figure 6), where a convex loop-shape is reproduced by MEMGE and MEMNJIT (and also EM), while this convexity is much smeared out in the reconstructions provided by Clean and uvsmooth (however, we point out that the performances of Clean depend on what Regression Combined Method is used for the final Clean beam image and what Beam Width Factor is used). In this case, VISCS fails to produce the correct orientation of the loop and reproduces a concave shape. Analogously, the spatial resolution achieved by the two MEM codes (and by EM and, partly, by uvsmooth) is as fine as can be expected in the reconstruction of the July 23 2002 X Class flare, given the angular resolution of the modulation collimators used in the analysis: all four sources characterizing this emission topography are clearly visibile in the reconstructions. On the contrary, neither VISCS nor Clean are able to distinguish all four emitting regions. Tables 1 and 2 show that, when MEMNJIT works properly, the two maximum entropy methods provide similar estimates of the quantitative parameters, although MEMGE may reproduce the data with significantly smaller values (as in the case of the December 13 2007 and July 23 2002 events) but may require a higher computational time (as in the case of the February 13 2002 and July 23 2002 events). We finally point out that the tolerance parameter in MEMNJIT can be manually tuned to higher values in order to obtain more regularized reconstructions. However, Figure 10 shows that increasing the tolerance value results into morphologies closer to the ones corresponding to MEMGE reconstructions but with a less accurate ability of the method to both fit the observations and conserve the flux value given as input to the algorithm. Even more importantly, the figure shows that this tuning procedure is dependent on the experimental dataset considered.
Interestingly, the super-resolution properties of Maximum Entropy are confirmed by the analysis of synthetic STIX data illustrated in Table 3, where the performances of MEMGE are compared to the ones of visibility-based Clean and count-based EM (there is no MEMNJIT code adapted to the STIX framework). In fact, in the reconstructions of the foot-points, the ability of MEMGE to recover the ground-truth full width at half maximum (FWHM) outperforms the effectiveness of the other visibility-based algorithm (EM works systematically better, probably because the signal-to-noise ratio associated to counts is higher than the ones associated to the real and imaginary parts of the visibilities, as shown by 2019A&A...624A.130M).
Also, similarly to EM, MEMGE behaves better than Clean in both reproducing the exact total flux and best-fitting the synthetic measurements. EM and Clean seem more accurate in separately reproducing the flux of each one of the two circular sources in Configuration 1 and Configuration 2.
This paper introduces a novel algorithm, named MEMGE, implementing a Maximum Entropy approach to image reconstruction from X-ray visibilities in solar astronomy. The motivation of this effort relies on the fact that the only existing algorithm (MEMNJIT) realizing this approach for this kind of data provides reliable and super-resolved reconstructions for datasets associated to most events, but sometimes suffers numerical instabilities and lack of convergence to meaningful solutions. Differently than the old algorithm, the new implementation relies on the constrained minimization of a convex functional, which is realized by alternating gradient descent and proximal steps. On the one hand, the resulting code maintains the good imaging properties of the previous one, while guaranteeing, on the other hand, convergence to reliable reconstructions when MEMNJIT gives unphysical sources (unless the tolerance parameter is tuned in a data-dependent fashion).
MEMGE is available in the Solar SoftWare tree and can be most easily accessed through the HESSI GUI. It is one of the algorithms that is currently used for the population of the RHESSI image archive, and can be used also for the processing of STIX synthetic visibilities.
The nice imaging properties of this algorithm, together with its systematic reliability, make it particularly appropriate for the accurate estimation of morphological properties like the ones considered in the analysis of coronal hard X-ray sources (2008ApJ...673..576X; 2012ApJ...755...32G; 2012A&A...543A..53G; 2018ApJ...867...82D). MEMGE is an appropriate algorithm to realize a statistical study of these kinds of events.
|First Peak||Second Peak||Total flux ()||C-statistic|
|X||Y||FWHM||Flux ()||X||Y||FWHM||Flux ()|
|First Peak||Second Peak||Total flux ()||C-statistic|
|X||Y||FWHM||Flux ()||X||Y||FWHM||Flux ()|
|Peak||Total flux ()||C-statistic|
|Peak||Total flux ()||C-statistic|