MemGE: a new maximum entropy method for image reconstruction from solar Xray visibilities
Abstract
Maximum Entropy is an image reconstruction method conceived to image a sparsely occupied field of view and therefore particularly appropriate to achieve superresolution effects. Although widely used in image deconvolution, this method has been formulated in radio astronomy for the analysis of observations in the spatial frequency domain, and an Interactive Data Language (IDL) code has been implemented for image reconstruction from solar Xray Fourier data. However, this code relies on a nonconvex formulation of the constrained optimization problem addressed by the Maximum Entropy approach and this sometimes results in unreliable reconstructions characterized by unphysical shrinking effects.
This paper introduces a new approach to Maximum Entropy based on the constrained minimization of a convex functional. In the case of observations recorded by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI), the resulting code provides the same superresolution effects of the previous algorithm, while working properly also when that code produces unphysical reconstructions. Results are also provided of testing the algorithm with synthetic data simulating observations of the Spectrometer/Telescope for Imaging Xrays (STIX) in Solar Orbiter. The new code is available in the HESSI folder of the Solar SoftWare (SSW) tree.
1 Introduction
Solar SoftWare (SSW) contains a large collection of computational procedures for the reconstruction of images from the Xray data recorded by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) (2002SoPh..210....3L) in the time interval between February 2002 to August 2018  see 2019ApJ...887..131D for a recent evaluation of the performance of the different available methods. Some of these methods apply directly to RHESSI counts while others have been conceived to process RHESSI visibilities, i.e. calibrated samples of the Fourier transform of the incoming photon flux, generated via a data stacking process. Among countbased methods, SSW includes Back Projection (2002SoPh..210...61H), Clean (1974A&AS...15..417H), Forward Fit (2002SoPh..210..193A), Pixon (1996ApJ...466..585M), and Expectation Maximization (2013A&A...555A..61B); among visibilitybased methods, SSW includes MEMNJIT (2006ApJ...636.1159B; 2007SoPh..240..241S), a Maximum Entropy method; VISFWDFIT (2007SoPh..240..241S), which selects predefined source shapes based on their best fitting of visibilities; uvsmooth (2009ApJ...703.2004M), an interpolation/extrapolation method in the Fourier domain; VISCS (2017ApJ...849...10F), a cataloguebased compressed sensing algorithm; and VISWV (2018A&A...615A..59D), a waveletbased compressed sensing algorithm. Although each one of these algorithms combines specific values with applicability limitations and specific flaws, a critical comparison of the maps of a given flaring event obtained by the application of all (or most) of these algorithms provides a good picture of what a reliable image of the event could be.
In particular, MEMNJIT provides reconstructions characterized by a notable degree of reliability. The capability of fitting the experimental observations is generally satisfactory both in terms of comparison between the predicted and experimental visibility profiles with respect to each RHESSI detector; and in terms of the reduced values computed considering either all detectors or just the detectors providing the observations. Further, the algorithm is robust with respect to the level of noise affecting the observations. Moreover, the computational time is among the smallest in the SSW scenario and allows reconstruction within a few seconds. Finally, MEMNJIT is characterized by superresolution properties (1985A&A...143...77C).
However, MEMNJIT sometimes produces images with multiple unrealistically small sources. The origin of this problem is not totally clear but is related to the minimization technique and the setting of the convergence criteria. MEMNJIT addresses a constrained maximization of the entropy function, which turns into an optimization problem with two penalty terms, the chisquared function relating the measured and predicted visibilities and a term that ensures the conservation of the overall flux. The optimization of these terms is computationally difficult especially since the optimization functional is not convex, which implies that the numerical schemes may suffer convergence issues. Therefore, the optimization procedure may continue on past the physical meaningful solution to find a solution involving multiple bright points (see Figure 1, where the reconstructed images have dimension , the pixel size is arcsec and a zoom is applied for sake of clarity). Of course, this outcome can generally be avoided by increasing the tolerance parameter from its default value (which is currently set at ) to datadependent higher values, but this typically results in a worse fitting of the observations and a less accurate estimate of the emission flux and, in general, impedes the use of MEMNJIT in an automatic pipeline for data processing.
The present paper illustrates a new algorithm for the constrained maximization of the image entropy which, differently than MEMNJIT, relies on the optimization of a convex functional. Indeed, this approach aims at the minimization of under the constraint of maximum entropy, which leads to the formulation of a convex functional characterized by a single penalty term and, therefore a single regularization parameter that is fixed a priori. The minimization of this functional is performed iteratively and in an alternate fashion (combettes:hal00643807): for each iteration, a gradient step minimizes the functional and then a proximal step minimizes the negative entropy and projects the result of the first step onto the hypersurface of vectors with positive components and constant flux.
The method is implemented in Solar SoftWare (SSW) and can be reached via the HESSI GUI with the name MEMGE. We point out that the IDL code is implemented in a way that is relatively independent of the instrument providing the experimental visibilities and the related uncertainties. In particular, when using RHESSI visibilities, MEMGE images are very similar to MEMNJIT images when this latter approach works but MEMGE also provides meaningful images for those cases where the MEMNJIT images, made with the standard value of the tolerance parameter, are unphysical.
The plan of the paper is as follows. Section 2 provides some details about the formulation of MEMGE and the optimization algorithm. Section 3 illustrates the results of its application against both experimental visibilities recorded by RHESSI and synthetic visibilities generated within the simulation software of the Spectrometer Telescope for Imaging Xrays (STIX) onboard Solar Orbiter. Comments on these results are contained in Section 4. Our conclusions will be offered in Section 5.
2 The optimization problem
Both RHESSI and STIX provide observations, named visibilities, which are calibrated Fourier transforms of the incoming photon flux picked up at specific sampling points in the spatial frequency plane (for RHESSI, the number of recorded visibilities is variable and depends on the observation; for STIX, it is fixed at ). Therefore image reconstruction for RHESSI and STIX involves the solution of the inverse Fourier transform problem with limited data
(1) 
where is the vector containing the observed visibilities; the photon flux image to reconstruct is transformed into the dimension vector with , by means of the standard columnwise concatenation procedure; is the matrix computing the Discrete Fourier Transform (DFT) of at the spatial frequencies sampled by either RHESSI or STIX.
The mathematical basis of MEMNJIT is the constrained maximization of the entropy
(2) 
where is the signal content of pixel , , is the total flux in the image and is the Euler’s number (). Three constraints are considered in the algorithm; one is
(3) 
with
(4) 
where is the vector of the experimental uncertainties. A second constraint involves the flux and is given by
(5) 
with
(6) 
Finally, a third constraint requires that all components of must be nonnegative. The algorithm implemented in the MEMNJIT IDL code addresses the constrained maximum problem
(7) 
where and are the Lagrange multipliers associated to constraints (3) and (5); these two parameters are not estimated a priori but are updated during the maximization process. The first main drawback of this approach is that the maximization problem (7) involves a functional which is not convex and therefore numerical schemes may lead to unstable solutions. Further, the functional is characterized by two Lagrange multipliers, whose updating process is sometimes non optimal. When one of these conditions occurs, MEMNJIT produces unphysical reconstructions like the ones shown in Figure 1.
MEMGE addresses these two issues by providing a different formulation of the maximum entropy optimization problem. The first idea is to replace the maximization problem (7) with the minimization problem
(8) 
under the flux constraint (5) and where is the regularization parameter. The main advantage of this approach is that now the optimization problem is convex and therefore it can be addressed by relying on several numerical methods whose convergence properties are wellestablished. In particular, in MEMGE we adopted the following iterative scheme whereby, at each iteration we compute (combettes:hal00643807)

A gradient step to minimize ;

A proximal step to maximize the entropy subjected to the positivity and flux constraints.
After this second step, the algorithm is accelerated by computing a linear combination with the approximation corresponding to the previous iteration and a monotonicity check is also performed (beck2009fast).
Two main technical aspects are involved by the implementation of the algorithm. First, the regularization parameter is a priori determined relying on the observation that an overregularizing value of this parameter implies that the corresponding regularized solution must be related to the average flux by
(9) 
for all . Simple numerical approximations show that (9) implies
(10) 
where indicates the partial derivative along the th direction. In order to determine we approximate each component of with . This results in a overestimate of the regularization parameter; therefore the optimal value for the regularization parameter is chosen as a rate of the estimated , where this rate is determined as a function of the visibility signaltonoise ratio using a heuristic lookup table.
Second, the realization of the flux constraint in the second step relies on the solution of the notregularized constrained minimum problem
(11) 
which is performed by applying the projected Landweber method (piana1997projected): the estimate of the flux is computed by summing up the pixel content of the solution of problem (11).
3 Application to Xray visibilities
This section validates the reliability of MEMGE in the case of both observations provided by RHESSI and synthetic visibilies simulated according to the STIX imaging concept setup.
3.1 Rhessi
In order to illustrate the behavior of the new algorithm when applied to RHESSI observations, we consider two sets of events. The first set is made of the same cases considered in Figure 1 and we verified whether MEMGE is able to provide reconstructions that do not suffer the same pathological behavior characterizing MEMNJIT, while guaranteeing the same data fidelity. Specifically, Figure 2 refers to the same events considered in Figure 1, but this time the reconstruction method employed is MEMGE. Then, Figures 3 through 5 compare the reconstructions provided by MEMGE and MEMNJIT with the ones of VISCS, EM, Clean and uvsmooth for these same three datasets. Finally, Figures 6 through 8 show the reconstructions provided by all six algorithms for three events whereby MEMNJIT works properly (for the reconstructions in Figures 3 through 8 we do not report the comparisons between the measured and predicted visibilities since they show very similar behaviors among the reconstruction methods, with the only exception of uvsmooth, which is characterized by fitting performances systematically slightly worse). Tables 1 and 2 illustrate a quantitative comparison of performances from all codes and for all events considered in this subsection.
MEMGE  MEMNJIT  VISCS 
EM  Clean  uvsmooth 
MEMGE  MEMNJIT  VISCS 
EM  Clean  uvsmooth 
MEMGE  MEMNJIT  VISCS 
EM  Clean  uvsmooth 
MEMGE  MEMNJIT  VISCS 
EM  Clean  uvsmooth 
MEMGE  MEMNJIT  VISCS 
EM  Clean  uvsmooth 
MEMGE  MEMNJIT  VISCS 
EM  Clean  uvsmooth 
reduced  reduced (used)  flux  time  
February 20 2002 ( keV; UT)  
MEMGE  2.45  2.99  26.2  22 
MEMNJIT  2.43  2.54  21.0  11 
VISCS  2.88  3.52  18.1  3 
CLEAN  2.74  3.30  39.6  7 
EM  2.39  2.91  18.4  83 
UVSMOOTH  5.39  7.21  40.8  1 
May 1 2002 ( keV; UT)  
MEMGE  3.02  3.93  1.81  12 
MEMNJIT  2.90  1.68  1.49  12 
VISCS  2.18  2.64  1.56  3 
CLEAN  2.30  2.82  2.95  11 
EM  1.88  2.18  1.56  100 
UVSMOOTH  3.19  4.18  1.94  1 
December 13 2007 ( keV; UT)  
MEMGE  1.76  1.61  12.6  7 
MEMNJIT  7.45  2.29  5.63  59 
VISCS  2.64  3.01  6.47  3 
CLEAN  2.68  3.10  9.90  8 
EM  2.56  2.91  6.57  108 
UVSMOOTH  2.75  3.20  8.01  1 
reduced  reduced (used)  flux  time  
February 13 2002 ( keV; UT)  
MEMGE  0.96  0.99  4.15  20 
MEMNJIT  0.98  1.03  4.22  3 
VISCS  0.99  1.04  3.70  3 
CLEAN  1.07  1.15  6.67  8 
EM  0.97  1.01  3.67  45 
UVSMOOTH  2.13  2.67  7.38  1 
July 23 2002 ( keV; UT)  
MEMGE  3.42  3.53  2.90  20 
MEMNJIT  5.17  6.18  2.71  4 
VISCS  5.45  6.07  2.41  3 
CLEAN  4.78  5.04  2.85  10 
EM  4.21  4.67  2.41  158 
UVSMOOTH  6.17  6.86  3.19  1 
August 31 2004 ( keV; UT)  
MEMGE  1.24  1.32  1.44  14 
MEMNJIT  1.31  1.42  1.24  5 
VISCS  1.20  1.26  1.29  3 
CLEAN  1.22  1.29  1.70  10 
EM  1.25  1.33  1.29  82 
UVSMOOTH  1.76  2.08  1.54  1 
3.2 Stix
We simulated the following four configurations with an overall incident flux of photons (see Figure 9 and Table 3 for all parameters):

a double footpoint flare in which one of the sources is more extended and has double the flux of the other source (Configuration 1);

a double footpoint flare in which the sources have the same size and the same flux (Configuration 2);

a loop flare with small curvature (Configuration 3);

a loop flare with large curvature (Configuration 4).
For each one of these configurations the STIX software utilized a Monte Carlo approach to produce a set of synthetic visibilities with associated the corresponding standard deviations. These visibility sets have been processed by MEMGE and the results are compared in Table 2 with the ones provided by the two other methods currently implemented in the STIX software tree, i.e. visibilitybased Clean and countbased Expectation Maximization (EM) (2019A&A...624A.130M).
4 Discussion of results
One of the nice aspects of MEMGE (see Figures 1 and 2) is that it provides reliable reconstructions for those events where MEMNJIT, with its default tolerance parameter set to , produces multiple unrealistically small sources, while predicting the experimental visibilities with a statistical fidelity close to the MEMNJIT one. On the other hand, it behaves similarly to MEMNJIT for those flaring events where MEMNJIT reconstructions are reliable (see Figures 6, 7, and 8). As far as the morphological properties are concerned, MEMGE systematically introduces highresolution effects. This is particularly visibile in the case of the reconstruction of the February 13 2002 event (Figure 6), where a convex loopshape is reproduced by MEMGE and MEMNJIT (and also EM), while this convexity is much smeared out in the reconstructions provided by Clean and uvsmooth (however, we point out that the performances of Clean depend on what Regression Combined Method is used for the final Clean beam image and what Beam Width Factor is used). In this case, VISCS fails to produce the correct orientation of the loop and reproduces a concave shape. Analogously, the spatial resolution achieved by the two MEM codes (and by EM and, partly, by uvsmooth) is as fine as can be expected in the reconstruction of the July 23 2002 X Class flare, given the angular resolution of the modulation collimators used in the analysis: all four sources characterizing this emission topography are clearly visibile in the reconstructions. On the contrary, neither VISCS nor Clean are able to distinguish all four emitting regions. Tables 1 and 2 show that, when MEMNJIT works properly, the two maximum entropy methods provide similar estimates of the quantitative parameters, although MEMGE may reproduce the data with significantly smaller values (as in the case of the December 13 2007 and July 23 2002 events) but may require a higher computational time (as in the case of the February 13 2002 and July 23 2002 events). We finally point out that the tolerance parameter in MEMNJIT can be manually tuned to higher values in order to obtain more regularized reconstructions. However, Figure 10 shows that increasing the tolerance value results into morphologies closer to the ones corresponding to MEMGE reconstructions but with a less accurate ability of the method to both fit the observations and conserve the flux value given as input to the algorithm. Even more importantly, the figure shows that this tuning procedure is dependent on the experimental dataset considered.
Interestingly, the superresolution properties of Maximum Entropy are confirmed by the analysis of synthetic STIX data illustrated in Table 3, where the performances of MEMGE are compared to the ones of visibilitybased Clean and countbased EM (there is no MEMNJIT code adapted to the STIX framework). In fact, in the reconstructions of the footpoints, the ability of MEMGE to recover the groundtruth full width at half maximum (FWHM) outperforms the effectiveness of the other visibilitybased algorithm (EM works systematically better, probably because the signaltonoise ratio associated to counts is higher than the ones associated to the real and imaginary parts of the visibilities, as shown by 2019A&A...624A.130M).
Also, similarly to EM, MEMGE behaves better than Clean in both reproducing the exact total flux and bestfitting the synthetic measurements. EM and Clean seem more accurate in separately reproducing the flux of each one of the two circular sources in Configuration 1 and Configuration 2.
5 Conclusions
This paper introduces a novel algorithm, named MEMGE, implementing a Maximum Entropy approach to image reconstruction from Xray visibilities in solar astronomy. The motivation of this effort relies on the fact that the only existing algorithm (MEMNJIT) realizing this approach for this kind of data provides reliable and superresolved reconstructions for datasets associated to most events, but sometimes suffers numerical instabilities and lack of convergence to meaningful solutions. Differently than the old algorithm, the new implementation relies on the constrained minimization of a convex functional, which is realized by alternating gradient descent and proximal steps. On the one hand, the resulting code maintains the good imaging properties of the previous one, while guaranteeing, on the other hand, convergence to reliable reconstructions when MEMNJIT gives unphysical sources (unless the tolerance parameter is tuned in a datadependent fashion).
MEMGE is available in the Solar SoftWare tree and can be most easily accessed through the HESSI GUI. It is one of the algorithms that is currently used for the population of the RHESSI image archive, and can be used also for the processing of STIX synthetic visibilities.
The nice imaging properties of this algorithm, together with its systematic reliability, make it particularly appropriate for the accurate estimation of morphological properties like the ones considered in the analysis of coronal hard Xray sources (2008ApJ...673..576X; 2012ApJ...755...32G; 2012A&A...543A..53G; 2018ApJ...867...82D). MEMGE is an appropriate algorithm to realize a statistical study of these kinds of events.
Configuration 1  

First Peak  Second Peak  Total flux ()  Cstatistic  
X  Y  FWHM  Flux ()  X  Y  FWHM  Flux ()  
Simulated  
MEMGE  
EM  
CLEAN  
Configuration 2  
First Peak  Second Peak  Total flux ()  Cstatistic  
X  Y  FWHM  Flux ()  X  Y  FWHM  Flux ()  
Simulated  
MEMGE  
EM  
CLEAN  
Configuration 3  
Peak  Total flux ()  Cstatistic  
X  Y  
Simulated  10.00  
MEMGE  
EM  
CLEAN  
Configuration 4  
Peak  Total flux ()  Cstatistic  
X  Y  
Simulated  10.00  
MEMGE  
EM  
CLEAN 