Identifying typical Mg ii flare spectra using machine learning

Identifying typical Mg ii flare spectra using machine learning

Brandon Panos11affiliation: University of Applied Sciences and Arts Northwestern Switzerland, Bahnhofstrasse 6, 5210 Windisch, Switzerland 22affiliation: University of Geneva, CUI-SIP, 1205 Geneva, Switzerland , Lucia Kleint11affiliation: University of Applied Sciences and Arts Northwestern Switzerland, Bahnhofstrasse 6, 5210 Windisch, Switzerland 33affiliation: Kiepenheuer Institut für Sonnenphysik (KIS), Schöneckstrasse 6, D-79104 Freiburg, Germany , Cedric Huwyler11affiliation: University of Applied Sciences and Arts Northwestern Switzerland, Bahnhofstrasse 6, 5210 Windisch, Switzerland , Säm Krucker11affiliation: University of Applied Sciences and Arts Northwestern Switzerland, Bahnhofstrasse 6, 5210 Windisch, Switzerland 44affiliation: Space Sciences Laboratory, University of California, 7 Gauss Way, Berkeley, CA 94720, USA , Martin Melchior11affiliation: University of Applied Sciences and Arts Northwestern Switzerland, Bahnhofstrasse 6, 5210 Windisch, Switzerland , Denis Ullmann22affiliation: University of Geneva, CUI-SIP, 1205 Geneva, Switzerland , Sviatoslav Voloshynovskiy22affiliation: University of Geneva, CUI-SIP, 1205 Geneva, Switzerland

IRIS performs solar observations over a large range of atmospheric heights, including the chromosphere where the majority of flare energy is dissipated. The strong Mg II h&k spectral lines are capable of providing excellent atmospheric diagnostics, but have not been fully utilized for flaring atmospheres. We aim to investigate whether the physics of the chromosphere is identical for all flare observations by analyzing if there are certain spectra that occur in all flares. To achieve this, we automatically analyze hundreds of thousands of Mg II h&k line profiles from a set of 33 flares, and use a machine learning technique which we call supervised hierarchical k-means, to cluster all profile shapes. We identify a single peaked Mg II profile, in contrast to the double-peaked quiet Sun profiles, appearing in every flare. Additionally, we find extremely broad profiles with characteristic blue shifted central reversals appearing at the front of fast-moving flare ribbons. These profiles occur during the impulsive phase of the flare, and we present results of their temporal and spatial correlation with non-thermal hard X-ray signatures, suggesting that flare-accelerated electrons play an important role in the formation of these profiles. The ratio of the integrated Mg II h&k lines can also serve as an opacity diagnostic, and we find higher opacities during each flare maximum. Our study shows that machine learning is a powerful tool for large scale statistical solar analyses.

Subject headings:
Sun: flares; chromosphere

1. Introduction

The Interface Region Imaging Spectrograph (IRIS, De Pontieu et al., 2014) routinely observes flares, yet statistical analyses of an ensemble of flares are rare. IRIS observes two of the brightest chromospheric lines, the optically thick Mg ii h&k resonant lines with core vacuum wavelengths at and Å. The h&k lines sample the entire chromosphere and provide excellent quiet Sun diagnostics (Leenaarts et al., 2013), however, there exist few diagnostics based on these lines for the flaring Sun. Our goal is to use machine learning to statistically analyze several dozen flares, and answer the question as to whether there are typical spectra that would indicate similar chromospheric physics in all flares.

In the standard model, a flare is caused by the impulsive reconfiguration of the coronal magnetic fields called reconnection. This process releases on average erg/s of magnetically stored energy within a few minutes (e.g. Emslie et al., 2005). The energy is used to accelerate electrons and protons into space as well as into the thick target of the chromosphere, resulting in heating and a subsequent emission over a large band of frequencies. During this process the Mg ii h&k lines differ from the usual quiet Sun profiles in a few significant ways. The central reversals and subordinate lines often go into emission, the line wings undergo substantial broadening and highly asymmetric profiles can be observed. Many attempts have been made to understand the behavior of the h&k lines during a flare (Machado et al. (1980); Lemaire & Gouttebroze (1983); Kerr et al. (2016); Rubio da Costa et al. (2016); Liu et al. (2015); de la Cruz Rodriguez et al. (2016); Kowalski et al. (2017); Reid et al. (2017)). Recent parameter studies by Rubio da Costa & Kleint (2017) using an artificial flaring atmosphere simulated with RADYN and the partial redistribution and non-LTE radiative transfer code RH, along with single flare observations by Kerr et al. (2015) have begun to extract Mg ii diagnostics for the flaring Sun. An important initial step is to identify all possible types of profiles that can be generated during a solar flare.

The arrival of large ground-based telescopes such as the Daniel K. Inouye Solar Telescope (DKIST) with an estimated data output of 5 PB per year (Reardon & Berukoff, 2014) as well as the already sizeable cumulative solar databases of IRIS and SDO places the discipline of heliophysics firmly within the territory of big data. The important aspects of this data can no longer be extracted and analyzed without the use of smart algorithms. Despite this wealth of solar data, large scale statistical studies with an ensemble of flares are a rarity in the heliophysics community, while publications based on single flare observations are the norm.

In this paper, we analyze hundreds of thousands of Mg ii spectral line profiles taken from 33 flares, and use a clustering algorithm to identify structures within the data set. The k-means clustering algorithm of MacQueen (1967) has been used in the past by Pietarila et al. (2007) and Viticchie, B. & Sanchez Almeida, J. (2011) to cluster the different shapes of Stokes profiles in order to investigate the magnetism of the quiet Sun. In the same spirit, we will use the k-means algorithm in combination with a manual merging and splitting of clusters based on physical relevance and in-group variance to identify the observed shapes of Mg ii k-line profiles produced during flares.

Figure 1.— Example of a k-means clustering for the M6.5 flare on June 6, 2015. The Mg ii k-line profiles for a single raster are assigned to the groups in the bottom panel that they are most similar to. The quiet Sun was not colored because its profiles belong to different groups. For the full temporal evolution of this flare, see the online movie.

2. Data and Machine learning

2.1. Iris

The data for this study were taken by NASA’s small explorer satellite IRIS, which was launched in 2013 and has since observed more than 400 flares according to a manually-kept list on its mission website. IRIS can take high quality slit-jaw images (SJI) of the solar atmosphere with a maximum field of view of in four different passbands, covering a range of heights from the photosphere to the transition region. It is also equipped with a spectrograph that can run simultaneously at high spatial (0.33-0.4 arcsec), spectral (0.056 Å in the NUV) and temporal (2s) resolutions. Various observing modes can be selected, from sit-and-stare, where the spectrograph slit remains stationary with respect to the Sun, to rasters with varying numbers of steps and slit orientations.

2.2. K-Means

We use an unsupervised clustering algorithm known as k-means to identify the different Mg ii k profiles that occur during a flare. Clustering algorithms partition similar observations into groups. If each observation consists of two features, they can be mapped as points on the cartesian plane. k-means uses additional points called centroids to group the observations. The centroids are placed on the plane and each observation is assigned to one of the centroids with a straight line. The k-means objective is to find the assignments and position of centroids that minimize the sum of all the squared Euclidian distances. This quantity is known as the "within cluster distance" , and is given by


Here, is the Kronecker delta and is a label assigning each observation to one of the groups, so that


while is the centroid of that group. To initialize the algorithm, the centroids are positioned in the locations of randomly selected data points. The k-means algorithm then minimizes the within cluster distance by iterating through a two step procedure known as coordinate descent. Firstly, the data is partitioned into groups by labeling each observation by the centroid it appears closest to


Secondly, the centroids are moved to the mean position of each group according to


where is the number of all observations in group . Because the centroids have shifted, new labels must be assigned to all the observations in accordance to step one. This process continues until the centroids converge. The final clustering depends on the initialization of the centroids. It is customary to repeat the clustering with a number of different centroid initializations and select the clustering with the lowest cost . k-means was chosen for its simplicity, scalability and linear time complexity (Hartigan & Wong, 1979). The above algorithm is easily adapted to the purposes of grouping line profiles. Instead of two features being mapped to a 2-dimensional cartesian plane, we composed each profile out of 216 points and mapped them to a 216-dimensional space, i.e., . k-means requires the number of clusters to be chosen by hand. There are many methods such as the "elbow technique" and silhouette analysis (Rousseeuw, 1987), which indicate the natural number of partitions of a data set, however, this number in many cases is left to the discretion of a professional and should not be automated. In conclusion, k-means generates groups, with each group containing a number of spectra that share similar features. These groups each have a representative spectral profile, which is the mean profile of that group called the centroid.

We experimented with a number of alternative distance metrics, most of which returned similar if not identical results to the Euclidean distance, while other distance measures found clusters that were hard to reconcile with any physical interpretation. Additionally, when updating the centroids one could select the median or a true representative point instead of the arithmetic mean. This could make the algorithm less susceptible to the effects of outliers at the price of being more computationally expensive. However, such effects are negligible if there are a sufficient number of normal data points. Furthermore, we acknowledge that there may exist a transformation and metric pair that is more suited to the clustering of spectral data, but did not wish to interject any biases into our data set.

Figure 1 shows the result of k-means applied to a single raster of the M6.5 class flare on June 6, 2015. The bottom panel shows 6 of the 53 centroids found by our adapted k-means algorithm explained in section 2.4. The color-coded positions of the spectra for this raster have been overlaid onto the SJI. The Mg ii k profiles emerging from each location are assigned to one of the 53 groups based on which centroid they are most similar to.

2.3. Data reduction

We analyzed 26 M- and 7 X-class flares with observations in the Mg ii spectral window and slits positioned directly over the flaring region. The details of each flare can be found in Table 1, which includes a variety of observational modes.

Before using the k-means algorithm, the data were prepared in several ways. The Mg ii spectra were selected over a time interval 15 minutes before to 15 minutes after each flare, if available. The start and end times were determined from the GOES flare database. The spectra were then cropped to a window which contained both the Mg ii k and subordinate lines consisting of the two strongest red wing transitions with vacuum wavelengths at 2798.75 and 2798.82 Å. This step not only reduces computational demands but also helps negate dimensionality problems common amongst machine learning algorithms with distance based metrics. The profiles were then interpolated using a spectral sampling of 0.025 Å/pixel, resulting in all Mg ii k-line profiles having a total of 216 points. Since the line intensities in data numbers (DN) can vary over two orders of magnitude, each profile was divided by its maximum intensity. This ensures that the classification is only based on the shape of the profile and not on its intensity. Data loss from incomplete spectra was automatically handled during the cropping phase. To visualize the results, we projected the assigned spectra onto the 1400 Å slit-jaw image, which is sensitive to plasma at chromospheric and transition like temperatures, however, if this filter was not used for that particular observation, the 2796 Å slit-jaw image, which sees the lower chromosphere was substituted.

2.4. Description of the k-means pipeline

The k-means algorithm was applied to Mg ii spectra collected from the subset of 4 M- and 4 X-class flares with the number of groups set to . Flaring groups were generated by manually selecting only the spectra that appeared over the flaring regions. The algorithm was repeated 10 times with different centroid initializations and the clustering with the lowest cost was selected (see section 2.2). If two or more centroids were similar, only a single centroid was retained.

Because every profile has to be assigned to a group, it is necessary to collect a number of non-flaring centroids to filter out the non-flaring spectra. Following the same procedure above, we generated quiet Sun and sunspot groups by manually selecting only the spectra that appeared over quiet Sun and sunspot regions within the 8 selected flares. Centroids associated with small energetic events due to flux emergence were generated by running k-means over rising flux regions. This completed the unsupervised part of the algorithm.

For the remaining flares, the centroids from the clustering were passed to another algorithm (one-nearest neighbor classifier), where each profile was labeled by the centroid they appeared nearest to. Once the profiles of every flare were assigned to their groups, the variances of the flaring groups were monitored. If the variance was to high, an additional flaring group was introduced manually, with the proviso that the new group consistently corresponded with some feature on the SJI projections (see Figure 1). This procedure is demonstrated in Figure 2, and was continued until we were satisfied that all interesting flare groups were included. The final 53 groups can be seen in Figure 3. Our method of clustering deviates from the original k-means algorithm in that we manually merge and split groups based on the supervision of both the variance and the SJI projections. It is unclear whether the same results could be achieved simply by increasing the initial number of centroids in the original k-means algorithm. We refer to this new clustering method as "supervised hierarchical k-means" or SHK for short.

Figure 2.— Plot showing 80 profiles (light grey solid lines) that have been assigned to a hypothetical group A (black solid line). The variance of the profiles close to group A is given by , while the total variance for every profile is given by . A high variance indicates a potential flaring group that had not been found by the original k-means algorithm and was manually added later (broken black-line).
Figure 3.— Centroids found using the SHK algorithm, with manual merging and splitting of groups. The y-axis is in units of normalized intensity and the x-axis is in wavelengths (Å). The groups that we analyzed are color coded and have bold borders. The event that each group is associated with appears in the top left and right corners of each panel, with FE standing for flux emergence. It is important to note that groups other than those forming the basis of this study may occur in more than one category, and were assigned descriptions based on the locations from which they were collected.
Figure 4.— Plot showing examples of the four profile classes. The grey shading indicates the FWHM of each centroid.
Figure 5.— Average variances of groups 0, 4, 5 and 9 across all 33 flares. Vertical lines indicate the flares used to train each centroid (group 9 was included manually). The inserts display group centroids with each flares average profile overplotted in grey. Var is the average variance across all flares. Group 9 has been included as an example of a group with high variance. Profiles from group 0 occur in every flare with extremely low variance and are therefore a universal flare feature.

2.5. Centroids

The 53 profile types fall into four main classes: Quiet Sun, sunspot, flux emergence, and flaring centroids. Figure 4 shows examples of each of the four classes. The quiet Sun profiles have characteristic central reversals, well defined blue (2kv) and red (2kr) peaks and line wings that follow the temperature structure of the lower atmosphere. Flaring profiles on the other hand can be extremely broad and are often observed without central reversals and with the subordinate lines in emission. Similarly, sunspot profiles often have a single peak, but can be distinguished from flaring profiles based on their narrow width and lack of subordinate line emission. Flux emergence profiles share many features of quiet Sun and flaring profiles. They have raised wings and are broad, however there is no subordinate line emission.

3. Analysis

Groups 0, 4, 5, 11, 12 and 52 in Figure 3 appear with consistent behavior throughout our data set, indicating that they are related to flares. We now analyze our findings and discuss the behavior and defining features of each of these groups. We focus on two main results: 1) There are typical flare profiles that appear in every single flare and 2) There are special flare profiles at the front of flare ribbons.

3.1. Are there typical flare profiles?

We investigate if all flares share common profiles, which would indicate that the physics of the lower solar atmosphere may be similar in all flares.

The single peaked profiles can be divided into two different types. The first is represented by centroid 0 and has small subordinate line emission and a FWHM of 0.5 Å. The second represented by centroid 4 and 5 has a narrower convex shape with large subordinate line emission and a FWHM between 0.29 - 0.49 Å. Broader profiles such as those belonging to group 1, 2 and 3 are rarer, have less predictable behavior and often appear with small central reversals. In Figure 5, we have plotted the average variances of groups 0, 4, 5 and 9 for each of the 33 flares. The single peaked profiles belonging to group 0 appear in every flare with a total average variance of 0.52, and are always located over the ribbon or in regions of small brightenings. These profiles are not exclusive to solar flares and can be stimulated by an assortment of sub-flare energetic events such as local heating from small scale reconnections over flux emergence regions. We tested their prevalence by analyzing 8 non-flaring energetic events, and 1 true quiet Sun observation. They did not appear in the quiet Sun observation but were seen in 7/8 of the energetic events (see table 2). The profiles therefore link both high and low energetic solar activity with temporal occurrences before, after, and during the flare, and can be identified as a universal flaring profile.

Single peaked profiles from group 4 and 5 with large triplet emission occur in the wake of the ribbon front and can have long characteristic life times of hour (for example flare 31 in our list). The large triplet emission associated with these groups may be due to the heating of both the upper and lower atmosphere by non-thermal electrons (Pereira et al., 2015). In every flare, the frequency of these profiles is strongly correlated with the GOES 1-8 Å channel.

Figure 6.— SJIs of four different flares in the 1400 Å channel with color coded group assignments overplotted (see Figure 3) and x- and y-axes given in arcseconds. Profiles assigned to group 11 (bottom right of panel C) and 12 appear at the leading edge of the flare ribbons in black. The temporal evolution of each flare can be seen in the online movies.

3.2. Ribbon-front IRIS profiles

Here we investigate which profiles occur at ribbon fronts, where accelerated electrons are thought to reach lower atmospheric layers. We find that profiles belonging to groups 11 and 12 occur at the leading edge of fast-moving flare ribbons, however, there are also a number of false positives generated by upflowing material. These profiles are similar to the ribbon-front profiles but lack subordinate line emission. Profiles from groups 11 and 12 seem to be less exaggerated versions of the rarer profiles in group 52, which also appear on the ribbon front with broader widths, deeper central reversals and larger subordinate line emissions. In figure 6, we have plotted four flares during their impulsive phase. Panel A contains the upper ribbon of a two-ribbon flare and has profiles assigned to groups 11 and 12 at three different positions, each of which are following the direction of the ribbon, upwards for the highest point and downwards for the other two points. Similar behavior can be noted in panels B, C and D. In each case profiles from groups 11 and 12 follow the leading edge of the ribbon. Once the ribbon passes, profiles from groups 4 and 5 appear over the heated regions where the NUV emission is enhanced. When the ribbon starts to progress more slowly, the single peaked profiles from group 0 take their place, as can be seen in the online movies.

Figure 7.— IRIS 1400 SJI with slanted black lines marking the positions of the raster steps. The time of the first slit position is shown in the top right corner along with the rasters cadence. The RHESSI hard X-ray contours for levels [.20, .35, .50, .65, .80, .90] appear in orange, with the noise level of the image starting at .15. An insert in the bottom left hand corner shows the 3 spectra assigned to centroid 52. The cyan and black markers can be seen to follow the leading edge of the ribbon front in the online movies. For clarity, only the ribbon-front profiles are shown.
Figure 8.— Profile counts for groups 11, 12 and 52 plotted over the GOES light curve and derivative. Subplots of the three groups have been included in flare panels 28, 21 and 29 with 20 of the profiles from that flare plotted in grey over their corresponding centroids. The histograms are normalized separately for each flare group. The ribbon-front IRIS profiles follow the GOES derivative and appear jointly in every flare observation.

Profiles belonging to group 52 have been observed by Rubio da Costa & Kleint (2017) (flare 28 of our list) to occur co-spatially and temporally with hard X-ray emission in the RHESSI 32-100 keV energy band. We found supporting evidence for the alignment of these profiles with hard X-ray signatures in the M6.5 flare observed both by IRIS and RHESSI on June 22, 2015 (flare 22 of our list). As seen in Figure 7, one RHESSI hard X-ray source coincides with the cyan markers which indicate the positions of profiles from group 52. The contours were drawn using the "clean" reconstruction algorithm from a time integration equal to the cadence of the IRIS raster. The roll angle of RHESSI was not modified. The slanted black lines show the locations of the IRIS raster. Unfortunately, RHESSI was in earth’s shadow during the impulsive phase of the flare, therefore we cannot be sure if the upper part of the ribbon had hard X-ray signatures previously, which may explain the black-colored profiles in that location and also the few cyan profiles seen in the movie at earlier times. Black ribbon-front profiles could still be related to X-ray emission if the X-ray emission is at least 10 times fainter than that of the main source, making them invisible due to RHESSI’s limited dynamic range. In the online movies, profiles from group 52 can be seen at the front of the fast moving bottom ribbon between times T17:53:37-T17:57:00, but not exactly at the time shown in the figure. This could have several explanations: The profiles may have coincided with hard X-rays at these times, but it cannot be verified due to RHESSI’s orbit. The hard X-rays at 18:04 could be too weak to trigger the cyan profiles, or alternatively, hard X-rays do not necessarily trigger cyan-type profiles. They could also be hidden in this observation: post flare loops with large amounts of downflowing material can be seen in the IRIS movie and faintly in this figure. The viewing angle means that IRIS observes the lower ribbon through the loops. This may explain why instead of profiles from group 52 lining the lower ribbon, we see triangular profiles with large redshifts. In summary, cyan profiles seem to occur near hard X-ray contours, but there are exceptions and a future statistical study is desirable.

In Figure 8 we have plotted the frequency of occurrence of the three ribbon-front IRIS profiles for 8 flares. The GOES curve and derivative in the 1-8 Å channel have been included for each flare to outline the impulsive phase. The three panels of flares 28, 21 and 29 contain the ribbon centroids as well as 20 overplotted profiles from the flare in that panel. Profiles belonging to group 52 occur during the impulsive phase and cluster around the maximum of the GOES derivative, in contrast to groups 11 and 12 which are more dispersed in relation to the GOES derivative, and may be the result of lower energy electron bombardment appearing to follow soft X-ray signatures. Profiles assigned to group 52 have FWHMs of and characteristic blue shifted reversals at , whose wavelength calibration we have verified with quiet Sun profiles that were centered at 2796.34 Å. Furthermore, these profiles may have large non-thermal contributions to their line width. Since IRIS has a negligible instrumental broadening, the non-thermal velocities can be calculated using the formula from Milligan (2011) given by


where the second component is the squared doppler velocity and the line formation temperature T, was taken as K. Assuming the profiles are resolved, we find an upper limit of 95 km/s for the non-thermal velocities neglecting both pressure and opacity broadening. The velocities would be diminished if the profiles were unresolved. Figure 9 shows the typical line shapes of ribbon-front IRIS profiles. The cyan profile represents the average profile from group 52 that appears in 42% of our flares. It is possible that IRIS misses the hard X-ray locations for some flares. The black profile is an average of every profile belonging to group 11 and 12 and occurs jointly in 100% of the observations. Note the striking similarity between the two averaged profiles. Both have central reversals at precisely the same wavelength, and emissions from both the strong far red wing and its partner line forming a dimple. This leads us to believe that the ribbon profiles place a real physical constraint on the non-thermal velocity fields generated by the electron beam.

Figure 9.— Average profiles of groups 52 and (11+12) taken over all 33 flares, with non-thermal velocities of 95 and 54 km/s respectively. The position of the characteristic blue shifted reversal is indicated with a vertical blue line at and has a standard deviation of .

3.3. Line ratios

The intensity ratios of the Mg ii h&k lines can be used as a diagnostic for the optical depth. A ratio of 2:1 of k:h indicates that the lines are formed in optically thin conditions, and a ratio of 1:1 indicates optical thickness. A derivation of these ratios is given in the Appendix.

In addition to the h&k lines, there exists a companion of triplet lines from transitions between the and states. These subordinate lines are located on the blue and red wings of the k-line and have vacuum wavelengths of 2791.60, 2798.75 and 2798.82 Å. The oscillation strength of the blue subordinate line (not visible in our window) is twice as weak as the far red line. From here on out, the strongest subordinate line at 2798.82 Å will be denoted by .

Early observations by Lemaire et al. (1984) using NASA’s Orbiting Solar Observatory recorded k/h ratios taken before and during a flare to be in the range 0.9-1.5. Quiet Sun ratios were measured by Kohl & Parkinson (1976) in the range 1.14-1.46, while more recent high resolution IRIS observations by Kerr et al. (2015) found time averaged quiet Sun ratios of and flaring ratios in the range 1.07-1.19. The SHK algorithm allows us to perform a large scale multi-flare study of the k/h ratios using the groups to partition flaring and non-flaring profiles. Additionally, we monitor the trend of the k/ ratios.

IRIS’s sensitivity slowly degrades over time, therefore the spectral data must be recalibrated for each observation. Before calculating the line ratios we took into account the change in effective area and converted the measured intensities into physical units , see Kleint et al. (2016) for details. The ratios were then calculated by dividing the integrated intensities of the h&k lines taken over a window. For a window exceeding , the asymmetries in the wings, which contain a number of blended lines, result in inaccuracies.

Figure 10.— Plot of line ratios for several flares, with the GOES curves given by solid black-lines. The raw data points for the k/h ratio (red) and k/ ratio (cyan) are averages of all measured values within the specified minute window. The k/h ratios can be inferred from the scale on the right of each panel. A vertical black-line separates ratios calculated using non-flaring profiles for times and ratios calculated from flaring profiles for times . The solid red and cyan lines were generated by fitting a third order polynomial to the raw flaring data in a running widow of width 9. Both the k/h and k/ ratios decrease during each flare, indicating an increase in opacity and enhanced subordinate line emission respectively.

Figure 10 shows the evolution of both the k/h and k/ line ratios. For times , the ratios of profiles belonging to quiet Sun groups were used, while profiles from flaring groups were used for times . This division is marked with a black vertical line across all panels. The raw data appears as points with each point representing the average measurements of ratios occurring within the specified minute window. In order to clearly show the trends, the data points of the k/h and ratios for have been fitted with a third order polynomial applied over a running widow of width 9. The numerical scales of the k/h-line ratios for each flare are given on the right-hand side of each panel. It is clear that both the k/h and k/s ratios decrease during each flare, indicating an increase in opacity (due to enhanced electron densities) and enhanced subordinate line emission respectively.

In Figure 11, we partitioned the ratios into flaring and non-flaring ratios based on a few representative quiet Sun and flaring groups. The average k/h ratios for flaring profiles across all 33 flares were found to be in comparison to the quiet Sun values of . The large variance in the ratios is a natural consequence of partitioning the profiles into groups. The results agree with the current ratio calculations of Kerr et al. (2015) based on flare number 15 of our list.

Figure 11.— Average k/h ratios for QS and flaring regions using the representative groups (39, 43, 38, 41, 42, 44) and (4, 5, 8, 11, 12, 52). The horizontal dashed line separates flaring and non-flaring k/h ratios based on the current literature. Flare 4 and 9 had very few profiles assigned to the chosen centroids.

The intensities in absolute units span a range of . In Figure 12, we plotted the high intensity portion of ratios of all 33 flares, and color coded ratios from profiles belonging to group 0, (11+12) and 52. The ratios remain far away from the optically thin line, and in general, higher intensity profiles occur closer to flare maximum and correspond to larger opacities. These results are not surprising since both emission and opacity depend on electron density, which based on half-width measurements of high Balmer lines with can reach values of at flare maximum, and vary over two orders of magnitude during a flare (Fritzová-Švestková & Švestka, 1967). We conclude that the Mg ii lines not only remain optically thick during a flare, but appear closer to the 1:1 ratio than in the quiet Sun.

Figure 12.— k/h ratios for groups 0, (11+12) and 52 for all 33 flares with h&k intensities given in absolute units. The optically thick and thin ratios are represented by the dashed lines and . Only the high intensity portion of the range has been plotted. Higher intensity profiles generally occur closer to the optically thick ratio, implying that during a flare, the h&k lines remain optically thick.

4. Discussion

4.1. Typical flare profiles

In this section, we review the common features of flare profiles and compare these features to that of the quiet Sun. We refer to the known mechanisms of quit Sun profile formation, and discuss the likely physical process responsible for the observed differences.

The quiet Sun profiles are observed almost completely at their value, and have a formation height that extends over the entire chromosphere, with the left (k1v) and right (k1r) minimums being formed in the lower chromosphere, left (k2v) and right (k2r) peaks in the middle chromosphere and line core photons coming from the upper chromosphere, just below the transition region. The wings contain contributions from the photosphere, and because they are formed under LTE conditions at , they follow the source function and consequently the lower atmospheric temperature structure. This results in raised wings as one samples lower and lower heights of the photosphere. The wings are most noticeable in profiles belonging to groups 44 and 45 in Figure 3. The raised wings become "flattened" for emissions over sunspots or during flares. For sunspots, the temperature is about K lower than the quiet Sun due to the stifling of convective energy by large kilogauss magnetic fields (e.g. Carlsson et al., 2015), which results in a diminished source function. For flares the flattening could possibly be explained by a downwards condensation layer that forms an optically thick barrier between the photosphere and the rest of the atmosphere (Kowalski et al., 2017). The resulting emergent intensities would then only contain contributions form the chromosphere.

In addition to the flattened wings, we found single-peaked profiles to be prevalent in every flare and discuss their potential origin here. Once again, under quiet Sun conditions the Mg ii h&k line profiles have a characteristic central reversal. This reversal is common in strong chromospheric lines and is due to the source and Plank function decoupling at heights comparable to the core formation height. Jefferies & Thomas (1960) found a frequency independent approximation of the source function that holds for upper chromospheric heights, given by


with mean intensity , absorption profile , Planck function and photon destruction probability approximated as the ratio of the collisional and spontaneous de-excitation coefficients . At heights close to the core formation height, the source function decreases on account of being several orders of magnitude larger than .

As explained by Leenaarts et al. (2013) and Leenaarts et al. (2012), the h&k line core photons are produced in the thin upper chromosphere where large photon mean free paths and low photon destruction probabilities prevail. As such, the radiation field is horizontally smoothed and the Eddington approximation holds. Consequently, the source function along with the emergent intensity decreases with height, resulting in an observed central reversal. We find that during a flare, the central reversals are commonly seen in emission. Eq. 6 demonstrates that an increase in electron density and temperature could result in a source function that continues to increase with height. A recent parameter study by Rubio da Costa & Kleint (2017) found that central emissions were indeed stimulated either by a large temperature spike in the upper chromosphere or an increase in electron density at the same location. In both cases the coupling of the source and Planck function persists to heights above the core formation height, allowing the entire profile to form under LTE. The study demonstrated that unresolved up- and downflows can also lead to single peaked profiles formed under non-LTE conditions. In this case, the complex velocity fields containing both condensation and evaporation patterns can produce photons that fill the central reversal. It is unclear which of the three mechanisms are responsible for the single peaked flare profiles, which may result from an interplay between all three processes.

The ribbon would provide the most complementary conditions for single peak production, with enhanced electron densities, temperatures and doppler velocities, and therefore their prevalence in flares may be explained.

4.2. Profiles at leading edges of flare ribbons

Recent observations of the ribbon front have shown that NUV spectra resulting from the non-thermal electron beam may differ from other spectra within the field of view. Xu et al. (2016) observed a negative flare front in He i 10830 Å using the Goode Solar Telescope at Big Bear Solar Observatory. The ribbon front produced the broadest Mg ii line profiles in the field of view, with a FWHM of 1 Å. Tei et al. (2018) observed the leading edge of a C-class flare kernel on November 11, 2014 with IRIS. They observed Mg ii profiles with intensity enhancements in the blue wing and smaller blue (h2v) peaks in comparison to the red (h2r) peaks of the h-line. The apparent blueshifts lasted 9-48 s, with speeds of km/s and were followed by strong redshifts up to 51 km/s. They proposed a simple model where non-thermal energetic electrons would heat a deep region of the atmosphere which would then expand and carry cool chromospheric-temperature plasma into the corona. This mechanism was verified by a simple non-LTE cloud model (Beckers, 1964), where the emission of the rising cool plasma explained the blue wing enhancement, and the peak asymmetries were naturally reproduced on account of the cool upwards moving material shifting the maximum opacity of the line into the blue. The peak asymmetries and blue wing enhancements were not observed in lines such as Ca ii K, Ca ii 8542 Å and , which have significantly smaller optical depths than the Mg ii lines. Additionally, using a non-LTE simulation with partial redistribution taken into consideration for the line cores, Rubio da Costa & Kleint (2017) generated synthetic blue shifted h&k lines with the desired blue and red peak asymmetries by combining two spectral profiles produced by downflows at different chromospheric heights.

We find that profiles located at the ribbon front are often the broadest profiles in the field of view, only surpassed by even broader profiles due to filament eruptions (group 10) or large downflows at the end of the flares (group 8). Large broadenings may indicate large non-thermal velocities and thus turbulence. We therefore conclude that if the profiles are resolved, the turbulence is highest at the leading edge of the flare ribbon.

Profiles from the combination of groups 11 and 12 occur in every flare, although tenuously and with extremely low triplet emissions in the succession of M-class flares on October 26, 2014. However, the slit of IRIS was positioned poorly in relation to the active region’s major ribbon activity. On this day, flare 9 produced a profile assigned to group 11 with precise width and peak asymmetries, but with no triplet emission. The profile seems to be caused by upflowing material from a clearly visible erupting jet.

We conclude that the ribbon-front IRIS profiles have three possible origins: 1) They are generated by the superposition of unresolved downflows at different chromospheric heights, as demonstrated by Rubio da Costa & Kleint (2017). 2) They come about due to enhanced turbulence at the leading edge of the ribbon front, with triplet line emission due to the heating of the lower chromosphere, in line with the quiet sun triplet modeling of Pereira et al. (2015). 3) They are generated by rising cool chromospheric-temperature material as discussed by Tei et al. (2018). The last explanation seems unlikely since flare 9 and the cloud model of their study generated similar profiles to our ribbon-front IRIS profiles but without the subordinate emission. This makes it clear that there should be a distinction between profiles generated from upflowing material and profiles generated from non-thermal electron bombardment. This distinction is based on the engagement of the two red wing triplet lines.

5. Conclusions and outlook

We have shown that clustering based on machine learning is a very suitable tool to systematically identify and analyze spectra in a multitude of flares. Our main results can be summarized as follows:

  • Typical Mg ii h&k flare profiles consisting of a single peak and broad wings exist and appear in all flares. They are located over heated regions where the NUV emission is enhanced. The subordinate lines appear in emission.

  • Profiles at the leading edges of flare ribbons appear to follow X-ray signatures and also have typical shapes. They are generally the broadest profiles in the field of view and contain a blue-shifted reversal at 2796.27 Å.

  • The k/h-line ratios for flaring and non-flaring profiles are well separated, with values of and respectively. During a flare, this ratio decreases with higher intensity profiles appearing closer to the optically thick 1:1 ratio.

We may explain the prevalence of the single peaked profiles either through enhanced electron densities, temperatures, or unresolved up and downflows, all of which are expected to occur in the wake of flare ribbons, as a consequence of the high electron deposition rates from the corona (see section 4.1). The unusual shape of the profiles at the leading edge of flare ribbons could be generated either by unresolved downflows or enhanced turbulence. In the case of unresolved downflows, a dip at a surprisingly constant wavelength may indicate similar downflows for all flares, and was also shown to occur in simulations. In case the profiles are resolved, the enhanced turbulence would be greatest at the leading edge of the flare ribbon, which also is expected from flare models.

Furthermore, we suggest that the ribbon-front IRIS profiles along with the single peaked profiles are universal flare indicators, with the single peaked profiles linked to the generic heating and increased electron density of the chromosphere during a flare, and the ribbon-front IRIS profiles being the NUV counterpart to the magnetic reconnection event.

Extending the SHK analysis to include additional lines such as Si iv, C ii and O iv will allow us to analyze the atmospheric conditions at different heights and in more detail, leading to a better understanding of the flaring atmosphere. Rare profiles such as those belonging to group 16 should be understood more thoroughly using forward modeling with realistic flaring atmospheres and non-LTE radiative transfer codes. SHK may also be an ideal and reliable candidate for detecting downflows over the entire IRIS database.

Additionally, we plan on performing an exhaustive study of co-aligned IRIS and RHESSI observations to provide further evidence of the spatial and temporal relationship between hard X-ray emission and profiles from group 52.

We would like to thank the Swiss National Science Foundation for funding this research under grant number 407540_167158, as well as LMSAL and NASA for allowing us to download all the IRIS data from their servers. IRIS is a NASA small explorer mission developed and operated by LMSAL with mission operations executed at NASA Ames Research center and major contributions to downlink communications funded by ESA and the Norwegian Space Centre. SK has been funded through NASA contract NAS 5-98033.


  • Beckers (1964) Beckers, J. M. 1964, PhD thesis, Sacramento Peak Observatory, Air Force Cambridge Research Laboratories, Mass., USA
  • Carlsson et al. (2015) Carlsson, M., Leenaarts, J., & De Pontieu, B. 2015, ApJ, 809, L30
  • de la Cruz Rodriguez et al. (2016) de la Cruz Rodriguez, J., Leenaarts, J., & Ramos, A. A. 2016, The Astrophysical Journal Letters, 830, L30
  • De Pontieu et al. (2014) De Pontieu, B., Title, A. M., Lemen, J. R., et al. 2014, Solar Physics, 289, 2733
  • Emslie et al. (2005) Emslie, A. G., Dennis, B. R., Holman, G. D., & Hudson, H. S. 2005, Journal of Geophysical Research: Space Physics, 110, n/a, a11103
  • Fritzová-Švestková & Švestka (1967) Fritzová-Švestková, L., & Švestka, Z. 1967, Solar Physics, 2, 87
  • Hartigan & Wong (1979) Hartigan, J. A., & Wong, M. A. 1979, Journal of the Royal Statistical Society. Series C (Applied Statistics), 28, 100
  • Jefferies & Thomas (1960) Jefferies, J. T., & Thomas, R. N. 1960, ApJ, 131, 695
  • Kastner & Bhatia (1997) Kastner, S., & Bhatia, A. 1997, Journal of Quantitative Spectroscopy and Radiative Transfer, 58, 217
  • Kastner (1993) Kastner, S. O. 1993, Space Sci. Rev., 65, 317
  • Kerr et al. (2016) Kerr, G. S., Fletcher, L., Russell, A. J. B., & Allred, J. C. 2016, The Astrophysical Journal, 827, 101
  • Kerr et al. (2015) Kerr, G. S., Simões, P. J. A., Qiu, J., & Fletcher, L. 2015, A&A, 582, A50
  • Kleint et al. (2016) Kleint, L., Heinzel, P., Judge, P., & Krucker, S. 2016, The Astrophysical Journal, 816, 88
  • Kohl & Parkinson (1976) Kohl, J. L., & Parkinson, W. H. 1976, ApJ, 205, 599
  • Kowalski et al. (2017) Kowalski, A. F., Allred, J. C., Daw, A., Cauzzi, G., & Carlsson, M. 2017, The Astrophysical Journal, 836, 12
  • Leenaarts et al. (2012) Leenaarts, J., Carlsson, M., & van der Voort, L. R. 2012, The Astrophysical Journal, 749, 136
  • Leenaarts et al. (2013) Leenaarts, J., Pereira, T. M. D., Carlsson, M., Uitenbroek, H., & De Pontieu, B. 2013, ApJ, 772, 89
  • Lemaire et al. (1984) Lemaire, P., Choucq-Bruston, M., & Vial, J. C. 1984, Solar Physics, 90, 63
  • Lemaire & Gouttebroze (1983) Lemaire, P., & Gouttebroze, P. 1983, A&A, 125, 241
  • Liu et al. (2015) Liu, W., Heinzel, P., Kleint, L., & Kašparová, J. 2015, Sol. Phys., 290, 3525
  • Machado et al. (1980) Machado, M. E., Avrett, E. H., Vernazza, J. E., & Noyes, R. W. 1980, ApJ, 242, 336
  • MacQueen (1967) MacQueen, J. 1967, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics (Berkeley, Calif.: University of California Press), 281–297
  • Milligan (2011) Milligan, R. O. 2011, The Astrophysical Journal, 740, 70
  • Pereira et al. (2015) Pereira, T. M. D., Carlsson, M., Pontieu, B. D., & Hansteen, V. 2015, The Astrophysical Journal, 806, 14
  • Pietarila et al. (2007) Pietarila, A., Socas-Navarro, H., & Bogdan, T. 2007, The Astrophysical Journal, 663, 1386
  • Reardon & Berukoff (2014) Reardon, K. P., & Berukoff, S. 2014, in 2014 IEEE International Conference on Big Data (Big Data), 45–52
  • Reid et al. (2017) Reid, A., Mathioudakis, M., Kowalski, A., Doyle, J. G., & Allred, J. C. 2017, The Astrophysical Journal Letters, 835, L37
  • Rousseeuw (1987) Rousseeuw, P. J. 1987, Journal of Computational and Applied Mathematics, 20, 53
  • Rubio da Costa & Kleint (2017) Rubio da Costa, F., & Kleint, L. 2017, ApJ, 842, 82
  • Rubio da Costa et al. (2016) Rubio da Costa, F., Kleint, L., Petrosian, V., Liu, W., & Allred, J. C. 2016, The Astrophysical Journal, 827, 38
  • Tei et al. (2018) Tei, A., Sakaue, T., Okamoto, T. J., et al. 2018, ArXiv e-prints
  • Viticchie, B. & Sanchez Almeida, J. (2011) Viticchie, B., & Sanchez Almeida, J. 2011, 736, 71
  • Xu et al. (2016) Xu, Y., Cao, W., Ding, M., et al. 2016, The Astrophysical Journal, 819, 89

Appendix A Derivation of the k/h ratio

The collision rate between the h&k levels and the ground state is much smaller that the spontaneous radiative de-excitation rate (Leenaarts et al., 2013). Consequently, the line strengths are only loosely coupled and we can expect the emergent intensities of the two spectral lines to differ from one another appreciably. The flux recorded on earth is given by


where characterizes the wavelength of photons from the transition , , is the atomic population of the h or k level, is the Einstein coefficient for spontaneous emission from the upper state j to the lower state i, is the volume and is the photon escape probability. Following the comprehensive review of photon escape probabilities by (Kastner, 1993), the mono-directional, frequency averaged single-flight escape probability for a doppler profile with constant emission along the line of sight is given by


where means we have averaged over a doppler profile, is the direction of emission, is the optical depth at the center of the profile and is a dimensionless frequency variable. This equation stems form the assumption that the escape probability depends on the optical depth in the same manner as the one-dimensional equation of transfer, namely . For a detailed numeric calculation of Eq. A2 see Kastner & Bhatia (1997). The asymptotic behavior at and is relatively simple to calculate. For the optically transparent case, the integral simplifies to a Gaussian integral after expanding the exponential in a power series and taking the limit , giving , while for the optically opaque case, the escape probability can be written as a convergent series which in the limit gives .

In general, the flux ratio between two lines reduces to a product of the ratios of the photon escape probabilities, statistical weights of the lower level, element abundance, ionization fraction and collision strengths given by


where is the ionization energy of hydrogen, g a Gaunt factor to correct for quantum mechanical effects, the threshold energy required for the transition and the levels oscillation strength. Because we are analyzing two resonant lines from the same element and ionization state, the k/h ratio will reduce to the ratio of each levels oscillation strength multiplied by the ratio of each core wavelengths escape probability


The h&k lines can therefore be used as an opacity diagnostic, with the optically thin case correspond to a flux ratio of 2:1 and the optically thick case correspond to a flux ratio of 1:1.

Flare Class    Date and time        Observation mode CAD   FOV FOV center   OBSID
when raster started (sec) ()   (arcsec)
1 M1.0 2014-06-12T18:44 Medium coarse 8-step raster 21.34 (-670,-306) 3863605329
2 M1.0 2014-10-26T18:52 Large sit-and-stare 16.20 (648,-287) 3864111353
3 M1.0 2014-11-07T09:37 Large coarse 16-step raster 23.34 (-646,224) 3860602088
4 M1.0 2014-10-26T15:31 Large sit-and-stare 5.36 (598,-307) 3880106953
5 M1.1 2014-09-06T11:23 Large sit-and-stare 9 (-709,-298) 3820259253
6 M1.1 2015-08-21T16:01 Medium dense 32-step raster 102 (-467,-336) 3660104044
7 M1.3 2014-02-02T21:08 Very large coarse 64-step raster 2051 (7,-123) 3880012095
8 M1.3 2014-06-12T11:09 Medium coarse 8-step raster 21 (-723,-303) 3863605329
9 M1.3 2014-10-26T18:52 Large sit-and-stare 16 (648,-287) 3864111353
10 M1.4 2015-03-12T05:45 Large sit-and-stare 5 (-185,-190) 3860107053
11 M1.5 2014-02-04T15:30 Large dense 64-step raster 1084 (245,-100) 3880010190
12 M1.5 2014-08-01T17:20 Large dense 64-step raster 2036 (-200,-230) 3800013190
13 M1.6 2015-03-12T05:45 sit-and-stare 5 (-185,-190) 3860107053
14 M1.8 2014-02-11T16:34 Very large dense 64-step raster 2049 (-197,-123) 3880012191
15 M1.8 2014-02-12T21:50 Large coarse 8-step raster 42 (140,-90) 3860257280
16 M1.8 2015-03-11T04:46 Large coarse 8-step raster 75 (-430,-194) 3860259280
17 M2.3 2014-11-09T15:17 Large coarse 4-step raster 37 (-217,-205) 3860258971
18 M2.4 2014-10-26T18:52 Large sit-and-stare 16 (648,-286) 3864111353
19 M2.9 2015-08-27T05:37 Large coarse 8-step raster 24 (24,708) 3860605380
20 M3.4 2014-10-27T20:56 Large sit-and-stare 16 (16,779) 3864111353
21 M3.9 2014-06-11T18:19 Medium coarse 8-step raster 21 (-781,-306) 3863605329
22 M6.5 2015-06-22T17:00 Large sparse 16-step raster 33 (72,192) 3660100039
23 M6.6 2014-10-27T20:56 Large sit-and-stare 16 (779,-271) 3864111353
24 M7.0 2014-04-18T12:33 Large sit-and-stare 9 (568,-230) 3820259153
25 M7.1 2014-10-26T18:52 Large sit-and-stare 16 (648,-287) 3864111353
26 M8.7 2014-10-21T18:10 Large sit-and-stare 16 (-359,-316) 3860261353
27 X1.0 2014-10-25T14:58 Large sit-and-stare 5 (408,-319) 3880106953
28 X1.0 2014-03-29T14:09 Very large coarse 8-step raster 72 (490,282) 3860258481
29 X1.6 2014-09-10T11:28 Large sit-and-stare 9 (-137,125) 3860259453
30 X1.6 2014-10-22T08:18 Very large coarse 8-step raster 131 (-292,-303) 3860261381
31 X2.0 2014-10-27T14:04 Large coarse 8-step raster 26 (727,-299) 3860354980
32 X2.1 2015-03-11T15:19 Large coarse 4-step raster 16 (-353,-197) 3860107071
33 X3.1 2014-10-24T20:52 Large sit-and-stare 16 (264,-302) 3860111353
Table 1Flare List
Target    Date and time        Observation mode CAD   FOV FOV center   OBSID
when raster started (sec) ()   (arcsec)
A Quiet Sun 2014-06-07T07:29 Large sit-and-stare 17 (128,-574) 3820011653
B Supersonic downflow 2014-09-10T21:29 Very large sit-and-stare 5 (-71,111) 3800507454
C Brightenings near AR 2014-10-01T11:25 Large coarse 2-step rast. 18 (-441,-125) 3860359362
D Brightenings near AR 2014-10-23T21:39 Large sit-and-stare 16 (48,-301) 3860261353
E Pores and flux emergence 2014-11-02T18:48 Large coarse 4-step rast. 37 (-206,-327) 3860258971
F Flux emergence region 2014-11-09T02:16 Large sit-and-stare 10 (-292,177) 3860009153
G Small brightenings in spot 2014-11-19T14:08 Very large sit-and-stare 10 (-139,-298) 3860259254
H Jets/tiny flare 2014-11-21T05:09 Very large sit-and-stare 10 (186,-284) 3860259254
I Sunspot with light bridge 2014-12-06T06:03 Large sit-and-stare 5 (725,-342) 3860256053
Table 2Non-flaring observations List
Figure 13.— Centroid colors as seen in the online movies.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description