Data Analysis and Management for High-Resolution Solar Physics

High-Cadence Imaging and Imaging Spectroscopy at the GREGOR Solar Telescope —
A Collaborative Research Environment for High-Resolution Solar Physics

[ Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany [ Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany [ Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany [ Astronomical Institute of the Slovak Academy of Sciences, 05960 Tatranská Lomnica, Slovak Republic Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany University of Potsdam, Institute of Physics and Astronomy, Karl-Liebknecht-Str. 24/25, 14476 Potsdam, Germany [ Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany University of Potsdam, Institute of Physics and Astronomy, Karl-Liebknecht-Str. 24/25, 14476 Potsdam, Germany [ Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany [ Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany [ Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany [ Center of Excellence in Space Sciences India (CESSI), Indian Institute of Science Education and Research Kolkata, Nadia 741246, West Bengal, India Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany [ Leibniz Institute for Astrophysics Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany University of Potsdam, Institute of Physics and Astronomy, Karl-Liebknecht-Str. 24/25, 14476 Potsdam, Germany
2017 June 162018 January 82018 February 26
2017 June 162018 January 82018 February 26
2017 June 162018 January 82018 February 26

In high-resolution solar physics, the volume and complexity of photometric, spectroscopic, and polarimetric ground-based data significantly increased in the last decade reaching data acquisition rates of terabytes per hour. This is driven by the desire to capture fast processes on the Sun and by the necessity for short exposure times “freezing” the atmospheric seeing, thus enabling post-facto image restoration. Consequently, large-format and high-cadence detectors are nowadays used in solar observations to facilitate image restoration. Based on our experience during the “early science” phase with the 1.5-meter GREGOR solar telescope (2014–2015) and the subsequent transition to routine observations in 2016, we describe data collection and data management tailored towards image restoration and imaging spectroscopy. We outline our approaches regarding data processing, analysis, and archiving for two of GREGOR’s post-focus instruments (see, i.e., the GREGOR Fabry-Pérot Interferometer (GFPI) and the newly installed High-Resolution Fast Imager (HiFI). The heterogeneous and complex nature of multi-dimensional data arising from high-resolution solar observations provides an intriguing but also a challenging example for “big data” in astronomy. The big data challenge has two aspects: (1) establishing a workflow for publishing the data for the whole community and beyond and (2) creating a Collaborative Research Environment (CRE), where computationally intense data and post-processing tools are co-located and collaborative work is enabled for scientists of multiple institutes. This requires either collaboration with a data center or frameworks and databases capable of dealing with huge data sets based on Virtual Observatory (VO) and other community standards and procedures.

Astronomical Databases — Sun: photosphere — Sun: chromosphere — methods: data analysis — techniques: image processing — techniques: spectroscopic
journal: ApJS\correspondingauthor

Carsten Denker

0000-0002-7729-6415]Carsten Denker

0000-0002-3242-1497]Christoph Kuckein

0000-0003-1054-766X]Meetu Verma

0000-0002-6546-5955]Sergio J. González Manrique

0000-0002-9858-0490]Andrea Diercke

0000-0002-2366-8316]Harry Enke

0000-0002-5883-4273]Jochen Klar

0000-0002-4739-1710]Horst Balthasar

0000-0001-5963-8293]Rohan E. Louis

0000-0002-4645-4492]Ekaterina Dineva

1 Introduction

Challenges posed by “Big Data” certainly became a topic in solar physics with the launch of space missions such as the Solar and Heliospheric Observatory (SoHO, Domingo et al., 1995) and the Solar Dynamics Observatory (SDO, Pesnell et al., 2012). Synoptic full-disk images (visible, UV, and EUV), magnetograms, and Doppler maps are the core data products of both missions. However, SDO pushed the limits of spatial resolution to one second of arc and a cadence of 12 seconds. The results are about 300 million images and a total data volume of more than 3.5 petabytes during the seven year mission time so far. Dataflow and data processing for SDO are described in Martens et al. (2012) concerning both hard- and software aspects, and in particular the data conditioning for helioseismology and near real-time data products needed in space weather prediction and forecast. Many of these data products are available from the Joint Science Operations (JSOC) at Stanford University. Considering the volume of data, various institutes around the world hold partial or full copies of SoHO and SDO for advanced in-house processing.

The data volume of ground-based synoptic full-disk observations also significantly increased over the years, even though not reaching the magnitude of space data. Both helioseismology and space weather applications require continuous observations, thus telescope networks such as the Global Oscillation Network Group (GONG, Leibacher, 1999) and the Global H Network (Denker et al., 1999; Steinegger et al., 2000) are a natural choice to overcome the day-night cycle. Other important synoptic data sets are hosted at the “Digital Library” of the U.S. National Solar Observatory (NSO), e.g., the photospheric and chromospheric vector magnetograms of the Synoptic Optical Long-term Investigations of the Sun (SOLIS, Keller et al., 2003; Henney et al., 2009) program, or at institutional data repositories, e.g., full-disk images of the Chromosheric (ChroTel, Kentischer et al., 2008) operated by the Kiepenheuer Institute for Solar Physics in Freiburg, Germany and coronal of the Coronal Multichannel Polarimeter (CoMP, Tomczyk et al., 2008) and of the COSMO K-coronagraph (Tomczyk et al., 2016) from the Mauna Loa Solar Observatory. These data are typically accessible via FTP archives or can be requested via web-based query forms.

High-spectral-resolution spectroscopy and spectropolarimetry were historically the domain of ground-based telescopes. The Japanese Hinode (Kosugi et al., 2007) space mission with its 50-centimeter Solar Optical Telescope (SOT, Tsuneta et al., 2008) changed this field by providing high-spectral and high-spatial resolution full Stokes polarimetry with moderate temporal resolution but high sensitivity (Ichimoto et al., 2008). Hinode data are publicly available in the Data ARchive and Transmission System (DARTS, Miura et al., 2000) at Institute of Space and Astronautical Science (ISAS) in Japan and are also mirrored to data centers around the world (Matsuzaki et al., 2007).

Access to high-resolution ground-based data is often difficult because they were obtained in campaigns led by a principle investigator (PI) and his/her team. In addition, poor documentation and offline storage of high-resolution data (tape drives and local hard-disk drives) have hampered the efforts to make them openly accessible. Fortunately, the situation is improving as more and more data holdings adopt the open-access paradigm. For example, high-resolution data from the 1.6-meter Goode Solar Telescope (Cao et al., 2010) at Big Bear Solar Observatory are listed on the observatory website and can be requested via web-interface. In preparation for the next generation of large-aperture solar telescopes, NSO tested a variety of observing modes including proposal-based queue observations executed by experienced scientists and observatory staff. These service-mode became also publicly available and are accessible via the NSO Digital Library infrastructure.

At the moment, the Daniel K. Inouye Solar Telescope (DKIST, McMullin et al., 2014; Tritschler et al., 2016) is under construction with first light anticipated in 2020, and the European Solar Telescope (EST, Collados et al., 2010a, b) completed its preliminary design phase and was included in the 2016 roadmap of the European Strategy Forum for Research Infrastructures (ESFRI). Concepts for the exploitation of DKIST, with special emphasis on the “Big Data” challenge were presented in Berukoff et al. (2016). The current generation of 1-meter-class solar telescopes like GREGOR (Schmidt et al., 2012) and the Goode Solar Telescope can be considered as stepping stones to unveil the fundamental spatial scales of the solar atmosphere, i.e., the pressure scale height, the photon mean free path, and the elementary magnetic structure size. All scales are of the order of 100 kilometers or even smaller. This provides the impetus for large-aperture solar telescopes and high-resolution solar physics, i.e., to approach temporal and spatial scales that are otherwise only accessible with numerical radiative MHD simulations (e.g., Vögler et al., 2005; Rempel & Cheung, 2014; Beeck et al., 2015).

In this article, we introduce high-resolution ground-based imaging (spectroscopic) data obtained with the 1.5-meter GREGOR solar telescope and describe our approaches to data processing, analysis, management, and archiving. We show by example how the dichotomy of ground-based vs. space-mission data and synoptic vs. high-resolution data affects these approaches. In addition, we present the collaborative research environment (CRE) for GREGOR data as a concept tailored towards the needs of the high-resolution solar physics community. This is complementary to research infrastructures such as the Virtual Solar,, and (VSO, Hill et al., 2004), which was established to allow easy access of solar data from various space missions as well as from ground-based observatories. The VSO reduces the effort of locating and downloading different data in different archives by providing a common web-based interface. The VSO design minimizes the hard- and software resources of federated archives and often simplifies integration of new data. The VSO does not store any data but integrates federated archives by offering a central registry and interfaces for distributed queries to multiple independent data repositories. An alternative access to solar data is provided by the SolarSoft (Bentley & Freeland, 1998) library, which is written mainly in the Interactive Data (IDL). Virtual observatory implementations in solar physics also exist on European level with the European Grid of Solar Observations (Bentley, 2002) and more recently with the SOLARNET Virtual (SVO). Currently, only an SVO prototype is available. However, data can be searched based on events, data set specific parameters, and co-temporal observations.

In the following, we provide a comprehensive overview of GREGOR high-resolution data – from the photons arriving at the detector to the final data products. In Sect. 2, we describe the telescope, its post-focus instruments, and the particulars of high-cadence imaging. The data processing pipeline, its design, and its relation to other software libraries is introduced in Sect. 3. Data management and the access to the GREGOR GFPI and HiFI (Sect. 4) comprise the domain specific answers to the challenges provided by “Big Data” in solar and stellar astronomy. Finally, the conclusions in Sect. 5 develop a perspective for CREs in high-resolution solar physics and explore future extensions allowing database research.

2 GREGOR Solar Telescope and Instrumentation

2.1 GREGOR Solar Telescope

The 1.5-meter GREGOR solar telescope is the largest telescope in Europe for high-resolution solar observations (see Soltau et al. (2012) for the origin of the telescope’s name and why it is formatted in capital letters). Located at Observatorio del Teide, Izaña, Tenerife, Spain, the telescope exploits the excellent and stable seeing conditions of a mountain-island observatory site. The concept for the telescope’s mechanical structure (Volkmer et al., 2012) and the open design employing a foldable-tent dome (Hammerschlag et al., 2012) allow for wind flushing of the telescope platform, which minimizes dome and telescope seeing. The GREGOR telescope uses a double Gregory configuration (Soltau et al., 2012) to limit the field-of-view (FOV) to a diameter of 150″ to facilitate on-axis polarimetric calibrations in the symmetric light path, and to provide a suitable -ratio to the GREGOR Adaptive Optics System (GAOS, Berkefeld et al., 2012) and the four post-focus instruments: Broad-Band Imager (BBI, von der Lühe et al., 2012), GREGOR Infrared Spectrograph (GRIS, Collados et al., 2012), GREGOR Fabry-Pérot Interferometer (GFPI, Denker et al., 2010; Puschmann et al., 2012, and references therein), and High-Resolution Fast Imager (HiFI, Denker et al., 2018b). In the present study, we discuss data processing, analysis, management, and archiving for the last two instruments.

2.2 GREGOR Fabry-Pérot Interferometer

The GFPI is a tunable, dual-etalon imaging spectrometer, where the etalons are placed near a conjugated pupil plane in the collimated beam. The etalons were manufactured by IC Optical Systems (ICOS) in Beckenham, UK. They have a 70-millimeter diameter free aperture and possess both high finesse ( – 50) and high reflectivity (%). Their coatings are optimized for the spectral range 5300 – 8600 Å, where the instrument achieves a spectral resolution of . Spectral scans are recorded with two precisely synchronized Imager QE CCD cameras from LaVision in Göttingen, Germany. One camera acquires narrow-band filtergrams, whereas the simultaneous broad-band images enable image restoration of the full spectral scan using various deconvolution techniques.

The 12-bit images with pixels are captured at a rate of up to 10 Hz for full frames depending on exposure time. The image scale of about 0.04″ pixel yields a FOV of . The image acquisition rate can be doubled with 22-pixel binning. The relatively small full-well capacity of 18 000 e and low readout noise of the detectors are well adapted to the instrument design considering the small full-width-at-half-maximum (FWHM) of the double-etalon spectrometer of just 25 – 40 mÅ, the maximum quantum efficiency of 60% at 5500 Å, and the short exposure times ( – 30 ms) needed to “freeze” the seeing-induced wavefront distortions. The typical cadence of  – 60 s for a spectral scan provides very good temporal resolution so that dynamic processes in the solar photosphere and chromosphere can be resolved.

However, both exposure time and cadence are already compromises because the high-spectral resolution leads to a low number of incident photons at the detector, and small-scale features potentially move or evolve significantly in the given time interval. Faster detectors combining low readout noise, comparatively small full-well capacity, good quantum efficiency, and a high duty cycle with respect to the exposure time mitigate against these limitations and are considered for future upgrades. In principle, the GFPI can be operated in a polarimetric mode, which produces spectral scans of the four Stokes parameters so that the magnetic field vector can be inferred for each pixel. However, validation of the polarimetric mode is still in progress. Consequently, imaging polarimetry is not covered in this study.

Figure 1: Imaging spectroscopy of active region NOAA 12139 observed on 2014 August 14 with the GFPI in the strong chromospheric absorption line H 6562.8 Å: broad-band image (top-left) near the H spectral region, line-core intensity image (bottom-left), blue line-wing image at  Å (top-middle), red line-wing image at  Å (bottom-middle), and chromospheric Doppler velocity derived with the Fourier phase method (top-right), where blue and red colors represent up- and downflows, respectively. Samples of spectral profiles (bottom-right) are shown for areas ‘a’ (black), ‘b’ (red), quiet-Sun ‘QS’ (blue), and Fourier Transform Spectrometer (FTS, Wallace et al., 1998) spectral atlas (green). The bullets on the quiet-Sun profile denote the positions of the three displayed filtergrams.

Sample data products obtained from a scan of the strong chromospheric absorption line H 6562.8 Å are compiled in Fig. 1 to illustrate the GFPI’s science capabilities. Ellerman bombs (see Rutten et al., 2013, for a review), as opposed to micro-flares, are typically observed as enhanced line-wing emission in H, H, H, etc. and have typical lifetime of 1.5 – 7 min with a maximum of around 30 min (Pariat et al., 2007). To resolve their evolution requires fast spectral scans with imaging spectroscopy and post-facto image restoration to uncover small-scale dynamics within the photosphere and chromosphere. On 2014 August 14, active region NOAA 12139 was observed with the GFPI focusing on a complex group of sunspots and pores (see broad-band image in Fig. 1). Image restoration using Multi-Object Multi-Frame Blind Deconvolution (MOMFBD, Löfdahl, 2002; van Noort et al., 2005) was applied to the spectral data to enhance solar fine-structure. The blue and red line-wing filtergrams reveal two features with different spectral characteristics: an “Ellerman bomb” (Ellerman, 1917) as localized, small-scale brightenings (area ‘a’) and a system of dark fibrils with strong downflows associated with newly emerging flux (area ‘b’), respectively.

High-spatial resolution imaging and imaging spectroscopy belong to the standard observational techniques of large-aperture solar telescopes. Various studies based on GFPI data illustrate the potential of the instrument. Recently, Kuckein et al. (2017b) used GFPI Ca ii 8542.1 Å filtergrams to study sudden chromospheric small-scale brightenings. The combination of ground-based, high-resolution imaging spectroscopy and synoptic EUV full-disk images from space reveals that the brightenings belong to the footpoints of a micro-flare. To further investigate the bright kernels and the central absorption part (below the flaring arches), spectral inversions of the near-infrared Ca ii line are performed with the NICOLE code (Socas-Navarro et al., 2015). The retrieved average temperatures reveal rapid heating at the brightenings (footpoints) of the micro-flare of about 600 K. The inferred line-of-sight (LOS) velocities at the central absorption area show upflows of about  km s. In contrast, downflows dominate at the other footpoints.

In another study, Verma et al. (2018) used high-resolution imaging and spectroscopic GFPI data in the photospheric Fe i 6173.3 Å spectral line to infer the three-dimensional velocity field associated with a decaying sunspot penumbra (Fig. 2). The velocities in the decaying penumbral region deviate from the usual penumbral flow pattern because of flux emergence in the vicinity of the sunspot. The detailed analysis is based not only on GFPI data, but includes HiFI and GRIS observations, which provide further photospheric and chromospheric diagnostics. In a study, with a similar set-up for GREGOR multi-wavelength and multi-instrument observations, Felipe et al. (2017) investigated the impact of flare-ejected plasma on sunspot fine-structure, i.e., a strong light-bridge, which experienced localized heating and changes of its magnetic field structure.

2.3 High-Resolution Fast Imager

Detectors based on scientific Complementary Metal-Oxide-Semiconductor (sCMOS) technology have become an alternative to standard CCD devices in astronomical applications, in particular when high-cadence and large-format image sequences are needed. In early 2016, we installed HiFI at the GREGOR solar telescope, where it observes the blue part of the spectrum (3850 – 5300 Å) using a dichroic beamsplitter in the GFPI’s optical path. Two Imager sCMOS cameras from LaVision (LaVision, 2015) are synchronized by a programmable timing unit and record time-series suitable for image restoration either separately for each channel or making use of both channels at the same time.

Figure 2: Three-dimensional flow field observed in active region NOAA 12597 on 2016 September 24. The horizontal flows were derived from a time-series of restored GFPI broad-band images at 6122.7 Å, whereas the Doppler velocities were determined from a restored narrow-band scan of the photospheric Fe i 6173.3 Å line. Color-coded local correlation tracking (LCT, November, 1989; Verma & Denker, 2011) vectors are superposed onto the average LOS velocity map, which was scaled between  km s. The black and white contours delineate the umbra-penumbra and penumbra-granulation boundaries, respectively.

In typical observations, two of three spectral regions are selected, i.e., Ca ii H 3968.0 Å, Fraunhofer G-band 4307.0 Å, and blue continuum 4505.5 Å (see Fig. 3). The width of the filters is around 10 Å, so that typical exposure times reach from a fraction of a millisecond to a few milliseconds. Thus, observations are not “photon-starved” as often encountered in imaging spectropolarimetry, where the transmission profile of (multiple) Fabry-Pérot etalons can be as narrow as 25 mÅ. Different count rates in both channels are balanced by choosing suitable neutral density filters. Short exposure times are essential to freeze the wavefront aberrations in a single exposure. The 25602160-pixel images with a FOV of are digitized as 16-bit integers and recorded with a data acquisition rate of almost 50 Hz. Thus, the image scale is about 0.025″ pixel or about 18 km on the solar surface at disk center. The diffraction-limited resolution of the GREGOR telescope with a diameter  m at a wavelength  Å is . According to the Nyquist sampling theorem, HiFI images are critically sampled at the shortest wavelength of the standard interference filters.

Figure 3: Blue continuum image 4505.5 Å of active region NOAA 12529 obtained with HiFI at 08:37 UT on 2016 April 11. The image was restored from a time-series of 100 short-exposure images with the speckle masking method implemented in KISIP.

In the standard HiFI observing mode, sets of 500 images are captured in each channel in 10 s and continuously written to a RAID-0 array of SSDs at a cadence below 20 s. The imaging system has achieved a sustained write speed summed over both channels of up to 660 MB s. To limit the final data storage requirements, only the best images of a set are kept for image restoration and further data analysis. These settings are already a compromise considering that solar features move with velocities of several kilometers per second in the photosphere and several tens of kilometers per second in the chromosphere, often exceeding the speed of sound. Eruptive phenomena in the chromosphere reach even higher velocities in excess of 100 km s. Thus, in the standard observing mode, solar features moving at more than 2 km s and traversing a pixel in less than about 10 s will be blurred, and their proper motions will not be properly resolved. However, considering that HiFI provides mainly context data and that only some pixels are exposed to high velocities, this compromise is acceptable. If higher temporal resolution is required, for example when tracking small-scale bright points or following the evolution of explosive events, changing the observing mode is an option, i.e., reading out smaller sub-fields to increase the data acquisition rate or dropping frame selection to keep all observed frames.

Figure 4: Temporal evolution of the seeing conditions at the GREGOR solar telescope on 2017 March 25. Three series of 50 000 images were captured at 135 Hz (top), and the corresponding MFGS values were computed and are plotted for G-band images. The gray rectangle covers a 2-minute period, which is depicted at higher resolution (bottom-left). A 10-second period at even higher temporal resolution highlights the frame selection process (bottom-right). The best images, which are used for image restoration, are marked by gray vertical lines.

2.4 Implications of High-Cadence Imaging

As demonstrated above, many instrument designs in solar physics rely on high-cadence imaging. The data challenge arises from the combination of several factors: the daytime correlation time-scale of the seeing, the evolution time-scale of solar features, and large-format detectors. The latter are needed to either catch transient events or to observe large-scale, coherent features that provide structuring and connectivity in the solar atmosphere. To illustrate the implications, we carried out a small experiment with the HiFI cameras. Images were acquired at 135 Hz for only a small FOV of pixels, i.e., on the solar disk. Three sets of 50 000 images were written to disk with short interruptions at a sustained rate of 220 MB s. An extended study based on an even higher image acquisition rate is presented in Denker et al. (2018a), where we evaluated image quality metrics and the impact of frame selection for AO-corrected images on image restoration with the speckle masking technique (Weigelt & Wirnitzer, 1983; von der Lühe, 1993; de Boer, 1993).

Figure 5: Decaying pore with light-bridge observed in active region NOAA 12643 on 2017 March 25. All G-band images were restored with MFBD with the exception of the best raw image (bottom-left). The best images from a 10-second time interval were used for the restored image, which is based on restored Zernike modes. The rms-intensity contrast and the MFGS value are used as image quality metrics. The images are scaled individually between minimum and maximum intensity, and the FOV is .

The median filter gradient similarity (MFGS, Deng et al., 2015) is an image quality metric recently introduced into solar physics. The results for G-band images of the full time-series are depicted in the top panel of Fig. 4, whereas the two lower panels show successively shorter time periods centered around the moment of best seeing conditions. The AO system locked on a decaying pore in active region NOAA 16643, which contained a light-bridge, umbral dots, and indications of penumbra-like small-scale features, i.e., elongated, alternating bright and dark features at the umbra-granulation boundary as well as chains of filigree in the neighboring quiet Sun, which point radially away from the pore’s center (Fig. 5). In the standard HiFI observing mode, images are selected within a 10-second period as indicated by the gray vertical bars in the lower-right panel of Fig. 4. This observing mode relies on a special implementation of frame selection (Scharmer, 1989; Scharmer & Löfdahl, 1991; Kitai et al., 1997), where not just the best solar image is selected but a set of high-quality images is chosen for image restoration.

Obviously, even at this very high cadence, the image quality metrics show still strong variations, despite some clustering of the best images in Fig. 4, indicating that seeing fluctuations occur at even higher frequencies. Considering what lies ahead for high-resolution imaging, the data acquisition rates for the next generation of -pixel detectors recording at 100 Hz will amount to about 3 GB s. This exceeds today’s typical data acquisition rates of less than 1 GB s, but demonstrates that the data challenge persists for the years to come and in particular for the next generation of large-aperture solar telescopes such as DKIST and EST. In Denker et al. (2018a) we demonstrated that an image acquisition rate of  Hz is an appropriate choice, considering the marginal benefits in quality for the frame-selected images with respect to the demands on camera detectors, network bandwidth, data storage, and computing power.

Multi-frame blind deconvolution (MFBD, Löfdahl, 2002) is another commonly used image restoration technique in solar physics. Figure 5 compiles some results based on the best set of G-band images. In Denker et al. (2018a), we established a benchmark for G-band images, i.e., an MFGS value of , where image restoration with the speckle masking technique becomes possible. Thus, the seeing conditions were only moderate to good on 2017 March 25. The aforementioned threshold was determined for the full FOV of the sCMOS detector, whereas the present observations used a much smaller FOV covering the immediate neighborhood of the AO lock point. The top row of Fig. 5 demonstrates that even compared to telescopes with 0.7 – 1 meter apertures, data obtained with larger 1.5-meter-class telescopes require a significant increase in the number of Zernike modes () in the restoration process. This significantly increases the computation time and poses challenges for imaging with 4-meter-class solar telescopes. MFBD has an advantage over speckle masking because it requires a smaller number of images for a restoration, in particular, when the images are of high quality taken under very good or excellent seeing conditions. Thus, restored time-series with higher cadence become possible. Already, selected images deliver good restorations. Note that MFGS looses its discriminatory power for restored images so that other metrics like the rms-intensity contrast become more important. A likely explanation is that the MFGS metric is sensitive to the fine-structure contents of an image, which is mainly encoded in the phases of the Fourier-transformed image. Thus, once almost diffraction-limited information is recovered, the MFGS metric reaches a plateau. The image contrast on the other hand is more closely related to the Fourier amplitudes.

3 sTools Data Pipeline

The “Optical Solar Physics” research group at AIP operates with GFPI and HiFI two of GREGOR’s facility instruments. The software package “sTools” (see Kuckein et al., 2017a, for a brief introduction) provides, among other features, the data processing pipeline for these two instruments. Major parts of the software were developed from 2013 – 2017 within the project, which is a “Research Infrastructures for High-Resolution Solar Physics” program following the Integrated Infrastructure Initiative (I3) model supported by the European Commission’s FP7 Capacities Program. The software package was written from scratch but builds on the code development for and experiences gained with data reduction tools for the Göttingen Fabry-Pérot Interferometer (Bendlin et al., 1992; Puschmann et al., 2006; Bello González & Kneer, 2008) and the Interferometric BIdimensional Spectropolarimeter (IBIS, Cavallini, 2006).

sTools is mainly written in IDL and utilizes other IDL libraries with robust and already validated programs whenever available. The SolarSoftWare (SSW) system (Bentley & Freeland, 1998; Freeland & Handy, 1998), for example, offers instrument specific software libraries and utilities for ground-based instruments and space missions, which includes database access and powerful string processing for metadata. The MPFIT (Markwardt, 2009) package is primarily used for spectral line fitting, the Coyote Library for image processing, graphics, and data I/O (Fanning, 2011), and the NASA IDL Astronomy User’s for reading and writing data in the Flexible Image Transport System (FITS, Wells et al., 1981; Hanisch et al., 2001) format. Data from imaging spectropolarimetry benefits especially from the possibility of writing FITS image extensions with individual headers (Ponz et al., 1994), i.e., polarization state, wavelength position, and calibration information are saved along with the corresponding filtergram. The result is a compact (a few gigabytes), self-describing data set, which can serve as input for various spectral analysis and inversion codes.

Computationally intense applications, in particular image restoration, make use of parallel computing implemented in other programming languages. Here, the IDL programs typically condition the input data and collect the output data for further processing. Currently, sTools provides interfaces for the Kiepenheuer-Institute Speckle Interferometry Program (KISIP, Wöger & von der Lühe, 2008) and MOMFBD. Recently, we started making use of the IDL built-in functions for parallel processing, so that other time consuming parts of the data processing are also more efficiently implemented.

All newly developed IDL routines use the prefix stools_ to avoid name space collisions with other libraries. No sub-folders exist within sTools with an exception for documentation and individual IDL routines from external sources, which are not part of the aforementioned libraries. Instrument-specific and functional dependencies are declared in the naming schema for routines, e.g., stools_gfpi_ for GFPI data processing or stools_html_ for creating summary webpages for the observed data sets. In general, we aim to separate instrument-specific code from multi-purpose routines, which can be shared among applications. The majority of the sTools programs are independent of the computer hardware and the site where the software is installed. Site specifications are encapsulated in structures, which are defined in specific configuration routines (stools_cfg_). The same applies to expert knowledge, e.g., about spectral lines, filter properties, camera settings, telescope details, etc., which is collected in configuration routines that return structures with the required parameters based on tag names. For example, many fitting routine rely on information regarding the width of spectral line and the spectral sampling. Based on the configuration information the most suitable and validated fit parameters are chosen. The configuration routines are the only programs, which may have to be adapted for specific sites, or if new observing modes are carried out with different filters and spectral lines.

While writing the sTools routines, we placed special emphasis on proper documentation using standard IDL headers, which can be extracted with the doc_library procedure, on meaningful inline comments, on descriptive variable names, and on a consistent programming style and formatting. Testing routines for specific data processing steps is typically performed for several data sets with different observing characteristics. This ensures that new calibration steps or updated procedures do not lead to unintended consequences. In some cases, when implementing new observing set-ups for scanning multiple lines or using non-equidistant line sampling, major conceptual changes of the code became necessary. These test data are used to ensure that previously reduced and calibrated data is unaffected by the aforementioned changes. If changes affect already calibrated data, they will be reprocessed. The version of sTools is tracked so that, if needed, data calibrated with older versions can be recovered. These changes are documented on the project website and major updates will be accompanied by data release publications in scientific journals. Initial data calibration for one data set takes about one day computation time on a single processor. On the other hand, reprocessing all GFPI quick-look data, which utilizes the already calibrated data, takes just a few days. The latter data serve as a benchmark to validate the performance of new or updated programs. The sTools data pipeline is currently installed at AIP and at the German solar telescopes on Tenerife. The source code is maintained internally on an Apache (SVN) version control system. The latest released version can be downloaded by registered users as a tarball from the GREGOR webpages at AIP (

4 Data Management and Data Archive

4.1 Point of Departure

High-resolution solar observations are taken at 1-meter-class, ground-based solar telescopes around the world and with Hinode/SOT from space, where the latter provides in many respects (i.e., imaging and spectropolarimetry) the closest match to HiFI and GFPI data. Consequently, scientists, who work with such high-resolution data, are among the primary target groups for utilizing GFPI and HiFI data. In addition, we especially foresee close interactions with the solar physics community on spectral inversion codes and numerical modelling of the highly dynamic processes on the Sun. Therefore, our immediate priority is providing a CRE fostering collaborations among researchers with common interests. Our goal is to raise awareness of and stimulate interest in complex and heterogeneous spectropolarimetric data sets, which are inherent trademarks of imaging spectropolarimeters with a broad spectrum of user-defined observing sequences. Offering a data repository to the whole solar physics community or the general public at large will enhance the impact of these high-resolution data products. However, in this case, it may be advantageous to highlight key data sets obtained under the best seeing conditions or of particularly interesting events like solar flares. In any case, data access, as described in the data policy (Sect. 4.2), is granted to everyone who registers for the GREGOR GFPI and HiFI data archive. At a later stage, when the high-level data products have matured, a more differentiated access will be instantiated, dropping the registration requirement for the publicly available data.

The data specific challenges for GFPI and HiFI are:

  • The campaign- and PI-oriented nature of the data, which results in different combinations of post-focus instruments with changing set-up parameters, e.g., selection of diverse spectral lines, dissimilar spectral and spatial sampling, and different cadences.

  • Observations through Earth’s turbulent atmosphere, which brings about data with fast-changing quality and necessitates image restoration, where different restoration algorithms introduce multiplicity in the major data levels.

  • The complexity of data sets, which requires considerable efforts to condition the data products for a broader user base – even at the level of quick-look data.

  • The availability of personnel and financial resources for maintaining long-term access to data and for quality assurance beyond the typical funding cycle of third-party financial support.

Fortunately, AIP’s mission includes the development of research technology and e-infrastructure as a strategic goal. Thus, the collaboration between E-Science and Solar Physics allowed us to develop a tightly matched solution to the aforementioned data specific challenges.

4.2 Data Policy

The GREGOR consortium (i.e., Kiepenheuer Institute for Solar Physics, Max Planck Institute for Solar System Research, and Leibniz Institute for Astrophysics Potsdam) and GREGOR partners (i.e., Instituto de Astrofísica de Canarias and Astronomical Institute of the Academy of Sciences of the Czech Republic) agreed that in principle all data shall be publicly available. In the following, we use the term “consortium” when referring to GREGOR members and partners.

Since observing proposals contain proprietary information and original ideas of the PI and her/his team, the data of PI-led observing campaigns will be embargoed for one year. The embargo period can be extended upon request for another year, when observations are related to PhD theses. Quick-look data are publicly available after storage at the GREGOR GFPI and HiFI archive and subsequent processing, typically after 4 – 6 weeks, and these data are not subject to the embargo period. This way, the PI of an observing can be contacted to inquire about potential collaborations even within the embargo period. The consortium encourages, but does not require, collaboration with the originator of the data. All work based on GREGOR data is required to include an acknowledgement (see below), and the consortium asks the authors to include appropriate citations to the GREGOR reference articles published in 2012 as a special issue of Astronomische Nachrichten (Vol. 333/9). A detailed version of the data policy is publicly available on the GREGOR webpages at AIP.

4.3 GREGOR GFPI and HiFI Data Archive

The archive is based on the Daiquiri131313 framework, which was developed by AIP’s R&D group “Supercomputing and E-Science”. Daiquiri is designed for creating highly customized web applications for data publication in astronomy. The features of of Daiquiri comprise rich tools for user management, an SQL query interface to enable users to directly enter database queries via the webpage, means to download the results of queries in different formats following standards of the International Virtual Observatory (IVOA, Quinn et al., 2004), and plotting functions. The interaction with Daiquiri can be scripted using its VO Universal Worker Service (UWS, Harrison & Rixon, 2016) interface.

For the GREGOR archive, we use Daiquiri’s user management to implement a custom user-registration workflow. New users first register on the GREGOR data portal, and they do not have to belong to the GREGOR consortium. Therefore, registration is possible for anyone with an interest in GREGOR high-resolution data. To emphasize the intended collaborative nature of this research infrastructure, we use in the following the terms “GREGOR collaboration” or “collaborators” collectively for all registered users. After the registration, one of the project managers confirms the new user, again through the GREGOR portal. Only after this organizational confirmation took place, members of the technical staff will activate the user. This procedure is necessary because users receive CRE privileges, allowing them processing data on institute-owned computers and editing of webpages and blog entries.

Along with their accounts for the web portal, users also obtain Secure Shell (SSH) login credentials for the data access node (Sect. 4.4). The SSH protocol facilitates efficient and fast distribution of the data products, while ensuring modern security and maintainability (in particular firewall configuration), when comparing to the popular, but insecure File Transfer Protocol (FTP). On the data access nodes, an elaborate permissions system ensures access restrictions and data security. This is implemented using Linux Access Control Lists (ACLs), which extend the usual file permissions (user/group/world) common on UNIX systems. Users are organized in groups, which gain write permissions to certain sub-directories of the GREGOR archive. These ACL directories, which typically comprise data for a specific instrument, data processing level, and observing day, can have multiple groups, allowing for fine-grained access by the registered users (e.g., when accessing embargoed data) as well as the GREGOR archive administrators.

Besides the user-registration workflow, Daiquiri is also used to set up the GREGOR webpages, some with access restrictions, some public. This includes the data products generated by sTools data processing pipeline (see Sect. 3). Access to the data was initially limited to the GREGOR consortium but is now open to all registered users of the GREGOR GFPI and HiFI data archive. The data sets generated by sTools are currently in the process of being converted to FITS files containing image extensions. This reduces the level of complexity, as metadata are available and many images of a spectral scan or in a time-series are already aggregated. Once converted to FITS format, these data will be integrated into an SQL database and can be accessed using Daiquiri’s query functionality.

4.4 Collaborative Research Environment

The large number and high-resolution of images and spectra, as well as the computational effort for their post-processing, demands capable and efficient structures for storage and data management. To make best use of GREGOR data and to encourage their usage, we implemented a dedicated CRE at AIP. This research infrastructure acted initially as central hub for storage and processing of different data products as well as their distribution within the GREGOR consortium but it is now open to all interested scientists. The CRE provides data space and data access with different levels of authorization, in addition to computational resources and customized tools for analysis and processing. Participation in the CRE is managed by the GREGOR consortium lead for GFPI and HiFI. Finally, collaborators have the option via the CRE to publish selected and curated “science-ready” data for the solar community, including a minted DOI registered with

Over the last decade, AIP provided similar CREs for several projects. For collaborations working on simulations of cosmological structure formation such as Constrained Local UniversE Simulations (CLUES, Gottlöber et al., 2010) and MultiDark (Riebe et al., 2013), AIP hosts hundreds of terabytes of file storage and a relational database with about 100 TB of carefully curated particle information, halo catalogs, and results of semi-analytical galaxy models. These data products are available via the database provided by the E-Science group at AIP. However, also observational collaborations, e.g., the Multi Unit Spectroscopic (MUSE, Bacon et al., 2010; Weilbacher et al., 2014) collaboration and the Radial Velocity (RAVE, Steinmetz et al., 2006) survey, rely on a CRE for data management and processing.

The CRE hardware is integrated into the Almagest cluster at AIP. The whole cluster consists of over 50 nodes and about 3 PB of raw disk space. The different machines are connected through a high-performance InfiniBand network. Throughout the cluster we use the Linux distribution CentOS as operating system. Directly allocated for the GREGOR CRE are:

  • One shared compute node acting as login node and to share the data among the collaboration. Collaborators can log in to this machine using the SSH protocol and copy data to their workstations for further processing and scientific analysis.

  • Two storage nodes with 80 TB of storage space are reserved for GREGOR data. Each of the nodes contains 24 hard-disks, which are combined to one logical RAID volume using a file system. Using ZFS’s send-receive feature, the disk content of the two nodes is mirrored, and they are physically located in different buildings to minimize the risk of data loss due to catastrophic events.

  • A dedicated compute node, hosting the sTools pipeline, which is directly connected to the archive. This machine allows users to run their own programs and to visualize GFPI and HiFI (raw) data. This computer is also used for generating and updating the webpages with quick-look data. Remote users can log in by SSH and use Virtual Network Computing (VNC) for a desktop-like environment.

  • Additional resources are supplied to host the web application for the data archive, for connecting to the internet, and for backups.

  • Two internal compute nodes with mirrored installations of the sTools data processing pipeline. One of the nodes with a 64-core processor and 256 GB RAM is dedicated to image restoration and hosts the SVN repository of the sTools code including external libraries. The workstations of the researchers on the AIP campus can NFS-mount the respective volumes containing data and software libraries so that data processing and analysis is also possible locally.

  • In addition, a copy of the sTools data processing pipeline is installed at the German solar telescopes at Observatorio del Teide, Izaña, Spain. The computer network includes workstations and a dedicated multi-core computer for data processing and image restoration on site. In particular, the MFGS-based image selection for HiFI data is carried out immediately after the observations with an easy to use GUI written in IDL as interface to sTools.

4.5 Data Levels

We distinguish three major levels of data products within the GREGOR GFPI and HiFI data archive:

  • Level 0 refers to raw data acquired with GFPI and HiFI. The data are written in a format native to the DaVis software of LaVision, which runs both instruments. A short ASCII header declares the basic properties of the images (e.g., DaVis version number, image size in pixels, number of image buffers, type of image compression, if applicable, etc.), which is followed by either compressed or uncompressed binary data blocks. Another free-format ASCII header, containing auxiliary information like a time stamp with microsecond accuracy, is placed after the data blocks at the end of the file. This time stamp results from the programmable timing unit (PTU), which provides external trigger signals for image acquisition. Specific settings of the observing mode and instrument parameters are saved in additional text files for each image sequence or spectral scan. These set files include, for example, telescope and AO status as well as a separate time stamp for the observing time based on the camera computer’s internal clock, which is synchronized with a local GPS receiver at the observatory.

  • The huge amount of large-format, high-cadence HiFI data (up to 4 TB per day with the current set-up) requires reducing the data already on site, directly after the observations. This is the standard procedure and includes dark and flat-field corrections. In addition, the image quality is determined with the MFGS metric, which facilitates frame selection. Only the best 100 out of 500 images in a set are kept. The calibrated image sequences (level 1) of the two synchronized sCMOS cameras are written as FITS files with image extensions. The metadata contained in the primary and image headers are partially SOLARNET-compliant (level 0.5), which means that they do not contain all mandatory SOLARNET keywords, and that they have not used any SOLARNET/FITS standard keywords in a way that is in conflict with their definitions. HiFI level 0 data are typically deleted, once the frame-selected and calibrated level 1 data are safely stored in the GREGOR archive. HiFI level 0 images are only kept when the seeing condition were excellent, when fast or transient events were captured, or when special observing programs were carried out (see Denker et al., 2018a). The processing time for restoring a single HiFI image (level 2) from a set of 100 images with KISIP takes several tens of minutes on the 64-core compute node.

  • Level 1 GFPI data are typically created at AIP after the observing campaign, which reflects the high complexity of data from imaging spectroscopy. In most cases, the multiple images per wavelength point are simply destretched and co-added with no image restoration. However, all other calibration steps (e.g., alignment of narrow- and broad-band images, blueshift and prefilter-curve corrections) are carried out so that scientific exploitation of level 1 data is possible. After preprocessing level 1 data, i.e., after dark and flat-field corrections and determining the alignment of narrow- and broad-band images, a copy of narrow- and broad-band images is saved, which serves as the starting point for image restoration (level 2), thus avoiding preprocessing level 0 data twice. Only the best spectral scans are then chosen for level 2 processing, i.e., image restoration with MOMFBD or speckle deconvolution. The processing time of a single scan with MOMFBD takes several hours on the 64-core compute node.

  • Level 1 data are the starting point for creating quick-look data products such as time-lapse movies, Doppler velocity maps, and overview graphics for seeing conditions and observing parameters.

  • Level 2 data are restored data using MOMFBD and KISIP (see Sect. 3). These data processing steps are only included on demand, considering the significant amount on computational resources. The best image restoration scheme is chosen by the researcher working with the data, and it is not unusual to select different ways to restore images or spectral scans depending on user preferences or a specific science case. Once the spectral scans are restored, other calibration steps still need to be applied such as blueshift and prefilter-curve corrections, before physical parameters such as Doppler velocities or other spectral line properties can be determined.

Level 1 HiFI data (and occasionally level 0) and level 0 GFPI data are transferred from the GREGOR telescope to AIP by regular 2.5-inch external hard-disk drives with 2 – 4 TB storage capacity. Smaller data sets are often transferred over the internet. However, the physical transfer using hard-disks is preferred over copying over the internet due to limited bandwidth and network reliability at the observatory site. Data from all GREGOR instruments can be stored on site on a 100 TB storage array for up to three months. Keeping at least two copies of the data at different locations during the data transfer to the GREGOR archive mitigates against potential data loss. Data integrity during the transfer is assured by monitoring the transfer logs and verifying that all files with correct sizes were transferred. The total amount of data, which was acquired with GFPI and HiFI, as well as with the now obsolete facility cameras of BIC, is summarized in the last row of Table 1. The other entries in the rows for 2014 – 2016 refer to the number of scientific data sets, which were obtained in various observing campaigns. The data volume refers to the sum over all data levels. GFPI level 1 data is roughly twice the size than level 0 data. Furthermore, the bulk of the data volume arises from level 0 data for BIC and from level 1 data for HiFI.

Finally, 70 users are currently registered in the CRE, who are mainly from the GREGOR consortium. However, increasingly external users (at the moment ten) register with the CRE, most of them participated in (coordinated) observing campaigns or in the SOLARNET Access Program promoting observing campaigns with Europe’s telescopes and instruments for high-resolution solar physics. In the meantime, routine observation started at GREGOR in 2016, and we already see an influx of new international collaborators, who will certainly broaden the user base of the GREGOR CRE.

4.6 Use Case: GREGOR Early Science Phase

The GREGOR early science phase took place in 2014 and 2015, where members of the GREGOR consortium collaboratively carried out observing campaigns, i.e., in 2014 with the individual instruments and in 2015 with multi-instrument set-ups. Notably, a 50-day observing campaign with GFPI and BIC (the predecessor of HiFI) was carried out in July and August 2014. In a joint effort, scientists from all involved institutes submitted observing proposals, which were evaluated and condensed into a list of top-priority solar targets and feasible observing modes and strategies. The observations were carried out by experienced observers of all institutes together with novice-observers (not necessary novice-scientists) to strengthen their observing skills and to familiarize them with the new instruments. The data, which were acquired with many different set-ups, were used to develop, test, and improve the sTools data processing pipeline. Finally, the level 1 data were stored in the GREGOR GFPI and HiFI data archive so that all scientists were able to access and analyze the data. Collaborations on specific studies, with the aim of publishing them in scientific journals, were coordinated using the blog facilities of the Daiquiri framework. This includes also the organization of GREGOR science meetings at AIP. The News Blog of the GREGOR CRE allowed us to publish poster presentations given at solar physics meetings (e.g., the SOLARNET IV Meeting in Lanzarote in 2016) and newly appearing journal articles and conference proceedings to a broader section of the solar physics community and the interested public. First results of the GREGOR early science phase were published in 2016 in the journals Astronomy & Astrophysics (Vol. 596) and Astronomische Nachrichten (Vol. 337/10), and more up-to-date GFPI and HiFI science publication are referenced in Sect. 2.2.

4.7 Future Plans

In the future, we plan to expand the data archive and data access infrastructure considerably. With a release of the data to the general solar community, new means of data access will be necessary, beyond users actively utilizing the CRE to interact with like-minded researchers. As before, the latter type of access will be managed via registration and subsequent confirmation by the collaboration. While collaborators will still be able to retrieve data through SSH connections, this is not suitable for public access by the general scientific community.

Therefore, we will extend the GREGOR archive towards a public data portal, which will offer a full search on all files in the archive, unless they are still embargoed, and downloads using the HTTP protocol, either through the browser or using command line tools. The selection of files will be based on the metadata of level 1 and 2 data of the GFPI and HiFI instruments. The archive will also allow for SQL queries to create custom result sets based on any scientific criteria. Standards defined and adopted by IVOA like UWS and Table Access Protocol (TAP) will expedite accessing the data through clients using interoperable Application Programming Interfaces (APIs). The results of the queries of a user and the queries themselves will be stored in a personal database to be retrieved at the user’s convenience without repeating the query to the whole GREGOR database.

2014 48 37
2015 69 42
2016 41 04 46
2017 35 17
10.0 TB 13.6 TB 20.3 TB
Table 1: Number of data sets and stored data volume for GREGOR’s instruments.

5 Conclusions and Prospects

High-resolution solar observations are confronted with short time-scales in both the Sun’s and the Earth’s atmospheres, whereas the first is related to evolution and dynamics of solar features, the latter imposes strong boundary conditions for image restoration. The underlying assumption is that the observed object is not changing. Thus, a contiguous data set has to be recorded within a few seconds. This interval becomes even shorter when observing faster features like in the chromosphere or with larger telescopes offering higher spatial resolution and consequently smaller “diffraction-limited” pixels. These time-scales affect instrument design, observing modes and strategies, and in the end also data management and archiving.

Providing high-resolution data to the solar community is an ongoing process, i.e., the GREGOR GFPI and HiFI archive and the sTools data processing pipeline are not static. At present, we are working on the polarimetric calibration routines, and they will be added to the sTools package once they are tested. The CRE implemented for the GREGOR telescope takes this into account and also contains provisions for future data access beyond the consortium and collaboration. Currently, our efforts are focused on characterizing image quality to identify the best data sets of HiFI images and GFPI spectral scans, which are both affected by the varying seeing conditions (see Denker et al., 2018a). The algorithms and routines (e.g., with the MFGS method) developed for this purpose additionally allow us to monitor long-term trends in data quality as well as to establish a database of seeing conditions at Observatorio del Teide.

Synoptic solar images, magnetograms, and Doppler maps with a typical spatial resolution of one second of arc serve very successfully as input for feature identification and pattern recognition algorithms. In particular, in the context of space weather research and forecasting tools were developed to enable easy access to such data products, e.g., the Heliophysics Event Knowledgebase (HEK, Hurlburt et al., 2012).

Even though the focus shifts from applications to more fundamental physics, a knowledgebase for high-resolution data (spatial resolution below 0.1″) is potentially very beneficial, bringing order into the plethora of small-scale features observed in the quiet-Sun (e.g., G-band bright points, filigree, blinkers, etc.) and in active regions (e.g., umbral dots, penumbral grains, Ellerman bombs, micro-flares, etc.). Morphological characteristics, photometric properties, and spectral/polarimetric features provide a wide parameter range, which can be stored in relational databases. A summary of image processing techniques and various ways of performing feature tracking is given in Aschwanden (2010), and Turmon et al. (2010) demonstrated the capability of multidimensional feature identification. Database research can reveal relationships among solar small-scale features such as those of the photospheric network, which would otherwise be missed in case studies, and can track changes with the solar activity cycle, when the contents of the database grows over time (e.g., McIntosh et al., 2014; Muller & Roudier, 1994; Jin et al., 2011; Roudier et al., 2017).

The GREGOR GFPI and HiFI archive and in the future associated relational databases are an attractive starting point for data mining and machine learning applications. In general, the huge amount of astronomical and astrophysical data has stimulated interest effectively exploring them (see Ball & Brunner, 2010; Ivezić et al., 2014, for current research in this field). Machine learning algorithms gain knowledge from experience. A training data set teaches the underlying models to the machine, and new data sets can then be classified with the results from the initial training set. Furthermore, machine learning can be used in time-series and wavelet analysis. Data mining often employs machine learning techniques when data sets become overwhelmingly large, extracting useful information from raw data and detecting new relations or anomalies. Furthermore, data mining and machine learning help validating model assumptions and ensure consistency. The performance of machine learning models is mostly influenced by the amount and quality of the training data sets. A central repository with immediate access to calibrated high-resolution solar data, such as ours, speeds up the process of training neural networks or data mining.

As an example, DeepVel (Asensio Ramos et al., 2017) is a deep learning neuronal network, estimating horizontal velocities at three different atmospheric heights from time-series of high-resolution images. The training data were in this case numerical simulations. The advantage of DeepVel is not only its extremely fast execution, as compared to other optical flow techniques, but also its the ability to infer velocity fields from subphotospheric layers. Machine learning will become increasingly helpful in image restoration and spectral inversions – both applications requiring significant computational resources.

The 1.5-meter GREGOR solar telescope was built by a German consortium under the leadership of the Kiepenheuer Institute for Solar Physics in Freiburg with the Leibniz Institute for Astrophysics Potsdam, the Institute for Astrophysics Göttingen, and the Max Planck Institute for Solar System Research in Göttingen as partners, and with contributions by the Instituto de Astrofísica de Canarias and the Astronomical Institute of the Academy of Sciences of the Czech Republic. We thank Dr. Michiel van Noort for his help in implementing the MOMFBD code at AIP. SJGM acknowledges support of project VEGA 2/0004/16 and is grateful for financial support from the Leibniz Graduate School for Quantitative Spectroscopy in Astrophysics, a joint project of the Leibniz Institute for Astrophysics Potsdam and the Institute of Physics and Astronomy of the University of Potsdam. CD and REL were supported by grant DE 787/3-1 of the German Science Foundation (DFG). This study is supported by the European Commission’s FP7 Capacities Program under Grant Agreement number 312495. The AIP Almagest cluster and its CREs were partially funded by an European Regional Development Fund (ERDF) grant in 2012. Development of VO interfaces and facilities have been supported by the German Federal Ministry of Education and Research (BMBF) Collaborative Research Program for the German Astrophysical Virtual Observatory (GAVO). Development of Daiquiri has been partially supported by a BMBF grant “Survey-Competence”. The Center of Excellence in Space Sciences India is funded by the Ministry of Human Resource Development, Government of India.
\facilitiesGREGOR solar telescope (GFPI, HiFI)
\softwareDeepVel (Asensio Ramos et al., 2017), KISIP (Wöger & von der Lühe, 2008), MOMFBD (Löfdahl, 2002; van Noort et al., 2005), MPFIT (Markwardt, 2009), SolarSoft (Bentley & Freeland, 1998; Freeland & Handy, 1998), and sTools (Kuckein et al., 2017a)


  • Aschwanden (2010) Aschwanden, M. J. 2010, Sol. Phys., 262, 235
  • Asensio Ramos et al. (2017) Asensio Ramos, A., Requerey, I. S., & Vitas, N. 2017, ArXiv e-prints, arXiv:1703.05128
  • Bacon et al. (2010) Bacon, R., Accardo, M., Adjali, L., et al. 2010, in Proceedings of SPIE, Vol. 7735, Ground-Based and Airborne Instrumentation for Astronomy III, 773508
  • Ball & Brunner (2010) Ball, N. M., & Brunner, R. J. 2010, Int. J. Mod. Phys. D, 19, 1049
  • Beeck et al. (2015) Beeck, B., Schüssler, M., Cameron, R. H., & Reiners, A. 2015, Astron. Astrophys., 581, A42
  • Bello González & Kneer (2008) Bello González, N., & Kneer, F. 2008, Astron. Astrophys., 480, 265
  • Bendlin et al. (1992) Bendlin, C., Volkmer, R., & Kneer, F. 1992, Astron. Astrophys., 257, 817
  • Bentley (2002) Bentley, R. D. 2002, in ESA Spec. Publ., Vol. 477, SOLSPA: Second Solar Cycle and Space Weather Euroconference, ed. H. Sawaya-Lacoste, 603–606
  • Bentley & Freeland (1998) Bentley, R. D., & Freeland, S. L. 1998, in ESA Spec. Publ., Vol. 417, Crossroads for European Solar and Heliospheric Physics. Recent Achievements and Future Mission Possibilities, 225–228
  • Berkefeld et al. (2012) Berkefeld, T., Schmidt, D., Soltau, D., von der Lühe, O., & Heidecke, F. 2012, Astron. Nachr., 333, 863
  • Berukoff et al. (2016) Berukoff, S., Hays, T., Reardon, K., et al. 2016, in Proceedings of SPIE, Vol. 9913, Software and Cyberinfrastructure for Astronomy IV, ed. G. Chiozzi & J. C. Guzman, 99131F
  • Cao et al. (2010) Cao, W., Gorceix, N., Coulter, R., et al. 2010, Astron. Nachr., 331, 636
  • Cavallini (2006) Cavallini, F. 2006, Sol. Phys., 236, 415
  • Collados et al. (2010a) Collados, M., Bettonvil, F., Cavaller, L., et al. 2010a, Astron. Nachr., 331, 615
  • Collados et al. (2010b) Collados, M., Bettonvil, F., Cavaller, L., et al. 2010b, in Proceedings of SPIE, Vol. 7733, Ground-Based and Airborne Telescopes IIII, ed. L. M. Stepp, R. Gilmozzi, & H. J. Hall, 77330H
  • Collados et al. (2012) Collados, M., López, R., Páez, E., et al. 2012, Astron. Nachr., 333, 872
  • de Boer (1993) de Boer, C. R. 1993, PhD thesis, Georg-August Universität Göttingen, Germany
  • Deng et al. (2015) Deng, H., Zhang, D., Wang, T., et al. 2015, Sol. Phys., 290, 1479
  • Denker et al. (2010) Denker, C., Balthasar, H., Hofmann, A., Bello González, N., & Volkmer, R. 2010, in Proceedings of SPIE, Vol. 7735, Ground-Based and Airborne Instrumentation for Astronomy III, ed. I. S. McLean, S. K. Ramsay, & H. Takami, 77356M
  • Denker et al. (2018a) Denker, C., Dineva, E., Balthasar, H., et al. 2018a, Sol. Phys., 293, 44
  • Denker et al. (1999) Denker, C., Johannesson, A., Marquette, W., et al. 1999, Sol. Phys., 184, 87
  • Denker et al. (2018b) Denker, C., Kuckein, C., Verma, M., et al. 2018b, Astron. Nachr., in preparation
  • Domingo et al. (1995) Domingo, V., Fleck, B., & Poland, A. I. 1995, Sol. Phys., 162, 1
  • Ellerman (1917) Ellerman, F. 1917, Astrophys. J., 46, 298
  • Fanning (2011) Fanning, D. W. 2011, Coyote’s Guide to Traditional IDL Graphics (Fort Collins, Colorado: Coyote Book Publishing)
  • Felipe et al. (2017) Felipe, T., Collados, M., Khomenko, E., et al. 2017, Astron. Astrophys., 608, A97
  • Freeland & Handy (1998) Freeland, S. L., & Handy, B. N. 1998, Sol. Phys., 182, 497
  • Gottlöber et al. (2010) Gottlöber, S., Hoffman, Y., & Yepes, G. 2010, ArXiv e-prints, arXiv:1005.2687
  • Hammerschlag et al. (2012) Hammerschlag, R. H., Kommers, J. N., Visser, S., et al. 2012, Astron. Nachr., 333, 830
  • Hanisch et al. (2001) Hanisch, R. J., Farris, A., Greisen, E. W., et al. 2001, Astron. Astrophys., 376, 359
  • Harrison & Rixon (2016) Harrison, P. A., & Rixon, G. 2016, Universal Worker Service Pattern Version 1.1, IVOA Recommendation 24 October 2016,
  • Henney et al. (2009) Henney, C. J., Keller, C. U., Harvey, J. W., et al. 2009, in ASP Conf. Ser., Vol. 405, Solar Polarization V, ed. S. V. Berdyugina, K. N. Nagendra, & R. Ramelli, 47–50
  • Hill et al. (2004) Hill, F., Bogart, R. S., Davey, A., et al. 2004, in Proceedings of SPIE, Vol. 5493, Optimizing Scientific Return for Astronomy through Information Technologies, ed. P. J. Quinn & A. Bridger, 163–169
  • Hurlburt et al. (2012) Hurlburt, N., Cheung, M., Schrijver, C., et al. 2012, Sol. Phys., 275, 67
  • Ichimoto et al. (2008) Ichimoto, K., Lites, B., Elmore, D., et al. 2008, Sol. Phys., 249, 233
  • Ivezić et al. (2014) Ivezić, Ž., Connolly, A. J., Vanderplas, J. T., & Gray, A. 2014, Statistics, Data Mining and Machine Learning in Astronomy (Princeton, New Jersey: Princeton University Press)
  • Jin et al. (2011) Jin, C. L., Wang, J. X., Song, Q., & Zhao, H. 2011, Astrophys. J., 731, 37
  • Keller et al. (2003) Keller, C. U., Harvey, J. W., & Giampapa, M. S. 2003, in Proceedings of SPIE, Vol. 4853, Innovative Telescopes and Instrumentation for Solar Astrophysics, ed. S. L. Keil & S. V. Avakyan, 194–204
  • Kentischer et al. (2008) Kentischer, T. J., Bethge, C., Elmore, D. F., et al. 2008, in Proceedings of SPIE, Vol. 7014, Ground-Based and Airborne Instrumentation for Astronomy II, ed. I. S. McLean & M. M. Casali, 701413
  • Kitai et al. (1997) Kitai, R., Funakoshi, Y., Ueno, S., & Ichimoto, S. S. K. 1997, PASJ, 49, 513
  • Kosugi et al. (2007) Kosugi, T., Matsuzaki, K., Sakao, T., et al. 2007, Sol. Phys., 243, 3
  • Kuckein et al. (2017a) Kuckein, C., Denker, C., Verma, M., et al. 2017a, in IAU Symp., Vol. 327, Fine Structure and Dynamics of the Solar Atmosphere, ed. S. Vargas Domínguez, A. G. Kosovichev, L. Harra, & P. Antolin, 20–24
  • Kuckein et al. (2017b) Kuckein, C., Diercke, A., Gónzalez Manrique, S. J., et al. 2017b, Astron. Astrophys., 608, A117
  • LaVision (2015) LaVision. 2015, Product Manual for DaVis 8.3: Imager sCMOS, Göttingen, Germany
  • Leibacher (1999) Leibacher, J. W. 1999, Adv. Space Res., 24, 173
  • Löfdahl (2002) Löfdahl, M. G. 2002, in Proceedings of SPIE, Vol. 4792, Image Reconstruction from Incomplete Data, ed. P. J. Bones, M. A. Fiddy, & R. P. Millane, 146–155
  • Markwardt (2009) Markwardt, C. B. 2009, in ASP Conf. Ser., Vol. 411, Astronomical Data Analysis Software and Systems XVIII, ed. D. A. Bohlender, D. Durand, & P. Dowler, 251–254
  • Martens et al. (2012) Martens, P. C. H., Attrill, G. D. R., Davey, A. R., et al. 2012, Sol. Phys., 275, 79
  • Matsuzaki et al. (2007) Matsuzaki, K., Shimojo, M., Tarbell, T. D., Harra, L. K., & Deluca, E. E. 2007, Sol. Phys., 243, 87
  • McIntosh et al. (2014) McIntosh, S. W., Wang, X., Leamon, R. J., et al. 2014, Astrophys. J., 792, 12
  • McMullin et al. (2014) McMullin, J. P., Rimmele, T. R., Martínez Pillet, V., et al. 2014, in Proceedings of SPIE, Vol. 9145, Ground-Based and Airborne Telescopes V, ed. L. M. Stepp, R. Gilmozzi, & H. J. Hall, 914525
  • Miura et al. (2000) Miura, A., Shinohara, I., Matsuzaki, K., et al. 2000, in ASP Conf. Ser., Vol. 216, Astronomical Data Analysis Software and Systems IX, ed. N. Manset, C. Veillet, & D. Crabtree, 180–183
  • Muller & Roudier (1994) Muller, R., & Roudier, T. 1994, Sol. Phys., 152, 131
  • November (1989) November, L. J. 1989, in High Spatial Resolution Solar Observations, ed. O. von der Lühe, Proc. Sacramento Peak Summer Workshop, 457
  • Pariat et al. (2007) Pariat, E., Schmieder, B., Berlicki, A., et al. 2007, Astron. Astrophys., 473, 279
  • Pesnell et al. (2012) Pesnell, W. D., Thompson, B. J., & Chamberlin, P. C. 2012, Sol. Phys., 275, 3
  • Ponz et al. (1994) Ponz, J. D., Thompson, R. W., & Munoz, J. R. 1994, Astron. Astrophys. Suppl. Ser., 105, 53
  • Puschmann et al. (2006) Puschmann, K. G., Kneer, F., Seelemann, T., & Wittmann, A. D. 2006, Astron. Astrophys., 451, 1151
  • Puschmann et al. (2012) Puschmann, K. G., Denker, C., Kneer, F., et al. 2012, Astron. Nachr., 333, 880
  • Quinn et al. (2004) Quinn, P. J., Barnes, D. G., Csabai, I., et al. 2004, in Proceedings of SPIE, Vol. 5493, Optimizing Scientific Return for Astronomy through Information Technologies, ed. P. J. Quinn & A. Bridger, 137–145
  • Rempel & Cheung (2014) Rempel, M., & Cheung, M. C. M. 2014, Astrophys. J., 785, 90
  • Riebe et al. (2013) Riebe, K., Partl, A. M., Enke, H., et al. 2013, Astron. Nachr., 334, 691
  • Roudier et al. (2017) Roudier, T., Malherbe, J. M., & Mirouh, G. M. 2017, Astron. Astrophys., 598, A99
  • Rutten et al. (2013) Rutten, R. J., Vissers, G. J. M., Rouppe van der Voort, L. H. M., Sütterlin, P., & Vitas, N. 2013, in J. Phys.: Conf. Ser., Vol. 440, Eclipse on the Coral Sea: Cycle 24 Ascending, ed. P. Cally, R. Erdélyi, & A. Norton, 012007
  • Scharmer & Löfdahl (1991) Scharmer, G., & Löfdahl, M. 1991, Adv. Space Res., 11, 129
  • Scharmer (1989) Scharmer, G. B. 1989, in NATO Adv. Sci. Inst. (ASI) Ser. C, Vol. 263, Solar and Stellar Granulation, ed. R. J. Rutten & G. Severino, 161–171
  • Schmidt et al. (2012) Schmidt, W., von der Lühe, O., Volkmer, R., et al. 2012, Astron. Nachr., 333, 796
  • Socas-Navarro et al. (2015) Socas-Navarro, H., de la Cruz Rodríguez, J., Asensio Ramos, A., Trujillo Bueno, J., & Ruiz Cobo, B. 2015, Astron. Astrophys., 577, A7
  • Soltau et al. (2012) Soltau, D., Volkmer, R., von der Lühe, O., & Berkefeld, T. 2012, Astron. Nachr., 333, 847
  • Steinegger et al. (2000) Steinegger, M., Denker, C., Goode, P. R., et al. 2000, in ESA Special Publication, Vol. 463, Proceedings of the 1st Solar and Space Weather Euroconference: The Solar Cycle and Terrestrial Climate, Solar and Space Weather, ed. A. Wilson, 617–622
  • Steinmetz et al. (2006) Steinmetz, M., Zwitter, T., Siebert, A., et al. 2006, Astron. J., 132, 1645
  • Tomczyk et al. (2008) Tomczyk, S., Card, G. L., Darnell, T., et al. 2008, Sol. Phys., 247, 411
  • Tomczyk et al. (2016) Tomczyk, S., Landi, E., Burkepile, J. T., et al. 2016, J. Geophys. Res., 121, 7470
  • Tritschler et al. (2016) Tritschler, A., Rimmele, T. R., Berukoff, S., et al. 2016, Astron. Nachr., 337, 1064
  • Tsuneta et al. (2008) Tsuneta, S., Ichimoto, K., Katsukawa, Y., et al. 2008, Sol. Phys., 249, 167
  • Turmon et al. (2010) Turmon, M., Jones, H. P., Malanushenko, O. V., & Pap, J. M. 2010, Sol. Phys., 262, 277
  • van Noort et al. (2005) van Noort, M., Rouppe van der Voort, L., & Löfdahl, M. G. 2005, Sol. Phys., 228, 191
  • Verma & Denker (2011) Verma, M., & Denker, C. 2011, Astron. Astrophys., 529, A153
  • Verma et al. (2018) Verma, M., Denker, C., Balthasar, H., et al. 2018, Astron. Astrophys., doi:10.1051/0004-6361/201731801, in press
  • Vögler et al. (2005) Vögler, A., Shelyag, S., Schüssler, M., et al. 2005, Astron. Astrophys., 429, 335
  • Volkmer et al. (2012) Volkmer, R., Eisenträger, P., Emde, P., et al. 2012, Astron. Nachr., 333, 816
  • von der Lühe (1993) von der Lühe, O. 1993, Astron. Astrophys., 268, 374
  • von der Lühe et al. (2012) von der Lühe, O., Volkmer, R., Kentischer, T. J., & Geißler, R. 2012, Astron. Nachr., 333, 894
  • Wallace et al. (1998) Wallace, L., Hinkle, K., & Livingston, W. 1998, An Atlas of the Spectrum of the Solar Photosphere from 13,500 to 28,000 cm (3570 to 7405 Å) (Tucson, Arizona: National Optical Astronomy Observatories)
  • Weigelt & Wirnitzer (1983) Weigelt, G., & Wirnitzer, B. 1983, Opt. Lett., 8, 389
  • Weilbacher et al. (2014) Weilbacher, P. M., Streicher, O., Urrutia, T., et al. 2014, in ASP Conf. Ser., Vol. 485, Astronomical Data Analysis Software and Systems XXIII, ed. N. Manset & P. Forshay, 451–454
  • Wells et al. (1981) Wells, D. C., Greisen, E. W., & Harten, R. H. 1981, Astron. Astrophys. Suppl. Ser., 44, 363
  • Wöger & von der Lühe (2008) Wöger, F., & von der Lühe, O. 2008, in Proceedings of SPIE, Vol. 7019, Advanced Software and Control for Astronomy II, ed. A. Bridger & N. M. Radziwill, 70191E
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description