Unsupervised spectral decomposition of XRBs

Unsupervised spectral decomposition of X-ray binaries with application to GX 339–4

K. I. I. Koljonen
Aalto University Metsähovi Radio Observatory, Metsähovintie 114, FIN-02540 Kylmälä, Finland
New York University Abu Dhabi, PO Box 129188, Abu Dhabi, UAE
email: karri.koljonen@nyu.edu

In this paper, we explore unsupervised spectral decomposition methods for distinguishing the effect of different spectral components for a set of consecutive spectra from an X-ray binary. We use well-established linear methods for the decomposition, namely principal component analysis, independent component analysis and non-negative matrix factorisation (NMF). Applying these methods to a simulated dataset consisting of a variable multicolour disc black body and a cutoff power law, we find that NMF outperforms the other two methods in distinguishing the spectral components. In addition, due the non-negative nature of NMF, the resulting components may be fitted separately, revealing the evolution of individual parameters. To test the NMF method on a real source, we analyse data from the low-mass X-ray binary GX 339–4 and found the results to match those of previous studies. In addition, we found the inner radius of the accretion disc to be located at the innermost stable circular orbit in the intermediate state right after the outburst peak. This study shows that using unsupervised spectral decomposition methods results in detecting the separate component fluxes down to low flux levels. Also, these methods provide an alternative way of detecting the spectral components without performing actual spectral fitting, which may prove to be practical when dealing with large datasets.

Accretion, accretion discs – methods: data analysis – stars: black holes – X-rays: binaries – X-rays: individual: GX 339–4 – X-rays: stars.
pagerange: Unsupervised spectral decomposition of X-ray binaries with application to GX 339–4Referencespubyear: 2014

1 Introduction

Scientific consensus dictates that the X-ray spectra of black hole X-ray binaries (XRBs) and active galactic nuclei (AGNs) are modelled by three spectral components: a multicolour black body component representing the emission from the accretion disc in the soft X-rays, a power law with a cutoff or no cutoff representing the thermal/non-thermal Comptonisation of soft photons from a population of hot electrons in the hard X-rays, and a reflection component representing the reprocessed emission of hard X-rays from the accretion disc, i.e. photoelectrically absorbed or Compton-scattered emission, in the intermediate energies, that most prominently manifests itself in the form of an iron line. However, the magnitude of these components remains unknown at a given time. For example, the magnitude of the soft component in the hard state is usually difficult to determine. However, knowledge of the accretion disc geometry in the hard state, in particular the location of the inner edge of the accretion disc, is critical for the properties and/or the validity of all accretion disc models. It is especially important for radiatively inefficient models such as advection dominated accretion flow (ADAF, Ichimaru, 1977; Narayan & Yi, 1994). The quiescent properties of black hole transients are often cited as evidence for ADAFs and black hole event horizons (Narayan et al., 1997; Garcia et al., 2001; McClintock et al., 2003). However, for the ADAF model to be relevant, the inner edge of the accretion disc must increase by orders of magnitude between the outburst and quiescent states. Therefore, it is important to determine the magnitude of the accretion disc and reflection components in the hard state – a difficult task due to the low flux compared to the Comptonisation component. In fact, even the physical mechanism for the X-ray emission in the hard state is open to question, with possible contributions from direct synchrotron emission from the jet (Falcke & Biermann, 1999; Markoff et al., 2001; Russell et al., 2010) or synchrotron self-Compton radiation from the base of the jet (Markoff et al., 2005). This kind of spectral degeneracy is especially true for the Galactic XRBs and a striking example can be seen e.g. in Nowak et al. (2011), where three very different models are fitted equally well to the same data set of Cyg X-1, despite the excellent quality of the data that was obtained by all the X-ray satellites in orbit at the time. In addition, the magnitudes of different components in the intermediate X-ray states are difficult to determine as all the components are luminous. Extra complications in the interpretation of X-ray spectra are caused by the attenuation of interstellar matter in Galactic XRBs and the so-called warm absorbers (possibly associated with the winds of the accretion discs) in AGNs. Thus, modelling the X-ray spectra of accreting black holes often leads to a problem of degeneracy, i.e. multiple distinct models fit the observed data equally well. Even if an apparently good fit is obtained between the data and the model, it does not necessarily imply a match between theory and physical reality.

In this paper we demonstrate the use of linear unsupervised decomposition methods in separating a set of time series of X-ray spectra into subcomponents corresponding to distinct spectral components from the disc and the population of hot electrons. In Section 2 we will compare three different decomposition methods, namely principal component analysis (PCA), independent component analysis (ICA) and non-negative matrix factorisation (NMF), using simulated spectra mimicking the spectral behaviour from a typical XRB. This analysis provides a better estimate of the X-ray continuum models required to fit the X-ray spectra in XRBs by taking into account also the spectral variability in addition to the fitting of the time-averaged spectra. In Section 3 we apply the NMF to a set of spectra from the stellar mass black hole XRB GX 339–4. With a sufficiently long set of observations throughout the hardness-intensity diagram (HID) from GX 3394, we study how spectral decomposition reveals the change of spectral components in the HID. In Section 4 we present the conclusions of this paper and make suggestions for the future use of the spectral decomposition methods in the context of XRBs.

2 Unsupervised spectral decomposition

X-ray spectra can be decomposed via matrix factorisation techniques. Generally, this problem falls into the category of blind source separation (BSS), in which a set of source signals is estimated from a set of mixed signals with no information on the source signals or the mixing process. In this case the data matrix is a collection of discrete X-ray spectra, , consisting of flux values with energy measured at time . This matrix is taken to consist of a mix of a few, up to , separate source signals weighted over the energy bands by a weight matrix . Essentially, the columns of the weight matrix can be thought to represent a constant spectral component that has a variable amplitude dictated by the rows of the source signal matrix for each . In practice, is much smaller than or , as the intention is to describe the information in concisely. and are subsequently estimated in such a way that , i.e.


The BSS problem is very underdetermined since multiple equally valid solutions exists. However, by placing clever requirements on the factorisation one can reduce the number of possible solutions and obtain meaningful results. PCA, for example, assumes that the source signals are minimally correlated to each other and ICA assumes that the source signals are maximally independent of each other. A slightly different approach is introduced in NMF which imposes structural constraints, i.e. the non-negativity of the constituent matrices and . In the following we create a simulated dataset that mimics the spectral effects that are present in the usual X-ray data of XRBs and tackle it using the methods outlined above.

2.1 Simulations

Figure 1: The parameters, fluxes and resulting X-ray spectra of the simulated data for testing the different matrix factorisation methods. The five panels in the upper left side of the figure show the variability of the parameters of individual spectral components: top two for multicolour disc black body (sine wave for the disc normalisation and sloped line for the disc temperature), and bottom three for cutoff power law (saw wave for power law normalisation, sloped line for the spectral index, and sine wave for the cutoff energy). The two panels in the upper right side of the figure show the corresponding flux values from the disc and power law components. The bottom panel shows the resulting X-ray spectra faked using the models described above.

In order to gauge the appropriateness of a decomposition method (PCA/ICA/NMF) we created a simulated dataset that mimics the spectral effects that are present in the usual X-ray data of XRBs: absorbed (phabs) disc black body spectra (diskbb) with changing normalisation and temperature values and cutoff power law spectra (cutoffpl) with changing normalisation, spectral index and cutoff values. For ‘spectral pathways’ we use distinguishable functions such as sine and saw waves. In addition, we vary the disc temperature and the power law index so as to produce soft and hard X-ray states. We use isis (Houck & Denicola, 2000) to fake 200 spectra with an exposure of 5 ks, varying the black body normalisation sinusoidally from 500 to 2500, the black body temperature linearly from 0.3 keV to 1.3 keV and back, the power law normalisation as a saw wave from 1.0 to 2.0, power law index linearly from 1.5 to 2.5 and back, and a power law cutoff sinusoidally from 40 keV to 120 keV (Fig. 1). We used the RXTE response function from GX 339–4 pointing 70110–01–04–00 and then unfolded the spectra using isis to flux units (keV photons s cm keV). We would like to note that by using “flux-corrected” spectra, response matrix features are introduced in addition to the physical processes, which could have an effect in the low flux regimes.

As PCA, ICA and NMF are all methods that rely on linear decomposition, they are suited best for finding the variability of the normalisations of spectral components, and can behave erratically when dealing with more complicated effects such as changing optical depth and electron temperature that varies the cutoff and spectral slope of the X-ray spectra. When presented with data that varies non-linearly, linear methods can, however, be used as the non-linearity can be approximated as a collection of several linear components. Thus, we do not expect the decomposition to return the variability of the parameters themselves, but rather the fluxes of the spectral components that are collectively produced by the varying parameters. The quality of the decomposition is then measured by how well they are able to reproduce the disc and power law fluxes. In the following we review the different spectral decomposition methods and how they are able to reproduce the original fluxes of the simulated spectral components.

2.2 Principal component analysis

PCA (review e.g. Jolliffe, 2002) is one of the standard tools of time series analysis that has been used in astronomy mainly for stellar spectral classification (e.g. Whitney, 1983), galaxy spectral classification (e.g. Connolly et al., 1995), and quasar spectral classification (e.g. Francis et al., 1992). For studying variable X-ray spectra PCA has been used in AGNs e.g. by Vaughan & Fabian (2004) and Parker et al. (2014), and in XRBs by Malzac et al. (2006) and Koljonen et al. (2013). However, one of the drawbacks in using PCA for decomposing spectra is its tendency to cancel components as it works with the mean-subtracted spectra. This results in components that have either too exaggerated or diminished an impact on the X-ray spectra depending on the values of their normalisation. Additionally, if the eigenvectors of the principal components have both positive and negative values, it results in pivoting behaviour of the spectra, i.e. when the normalisation of that component increases it increases the impact of the component in the positive part and decreases the impact in the negative part of the eigenvector. However, this turns out to be a good proxy for the power law index (Parker et al., 2014).

We have used a singular value decomposition (SVD) in calculating the components of individual, mean-subtracted spectra that comprise on the left-hand side of Eq. 1. More detailed description of the method can be found in e.g. Malzac et al. (2006), Koljonen et al. (2013) and Parker et al. (2014).

2.2.1 Choosing the degree of the factorisation

Figure 2: Determining the degree of factorisation for PCA. Left: the log-eigenvalue (LEV) diagram of the simulated dataset. Right: the -diagram of the simulated dataset. After the LEV settles down to a straight line indicating the start of the noise. Likewise, in the -diagram achieves the value where further adding the number of components decreases the reduced value only by a small amount after .
Figure 3: The effect of sampling the data to the number of significant factorised components above the noise level in the -diagram. Left: The number of components above the noise level when the data is selected to be below a certain flux threshold. Right: The number of components above the noise level when the data is selected to span a certain number of observations from the beginning of the simulation.

In general, we would expect the quality of the factorisation, i.e. its similarity with the original data, to be an increasing function of the degree of the factorisation . A “good” value of would not be a local maximum for quality, but instead a point where the response of the quality to changes from being steep to shallow (i. e., a “good” value of provides a substantially better approximation than nearby smaller values, but only a slightly worse approximation than nearby larger values). One method of choosing the degree of the factorisation in PCA is the log-eigenvalue (LEV) diagram (e.g. Jolliffe, 2002; Koljonen et al., 2013). In the LEV diagram significant components can be distinguished from the noise by their deviation from the geometrical progression, i.e. straight line in the diagram. In addition, we devise a method similar to LEV diagram to measure the quality of the factorised spectra using a median of reduced -values of the resulting factorisation when compared to the simulated dataset. This method calculates the reduced -values between each individual spectrum from the factorisation , on the right-hand side of Eq. 1, with different degrees of factorisation and the spectra from the simulated dataset with associated errors and takes the median of these values, thus producing a quality measure of how well the factorisation with degree fits in to the data and reducing the number of components to those that vary above the noise level. As mentioned above the chosen degree of factorisation should be a point where the median of reduced -value changes from being steep to shallow. In addition, the value should be close to 1 so as to portray faithfully the original spectra. Let us call this method the -diagram and it can be formulated as follows:


The -diagram will be used for ICA and NMF results as well so as to make the comparison between different methods easier. This is because the LEV diagram cannot be used for ICA and NMF as in these methods the number of components of the factorisation has to be determined before starting the analysis itself.

Fig. 2 shows the LEV diagram and the -diagram for the simulated dataset with different degrees of factorisation. The LEV diagram in Fig. 2 indicates the start of the noise after , thus indicating that six components is sufficient to explain the simulated X-ray spectra. Likewise, in the -diagram after there appears to be no improvement to the value by adding more components, and thus it is sufficient to take six components into account. This also demonstrates that the -diagram leads to the same conclusion as the LEV diagram. Based on these methods it suffices to take into account six principal components for explaining the simulated X-ray spectra.

We also studied how the number of above the noise level changes in the -diagram on a flux-limited or time-limited sample. In this case, we use the same simulation setup, models, and parameter ranges, but increase the number of spectra five-fold to increase resolution. Fig. 3 shows how changes if the sample is restricted below some flux threshold (left panel), or if the sample is restricted to span a number of observations from the beginning of the simulation (right panel). To obtain , we see that use of the whole dataset is not necessary, although it is still a substantial amount (the flux level 0.7 photons s cm roughly divides the data in two equal sets). However, even in smaller samples the number of significant components is only slightly smaller.

Figure 4: The six components from the PCA sufficient to explain most of the variability in the X-ray spectra of the simulated dataset. The top six panels show the source signals of the principal components and the bottom panel shows the weights of the principal components with the inset showing the last five with a zoomed range for the y-axis.

2.2.2 Simulated data

Fig. 4 shows the six principal components; the source signals and weights across the energies, derived from the simulated dataset. It is clear that the first component () corresponds to the variations in the disc component, and the second component () correspond to the variations in the power law component. However, it is not clear how the remaining components contribute to the disc and power law components and they likely produce random negative and positive corrections to the disc and/or power law components. A comparison of the first two components with the disc and power law fluxes from the simulation is shown in Section 9.

2.3 Independent component analysis

ICA (for a review see e.g. Hyvärinen et al., 2001) is a linear solution to BSS similar to PCA, but instead of uncorrelatedness and orthogonality ICA relies on statistical independency of the constituent components. Similarly to PCA, ICA has been used in astronomy mainly for galaxy spectral classification (e.g. Lu et al., 2006; Allen et al., 2013). Statistical independency is a more strict rule than uncorrelatedness, as the uncorrelatedness is implied in independency but not vice versa. ICA relies on the central limit theorem which states that the mean of random processes will always approach gaussian distribution. Thus, ICA searches for non-gaussian components. ICA suffers from the same problems in spectral decomposition as PCA, as it also works with the mean-substracted spectra resulting in positive and negative weights and pivoting source spectra.

We have used a fastica algorithm (Hyvärinen, 1999) in calculating the components of individual, pre-whitened spectra that comprise , on the left-hand side of Eq. 1 using negative entropy to search for the non-gaussian components.

2.3.1 Choosing the degree of the factorisation

ICA has the disadvantage to PCA that it does not arrange the components via their fraction of variance. Additionally, the factorisation degree has to be determined before running the algorithm. As in PCA, we use the same method to determine the quality of the factorisation using the -diagram as explained in Section 2.2.1. Fig. 5 shows the -diagram for different degrees of factorisation. Similar to PCA results, after there appears to be no improvement to the value by adding more components, and thus it is enough to take six components into account for the ICA analysis.

Figure 5: Determining the degree of factorisation for ICA. The figure shows the -diagram of the simulated dataset. After the -diagram achieves the value where further adding the number of components decreases the reduced value only by a small amount, thus it is sufficient to take six components into account for the matrix factorisation.
Figure 6: The six components from the independent component analysis sufficient to explain most of the variability and X-ray spectra of the simulated dataset. The top six panels show the source signals and the bottom panel shows the weights of the independent components.

2.3.2 Simulated data

Fig. 6 shows the six independent components, divided into the source signals and weights as previously, derived from the simulated dataset. Unlike in PCA, all the components seem to be relevant in either the disc and/or power law component. Based on the energy range of the weights and the simulated fluxes in Fig. 1 it appears that the independent components represent the disc and independent components represent the power law spectral components. However, it can be seen that when the power law flux is low the disc flux leaks to all independent components. A comparison of the disc and power law components as represented by the respective independent components with the disc and power law fluxes from the simulation is shown in Section 9.

2.4 Non-negative matrix factorisation

NMF (Paatero & Tapper, 1994; Lee & Seung, 1999) provides an exciting alternative to traditional dimensional-reduction methods. In NMF, samples are represented by non-negative combinations of canonical components. The structure found by NMF methods is thus often very different from, and more intuitive to interpret than, that of more traditional eigenvector-based methods, such as PCA. By constructing samples as positive combinations, NMF also has the potential to disentangle canonical components which often overlap to create particular community samples. The price of this more intuitive decomposition is that the factorisation is approximate, i.e. not unique as in PCA, and components depend on the dimension of the decomposition, requiring greater care in the interpretation.

In NMF the matrices and are found by minimising a cost function under the constraint that they must be non-negative. Such a cost function can be constructed using some distance measure between and . In this paper, we use the generalised Kullback-Leibler (KL) divergence,


where . The KL divergence measures the information lost when is used to approximate , and thus minimising it will result in maximising the information of in . Additionally, it has been shown in Sajda et al. (2003) that minimising the above cost function is equivalent to maximising the likelihood of generating from when is assumed to be Poisson distributed with mean . Therefore, the KL divergence is appropriate cost function to use with X-ray counting data.

In this paper we use the nmf package (Gaujoux & Seoighe, 2010) that calculates the standard NMF (Brunet et al., 2004) by picking random starting values for and from a uniform distribution and then updating iteratively 10000 times to find a local minimum of the cost function with a multiplicative rule from Lee & Seung (2001):


To ensure that the algorithm does not get stuck in a local minimum we repeated the minimisation process for 50 different starting points (30–50 starting points is considered sufficient in Hutchins et al. 2008 and Brunet et al. 2004 to get robust estimate of the factorisation degree ) for each value of and 300 different starting points for the chosen degree of factorisation, and used the factorisation which minimised the cost function in Eq. 3.

2.4.1 Choosing the degree of the factorisation

NMF is not a hierarchical method; each component depends on the choice of degree and thus this choice should be made with care. Similar to PCA and ICA above, we use the -diagram as explained in Section 2.2.1. Fig. 7 shows the -diagram for different degrees of factorisation. After there seems to be no improvements to the value by adding more components, and thus it is enough to take six components into account for the NMF analysis.

Figure 7: Determining the degree of factorisation for NMF. The figure shows the -diagram of the simulated dataset. After the -diagram achieves the value where further adding the number of components decreases the reduced value only by a small amount.

2.4.2 Simulated data

Figure 8: The six components from the NMF analysis sufficient to explain most of the variability and X-ray spectra of the simulated dataset. The top six panels show the source signals of the NMF components and the bottom panel shows the weights of the NMF components.

Fig. 8 shows the components derived by NMF from the simulated dataset. The components are clearly divided to those representing the disc component () and the power law component (). The two components representing the disc retain even some information of the original parameters, showing the increasing and decreasing disc temperature and sinusoidally varying normalisation, though some mixture of both components can be clearly seen. The components representing the power law are more mixed with each other, but some similarity can be distinguished of the power law normalisation with and components, power law index with component and the cutoff with component. One has to bear in mind that one-to-one correspondence with the model parameters is not expected as their effects to the spectra is non-linear, but NMF performs quite well in distinguishing the disc and power law components from each other and tackling the non-linear effects. In the following we will compare all the decompositions from PCA, ICA and NMF to the fluxes derived from the disc and power law components of the simulated dataset.

2.5 The comparison of the decompositions

It is interesting to note that all methods converge on the same degree of factorisation however with seemingly different components. This shows that individual components of the factorisations do not likely represent any meaningful parameters of the spectral model, which is expected as the parameters vary in non-linear fashion. However, it is possible to compile the disc and power law component from a collection of the factorisation components. Fig. 9 shows the correlation between the model flux values (see Fig. 1) and individual or different combinations of the representing the disc or power law component of the different decomposition methods described above. The PCA can fairly well distinguish the disc flux. The power law flux is trickier to produce, and the component comes closest to that. It is not clear how the correcting components () would improve the match if added or subtracted from these “main” components. In addition, it seems that the components do not resemble the variations of the individual model parameters. ICA performs slightly better in distinguishing the flux components. Components add up to match the disc flux and to power law flux. However, likewise in PCA the correlation breaks down for low power law fluxes. NMF performs the best in distinguishing the two flux components. The components representing the disc flux () are equally good as in PCA and ICA. The components representing the power law flux () distinguish the power law flux of the simulated data the best out of the three methods with smaller spread and continuing the correlation to low power law fluxes as well.

Figure 9: The quality of the decompositions. The panels show the correlation between the assumed disc and power law fluxes from the PCA (left), ICA (centre) and NMF (right) analyses with the simulated disc (top) and power law (bottom) fluxes (see also Fig. 1).

The unique feature of NMF that allows only positive values for the factorisation, allows the resulting and to be united to form the spectra of individual components. Thus, we select the that corresponds to the disc and power law components mentioned above, and form the disc and power law spectra as and respectively. The resulting spectra are then fed into isis111This is done via isis load_data function, which can load ASCII spectral files as described in the isis documentation. As the spectra are already flux-corrected, only a diagonal response with 1 cm area and a nominal 1 sec integration time are assumed for fitting purposes. The units of the ASCII spectra should be photons s cm, thus and are multiplied by the energy bin widths and divided by the bin energy. It is also important to set minimum_stat_err variable lower than the input uncertainties to keep isis from modifying them. and fitted with diskbb or cutoffpl model for and respectively. The parameters of the fits are then compared (see Fig. 10) to the original ones in Fig. 1. It is clear that exactly similar parameter tracks are not achieved, but this is mainly driven by low flux values which increase the degeneracy of the model parameters.

Figure 10: The spectral parameters (red points) of the NMF spectra fitted independently for disc and power law components as compared to the parameter values of the simulation (solid lines).

3 Application to GX 339–4

GX 339–4 is a low-mass black hole X-ray binary discovered in the early 1970s (Markert et al., 1973). The mass of the black hole is 5.8 (Hynes et al., 2003; Muñoz-Darias et al., 2008) at a distance of 7–9 kpc (Zdziarski et a., 2004). The inclination is uncertain though likely less than 60 (Cowley et al., 2002). GX 339–4 exhibits recurrent outbursts and is one of the most observed XRBs, thus making it a very suitable object for detecting temporal changes in its X-ray spectra. In addition, the X-ray spectra of GX 339–4 can be usually fit with relatively simple model consisting of an absorbed multicolour disc black body component and/or a power law component with a cutoff, and an iron line (e.g. Dunn et al., 2008). GX 339–4 is often used as the prime example of a source exhibiting hysteresis in its X-ray spectra, i.e. the change from the power law dominated hard state to the disc dominated soft state occurs at a higher luminosity than the change from the soft state back to the hard state in the same outburst (e.g. Miyamoto et al., 1995; Nowak, 1995; Nowak et al., 2002). The outbursts of GX 339–4 also link radio observations to the spectral evolution such that the X-ray flux correlates with the radio flux in the hard state (Hannikainen et al., 1998; Corbel et al., 2000, 2013), the radio flux exhibits strong, transient outbursts before quenching when the source transits to the soft state, and the radio flux is again detected when the source transits back to the hard state but without strong outbursts (Fender et al., 2004). These characteristics make GX 339–4 a good source to probe accretion physics and the disc-jet connection. Other interesting observations include possible detections of the accretion disc in the hard state (Miller et al., 2006; Tomsick et al., 2008; Reis et al., 2010) which have implications on the accretion geometry and compact jet production.

Thus, we selected GX 339–4 as the source to test the best unsupervised spectral decomposition method found in this paper, namely the NMF analysis (see Section 2), because of the amount of data available, the ease of fitting the X-ray spectra with simple models, and the clear hysteresis the source exhibits with implications to the accretion physics.

3.1 Observations

We analysed the archived RXTE observations of GX 339–4 from year 2002 to year 2010 that include both RXTE/PCA and High-Energy X-ray Timing Experiment (RXTE/HEXTE) data totalling to 934 pointings. We also exclude pointings that exhibit large data gaps in the RXTE/HEXTE spectrum. Small data gaps (3 energy channels) are filled by interpolating data from the neighbouring points. The whole data set includes four outbursts from 2002, 2004, 2007 and 2010. A comprehensive look into these outbursts can be found e.g. from Dunn et al. (2008) and Motta et al. (2011) and references therein. Each pointing is individually reduced by the standard method as described in the RXTE cook book using heasoft 6.13. The individual spectra for the analysis is prepared as follows: The RXTE/PCA spectrum is extracted from the top layer of PCU–2, which has been operational for all the selected pointings. The RXTE/HEXTE spectrum is extracted from cluster B for data previous to 14 December 2009 when the cluster stopped rocking and it was permanently left as an off-source pointing position. After that we used cluster A for the source data and estimated the background using cluster B (due to cluster A not rocking and fixed to on-source pointing position). The RXTE/PCA and RXTE/HEXTE spectra are then unfolded and the flux values extracted from 3–25 keV for RXTE/PCA and from 20–200 keV for RXTE/HEXTE. RXTE/HEXTE spectra are then normalised to the RXTE/PCA such that the overlapping energy range from 20–25 keV is the same for both detectors. However, RXTE/PCA data is used in this region in the final spectra.

3.2 Results

Figure 11: Top: The quality of the degrees of factorisation. We chose for the degree of the NMF analysis (see text for more details). Bottom: The individual of the factorisation for each . The inset shows the same plot but for smaller range of to better show components at higher energies. The components for can be attributed to the disc, to the iron line and power law.

We used the exact same method as described in Section 2.4 to analyse the spectral data from GX 339–4. Fig. 11 shows the degree of factorisation in the upper panel for GX 339–4 data. We choose for the degree as it provides substantially better approximation than smaller degrees, but only a slightly worse approximation than larger degrees as was explained in Section 2.4.1. The lower panel of Fig. 11 shows the individual of the factorisation. From these we can see that the corresponds to the disc, corresponds to the iron line and the cutoff power law. The component also shows line-like features in the high energies which arise from the RXTE/HEXTE background lines present in the RXTE/HEXTE spectra after 2010.

As was shown in Section 9, we can form the disc and power law fluxes by adding together the appropriate , in this case and respectively, taking into account the iron line flux in the latter. As was stated in Section 9 these correspond fairly well to the actual spectral component fluxes. Thus, it is possible to follow the evolution of the disc and power law fluxes without doing any spectral fitting to the data.

In addition, it is possible to construct a disc fraction luminosity diagram (DFLD) and a power law fraction luminosity diagram (PFLD) using and . Here we use the PFLD defined as the total flux, , as a function of , which equals to 1 when the total flux is composed only of power law flux, and 0 when the total flux is composed only of disk flux.

If we take the above results as portraying the actual disc and power law components we can form the disc and power law spectra as and respectively, and feed them into isis for spectral fitting. We use the original data errors for fitting both and spectra and do not add any systematic error to the spectra. The spectra were fit with a model phabsdiskbb using an energy range 3–15 keV, and with a model phabscutoffpl using an energy range 10–140 keV, thus leaving out the iron line region for simplicity and to better concentrate on the power law continuum. Fig. 12 shows the results of the modelling plotted on the PFLD. This is done so as to show the evolution of the parameters rather than to provide hard values. As it is not clear how the data errors would be divided into the and components the resulting parameters have bigger errors than when fitted to the whole spectra. Thus, the parameter values in this presentation are more robust when they are changing gradually, and likely does not represent the actual value when they are in a region where the parameter value changes abruptly across the whole parameter range (i.e. a mix of colours in Fig. 12). In addition, parameter values that have their error values equal to the minimum or maximum value allowed are removed.

Figure 12: The change of parameters along the power law fraction luminosity diagram. The upper panels correspond to the disc parameters: normalisation and temperature, the middle and bottom panels correspond to the power law parameters (the power law normalisation is plotted in linear and logarithmic scales in order to distinguish better the parameter evolution for the high and low normalisation regimes).

From Fig. 12 we can see the following:

  1. The power law normalisation increases with the overall luminosity and the power law index increases with the power law fraction, the exception being when the power law fraction is below 0.01. The scatter in the lower part of the hard state appears to be due a changing value of the power law index from 1–1.5.

  2. As the power law fraction starts to decrease in the hard state, the disc normalisation starts to increase.

  3. The cutoff energy decreases with the luminosity in the hard state. Up to the critical luminosity (as discussed above) the cutoff is over 100 keV, but starts to decrease after the critical luminosity down to 30–40 keV.

  4. The difference between the transition fluxes when transferring to the soft state seem to occur from a difference in the disc normalisation and power law normalisation (an indication of a difference could be seen in the cutoff energy as well), while the disc temperature and power law index change in similar fashion for each transition.

  5. The disc gets hotter when transitioning to the soft state and there is an indication of heating up again when transitioning back to the hard state as well.

  6. In the soft state both the disc normalisation and temperature decrease slowly.

These results correspond well to the ones found in previous studies (e.g. in Dunn et al., 2008; Del Santo et al., 2008; Motta et al., 2011; Stiele et al., 2011; Cadolle Bel et al., 2011). In addition, the NMF analysis is able to track the changes of the disc component in the low disc fraction regime. These correspond to the regions where the disc component is weak and not required statistically in the spectral fits that have been performed previously. However, it is possible that the low energy tail of the response matrices that arise from escape and fluorescence peaks is mimicking the disc component in the faint spectral states. But, these peaks are most likely caused by photons that come from the hard spectral component and thus have the same variability producing soft excess to the hard, and not to the soft, factorised component. This shows the strength of the NMF analysis in situations where the disc component is harder to detect by using only spectral fitting. While the fluxes of the disc and power law components can be distinguished fairly well in the NMF analysis (Fig. 9), one has to be careful with the values of spectral components in the low flux regimes as these are most likely degenerate (see Fig. 10 and compare to Fig. 1).

As was found in previous studies, at the soft state the disc luminosity is driven by the -relation mimicking the emission from a black body with radius and temperature . It is usually assumed that the above relation is achieved when the inner disc is located at the innermost stable circular orbit (ISCO), though multiple factors can affect this scaling (see a more complete discussion on the difference between these factors in Salvesen et al. 2013). Fig. 13 shows the luminosity-temperature diagram for a set of selected points with a best-fit line in black, and the whole PFLD in the inset with the selected points in red. The best-fit line follows approximately the relation , which differs from the usual relationship. However, as the RXTE is sensitive only for energies 3 keV and upwards, the missing softer X-ray flux, as well as the absorption, affect the luminosity-temperature diagram as discussed in Dunn et al. (2008) (where their original correlation was and corrected ). Here we are concerned with the disc-temperature scaling in the broad sense and assume that when taking into account properly the data below 3 keV and the effect of absorption the scaling should be close to . From Fig. 13 it can be seen that GX 339–4 exhibits the disc-temperature relation in the disc component right after it starts the transition to the soft state, i.e. the inner disc seems to be at the ISCO already in the beginning of the intermediate state right after the peak of the outburst. This relation breaks in the soft state when the power law fraction achieves values close to 0.3–0.4.

Figure 13: Temperature-luminosity diagram of a selected set of data points with the best-fit line in black. The inset shows the whole power law fraction luminosity diagram with the selected points marked in red. The best-fit line follows approximately , which differs from the expected theoretical relation (see text for possible reasons).

4 Conclusions

In this paper we have demonstrated that unsupervised linear spectral decomposition methods can be used to follow the evolution of distinct spectral components. The non-linearities present in the original spectral components can be taken into account by adding multiple linear components together based on their weights across the spectral energies. These methods can resolve differently varying spectral components given that they present a measurable effect on the fluxes. Of the three methods used to tackle a simulated data set we showed that the non-negative matrix factorisation performs best over principal component analysis and independent component analysis. The non-negative matrix factorisation also has the additional benefit that the resulting factorisations are always positive, thus the components of these factorisations can be used to fit spectral models separately. Tracking the individual spectral parameter variations is not as robust compared to fluxes but they can be tracked reasonably well with good quality spectra and high flux values. This has the advantage over normal spectral fitting where different models fit the same spectra or a specific spectral component is not required statistically in the fits.

We applied the non-negative matrix factorisation to a set of RXTE spectra of the Galactic black hole X-ray binary GX 339–4 that includes four outbursts from the source. We found that five components are required to accurately represent the spectra, and that these can be divided into disc and power law components taking into account the iron line in the latter. Fitting the factorised disc and power law components separately with an appropriate spectral model corroborates the notion that the inner edge of the disc is at the innermost stable circular orbit at the onset of the intermediate state following the peak of the outburst. However, this analysis should be refined by including data below the RXTE soft X-ray limit.

In the future we envision that unsupervised linear spectral decomposition methods will be used in multiple situations involving the detection of separate spectral components. In order to extend these preliminary studies in detecting the accretion disc in the low disc flux states, as seen for example in the hard state, we would need to have access to both data from multiple sources and detectors more sensitive to the softer X-ray bands. In order to detect the spectral component responsible for the quasi-periodic oscillations in XRBs, different timing scales could be tested. Finally, these methods provide an alternative way of detecting the spectral components without performing actual spectral fitting, which might prove to be an invaluable tool when dealing with large datasets.


I thank the the referee for the careful reading of the manuscript and excellent suggestions that improved the paper. I thank Diana Hannikainen and Thomas Maccarone for useful discussions. I gratefully acknowledge support from Emil Aaltonen säätiö. This research has made use of data obtained from the High Energy Astrophysics Science Archive Research center (HEASARC), provided by NASA’s Goddard Space Flight center.


  • Allen et al. (2013) Allen J.T., Hewett P.C., Richardson C.T., Ferland G.J., Baldwin J.A., 2013, MNRAS, 430, 3510
  • Brunet et al. (2004) Brunet J.-P., Tamayo P., Golub T.R., Mesirov J.P., 2004, PNAS, 101, 4164
  • Cadolle Bel et al. (2011) Cadolle Bel, M. et al., 2011, A&A, 534, A119
  • Connolly et al. (1995) Connolly A. J. et al., 1995, AJ, 110, 2655
  • Corbel et al. (2000) Corbel S., Fender R.P., Tzioumis A.K., Nowak M., McIntyre V., Durouchoux P., Sood, R., 2000, A&A, 359, 251
  • Corbel et al. (2013) Corbel S., Coriat M., Brocksopp C., Tzioumis A.K., Fender R.P., Tomsick J.A., Buxton M.M., Bailyn C.D., 2013, MNRAS, 428, 2500
  • Cowley et al. (2002) Cowley A.P., Schmidtke P.C., Hutchings J.B., Crampton D., 2002, AJ, 123, 1741
  • Del Santo et al. (2008) Del Santo M., Malzac J., Jourdain E., Belloni T.M., Ubertini P., 2008, MNRAS, 390, 227
  • Dunn et al. (2008) Dunn R.J.H., Fender R.P., Körding E.G., Cabanac C., Belloni T., 2008, MNRAS, 387, 545
  • Falcke & Biermann (1999) Falcke H., Biermann P. L., 1999, A&A, 342, 49
  • Fender et al. (2004) Fender R.P., Belloni T.M., Gallo E., 2004, MNRAS, 355, 1105
  • Francis et al. (1992) Francis P.J., Hewett P.C., Foltz C.B., Chaffee F.H., 1992, ApJ, 398, 476
  • Garcia et al. (2001) Garcia M.R., McClintock J.E., Narayan R., Callanan P., Barret D., Murray S.S., 2001, ApJ, 553, L47
  • Gaujoux & Seoighe (2010) Gaujoux R., Seoighe C., 2010, BMC Bioinformatics, 11, 367
  • Hannikainen et al. (1998) Hannikainen D.C., Hunstead R.W., Campbell-Wilson D., Sood R.K., 1998, A&A, 337, 460
  • Houck & Denicola (2000) Houck J.C., Denicola L.A., 2000, in Manset N., Veillet C., Crabtree D., eds., ASP Conf. Ser. 216, Astronomical Data Analysis Software and Systems IX, p. 591
  • Hutchins et al. (2008) Hutchins L.N., Murphy S.M., Singh P., Graber J.H., 2008, Bioinformatics, 24(23), 2684
  • Hynes et al. (2003) Hynes R.I., Steeghs D., Casares J., Charles P.A., O’Brien K., 2003, ApJ, 583, L95
  • Hyvärinen (1999) Hyvärinen A., 1999, IEEE Transactions on Neural Networks 10(3), 626
  • Hyvärinen et al. (2001) Hyvärinen A., Karhunen J., Oja E., 2001, Independent Component Analysis, 2nd ed., John Wiley & Sons
  • Ichimaru (1977) Ichimaru S., 1977, ApJ, 214, 840
  • Jolliffe (2002) Jolliffe I.T., 2002, Principal Component Analysis, 2nd ed., Springer-Verlag, New York
  • Koljonen et al. (2013) Koljonen K.I.I., McCollough M.L., Hannikainen D.C., Droulans R., 2013, MNRAS, 429, 1173
  • Lee & Seung (1999) Lee D.D., Seung H.S., 1999, Nat, 401, 788
  • Lee & Seung (2001) Lee D.D., Seung H.S., 2001, in Leen T.K., Dietterich T.G., Tresp V., eds., Advances in neural information processing systems 13: Algorithms for Non-negative Matrix Factorization. MIT Press, Cambridge, MA, p. 556
  • Lu et al. (2006) Lu H., Zhou H., Wang J., Wang T., Dong X., Zhuang Z., Li C., 2006, AJ, 131, 790
  • Malzac et al. (2006) Malzac J. et al., 2006, A&A, 448, 1125
  • Markert et al. (1973) Markert T.H., Canizares C.R., Clark G.W., Lewin W.H.G., Schnopper H.W., Sprott G.F., 1973, ApJ, 184, L67
  • Markoff et al. (2001) Markoff S., Falcke H., Fender R., 2001, A&A, 372, L25
  • Markoff et al. (2005) Markoff S., Nowak M.A., Wilms J., 2005, ApJ, 635, 1203
  • McClintock et al. (2003) McClintock J.E., Narayan R., Garcia M.R., Orosz J.A., Remillard R.A., Murray S.S., 2003, ApJ, 593, 435
  • Miller et al. (2006) Miller J.M., Homan J., Steeghs D., Rupen M., Hunstead R.W., Wijnands R., Charles P.A., Fabian A.C., 2006, ApJ, 653, 525
  • Miyamoto et al. (1995) Miyamoto S., Kitamoto S., Hayashida K., Egoshi W., 1995, ApJ, 442, L13
  • Motta et al. (2011) Motta S., Muñoz-Darias T., Casella P., Belloni T., Homan J., 2011, MNRAS, 418, 2292
  • Muñoz-Darias et al. (2008) Muñoz-Darias T., Casares J., Martínez-Pais I.G., 2008, MNRAS, 385, 2205
  • Narayan & Yi (1994) Narayan R., Yi I., 1994, ApJ, 428, L13
  • Narayan et al. (1997) Narayan R., Garcia M.R., McClintock J.E., 1997, ApJ, 478, L79
  • Nowak (1995) Nowak M.A., 1995, PASP, 107, 1207
  • Nowak et al. (2002) Nowak M.A., Wilms J., Dove J.B., 2002, MNRAS, 332, 856
  • Nowak et al. (2011) Nowak M.A. et al., 2011, ApJ, 728, 13
  • Paatero & Tapper (1994) Paatero P., Tapper U., 1994, Environmetrics, 5(2), 111
  • Parker et al. (2014) Parker M.L., Marinucci A., Brenneman L., Fabian A.C., Kara E., Matt G., Walton D.J., 2014, MNRAS, 437, 721
  • Reis et al. (2010) Reis R.C., Fabian A.C., Miller J.M., 2010, MNRAS, 402, 836
  • Russell et al. (2010) Russell D.M., Maitra D., Dunn R.J.H., Markoff S., 2010, MNRAS, 405, 1759
  • Sajda et al. (2003) Sajda P., Shuyan D., Parra L.C, 2003, in Unser M.A, Aldroubi A., Laine A.F., eds, Proc. SPIE 5207, Wavelets: Applications in Signal and Image Processing X, San Diego, p. 321
  • Salvesen et al. (2013) Salvesen G., Miller J.M., Reis R.C., Begelman M.C., MNRAS, 431, 3510
  • Stiele et al. (2011) Stiele H., Motta S., Muñoz-Darias T., Belloni T.M., 2011, MNRAS, 418, 1746
  • Tomsick et al. (2008) Tomsick J.A. et al., 2008, ApJ, 680, 593
  • Vaughan & Fabian (2004) Vaughan S., Fabian A.C., 2004, MNRAS, 348, 1415
  • Whitney (1983) Whitney C.A., 1983, A&AS, 51, 443
  • Zdziarski et a. (2004) Zdziarski A.A., Gierliński M., Mikołajewska J., Wardziński G., Smith D.M., Harmon B.A., MNRAS, 351, 791
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description