Phase Errors in DiffractionLimited Imaging: Contrast Limits for Sparse Aperture Masking
Abstract
Bispectrum phase, closure phase and their generalisation to kernelphase are all independent of pupilplane phase errors to firstorder. This property, when used with Sparse Aperture Masking (SAM) behind adaptive optics, has been used recently in highcontrast observations at or inside the formal diffraction limit of large telescopes. Finding the limitations to these techniques requires an understanding of spatial and temporal thirdorder phase effects, as well as effects such as timevariable dispersion when coupled with the nonzero bandwidths in real observations. In this paper, formulae describing many of these errors are developed, so that a comparison can be made to fundamental noise processes of photon and backgroundnoise. I show that the current generation of aperturemasking observations of young solartype stars, taken carefully in excellent observing conditions, are consistent with being limited by temporal phase noise and photon noise. This has relevance for plans to combine pupilremapping with spatial filtering. Finally, I describe calibration strategies for kernelphase, including the optimised calibrator weighting as used for LkCa15, and the restricted kernelphase POISE technique that avoids explicit dependence on calibrators.
keywords:
techniques: interferometric, instrumentation: adaptive optics, instrumentation: high angular resolution1 Introduction
The concepts of closurephase, bispectrum phase (e.g. Hofmann & Weigelt, 1993), selfcalibration and now kernelphase (Martinache, 2010) are wellknown as techniques that cancel out many instrumental effects due to pupilplane phase errors. Despite the very long history of aperturemasking with a focus on fringe visibility amplitude (Fizeau, 1868; Michelson, 1891; Schwarzschild, 1896), it was the use of closurephase that first enabled imagereconstruction from this technique (Baldwin et al., 1986) as well as recent efforts in highcontrast imaging (e.g. Lloyd et al., 2006; Kraus & Ireland, 2012).
A simple explanation of closurephase comes from a counting argument. From an interferometer with (sub)apertures, the complex visibilities can be independently measured on each of the baselines consisting of each pair of (sub)apertures. An optical aberration consisting of a piston on each of the (sub)apertures amounts to degrees of freedom in the phase differences, leaving additional measured quantities, which are the linearlyindependent set of closurephases. A set of observables which are independent of pupilplane phase form an ideal starting point for precise modelfitting and imaging at the diffractionlimit. This argument applies to both redundant and nonredundant pupil geometries, as realised by Martinache (2010). But if phase errors on a pupil are large, a redundant pupil configuration is at a disadvantage, because the pairs of pupil locations that form any given Fourier component may add outofphase and destructively interfere. In the case of observations taken behind adaptive optics, the choice of one technique over the other is not obvious.
In this paper, I will outline the causes of contrast limitations in the aperturemasking interferometry and kernelphase techniques, and methods to maximise contrast. In Section 2 the main causes of kernelphase errors will be outlined. In Section 3 I will describe why the statistical correlations between closurephases mean that kernelphases are preferred as a primary observable, and will compare the contrast limits achievable by different pupil geometries. In Section 4.1 I will describe standard closurephase calibration and its limitations, in Section 4.2 I will describe the calibration strategy as used in Kraus & Ireland (2012) to maximise contrast in aperturemasking interferometry observations, and in Section 4.3 I will describe the simpler POISE calibration strategy. In Section 5 I will conclude and outline the key areas where further research is needed.
1.1 KernelPhase
The definition of Kernelphase as used in this paper will be slightly simplified from the definition of Martinache (2010), as we will avoid the use of the “redundancy” matrix . To firstorder in pupilplane phase (i.e. with a nearlyflat wavefront), we can write the observed phase in the Fourier transform of an image as:
(1) 
where is the pupilplane phase and is the phase of the Fourier transform of the object. These are represented as vectors where each vector element is one discrete point in the model pupil plane or the image discrete Fourier transform. The matrix encodes the information about which parts of the pupil form each Fourier component. For example, a nonredundant baseline formed by two discrete pupil components only would have a +1 and 1 in that row of , with all other elements taking the value 0. This matrix is described in detail in Martinache (2010). Using singular value decomposition, we then find a matrix , the Kernel of , such that . By choosing such that its number of nonzero rows is equal to its rand, this matrix enables us to project the Fourier phases onto a subspace, which we will call the Kernelphases by . On this subspace, the observables are not affected by pupilplane phase errors at firstorder:
(2) 
A model of the object can therefore be directly compared to the observed Kernelphases by computing the Fourier transform and multiplying by the matrix . For all reasonable 2dimensional pupils, the rank of is at least half the length of , meaning that at least half the object Fourierphase information is preserved when transforming from Fourierphase to kernelphase.
2 Causes of KernelPhase Errors
There are three broad classes of kernelphase errors: those that vary rapidly, approximating white noise in a sequence of exposures (random errors), those that are static throughout an observing run and can therefore be calibrated by observation of unresolved calibrator stars (static errors) and those which vary from one target to another (calibration errors). Calibration errors include quasistatic errors with a time variability measured in minutes or hours, as well as errors that depend on e.g. the sky position or the spectrum of the source observed. The goal of any combination of observing technique and analysis strategy is to both minimise the random errors, and to develop a calibration strategy where residual calibration errors are smaller than typical random errors. The following sections include error causes that could manifest themselves as one or several of these error classes.
2.1 General PupilPlane Phase Errors
We will examine first an abstract representation of pupilplane phase errors that could cause random, calibration or static errors. We consider a closing triangle containing apertures , and , as depicted in Figure 1. Each aperture has the same size and shape, and each baseline , and has data taken at the same time. That is, there are equivalent coordinate systems describing apertures , and , centered on each aperture. This means that the visibility on each baseline is formed by the incoherent integral of visibilities arising from common spatiotemporal coordinates in subapertures , and .
We will assign the symbols , and to the phase in subapertures , and , the symbols , and to the phase on baselines 1, 2 and 3 respectively, and will neglect amplitude variations (i.e. scintillation). The complex visibilities are then formed by:
(3) 
where the bar represents an average over the spatiotemporal coordinates corresponding to each aperture. This can be expanded to thirdorder in phase to:
(4) 
with similar expressions for and . The bispectrum is given by the product of these three visibilities, which can be again expanded to thirdorder in phase:
(5) 
(6)  
(7) 
where we have considerably simplified the expansion by introducing the pistoncorrected phases:
(8)  
(9)  
(10) 
A more complete derivation of this expansion is given in Appendix A. The closure phase is then most simply approximated by taking the leading terms in the real (0th order) and imaginary (3rd order) components of the bispectrum, giving .
It is also worthwhile briefly considering the effects of averaging the visibilities for baselines 1, 2 and 3 over different spaces. This could be caused by differing subaperture shapes in conventional aperturemasking interferometry (amounting to nonclosing triangles), or by disjoint integration times as found in other forms of interferometry. In this case, the leading terms in the closurephase errors become firstorder rather than third order in pupil plane phase. Clearly, this is something to be avoided at considerable effort in the case of highcontrast aperturemasking. The pupil “shape” can also be thought of as the pupilplane amplitude within each subaperture. Where amplitude errors are taken into account, these closurephase errors then become secondorder, i.e. firstorder in phase and firstorder in amplitude, and could plausibly be the leading term.
2.2 Temporal Phase Errors
Our first application of Equation 7 to closurephase errors is rapid temporal effects, which cause a random kernelphase error. There are two key regimes that temporal errors operate in behind an AO system. Either exposuretimes are comparable to or shorter than the inverse of the AO system bandwidth (the shortexposure regime) or exposure times are significantly longer than these timescales (the longexposure regime). Given typical coherence times at 2.2 microns or shorter wavelengths of 50 ms, and typical AO system bandwidths in the range 10100 Hz, exposure times longer than 100 ms in the nearinfrared are in the longexposure regime.
In the longexposure regime, we can make the approximation that piston noise is white up to some cutoff frequency . This is not very unrealistic, because in the frozen turbulence approximation, the atmosphere has an amplitude spectrum proportional to , while the error signal from a ProportionalIntegralDifferential (PID) controller in the midfrequency range where the proportional term dominates gives residual errors proportional to the input signal amplitude multiplied by the frequency . This gives a resultant error amplitude proportional to , up to the servo loop cutoff. At this cutoff, three independent phenomena all tend to cutoff the error spectrum rapidly: the atmospheric amplitude spectrum, the rapidly lowering gain of the servo approaching its Nyquist sampling frequency, and effects of spatial filtering.
We will now make a second set of approximations by assuming that the phase piston on each subaperture making a closingtriangle is uncorrelated and has identical phase noise . This may not be reasonable for some AO systems (e.g. if tip/tilt errors dominate due to tip/tilt mirror bandwidth) but as this depends on reconstructor and wavefront sensor details, it is a good first approximation.
An exposure of total time can then be split into subexposures, each of which has independent phase noise, so that in each exposure we have pupilplane subaperture piston phases given by normal distributions:
(11)  
(12)  
(13) 
Applying Equation 7 to this phase noise distribution for gives the standard deviation of closurephase (see Appendix B for a derivation):
(14) 
In the short exposure regime, we are dominated by atmospheric piston, as in the case with aperturemasking interferometry without adaptive optics (e.g. Tuthill et al. 2000). In this regime, for typical exposure times less than 20 ms at a 2.2 m wavelength, or 50 ms at 4 m wavelength without adaptive optics or fringetracking, we can still consider phase errors at thirdorder with reasonable accuracy. By evaluating Equation 7 numerically based on Kolmogorov turbulence, we arrive at:
(15) 
which is valid for . This kind of relationship also has relevance to longbaseline interferometry in the case of measurements where visibilities are measured simultaneously. Examples of this are MIRC (Monnier et al., 2006) or PAVO (Ireland et al., 2008) at the CHARA array. This relationship does not apply to scanning beam combiners, where fringes can be recorded nonsimultaneously depending on group delay tracking accuracy.
2.3 Spatial ClosurePhase Errors
In this section, we will examine how wavefront phase corrugations affect closure or kernel phases, occurring as random, calibration and static errors. Calibration errors occur when there are slowly timevariable spatial aberrations (often called quasistatic speckles). To most easily compare kernel phase to closurephase, we adopt a factor of scaling to the closurephase, so that adding the three baseline phases is equivalent to multiplying by a unit vector (e.g. one of the orthonormal columns of the matrix from Martinache 2010).
Figure 2 shows a comparison between simulated sparse aperturemasking and kernelphase data analysis for a variety of aberration spatial frequencies and aberration amplitudes. For each amplitude and spatial frequency, the position angle of a sinusoidal aberration was randomly varied and the overall RMS kernelphase computed. It can be seen that although both kernel phase and closurephase appear equivalent to firstorder, they have quite different responses to highorder pupil plane errors. The spatial filtering of an aperturemask means that it can be effectively used at much lower instantaneous Strehl ratios than unobstructedpupil kernelphase, but in a highStrehl regime, kernelphase is in principle superior. For the 0.35 radians RMS phase error case (right hand figure), Equation 7 predicts closurephases approximately 2 times lower than the simulation, possibly due to Fourier sampling and windowing effects in the sparse aperturemasking pipeline used, and possibly due to effects higher than 3rd order in pupilplane phase. For very high instantaneous Strehls, kernel phase in both geometries is expected to scale as the cube of the pupilplane phase error, which is in the Maréchal approximation.
A comparison between imaging with an unobstructed aperture and with sparse aperture masks is complicated somewhat by the ability to window data, which smooths over high spatial frequency aberrations. This gives a further advantage inprinciple to an unobstructed aperture or a mask with large holes where the interferogram has a relatively small spatial extent. An example of a regime where fine spatial scale aberrations may dominate phase errors postcalibration is when aberrated pupilplane elements or masks shift due to flexure effects.
2.4 Flat Field Errors
In sparse aperture masking, many pixels are used to record fringes from objects with intrinsically small spatial extents. If target and calibrator objects are not acquired on the same pixels, then the effect of flat field errors is to add random phase errors across the Fourier plane. These random errors are only static if alignment is perfect between target and calibrator star observations – otherwise flat field errors become a calibration error. A flat field error can be modelled as multiplication in the image plane by a function that is 1.0 everywhere plus whitenoise with standard deviation . A typical value of is 10, arising from a series of flat field exposures with a total of photoelectrons per pixel. Multiplication by this flat is equivalent to convolution in the Fourier domain, which spreads the power from the zero and nearzero spatial frequency components over the full Fourier plane. Clearly phase errors will then be proportional to and inversely proportional to visibility. Numerical simulations give the following relationship for closurephase in sparse aperturemasking observations:
(16) 
where the fringe visibility, referenced to a perfect Strehl interferogram of a point source. The constant of 0.3 varies between approximately 0.2 and 0.3 for different bandpass filters and aperture masks. To ensure that these errors are less than radians with typical visibilities of 0.3, we need , meaning at least photons per pixel recorded when taking flat fields.
2.5 Bad Pixels
The existence of bad pixels on an imaging array can often destroy sensitivity in traditional imaging over a small portion of the field of view. Like flat field errors, incorrectly accounting for bad pixels can cause significant calibration errors. By spreading the information over many pixels, it may seem that at first glance bad pixels would always do significant harm to the information content in aperturemasking observations. However, the limited Fourier support of this kind of observation, as long as it is better than Nyquist sampled, means that bad pixels can be very effectively corrected. In simulations, the algorithm below has proved effective at contrasts beyond 10 for arrays far worse than those found at telescopes where aperturemasks are installed, meaning that if properly corrected, bad pixels are not a cause of kernelphase errors.
The principle of this bad pixel correction algorithm is to assign the values to the bad pixels so that the power in the Fourier domain outside the region of support permitted by the pupil geometry is minimised. We will call this region of the Fourier plane the zero region . We can turn this problem into a linear one by realising that the Fourier components corresponding to the set of bad pixel coordinates forms a subspace of , and we can find a vector of bad pixel offsets to subtract so that the image Fourier transform on this subspace is identically zero.
The first step in this process is to create the matrix which maps the bad pixel values onto . The measured values in the Fourier plane region are then modelled as:
(17) 
with being the remaining Fourierplane noise. The bad pixel adjustments are then found using the MoorePenrose pseudoinverse of :
(18)  
(19) 
The MoorePenrose pseudoinverse can also be found by other methods such as singularvaluedecomposition rather than direct computation of an inverse as in Equation 19, but this method suffices for a relatively small number of bad pixels. Although this algorithm is very quick (the matrix is precomputed), the bad pixel correction Equation 18 does have to be applied for every frame, with the computed values subtracted off each frame. It can also be used to correct for saturated pixels at the core of a PSF, pixels affected by transient events such as cosmic rays, or an acquisition error where a small portion of the interferogram is truncated by the detector edge.
2.6 Dispersion and WavelengthDependent Phase Errors
Kernelphase observations are often made in a broadband filter where different wavelengths are affected by both the atmosphere and optics in different ways. This causes a static kernelphase error, which can become a calibration error unless observing conditions and spectrum are matched between target and calibrator observations. A general analysis of these errors is particularly difficult and beyond the scope of this paper, because the definition of kernelphase is inherently monochromatic. However, we can put some limits on when this effect might become important, and the order of magnitude of the effect. We write the air refractive index difference of between the blue and red edges of a filter, and the spectral difference between a target and calibrator is covering a fraction of the bandpass. Assume both objects are observed at the same airmass. The image Fourierplane phase error arising from this difference is:
(20) 
where the change in angle on the sky between long and short wavelength part of the filter is:
(21) 
Here is the zenith distance angle, and this formula is only value for air masses less than approximately 3. The kernelphase signature of this dispersion effect is very similar to that of a close companion of separation and magnitude difference . For values of greater than about , the kernelphase error is of the same magnitude as , and for smaller values of , the kernelphase error goes as (e.g. see Equation 5 of Le Bouquin & Absil, 2012). As an example, observing in the full Hband with a zenith angle of 45 degrees from an altitude of 2600 m gives milliarcsec, which is larger than for m. A 10% difference in the spectrum over the longwavelength 10% of the H bandpass would then give radians.
The effect of observing at different airmass is much more complex, because for flat spectra, dispersion does not give a nonzero kernelphase. In general, it may be a nonlinear interaction between pupilplane aberrations and dispersion that dominate the calibration errors.
2.7 Photon, Background and Readout Noise
Finally, we consider the fundamental limitation of random errors caused by photon, background and readout noise. Where the fringe visibility is , the total number of photons collected in an interferogram is , the number of background photons and the number of holes in the aperture mask , the closurephase error due to photon (shot) noise is:
(22) 
The factor of includes a factor of due to photon noise from three independent baselines making up the closurephase, as well as a factor of due to the shot noise power at any nonzero spatial frequency being split equally between the real and imaginary parts. The readout noise in photon units is and the number of pixels . The effect of both readout and background noise is affected by the size of the window function used prior to making the Fourier transform to compute the visibilities, and this effect can be minimised if fringes are directly fit to the data (e.g. the SAMP pipeline of Lacour et al. (2011)).
2.8 Dominant Error Terms
The most common kind of kernelphase data taken so far has been sparse aperturemasking behind natural guide star adaptive optics, particularly at 1.52.4 micron wavelengths, so we will consider this regime first. We will also consider that adequate flatfields have been taken and bad pixels properly corrected. The adaptive optics system only locks when there are at least 100 visible photons per ShackHartmann lenslet in 0.01s, or 10 photons in 100 s. With a similar nearinfrared and visible photon rate, and a similar masking subaperture size to a ShackHartmann lenslet size, Equation 22 would predict a 0.4 degree photonlimited closurephase uncertainty for a 100 s integration and a 9hole aperture mask.
We can use Equation 14 to predict the effect of temporal phase errors: in particularly good seeing, could be as low as 0.3 radians (giving a temporal phasenoise limited Strehl of 0.9) and could have a value of 10 Hz. This would give a temporal phasenoise component to closurephase uncertainty of 0.1 degrees. Perhaps not surprisingly given how much light an aperturemask blocks, photon noise would dominate in this regime. However, for less than ideal seeing conditions and targets which are brighter in the infrared, the temporal phase noise dominates over photon noise. A characteristic “typical seeing” predicted closurephase error for 0.5 radians RMS pupilplane phase error is 0.5 degrees for a 100 integration.
The closurephase uncertainties predicted here are similar to the typical closurephase uncertainties computed from the standard error of the mean of individual observation sets in survey papers such as Kraus et al. (2008). However, it is certainly true that the residuals when subtracting closurephases from two pointsources are not always statistically consistent with these standard errors. This kind of residual is often called a calibration error, where the nonzero closure phases described in Section 2.3 are not fully corrected by observations of a calibrator star. Typical uncalibrated closure phases from the Keck 9 hole aperture mask are 3.5 degrees in H and K bands (CH4S and Kp filters), and 7 degrees in L band (Lp filter). These nonzero closure phases are consistent with having quasistatic spatial aberrations of 0.5 radians amplitude in the CH4S and Kp filters (e.g. Figure 2) and atmospheric dispersion in the Lp filter (Section 4). A small change in the cause of these nonzero closure phases causes miscalibrations that can be larger than the temporal (subaperture piston) phase and photon noise effects.
3 ClosurePhase Correlations
One of the more confusing aspects of aperturemasking data analysis is knowing what to do with a linearly dependent set of closurephases. As described in Kulkarni (1989), these phases may be linearly independent in the case of very low signaltonoise per exposure when the bispectrum is averaged, but in the high signaltonoise limit considered here, with nonredundant subapertures, there are closurephases but only linearly independent closurephases. A redundant aperture has an even higher degree of correlation of the bispectrum phases.
Simply choosing an arbitrary independent set of closurephases for the purpose of modelling is not possible without a full consideration of the covariance matrix. If one considers only the simplest forms of closurephase errors, namely that due to readoutnoise, then the problem of modelling the covariance matrix is not difficult. However, there are many other kinds of errors that can cause correlations between closurephase errors.
Previous work has either gone to great lengths to diagonalise the measured covariance matrix of closurephase (e.g. Kraus et al., 2008) or has made an approximate scaling of fitting errors to account for the closurephase correlations (e.g. Hinkley et al., 2011). The difficulty in any approach based on real data is that the sample covariance matrix must be modelled, and can not in general be measured completely from the data. The reason for this is that where there are fewer data frames taken than independent closurephases, the sample covariance matrix is necessarily singular.
These difficulties are all avoided if rather than considering closurephases as a primary observable, the linear combinations that make the kernelphases are seen as the primary observables. This has added benefits of being able to extend the aperturemask technique to considering baselines within each subaperture (consequently extending the usable field of view) and using the same language for all adaptive optics image analysis that is independent of pupilplane phase to first order.
Of course, there are many different ways to form a set of kernelphases from a set of closurephases, or indeed a linearly independent set of kernelphases. Martinache (2010) suggested that kernel phases should be constructed so that only orthonormal linear combinations of Fourier phase are considered. However, this does not guarantee statistical independence. In the simplest case of a centrallyconcentrated image limited by photonnoise, the spatial concentration of the image variance means that neighbouring Fourier components have highly correlated phase errors. This amounts to a contrast loss when considering sigma excursions of kernelphase, because just like aperturemasking, the kernelphase technique as described by Martinache (2010) has a nearly flat contrast limit curve beyond separations of . However, standard imaging can have increasing contrasts as separations increase beyond the PSF centre. This apparent loss in sensitivity can be regained by properly considering the correlation between Fourier phases, as shown below.
3.1 StatisticallyIndependent Kernel Phase
Following from Section 1.1 we will define the matrix that transforms the Fourier phase vector to the vector of kernelphases . This is an by matrix, where is the number of Kernelphases and is the number of Fourier phases. The subscript indicates that this matrix produces an orthonormal set of phase linear combinations. We can compute the sample covariance matrix of kernel phases either directly or from the sample covariance matrix of Fourier phases . This matrix can be diagonalised by the finitedimensional spectral theorem:
(23) 
The matrix is then a unitary matrix which allows us to construct a set of statistically independent kernel phases based on a new kernelphase operator :
(24) 
As an example of the utility of this approach, I have simulated the effects of photonnoise on Kernelphase contrast limits, as shown in Figure 3. The contrast standard deviation was estimated by first estimating the standard deviation of each Kernelphase (i.e. neglecting covariances), forming a vector , then computing the contrast error using standard formulae for weighted averages:
(25)  
(26) 
Here is the model phase divided by the contrast in the highconrast limit, e.g. for a 100:1 brightness ratio companion, the phase would be approximated well by 0.01. It is clear that the contrast achieved by considering statistically independent kernelphases defined by is superior to the contrast achieved by orthonormal kernelphases defined by , for companions away from the PSF core.
4 Calibration Strategies
For the situation where phase errors are mostly random, calibration is not required. This has been the case for faint aperturemask observations with a laserguide star system, where obtaining calibration observations has a very significant observing time cost (e.g. Dupuy et al., 2009). When static phase errors dominate and random errors are larger than calibration errors, only a single suitable calibrator observation is required. A more typical situation in sparse aperture masking has been where random errors are small compared to calibration errors, and the choice and weight assigned to calibrator observations is critical in achieving the lowest possible model fit residuals and the highest contrasts. In this regime there is an obvious danger – where calibrators are chosen to minimise the calibrated kernelphase, this biases the kernelphase away from a detection, and may result in deeper contrast limits being quoted for a nondetection than is justified by the data. This problem is also in common with the LOCI algorithm (Lafrenière et al., 2007).
4.1 Nearest Neighbour Calibration
The simplest calibration technique is to subtract the kernelphases from a calibrator observed closest to the target in time or space. A small extension to this technique (e.g. Evans et al., 2012), is to use the average of several calibrators observed nearby in time, rejecting outlier calibrator observations. Outliers are most easily rejected by looking for calibrators that when used to calibrate the target, give spuriously large closurephases. For calibrators, this amounts to calibrator weightings where each is either 0 or , with the number of calibrators used. There are however, several weaknesses to this technique:

With small numbers of calibrator observations, it is difficult to avoid subjectivity in the choice to reject particular calibrators.

For particularly noisy calibrator observations and small systematic kernel phases, this process only adds noise.

All calibrators are weighted evenly, when the optimal weighting of individual calibrators may even be negative.

Any astrophysical structure in calibrators, e.g. undetected faint companions, contributes to any signal in final calibrated data.
The third point may not be obvious, and is illustrated in Figure 4. Whenever calibrators are all on one side of the calibrator in some space, then optimal calibration may extrapolate past the position of the calibrators to the target. This space may be real (such as zenith distance which produces nonzero kernel phases due to dispersion) or a one dimensional parameterisation of a hidden variable describing a timevariable aberration. This approach is similar to the potentially negative weighting of astrometric reference stars in precision astrometry (Lazorenko, 2006).
4.2 Optimised Calibrator Weighting
We will now proceed to define a more optimal set of calibrator weightings . This set of calibrator weightings must minimise the residual closurephases after fitting a model, without significantly biasing the model fit. In this section, we will describe this process as applied in Kraus & Ireland (2012), where the starting point is closurephases rather than kernelphases.
Following Appendix A of Kraus et al. (2008), we begin by considering the closurephases only on a subspace spanned by the linearly independent set of closurephases. Furthermore, we construct a basis vector set on this subspace such that the closurephase covariance matrix is diagonal (or nearly so) when projected on to it. To see how this is done, first note how closurephases can be constructed linearly from phases:
(27) 
The matrix then projects any set of closurephases onto the set spanned by the linearly independent set of closurephases. This matrix can be diagonalised by a diagonal matrix and a unitary matrix . The eigenvalues on the diagonal of are either 0 or 1. By considering only the nonzero eigenvectors of , we can write:
(28) 
for an projection matrix . projects onto a subspace spanned by an orthonormal set of linear combinations of closurephases.
Next, given a closurephase covariance matrix , we can modify the projection matrix so that it projects onto a set of basis vectors for with a diagonal covariance matrix. To accomplish this, we diagonalise the projection of :
(29) 
Then our new matrix is a projection matrix onto satisfying:
(30) 
Representing the data in this way enables, for example, the construction of variables that can be computed by the sum over variancenormalised square deviates of a set of independent data, without the explicit use of covariance matrices. A potential problem with this approach is that the sample covariance matrix estimated from the data has a rank equal to min, where is the number of data frames. Taken at face value, with , this process unreasonably restricts the closurephases of a model of the target to lie on a very limited subspace in the space spanned the observed departures from the mean closurephase. For this reason, we take above to be the weighted mean sample covariance matrix of all target and calibrator observations weighted by the inverse of the trace of each sample covariance matrix. We form the estimated errors of the target by:
(31) 
Our data and errors are then transformed to a set of kernelphases :
(32)  
(33) 
The nondiagonal terms of are ignored, and any values on the diagonal less than the median are set to the median. This is a crude method to ensure our statistics are reasonably robust, without resorting to studentizing a multidimensional distribution. An alternative to this approach might be a bootstrapping technique, however in this case there is no obvious way to estimate the variables below or to account for the error in their estimation. The additional uncertainty accounts for calibration errors, to be further defined below.
The next step is to find an optimal linear combination of weights , where is the number of possible calibrators. By optimal, we mean that we want to maximise the likelihood function for based on a nullmodel for calibrated kernelphases :
(34)  
(35) 
where we have explicitly subscripted with and where is a Bayesian prior distribution for . The use of a restrictive prior as a regulariser is essential where there are many calibrators in use, because if and there is a random error component, then there almost surely exists an such that , subtracting any real astrophysical signal. The prior chosen in Kraus & Ireland (2012)^{1}^{1}1This equation as presented in Equation 1 of Kraus & Ireland (2012) was potentially confusing, because the division was elementbyelement division, and the vector norm was used without being explicitly described. was:
(36) 
where is the variance of the th component of . This is certainly not the only choice of such a prior, but it does have the essential feature of preferring calibrator weights of zero, and also of reducing the weighting of calibrators with large internal sample variances.
Once an optimal set of weights has been found by maximising the likelihood function, the uncertainty on the calibrated kernelphases is given by:
(37) 
Note that this neglects any uncertainty in estimating the .
Finally, the calibrator observations do not necessarily span the space of the hidden parameters causing nonzero pointsource kernelphases. For this reason, the additional “calibration error” term in Equation 33 was iteratively added so that the reduced for the nullmodel was 1.0, i.e.:
(38) 
In approximately half of the data sets tested in the work leading up to Kraus & Ireland (2012), no calibration error was needed. With values of the calibrated kernelphases and their errors so computed, a model such as a bright star plus faint companion or a more complex image can be fit using leastsquares. This is, however, a biased fit just like the LOCI technique (Lafrenière et al., 2007), because the process of computing the weights partly removes the binary signal, due to the null model for kernelphase in Equation 35. For this reason, in Kraus & Ireland (2012), final values of model parameters were computed after recomputing the with the best fit model subtracted iteratively from the .
4.3 Restricted Kernel Phase (POISE)
An alternative to the complexity of the calibration strategy in the previous section is to ignore the kernelphases that require calibration, i.e. those kernel phases that are most affected by systematic errors. This is similar to choosing a prior in Equation 35 so that the calibrator is ignored for some kernelphases ( and left uniform for other kernelphases, so that both calibration errors and astrophysical signal are subtracted. The difference between this and the technique described in this section is that only the restricted set of kernelphases where calibration is not required is used for subsequent analysis. We will call these restricted observables the Phase Observationally Independent of Systematic Errors (POISE) observables. This technique is very similar to the technique of ignoring dominant KarhunenLoève eigenimages as a means of calibrating more widefield pointspread functions (Soummer et al., 2012)
Following Equation 28, we find a set of kernelphases for each image by a projection of the Fourier phases :
(39) 
for general Kernelphase, remembering that:
(40) 
for aperturemasking. The matrix is formed in a similar way to Equation 23, using the matrix of calibrator observations, which is an ( by ) matrix, with the total number of calibrator frames:
(41) 
This definition is almost the same as taking diagonalizing the covariance matrix, except that we do not subtract the mean kernelphases from the .
The calibrator kernelphases on this new subspace with zero covariances is are naturally subdivided into image sets for each PSF calibrator observation . Within each image set, uncertainties are dominated by random errors, but between image sets, there is a combination of random and calibration errors. We consider the sample variance for kernelphase computed over all images as systematic if:
(42) 
for all calibrator image sets . In the POISE technique, we simply compute the systematic error components for each kernelphase , and:

Ignore kernelphases whenever
(43) A typical value for is 1, which rejects approximately 1 to 3 out of 28 kernelphases for 9hole Keck aperturemasking data.

Add to each target observation’s uncertainty estimate for the remaining kernelphases .
This means that the process of calibration is completely independent of the target, which was not the case in Section 4.2, because in that technique calibrator weights were chosen to minimise the calibrated target kernel phases. The technique requires at least 3 calibrator image sets to differ significantly from simpler calibration techniques.
As an example of the use of this technique, we consider the data set used in the November 2010 K’ sparse aperturemask observations of the LkCa 15 system (Kraus & Ireland, 2012). This data set consisted of 13 calibrator image sets of 12 images each, and 12 target image sets of 12 images each, all taken in good (0.6”) seeing. This is an ideal data set, especially given that all calibrators had previous sparse aperture mask observations and were known to be single stars, and observations were continuous over a time period of 3.5 hours, with target and calibrator observations interspersed. This is also the highest contrast detection published in the literature so far, which is the Kband detection of structure modelled as three compact sources around the star, with details reproduced in Table 1. Although much higher contrast is possible for brighter stars, especially when extreme adaptive optics may enable negligible piston phase errors, at this is roughly the brightest star of its class – no known 5 Myr solarmass star is in any association closer than Taurus.
When applying the POISE algorithm to this data set with a value of 1.0 in Equation 43, only 1 of the 28 kernelphases are removed as “systematic” by the calibrator observations, meaning that 96% of the closurephase information is retained. A three pointsource fit to these restricted kernelphases had a reduced value of 0.92, as shown in Table 1. With a reduction of to 0.25, 4 kernelphases are removed as “systematic”, the reduced becomes 1.00 but no fitted parameters change by even 1. In addition, the variance of the mean for 50% of the imageset kernelphases are dominated by random errors, and not the values from Equation 42. This means that quasistatic spatial aberrations in this case do not significantly limit the signaltonoise in the final image. For this kind of observation, spatiallyfiltering the input wavefront (e.g. Huby et al., 2012; Jovanovic et al., 2012) could not significantly improve the achievable calibrationlimited contrast. The random errors of 0.5 degrees in each 240 s image set are also consistent with temporal phase piston errors, which would not be improved by spatial filtering. This argument of course falls over for brighter targets (i.e. generally highermass or closer and older targets) where exiting adaptive optics systems perform much better, and extreme AO is possible. In these situations, in Equation 14 can be smaller than 0.3 radians, can exceed 100 Hz and spatial filtering may become essential at the 10 magnitude contrast range enabled by this improved AO performance.
4.4 Imaging with Poise
For sufficiently complex sources, modelfitting is replaced with imaging. In general, imaging from kernelphases alone is computationally intensive because of the nonlinear relationship between the imageplane and Fourier phase. However, in the high contrast regime,where interferometric visibility amplitudes are unity within errors, we can approximate the Fourier transform of an image normalised to a total flux of unity as:
(44) 
In turn, the phase becomes:
(45) 
We can consider the image to be made of discrete pixel values arranged in a vector , so that the integral in Equation 45 becomes a sum, and the values of Fourier phases and Kernelphases are represented by matrix multiplication:
(46)  
(47) 
This linear approximation to imaging means that minimising kernelphase subject to a differentiable regulariser can be rapidly computed using a gradient descent method. An example of such a regulariser is the Maximum Entropy regularizer (e.g. Narayan & Nityananda, 1986):
(48) 
for some prior image , often taken to be a uniform image in some finite field of view and zero elsewhere. The problem of Maximum Entropy image construction is then simply a problem of minimising the sum of the value and the regulariser:
(49) 
The value of is typically chosen so that the final image has a reduced value of 1.0^{2}^{2}2Image reconstruction code in the python language using this regulariser can be found at http://code.google.com/p/pysco, the repository where all code in this paper is intended to go after translation to python.. The to see the result of this approach to imaging, we will again use the K’ data set from Kraus & Ireland (2012). In that original paper, the optimised calibrator weighting scheme (see Section 4.2) enabled the MACIM algorithm (Ireland et al., 2006) to be used to create images directly from the closurephases via an OIFITS input file. This approach ignored correlations between closurephases. The image created directly by fitting to kernelphases imaging with the Maximum Entropy regulariser can be seen in Figure 5, where the resolved structures contain 1% of the total system flux and the reduced of the image is 1.0. Note that arbitrary pointsymmetric flux could be added to this image and it would still fit the Kernelphases. A weakness of imaging from kernelphases alone is that pointsymmetric flux added to a bright central point source does not produce any phase information.
The image in Figure 5 is cosmetically at least as good cosmetically as that shown in Kraus & Ireland (2012), but comes with the significant benefit that the calibration process does not directly affect the image: the POISE observables are independent of the calibrator observations.
Parameter  KI12  POISE 

(mas)  67.03.2  65.13.1 
(deg)  12.32.8  10.92.9 
7.400.19  6.890.18  
(mas)  64.41.5  62.61.9 
(deg)  334.81.5  333.42.5 
6.590.09  6.360.11  
(mas)  82.52.4  78.04.1 
(deg)  302.31.5  302.32.8 
7.060.12  7.020.18 
5 Conclusions
Aperturemask interferometry has proven to be a powerful technique to recover high contrast (up to 8 magnitudes at 1), asymmetric information at the diffraction limit ( 0.5–) of large telescopes. The reason for this success is the ability for closurephase, a kind of kernelphase, to give an observable largely independent of timevariable aberrations. I have described many of the key sources of phase errors in this technique, as well as several strategies for mitigating them. Of note is the Phase Observationally Independent of Systematic Errors (POISE) observables, which are a subset of all possible linear combinations of closurephases. Observations of calibrator stars inform which linear combinations of phases constitute the POISE observables, but the analysis of the target observations is performed quite independently of the calibrator observations, leading to a more robust calibration method.
The generalisation of the aperturemask technique to full pupil images shows great promise in the form of the full pupil kernelphase observables. Simulations show that pupilplane phase errors higher than thirdorder affect full pupil kernelphase more than aperturemask kernelphase, meaning that fullpupil kernel phase will likely be restricted to moderately high Strehl observations.
The analysis presented here has implicitly involved only a monochromatic PSF from an imaging system. Although the effect of dispersion was discussed and the POISE calibration technique ameliorates the effects of dispersion, a mathematical framework to clearly predict the effects of dispersion on kernelphase was not developed. A future study of the effect of very broad bandwidths is needed. More importantly, an extension of this technique to work for the simultaneous wavelengthdispersed images formed by an integral field unit could be very powerful. The scaling of PSF with wavelength as a speckle suppression technique could be equallywell applied to observables in the Fourier domain as it has been in imageplane analyses.
Acknowledgments
M.I. would like to acknowledge many helpful conversations with and encouragement from a large number of people over the past 10 years as these ideas developed and have been tested in various contexts, in particular JeanPhilippe Berger, Adam Kraus, Shri Kulkarni, Sylvestre Lacour, David Lafreniére, James Lloyd, Frantz Martinache, John Monnier, Laurent Puyeyo, J. Gordon Robertson, Anand Sivaramakrishnan and Peter Tuthill. The manuscript was also substantially improved following helpful comments from an anonymous referee.
Appendix A ThirdOrder Bispectrum Expansion
(50) 
The 0th order terms in the s are trivially collected as 1, and the 1st order terms clearly cancel to give 0. The second order terms are:
(51) 
Moving from this equation to Equation 6 requires the substitution of Equations 8 through 10, as well as a recognition of the following classes of trivial identities:
(52)  
(53) 
The 3rd order terms of Equation 50 are collected (after minor simplification of the coefficient 1/2 terms) as:
(54) 
Appendix B Temporal Phase Errors
In applying Equation 7 to temporal phase errors, we write the instantaneous values of , and as random variables , and respectively, which take a new random value at statistically independent time steps. We can then write:
(55)  
(56)  
(57)  
(58) 
Here Var represents the variance of a quantity, which in this special case of quantities of zero mean, is simply the expectation of the square. The approximately equals sign () in Equation 56 is used because we are ignoring the piston subtraction, applicable only for (and with an error of order ). Each of the variables , and are independent Gaussian variables with mean 0 and standard deviation , so their moments are standard results, and the expectation of a product of their moments is simply the product of the expectation of their respective moments. The variance on the right hand side of Equation 57 can be thus be simply but tediously evaluated as the sum over 36 mutual covariances to give a value of 12. Finally, Equation 14 follows directly from Equation 58, noting that the number of independent phase samples .
References
 Baldwin et al. (1986) Baldwin J. E., Haniff C. A., Mackay C. D., Warner P. J., 1986, Nature, 320, 595
 Dupuy et al. (2009) Dupuy T. J., Liu M. C., Ireland M. J., 2009, ApJ, 699, 168
 Evans et al. (2012) Evans T. M. et al., 2012, ApJ, 744, 120
 Fizeau (1868) Fizeau H., 1868, C.R.Acad.Sci., 66, 932
 Hinkley et al. (2011) Hinkley S., Carpenter J. M., Ireland M. J., Kraus A. L., 2011, ApJ, 730, L21
 Hofmann & Weigelt (1993) Hofmann K.H., Weigelt G., 1993, A&A, 278, 328
 Huby et al. (2012) Huby E. et al., 2012, A&A, 541, A55
 Ireland et al. (2008) Ireland M. J. et al., 2008, in Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series, Vol. 7013, Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series
 Ireland et al. (2006) Ireland M. J., Monnier J. D., Thureau N., 2006, in Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series, Vol. 6268, Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series
 Jovanovic et al. (2012) Jovanovic N. et al., 2012, MNRAS, 427, 806
 Kraus & Ireland (2012) Kraus A. L., Ireland M. J., 2012, ApJ, 745, 5
 Kraus et al. (2008) Kraus A. L., Ireland M. J., Martinache F., Lloyd J. P., 2008, ApJ, 679, 762
 Kulkarni (1989) Kulkarni S. R., 1989, AJ, 98, 1112
 Lacour et al. (2011) Lacour S., Tuthill P., Amico P., Ireland M., Ehrenreich D., Huelamo N., Lagrange A.M., 2011, A&A, 532, A72
 Lafrenière et al. (2007) Lafrenière D., Marois C., Doyon R., Nadeau D., Artigau É., 2007, ApJ, 660, 770
 Lazorenko (2006) Lazorenko P. F., 2006, A&A, 449, 1271
 Le Bouquin & Absil (2012) Le Bouquin J.B., Absil O., 2012, A&A, 541, A89
 Lloyd et al. (2006) Lloyd J. P., Martinache F., Ireland M. J., Monnier J. D., Pravdo S. H., Shaklan S. B., Tuthill P. G., 2006, ApJ, 650, L131
 Martinache (2010) Martinache F., 2010, ApJ, 724, 464
 Michelson (1891) Michelson A. A., 1891, Nature, 45, 160
 Monnier et al. (2006) Monnier J. D. et al., 2006, in Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series, Vol. 6268, Society of PhotoOptical Instrumentation Engineers (SPIE) Conference Series
 Narayan & Nityananda (1986) Narayan R., Nityananda R., 1986, ARA&A, 24, 127
 Schwarzschild (1896) Schwarzschild K., 1896, Astronomische Nachrichten, 139, 353
 Soummer et al. (2012) Soummer R., Pueyo L., Larkin J., 2012, ApJ, 755, L28
 Tuthill et al. (2000) Tuthill P. G., Monnier J. D., Danchi W. C., Wishnow E. H., Haniff C. A., 2000, PASP, 112, 555