The Multiplexed Imaging Method HighResolution Wide Field Imaging Using Physically Small Detectors
Abstract
We present the method of multiplexed imaging designed for astronomical observations of large sky areas in the IR, visible and UV frequencies. Our method relies on the sparse nature of astronomical observations. The method consists of an optical system that directs light from different locations on the focal plane of a telescope onto the same detector area and an algorithm that reconstructs the original widefield image. In this way we can use a physically small detector to cover a wide field of view. We test our reconstruction algorithm using public space telescope data. Our tests demonstrate the reliability and power of the multiplexed imaging method. Using our method it will be possible to increase the sky area covered with space telescopes by 13 orders of magnitude, depending on the specific scientific goal and optical parameters. This method can significantly increase the volume of astronomical surveys, including search programs for exoplanets and transients using space and ground instruments.
1 Introduction
1.1 Observing a large area of the sky
Surveying a large sky area is one of the most common and elementary types of observation. In principle, one would want to image as wide an area of the sky as possible, at high spatial resolution and through a telescope with a large aperture. Covering a wide field at high resolution requires a detector with a large physical area and many pixels, leading to high cost and complexity.
Astronomers deal with this problem in two ways:

Using complex and expensive arrays of detectors.

Using detectors with large pixels, at the expense of resolution.
We propose a novel method to address this issue.
1.2 The origin of noise in astronomical observations
The performance of the multiplexed imaging method strongly depends on the properties of the relevant noise. We consider the following origins of noise in astronomical observations.

Poisson noise  The difference between the expected flux of energy coming from a light source (or a noise source) and the flux actually arrived.

Background noise  The fluctuations in the number of photons originating in the space between the observer and the source (e.g, atmospheric background noise and zodiacal light).

Read noise and instrumental thermal noise (dark current): The noise coming from the process of reading the data from the detector and the electrons that originate in the electronics and the detector itself.
When imaging from the ground, all types of noise exist and contribute, while when imaging from space, in some situations we can neglect the background noise and assume that the only sources of noise are the read noise and the Poisson noise.
1.3 Sparsity of astronomical images
A very important feature of many astronomical images is that they are almost empty. When we pick a random patch of sky and observe it (not a specifically chosen close galaxy, nebula or dense star cluster) there are very few objects with nonzero flux. Most of them are either point source objects (the size of the seeing disk) or small patches (like distant galaxies) with sizes on the scale of a few arcseconds. Our analysis only applies to sparse images because the source of the improvement that we offer is the sparsity of the part of the sky observed.
2 The Multiplexed Imaging Method
2.1 The general idea
We propose to use the sparse nature of astronomical images to effectively measure all objects contained in the corrected field of view of a telescope using a physically small detector, without reducing the resolution. This is done by projecting different areas of the focal plane simultaneously onto the detector. Because of the scarcity of objects one can perform scientific measurements using the combined images with the same quality and greater efficiency compared to mosaicking. Flux measurements of known sources in combined images can be done directly (e.g to search for transients ans planets). Using subobservations it is possible to chart an unknown part of the sky.
In section 3 we analyze both cases and present our recovery algorithm. In section 4 we analyze the efficiency gain factor of our method under different dominant noise sources and show that multiplexed imaging is very efficient when the dominant noise source is either Poisson or read noise. In section 5 we show simulations of our charting recovery algorithm. In section 6 we analyze the possible impacts of the method on instruments designed for sky surveys, finding exoplanets, looking for transients, fast photometry and lucky imaging, finding orders of magnitude potential improvements to the area observed per unit time. A conceptual optical realization of this method is discussed in BenAmi et al. (in preparation), and laboratory experiments with this design are ongoing.
2.2 Definitions

The focal plane is divided into parts, each part can be covered by the detector in a single exposure of length .

Our device directs the light of parts () onto one detector.

An observation is composed of subobservations. Each subobservation is a measurement of the sum of the flux from areas on the focal plane.

The time each observation takes depends on the exposure time , readout time and slew time . When referring to two different exposure times, the duration of multiplexed imaging will be denoted by .

The total time required for one observation using multiplexed imaging is given by

Regular imaging can be thought of as multiplexed imaging with and (it will take observations to cover the whole area). In that case .

The efficiency of the system is given by the time required to cover the described area with the regular mode divided by the time required to do so with the multiplexed method, when both observations have the same SNR, meaning
(and is adjusted to match the SNR).

For example, under the assumptions that
(1) (which is not always the case, see section 4.2) and (which will make the SNR equal when the dominant source of noise is Poisson noise) this yields .

We will define the object surface density to be the number of sources in one part of the sky divided by the area of sky observed.

Denote by the average (over sources with different intensities and sizes) number of pixels that have a statistically significant contribution per object.

The assumption that a sky area is sparse means that (Fig. 2).

The best possible multiplexing and the absolute upper limit on satisfies , which means that we are measuring a nontrivial flux with every pixel of our detector.

Generally depends on many parameters such as the depth of the observation, the filter, the field observed, the plate scale and the seeing. values for a few common surveys are:
DSS
3 Recovering the original observation
We consider separately two scientific cases: charting and reobserving.
3.1 Charting
We define charting as an observation of a part of the sky that is unknown to the resolution and depth in question. In this mode, since we have no prior knowledge of the position of every object, we need to obtain several subobservations, allowing for each part of the sky to have a specific pattern of appearance (Fig. 3) in order to recover the correct position of each source.
3.2 Reobserving
When reobserving, we have a prior image of the relevant region of the sky. Our observational goal in reobserving mode is to measure the flux from previously known objects, measuring variability or searching for new transients. In the reobserving mode we can calculate for each pixel on the image apriory (using our known mapping of the sky) which areas on the focal plane contribute to the measured flux, allowing for a simple recovery algorithm. The number of subobservations required and therefore also the efficiency depends on whether the scientific mission requires to measure all the objects in the field, or just as many of them as possible. Those two cases are analyzed separately. The case where it suffices to measure only most of the objects should allow using a trivial recovery algorithm, and requires only subobservations. The other case, where all objects in the field are important, is similar to the charting mode because different objects falling on the same detector area must be dealt with. This case is not addressed in this paper as we expect it to be rare.
With recent advances in the recording of a multiwavelength static image of much of the sky, e.g., by surveys such as SDSS (York et al., 2000) PS1 (Kaiser et al., 2010) and PTF (Law et al., 2009, Rau et al., 2009) in the optical, GALEX (Martin et al., 2005) in the UV and 2MASS (Skrutskie et al., 2006), UKIDSS (Lawrence et al., 2007) and WISE (Wright et al., 2010) in the IR, the reobserving mode is likely to be the common mode (see applications 6.2, 6.3, 6.4, 6.5).
3.3 Construction of the sets
The recovery algorithm is simple, but to present it, we first need to introduce some notations and construct the sets of regions of sky combined during each subobservation. We use , , and as defined before.
Denote the set of sky regions combined during subobservation by . Denote the flux recorded in subobservation at pixel location by and denote the flux arriving to pixel from region on the focal plane by . The expected flux (without noise) at each pixel is therefore
Denote for each region on the focal plane, the representing vector (a binary vector of length ) indicating if the flux from region is combined during subobservation .
And denote the set of representing vectors . The representing vectors determine uniquely the set of regions that are included in each subobservation and they can be chosen by the algorithm designer ahead of making the observation. This allows us to choose in a special way the set of vectors such that there will be no ambiguity in the reconstruction algorithm.
Denote the vector of measured fluxes from all subobservations at a pixel on the detector by
If a specific pixel has exactly one nonzero flux contribution coming from a specific sky region , then there exists a real number such that
A simple example
When constructing the sets, the recovery ambiguity problem has to be dealt with, and to demonstrate it we will use a simple set of parameters: we choose , and . We will denote the focal plane subareas by as illustrated in Fig.3.
The sets we use are and .
The subobservations we use are therefore and .
The representing vectors will be , , .
For every pixel location we can construct .
If for a pixel location we observe
we can deduce that and .
If for a pixel location we observe
we can deduce that and .
If for a pixel location we observe
we can deduce that and .
This demonstrates a possible ambiguity, even without considering observational errors. There could be more than one combination of fluxes that will generate the same observed vector. This problem is inherent because we have less measurements than free parameters we are trying to measure (here we try to measure 3 free parameters, , but we have only 2 measurements, ).
This is where the sparsity assumption is necessary. We assume the original image is sparse, meaning that most of the measured parameters are , and therefore we assume the correct recovery in this case is and .
If for a pixel location we observe we cannot recover the original fluxes because the following cases are equally likely:

, and

, and .
In this case we cannot solve the ambiguity and we cannot determine neither the locations nor the fluxes of the nonzero sources. We should therefore construct the sets carefully, preventing ambiguities when few sources are contributing nonzero flux to a pixel location.
Preventing the ambiguity
The most basic requirement on our sets is that if there is only one nonzero flux contributing to a location then the recovery is unique. From here we can immediately deduce a strict demand on the construction of the set : For every pair of different vectors, and for all pairs of real numbers the following should hold:
This demand, implying and , defines the absolute lower bound for the number of subobservations to chart regions of the focalplane, which is .
It is suboptimal to recover only objects that do not overlap with other objects (this limits the multiplexing number (and therefore ) to be smaller than desired, leading to smaller region of the sky being observed). Therefore we wish to construct our sets such that unique recovery is guaranteed also when two objects fall on the same detector area, (and with high probability will be unique even when there are 3 or more objects falling on the same detector area). In the case of two objects, in a similar fashion, we have
Again, we choose the set to hold an analogue condition: for every quadruplet of different vectors, and for all quadruplets of real numbers the following should hold:
(2) 
In our simple example above, this condition does not hold, as
This fact is the cause of the ambiguity in the recovery.
When we consider the charting of weak sources, another source of confusion can be the noise (of all kinds). If two regions of sky have close representing vectors, i.e is small, then it is easy to confuse flux coming from region with flux coming from region . This is because the recovery algorithm can use only subobservations that contain only one of the regions to distinguish between sources coming from region and sources coming from region . The SNR of one subobservation is lower than the SNR of the reconstructed observation, meaning that statistically significant sources on the reconstructed image might not be significant in one subobservation. This means that weak (yet statistically significant sources) which are nonsignificant in a single subobservation can be mistakenly misplaced to positions with a representing vector which is close to the representing vector of the correct position. Therefore we prefer to choose the representing vectors such that the difference between every pair of vectors is nonzero in at least coordinates.
(3) 
Constructing a set for which conditions (2) and (3) hold is nontrivial, and is discussed in section 8.1, we will just remark that it can be done for a multiplexing of with if we want a robust recovery. In a slightly less robust case, when in the above notation ( is introduced later and is a confidence parameter of the algorithm, this condition means that with high probability there is no confusion between and due to noise) then we can use as little as subobservations for full recovery. On our simple example above, the minimal set that satisfies condition (2) and condition (3) with is:
The charting recovery algorithm
A detailed description of the charting recovery algorithm can be found in section 8.2, here we show only a simplified conceptual version.
The input of this algorithm is for every pixel position . The output is the fluxes, and the locations they are coming from, contributing to pixel .
Algorithm (with confidence parameter )

Go over all vectors and find the optimal and such that is minimal. Remember the best position and flux . If stop and output all pairs of that were found before.

Update

Go to step 1.
This algorithm is the simplest recovery algorithm one could think of, yet it captures the essence of the algorithm presented in section 8.2, which is statistically more accurate. This simple algorithm does not treat the case . We note that using the information from neighboring pixels we can easily solve ambiguities rising from the case . The results of employing this algorithm (without using data from neighboring pixels) to real data from GALEX are shown in section 5.
4 Efficiency analysis
To understand the behavior of the efficiency , we need to know the exposure time factor . This factor is determined by the constraint on that we compare the time needed for observations with equal SNR. The exposure time factor changes when different noise sources are dominant and for various observing modes (reobserving vs charting), and therefore each case should be handled separately. The efficiency for all cases (assuming condition (1)) is summarized on table 1.
Observing mode Dominant noise  Poisson  Background noise  Readnoise 

Reobserving  (*)  
Charting  (*) 
(*)Note that in section 4.2 we show that the method can improve the efficiency of background dominated observations if condition (1) is invalid and is not negligible compared to .
Denote the background noise in observation coming from region at pixel as , denote the read noise in observation at pixel by . We will use the notation to denote a Poisson random variable with expectancy (where is in units of photons). We will assume that all are independent random variables and that the background and read noise have mean and standard deviations respectively (and otherwise subtract the mean). To calculate the SNR of an observation, assuming that only one region contributes nonzero flux to pixel , lets first look at a specific subobservation:
For each region we have (the number of sets that contain ) subobservations containing it (in each subobservation we observe parts, there are such subobservations and the total number of parts is ). Assuming the common case that only the region contributes nonzero flux to pixel , the best SNR of region is achieved when taking the average of all subobservations for which , meaning that the best flux estimation of region is
Multiplying the signal by a factor does not change the SNR so
(4) 
Now, we analyze the efficiency of the method assuming that different parts of the noise are dominant.
4.1 Poisson noise is dominant
Assuming Poisson noise is the dominant noise source, we can rewrite equation 4 as
We can choose , so the equation becomes
and we get the same SNR as in the original observation. This means that the efficiency is
This means that in both observing modes the multiplexed imaging gains maximal efficiency when the dominant noise is Poisson noise, and since there is no dependence on one can use large numbers of subobservations guaranteeing high stability for the recovery algorithm.
4.2 Background noise is dominant
Assuming background noise is the dominant noise source and the fact that are all independent, we can rewrite equation 4 above to
Now we can calculate the SNR when using multiplexed imaging:
Recall that the original SNR was
So, requiring equal SNR’s, we need . This means that the efficiency is
It seems that the method does not help when the dominant noise is the background because the efficiency is . But for some applications (for example when performing shallow surveys) our assumption that cannot be satisfied because the required exposure time for the observation is small compared to slew time or readout time. In these cases multiplexed imaging allows using exposures with a factor of larger exposure time instead of different exposures, with efficiency
which would mean if is dominant and if is dominant.
4.3 Read noise is dominant
When the read noise is dominant, equation 4 gives
The SNR when using multiplexed imaging is then:
So if we choose we get
which is equal to the original SNR. From this we get the efficiency
In this case, the efficiency depends on the choice of and therefore there will be a difference between the charting mode and the reobserving mode. In reobserving, one needs only to extract the flux of most sources, neglecting overlapping stars and therefore typically will use and , meaning . When charting, the efficiency depends on the parameters , which are somewhat free to the choice of the system designer.
Note that to avoid having two indices such that we must have (counting argument). This reproduces the obvious inequality on the efficiency (efficiency cannot be greater than as we observe at most areas at a time). Note also that this boundary can be reached when . Using some reasonable parameters from our simulations, we get the efficiency which we expect will be typical for most uses.
5 Simulations
To simulate the operation of the charting algorithm, we performed several simulations using data observed with the GALEX satellite (scanning mode with 80 seconds exposure time). These observations were targeted at high galactic latitude, and therefore were sparse, with a measured allowing for high multiplexing. We used 349 regions, 1000x1000 pixels each. We used our algorithm with and (see Fig. 4). To study the performance of the recovery, we compare the reconstructed image with the original image with added background noise. Each reconstructed sky region is the average of 9 subobservations in which it is contained. Therefore the theoretical standard deviation of the reconstructed image is . We generated our set satisfying conditions (2) and (3) with (the closest pair of vectors is different in 4 coordinates).
We report a high fidelity () for recovering the correct positions of all the pixels which have flux contribution from 1 source (and at and above, see Fig. 5). We also report a high fidelity of recovering the correct combination of pixels with flux contribution of 2 or more sources ( for sources with more than and for sources with more than ).
6 Applications
There are many possible applications to the multiplexed imaging method, and many scientific missions operating in the visible, UV and IR (from space) can substantially increase their capabilities by using it. High multiplexing may be limited by the observed object density, or by practical/optical reasons. Here we discuss the expected improvements assuming no optical limitations, which will surely arise when designing a multiplexed imaging system with a large number of optical elements. We assume that the object density () can be as low as , and the expected improvement shown below assume this object density.
6.1 Sky surveys from space
Multiplexed imaging would be powerful for sky surveys from space because of the combination of the following factors:

Detectors are more expensive to operate in space, and the multiplexed imaging system may reduce the amount of expensive spacequalified hardware.

The background noise is substantially lower from space than from the ground.

There are no atmospheric aberrations, meaning that the PSF when imaging from space is substantially smaller, reducing , allowing for higher multiplexing.
Expected improvement: Our simulations suggest one can use multiplexing as high as with charting mode (section 5), leading to increased area coverage per unit time by a factor of .
6.2 Lucky imaging
The lucky imaging method is a technique used to decrease the effects of atmospheric aberrations using high frequency imaging (Law et al., 2006).
When imaging at high frequency, the dominant noise source is the read noise, meaning that high multiplexing might be useful.
One should notice that the current charting recovery algorithm will not suffice for this approach, because objects that fall on the same detector area will interfere differently on each subobservation, as a result of the atmospheric aberrations.
Therefore, we must assume that the logical limit set by the object density is lower than the bound we got from the object density, to prevent the interference of objects.
Since highspeed detectors are small and expensive, using the multiplexed imaging method the lucky imaging technique may become more useful for imaging larger sky areas.
Expected improvement: An increase in the range of 100 fold (assuming we cannot allow colliding sources) to 1000fold (assuming we can) in the area observed.
6.3 Fast photometry
Astronomers use high frequency observations () to search for rapid changes in the light flux coming from stars, e.g caused by random occultations by Kuiper belt objects (Schlichting et al., 2009) or by intrinsic changes of the stellar flux (e.g, astroseismology, Cunha et al., 2007).
In this scientific use, the dominant source of noise is either the readnoise or the Poisson noise, allowing for high multiplexing.
Expected improvement: An increase of up to 1000 in the amount of stars we will be able to monitor.
6.4 Searching for transient sources
When searching for transients, one seeks the appearance of new objects in the field of view. This means that in principle astronomers try to image as wide an area as possible, observing the same area of sky over and over again. With multiplexed imaging, one can use the reobserving mode.
Expected improvement from the ground: From the ground, when the background noise is dominant, multiplexing can help making shallow allsky surveys 10fold to 100fold more effective, increasing the cadence and reducing the efficiency drop due to slew time.
When designing new instruments, multiplexed imaging may help reduce the cost of survey telescopes, allowing for the use of smaller detectors, larger fnumbers and reducing the demands from the physical machinery.
Expected improvement from space: In this case the dominant noise source is either readnoise or Poisson noise. The expected improvement is up to a factor of 1000.
6.5 Searching for planets, eclipsing binaries and microlensing events
These applications involve monitoring bright stars regularly, to detect flux decrements due to occultation of the star by a planet or an increase of flux due to a lensing event. The flux variability scale can be as small as 0.0001%. At these levels of precision the dominant noise is the Poisson noise, allowing for high multiplexing. It will be especially beneficial when making shallow homogeneous searches for planets.
Expected Improvement: Improvement factor of up to 1000 when searching nondense sky areas, and roughly 10 when monitoring dense regions of the sky.
7 Summary
Multiplexed imaging systems could be gamechangers for future space missions, and would be useful also for new ground based instrumentation.
The use of the method may lead to a new generation of widefield surveying space telescopes as well as efficient groundbased instruments for lucky imaging, fast photometry, and transient and variability surveys.
8 Supplementary
8.1 Construction of the sets
In this section we show how one can construct the set such that there will be no ambiguity in the recovery of two interfering sources. We will start by noticing a useful fact:
Claim. Let be a linear combination of two binary vectors, , such that all the numbers in the set appear in . Then the only set of numbers such that binary vectors and is .
Proof. Assume that there exists such that .
W.l.o.g, and .
Since are binary vectors, the only numbers that can appear in are . So the sets must be equal, but since both sets are sorted, this means that
.
Q.E.D
If we assume that , we know that the numbers are statistically distinguishable in our measurement, meaning that for each index in the vector we can decide which of the values gets. This means that we can exactly construct as with no confusion, leading to a correct recovery of the indices . In order to construct the set we will introduce another helpful fact.
Claim. For every pair of binary vectors with equal sum where , all the numbers appear in the sum .
Proof. The sum of is which is distributed over coordinates, and meaning contains a coordinate with weight larger than 1, so contains the number . Since there is at least one coordinate where they are different and they both have the same number of 1’s, so they must be different on an even number of places, therefore and also appear in .
Q.E.D
To assure that all the numbers appear in every combination of for every we can choose the set to be a set of binary vectors with equal sum, and which will be chosen to be . We need to have , therefore we can determine by the relation , deriving that
Using an example ratio of , we get .
If we want to recover the position in the situation where (or and are indistinguishable due to noise) there needs to be only one way to recover from . We may also want to enforce that every pair of vectors are sufficiently distant to help reduce the confusion in the positions of weak sources. Both conditions can be satisfied by constructing the set from the empty set by inserting vectors with exactly ones in a random order, verifying the condition for every quadruplet of and the condition at every time. Experimentally we got that the achieved with this process is roughly with , which is about twice the value of needed without the above limiting conditions.
8.2 Reconstruction algorithm
Define the confidence parameter . For every pixel on the detector:

Let be the vector of all its subobservations. Initialize the set .

Construct the error vector and set the loss .

For every use weighted least squares to find the parameters such that the loss is minimal. Choose the pair that minimizes the loss.

If

Add v to the chosen vectors set .

Set the loss

Repeat stage 3.


Use the weighted least squares algorithm to find that minimizes , and output all the couples .
Footnotes
 affiliationmark:
 affiliationmark:
 slugcomment: Submitted to PASP
 footnotetext: http://archive.stsci.edu/cgibin/dss_form
References
 Bowen, I. S. 1938, AJ, 88, 113
 Cunha, M. S., Aerts, C., ChristensenDalsgaard, J., et al. 2007, A&A Rev., 14, 217
 Kaiser, N., Burgett, W., Chambers, K., et al. 2010, Proc. SPIE, 7733,
 Law, N. M., Mackay, C. D., & Baldwin, J. E. 2006, A&A, 446, 739
 Law, N. M., Kulkarni, S. R., Dekany, R. G., et al. 2009, PASP, 121, 1395
 Lawrence, A., Warren, S. J., Almaini, O., et al. 2007, MNRAS, 379, 1599
 Martin, D. C., Fanson,J., Schiminovich, D., et al. 2005, ApJ, 619, L1
 Rau, A., Kulkarni, S. R., Law, N. M., et al. 2009, PASP, 121, 1334
 Schlichting, H. E., Ofek, E. O., Wenz, M., et al. 2009, Nature, 462, 895
 Skrutskie, M. F., Cutri, R. M., Stiening, R., et al. 2006, AJ, 131, 1163
 Wright, E. L., Eisenhardt, P. R. M., Mainzer, A. K., et al. 2010, AJ, 140, 1868
 York, D. G., Adelman, J., Anderson, J. E., Jr., et al. 2000, AJ, 120, 1579