Radio interferometric gain calibration as a complex optimization problem

Radio interferometric gain calibration as a complex optimization problem

O.M. Smirnov, C. Tasse
Department of Physics and Electronics, Rhodes University, PO Box 94, Grahamstown, 6140 South Africa
SKA South Africa, 3rd Floor, The Park, Park Road, Pinelands, 7405 South Africa
GEPI, Observatoire de Paris, CNRS, Université Paris Diderot, 5 place Jules Janssen, 92190 Meudon, France
E-mail: o.smirnov@ru.ac.za
Accepted 2015 February 24. Received 2015 February 13; in original form 2014 October 30
Abstract

Recent developments in optimization theory have extended some traditional algorithms for least-squares optimization of real-valued functions (Gauss-Newton, Levenberg-Marquardt, etc.) into the domain of complex functions of a complex variable. This employs a formalism called the Wirtinger derivative, and derives a full-complex Jacobian counterpart to the conventional real Jacobian. We apply these developments to the problem of radio interferometric gain calibration, and show how the general complex Jacobian formalism, when combined with conventional optimization approaches, yields a whole new family of calibration algorithms, including those for the polarized and direction-dependent gain regime. We further extend the Wirtinger calculus to an operator-based matrix calculus for describing the polarized calibration regime. Using approximate matrix inversion results in computationally efficient implementations; we show that some recently proposed calibration algorithms such as StefCal and peeling can be understood as special cases of this, and place them in the context of the general formalism. Finally, we present an implementation and some applied results of CohJones, another specialized direction-dependent calibration algorithm derived from the formalism.

keywords:
Instrumentation: interferometers, Methods: analytical, Methods: numerical, Techniques: interferometric
pagerange: Radio interferometric gain calibration as a complex optimization problemLABEL:lastpagepubyear: 2014

Introduction

In radio interferometry, gain calibration consists of solving for the unknown complex antenna gains, using a known (prior, or iteratively constructed) model of the sky. Traditional (second generation, or 2GC) calibration employs an instrumental model with a single direction-independent (DI) gain term (which can be a scalar complex gain, or complex-valued Jones matrix) per antenna, per some time/frequency interval. Third-generation (3GC) calibration also addresses direction-dependent (DD) effects, which can be represented by independently solvable DD gain terms, or by some parameterized instrumental model (e.g. primary beams, pointing offsets, ionospheric screens). Different approaches to this have been proposed and implemented, mostly in the framework of the radio interferometry measurement equation (RIME, see ME1); RRIME1; RRIME2; RRIME3 provides a recent overview. In this work we will restrict ourselves specifically to calibration of the DI and DD gains terms (the latter in the sense of being solved independently per direction).

Gain calibration is a non-linear least squares (NLLS) problem, since the noise on observed visibilities is almost always Gaussian (though other treatments have been proposed by Kazemi2013a). Traditional approaches to NLLS problems involve various gradient-based techniques (for an overview, see Madsen-NLLS), such as Gauss-Newton (GN) and Levenberg-Marquardt (LM). These have been restricted to functions of real variables, since complex differentiation can be defined in only a very restricted sense (in particular, does not exist in the usual definition). Gains in radio interferometry are complex variables: the traditional way out of this conundrum has been to recast the complex NLLS problem as a real problem by treating the real and imaginary parts of the gains as independent real variables.

Recent developments in optimization theory (CR-Calculus; ComplexOpt) have shown that using a formalism called the Wirtinger complex derivative (WirtingerDeriv) allows for a mathematically robust definition of a complex gradient operator. This leads to the construction of a complex Jacobian , which in turn allows for traditional NLLS algorithms to be directly applied to the complex variable case. We summarize these developments and introduce basic notation in Sect. LABEL:sec:Wirtinger. In Sect. LABEL:sec:unpol, we follow on from Tasse-cohjones to apply this theory to the RIME, and derive complex Jacobians for (unpolarized) DI and DD gain calibration.

In principle, the use of Wirtinger calculus and complex Jacobians ultimately results in the same system of LS equations as the real/imaginary approach. It does offer two important advantages: (i) equations with complex variables are more compact, and are more natural to derive and analyze than their real/imaginary counterparts, and (ii) the structure of the complex Jacobian can yield new and valuable insights into the problem. This is graphically illustrated in Fig. LABEL:fig:JHJ (in fact, this figure may be considered the central insight of this paper). Methods such as GN and LM hinge around a large matrix – – with dimensions corresponding to the number of free parameters; construction and/or inversion of this matrix is often the dominant algorithmic cost. If can be treated as (perhaps approximately) sparse, these costs can be reduced, often drastically. Figure LABEL:fig:JHJ shows the structure of an example matrix for a DD gain calibration problem. The left column row shows versions of constructed via the real/imaginary approach, for four different orderings of the solvable parameters. None of the orderings yield a matrix that is particularly sparse or easily invertible. The right column shows a complex for the same orderings. Panel (f) reveals sparsity that is not apparent in the real/imaginary approach. This sparsity forms the basis of a new fast DD calibration algorithm discussed later in the paper.

In Sect. LABEL:sec:separability, we show that different algorithms may be derived by combining different sparse approximations to with conventional GN and LM methods. In particular, we show that StefCal, a fast DI calibration algorithm recently proposed by Stefcal, can be straightforwardly derived from a diagonal approximation to a complex . We show that the complex Jacobian approach naturally extends to the DD case, and that other sparse approximations yield a whole family of DD calibration algorithms with different scaling properties. One such algorithm, CohJones (Tasse-cohjones), has been implemented and successfully applied to simulated LOFAR data: this is discussed in Sect. LABEL:sec:implementations.

In Sect. LABEL:sec:pol we extend this approach to the fully polarized case, by developing a Wirtinger-like operator calculus in which the polarization problem can be formulated succinctly. This naturally yields fully polarized counterparts to the calibration algorithms defined previously. In Sect. LABEL:sec:variations, we discuss other algorithmic variations, and make connections to older DD calibration techniques such as peeling (JEN:peeling).

While the scope of this work is restricted to LS solutions to the DI and DD gain calibration problem, the potential applicability of complex optimization to radio interferometry is perhaps broader. We will return to this in the conclusions.

x¯x
Table 1: Notation and frequently used symbols
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
139199
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description