Controltheoretic Approach to Communication with Feedback: Fundamental Limits and Code Design
Abstract
Feedback communication is studied from a controltheoretic perspective, mapping the communication problem to a control problem in which the control signal is received through the same noisy channel as in the communication problem, and the (nonlinear and timevarying) dynamics of the system determine a subclass of encoders available at the transmitter. The MMSE capacity is defined to be the supremum exponential decay rate of the mean square decoding error. This is upper bounded by the informationtheoretic feedback capacity, which is the supremum of the achievable rates. A sufficient condition is provided under which the upper bound holds with equality. For the special class of stationary Gaussian channels, a simple application of Bode’s integral formula shows that the feedback capacity, recently characterized by Kim, is equal to the maximum instability that can be tolerated by the controller under a given power constraint. Finally, the control mapping is generalized to the sender AWGN multiple access channel. It is shown that Kramer’s code for this channel, which is known to be sum rate optimal in the class of generalized linear feedback codes, can be obtained by solving a linear quadratic Gaussian control problem.
I Introduction
Feedback loops are central to many engineering systems. Their study naturally falls at the intersection between communication and control theories. However, the informationtheoretic approach and the controltheoretic one have often evolved in isolation, separated by almost philosophical differences. In this paper we attempt one step at bridging the gap, showing how tools from both disciplines can be applied to study fundamental limits of feedback systems and to design efficient codes for communication in the presence of feedback.
Consider the feedback communication problem over an arbitrary pointtopoint channel depicted in Fig 1a. The encoder, which has access to the channel outputs causally and noiselessly, wishes to communicate a continuous message to the decoder through channel uses. At the end of the transmissions, the decoder forms an estimate based on the received channel outputs, and the mean square error (MSE) of the estimate represents the performance metric of the communication.
We map this communication problem to the general (nonlinear and time varying) controlled dynamical system depicted in Fig. 1b, in which the initial state of the system corresponds to the message , and the control actions–received through the same noisy channel as in the communication problem–correspond to the transmitted signals by the encoder. In this representation, the set of controllers for a given system corresponds to a subclass of encoders for the communication problem. In fact, the system can be viewed as a filter which determines the information pattern [Witsenhausen2], on which the transmitted signals (actions) by the encoder (controller) can depend upon. A similar mapping for the special case of linear timeinvariant (LTI) systems and controllers was first presented in [Elia2004].
We study three different channel models. First, we consider a general pointtopoint channel. The MSE exponent is defined as the exponential decay rate of the MSE in block length , and the (feedback) minimum mean square error (MMSE) capacity is defined as the supremum of all achievable MSE exponents with feedback. We show that the MMSE capacity is upper bounded by the informationtheoretic (feedback) capacity, the supremum of all achievable rates with feedback. We also present a sufficient condition, under which the (informationtheoretic) capacity coincides with the MMSE capacity. These results provide a step towards the understanding of the connection between estimation and information theory.
Second, we focus on the stationary Gaussian channel with feedback, the capacity of which was recently characterized by Kim [YH]. Applying the discrete extension of Bode’s result [DisBode1] (cf. [Bode, Freudenberg]), we observe that the capacity of the Gaussian channel under power constraint is equal to the maximum instability which can be tolerated by a linear controller with power at most , acting over the same stationary Gaussian channel. This follows almost immediately from the previous results [Elia2004, YH] and provides a step towards the understanding of the connection between stabilizability over some noisy channel and the capacity of that channel.
Finally, we consider the sender additive white Gaussian noise (AWGN) multiple access channel (MAC) with feedback depicted in Fig. LABEL:fig:GMACa. We show that the linear code proposed by Kramer [KramerFeedback], which is known to be optimal among the class of generalized linear feedback codes [Ehsan], can be obtained as the optimal solution of a linear quadratic Gaussian (LQG) problem given by a linear timeinvariant (LTI) system controlled over a pointtopoint AWGN channel where the asymptotic cost is the average power of the controller. These results provide a step towards the understanding of how control tools can be used to design codes for communication.
We now wish to place our results in the context of the related literature. The results on the sender AWGNMAC generalize previous ones of Elia [Elia2004], who recovers Ozarow’s code [Ozarow]–a special case of Kramer’s code–using control theory. Our approach is different from Elia’s in both the model and the analysis. Our reduction is to a control problem over a single pointtopoint channel for any number of senders, and our analysis is based on the theory of LQG control. In contrast, in [Elia2004] the communication is limited to sender MAC, which is mapped to a control problem also over a 2sender MAC, and the analysis is based on the technique of Youla parameterization.
The connection between the MMSE and capacity has been investigated extensively and from different perspectives in the literature. For example, in a classic paper Duncan [Duncan] expresses the mutual information between a continuous random process and its noisy version corrupted by white noise, in terms of the causal MMSE. More recently, Forney [Forney] explains the role of the MMSE in the context of capacity achieving lattice codes over AWGN channels. Guo et al. [Verdu] and Zakai [Zakai] showed that for a discrete random vector observed through an AWGN channel, the derivative of the mutual information between input and output sequences with respect to the signaltonoise ratio (SNR), is half the (noncausal) MMSE. We point out that these authors study the average MMSE of a vector observed over a noisy channel without feedback as a function of the SNR. In contrast, we consider the estimation of a single random variable (message), given the observation of a whole block of length , and we look at the exponential decay rate of the MMSE with , at fixed SNR. Of more relevance to us is the recent work of Liu and Elia [Elia2009], who study linear codes over Gaussian channels obtained using a Kalman filter (KF) approach. For this class of codes, they show that the decay rate of the MMSE equals the mutual information between the message and the output sequence. In contrast, our results for the MMSE capacity are derived based on an informationtheoretic approach and hold for all codes over general channels.
Additional works in the literature revealed connections between control theory and information theory. Without attempting of being exhaustive, we distinguish between those who use information theory to study control systems and those who use control theory to study communication systems. Within the first group, Mitter and Newton [Mitter03, Mitter05] studied estimation and filtering in terms of information and entropy flows. In the last decade, Bodelike fundamental limitations in controlled systems have been analyzed with success from an informationtheoretic perspective [iglesias, iglesias2, Nuno07, Nuno08, Yu]. In this context, we point out that our Lemma LABEL:generalupp, when it is specialized to the case of additive channels, provides an alternative proof of Theorem 4.2 in [Nuno08].
Within the second group, Elia [Elia2004] was the first to map linear codes for additive white Gaussian noise channels to an LTI system controlled over an AWGN channel. Subsequently, Wu et al. [Sriram] studied the Gaussian interference channel in terms of estimation and control. Tatikonda and Mitter [Tatikonda] used a Markov decision problem (MDP) formulation to study the capacity of Markov channels with feedback, and recently Coleman [Coleman] considered the design of the feedback encoder from a stochastic control perspective. Finally, we also refer the reader to the work in [MitterSurvey], which gives an historical perspective and contains selected additional references.
The rest of the paper is organized as follows. Section II presents the definitions and the mapping between the feedback communication and the control problem. Section III provides the upper bound on the MMSE capacity for a general pointtopoint channel. The pointtopoint stationary Gaussian channels and the AWGN multiple access channel are considered in Section LABEL:SGC and Section LABEL:MAC, respectively. Finally, Section LABEL:con concludes the paper.
Notation: A random variable is denoted by an upper case letter (e.g. ) and its realization is denoted by a lower case letter (e.g. ). Similarly, a random column vector and its realization are denoted by bold face symbols (e.g. and , respectively). Uppercase letters (e.g. ) also denote matrices, which can be differentiated from a random variable based on the context. The th element of is denoted by , and notation and denote the transpose and complex transpose of matrix , respectively. We use the following short notation for covariance matrices and .
Ii Definitions and Control Approach
Consider the communication problem depicted in Fig. 1a, where the sender wishes to convey a message to the receiver through uses of a stochastic channel,
where and denote the input and output of the channel, respectively, and denote the noise at time . The set of mappings
and the distribution of the noise sequence determine the channel. The noise process is assumed to be independent of the message . We assume that the output symbols are causally fed back to the sender and the transmitted symbol at time can depend on both the previous channel output sequence and the message .
Definition 1
We define a (feedback) code as

an encoder: a set of (stochastic)^{1}^{1}1For stochastic encoders we can write as a function of , where is a random process independent of and . encoding maps , also known to the receiver, such that for each
(1) and

a decoder: a decoding map which determines the estimate of the message based on the received sequence , i.e.,
(2)
We assume that the message is a random variable uniformly distributed over the unit interval and does not depend on . As the performance measure of the communication, we consider the MSE,
(3) 
where the expectation is with respect to randomness of both the message and the channel. Note that the decoder does not affect the joint distribution of and simply estimates the message at the end of the block. Hence, without loss of generality, we pick the optimal decoder, namely, the MMSE estimator of the message given , and we call an encoder optimal if it minimizes . Let
be the exponential decay rate of the MSE with respect to .
Definition 2
The MSE exponent is called achievable (with feedback) if there exists a sequence of codes such that
(4) 
The MMSE capacity is the supremum of all achievable MSE exponents.
The communication problem described above can be viewed as a control problem where the encoder wishes to control the knowledge of the receiver about the message. In his notable paper [Witsenhausen], Witsenhausen wrote: “When communication problems are considered as control problems (which they are), the information pattern is never classical since at least two stations, not having access to the same data, are always involved”. However, this is not the case for the feedback communication problem in Fig. 1a since the encoder (controller) has access also to the information available at the decoder via feedback. In fact, below we show that this feedback communication problem can be represented as a control problem in which the controller has the complete state information.
Consider the control problem in Fig. 1b where the state at time is
(5) 
with initial state
and . We refer to the mappings ,
as the system. The controller, which observes the current state , picks an action (symbol) ,
(6) 
according to a set of (stochastic) mappings
We refer to the set as the controller.
The communication problem in Fig. 1a can be represented as the control problem Fig. 1b as follows. Let the system be such that the state at time is the collection of the initial state and the past observations , namely,
(7) 
Also, let the controller be picked according to the encoder in the communication problem such that
Then the joint distribution of all the random variables in the control problem is the same as that in the communication problem.
To complete the representation, let be the MMSE estimate of the initial state based on , and the final cost be
We call a controller optimal if it minimizes the final expected cost
Thus, the optimal controller represents the optimal encoder for the communication problem.
The system in (7) is the most general system which can represent all the encoders for the communication system. However, if the system is more restricted such that the state is a filtered version of , then the controller represents only a subclass of encoders where the transmitted symbol depends on only through (see (6)). In that case, we can view the system as a filter which determines the information pattern available at the controller (encoder). Below, we show a subclass of encoders for which the state does not include all the past output as in (7), yet it contains all the optimal encoders. Let be the conditional distribution of the message given the channel outputs .
Lemma 1
An optimal encoder which minimizes the MSE , can be found in the subclass of encoders which is determined by the system with state of the form .
Proof:
See Appendix LABEL:appleminfo.
This lemma, which is based on a wellknown result in stochastic control, provides a sufficient information pattern such that the optimal feedback encoders for the communication over a general channel can be built upon. For example, in the special case of memoryless channels, Shayevitz and Feder [Oferarxiv] proposed an explicit encoder which uses the information pattern described in Lemma 1, and showed that it is optimal in terms of the achievable rates.
Iii MMSE Capacity
In this section we present the relationship between the MMSE capacity and the informationtheoretic capacity [CoverThomas2006].
First, we review the definition of the informationtheoretic achievable rates, which is an asymptotic measure based on a sequence of message sets whose size depend on the block length . Consider the code in Definition 1 and let the message be uniformly distributed over the set such that
(8) 
Let probability of error be . A rate is called achievable (with feedback) if there exists a sequence of codes with message sets for which as . The (feedback) capacity is the supremum of all achievable rates.
Recall that the MSE exponent is defined based on a message which does not depend on the block length . The following lemma provides the connection between the achievable rates and MSE exponents.
Lemma 2
If the MSE exponent is achievable, then any rate is achievable. Conversely, if the rate is achievable and the probability of error satisfies , then the MSE exponent is achievable.