Separated design of encoder and controller for
networked linear quadratic optimal control
Abstract.
For a networked control system, we consider the problem of encoder and controller design. We study a discretetime linear plant with a finite horizon performance cost, comprising of a quadratic function of the states and controls, and an additive communication cost. We study separation in design of the encoder and controller, along with related closedloop properties such as the dual effect and certainty equivalence. We consider three basic formats for encoder outputs: quantized samples, realvalued samples at eventtriggered times, and realvalued samples over additive noise channels. If the controller and encoder are dynamic, then we show that the performance cost is minimized by a separated design: the controls are updated at each time instant as per a certainty equivalence law, and the encoder is chosen to minimize an aggregate quadratic distortion of the estimation error. This separation is shown to hold even though a dual effect is present in the closedloop system. We also show that this separated design need not be optimal when the controller or encoder are to be chosen from within restricted classes.
1. Introduction
We consider discretetime sequential decision problems for a control loop that has a communication bottleneck between the sensor and the controller (Figure 1). The design problem is to choose in concert an encoder and a controller. The encoder maps the sensor’s raw data into a causal sequence of channel inputs. Depending on the channel model adopted in this paper, the encoder performs either sequential quantization, sampling, or analog companding. The controller maps channel outputs into a causal sequence of control inputs to the plant. Such twoagent problems are generally hard because the information pattern is nonclassical, as the controller has less information than the sensor [51]. This gives scope for the controller to exploit any dual effect present in the loop, even when the plant is linear [14]. These twoagent problems are at the simpler end of a range of design problems arising in networked control systems [11, 3, 21, 1]. Naturally, one seeks formulations of these design problems as stochastic optimization problems whose solutions are tractable in some suitable sense.
The classical partially observed linear quadratic Gaussian (LQG) optimal control problem is a oneagent decision problem [52]. Given a linear, GaussMarkov plant, one is asked for a causal controller, as a function of noisy linear measurements of the state, to minimize a quadratic cost function of states and controls. This problem has a simple and explicit solution, where the optimal controller ‘separates’ into two policies; one to generate a minimum meansquared error estimate of the state from the noisy measurements, and the other to control the fully observed GaussMarkov process corresponding to the estimate. A networked version of this problem is the following twoagent LQG optimal control problem [10]. Given a linear GaussMarkov plant and a channel model, one is asked for an encoder and controller to minimize a performance cost which is a sum of a communication cost and a quadratic cost on states and controls. The communication cost is charged on decisions at the encoder, which are chosen to satisfy constraints imposed by the channel model. No causal encoding or control policies are, in general, excluded from consideration. As in the oneagent version, a certain ‘separated’ design is optimal, as has been suggested in various settings since the sixties [27, 42, 16, 5, 33, 45, 31, 53, 35, 6, 34, 56]. Precisely, the following combination is optimal: certainty equivalence controls with a minimum meansquared estimator of the state, and an encoder that minimizes a distortion for state estimation at the controller. The distortion is the average of a sum of squared estimation errors with timevarying coefficients depending on the coefficients of the performance cost. This separation is different from that obtained in the classical LQG problem, but it is still due to a linear evolution of the state, and the statistical independence of noises from all other current and past variables. As in the classical oneagent version [43, 41], the random variables need not be Gaussian.
1.1. Previous works
In the long history of the twoagent networked LQG problem, different channel models have been treated, leading to different types of encoders. We find in these works that the encoder is either a quantizer, an analog timedependent compander, or an eventbased sampler.
When a discrete alphabet channel is treated, the encoder is a timedependent quantizer. Quantized control has been explored since the sixties, and structural results for this problem have seen spirited discussions over the years [27, 32, 16]. This problem was revisited by Borkar et al. [10] in recent years, setting off a new wave of interest. Surveys can be found in [35, 18]. For an additive noise channel, the encoder is a timedependent, possibly nonlinear, compander. The corresponding networked LQG problem has been studied in [5], and more recently in [17, 19]. Analog channels with channel use restrictions lead to an encoder being an eventtriggered sampler [2]. The networked LQG problem for eventtriggered sampling is studied in [34].
The above papers suggested separated designs for the twoagent LQG problem with dynamic encoder and controller, and certainty equivalence controls. This is despite other results [13, 15], confirming the dual effect in the twoagent networked control problem. Thus, there can be an incentive to the controller to influence the estimation error, and yet the optimal controller chooses to ignore this incentive. Furthermore, for the twoagent LQG problem with eventtriggered sampling, and with zero order hold control between samples, Rabi et al. [39] showed through numerical computations that it is suboptimal to apply controls affine in the minimum mean square error (MMSE) estimate. The optimal controls are nonlinear functions of the received samples. Thus, the literature does not tell us when separation holds, and when it does not, for the general class of twoagent problems.
1.2. Our contributions
We make three main contributions. Firstly, we show that for the combination of a linear plant and nonlinear encoder, the dual effect is present. This confirms the results of Curry and others [13, 15], by establishing through a counter example that there is a dual effect in the closedloop system. In fact, each of the three models we allow for the channel endow the loop with the dual effect. The dual role of the controller lies in reducing the estimation error in the future, using the predicted statistics of the future state and knowledge of the encoding policy. Due to this dual role, we show that, in general, separated designs need not be optimal for linear plants with nonlinear measurements, even with independent and identically distributed (IID) Gaussian noise and quadratic costs. Examples 5 and 6 show instances where the dual effect matters. Example 3 shows how the dual effect in the twoagent networked LQ problem renders useless the techniques that work for the classical, singleagent, partially observed LQ problem. These examples illustrate the insufficiency of arguments offered in [27, 42, 16, 5, 33, 45, 31, 53, 35, 6, 34, 56] for the optimality of separation and certainty equivalent controls.
Our second contribution is a proof for separation in one specific design problem. We prove that for the dynamic encodercontroller design problem, it is optimal to apply separation and certainty equivalence. A key instrument in our proof is the class of ‘controlsforgetting encoders’ (introduced in section 4.2) which we show to be optimal despite it being a strict subset of the general class of statebased encoders. We also notice that the result holds under a variety of schemes for charging communication costs. For example, it holds even when the encoder is an analog compander with hard amplitude limits. Our proof does not require the dual effect to be absent. Hence there is no contradiction with the fact separation and certainty equivalence are not optimal for other design problems concerning the same plantsensor combination. Our work also provides a direct insight to explain separation or the lack of it, in the form of a property of the optimal costtogo function (Example 4 in Section 6). Furthermore, we show that when this property does not hold separation is no longer optimal.
Our third contribution points out some subtleties that arise when dynamic policies are involved. We explicitly demonstrate that with dynamic encoders for LQ optimal control, one cannot extend and apply a result of BarShalom and Tse [7] which mandates absence of dual effect for certainty equivalence to be optimal. The classical notion of a dual effect was introduced for static measurement policies, and the dual role of the controls has been motivated through the notion of a probing incentive [14]. We ask if the concept of probing applies unchanged for dynamic measurement policies and point out some subtleties in answering this question.
In recent years, there has been a resurgence of interest in problems related to dynamic and decentralized decision making in stochastic control. Old problems and results have been reexamined and reinterpreted to find new insights and develop new methods, such as the common information approach [30, 36]. Others, such as [26], have sought to reinterpret the proof techniques used in [4]. Following in the path of [50], many new counterexamples have been identified that show optimality of nonlinear strategies for control problems under nonclassical information patterns [29, 57]. Similarly, drawing from the many works on twoagent networked LQG problems [13, 15, 10, 35, 18], we have sought to understand why a structural simplification can be found in some dynamic decision problems, despite the nonclassical information pattern and the consequent presence of a dual effect.
1.3. Outline
The remainder of the paper is organized as follows. In Section 2, we present a basic problem formulation, pertaining to encoder and controller design for datarate limited channels. In Section 3, we discuss the notion of a dual effect and certainty equivalence, and present a counterexample to establish that there is a dual effect in the considered networked control system. In Section 4, we present a proof for separation in the twoagent networked LQG problem. In Section 5, we extend our results to other channel models, including eventtriggered samples and additive noise channels. In Section 6, we present a number of examples to illustrate that in general, separation does not hold for constrained design problems, followed by the conclusions in Section 7.
2. Problem formulation
In this section, we describe a version of the twoagent networked LQG problem, corresponding to a ratelimited channel model. We consider an instantaneous, errorfree, discretealphabet channel and the logarithm of the size of the alphabet is the bit rate. A control system that uses such a channel to communicate between its sensor and controller is depicted in Figure 1, and comprises of four blocks. Each of these blocks, along with the performance cost, are described below, followed by a description of the design problems under consideration.
2.1. Plant
The plant state process is scalar, and its evolution law is linear:
(1) 
for Here is the controls process, and is the plant noise process, which is a sequence of independent random variables with constant variance , and zero means. The initial state has a distribution with mean and variance . At any time , the noise is independent of all state, control, channel input, and channel output data up to and including time . We assume that the state process is perfectly observed by the sensor.
2.2. Performance cost
The performance cost is a sum of the quadratic cost charged on states and controls, and a communication cost charged on encoder decisions:
(2) 
where and are suitably chosen scalar weights for the squares of the states and controls, respectively. The communication cost is an average quantity that depends on the encoding and control policies, and the channel model adopted.
2.3. Channel model
The channel model refers to an inputoutput description of the communication link from the sensor to the controller. We denote the channel input at time by , the corresponding output by , and the encoding map generating by . In Figure 1, we consider an ideal, discrete alphabet channel that faithfully reproduces inputs, and thus, . The encoder’s job is to pick at every time , the encoding map producing a channel output letter from the preassigned finite alphabet where the nonnegative integer is the preassigned size of the channel alphabet. Since the alphabet is fixed, we have a hard datarate constraint at every time. Hence there is no explicit cost attached to communication, so in this case. In Section 5, we consider other channel models that permit the datarate or energy needed for each transmission to be chosen causally by the encoder.
2.4. Controller
The control signal is real valued and is to be computed by a causal policy based on the sequence of channel outputs. The controller has perfect memory, and thus remembers all of its past actions, and the causal sequence of channel outputs. Thus, in general, at every time the controller’s map takes the form:
2.5. Encoder
At all times, the encoder knows the entire set of control policies employed by the controller and the statistical parameters of the plant. With this prestored knowledge, the encoder works as a causal quantizer mapping the sequence of plant outputs. Thus, the encoder’s map takes the form:
Notice that we do not allow the encoder to directly view the sequence of inputs to the plant. This subtle point plays an important role in the examples we present in Section 7.
2.6. Design problems
For a given information pattern, different design spaces may arise due to engineering heuristics, hardware or software limitations, etc. Any such design space is a subset of the set of all admissible encoder and controller pairs. We identify four design problems, each associated with its own design space. For these design problems, an adopted channel model can be either the one described in Section 2.3, or any of the models from Section 5. First, we pose a singleagent design problem which has a classical information pattern.
Design problem 1 (Controlleronly Design).
Next we pose a design problem where the design space is the largest possible nonrandomized set of admissible encodercontroller pairs. We consider every causally timedependent encoder and controller. In other words, for this type of design problem, regardless of the choices one makes for channel and communication cost, at any time, the controller can update the control signal using all of the channel outputs up till then.
Design problem 2 (Dynamic EncoderController Design).
Next we pose a design problem where the controller and encoder must respect a restriction on selecting the control signals or encoding maps. At every time, the control values must be chosen from a restricted set , such as the interval or the finite set . Likewise, the encoding maps have to be chosen from within restricted sets. For example, the encoding maps may be constrained to consist of two quantization cells , where the encoder threshold must be chosen from a restricted set , say the interval . Subject to these constraints, the controller and encoder policies are still to be dynamically chosen.
Design problem 3 (Constrained EncoderController Design).
Next we pose a design problem where the controller must respect not only the information pattern in the dynamic encodercontroller design problem (Design problem 2), but must also respect a restriction on updating controls. Basically, the control waveform is generated in a piecewise ‘openloop’ way, while epochs and encoding maps are picked using dynamic policies. Let , be two random integers such that . Then the two epochs are and . These epochs are chosen by the controller respecting the inequalities: and , and hence have to be adapted to all the data available at the controller. Within an epoch, the controller must pick controls depending only on data at the start of the epoch. Precisely, given the condition that , and given the initial observation , the controls must be a fixed function of regardless of the data .
Design problem 4 (HoldWaveformController and Encoder Design).
For the linear plant (1), and the adopted channel model, the holdwaveformcontroller and encoder design problem is to pick a causal sequence of encoding polices in concert with a causal sequence of policies for epochs and controls to minimize the performance cost (2). The controls are restricted to depend on the controller’s data in the specific form:
A special case of a holdwaveform controller is that of zero order hold (ZOH) control where an additional restriction forces the control waveform be held constant over each epoch.
For all four design problems presented above, we assume the existence of measurable policies minimizing the associated costs. We avoid investigating the necessary technical qualifications except to say that if need be, one may allow randomized polices, or even reject the class of merely measurable policies in favour of the class of universally measurable policies [9].
3. Dual effect and certainty equivalence
We begin by presenting a definition of dual effect [14] and certainty equivalence [25]. We then present an example to establish that there is a dual effect of the controls in the networked control system introduced in Section 2.
3.1. Dual effect
In a feedback control loop, the dual effect is an effect that the controller may see in the rest of the loop. When it is present, the control laws affect not just the first moment, but also second, third and higher central moments of the controller’s nonlinear filter for the state. Below, we state this formally for a controlled Markov process with partial observations available to the controller:
(3) 
where the sequences and are the realvalued plant state and control processes, respectively, see Figure 2. The sequence is the observation process and the sequences and are the plant noise and observation noise processes, respectively. Assume that all the primitive random variables are defined on a suitable probability triple, . Now, consider two arbitrary admissible sets of control policies: . Once we pick one such set of control policies, they together with the measure define the states, observations and controls as random processes. The choice of policies fixes their statistics. We can advertise this relationship by (1) specifying random variables, for example, in the form , (2) specifying a filtration, for example, the one generated by the process as , or (3) specifying an expected value of a functional, for example, in the form
where stands for any element of the sample space of the primitive random variables. To minimize the notational burden, we advertise the dependence on the set of control policies only as needed. We now define the dual effect by defining its absence.
Definition 1 (Dual effect).
The networked control system in Figure 2 is said to have no dual effect of secondorder if

for any two sets of admissible control policies, and

for any two time instants ,
we have for every , and that for any given event ,
Thus, we require equality of the two sets of covariances of filtering/prediction/smoothing errors, corresponding to any two choices of control strategies. In the definition above, by choosing one set of control policies, say as resulting in , for all , we obtain the definition of BarShalom and Tse [7].
3.2. Certainty equivalence
For the controlled Markov process (3), consider the general cost
where is a given nonnegative cost function. Imagine that a muse could at time supply to the controller the exact values of all primitive random variables by informing the controller the exact element of the sample space . With such complete and acausal information, the controller could, in principle, solve the deterministic optimization problem
Let be an optimal control law for this deterministic optimization problem. We now state the definition of certainty equivalence from van der Water and Willems [46]:
Definition 2.
Clearly, this law is causal. Notice also that its form is tied to the performance cost, and to the statistics of the state and observation processes. It is possible for certainty equivalence control laws to be nonlinear, and such laws can be optimal even when separated designs may not be. For linear plants, they can sometimes be linear or affine, as indicated by the following proposition from [46] adapted to our problem.
Lemma 1 (Affine certainty equivalence laws for linear plants).
Definition 3 (Certainty equivalence property).
The certainty equivalence property holds for a stochastic control problem if it is optimal to apply the certainty equivalence control law.
For the stochastic control problem described in Lemma 1, with nonlinear measurements that do not result in a dual effect of the controls, BarShalom and Tse [7] showed that the certainty equivalence property holds.
We now consider a simple example, and show that there is a dual effect of the control signal in the closedloop system presented in Section 2.
Example 1.
For the plant (1), let , , and . Let this information be known to the encoder and the controller, which simply means that . Let the variance . For the objective function, let the horizon end at , and let . Let the channel alphabet be the discrete set .
For the given threshold , let the encoder at be:
(4) 
The optimal control law at is , where . Using the encoding policy and the optimal control signal , the performance cost with can be written as a function of the control at :
In the above expression, is the quantization distortion, which is thus proportional to the conditional variance of the controller’s minimum meansquared estimation error of . Notice that is a function of , thus resulting in a dual effect of the control signal in the plantencoderchannel combination. Figure 3 shows how the quantization distortion depends on . The total cost is also plotted and the optimal value is shown to be different from the certainty equivalent control .
4. Dynamic encodercontroller design
In this section we solve the dynamic encodercontroller design problem (Design problem 2) which allows both controls and encoders to be dynamic. We work out the details for the discrete alphabet channel with the fixed alphabet size . We begin by examining a known structural property of optimal encoders. This states that it is optimal for the encoder to apply a quantizer on the state , with the shape of the quantizer depending only on past quantizer outputs. Next, we present a structural property for encoders called controlsforgetting, which leads to separation. Finally, we show that one optimal encoder for Design problem 2 does indeed possess this property, which leads to separation and certainty equivalence for this problem.
4.1. Known structural properties of optimal encoders
Let us now formulate the encoder’s Markov decision problem. Fix the control policies to be the arbitrary, but admissible laws:
Then the optimization problem reduces to one of picking encoding policies. This is a singleagent, sequential decision problem, and hence one with a classical information pattern. The action space for this decision problem is the infinite dimensional function space of discretevalued encoders. At time , the encoder takes as input: the current and previous states, all previous outputs, and all previous encoding maps. For convenience, we can view this encoding map as a function of only the current state but with the rest of the inputs considered as parameters determining the form of this function. Thus, without loss of generality the encoder can be described as the function
having as its argument with its shape determined by Hence the action space at times can be described as: Identifying encoders as decisions to be picked is not enough, as the signal need not be Markov. We utilize the following property.
Lemma 2 (Striebel’s sufficient statistics).
For every design problem we have set up, the signals
form sufficient statistics for the encoding decision at time .
Proof.
See Striebel [44]. ∎
Hence, at every time , performance is not degraded by the encoder choosing to quantize just instead of quantizing the entire waveform . Of course the shape of the quantizer is allowed to vary with past encoder shapes, past encoder outputs, and on past control inputs. But given the sufficient statistics, the encoder can forget the data: .
Denote by the data at the controller just after it has read the channel output and just before it has generated the control value . Similarly denote by the data at the controller just after it has generated the control value . Then
Also let
The problem we consider has two decision makers that jointly minimize a given cost function. The information available to these decision makers is not the same, and neither is the information available to each agent a subset of the information available to the agent downstream in the loop. Thus, the information pattern here is neither classical nor nested. We apply the common information approach^{1}^{1}1This approach was first proposed by Witsenhausen, as a conjecture in [51], to deal with multiple decision makers and nonclassical information patterns in a general setting. This conjecture was shown to be true by Varaiya and Walrand in [47] for a special case. Our terminology is derived from [36], where the conjecture has been studied in detail. to our problem. This approach allows a designer to treat a problem with multiple decision makers as a classical control problem with a single decision maker that has access to partial state information. When applied to our setup, this approach leads to the following structural result at the encoder. The encoding policy is selected based on the information available to the controller at the previous time instant namely . At times respectively, the data comprise the common information in this problem. The encoding map is applied to the state , which is private information available to the encoder. A similar approach has been used by others for problems of quantized control [12, 49, 55].
4.2. Controlsforgetting encoders and separation
We now present a structural property of encoders which ensures separation in design. Recall the plant (1) and cost (2), and define the following control free part of the state:
At the encoder, the change of variables
(5) 
is causal and causally invertible. Hence the statistics are also sufficient statistics at the encoder. We now introduce the innovation encoding of Borkar and Mitter [10].
Definition 4 (Innovation encoder [10]).
An encoder with the inputs and outputs:
is admissible and is called an ‘innovation’ encoder.
The networked control system in Figure 1 redrawn with an innovation encoder is shown in Figure 4. Note that with innovation encoding, the control free part of the state is not affected by the control policies, but obeys the recursion
For any sequence of causal encoders, one can find an equivalent sequence of innovation encoders such that when these two sets operate on the same sequence of plant outputs, they produce two sequences of channel inputs that are equal with probability one. Hence, if for a plant and channel, the dual effect is present in a certain class of causal encoders, then the dual effect is also present in the equivalent class of innovation encoders [15]. This is what the following example illustrates:
Example 2 (Dual effect in a loop with fixed innovation encoder).
We use the same setup as in Example with the encoder replaced by an innovation encoder. For the given threshold , let the encoder at time be the following innovation encoder:
(6) 
The optimal control law at is still , where . For the control , notice that (4) and (6) tell us that this innovation encoder is equivalent to the causal encoder of Example 1. For the same applied control policy , and for the same realizations of primitive random variables, we get . Hence, with probability one the two nonlinear filters for the state given are the same. Thus for an event , we have:
Hence the results in Figure 3 apply also to this example.
The encoder (quantizer) in the loop causes the dual effect. Furthermore, the encoder’s presence renders useless the techniques that worked in the case of the classical, singleagent, partially observed LQ control problem. The next example illustrates this.
Example 3.
We examine a scalar system as it evolves from time step 0 to time step 1. We have:
where is the process noise variable which is independent of , and We adopt the specific quantizing strategy given below (on the left in the form of a encoder for , and on the right, in the equivalent, innovation form):
Since the encoder at time is binary, the general control law at time has the form:
where are arbitrary real numbers. The process is fully observed at the controller. We have , and as noted in [56], one can write:
(7) 
where the noiselike random variable is given by: Then one can treat the problem as the control of the fully observed process to minimize the given cost, which can be rewritten as the following sum of two terms:
(8) 
Such a treatment actually works for the case of the classical, singleagent partially observed LQ control problem. There two special things happen: (1) the random process is statistically independent of the control process and of the ‘state’ process , and (2) because the dual effect is absent, the second term on the RHS of 8 does not vary with . Therefore, by considering as the process to be controlled, we get a singleagent, fully observed LQ control problem.
In the twoagent problems considered in this paper, neither of the abovementioned special things may happen. For this specific example, we have calculated, and then plotted in Figure 5 how the second moments of and vary with . The calculations are presented in Appendix A.
Next we define a class of encoders for which at prescribed times the statistics of , are independent of the control
Definition 5 (Controlsforgetting encoder).
Denote by the conditional density of given the data . An admissible encoding strategy is controlsforgetting from time if it takes the form:
where (1) is any admissible policy for encoding at time , (2) for the policies are adapted to the data
and (3) for fixed values of the data , the map produces the same output regardless of both the controls and the control policies
Clearly such controlsforgetting encoders exist. For example, consider a set of encoders that quantize in sequence to minimize the estimation distortion , where . Let the nonnegative function represent some notion of cost. For example, .
Lemma 3 (Distortions incurred by controlsforgetting encoders also forget controls).
Fix the time and the distortion measure . If the encoder is controlsforgetting from time , then for times , the distortions are statistically independent of the partial set of controls .
Proof.
The unconditional statistics of are independent of the entire control waveform, no matter what the encoder is. For times and for sets , is independent of because the encoding maps are controlsforgetting from time . Since , for all , the lemma follows. ∎
Definition 6 (Controls affine from time ).
A controller affine from time takes the following form:
(9) 
where the controls are generated by an admissible strategy , the controls are generated by an affine strategy , with the gains and offsets computed offline, and
4.3. Preliminary lemmas
The main result ahead is Theorem 1 that states that it is optimal for Design problem 2 to apply a separated design and certainty equivalence controls. In this subsection, we do some necessary ground work towards proving that result.
Once we are prescribed an admissible encoder, the controls affect only the costtogo: . In the classical single agent LQ problem, the ‘prescribed encoder’ is simply the linear observation process with prescribed signaltonoise ratios. There, this costtogo can be expressed as a quadratic function of and . But in our two agent LQ problem, because of the dual effect, the cost to go may have a nonquadratic dependence on the controls . However we show that by restricting to controlsforgetting encoders and affine controls, the costtogo does get a quadratic dependence on controls. We use this reasoning and dynamic programming to show that for time going backwards from the following conclusions fall out:

it is optimal at time to apply as control a linear function of , and,

it is optimal at time to apply an encoding map that is controlsforgetting from time .
Lemma 4 (Optimal control at time ).
The optimal control policy at time is the linear law: , and the optimum costtogo is the expected value of a quadratic in and .
Proof.
At time , one is given , and is asked to pick to minimize the costtogo
and this lets us prove the Lemma. ∎
Lemma 5 (Optimal for separated, quadratic costtogo).
Fix the time . Consider the dynamic encodercontroller design problem (Design problem 2), for the linear plant (1), and the performance cost (2). Suppose that we apply an admissible controller along with an encoder that is controlsforgetting from time . Furthermore, suppose that the partial sets of policies:
are chosen such that the following three properties hold:

the costtogo at time takes the separated form:
where, and the term is a weighted sum of future distortions and depends only on the random sequence ,

the coefficients of the quadratic may depend on the control policies but not on the partial set of encoding maps and,

the term depends on the encoding maps but not on the partial set of control policies .
Then, it is optimal to apply an encoding map at time that does not depend on the data: . It also follows that the shapes of the encoding maps and their performance do not depend on the control .
Proof.
The proof exploits three facts: Firstly the special form of makes the encoder’s performance cost at time a sum of a quadratic distortion between and , and a term gathering distortions at later times. Secondly the minimum of the sum distortion depends only on the intrinsic shape of the conditional density and not on its mean. Thirdly, these facts and the controlsforgetting nature of later encoding maps allows the encoder to ‘ignore’ the control . We now start by writing the costtogo as: