A Preliminary Results

# Optimal Local and Remote Controllers with Unreliable Uplink Channels

## Abstract

We consider a networked control system consisting of a remote controller and a collection of linear plants, each associated with a local controller. Each local controller directly observes the state of its co-located plant and can inform the remote controller of the plant’s state through an unreliable uplink channel. We assume that the downlink channels from the remote controller to local controllers are perfect. The objective of the local controllers and the remote controller is to cooperatively minimize a quadratic performance cost. We provide a dynamic program for this decentralized control problem using the common information approach. Although our problem is not a partially nested problem, we obtain explicit optimal strategies for all controllers. In the optimal strategies, all controllers compute common estimates of the states of the plants based on the common information from the communication network. The remote controller’s action is linear in the common estimated states, and the action of each local controller is linear in both the actual state of its co-located plant and the common estimated states. We illustrate our results with a platoon control problem for autonomous vehicles.

\IEEEoverridecommandlockouts

## 1 Introduction

The advent of information and communication technologies along with the development of the Internet of Things (IoT) has drawn more attention to networked control systems (NCSs). NCSs are distributed systems in which information is exchanged through a network among various components (controllers, smart sensors, actuators, etc.). The connectivity of NCS brings numerous opportunities to new applications such as autonomous vehicles, smart grid, remote surgery, smart home, and large manufacturing systems (see [2, 3, 4] and references therein). However, the network connection is subjected to various communication constraints. One main constraint is the unreliability of communication channels which can greatly affect the performance of NCS [5, 6]. Therefore, the study of NCS over unreliable channels is of great importance.

The effect of control over unreliable channels has been investigated in [7, 8, 9, 10, 11, 12] for NCS with a single controller. However, most NCS applications consist of multiple sub-systems where each sub-system may be controlled by a remote controller as well as a local controller. For example, in unmanned aerial vehicle (UAV) systems, the UAVs are remotely controlled by a ground control station while a local computer in each UAV provides basic stability controls [13]. The overall system performance depends on the coordination among the remote controller and all local controllers through the communication network. In this paper, we consider a NCS consisting of a remote controller and a collection of linear plants, each associated with a local controller as shown in Fig. 1. Each plant is directly controlled by a local controller which can perfectly observe the state of the plant. The remote controller can control all plants, but it does not have direct access to the states as its name suggests. The objective of the local controllers and the remote controller is to cooperatively minimize an overall quadratic performance cost of the NCS. The remote controller and local controllers are connected by a communication network where the downlinks from the remote controller to local controllers are perfect but the uplinks from local controllers to the remote controller are unreliable channels with random packet drops. Such scenario happens in many situations where the remote controller is equipped with sufficient communication resources, but each local controller has limited transmission capabilities. For instance, the local controllers can be a group of battery-powered telerobots or autonomous vehicles with limited transmission power proximal to their co-located systems while the remote controller can be a controlling operator connected to a power outlet or a base station with high transmission power.

When the local controllers are smart sensors or encoders that can only sense and transmit information, the NCS operation depends only on remote estimation and control. Remote estimation with a single smart sensor has been studied in [14, 15, 16, 17] and has been extended to the case with multiple smart sensors and general packet drop models in [18, 19]. Remote estimation and control of a linear plant has been studied in [20, 21, 22, 23, 24, 25] under various channel models between smart sensors and a remote controller. The problem considered in this paper is different from these previous works on NCS because our problem is a decentralized control problem with multiple controllers where the dynamics of each plant is controlled by the remote controller as well as the corresponding local controller. Finding optimal strategies in decentralized control problems is generally considered a difficult problem (see [26, 27, 28]). In general, linear control strategies are not optimal, and even the problem of finding the best linear control strategies is not convex [29]. Existing optimal solutions of decentralized control problems require either specific information structures, such as partially nested [30, 31, 32, 33, 34, 35], stochastically nested [36], or other specific properties, such as quadratic invariance [37] or substitutability [38, 39].

For the problem we consider in this paper, none of the above properties hold due to either the unreliable communication or the nature of dynamics and cost function. We use the common information approach to show that this problem is equivalent to a centralized sequential decision-making problem where the remote controller is the only decision-maker. We provide a dynamic program to obtain the optimal strategies of the remote controller in the equivalent problem. Then, using the optimal strategies of the equivalent problem, we obtain explicit optimal strategies for all local controllers and the remote controller. In the optimal strategies, all controllers compute common estimates of the states of the plants based on the common information from the communication network. The remote controller’s action is linear in the common estimated states, and the action of each local controller is linear in both the actual state of its co-located plant and the common estimated states. As an application of our problem, we apply our results to a simple platoon control problem for autonomous vehicles.

### 1.1 Notation

Random variables/vectors are denoted by upper case letters, their realization by the corresponding lower case letter. For a sequence of column vectors , the notation denotes the vector . The transpose and trace of matrix are denoted by and , respectively. In general, subscripts are used as time index while superscripts are used to index controllers. For time indices , (resp. ) is the shorthand notation for the variables (resp. functions ). Similarly, for , (resp. ) is the shorthand notation for the variables (resp. functions ). For set , the collection (resp. ) is denoted by (resp. ). Furthermore, the notation is used to denote . The intersection of the events is denoted by .

The indicator function of set is denoted by , that is, if , and otherwise. If is an event, then denotes the resulting random variable. , , and denote the probability of an event, the expectation of a random variable/vector, and the covariance matrix of a random vector, respectively. For random variables/vectors and , denotes the probability of an event given that , and . For a strategy , we use (resp. ) to indicate that the probability (resp. expectation) depends on the choice of . Let denote the set of all probability measures on with finite second moment. For any , denotes the probability of event under . The mean and the covariance of a distribution are denoted by and , respectively, and are defined as and .

The notation and is used to denoted a identity matrix and a zero matrix, respectively. For block matrix , denotes the -th block row of . For example, for , and .

### 1.2 Organization

The rest of the paper is organized as follows. We introduce the system model and formulate the multi-controller NCS problem in Section 2. In Section 3, we formulate an equivalent problem using the common information approach and provide a dynamic program for this problem. We solve the dynamic program in Section 4. In Section 5, we consider an application for autonomous vehicles using the system model of Section 2. Section 6 concludes the paper. The proofs of all the technical results of the paper appear in the Appendices.

## 2 System Model and Problem Formulation

Consider a discrete-time system with plants, local controllers, , and one remote controller as shown in Fig. 1. We use to denote the set and to denote . The linear dynamics of plant are given by

 Xnt+1=AnnXnt+BnnUnt+Bn0U0t+Wnt,t=0,…,T (1)

where is the state of the plant at time , is the control action of the controller , , and , , are matrices with appropriate dimensions. is a random vector with distribution , is a zero-mean noise vector at time with distribution . are independent random vectors with finite second moments. Note that we do not assume that and are Gaussian.

The overall dynamics can be written as

 Xt+1=AXt+BUt+Wt (2)

where and are defined as

 A =⎡⎢ ⎢ ⎢⎣A11\huge 0⋱\huge 0ANN⎤⎥ ⎥ ⎥⎦,B=⎡⎢ ⎢ ⎢⎣B10B11% \huge 0⋮⋱BN0\huge 0BNN⎤⎥ ⎥ ⎥⎦. (3)

At each time the local controller , , perfectly observes the state and sends the observed state to the remote controller through an unreliable channel with link failure probability . Let be Bernoulli random variable describing the state of this channel, that is, when the link is broken and otherwise. We assume that are independent and identically distributed (i.i.d.) random variables and they are independent of and . Furthermore, let be the output of the channel between the local controller and the remote controller . Then,

 Γnt= {1 with probability (1−pn),0 with probability pn. (4) Znt= {Xnt when Γnt=1,∅ when Γnt=0. (5)

We assume that the channel outputs are perfectly observed by . Furthermore, we assume that there exists a perfect link from to , for . Therefore, can share and with . All controllers select their control actions after observing . A schematic of the time ordering of the variables is shown in Fig. 2. We assume that for all , the links from and to the plant are perfect.

Let denote the information available to controller , , to make decisions at time . Then,

 Hnt Missing dimension or its units for \hskip H0t ={Z1:N0:t,U00:t−1}. (6)

Let be the space of all realizations of . Then, ’s actions are selected according to

 Unt =gnt(Hnt),∀n∈¯N, (7)

where is a Borel measurable mapping. The collection of mappings is called the strategy of the controller as is denoted by . The collection of all controllers’ strategies is called the strategy profile.

The instantaneous cost of the system is a general quadratic function given by

 ct(X1:Nt,U0:Nt)=S⊺tRtSt, where St=vec(X1:Nt,U0:Nt),Rt=[RXXtRXUtRUXtRUUt], (8)

and

 RXXt =⎡⎢ ⎢ ⎢ ⎢⎣RX1X1t…RX1XNt,⋮⋱⋮RXNX1t…RXNXNt⎤⎥ ⎥ ⎥ ⎥⎦=:[RXiXjt]i,j∈N, RXUt =(RUXt)⊺=[RXiUjt]i∈N,j∈¯N,RUUt=[RUiUjt]i,j∈¯N. (9)

is a symmetric positive semi-definite (PSD) matrix and is a symmetric positive definite (PD) matrix.

The performance of strategies , is the total expected cost given by

 J(g0:N)=Eg0:N[T∑t=0ct(X1:Nt,U0:Nt)]. (10)

Let be the set of all control strategies for , . Then, the optimal control problem for is formally defined below.

###### Problem 1.

For the system model described by (1)-(10), we would like to solve the following strategy optimization problem,

 infgn∈Gn,n∈¯NJ(g0:N). (11)
###### Remark 1.

Without loss of optimality, we can restrict attention to strategy profiles that ensure a finite expected cost at each time step. Because is positive semi-definite and is positive definite, finite expected cost at all time is equivalent to

 Eg0:N[(Unt)⊺Unt]=Eg0:N[gnt(Hnt)⊺gnt(Hnt)] <∞,∀n∈¯N,∀t. (12)

Therefore, in the subsequent analysis we will implicitly assume that the strategy profile under consideration, , ensures that for all time and for all , has finite second moments, that is, (12) holds.

Problem 1 is a -controller decentralized optimal control problem. Decentralized optimal control problems are generally believed to be hard. For decentralized linear-quadratic-Guassian (LQG) control problems with partially-nested information structure, linear control strategies are optimal [30]. An information structure is partially-nested if whenever the action of a controller affects the information of another controller, the latter knows whatever the former knows. Note that Problem 1 is not a partially nested problem. In particular, ’s action , , affects , and consequently, it affects . Since is a part of the remote controller ’s information at but , the information structure in Problem 1 is not partially nested. Furthermore, in Problem 1, and are not necessarily Gaussian. Therefore, linear control strategies are not necessarily optimal for Problem 1.

Our approach to Problem 1 is based on the common information approach [40] for decentralized decision-making. We identify the common information among the controllers and use it to define a common belief on the system state. This common belief can serve as an information state for a dynamic program that characterizes optimal control strategies.

## 3 Equivalent Problem and Dynamic Program

We first provide a structural result for the local controllers’ strategies.

###### Lemma 1.

Let , and . Then,

 infgn∈Gn,n∈¯NJ(g0:N)=infgn∈^Gn,n∈N,g0∈G0J(g0:N). (13)
###### Proof.

See Appendix B for a proof. ∎

Due to Lemma 1, we only need to consider strategies for the local controller , . That is, the local controller only needs to use to make the decision at .

According to the information structure (6) and Lemma 1, is the common information among , and is the private information used by the local controller in its decision-making. Note that has no private information, since . Based on the common information approach [40], we construct below an equivalent centralized problem using the controllers’ common information.

### 3.1 Equivalent Centralized Problem

Consider arbitrary control strategies , and for the local and the remote controllers, respectively. Under these strategies

 Unt =gnt(Xnt,H0t)=Eg[gnt(Xnt,H0t)|H0t]+{gnt(Xnt,H0t)−Eg[gnt(Xnt,H0t)|H0t]}. (14)

We can rewrite (14) as

 Unt=¯gnt(H0t)+~gnt(Xnt,H0t) (15)

where

 ¯gnt(H0t)=Eg[gnt(Xnt,H0t)|H0t], ~gnt(Xnt,H0t)=gnt(Xnt,H0t)−Eg[gnt(Xnt,H0t)|H0t]. (16)

Observe that is conditionally zero-mean given , that is, .

Note that is the conditional mean of given the remote controller’s information and can be interpreted as the deviation of from the mean . Considering this representation, (15) suggests that at each time , the problem of finding optimal control action for is equivalent to the problem of finding “mean value” of and “deviation” of from the mean value.

We will use the above representation of in terms of and to formulate a centralized decision-making problem. In the centralized problem, the remote controller is the only decision-maker. At each time , given the realization of the remote controller’s information, it makes three decisions:

1. Remote controller’s control action ,

2. Mean value of all local controller’s control action , ,

3. A “deviation from the mean value” mapping , , where and .

The control actions applied to the system described by (1)-(5) are:

• as the control action of the remote controller,

• as the control action of the -th local controller, .

We call the prescription at time . We denote by and write to indicate that the prescription is a function of the common information . The functions are collectively referred to as the prescription strategy and denoted by . The prescription strategy is required to satisfy the following conditions:

1. .

2. Define . Then, for any .

3. We require that for any ,

 Eϕprs[~ϕnt(H0t)(Xnt)|H0t]=0, (17)

where is the probability measure induced by the prescription strategy .

Denote by the set of all prescription strategies satisfying the above conditions. Consider the following problem of optimizing the prescription strategies.

###### Problem 2.

Consider the system described by (1)-(9). Given a prescription strategy , let

 Λ(ϕprs)= Eϕprs[T∑t=0cprst(X1:Nt,Uprst)] (18)

where for any and ,

 cprst(x1:Nt,uprst)=ct(x1:Nt,u0t,{¯unt+qnt(xnt)}n∈N) (19)

Then, we would like to solve the following optimization problem,

 infϕprs∈ΦprsΛ(ϕprs). (20)

Note that any feasible prescription strategy in Problem 2 results in control strategies in Problem 1. On the other hand, any control strategies in Problem 1 can be represented by a prescription strategy in Problem 2. This equivalence between Problems 1 and 2 is formally stated in the following lemma.

###### Lemma 2.

Problems 1 and 2 are equivalent in the following sense:

1. For any control strategies and in Problem 1, there is a prescription strategy in Problem 2 such that for ,

 ϕ0t(H0t)=g0t(H0t), (21) Missing dimension or its units for \hskip (22) [~ϕnt(H0t)](Xnt)=~gnt(Xnt,H0t)=gnt(Xnt,H0t)−Eg[gnt(Xnt,H0t)|H0t],∀n∈N, (23) Λ(ϕprs)=J(g0:N). (24)
2. Conversely, for any prescription strategy in Problem 2, there are control strategies and in Problem 1 such that for ,

 g0t(H0t)=ϕ0t(H0t), (25) gnt(Xnt,H0t)=¯gnt(H0t)+~gnt(Xnt,H0t)=¯ϕnt(H0t)+[~ϕnt(H0t)](Xnt),∀n∈N, (26) J(g0:N)=Λ(ϕprs). (27)
###### Proof.

See Appendix C for a proof. ∎

### 3.2 Information State for Problem 2

Since Problem 2 is a centralized decision-making problem for the remote controller , ’s belief on the system states can be used as an information state for decision-making. Note that ’s information at any time is the common information . Therefore, we define the common belief as the conditional probability distribution of given . That is, under prescription strategies until time , for any measurable set ,

 Θt(E):=Pϕprs0:t−1(vec(X1:Nt)∈E|H0t). (28)

Let denote the marginal common belief on . That is, for any measurable set

 Θnt(En):=Pϕprs0:t−1(Xnt∈En|H0t). (29)

Then, for a given realization of , the corresponding realization of belongs to and the realization of belongs to .

Since the plants’ dynamics are only coupled through the remote controller’s actions which belongs to the common information, the common belief has the following conditional independence property.

###### Lemma 3.

Consider a feasible prescription strategy . Then, the random vectors are conditionally independent given the common information . That is, for any measurable sets , ,

 Θt(N∏n=1En)=N∏n=1Θnt(En) (30)

where and are given by (28) and (29).

###### Proof.

The proof is a direct consequence of Part 2 of Claim 2 in Appendix A. ∎

From Lemma 3, the joint common belief can be represented by the collection of marginal common beliefs .

We show in the following that the marginal common beliefs can be sequentially updated.

###### Lemma 4.

For any feasible prescription strategy and for any , we recursively define as follows:

For any measurable set ,

 [νn0(h00)](En) ={πXn0(En) if zn0=∅,\mathds1En(xn0) if zn0=xn0. (31) (En) =[ψnt(νnt(h0t),uprst,znt+1)](En), (32)

where and is defined as follows:

• If , then

 [ψnt(νnt(h0t),uprst,znt+1)](En)=\mathds1En(xnt+1). (33)
• If , then

 [ψnt(νnt(h0t),uprst,∅)](En)=∫∫\mathds1En(fnt(xnt,wnt,uprst))νnt(h0t)(dxnt)πWnt(dwnt), (34)

where

 fnt(xnt,wnt,uprst)=Annxnt+Bnn(¯unt+qnt(xnt))+Bn0u0t+wnt. (35)

Then, is a conditional probability of given , that is .

###### Proof.

See Appendix D for a proof. ∎

Lemma 4 implies that the realization of the belief can be updated according to

 θnt+1=ψnt(θnt,uprst,znt+1). (36)

Recall that is the space of all measurable functions . We now define the space of mapping for any to be

 Qn(θn)={qn:RdnX↦RdnU measurable,∫qn(xn)θn(dxn)=0}. (37)

Note that for any feasible prescription strategy , (17) implies that for almost every realization under ,

 Eϕprs[qnt(Xnt)|h0t]=0, (38)

where . Then, (38) and (29) imply that for almost every realization , , that is, belongs to .

### 3.3 Dynamic Program for Problem 2

We can use the collection of marginal common beliefs as an information state to construct a dynamic program for Problem 2. For that purpose, we will use the following definitions.

For every , we use to denote the Dirac-delta distribution at . Then, for any , .

For any , , for and , letting , we define

• . This function represents the remote controller’s expected instantaneous cost at time when its beliefs on the system states are and it selects .