The CEO Problem With Secrecy Constraints

The CEO Problem With Secrecy Constraints

Farshad Naghibi, , Somayeh Salimi, , and Mikael Skoglund,  © 2014 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.orgAuthors are with the School of Electrical Engineering and ACCESS Linnaeus Center, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden (emails: {naghibi,somayen,skoglund}@ee.kth.se).Part of the material in this work was presented in the IEEE International Symposium on Information Theory (ISIT), Honolulu, HI, 2014 [1].
Abstract

We study a lossy source coding problem with secrecy constraints in which a remote information source should be transmitted to a single destination via multiple agents in the presence of a passive eavesdropper. The agents observe noisy versions of the source and independently encode and transmit their observations to the destination via noiseless rate-limited links. The destination should estimate the remote source based on the information received from the agents within a certain mean distortion threshold. The eavesdropper, with access to side information correlated to the source, is able to listen in on one of the links from the agents to the destination in order to obtain as much information as possible about the source. This problem can be viewed as the so-called CEO problem with additional secrecy constraints. We establish inner and outer bounds on the rate-distortion-equivocation region of this problem. We also obtain the region in special cases where the bounds are tight. Furthermore, we study the quadratic Gaussian case and provide the optimal rate-distortion-equivocation region when the eavesdropper has no side information and an achievable region for a more general setup with side information at the eavesdropper.

CEO problem, multiterminal source coding, secrecy constraints, eavesdropping, equivocation.

I Introduction

As networks are becoming more distributed, their vulnerability to malicious activities increases which in turn raises the concern on the security of such networks. Consequently, information-theoretic security as a concrete framework for analyzing secrecy in networks has gained attention among researchers [2, 3]. Information-theoretic security, which was initially introduced by Shannon [4], exploits different statistical characteristics of received information at the legitimate receiver and at the eavesdropper. Moreover, it makes no assumptions on the computational power of the eavesdropper, unlike the traditional cryptographic approaches for secrecy. Later, Wyner introduced the Wiretap channel model in [5] and showed that perfectly secure communication without a shared secret key is possible if the channel from the transmitter to the eavesdropper is a degraded version of the channel to the legitimate receiver. This result was generalized to broadcast channels with confidential messages by Csiszár and Körner in [6]. Subsequently, many extensions to this problem have been developed and studied in the literature (see, for instance, [2], [3], and references therein).

In this paper, we consider secrecy in a multiterminal source coding problem. In particular, we study the problem of conveying an information source to a single destination via multiple agents (encoders) in the presence of a passive eavesdropper. The agents have access to noisy observations of the source and are connected to the destination via noiseless rate-limited links. They do not cooperate or communicate to one another and are not required to estimate the source themselves. This scenario is of interest for many applications such as sensor networks or smart grid systems where reconstruction of the source at sensors and smart meters is not necessary. The distributed nature of such networks makes them more susceptible to eavesdropping. At each instant, the eavesdropper listens in on one of the links from the agents to the destination in order to obtain information about the source. In addition, it has access to side information correlated to the source. Since the link that will be compromised by the eavesdropper is unknown to the agents prior to their transmissions, each agent should protect its link in order to leak as little information as possible about the source. Our objective is to characterize the trade-off among agents’ transmission rates, incurred distortion at the destination, and the amount of information revealed to the eavesdropper. This setup can be viewed as the extension of the so-called CEO problem [7] in which secrecy constraints are considered.

I-a Related Work

The chief executive/estimation officer (CEO) problem was motivated in [7] by a communication and distributed processing system analogous to a scenario in which a firm’s CEO is interested in information of a source that cannot be observed directly. The CEO assigns a group of agents to independently observe a corrupted version of the source and communicate their observations. The lossless variant of this setup was initially studied by Gel’fand and Pinsker [8]. It was extended by Yamamoto and Itoh [9] as well as Flynn and Gray [10] to the lossy case with only two encoders for which an achievable rate-distortion region was derived. The model was generalized to the CEO problem with many encoders by Berger and Viswanathan [7] in which the trade-off between the end-to-end average distortion and sum of the rates at which the agents transmit to the CEO was studied. Multiterminal lossy source coding problems, including the CEO problem, are still open in general. However, for the special case of the quadratic Gaussian CEO problem [11], the sum-rate-distortion function for infinite number of agents with identical signal-to-noise ratios (SNRs) was derived by Oohama [12], and later, the complete rate-distortion region with arbitrary number of agents and SNR values was characterized by Prabhakaran et al. [13] and Oohama [14]. More recently, Courtade and Weissman [15] gave the rate-distortion region of the CEO problem under the logarithmic-loss distortion measure.

Secure lossless source coding with uncoded side information at the legitimate decoder and the eavesdropper was studied by Prabhakaran and Ramchandran [16] with the assumption of no rate constraint on the encoder-decoder link. The minimum leakage rate was derived and it was shown that due to the side information at the eavesdropper, the usual Slepian-Wolf scheme [17] is not always optimal. Lossless source coding with coded side information at the decoder (the so-called one-helper problem) and no side information at the eavesdropper was studied by Tandon et al. [18] where the rate-equivocation region was characterized. This setup was extended by Gündüz et al. [19] with additional side information at the eavesdropper in which inner and outer bounds on the compression-equivocation rate region were derived that did not match in general. Secure distributed lossless compression of two correlated sources, in which both sources were to be estimated at the decoder, was considered by Luh and Kundur [20] without side information at the eavesdropper and by Gündüz et al. [21] with side information at the eavesdropper. These models were generalized by Salimi et al. [22] to the case where both the legitimate receiver and the eavesdropper have access to correlated side information and the eavesdropper can choose to intercept either links from the encoders to the decoder at each instant. In [22], inner and outer bounds for the compression-equivocation region were provided which were proved to be tight for several special cases.

The extension to the lossy case was considered in [23, 24, 25], and more recently by Villard and Piantanida [26] in which inner and outer bounds on the rate-distortion-equivocation region were derived. The optimal characterization of the rate-distortion-equivocation region was first found in [24] for the lossy case with uncoded side information. Later in [26], the optimal characterization for the lossless case was also derived. A different setup was considered by Kittichokechai et al. [27] in which the eavesdropper can only access the coded side information, and the complete region was characterized under the logarithmic-loss distortion [15]. Chia and Kittichokechai [28] studied the case when the encoder has access to the side information of the decoder. Tandon et al. [29] considered a scenario with two legitimate receivers and investigated the privacy of side information at one receiver with respect to the other one. An alternative approach to provide secrecy in source coding problems is based on having a shared secret key between the transmitter and the legitimate receiver [30, 31, 32], although we do not exploit this approach in our work.

I-B Contributions

Our setup in this paper has two main distinctions from the aforementioned scenarios; first, the destination (CEO) is interested in estimation of the original source rather than the agents’ observations as in all prior works. Similarly, the secrecy constraints in our problem are on the equivocation of the eavesdropper with respect to the remote source, not to the observations of the agents. In fact, our setup is a generalization of the previous cases considered for lossy secure source coding problems. We extend our previous work [33] for the lossless variant of this problem to the lossy case and derive inner and outer bounds on the rate-distortion-equivocation region of the CEO problem with secrecy constraints. We also investigate the region in special cases where the bounds are tight and we show that for these special cases our results coincide with the previous results in the literature.

In addition, we consider the quadratic Gaussian CEO problem with secrecy constraints and provide the optimal characterization of the rate-distortion-equivocation region for the case when the eavesdropper has no side information and an achievable region for a more general setup with side information at the eavesdropper.

I-C Notations and Organization

In this paper, we use capital letters to indicate a random variable, small letters to indicate realization of a random variable, calligraphic letters to denote a set, e.g., , and to indicate the cardinality of the set. The notation denotes the sequence . The notion shows that , , and form a Markov chain, i.e., or . We define for , and for . Finally, denotes the indicator function such that for , and otherwise.

The rest of the paper is organized as follows: In Section II, we describe the problem along with some definitions. Main results for inner and outer bounds on the rate-distortion-equivocation region are presented in Section III. Then, we study some special cases of our results in Section IV where the region is completely characterized. The rate-distortion-equivocation region for the quadratic Gaussian case is given in Section V. Finally, the paper is concluded in Section VI.

Ii Problem Setting

Fig. 1: The CEO problem with secrecy constraints.

We consider the CEO problem with secrecy constraints as depicted in Fig. 1. In this setup, two non-cooperative and independent agents have access to length- observations and , respectively, which are noisy versions of the source sequence . These observations are conditionally independent given . Each agent independently transmits a compressed version of its observation to the CEO over a rate-limited noiseless link. The CEO estimates the source sequence based on the received information from the two agents. An eavesdropper, referred to as Eve, with access to side information correlated to the source sequence can eavesdrop only one of the links from the agents to the CEO at each time instance to obtain as much information as possible about the source. Therefore, agents’ transmission rates should be such that the CEO can reconstruct the source reliably within a certain mean distortion threshold while simultaneously the equivocation at Eve is maximized. Eve’s equivocation, with respect to either links, corresponds to her uncertainty about the original source when she combines her side information with the information obtained from the link. We assume that Eve cannot access both links simultaneously as the links are noise-free and in such case she would be more powerful than the CEO for estimating the source due to her additional side information. The sequences , , , and are independent and identically distributed (i.i.d.) according to joint distribution over the finite alphabet .

Let be a finite distortion measure. We define the component-wise mean distortion between two sequences , in as

(1)
Definition 1

A -code for compression and transmission of the source by the agents with secrecy constraints consists of an encoding function at each agent, for , and a decoding function at the CEO, . The equivocation rates for this code are defined as for .

Definition 2

A tuple is said to be achievable if there exists such that for all there exists a sequence of -codes with

Let denote the rate-distortion-equivocation region defined as the set of all achievable tuples .

Iii Inner and Outer Bounds on the Rate-Distortion-Equivocation Region

Iii-a Inner Bound

Theorem 1

Let denote the region defined as the closure of the convex hull of the set of all tuples such that there exist random variables , , , and on some finite sets , , , and , respectively, according to the distribution and a function that satisfy

(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)

Then, we have .

Proof:

The proof is given in Appendix A. \qed

Proposition 1

In Theorem 1, it suffices to consider auxiliary random variables and for with cardinalities and , respectively (see Appendix B for the proof).

The achievability scheme resulting in the inner bound is based on superposition coding and random binning at the agents, and joint decoding at the CEO. In particular, agent first transmits the bin index related to the auxiliary random variable with distribution via the noiseless link. Then, the agents send the remaining information which is required for the CEO to be able to reconstruct the source based on the Wyner-Ziv scheme [34]. The detailed proof is given in Appendix A, however, we provide some intuitions on the results. Inequalities (2)–(4) and (10) are similar to the Berger-Tung bounds [35, 36] that establish perfect estimation of and at the CEO from which can be reconstructed within the distortion limit . In the equivocation bounds (5) and (6), the first term corresponds to Eve’s uncertainty about the source after decoding the codeword based on the received bin index combined with her side information and the second term is the reduction in her uncertainty when receiving the remaining information transmitted to the CEO by the agents. Finally, the last term in (5) and (6) stems from the fact that in contrast to previous works, the secrecy constraints are on Eve’s equivocation with respect to the original source while the transmitted information by the agents are functions of their respective observations and not the source, resulting in an increase in Eve’s uncertainty. Inequalities (8) and (9) depict a trade-off between Eve’s equivocation and transmission rates, implying that each link’s transmission rate limits the other link’s equivocation rate.

Remark 1

The region of Theorem 1 can also be obtained by constructing six different codes achieving the corner points shown in Tables III and using the time-sharing technique between these points. Each corner point is achieved using a four-step communication to transmit variables , , , and to the CEO with different decoding orders, provided that is decoded prior to for . In each step, previously received and decoded information at the CEO is used as side information for the current decoding step. Each code employs superposition coding, with as the first layer and as the second layer, and random binning based on the available side information at the CEO in each communication step.

Corner point Decoding order
1
2
3
4
5
6
TABLE I: Corner points of the inner region corresponding to different decoding orders: rates and distortion.
Corner point Decoding order
1
2
3
4
5
6
TABLE II: Corner points of the inner region corresponding to different decoding orders: equivocation rates.

Iii-B Outer Bound

Theorem 2

Let denote the region defined as the closure of the set of all tuples such that there exist random variables , , , and on some finite sets , , , and , respectively, which form Markov chains for with , and a function that satisfy

(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)

Then, we have .

Proof:

The proof is given in Appendix C. \qed

Proposition 2

In Theorem 2, it suffices to consider auxiliary random variables and for with cardinalities and , respectively (see Appendix D for the proof).

Iv Special Case: The One-Helper Problem with Secrecy Constraints

If Agent 1 has access to the source sequence , our setup reduces to the lossy source coding problem with a helper and an eavesdropper who can choose to listen in on either source-destination or helper-destination links.

Corollary 1

In the above setting, if we additionally assume the helper’s link is perfectly secure, our results coincide with the results given by Villard and Piantanida [26, Theorem 3]. The inner bound is obtained by setting , , and removing the constraints on in Theorem 1, and the outer bound can be proved similar to the proof of Theorem 2.

Corollary 2

In the described one-helper problem with secrecy constraints, if , the helper’s sequence can be reconstructed by the destination losslessly. Then, the rate-distortion-equivocation region is characterized by

(19)
(20)
(21)
(22)

where the auxiliary random variables and satisfy the Markov chain .

The achievability proof follows from the proof of Theorem 1 by setting and . Inequalities (7)–(9) are inactive for this setup. The converse proof is given in [26, Theorem 3] for the secure lossy source coding with uncoded side information. Note that if Eve intercepts the helper’s link, it can also reconstruct the helper’s sequence losslessly.

Corollary 3

For the lossless one-helper setting, i.e., , if , the rate-equivocation region writes as:

(23)
(24)
(25)

The achievability proof follows from the proof of Theorem 1 by setting and . The converse proof is similar to the proof given in [16].

Corollary 4

For the lossless one-helper setting, i.e., , if the eavesdropper has no side information, the rate-equivocation region is characterized by

(26)
(27)
(28)
(29)

The achievability is a special case of Theorem 1 and obtained by setting and to be constants, , and . The proof of converse is given in Appendix E.

V The Quadratic Gaussian Case

In this section, we study the Gaussian CEO problem with secrecy constraints and quadratic distortion measure.

Let be a Gaussian source, i.e., . The observations at the agents are modeled as for , with , where Gaussian random variables , , and are mutually independent.

First, we consider the case where the eavesdropper has no side information. The model is depicted in Fig. 2 and the following theorem provides the complete rate-distortion-equivocation region for this Gaussian setup.

Fig. 2: The quadratic Gaussian case with no side information at Eve.
Theorem 3

In the quadratic Gaussian CEO problem with secrecy constraints, the rate-distortion-equivocation region is characterized by the set of all tuples satisfying

(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)

for some that satisfy

(38)
Proof:

The proof is given in Appendix F. \qed

An example of the region of Theorem 3 is illustrated in Fig. 3 for different distortion constraints.

(a) (b)
(c) (d)
Fig. 3: An example of the rate-distortion-equivocation region for the quadratic Gaussian case with no side information at Eve and different distortion constraints.
Remark 2

We note that equivocation as the secrecy measure in the finite alphabet setting represents the uncertainty of Eve about the source, but in the Gaussian setting, this interpretation is not quite valid. However, based on [37, Theorem 8.6.6], we can relate the equivocation rate (normalized differential entropy) to the estimation error at the eavesdropper. That is, we define the secrecy measure in the Gaussian setting as

(39)

for , where is an estimator of at the eavesdropper. Then, we have , and in this sense, equivocation rate provides a lower bound on the normalized distortion at Eve. Moreover, we can relate this to the information leakage rate as

(40)

which is also in line with the result in [38, Theorem 7.3].

Next, we consider the case where Eve has access to additional side information correlated to the source as shown in Fig. 4. We model this side information as where is a Gaussian random variable with and is independent of , , and . The following theorem gives an inner bound for the rate-distortion-equivocation region of the quadratic Gaussian CEO problem with secrecy constraints and side information at the eavesdropper.

Theorem 4

In the quadratic Gaussian CEO problem with secrecy constraints and side information at the eavesdropper, a tuple is achievable if

(41)
(42)
(43)
(44)
(45)
(46)
(47)
(48)

where is the indicator function and

(49)