# The Finite State MAC with Cooperative Encoders and Delayed CSI

## Abstract

In this paper, we consider the finite-state multiple access channel (MAC) with partially cooperative encoders and delayed channel state information (CSI). Here partial cooperation refers to the communication between the encoders via finite-capacity links. The channel states are assumed to be governed by a Markov process. Full CSI is assumed at the receiver, while at the transmitters, only delayed CSI is available. The capacity region of this channel model is derived by first solving the case of the finite-state MAC with a common message. Achievability for the latter case is established using the notion of strategies, however, we show that optimal codes can be constructed directly over the input alphabet. This results in a single codebook construction that is then leveraged to apply simultaneous joint decoding. Simultaneous decoding is crucial here because it circumvents the need to rely on the capacity region’s corner points, a task that becomes increasingly cumbersome with the growth in the number of messages to be sent. The common message result is then used to derive the capacity region for the case with partially cooperating encoders. Next, we apply this general result to the special case of the Gaussian vector MAC with diagonal channel transfer matrices, which is suitable for modeling, e.g., orthogonal frequency division multiplexing (OFDM)-based communication systems. The capacity region of the Gaussian channel is presented in terms of a convex optimization problem that can be solved efficiently using numerical tools. The region is derived by first presenting an outer bound on the general capacity region and then suggesting a specific input distribution that achieves this bound. Finally, numerical results are provided that give valuable insight into the practical implications of optimally using conferencing to maximize the transmission rates.

## 1Introduction

Temporal variations, a characteristic typical of wireless channels, may occur due to atmospheric changes, changes in the environment, the mobility of transmitters and/or receivers or time-varying intentional or unintentional interference. Since accurate channel state information (CSI) at both the transmitting and the receiving ends is crucial for efficient communications, measures are commonly incorporated in the communication protocol to enable channel state estimation. For example, the long term evolution (LTE) cellular communication standard relies on pilot signals transmitted at pre-scheduled time intervals and frequency slots to estimate the channel’s state [1]. Performed at the receiver, these estimations are then typically fed back to the transmitter, but obtaining perfect CSI at both ends of the channel in practical systems is a formidable challenge. More often than not, CSI is subject to channel estimation errors and feedback is not instantaneous due to some inevitable processing delay, and as a result, receivers and transmitters typically have access to only partial CSI. The impact of such partial CSI on the achievable performance, therefore, has attracted much attention in recent years. In the case of multiuser communication, performance is affected not only by channel characteristics, but also by interactions between the users. In particular, different forms of cooperation between the transmitting and receiving ends, a subject of growing interest in recent years (e.g., [2]), may significantly enhance performance. This paper aims to investigate the combined impact of both partial CSI and cooperation. More specifically, we focus on a two-user finite state Markov multiple access channel (FSM-MAC), with *partially* cooperative encoders and *delayed* CSI, as illustrated in Fig. ? and explained in the following text.

In the communication scenario under discussion, each of the two encoders wishes to send an independent private message through a time-varying MAC to the decoder. Delayed CSI is assumed to be available at the encoders, while full delayless CSI is assumed at the decoder. Different users may be subject to different CSI delays. It is further assumed that prior to each transmission block, the two encoders are allowed to hold a conference. More specifically, it is assumed that the encoders can communicate with each other over noise-free communication links of given capacities. We restrict the discussion to the case in which the conference held between the encoders is independent of the CSI.

The non-state-dependent MAC with partially cooperative encoders was first introduced by Willems [4], who also derived the capacity region for the discrete memoryless setting. Special cases of this channel model include that in which the encoders are ignorant of each other’s messages (i.e., the capacities of the communication links between them are both zero) and that in which the encoders fully cooperate (i.e., the capacities of the communication links are infinite). The first setting, where no conference is held, corresponds to the classical MAC, for which the capacity region was determined by Ahlswede [5] and Liao [6]. In contrast, in the second setting, where total cooperation is available, the encoders can act as one by fully sharing their private messages via the conference. The capacity region for this case is the part of the first quadrant below the so-called total cooperation line. This triangle-shaped region always contains the capacity region for the classical MAC.

In his proof of achievability for the conferencing MAC, Willems [4] introduced a coding scheme based on the capacity region for the MAC with a common message, derived by Slepian and Wolf in [7]. Willems showed that in order to achieve the capacity region, the encoders should use the cooperation link to share parts of their private messages and then use a coding scheme for the ordinary MAC with a common message. Although Willems’s model allows interactive communication between the encoders, it was shown both in [4] and later in [8] that a single round of communication between the encoders (referred to as a “pair of simultaneous monologues” in [4]) suffices to achieve optimality.

Additional multiuser settings that involve cooperation between users through communication links of finite capacities have been extensively treated in the literature. See, for example, [9] and [10] for studies of the MAC, [2] and [11] for studies of the interference channel with cooperating nodes, [17] for the broadcast channel, [18] and [19] for cooperative relaying and [3] and [20] and references therein for cooperation in cellular architectures. A comprehensive survey of cooperation and its role in communication can be found in [21]. It is important to note, however, that in all of the above settings the channel was not assumed to be time-varying.

Multiuser settings that combine both time-varying channels and user cooperation are obviously of major interest as well. A Gaussian fading MAC with cooperating encoders that have access to delayless CSI was considered in [22] and in [23]. As in our case, these works assume that cooperation is allowed only before the CSI becomes available at the encoders. The case in which the CSI becomes available to the encoders prior to transmission is treated in [24], where a MAC with perfect noncausal CSI is considered. The coding scheme introduced in [24] uses conferencing to share parts of the messages as well as CSI.

The notion of modeling time-varying channels as state-dependent channels dates back to Shannon [25], who characterized the capacity of the state-dependent, memoryless point-to-point channel with independent and identically distributed (i.i.d.) states available causally at the encoder. To establish achievability, Shannon presented a code construction that relied on “strategies” (or “strategy letters”) [26], a notion we also exploit in this paper. Gelfand and Pinsker [27], and later Heegard and El Gamal [28], studied the case in which the encoder observes the channel states noncausally. In both [27] and [28] a single letter expression for the capacity is derived using random binning. In [29], Goldsmith and Varaiya considered a fading channel with perfect CSI at both the transmitter and the receiver. It was shown that in such a case, the optimal strategy is to employ waterfilling over time.

As was already stated, because perfect CSI is difficult to obtain in practical systems, models that involve partial or imperfect CSI have attracted a lot of attention in recent years. At first, different settings involving an i.i.d. state sequence with imperfect CSI were treated. Initially, various point-to-point channel scenarios with partial CSI were studied. Among others, the causal, noncausal, rate-limited and noisy cases were addressed [30]. Extension of the result to the MAC with rate-limited CSI can be found in [33]. In [34], the authors derive the capacity region for the MAC with asymmetric quantized CSI at the encoders, where the quantization models the imperfection in the channel state estimation (full CSI at the decoder is assumed). Later, in [35] Lapidoth and Steinberg provided an inner bound for the capacity region of the MAC with strictly causal CSI at the encoders. In contrast to the point-to-point setting, where strictly causal CSI regarding an i.i.d. state sequence does not increase capacity, the capacity region of the MAC with causal CSI is strictly larger than the corresponding region without CSI. Li *et al*. presented an improved inner bound for the same setting in [36]. A comprehensive monograph on channel coding in the presence of side information can be found in [37], where an i.i.d. state sequence is assumed. An information theoretic model for a single user channel involving delayed CSI and a state process that is no longer restricted to be memoryless and i.i.d. was first introduced by Viaswanathan [38], who derived the capacity while assuming a FSM channel. This result was later generalized by Caire and Shamai in [26], where they addressed a point-to-point channel in which the CSIs at both encoder and decoder admit some general joint probability law. A general capacity formula, which relies on the notion of *inf-information rate* [39], is then provided for the case of state processes with memory. The result is then shown to boil down to a single-letter characterization in the case in which perfect CSI is available to the receiver, the CSI at the transmitter is given by a deterministic function of the channel state, and the two processes are jointly stationary and ergodic. By an appropriate choice of the above deterministic function, the result for Viswanathan’s delayed CSI model [38] is obtained as a special case of the result in [26]. A generalization of the point-to-point results of [26] to the MAC was presented by Das and Narayan in [40]. The generality of the channel model therein leads to multiletter characterization of the capacity region in various settings, which unfortunately provides limited insight into practical encoding schemes for channel models in this framework.

Taking a practically oriented approach, we focus in this paper on a specific channel definition that leads to single-letter results. Following [38], we model temporal variations by means of a FSM channel [41]. The channel state is determined on a per symbol basis and governed by the underlying FSM process. An important extension of this idea to the multiuser case was introduced by Basher *et al*. in [43], presenting the FSM-MAC with delayed CSI and non-cooperating encoders, i.e., where no conference is held (see also [44] for a related source coding analysis). In the proof of the capacity region for this model, achievability was established by employing a coding scheme based on rate-splitting and multiplexing-coding combined with successive decoding at the receiver. Successive decoding was used in [43] to demonstrate that the two corner points of the capacity region are achievable. The whole capacity region is then achievable via time-sharing. Although the setting in [43] constitutes a special case of the general model in [40], the main contribution of [43] is the single-letter characterization of the capacity region and the detailed construction of the coding scheme.

In the current paper, accounting for the availability of a conferencing link between the encoders, we take a different approach than that taken in [43]. We base the proof of achievability on the coding scheme for the MAC with a common message as presented in [4], and therefore, we start by deriving the capacity region for the FSM-MAC with a common message and the same CSI properties as in [43]. We thus provide a solution to what has been, until now, an unsolved problem. Next, using the achievable scheme for the common message setting, the achievability of the conferencing region is established. We note that the large number of corner points induced by the presence of an additional transmission rate (namely, the rate of the common message) render the provision of an achievable coding scheme for the common message setting based on achieving the region’s corner points an awkward task. Moreover, the use of rate-splitting and multiplexing-coding when a common message is involved yields a rather complex coding scheme which we sought to avoid.

Therefore, we present an alternative coding scheme that employs strategy letters in the code construction (cf., e.g., [25] and [40]) and simultaneous decoding. However, unlike the case of Shannon’s classical result for the point-to-point channel with causal encoder CSI, here we show that optimal codes can be constructed directly over the input alphabet (as also shown for certain special cases in [40]). Namely, a single codebook is generated for each of the three messages over a super-alphabet that corresponds to the different realizations of the delayed CSI available at the encoders. At each time instance, a symbol that is correlated with the *current* available delayed CSI is selected by the encoders and transmitted to the channel. Thus, in contrast to previous works involving delayed CSI (cf., [38] and [43]), here rate-splitting is no longer required. The decoder then uses its access to full CSI (which deterministically defines the delayed state sequences as well) to reduce each codeword (originally constructed over a super-alphabet) to a sequence over the input alphabet and executes a simultaneous decoding scheme based on joint typicality. Indeed, one of the most signiﬁcant contributions of our paper is this coding scheme for the MAC with a common message and delayed CSI. Not only does it successfully avoid the unnecessary complexity of its rate-splitting and multiplexing counterpart and relies on a simpler codebook construction, it also achieves every possible point in the region rather than only the corner points. Furthermore, this two-user coding scheme is easily extendable to the case of multiple users with a *single* common message.

Based on the general results for the FSM-MAC with conferencing, we continue with the derivation of the capacity region for the special case of a vector Gaussian FSM-MAC with diagonal channel transfer matrices. This channel model can be used to represent an orthogonal frequency-division multiplexing (OFDM)-based communication system, employing single receive and transmit antennas, where the diagonal entries of the channel matrices represent the orthogonal sub-channels used by the OFDM scheme.

To derive the capacity region for the latter channel, we use a multivariate extension of a novel tool first derived in [45] (namely, a necessary and sufficient condition for a Gaussian triplet of random variables to satisfy a certain Markov relation), and demonstrate that Gaussian multivariate distributions maximize certain mutual information expressions under a Markovity constraint. The scalar version of this tool was employed by Lapidoth *et al*. [46] to provide an outer bound for the capacity region of the scalar Gaussian non-state-dependent MAC with conferencing encoders. Wigger and Kramer also used this tool in their solution for the capacity region of the three-user, non-state-dependent MIMO MAC with conferencing [47]. The need to use the tool from [45] stems from the fact that the input distribution of the conferencing channel must admit a certain Markovity constraint. For cases in which no Markov relation needs to be satisfied, the traditional approach to proving the optimality of Gaussian multivariate distributions involves employing either the Vector Max-Entropy Theorem (a direct extension of [48]) or a conditional version of it. Here, however, this approach fails since replacing a non-Gaussian vector satisfying the Markovity condition by a Gaussian vector of the same covariance matrix may result in a Gaussian vector that violates the Markovity condition. To overcome this issue we use a sufficient and necessary condition on the (auto- and cross-) covariance matrices of the involved Gaussian random vectors for them to admit a Markov relation [49].

We note that although Gaussian input vectors are shown to be optimal in this setting, the original form of the capacity region involves a non-convex optimization problem. To circumvent this difficulty, new variables are introduced to convert the optimization problem into a convex problem that can then be solved using numerical tools such as CVX [50]. The capacity region for the corresponding scalar Gaussian channel can be immediately derived from the result for the vector channel setting and serves as an extension of the result in [46] to the state-dependent case. The capacity region of the vector Gaussian FSM-MAC with a common message and the same CSI properties can also be easily derived from the result for the conferencing channel by exploiting the strong correspondence between the two models and using a simple analogy.

To gain some insight into the practical implications of the results we conclude this paper with a specific example, namely, a scalar AWGN channel with two possible states (‘Good’ and ‘Bad’). Numerical results are included to demonstrate the impact of different channel parameters on the capacity region and the optimal input distribution. Our interpretation of interactions between the different parameters produces valuable insights.

The remainder of the paper is organized as follows. In Section 2 we describe the two communication models of interest – the FSM-MAC with a common message and delayed CSI and the FSM-MAC with partially cooperative encoders and delayed CSI. In Sections Section 3 and Section 4, we state the capacity results for the common message and conferencing models, respectively. Each result is followed by its proof. Section 5 follows with the definition of the vector Gaussian FSM-MAC with diagonal channel transfer matrices and the derivation of the maximization problem defining its capacity region. The regions for the corresponding common message model and the scalar setting are given as special cases. The two-state Gaussian example is discussed in this section as well. Finally, Section 6 summarizes the main achievements and insights presented in this paper along with some possible future research directions and extensions.

## 2Channel Models and Notation

In this paper, we investigate the capacity region of the FSM-MAC with partially cooperative encoders, full CSI at the decoder (receiver) and delayed CSI at the encoders (transmitters), as illustrated in Fig. ?. To this end, we first consider a different setting, which is the FSM-MAC with a common message and the same CSI properties, as depicted in Fig. ?. The derivation of the capacity region for the latter common message setting forms the basis for the achievability proof for the former setting where a conferencing link exists between the encoders. Since most definitions for both channels follow similar lines, we start by defining the common message setting and then extend the description for the setting of partially cooperative encoders.

We use the following notations. Matrices are denoted by nonitalicized capital letters, e.g., . Calligraphic letters denote sets, e.g., , while the cardinality of a set is denoted by . stands for the -fold Cartesian product of . An element of is denoted by , and its substrings as ; when , the subscript is omitted. We use the notation . Whenever the dimension is clear from the context, vectors (or sequences) are denoted by boldface letters, e.g., . Random variables are denoted by uppercase letter, e.g., , with similar conventions for random vectors. stands for the sequence of random variables , while stands for . The probability of an event is denoted by , while denotes conditional probability of given . Probability mass functions (PMFs) are denoted by the capital letter with a subscript that identifies the random variable and its possible conditioning. For example, for two jointly distributed random variables and , let , , and denote, respectively, the PMF of , the joint PMF of , and the conditional PMF of given . In particular, when and are discrete, represents the stochastic matrix whose elements are given by . We omit the subscripts if the arguments of the distribution are lower case versions of the random variables.

### 2.1FSM-MAC with a Common Message and Delayed CSI

The FSM-MAC with a common message considered in this paper is illustrated in Fig. ?. The MAC setting consists of two senders and one receiver. Each sender chooses a pair of indices, , uniformly from the set , where denotes the common message and , , denotes the private message of the corresponding sender. The choices of , and are independent. The input to the channel from encoder is denoted by , and the output of the channel is denoted by .

At each instance of time, the FSM channel is assumed to be in one of a finite number of states . In each state, the channel is a discrete memoryless channel (DMC), with input alphabets and output alphabet . Let the random variable denote the channel state at time . Similarly, we denote by and the inputs and the output of the channel at time . The channel transition probability distribution at time depends on the state and the inputs at time , and it is given by . The channel output at any time is assumed to depend only on the channel inputs and state at time . Hence,

The state process, , is assumed to be an irreducible, aperiodic, finite-state, homogeneous and stationary Markov chain and is therefore ergodic. The state process is independent of the channel inputs and output when conditioned on the previous states, i.e.,

Furthermore, we assume that the state process is independent of the messages , and , i.e.,

We assume that full CSI is available at the decoder (i.e., the decoder knows at each time instance ). However, the encoders are only assumed to have access to delayed CSI, with delays and for Encoder 1 and Encoder 2, respectively. We let , , denote the channel state at time , and assume without loss of generality that . Now, let be the one-step state-transition probability matrix of the Markov process that governs the channel states, and let be its steady state probability distribution. The joint distribution of is stationary and is given by

where is the -th element of the d-step transition probability matrix of the Markov state process. To simplify the notation, we define the joint distribution of the random variables as the joint distribution of , i.e.,

where

The average probability of error

for the code is given in (Equation 1) at the bottom of the page. We use standard definitions of achievability and of the capacity region [48]. Namely, a rate triplet is *achievable* for the FSM-MAC if there exists a sequence of codes with as . *The capacity region* is the closure of the set of achievable rates .

### 2.2FSM-MAC with Partially Cooperative Encoders and Delayed CSI

The FSM-MAC with partially cooperative encoders and delayed CSI is depicted in Fig. ?. The channel definition relies on subSection 2.1, while taking the common message set to be . Here, however, conferencing between the encoders is introduced under the assumption that conferencing links of fixed and finite capacities and exist between the encoders. Accordingly, the amount of information exchanged between the encoders during the conference is bounded by and . The conference is assumed to take place prior to the transmission of a codeword through the channel and consists of consecutive pairs of communications, simultaneously transmitted by the encoders. Each communication depends on the message to be transmitted by the sending encoder and previously *received* communications from the other encoder. We denote the communications transmitted from encoder to the other encoder by . Note that here the state process is also assumed to be independent of the conference communications, i.e.,

The average probability of error

for the code is given by (Equation 3) at the bottom of the page. The *achievable rates* and the *capacity region* for this channel are defined analogously to their definitions in Section 2.1.

## 3The Capacity Region of the FSM-MAC with a Common Message and Delayed Transmitter CSI

In this section we state the capacity region of the FSM-MAC with a common message and delayed transmitter CSI, after which we present its proof.

## 4The Capacity Region of the FSM-MAC with Partially Cooperative Encoders and Delayed Transmitter CSI

In this section we state the capacity region of the FSM-MAC with partially cooperative encoders and delayed transmitter CSI followed by its proof.

## 5The Vector Gaussian FSM-MAC with Diagonal Channel Transfer Matrices, Conferencing and Delayed CSI

In this section we consider the vector Gaussian FSM-MAC with diagonal channel transfer matrices, partially cooperative encoders and delayed CSI. For every time instance , the channel model under consideration is:

where and are diagonal matrices, which are deterministic functions of the channel state . We denote the diagonal entries of these matrices by and , respectively, for and . Moreover, we assume that . For every , and are the channel input vectors and the channel output vector, respectively. is a proper complex zero mean additive white Gaussian noise (AWGN) process, independent of and for every . Thus, each noise sample is distributed according to , where is the identity matrix of dimensions . The input vector signals are assumed to satisfy the average power constraints

where we use the standard notation , and denotes the conjugate transpose of the matrix .

The motivation for examining the channel model in (Equation 4) stems from the fact that it can be used to represent an OFDM-based communication system, employing single receive and transmit antennas. OFDM is an efficient technique used to mitigate frequency selective fading, which is typical in modern wideband communication systems (see, e.g., [1]). The underlying idea behind OFDM is to split the channel’s bandwidth into separate sub-channels through which orthogonal signals are transmitted. By doing so, not only is the impact of intersymbol interference (ISI) dramatically reduced, but the transfer functions of each of the sub-channels boil down to multiplicative scalar gains. These gains are modeled by the diagonal entries of the channel matrices defined above. In this section we derive the maximization problem that specifies the capacity region for the vector Gaussian channel under consideration and convert it into a convex problem. The solution of this convex maximization problem, which can be easily obtained using a numerical tool such as CVX [50], also yields the optimal power allocation strategy among the sub-channels, which is another essential factor in an OFDM-based transmission.

### 5.1Capacity Region

The corresponding capacity region for the analogous setting with a common message can be obtained from Theorem ? by taking:

where denotes the common message rate, and and denote the rates of the private messages (according to the common message channel definition in Section 2.1). The result is summarized in the following Corollary.

Note that the capacity regions in Theorem ? and Corollary ? are both given in the form of a convex optimization problem, which can be solved efficiently using numerical tools. In the following proof we first derive a slightly different, yet equivalent, region for the Gaussian conferencing model. This equivalent capacity region involves a nonconvex optimization problem that we then convert into a convex problem by an appropriate change of optimization variables.

### 5.2Two-State Scalar AWGN Channel Example

To gain some intuition on the capacity region of the MAC with partially cooperative encoders and delayed CSI, we now consider the scalar Gaussian channel with only two possible states. The scalar channel corresponds to taking in the diagonal vector channel definition in (Equation 4). We denote the two possible channel states by and (where stands for ‘Good’ and for ‘Bad’), thus, . The two states differ in their associated channel gains. When , the gains are , whereas when the gains are . We assume without loss of generality that . The Markov model of the state process is illustrated in Fig. ?.

The state process is specified by the the transition probability matrix:

which induces the following stationary distribution:

We start by examining the impact of the cooperation link capacities, and , on the capacity regions in the particular case of symmetric CSI delays, i.e., . Note that since , it immediately follows that . The capacity region is presented in Fig. ? for three different cases: (a) symmetrical capacities, represented by, , (b) single cooperation link, represented by, and (c) one infinite cooperation link, represented by, . The capacity regions were calculated by numerically solving the optimization problem induced by Theorem ? for the above three cases using CVX [50]. Throughout this example we assume , , , and (results of similar nature were observed for and ).

Note that in Fig. ?(a), which presents the region for the symmetrical case, as grows without bound, the capacity region increases and eventually adopts a triangular shape. This outcome is because the first three constraints on the rates , as given by ( ?)-( ?), also grow without bound, and thus, the binding constraint is the sum-rate constraint of ( ?). For the case of a single cooperation link shown in Fig. ?(b), the upper bound on remains fixed as grows, since the constraint in ( ?) does not change with and stays fixed at approximately . Finally, for the case of infinite cooperation link capacity , as shown in Fig. ?(c), we have that the constraint on in ( ?) and the first constraint on the sum-rate in ( ?) are both redundant. Hence, the only meaningful constraint on is ( ?), which does not involve (or ).

Next, we demonstrate that the capacity region of this setting grows as the cooperation link capacities grow, regardless of the specific assumptions on the relation between the delays of the CSI available at the encoders. To do so, we present the maximum sum-rate versus the cooperation link capacities for three different possible relations between the delays: (a) , (b) and (c) . For all three cases we assume and use the same values of the channel gains as before. The curves are shown in Fig. ?(a)-(c).

As expected, The sum-rate of case (c) (which exhibits the best CSI properties of the three) reaches the highest value as the capacities grow, whereas the sum-rate for case (b) (which exhibits the worst CSI properties) reaches the lowest value. Moreover, we note the correspondence between Fig. ?(a) and Fig. ?(a) (both corresponding to the case of symmetrical delays and equal cooperation link capacities). Evidence of this correspondence is the fact that when grow, the sum-rate, in both figures, approaches its maximal value, which is approximately bits per symbol.

Another interesting aspect of the Gaussian channel example is the impact of the signal-to-noise ratio (SNR) on the correlations between the auxiliary random variable, , and the random variables and . These correlations are associated with the level of cooperation used in the scheme. We assume that the transmit powers satisfy and that , so that the SNR, in fact, equals , and restrict the analysis to the case where , i.e., a single and constant channel state [46]. We use throughout the same notations and expressions for the rate bounds as in [46]. Note that for the case where , the maximization problem in ( ?) turns out to be concave even without the transformation ( ?); thus no transformation is needed. The remaining optimization variables are and , which are defined through (cf., ( ?)-( ?))

We consider the case of symmetrical cooperation link capacities, i.e., . By the symmetry of the maximization problem in , optimality is achieved when . For this reason we use the notation and plot a single curve representing both correlations (which are calculated directly from according to ( ?)). The numerical results are shown in Fig. ?. The dashed blue and green lines designate the asymptotic value of the correlation and the critical SNR at which the correlation drops from unity, respectively. Results are shown for six different values of .

Although the effect of the SNR on the correlations could not be calculated analytically, we use asymptotic evaluations to gain some additional insight. Namely, we demonstrate that the optimal correlation admits

where .

We start by justifying the observation that the correlation approaches for small SNR values. For some positive value of and for , consider (cf. ( ?)-( ?)):

Now note that the last term in (Equation 7) is maximized for , which, in turn, implies that the correlation is equal to unity. As shown in Fig. ?, for smaller values of SNR the correlation is indeed higher, indicating that the scheme compensates for the low SNR via cooperation.

The asymptotic evaluation for low SNRs is valid up to some critical SNR value at which the correlation drops from its maximal value of unity. We define this critical value of SNR as

To calculate we restrict the analysis to the segment of SNRs at which the correlation is maximal (or equivalently, ) and consider (Equation 7) taken for and . As shown in (Equation 7), when and , the second logarithm achieves the minimum between the two terms. Fixing and increasing increases the second logarithm in (Equation 7) while the first term remains unchanged and equals . As long as

the optimum in achieved for . However, when (Equation 9) is no longer valid, the optimal value of must vary from 0. Thus, calculating reduces to solving the following equation:

yielding,

The value of is represented by the perpendicular dashed green line in the plots shown in Fig. ? and is observed to agree with the numerical results. Note that as the capacities grow, so does the value of , and hence, the transition between the low- and high-SNR regimes occurs at higher SNR values.

As the SNR grows, the correlation asymptotically approaches some value in the interval ; this value is denoted by . To find this asymptotic correlation, we present the following analysis for the high-SNR regime (assuming ). We start by excluding as a possible solution for this case (a fact which will be used subsequently). Fixing and substituting into the sum-rate bounds on yields (cf. ( ?)-( ?)):

where (a) follows from the fact that . We thus get that for , by taking , the sum-rate is bounded by the sum of the cooperation link capacities. However, since is a constant that does not depend on the powers and , we conclude that cannot be equal to zero.

Next, assuming , we calculate by using some approximations that are easily justified at a high SNR. First, note that the first and second logarithms in (Equation 7) are monotonically increasing and decreasing, respectively, in . This implies that the optimum is achieved at the value of at which the functions intersect, that is

Using the fact that for high SNR we have:

the equation in (Equation 12) reduces to:

To further simplify the analysis we again assume a unit channel gain, that is, . After some algebra we obtain that the intersection point is given by

which by taking , reduces to

Therefore, the optimal correlation at infinite SNR is given by

The value of , for each value of the cooperation link capacities and , is represented by the horizontal dashed blue line in the plots shown in Fig. ?. Note that the numerical calculations indeed meet the asymptotic results for large values of SNR.

To conclude, we interpret the numerical and analytical results in terms of the optimal transmission strategies of the users for each SNR regime. Recall that the symbols of the codewords transmitted by the users are modeled by the random variables and . The fact that for low SNR the correlation is at its maximal value of unity implies that both users tend to transmit the same codewords, which, in turn, indicates that they transmit the same message. However, the only common information the users share is the common message that they have created using the conference. Therefore, we conclude that when the channel quality is low, the best strategy for the users is to transmit the common message exclusively and to abandon their private messages (i.e., the parts of their original messages that they have not managed to share). As the SNR grows beyond , the correlation between the code symbols decreases to some positive value , asymptotically approaching (Equation 14). This decrease in correlation is the result, when a higher quality channel is experienced, of each user transmitting not only the common (correlated) message, but also the private (uncorrelated) message.

One can also get some additional insight by examining the behavior of the correlation coefficient from the rate perspective. As long as the sum-rate falls below the sum of the cooperation link capacities, i.e., , the transmission consists only of the correlated common message; namely, the users are fully cooperative. However, once the sum-rate crosses this threshold value, the transmitted codewords incorporate both the common and private messages, leading to a decrease of the optimal correlation coefficient.

## 6Summary and Concluding Remarks

In this paper we considered the FSM-MAC with partially cooperative encoders and delayed CSI, and derived its capacity region. The achievability proof used another result of this paper, namely, the capacity region of the FSM-MAC with a common message and delayed CSI. The latter result was obtained by providing a coding scheme that relies on strategy letters. Nonetheless, using the fact that the decoder has access to full CSI, it was also shown that optimal codes can be constructed directly over the input alphabet. Thus, a single codebook was constructed, a fact that formed the basis for simultaneous joint decoding. This approach not only successfully avoids the unnecessary complexity of a coding scheme based on rate-splitting and multiplexing (in contrast to previous works involving delayed CSI [38]), but it also circumvents the need to rely on the corner points of the capacity region, which can render the analysis cumbersome and inefficient when the number of corner points is large.

The general conferencing result was then applied to the special case of the Gaussian vector MAC with diagonal channel transfer matrices, which models OFDM-based communication systems. The corresponding capacity region was given in the form of a convex optimization problem and the optimality of Gaussian Markovian inputs was established. This result serves as a generalization of [46] to the vector state-dependant case. Focusing on a two-state Gaussian FSM-MAC example, the crucial role of cooperation for low SNR values was demonstrated.

We finally note that an extension of the results to a more general state-dependant MAC with partially cooperative encoders and CSI at both transmitters and at the receiver (as, e.g., in [40]) is currently being investigated. Extensions of the results for the Gaussian vector FSM-MAC to general MIMO settings (see, e.g., [47]) and to the ISI channel are also being considered.

## 7Proof of the Markov Relation in ()

We prove the Markov relation ( ?) using the following claims. The Markov property in ( ?) follows from the fact that , and , and thus, due to the stationary property of the state process, also (.

To show ( ?) consider the following relations

where (a) follows from the facts that is independent of given and is a deterministic function of . Now, since this is true for all and because the auxiliary random variable is defined as , we conclude that

Finally, to show ( ?) we use the following relations

where (a) follows from the facts that is independent of given and is independent of ( given . Again, the above holds for every , and by the definition of the random variable , we conclude that

## 8Error Probability Analysis for the Achievability Proof of Theorem

We need to show that for the coding scheme presented in Section ? and for a rate triplet as given in Theorem ?, as . Define the event in (Equation 15) at the bottom of the page for any (recall that a fixed state sequence induces a fixed pair of delayed state sequences ). Denote the transmitted messages by . Using (Equation 15), the probability of error, when averaged over the ensemble of codebooks, can be written as in (Equation 16) at the bottom of the next page. By the union bound, (Equation 16) is further upper bounded by ( ?). We proceed with the following steps:

as by the law of large numbers.

To upper bound consider the following:

where step (a) is proven in App. Section 9, and as . Hence, for the probability to vanish as , the following must hold:

The mutual information term in (Equation 18) can be rewritten as,

where (a) follows from the mutual information chain rule and (b) follows from the fact that is independent of given , by the underlying channel model (see Section ?).

The upper bounds on , and are all observed to be redundant, since in all three types of events the codeword is assumed incorrect, which immediately implies that the codewords and are also incorrect. Hence, requiring the probability of error to vanish as produces the same upper bound as in (Equation 18) but with respect to the partial sum-rates , and . It can therefore be concluded that the upper bound in (Equation 18) is the dominating constraint.

To upper bound consider the following steps:

where the proof of step (a) is provided in App. Section 9, and as . It hence follows that as as long as,

Using similar arguments it can be shown that to guarantee that and vanish as the following conditions must hold,