Paranoid Secondary: Waterfilling in a Cognitive Interference Channel with Partial Knowledge

Paranoid Secondary: Waterfilling in a Cognitive Interference Channel with Partial Knowledge


We study a two-user cognitive channel, where the primary flow is sporadic, cannot be re-designed and operating below its link capacity. To study the impact of primary traffic uncertainty, we propose a block activity model that captures the random on-off periods of primary’s transmissions. Each block in the model can be split into parallel Gaussian-mixture channels, such that each channel resembles a multiple user channel (MAC) from the point of view of the secondary user. The secondary senses the current state of the primary at the start of each block. We show that the optimal power transmitted depends on the sensed state and the optimal power profile is paranoid, i.e. either growing or decaying in power as a function of time. We show that such a scheme achieves capacity when there is no noise in the sensing. The optimal transmission for the secondary performs rate splitting and follows a layered water-filling power allocation for each parallel channel to achieve capacity. The secondary rate approaches a genie-aided scheme for large block-lengths. Additionally, if the fraction of time primary uses the channel tends to one, the paranoid scheme and the genie-aided upper bound get arbitrarily close to a no-sensing scheme.


Cognitive radio, spectrum sensing, interference channel, Gaussian mixture channel, capacity, side information, rate splitting, water-filling.

1 Introduction

Cognitive wireless is a novel approach to deploy new wireless services in the presence of legacy devices with priority access to the channel. The aim of a cognitive framework is to communicate as an underlay to an underutilized primary network without degrading the primary communication beyond a predetermined threshold. The key constraint for our formulation is that the primary transceiver is legacy but fixed i.e. it was not designed keeping a cognitive secondary in mind. This constraint makes the classical interference channel [1] one sided, i.e, Z-interference channel with an additional power constraint. In this paper, we analyze how temporal opportunities due to sporadic primary traffic can be exploited, even if the secondary transmitter has incomplete information about the current state of primary traffic.

Our contributions are three-fold. First, we approximate the uncertainty in primary activity (interference channel with a sporadic primary) by a simple block activity model where the primary changes its state at most once during a block of fixed duration . In this model, the two sources of uncertainty in the primary traffic are captured by the initial state and the time of state change . Their actual values are unknown to the secondary, but their distribution and are known. We first show that the reliability constraint at the primary receiver, which requires that primary transmission should not be harmed, places an additional power constraint on the secondary transmitter (see for example, [25]) and the interference caused by the sporadic primary at the secondary receiver converts the AWGN channel to a Gaussian mixture channel.

Second, we present two sense-and-send schemes where the secondary senses the channel at the start of each block to look for temporal opportunities. The fixed primary design converts the effective channel into a MAC. The secondary splits its rate into two layers [20], [19]. We show that it is optimal to treat the primary message either as all public information or as all private information but not both because the fixed primary’s message cannot be spilt into private and public parts. We also present a no-sensing scheme and a genie-aided upper bound. We prove that for the sense-and-send schemes the secondary power profile is paranoid, i.e. the power monotonically increases or decreases during a block depending on whether the primary was using the channel or not during the start of the block. The paranoid profile arises due to the effective noise of the mixture channels which decays or grows, depending on the starting state of the channel in a block. Additionally, we prove that when the sensing is perfect, the paranoid scheme is optimal. The proof is along the lines of a channel with delayed state information [22].

Third, we show that if there is no information about the primary traffic the rate splitting scheme achieves capacity. The effective channel is converted to a MAC with a fixed transmitter from the secondary’s point of view and hence rate splitting at the secondary transmitter and sequential decoding at the secondary receiver is optimal. Finally, we show that the paranoid scheme approaches a genie-aided upper bound for large block lengths and when the primary traffic becomes more persistent, both these schemes get close to the no-information scheme.

Much like any active area, many variations have been proposed and studied in the literature [12, and references therein]. Some of the earlier work [18, 16] in opportunistic spectrum allocation for cognitive flows was motivated by studies done by FCC [7] showing significant spectral underutilization. Many aspects of cognitive radio have been studied including coding with degraded messages [5, 23], capacity of the secondary flow with causal or non-causal information [15], stability of the queues at both the flows for maximal secondary rate with a guaranteed primary throughput [21], spectral shaping [25] etc. The case of an adaptive primary which acts as a feedback to the cognitive radio is considered in [6]. When interference is considered in terms of SINR constraints (see for example, [4, 9, 11, 10]), the effect of the primary communication at the secondary is not considered. Power control schemes have been derived for different power constraints [25, 24] when the primary is persistent but operating below capacity. The idea of opportunistic interference cancellation was introduced in [19]. However all the above results do not address the issue of the uncertainty in the channel when the primary traffic is fixed but sporadic. We extend the idea of opportunistic interference cancellation to derive sense and send schemes for a fixed and sporadic primary operating below capacity.

The rest of the paper is organized as follows. In Section II, we present the system model and introduce the a block activity model for a sporadic primary. In Section III, we first present a genie-aided bound. Then we derive the paranoid secondary profile for a sense and send protocol when the sensing is noisy. Finally, we give two special cases when the sensing is noiseless and when no sensing is performed. We conclude with some numerical examples.

2 System Model

2.1 Signal Model

(Fixed Primary)



Figure 1: A two-flow cognitive interference network.

We consider an interference channel (Figure 1), where the channel inputs and outputs are related as, and where are channel inputs, are the channel outputs and are independent zero mean, unit variance Gaussian noise. Each transmitter is individually average power constrained, such that and . We define the following for convenient interpretation: . We have fixed the primary power to (primary uses its full power budget) in our definition of and , leaving the secondary power as an optimization variable in our definition of and . Finally, we use the standard form [2] of interference channel by assuming . Thus the channel input-output are related as and where the variance of and is bounded above by one.

2.2 Block Activity Traffic Model for the Primary Flow

The primary is assumed to have the following two properties: its channel is underutilized i.e. it sends below the capacity of its own channel when it transmits, and its traffic is sporadic i.e. it does not occupy the channel continuously. We model the channel underutilization by assuming that the primary employs Gaussian random codes with a code rate of , where and . The sporadic traffic is modeled by assuming that the primary transmits for fraction of time, which results in an average primary rate of . Thus a secondary user can use the same channel as long as it ensures that the primary INR does not go above . We refer to this as the INR constraint. We capture the two uncertainties from the secondary’s point of view, the start time of the primary transmission bursts and the duration of the bursts, using the following block activity model. The primary transmissions are assumed to occur in blocks of time-slots, where is a finite constant. All transmissions by the primary and the secondary are considered to be slot synchronous. The primary user activity is labeled as the state of the primary channel and is denoted by , where is the time-slot index in a block. The state process is independent of the secondary channel’s input and output. The primary channel is either in the busy state, , or in the idle state, during the time-slot . To keep the analysis tractable, we assume that the primary user changes its state only once at time-slot in a block (see Figure 2). Thus, the tuple captures the two uncertainties related to the sporadic transmissions.

The starting state of the primary user for each block is assumed to be independently drawn from . Conditioned on the starting state, the switching time of the primary has a probability mass function, and the corresponding distribution function . If , then the primary user does not change its state during the block. The secondary transmitter is assumed to know , and . The secondary receiver is assumed to have perfect knowledge to do coherent decoding.

Figure 2: Simplified block activity model for the primary packets with only one allowed switching of state. This model is characterized by the starting state , the state-switching time and the blocklength .

Since the primary uses Gaussian codes, from the point of view of the secondary the channel always behaves like an AWGN channel. The effective noise as seen by the secondary receiver has the mixture distribution for ,


where, . The term is the probability of the state conditioned on the starting state , and is used to derive the power profile in Section III. Note that the is the actual primary channel usage fraction and is what is seen by the secondary. A persistent primary can be represented by putting . If , the system is fully loaded and no secondary is allowed on the same channel. The interesting case is when and .

2.3 Interleaved Block Code for the Secondary Flow

We now define the set of secondary codes over consecutive blocks, when the secondary has an estimate of the starting state for each block. Bold face letters represent the vector corresponding to the variable for all blocks, e.g. represents the starting state of all blocks. Since the time-slot of each block has identically distributed noise as given by Equation (1), the secondary designs different codebooks each matched to a given time slot across blocks. For example, during block the secondary transmits where denotes the component of a codeword from the codebook. Similarly the codeword is given by as shown in Figure 3.








Figure 3: Codewords are multiplexed across blocks, i.e. each time slot uses a different codeword which spans across blocks. The figure shows the sensing operation at the beginning of each block and the codeword, . Each of the codewords are picked from the corresponding codebooks. The block size is a positive constant, however, the code length tends to infinity.

A cognitive interleaved block code of length consists of separate -codes, corresponding to each time slot of a block. Each of these codebooks further consists of two component codebooks corresponding to the sensed starting state of the block which may or may not match the actual starting state . The codebooks are defined by a set of encoding functions that map the set of equiprobable messages to channel inputs. The transmitter encoding function is defined as for and , i.e. . The decoding function maps the received vector to the message set, leading to probability of error The rate is if there exists a cognitive block code for sufficiently large and . A rate is said to be achievable if there exists an code for every and the capacity is defined to be the supremum of all the achievable rates .

Finally consider the Han-Kobayashi scheme for interference channels where each user’s data is split into private and public parts. The public information is decodable by both decoders and the private information is decodable by the intended decoder. A special case of this structure is usable in our model. Firstly, since the primary encoder is fixed, its data can be considered either public (if the secondary can decode it, i.e. the cross channel can support the rate) or private, but it cannot be split into both. Secondly, since the primary decoder is fixed, it considers all the secondary information as noise, i.e. private. Since the primary data is fixed, the secondary can do rate splitting at the encoder and sequential decoding at the decoder without losing optimality. We will call the first layer the single user codeword with power since it is decoded using a single user decoder (treating everything else as Noise), and the second layer the multiuser codeword with power since it is decoded using a multiuser receiver (with Successive interference cancellation). The subscript will be replaced by its value (0 or 1) depending on the context.

3 Main Results

In the block activity model, the two sources of uncertainty in the primary traffic are captured by the initial state and the time of state change . Their actual values are unknown to the secondary, but their distribution and are assumed to be known. Our aim is to understand how lack of knowledge of these two parameters impacts the secondary rate. We first derive the case where a Genie provides the information about which serves as an upper bound and a design motivation for the general case of Section 3.2, where the secondary has an estimate of the initial state and has no knowledge of . We then consider two important special cases of the general scheme. First, we consider the case of perfect sensing, , and derive stronger results about the optimal secondary transmission design. Second, we consider the case where the secondary has no information about any of the unknown parameters, which is equivalent to not providing any useful information about .

3.1 Genie-aided Case: Secondary has perfect estimate of

In this section, we assume Genie-aided knowledge of at the secondary. As described in Section 2.2, the primary flow has two states, on and off. Hence the secondary transmitter uses two interleaved codes matched to the two channel states. Such a solution is called water-filling or water-pouring [8], as the variance of the two codebooks is inversely proportional to the noise variance of the corresponding channel state. We use the following result from [17] to show that Gaussian codebooks are optimal for the secondary.

Lemma 1 (Optimality of Gaussian codewords, see [17, Theorem ])

For a single-user scalar additive noise channel with nearest neighbor decoding irrespective of the noise distribution, the average probability of error over the ensemble of Gaussian codebooks of power , approaches zero as the blocklength tends to infinity for code rates below (and approaches one for rates above ).

Since the primary uses a Gaussian codebook, the equivalent noise at the secondary is also Gaussian. Further since the decoder of the primary is fixed, any distribution of noise at its receiver does not change the primary capacity as long as the variance (interference plus noise) is below . Hence Gaussian codebooks are optimal for the secondary. The secondary uses a Gaussian codebook with power when the primary is off, and a superposition Gaussian codebook with power . Define and to characterize the power allocation for the Gaussian codebook that achieves capacity.

Theorem 1 (Genie-aided Upper Bound)

When , then the optimal power allocation for the genie-aided case is given by


On the other hand, when , then the optimal power allocation is given by,


where , , . Here, and are the on and off fractions of the primary, respectively.

Proof 1

The proof is based on two steps. First we use Lemma 1 to show that Gaussian codebooks are optimal for the secondary transmissions. Then the optimal power allocation is obtained by solving the following optimization problem,

subject to

The rate achieved is a corner point of the 3 user virtual MAC formed by the primary and the two code layers of the secondary. Hence the sequential decoding at the secondary receiver achieves capacity. Complete details are in Appendix 5

As shown in Figure 4, power allocation can be viewed as water-filling with a layer in the middle representing the active primary. For the time-slots when primary is off, the secondary uses a single-user codebook matched to its own channel capacity. When the primary is on, the secondary transmitter splits into a two-layer Gaussian superposition codebook as shown in Figure 4 followed by a sequential decoding at the secondary receiver. The sequence of decoding at the secondary receiver is as follows. First, Layer  is decoded by treating everything else as noise, then the primary codeword is decoded by treating Layer 2 as noise and finally Layer  is decoded interference-free.

Layer ,

Layer ,

Encoding order (filling)

Decoding order (emptying)




Figure 4: Layered water-filling: power allocation can be thought of as filling a layered bucket with water. The presence of the primary causes one of the layers to be fixed. The size of the fixed layer (in red) is a function of , the size of Layer  depends on the interference gap and the rest of the power is put into Layer . When the primary is persistent, there is only one bucket to fill, as shown here. For a sporadic primary, there is another bucket without any layers for . For the block activity model, each of the time slots across the block look alike with replaced by or depending on the starting state.

3.2 The General Case: Secondary with noisy estimate of primary state

When the secondary does not know the channel states exactly for each time slot, it senses the channel in the beginning of each block (noisy estimate of ). The effective channel is a Gaussian mixture channel as given by Equation (1). A given time-slot (for the same sensed state) has the same channel statistics across different blocks as shown in Figure 3. We take a cue from the form of Genie-aided code in Theorem 3 and use a general superposition code. If , the secondary transmitter sends at a power level , (where etc.) and if , it sends at a power level . The above time-dependent power allocation exploits secondary’s knowledge of the conditional probability distribution of effective noise in each time slot caused by the primary. We shall omit the subscript in in the summation indices to avoid clutter. The error in the state estimate is characterized by the probability of missed detection and the probability of false alarm . The exact capacity with noisy state estimate is unknown. To derive an achievable rate using Gaussian codes for secondary transmissions, we extend the optimization problem in (4) by generalizing the rate and the three constraints for each sensed state.

The average power constraint can be computed by summing the power levels weighted with the appropriate probability of occurrences,


Similarly the rate has to be calculated for the four possible combinations of . When the primary user is switched ‘off’ or ‘on,’ the effective secondary channel is AWGN with noise or respectively. When there is no primary on the channel, the secondary can support a rate of and in the presence of the primary, the secondary can support a rate of .

When , the secondary correctly detects that the primary is silent and sends . The achievable rate averaged over all possible primary switches is given by,

where , , and . Lemma 3 (see Appendix 6) was used to change the summation order in step (a).

In this case, as the secondary correctly detects the primary user and sends and the effective noise at the secondary receiver has a variance of in the beginning of the block which changes to when the primary user changes its state at time-slot . Probability of this event , is . The rate that can be achieved in this case is given by, , where and .

Even though the primary user is transmitting, the secondary detects that there is no primary packet on the channel, it sends . In this case the primary user is actually on during the start of the block, the noise that the secondary receiver sees has a variance of and it changes to when the primary changes its state. Probability of the event , is . The rate achieved for a given this case is given by,

In this case, as the secondary detects the primary user and it sends , but the primary user is actually off during the start of the block, so the noise that the secondary receiver sees in the beginning is and it changes to when the primary changes its state. Probability of the event is . The rate achieved for a given is given by,

Adding up all the terms derived above after weighting them with the appropriate probability of occurrences, leads to the average rate of

Two additional constraints are required to complete the problem formulation. First the INR constraint which is imposed due to the constraint on the maximum noise variance that can be experienced at the primary receiver, , for and . Second, the SIC constraint which is imposed to ensure that the primary information can be decoded in the presence of the multiuser codeword , after the single user codeword has been decoded out. This gives rise to the condition, which is same as , for and . Note that these two constraints hold for each state and time slot in contrast to the average power constraint which are an average constraint. We solve the following optimization problem, to find the optimal power profile,


where is the constraint set defined by the INR, SIC and average power constraints derived above. Additionally, we have to consider positivity constraints for all the power variables, i.e. . This is with the understanding that the constraints do not become infeasible, i.e. and if , . We will assume complementarity and positivity but not discuss it further due to lack of space.

Theorem 2 (Monotonicity of the power profile)

For the optimization problem given in (6), Additionally,

  1. If then

  2. If then

  3. If then

  4. If then

Proof 2

See Appendix 7.

Due to the generality of our problem formulation in (6), the exact form of the power distribution cannot be derived in closed form. However, Theorem 2 proves a very important result about the monotonicity of power allocation across time-slots. The optimal power profile is non-increasing in time if the start state is . That is the secondary gets paranoid over time since it does not know when the primary transmitter will start transmitting. So it is better for the secondary to send more power in the initial time-slots and become more conservative as time progresses. In contrast, if , then the secondary bets more power in the later time-slots as there is a higher chance that the primary will turn off in those slots, thereby creating a better channel for the secondary flow. In the next section, we show that the above result can be significantly strengthened for the special case of perfect state estimate .

3.3 Special Case I: Secondary with perfect estimate of primary state,

In this section we assume that there is no error in secondary’s estimate of . When the starting state is perfectly known, each of the subchannels shown in Figure 3 behave as parallel channels. For such a channel capacity can be achieved by sending at a constant power with receiver side channel side information [13]. We show below that Gaussian codewords with power levels achieves the capacity if the sensing is error-free.

Theorem 3 (Capacity with perfect sensing, )

With perfect sensing, the capacity for the discrete cognitive interference channel is given by,

Proof 3

See Appendix 8.

The above result for finite input-output alphabets extends to continuous alphabets such that , , and . In order to find the power allocation , we have to solve an optimization problem similar to (4). When sensing is perfect and . There is no missed detection or false alarm, i.e. and . So, the rate and all the constraint equations for the perfect sensing protocol can be obtained by making these substitutions in (6).

Theorem 4 (Optimal Layer  power)

For the optimization problem given in (6) with and , the optimal power of the Layer  Gaussian codeword is given by, where are defined in Theorem 3 and .

Proof 4

See Appendix 9.

Due to the perfect estimate of the primary starting state, the rate splitting done at secondary is optimal, which allows us to find the exact value of the Layer 2 codewords. The monotonicity properties of the power levels still hold for the perfect sensing case.

Theorem 5 (Monotonic Layer  profile)

For the optimization problem given in (6) with , Additionally,

  1. If then

  2. If then

  3. If then

  4. If then

Proof 5

The proof follows directly from Theorem 2 for and .

The intuition for the above results is as follows. It is always better to allocate more power to Layer  codewords (while satisfying the power and interference constraints) as it has a higher contribution towards the secondary rate . For a fixed , below a certain , this upper bound is zero and all the power goes to the Layer  codeword. As increases, the proportion of the Layer  codeword keeps increasing and in the end all the power is put into the Layer  codeword. In short, we have to do layered water-filling for each time slot in the block as shown in Figure 4. Next, we show that the opportunistic superposition of [19] is a special case of the coginitive protocol when no sensing is done.

3.4 Special Case II: Secondary with no information about primary state

Consider the special case when the secondary does not sense the channel at all, i.e. the secondary transmitter only knows the statistics of the primary traffic and . Alternately, the estimate is so noisy that it does not provide any information about . A similar analysis of this special case can also be found in [20]. Out of the available power , is assigned to the Layer  codeword and is assigned to the Layer  codeword. The average power constraint for the secondary is given by and the INR constraint can be written as . After decoding the single user codeword, the residual capacity of the channel is given by . To ensure primary is decodability the SIC constraint is given by, . For such an , a secondary rate of can be achieved. Gaussian codebooks are optimal in this case too by Lemma 1. The optimal value of the superposition fraction () is one which maximizes while satisfying all the constraints.

Theorem 6 (No-sensing Capacity)

For a cognitive interference channel where the secondary does not sense the channel, the optimal superposition fraction is given by

and the capacity is given by .

Proof 6

See Appendix 10.

From the secondary receiver’s perspective, the equivalent channel is a MAC. The fixed primary converts the equivalent MAC rate region (pentagon) to a single line (Figure a). Even though the rate splitting assumes sequential decoding, it turns out to be optimal [3] because the two code layers at the secondary, makes the rate tuple a corner point of a three user virtual MAC [20] consisting of the two code layers of secondary and the primary.

Remark 1 (Persistent Primary)

If the primary has persistent data (), the rate that the secondary can achieve is the same as proved above, i.e. the effective channel in the no-sensing case is a compound channel and the secondary has to code for the worst case.

The optimal distribution puts as much available power in as possible without violating the SIC, INR and power constraints. This can be thought of a layered water-filling as shown in Figure 4. The layer due to the primary codeword is fixed. Available power is first assigned to the Layer 2 codeword. When the fixed layer is very high so that no power is left to put in the Layer 1 codeword, and if the fixed layer touches the noise floor, all the power is put in the Layer 1 codeword, . The idea of opportunistically doing interference cancellation has also been analyzed in [?, 19] and is a restatement of the rate splitting approach introduced in [20] for achieving time-sharing without cooperating encoders. The difference from [19] is the effect of the secondary transmissions on the primary which gives rise to the power profile and the proof of optimality using rate splitting and sequential decoding.

Figure b shows a detailed view of the the different rate splitting regions as a function of and , for fixed , . In Region 1, the secondary user treats the primary data as noise (). In region 2 the secondary uses both layers. In Region 3 the secondary receiver first decodes the primary’s data and then decodes its own ().

Only Layer 

Only Layer 


(a) Effective MAC (as seen by the secondary receiver)
(b) Detailed view of the rate splitting regions
Figure 5: (A) The effective MAC as seen by the secondary receiver, where is the decodable primary rate as seen by the secondary receiver, is the rate supported by the cross channel between the primary transmitter and the secondary receiver, is the rate achievable by the secondary and is the maximum rate achievable by the secondary if there is no primary. (B) Detailed view of the regions of operation for the secondary user for as a function of the two variables and . The secondary user uses Layer  codewords in Region , Layer  codewords in Region  and a superposition code with both layers in Region . Note that is not a part of any region.

The Layer  protocol region is always the same w.r.t. . The increase in the Layer  protocol region with the increase in the can be attributed to the interference constraint at the primary receiver. This increase comes at the expense of the superposition protocol region. When increases, the power available to the secondary user keeps on decreasing due to the interference constraint. This decreases the power left for the Layer  codeword. Hence at higher , a part of the mixed protocol region gets converted to a multiuser protocol region as there is no more power left to put into the single user codeword.

The paranoid power profile is optimal if the sensing is perfect. However, even if the sensing is noise free, the time spent in sensing is still an overhead. This overhead can be high enough for the no-sensing scheme to outperform the perfect sensing scheme in some regimes, as shown in Figure 6.

Figure 6: The no-information lower bound can outperform perfect sensing scheme with a finite sensing overhead depending on the , the blocklength . Here, and .

Along the curve, points close to the right side () are obtained for and the points to the left side () are obtained for . Figure 7 plots the two extreme rate points on the y-axis in Figure 6 (where ) for different block sizes. The above-mentioned loss due to sensing, decreases as increases. Finally, the bounds are closer together when the residual capacity of the primary channel is smaller as shown in Figure 8. Different values of has no effect on the no-information lower bound. But the genie-aided upper bound is able to get a higher rate if the channel is idle for a longer period of time. This means, when is high, the performance of both these bounds are close.

Figure 7: The maximum achieved (two extreme rate points on the y-axis in Figure 6 where ) by the paranoid profile scheme gets closer to the genie-aided upper bound as increases which is due to smaller sensing overhead. Here, .
Figure 8: Comparison of no-sensing scheme and the Genie-aided scheme for different persistence () of the primary user’s data and . The dashed lines are for the no-sensing scheme and the solid lines are for the genie-aided scheme.

4 Conclusion

We approximate an interference channel in the presence of a fixed sporadic primary flow as a block activity channel. Such a block activity model can be broken into parallel channels in time and multiplexed codebooks can be used when there is perfect sensing. We derived a paranoid scheme for the secondary user when the primary user is transmitting below capacity and show that rate splitting at the secondary transmitter with sequential decoding is optimal when the estimation of the primary starting state is noise-free. Depending on the starting state of the primary during a block, the optimal power profile for Gaussian inputs is either growing or decaying in power as a function of the time slot. If the sensing overhead is considered, we showed that the paranoid scheme approaches the genie-aided scheme for large block lengths. Finally we show numerically that the paranoid and genie aided schemes approach the no-information scheme when the primary channel is operating close to its capacity. \appendices


Proof of Theorem 3: We derive a special case of the water-filling result [8] for a channel with two states, which will be used to prove the optimal power allocation for the genie-aided case.

Lemma 2

For the optimization problem given by

subject to

with non-negative parameters , the maximizing is characterized as follows.

where and .

Proof 7

Let us start with assuming that there is no constraint on . The optimal solution is given by , where is chosen such that . If , then . Therefore, and . Note also ensures and by assumption . This proves part B of the Lemma.