LP-decodable multipermutation codesThis material was presented in part at the 2014 Allerton Conference on Communication, Control, and Computing, Monticello, IL, Oct. 2014 and in part at the 2015 Information Theory Workshop (ITW), Jerusalem, Israel, Apr. 2015. This work was supported by the National Science Foundation (NSF) under Grants CCF-1217058 and by a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Research Grant. This paper was submitted to IEEE Trans. Inf. Theory.

# LP-decodable multipermutation codes††thanks: This material was presented in part at the 2014 Allerton Conference on Communication, Control, and Computing, Monticello, IL, Oct. 2014 and in part at the 2015 Information Theory Workshop (ITW), Jerusalem, Israel, Apr. 2015. This work was supported by the National Science Foundation (NSF) under Grants CCF-1217058 and by a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Research Grant. This paper was submitted to IEEE Trans. Inf. Theory.

Xishuo Liu , Stark C. Draper X. Liu is with the Dept. of Electrical and Computer Engineering, University of Wisconsin, Madison, WI 53706 (e-mail: xishuo.liu@wisc.edu).S. C. Draper is with the Dept. of Electrical and Computer Engineering, University of Toronto, ON M5S 3G4, Canada (e-mail: stark.draper@utoronto.ca).
###### Abstract

In this paper, we introduce a new way of constructing and decoding multipermutation codes. Multipermutations are permutations of a multiset that generally consist of duplicate entries. We first introduce a class of binary matrices called multipermutation matrices, each of which corresponds to a unique and distinct multipermutation. By enforcing a set of linear constraints on these matrices, we define a new class of codes that we term LP-decodable multipermutation codes. In order to decode these codes using a linear program (LP), thereby enabling soft decoding, we characterize the convex hull of multipermutation matrices. This characterization allows us to relax the coding constraints to a polytope and to derive two LP decoding problems. These two problems are respectively formulated by relaxing the maximum likelihood decoding problem and the minimum Chebyshev distance decoding problem.

Because these codes are non-linear, we also study efficient encoding and decoding algorithms. We first describe an algorithm that maps consecutive integers, one by one, to an ordered list of multipermutations. Based on this algorithm, we develop an encoding algorithm for a code proposed by Shieh and Tsai, a code that falls into our class of LP-decodable multipermutation codes. Regarding decoding algorithms, we propose an efficient distributed decoding algorithm based on the alternating direction method of multipliers (ADMM). Finally, we observe from simulation results that the soft decoding techniques we introduce can significantly outperform hard decoding techniques that are based on quantized channel outputs.

## 1 Introduction

Using permutations and multipermutations in communication systems dates back at least to [1], where Slepian considered using multipermutations in a data transmission scheme for the additive white Gaussian noise (AWGN) channel. In recent years, there has been growing interest in permutation codes due to their usefulness in various applications such as powerline communications (PLC) [2] and flash memories [3]. For PLC, permutation codes are proposed to deal with permanent narrow-band noise and impulse noise while delivering constant transmission power (see also [4]). In flash memories, information is stored in the pattern of charge levels of memory cells. Jiang et al. proposed using the relative ranking of memory cells to modulate information [3]. This approach alleviates the over-injection problem during cell programming. In addition, it can reduce errors caused by charge leakage (cf. [3]).

Error correction codes that use permutations are usually designed using on a specific distance metric over permutations. In the context of rank modulation, the commonly considered distance metrics include the Kendall tau distance (e.g., [5, 6, 7, 8, 9, 10]), the Chebyshev distance (e.g., [11, 12, 13, 14, 7, 9]), and the Ulam distance (e.g., [15]). Regardless of the choice of distance metric, these studies all consider hard decoding algorithms. In other words, the objective of each decoder is to correct some number of errors in the corresponding distance metric. In order to bring soft decoding to permutation codes, Wadayama and Hagiwara introduce linear programming (LP) decoding of permutation codes in [16]. Although the set of codes that can be decoded by LP decoding is restrictive, the framework is promising for two reasons. First, the algorithm is soft-in soft-out, differentiating itself from hard decoding algorithms based on quantized rankings of channel outputs; and would therefore be expected to achieve lower error rates. Second, the algorithm is based on solving an optimization problem, which makes it possible to incorporate future advances in optimization techniques.

In this paper, we extend the idea in [16] to multipermutations. Multipermutations generalize permutations by allowing multiple entries of the same value: a multipermutation a permutation of the multiset . The number of entries of value in the multiset is called the multiplicity of . We denote by the multiplicity vector of a multiset, where is the multiplicity of ; in other words, is the histogram of the multiset. Furthermore, a multipermutation is called -regular if for all . A multipermutation code can be obtained by selecting a subset of all multipermutations of the multiset. In the literature, multipermutation codes are referred as constant-composition codes when the Hamming distance is considered [17]. When , the multipermutations under consideration are known as frequency permutation arrays [18]. Recently, multipermutation codes under the Kendall tau distance and the Ulam distance are studied in [19] and [20] respectively. As mentioned in [20], there are two motivations for using multipermutation codes in rank modulation. First, the size of a codebook based on multipermutations can be larger than that based on permutations (i.e., those in [21]). Second, the number of distinct charges a flash memory can store is limited by the physical resolution of the hardware and thus using permutations over large alphabets is impractical.

In fact, the construction of Wadayama and Hagiwara in [16] is defined over multipermutations. However in [16], multipermutations are described using permutation matrices. This results in two notable issues. First, since the size of a permutation matrix scales quadratically with the length of the corresponding vector, the number of variables needed to specify a multipermutation in this representation scales quadratically with the length of the multipermutation. But since multipermutations consist of many replicated entries, one does not need to describe the relative positions among entries of the same value. This intuition suggests that one can use fewer variables to represent multipermutations.

The second issue relates to code non-singularity as defined in [16]. To elaborate, we briefly review some concepts therein. In [16], a codebook is obtained by permuting an initial (row) vector with a set of permutation matrices. If contains duplicate entries, then there exists at least two different permutation matrices and such that . This means that we cannot differentiate between and by comparing the matrices and . Due to this ambiguity, permutation matrices are not perfect proxies to multipermutations. To see this, note that the cardinality of a multipermutation code is not necessarily equal to the cardinality of the corresponding permutation matrices. This makes it not straightforward to calculate codebook cardinality from the set of permutation matrices. Furthermore, minimizing the Hamming distance between two multipermutations is not equivalent to minimizing the Hamming distance between two permutation matrices111This is defined as the number of disagreeing entries (cf. Section 3).. This can be seen easily from the example above, where the Hamming distance between and is zero, but the Hamming distance between and is greater than zero.

In this paper, we address the above two problems by introducing the concept of multipermutation matrices. Multipermutation matrices and multipermutations are in a one-to-one relationship. In comparison to a permutation matrix, a multipermutation matrix is a more compact representations of a multipermutation. Further, due to the one-to-one relationship, we can calculate the cardinality of a multipermutation code by calculating the cardinality of the associated multipermutation matrices. In order to construct codes that can be decoded using LP decoding, we develop a simple characterization of the convex hull of multipermutation matrices. The characterization is analogous to the well known Birkhoff polytope (cf. [22]). These results form the basis for the code constructions that follow. They may also be of independent interests to the optimization community. We consider the introduction of multipermutation matrices and the characterization of their convex hull to be our first set of contributions.

Building on these results, our second set of contributions include code definitions and decoding problem formulations. By placing linear constraints on multipermutation matrices to select a subset of multipermutations, we form codebooks that we term LP-decodable multipermutation codes. Along this thread, we first present a simple and novel description of a code introduced by Shieh and Tsai in [23] (ST codes) that has known rate and distance properties. Then, we study two random coding ensembles and derive their size and distance properties. The code definitions using multipermutation matrices immediately imply an LP decoding formulation. We first relax the maximum likelihood (ML) decoding problem to form an LP decoding problem for arbitrary memoryless channels. In particular, for the AWGN channel, our formulation is equivalent to the decoding problem proposed in [16]. We then relax the minimum (Chebyshev) distance decoding problem to derive an LP decoding scheme that minimizes the Chebyshev distance in a relaxed code polytope.

Due to the non-linearity of multipermutation codes, we need efficient encoding and decoding algorithms, which brings us to our third set of contributions. To the best of the our knowledge, there has been no encoding algorithms for the ST codes that were introduced in [23]. Therefore, we first focus on encoding and introduce an algorithm that ranks all multipermutations that are parameterized by the multiplicity vector . In other words, suppose all multipermutations are ranked such that each corresponds to an index in . Our algorithm outputs the index that corresponds to a given input multipermutation. We use this ranking algorithm as the basis for developing of a low-complexity encoding algorithm for the ST codes. Next, we develop an efficient decoding algorithm based on the alternating direction method of multipliers (ADMM), which has recently been used to develop efficient decoding algorithms for linear codes (e.g, [24, 25, 26, 27, 28]). The ADMM decoding algorithm in this paper requires two subroutines that perform Euclidean projections onto two distinct polytopes. Both projections can be solved efficiently, the first using techniques proposed in [29] and the second using the algorithm developed in Appendix 10.

We list below our contributions in this paper.

• We introduce the concept of multipermutation matrices (Section 3.1) and characterize the convex hull of all multipermutation matrices (Section 3.2).

• We propose LP-decoding multipermutation codes (Section 4.1) and redefine a code introduced by Shieh and Tsai (ST codes) using our framework (Section 4.2). Furthermore, we study two random coding ensembles and compare ST codes with the ensemble average (Section 4.3 and Appendix 9).

• We formulate two LP decoding problems (Section 5), one for maximizing the likelihood, and the other for minimizing the Chebyshev distance.

• We derive an efficient encoding algorithm for ST codes (Section 6.1).

• We develop an ADMM decoding algorithm for solving the LP decoding problem for memoryless channels (Section 6.2).

• We initiate the study of initial vector estimation problem for rank modulation and propose a turbo-equalization like decoding algorithm (Appendix 11).

## 2 Preliminaries

In this section, we briefly review the concept of permutation matrices and the code construction approach proposed in [16].

A length- permutation is a length- vector, each element of which is a distinct integer between and , inclusive. Every permutation corresponds to a unique permutation matrix, a permutation matrix is a binary matrix such that every row or column sum equals to . In this paper, all permutations (and multipermutations) are represented using row vectors. Thus, if is the permutation matrix corresponding to the permutation , then where is the identity permutation. We denote by the set of all permutation matrices of size .

###### Definition 1

(cf. [16]) Let and be positive integers. Assume that , , and let “” represent a vector of “” or “” relations. A set of linearly constrained permutation matrices is defined by

 Π(A,b,⊴):={P∈Πn|Avec(P)⊴b}, (2.1)

where is the operation of concatenating all columns of a matrix to form a column vector.

###### Definition 2

(cf. [16]) Assume the same set up as in Definition 1. Suppose also that a row vector is given. The set of vectors given by

 Λ(A,b,⊴,s):={sP∈Rn|P∈Π(A,b,⊴)} (2.2)

is called an LP-decodable permutation code222In this paper, we always let the initial vector be a row vector. Then is again a row vector. This is different from the notation followed by [16], where the authors consider column vectors.. is called the “initial vector”, which is assumed to be known by both the encoder and the decoder.

Note that may contain duplicates, and can be a permutation of the following vector

 s=(t1,t1,…,t1r1,t2,t2,…,t2r2,…,tm,tm,…,tmrm), (2.3)

where for and there are entries with value . In this paper, we denote by the multiplicity vector, and let . Due to this notation, .

Throughout the paper, we use to represent a vector with distinct entries. At this point, it is easy to observe that the vector can be uniquely determined by and . Therefore, in this paper, we refer as the “initial vector” instead of .

As an important remark, we note that the initial vector does not have to be a vector of integers ( and are in the reals). One can think of initial vectors as the actual charge levels of a flash memory programmed using the rank modulation scheme. For most of this paper, we assume that is fixed and known to the decoder. However, this assumption does not necessarily hold in practice. In Appendix 11, we briefly discuss some initial work that considers what to do when is unknown to the decoder.

## 3 Multipermutation matrices

In this section, we introduce the concept of multipermutation matrices. Although, as in (2.2), we can obtain a multipermutation of length by multiplying an length- initial vector by an permutation matrix, this mapping is not one-to-one. Thus , where the inequality can be strict if there is at least one such that . As a motivating example, let a multipermutation be . Consider the following two permutation matrices

 P1=⎛⎜ ⎜ ⎜⎝1000001001000001⎞⎟ ⎟ ⎟⎠ and P2=⎛⎜ ⎜ ⎜⎝0010100000010100⎞⎟ ⎟ ⎟⎠.

Then , where . In fact there are a total of four permutation matrices that can produce .

To resolve this ambiguity, we now introduce multipermutation matrices, which are defined to be rectangular binary matrices parameterized by a multiplicity vector. Then, we discuss the advantages of using multipermutation matrices vis-a-vis permutation matrices. Finally, we show a theorem that characterizes the convex hull of multipermutation matrices, a theorem that is crucial to our code constructions and decoding algorithms.

### 3.1 Introducing multipermutation matrices

Recall that is a multiplicity vector of length and . We denote by the set of all distinct multipermutations parameterized by the multiplicity vector . We now define a set of binary matrices that is in a one-to-one correspondence with .

###### Definition 3

Given a multiplicity vector of length and , we call a binary matrix a multipermutation matrix parameterized by if for all and for all . Denote by the set of all multipermutation matrices parameterized by .

Using this definition, it is easy to build a bijective mapping between multipermutations and multipermutation matrices. When the initial vector is , the mapping can be defined as follows: Let denote a multipermutation. Then, it is uniquely represented by the multipermutation matrix such that if and only if . Conversely, to obtain the multipermutation , one can simply calculate the product .

###### Example 1

Let the multiplicity vector be , let , and let Then the corresponding multipermutation matrix is

 X=⎛⎜ ⎜ ⎜⎝0101000000100010001000000100010010001100⎞⎟ ⎟ ⎟⎠.

The row sums of are respectively. Further, .

###### Lemma 1

Let be an initial vector of length with distinct entries. Let and be two multipermutation matrices parameterized by a multiplicity vector . Further, let and . Then if and only if .

Proof First, it is obvious that if then . Next, we show that if then . We prove by contradiction. Assume that there exists two multipermutation matrices and such that and . Then

 x−y=t(X−Y).

Since , there exists at least one column such that has a at the -th row and has a at the -th row where . Then the -th entry of would be because all entries of are different. This contradicts the assumption that .

At this point, one may wonder why this one-to-one relationship matters. We now discuss three aspects in which having a one-to-one relationship between multipermutations and multipermutation matrices is beneficial.

#### 3.1.1 Reduction in the number of variables

One immediate advantage of using multipermutation matrices is that they require fewer variables to represent multipermutations. (This was the first issue discussed in the Introduction.) The multipermutation matrix corresponding to a length- multipermutation has size , where is the number of distinct values in the multipermutation.

This benefit can be significant when the multiplicities are large, i.e., when is much smaller than . For example, a triple level cell flash memory has states per cell. If a multipermutation code is of length , then one needs an multipermutation matrix to represent a codeword. The corresponding permutation matrix has size .

#### 3.1.2 Analyzing codes via binary matrices

Also due to the one-to-one relationship, one can use multipermutation matrices as a proxy for multipermutations. By this we mean that one can analyze properties of multipermutation codes by analyzing the associating multipermutation matrices. First, note that a set of multipermutation matrices of cardinality can be mapped to a set of multipermutations of cardinality . This means that one can determine the size of a multipermutation code by characterizing the cardinality of the corresponding set of multipermutation matrices. As an example, in Section 4.3, we analyze the average cardinality of two random coding ensembles. Second, one can determine distance properties of a multipermutation code via the set of multipermutation matrices. One such example is the Hamming distance, which is demonstrated in details in the following.

#### 3.1.3 The Hamming distance

The Hamming distance between two multipermutations is defined as the number of entries in which the two vectors differ from each other. More formally, let and be two multipermutations, then . Due to Lemma 1, we can express the Hamming distance between two multipermutations using their corresponding multipermutation matrices.

###### Lemma 2

Let and be two multipermutation matrices, and let be an initial vector with distinct entries. With a small abuse of notations, denote by the Hamming distance between the two matrices, which is defined by . Then

 dH(X,Y)=2dH(x,y),

where and . Furthermore,

 dH(X,Y)=tr(XT(E−Y)),

where represents the trace of the matrix and is an matrix with all entries equal to . 333We note that is the Frobenius inner product of two equal-sized matrices.

Proof For all such that , the -th column of differs from the -th column of by two entries. As a result, the distance between multipermutation matrices is double the distance between the corresponding multipermutations.

Next, . If then . Otherwise . Therefore .

The above two points relate to the second issue regarding the redundancy in the representation of [16], as was discussed in the Introduction.

### 3.2 Geometry of multipermutation matrices

In this section we prove an important theorem that characterizes the convex hull of all multipermutation matrices. As background, we first review the definition of doubly stochastic matrices. Then, we state the Birkhoff-von Neumann theorem for permutation matrices. Finally, we build off the Birkhoff-von Neumann theorem to prove our theorem. We refer readers to [22] and references therein for more materials on doubly stochastic matrices and related topics.

###### Definition 4

An matrix is doubly stochastic if

• ;

• for all and for all .

The set of all doubly stochastic matrices is called the Birkhoff polytope. There is a close relationship between the Birkhoff polytope and the set of permutation matrices as the following theorem formalizes:

###### Theorem 1

(Birkhoff-von Neumann Theorem, cf. [22]) The permutation matrices constitute the extreme points of the set of doubly stochastic matrices. Moreover, the set of doubly stochastic matrices is the convex hull of the permutation matrices.

This theorem is the basis for the decoding problem formulated in [16]. Namely, the LP relaxation for codes defined by Definition 2 is based on the Birkhoff polytope. In order to formulate LP decoding problems using multipermutation matrices, we need a similar theorem that characterizes the convex hull of multipermutation matrices.

Denote by the convex hull of all multipermutation matrices parameterized by , i.e. . Then, is characterized by the following theorem.

###### Theorem 2

Let and be an matrix such that

• for all .

• for all .

• for all and .

Then, is a convex combination of all multipermutation matrices parameterized by . Conversely, any convex combination of multipermutation matrices parameterized by satisfies the above conditions.

Proof Consider a multipermutation where each . Without loss of generality, we assume that is in increasing order. We denote by the index set for the -th symbol, i.e.,

 Ii:={i−1∑l=1rl+1,…,i∑l=1rl}. (3.1)

Then if . Let be the corresponding multipermutation matrix. Then has the following form

where if and otherwise.

Note that all multipermutation matrices parameterized by a fixed are column permutations of each other. Of course, as already pointed out, not all distinct permutations of columns yield distinct multipermutation matrices. To show that any satisfying - is a convex combination of multipermutation matrices, we show that there exists an stochastic matrix such that . Then by Theorem 1, can be expressed as a convex combination of permutation matrices. In other words, where are permutation matrices; for all and . Then we have

 Z=X∑hαhPh=∑hαh(XPh),

where is a column permuted version of the matrix , which is a multipermutation matrix of multiplicity . This implies that is a convex combination of multipermutation matrices.

We construct the required matrix in the following way. For each , let be a length- column vector, for . Then the matrix

 QT:=[q1|q1|…|r1 of % them…|qi|qi|…|ri of them…|qm|qm…rm of them]. (3.2)

In other words, for all and . We now verify that and that is doubly stochastic, which by our discussions above implies that is a convex combination of column-wise permutations of .

1. To verify , we need to show that . Since is a binary matrix,

 n∑k=1XikQkj=∑k:Xik=1Qkj.

In addition, since is sorted, if and only if . By the definition of , for all . Therefore

 ∑k:Xik=1Qkj=riZijri=Zij.
2. Next we verify that is a double stochastic matrix. Since for all , for all . By the definition of , the sum of each row is for some . Thus by condition . The sum of each column is

 n∑k=1Qkj =m∑i=1∑k∈IiQkj =m∑i=1∑k∈Ii1riZij=m∑i=1Zij=1,

where the last equality is due to condition .

To summarize, for any given real matrix satisfying condition - we can find a doubly stochastic matrix such that for a particular multipermutation matrix . This implies that is a convex combination of multipermutation matrices.

The converse is easy to verify by the definition of convex combinations and therefore is omitted.

## 4 LP-decodable multipermutation code

### 4.1 Constructing codes using linearly constrained multipermutation matrices

Using multipermutation matrices as defined in Definition 3, we define the set of linearly constrained multipermutation matrices analogous to that in [16]444The analogy is in the following sense. One can obtain the definitions in this section by restating the definition in [16] using multipermutations and the convex hull .s.

###### Definition 5

Let be a length- multiplicity vector, and . Let be a positive integer. Assume that , , and let “” represent a vector of “” or “” relations. A set of linearly constrained multipermutation matrices is defined as

 ΠM(r,A,b,⊴):={X∈M(r)|Avec(X)⊴b}, (4.1)

where is the set of all multipermutation matrices parameterized by .

###### Definition 6

Let be a length- multiplicity vector, and . Let be a positive integer. Assume that , , and let “” represent a vector of “” or “” relations. Suppose also that is given. The set of vectors given by

 ΛM(r,A,b,⊴,t):={tX∈Rn|X∈ΠM(r,A,b,⊴)} (4.2)

is called an LP-decodable multipermutation code.

We can relax the integer constraints and form a code polytope. Recall that is the convex hull of all multipermutation matrices parameterized by .

###### Definition 7

The polytope defined by

 PM(r,A,b,⊴):=M(r)⋂{X∈Rm×n|Avec(X)⊴b}

is called the “code polytope”. We note that is a polytope because it is the intersection of two polytopes.

Regarding the above definitions, we discuss some key ingredients.

• Definition 5 defines the set of multipermutation matrices. Due to Lemma 1, this set uniquely determines a set of multipermutations. The actual codeword that is transmitted (or stored in a memory system) is also determined by the initial vector , which depends on the modulation scheme used in the system. Definition 6 is the set of codewords determined by once the initial vector is into account.

• Definition 7 is useful for decoding. It will be discussed in detail in Section 5. As a preview, in Section 5, we will formulate two optimization problems with variables constrained by the code polytope . In both optimizations, the objective functions will be related to the initial vector but the constraints will only be a function of . We emphasize that is not parameterized by .

• is defined as the intersection of two polytopes. It is not defined as the convex hull of , which is usually hard to describe. However, the intersection that define may introduce fractional vertices, i.e., vertices such that . Because of this, we call a relaxation of .

To better explore structures of LP-decodable multi-permutation codes, we now define two specific types of linear constraints.

###### Definition 8

Fixed-at-zero constraints: Let be a set of entries . A code with a set of fixed-at-zero constraints is defined by both and for all . Fixed-at-equality constraints: Let be a set of entry pairs . A code with a set of fixed-at-equality constraint is defined by both and for all . These two types of constraints can be combined.

We consider these two types of constraints because they are useful to define LP-decodable multipermutation and permutation codes, and because we can develop efficient decoding algorithms for these codes. For example, the pure “involution” code introduced in [16] is constructed by combining both constraints. In Section 4.2, we discuss two codes constructed using fixed-at-zero constraints. In Section 4.3, we show random coding results for these two types of codes. Last but not least, we show how to decode codes with fixed-at-zero, fixed-at-equality, or both constraints using ADMM in Section 6.2.

#### Remarks

A natural question to ask is whether the restriction to linear constraints reduces the space of possible code designs. In our previous paper [30], we show that the answer to this question is “No.” This follows because it is possible to define an arbitrary codebook using linear constraints. As we show more formally in [30], one can add one linear constraint for each non-codeword, where the linear constraint requires that the Hamming distance between any codeword and that non-codeword to be at least . However, this approach leads to an exponential growth in the number of linear constraints. Thus, the interesting and challenging question is how to construct good codes (in terms of rate and error performance) that can be described efficiently using linear constraints. A related question that we study in [30] connects the description of LP-decodable permutation code (as defined in Definition 2) to that of LP-decodable multipermutation code (as defined in Definition 6). We show that codes described by Definition 6 can be restated using Definition 2 using the same number of linear constraints (the same “” in Definition 2 and 6). We refer readers to [30] for the details of these results.

### 4.2 Examples of LP-decodable multipermutation codes

We provide two examples of codes using Definition 6.

###### Example 2 (Derangement)

A permutation is termed a “derangement” if for all . For multipermutations, we define a generalized notion of derangement as follows. Let

 ı=(1,1,…,1r1,2,2,…,2r2,…,m,m,…,mrm).

Let be a multipermutation obtained by permuting . We say that is a derangement if for all .

In [16], the authors use Definition 2 to define the set of derangements by letting , where is a permutation matrix. We now extend this construction using Definition 6 and let the linear constraints on the multipermutation matrix be

 ∑j∈IiXij=0 for all i=1,…,m, (4.3)

where is defined by (3.1). Suppose the initial vector , then (4.3) implies that symbol cannot appear at positions . For example, let and . Then the allowed derangements that form the codebook are

 (3,3,1,1,2,2),(2,2,3,3,1,1),(2,3,1,3,2,1), (2,3,1,3,1,2),(2,3,3,1,2,1),(2,3,3,1,1,2), (3,2,1,3,2,1),(3,2,1,3,1,2),(3,2,3,1,2,1), (3,2,3,1,1,2).
###### Example 3

In [23], Shieh and Tsai study multipermutation codes under the Chebyshev distance. The Chebyshev distance between two multipermutations and is defined as

 d∞(x,y)=maxi|xi−yi|. (4.4)

We review the Shieh-Tsai (ST) code (cf. [23, Construction 1]) in Definition 9.

###### Definition 9

Let be a length- vector. Let be an integer such that divides . We define

 C(r,m,d)={x∈M(r)|∀i∈{1,…,mr},xi≡imodd}. (4.5)

Although not originally presented that way in [23], it is easy to verify that this code is an LP-decodable multipermutation code defined by fixed-at-zero constraints. The fixed-at-zero constraints are defined by the set As a concrete example, let , and . Then the constraints are

 X21=X31=X51=X61 =0 X12=X32=X42=X62 =0 ⋮ X1,12=X2,12=X4,12=X5,12 =0.

It is showed in [23] that this code has cardinality where . Further, the minimum Chebyshev distance of this code is . In addition, for large values of , the rate of the code is observed to be close to a theoretical upper bound on all codes of Chebyshev distance . However, no encoding or decoding algorithms are presented in [23]. We discuss encoding and decoding algorithms for this code in Section 6.

### 4.3 The random coding ensemble

In this subsection, we study randomly constructed LP-decodable multipermutation codes. We focus on the ensembles generated either by fixed-at-zero constraints or by fixed-at-equality constraints. The randomness comes from choosing the respective constraint sets, or , uniformly at random. Unfortunately, as we show in Appendix 9.2, several results indicate that the ensemble average is not as good as ST codes, which are structured codes belonging to the ensemble. Therefore, we only briefly present our problem formulations and results in the main text and refer readers to Appendix 9 for more details.

We first introduce some additional notation. With a small abuse of notation, we denote by the set of multipermutation matrices constrained by fixed-at-zero constraints. Similarly, denote by the set of multipermutation matrices constrained by fixed-at-equality constraints. Denote by and the respective cardinalities of sets and . Furthermore, denote by the set of all possible choices of that have elements; denote by the set of all possible choices of that have elements.

Note that we do not consider duplicated constraints. In other words, all entries in (or ) are distinct from each other. This is different from the set up in [16, Sec. VI], where the authors allow repeated constraints. Consequently, the cardinalities of both types of constraints, i.e., and , are limited. For example, since there are zeros in a multipermutation matrix, should be less than or equal to ; otherwise the cardinality of the code must be zero555The cardinality of a code may be zero even when is small, e.g., when fixes a whole column to zero. But for distinct constraints, the code size is zero regardless how we pick .. On the other hand, fixed-at-equality constraints are constructed by entry pairs. There are ways of choosing two entries from a multipermutation matrix. Therefore .

###### Lemma 3
 |Sz(κ)|=(nmκ),|Se(ι)|=((nm2)ι).

Next, we draw (resp. ) uniformly at random from the set (resp. ). As a result, for particular realizations and , and . When taking into account all possible choices, we can show the following lemma for multipermutation matrices.

###### Lemma 4

Consider a fixed multiplicity vector and a fixed multipermutation matrix . Let and be fixed parameters, then

 |{Z∈Sz(κ)|X∈ΠM(r,Z)}|=(nm−nκ).

Similarly,

 |{E∈Se(ι)|X∈ΠM(r,E)}|=((nm−n2)+(n2)ι).

Proof See Appendix 9.1.1.

Following the methodology adopted in [16], we prove Proposition 1 which calculates the average cardinality of multipermutation matrices that meets a randomly chosen set of either fixed-at-zero or fixed-at-equality constraints. Note that due to Lemma 1, Proposition 1 actually calculates the codebook size.

###### Proposition 1

Denote by the cardinality of the code. Then,

 E[A(ΠM(r,Z))]=(nm−nκ)|M(r)||Sz(κ)|,

where the expectation is taken over all possible choices of . Using the same notation,

 E[A(ΠM(r,E))]=((nm−n2)+(n2)ι)|M(r)||Se(ι)|,

where the expectation is taken over all possible choices of . Recall that . Further, and can be calculated using Lemma 3.

Proof See Appendix 9.1.2.

We now study the distance properties of these codes. We are particularly interested in the Chebyshev distance of -regular multipermutations, for we can directly compare our results to the distance property of ST codes. Let be a fixed multipermutation that may or may not be a codeword. Following the terminology used in [16], we refer as the “origin” multipermutation; we consider the Chebyshev distance from the fixed to other codewords. We use as the initial vector.

###### Proposition 2

Let and define

 Ld(ΠM(r,Z)):=|{X∈ΠM(r,Z)|d∞(tX,y)≤d}|.

Then,

 (mn−nκ)|Sz(κ)|(2dr+r)nn!22drnn(r!)m≤E[Ld(ΠM(r,Z))]≤(mn−nκ)|Sz(κ)|[(2dr+r)!]n2dr+r(r!)m (4.6)

where the expectation is taken over all possible choices of . Using the same notation,

 ((nm−n2)+(n2)ι)|Se(ι)|(2dr+r)nn!22drnn(r!)m≤E[Ld(ΠM(r,E))]≤((nm−n2)+(n2)ι)|Se(ι)|[(2dr+r)!]n2dr+r(r!)m (4.7)

where the expectation is taken over all possible choices of .

Proof See Appendix 9.1.3.

## 5 Channel model and LP decoding

In the previous section we showed how to construct codes by placing linear constraints on multipermutation matrices. Recall that in Theorem 2 we characterized the convex hull of multipermutation matrices. We now leverage this characterization to develop two linear programming decoding problems. By relaxing the ML decoding integer program, we first formulate a linear program decoding problem that is suitable for arbitrary memoryless channels. The objective function of this LP is based on log-likelihood ratios, and is analogous to LP decoding of non-binary low-density parity-check (LDPC) codes, which is introduced by Flanagan et al. in [31]. If we apply this formulation to the AWGN channel, the resulting problem is analogous to the one developed in [16]. The second problem we introduce is not seen in the literature to the best of our knowledge, and can be applied to channels that are not memoryless. In this problem, we relax the minimum Chebyshev distance decoding problem to a linear program by introducing an auxiliary variable.

### 5.1 LP decoding for memoryless channels

We first focus on memoryless channels where is the channel output space. Since the initial vector is assumed to contain distinct entries, the channel input space is . Without loss of generality, we assume that . Let be a codeword from an LP-decodable multipermutation code that is transmitted over a memoryless channel. Let be the received word. Then, . For this channel model, we define a function , where is a length- row vector defined by . Further, we let .

Then, ML decoding can be written as

 ^x =argmaxx∈ΛM(r,A,b,⊴,t)PΣ|S(y|x) =argmaxx∈ΛM(r,A,b,⊴,t)n∑i=1logPΣ|S(yi|xi) \scriptsize{(a)}=t(argminX∈ΠM(r,A,b,⊴)n∑i=1γ(yi)XCi) \scriptsize{(b)}=t(argminX∈ΠM(r,A,b,⊴)Γ(y)vec(X)),

where is the -th column of and the transmitted codeword is . We recall that since is a multipermutation matrix, is a binary column vector with a single non-zero entry. Equality (a) comes from the fact that for each there exists an such that . Further, since , the maximization problem can be transformed to a minimization problem. Equality (b) is simply a change of notation.

For this problem, we can relax the integer constraints to linear constraints . Then the LP decoding problem is

 minimizeΓ(y)vec(X)subject toX∈PM(r,A,b,⊴) (5.1)
###### Theorem 3

The LP decoding problem (5.1) has an ML certificate. That is, whenever LP decoding (5.1) outputs an integral solution, it is the ML solution.

Proof Suppose that is the solution of the LP decoding problem and is integral. Then is a multipermutation matrix and . Since the relaxation does not add or remove integral vertices, . Since attains the maximum of the ML decoding objective, it is the ML solution.

###### Proposition 3

LP decoding (5.1) is equivalent to ML decoding for LP-decodable multipermutation codes defined by fixed-at-zero constraints (cf. Definition 8).

Proof As before for simplicity, we denote by the code polytope of a multipermutation code subject to only fixed-at-zero constraints. In order to prove the proposition, it is sufficient to show that

 PM(r,Z)=conv(ΠM(r,Z)). (5.2)

If (5.2) holds, then the relaxation does not have factional vertices and hence is tight. By the ML certificate (Theorem 3), LP decoding is thus equivalent to ML decoding. Note that it is easy to verify by Definition 7 that

 PM(r,Z)⊃conv(ΠM(r,Z)).

Hence, to complete the proof, we need to show that for all , .

Since , we can express as a convex combination of multipermutation matrices in . In other words,

 Z=|M(r)|∑h=1αhXh,

where are multipermutation matrices, and the set is a set of convex combination coefficients. We split the sum to two parts:

 Z=∑h:Xh∈ΠM(r,Z)αhXh+∑h:Xh∉ΠM(r,Z)αhXh.

Since for all , for all such that . This means that for all , which implies that

 Z=∑h:Xh∈ΠM(r,Z)αhXh.

This implies that .

#### 5.1.1 The AWGN channel

In the AWGN channel, , where is the variance of the noise. Thus

 Γ(y)= ϕ⋅11×mn+θ((y1−t1)2,…,(y1−tm)2m …(yn−t1)2,…,(yn−tm)2m),

where is a constant bias and is a common scaling constant. Then

 Γ(y)vec(X)=nϕ+θ(n∑i=1y2i+m∑i=1rit2i−2uvec(X)),

where . Thus

 argminXΓ(y)vec(X)=argmax