# A Novel Asymmetric Coded Placement in Combination Networks with end-user Caches

###### Abstract

The tradeoff between the user’s memory size and the worst-case download time in the combination network is studied, where a central server communicates with users through immediate relays, and each user has local cache of size files and is connected to a different subset of relays. The main contribution of this paper is the design of a coded caching scheme with asymmetric coded placement by leveraging coordination among the relays, which was not exploited in past work. Mathematical analysis and numerical results show that the proposed schemes outperform existing schemes.

## I Introduction

Caching is an effective way to smooth out network traffic by storing some contents in users’ memories during off-peak times to reduce the required number of transmissions during peak-traffic times. A caching scheme includes two phases. In the placement phase, each user stores parts of content in his cache without knowledge of later demands. If each user directly stores some bits of the files, the placement is said to be uncoded. In the delivery phase, each user requests one file. According to users’ demands and cache contents, the server aims to transmit the smallest number packets so as to satisfy the users’ demands, regardless of the demands.

Caching was originally studied by Maddah-Ali and Niesen (MAN) in [dvbt2fundamental] for the shared-link network, which comprises a server with files, users with a cache of size files, and an error-free broadcast link. An additional multiplicative coded caching gain was shown to be attainable by coded caching compared to conventional uncoded caching schemes. For each , where is an integer from to , each file is split into non-overlapping equal-size subfiles that are strategically placed into the user caches. During the deliver phase, coded multicast messages are sent through the shared-link so that a single transmission simultaneously serves users. We say that the MAN scheme attains a coded caching gain of for . A slight variation of the MAN scheme is known to be at most a factor of from an information theoretical outer bound [yas2].

#### Combination networks

In practice, users may communicate with the central server through intermediate relays. Since it is difficult to analyze general relay networks, a symmetric network, known as combination network [cachingincom], has received a significant attention recently. A combination network comprises a server with files that is connected to relays (without caches) through orthogonal links, and each of the users (with caches of size files) is connected to a different subset of relays through orthogonal links–see Fig. 1. The goal is to design a two-phase caching scheme that attains the max-link-load, that is, that minimizes the maximum number of transmissions among all links, which is related to the download time.

Past work can be divided into two groups.

#### Past work for combination networks with uncoded placement

With MAN placement and MAN multicast message generation, the authors in [cachingincom, novelwan2017] proposed various delivery schemes. The scheme in [wan2017novelmulticase] still used MAN placement but proposed a novel way to generate and to deliver the multicast messages by leveraging the symmetries in the network topology. Placement Delivery Array (PDA), originally proposed in [ontheplacementarrray] to reduce the sub-packetization of the MAN scheme in the shared-link model, has been recently extended in [PDA2017yan] to combination network; when divides , the scheme achieves the same load as [Zewail2017codedcaching] but with lower sub-packetization and with uncoded placement.

The main limitation of schemes based on MAN placement is that, due to the combination network topology, the “multicast opportunities” (directly related to the overall coded caching gain) to transmit the various subfiles are different across subfiles. Hence, even if the placement is symmetric, the delivery may be asymmetric. Since worst-case performance is of interest here, asymmetric delivery schemes are not desirable and they may actually be suboptimal.

#### Past work for combination networks with coded placement

In [asymmetric2018wan] we showed that coded placement schemes can be strictly better than any possible scheme with uncoded placement. The authors in [Zewail2017codedcaching] proposed a caching scheme where an MDS code is used before (symmetric) placement so that the delivery phase for the combination network is equivalent to the delivery phase of uncoordinated shared-link networks, each serving virtual users.

Our recent results in [asymmetric2018wan] used asymmetric coded placement with an MDS precoding to further reduce the max-link load achieved by [Zewail2017codedcaching] when the cache size is large; the MDS code parameters are not the same in the two papers. The key idea in [asymmetric2018wan] is to let the users decode only those subfiles that can be transmitted with other equal-length subfiles in a single linear combination from a single relay; the main drawback is that when (i.e., and thus the cache size) is small some multicasting opportunities are “overlooked.”

#### Contributions

In this paper we design an asymmetric coded placement so that the delivery by the relays can be “coordinated”–to be made precise later. We also prove that the proposed schemes strictly lower the max-link-load compared to [Zewail2017codedcaching] when . Numerical evaluations show that the proposed schemes outperform existing schemes.

#### Paper Organization

## Ii System Model and Related Results

### Ii-a Notation

We shall use the following notation convention in the study of the combination network, where a server with files communicates with the users through immediate relays, and each user has local cache of size files and is connected to a different subset of relays. We let

(1) |

where is the number of users in the system, is the number of users connected to each relay, and represents the number of users that are simultaneously connected to relays. Our convention is that if or or .

The subset of users connected to relay is denoted by , and the subset of relays connected to user by . For a subset of users , the set of relays simultaneoulsy connected to all the users in is denoted by

(2) |

For a subset of relays , the set of users who are simultaneously connected to all the relays in is denoted by

(3) |

Note that . For a given integer , the -subsets of users for which there exists at least one relay connected to all the users in this subset is denoted as

(4) |

By the inclusion-exclusion principle [combinatorics, Theorem 10.1]

(5) |

and moreover, from the definition of in (1), we have

(6) |

For the network in Fig. 1, we have

and thus, for instance, , , and contains all the -subsets of , while contains all the -subsets of with the exception of .

Moreover, calligraphic symbols denote sets or collections (i.e., set of sets), bold symbols denote vectors, and sans-serif symbols denote system parameters. We use to represent the cardinality of a set or the absolute value of a real number; and ; represents bit-wise XOR.

### Ii-B System Model

In a combination network, a server has files, denoted by , each composed of i.i.d uniformly distributed bits. The server is connected to relays through error-free orthogonal links. The relays are connected to users through error-free orthogonal links. Each user has a local cache of size bits, for , and is connected to a distinct -subset of relays.

In the placement phase, user stores information about the files in its cache of size bits, where . The cache content of user is denoted by ; let . During the delivery phase, user requests file ; the demand vector is revealed to all nodes. Given , the server sends a message of bits to relay . Then, relay transmits a message of bits to user . User must recover its desired file from and with high probability when . The max-link load is

(7) | ||||

(8) | ||||

(9) |

where in (8) is the largest load from the server to the relays, and in (9) is the largest load from the relays to the users.

We say that a scheme with max-link load attains a coded caching gain of if

(10) | ||||

(11) |

By the cut-set bound [cachingincom] we have (recall that is the number of users connected to each relay).

### Ii-C Caching Scheme in [Zewail2017codedcaching, Theorem 1]

We state here the state-of-the-art scheme in [Zewail2017codedcaching] for the case of no cache at the relays; the scheme uses MDS-based coded placement so as the delivery from each relay is equivalent to that of a shared-link network serving virtual users and where the operations of the virtual shared-link network are not coordinated. In particular, each file is divided into non-overlapping and equal-length pieces that are encoded by an MDS code. The -th MDS-coded symbol is denoted by and must be delivered by relay to the users in following the MAN scheme [dvbt2fundamental]. This is done as follows.

#### Placement

Fix . The MDS-coded symbol is partitioned into non-overlapping and equal-length subfiles as (recall for all ). There are in total

(12) |

User caches if from all (recall for all users), for a total of

(13) |

#### Delivery

The MAN-like multicast coded message

(14) |

is delivered from the server to relay , who then forwards it to the users in . User , thanks to its cache content and the received multicast coded messages from the relays in , recovers

(15) |

Note that there are

(16) |

multicast coded messages in (14), each of the size of a subfile, that are delivered from the server to the relays.

#### Performance

Each user eventually knows subfiles of its desired file (either cached or delivered), which suffices to recover all the subfiles of its desired file because of the MDS encoding before placement, where , and are defined in (13), (15) and (12), respectively. Since each multicast coded message in (14) is simultaneously useful for users, a coded caching gain of is achieved and the required memory size is

(17) |

where in (17) the factor is the inverse of the rate of the MDS code used before placement.

In general, the used MDS code has parameters because each users must be able to recover subfiles from the available subfiles; therefore for a scheme where the delivery is symmetric across users and relays we have

(18) | ||||

(19) | ||||

(20) | ||||

(21) |

where and were defined in (8) and (9), respectively; notice that represents the total number of subfiles decoded by the users and is the number of subfiles actually sent.

#### Limitation

In [Zewail2017codedcaching], the operations at the relays are uncoordinated. Indeed, consider the network in Fig. 1 for . The scheme in [Zewail2017codedcaching] uses an MDS code, and the MDS-coded symbols and are treated as two “independent” subfiles if . For example, among the MDS subfiles , , , , and , each of length is , user 1 caches and , which requires . However, and can be treated as a single subfile known / cached by user 1. This observation is key for the design of the novel proposed schemes.

## Iii Main Result

In this section, we describe the proposed scheme that aims to overcome the limitation of [Zewail2017codedcaching, Theorem 1] as discussed in the previous section. We have:

###### Theorem 1.

For an combination network, a coded caching gain is achievable with a memory requirement of

(22) |

###### Proof:

We aim to achieve coded caching gain . In other words, every multicast coded message send through the network is simultaneously useful for users and each subfile is cached by at least other users.

#### Placement

We consider the elements of defined in (4), that is, those subsets of users with cardinality (from a ground set of cardinality ) for which there exists at least one relay connected to all of them. We aim to partition each MDS-coded file into

(23) |

equal-length subfiles, i.e., where subfile is cached by the users in . Therefore, each user caches

(24) |

since each subfile is cached by users and all users cache the same amount of subfiles. This placement is considered to be asymmetric because not all subfiles for of cardinality are present.

#### Delivery

We should create a multicast coded message similarly to (14) for each subset of users of the form

(25) |

however, only those are such that all users in have at least one common connected relay; in order to have a symmetric delivery scheme from the relays to the users, we aim to deliver only those multicast coded messages for and consider those for as “erased”, i.e., . Therefore, each user eventually decodes

(26) |

More precisely, for each set , we generate the MAN-like multicast message

(27) |

We then divide into non-overlapping and equal-length pieces ; the server transmits to relay , which then forwards it to users in . A user must be able to recover all the subfiles of its desired file from the subfiles that were either cached or received; this is possible if we divide each file into non-overlapping and equal-length pieces and use an MDS code to generate the subfiles before placement, where , and are defined in (24), (26) and (23), respectively.

#### Performance

### Iii-a Comparison between Theorem 1 and [Zewail2017codedcaching, Theorem 1]

In the following we show that our scheme in Theorem 1 is no worse than the scheme in [Zewail2017codedcaching]. In general we have:

###### Corollary 1.

For an combination network with coded caching gain , with equality if and only if .

### Iii-B Numerical Results

In Fig. 2, we compare the performance of the proposed schemes to those of the schemes with coded cache placement in [Zewail2017codedcaching] and [asymmetric2018wan]. As an outer bound, we use the same cut-set idea of [cachingincom] (which used the cut-set bound for the shared-link model originally proposed in [dvbt2fundamental]) but with the enhanced cut-set for the shared-link model in [yas2]; we denote this outer bound as . In Fig. 2, we plot the ratios (red line), (blue line), and (magenta dotted line), and where and are the achievable max-link load by the schemes in [asymmetric2018wan] and in [Zewail2017codedcaching], respectively. We plot the ratio of max-link loads as otherwise their difference would not be clearly visible on a small figure. It can be noted from Fig. 2 that the blue curve, which represents our proposed scheme in Theorem 1, is never below one, that is, it is never inferior in performance to the baseline scheme in [Zewail2017codedcaching]; however, it is strictly worse than the performance of our past work in [asymmetric2018wan] for (which is information theoretically optimal for ). Our proposed scheme in Theorem 1 is information theoretically optimal for and has the same max-link load as the scheme in [Zewail2017codedcaching].

From Fig. 2 we observe a general fenomenon: our scheme in Theorem 1 (blue line) improves on the scheme in [Zewail2017codedcaching] for small value of , while our scheme in [asymmetric2018wan] (red line) improves on the scheme in [Zewail2017codedcaching] for large value of . Part of our ongoing work is to design a scheme that combines the advantages of both Theorem 1 and [asymmetric2018wan]. In Corollary 1 we proved that Theorem 1 is equivalent to the scheme in [Zewail2017codedcaching] for ; this suggests that an improved scheme should consider the multicasting coding opportunities for groups of users or more.

Finally, numerical evaluations suggest that the ratio is increases as increases. An interesting open question is thus if any of the known achievable schemes is to within a constant factor of a known outer bound.

## Iv Conclusions

This paper proposed a novel asymmetric coded cache placement scheme for combination networks with end-user-caches, which aim to create multicasting opportunities across relays. The proposed schemes were shown to be achieve a max-link load no larger than the best scheme known in the literature.

## Acknowledgment

This work was supported in parts by NSF 1527059 and Labex DigiCosme.