Coded Caching Schemes with Reduced Subpacketization from Linear Block Codes
Abstract
Coded caching is a technique that generalizes conventional caching and promises significant reductions in traffic over caching networks. However, the basic coded caching scheme requires that each file hosted in the server be partitioned into a large number (i.e., the subpacketization level) of nonoverlapping subfiles. From a practical perspective, this is problematic as it means that prior schemes are only applicable when the size of the files is extremely large. In this work, we propose coded caching schemes based on combinatorial structures called resolvable designs. These structures can be obtained in a natural manner from linear block codes whose generator matrices possess certain rank properties. We obtain several schemes with subpacketization levels substantially lower than the basic scheme at the cost of an increased rate. Depending on the system parameters, our approach allows us to operate at various points on the subpacketization level vs. rate tradeoff.
mycommfont
coded caching, resolvable designs, cyclic codes, subpacketization level
1 Introduction
Caching is a popular technique for facilitating large scale content delivery over the Internet. Traditionally, caching operates by storing popular content closer to the end users. Typically, the cache serves an end user’s file request partially (or sometimes entirely) with the remainder of the content coming from the main server. Prior work in this area [1] demonstrates that allowing coding in the cache and coded transmission from the server (referred to as coded caching) to the end users can allow for significant reductions in the number of bits transmitted from the server to the end users. This is an exciting development given the central role of caching in supporting a significant fraction of Internet traffic. In particular, reference [1] considers a scenario where a single server contains files. The server connects to users over a shared link and each user has a cache that allows it to store fraction of all the files in the server. Coded caching consists of two distinct phases: a placement phase and a delivery phase. In the placement phase, the caches of the users are populated. This phase does not depend on the user demands which are assumed to be arbitrary. In the delivery phase, the server sends a set of coded signals that are broadcast to each user such that each user’s demand is satisfied.
The original work of [1] considered the case of centralized coded caching, where the server decides the content that needs to be placed in the caches of the different users. Subsequent work considered the decentralized case where the users populate their caches by randomly choosing parts of each file while respecting the cache size constraint. Recently, there have been several papers that have examined various facets of coded caching. These include tightening known bounds on the coded caching rate [2, 3], considering issues with respect to decentralized caching [4], explicitly considering popularities of files [5, 6], network topology issues [7, 8] and synchronization issues [9, 10].
In this work, we examine another important aspect of the coded caching problem that is closely tied to its adoption in practice. It is important to note that the huge gains of coded caching require each file to be partitioned into nonoverlapping subfiles of equal size; is referred to as the subpacketization level. It can be observed that for a fixed cache size , grows exponentially with . This can be problematic in practical implementations. For instance, suppose that , with so that with a rate . In this case, it is evident that at the bare minimum, the size of each file has to be at least terabits for leveraging the gains in [1]. It is even worse in practice. The atomic unit of storage on present day hard drives is a sector of size bytes and the trend in the disk drive industry is to move this to bytes [11]. As a result, the minimum size of each file needs to be much higher than terabits. Therefore, the scheme in [1] is not practical even for moderate values of . Furthermore, even for smaller values of , schemes with low subpacketization levels are desirable. This is because any practical scheme will require each of the subfiles to have some header information that allows for decoding at the end users. When there are a large number of subfiles, the header overhead may be nonnegligible. For these same parameters () our proposed approach in this work allows us obtain, e.g., the following operating points: (i) and , (ii) and , (iii) and . For the first point, it is evident that the subpacketization level drops by over five orders of magnitude with only a very small increase in the rate. Point (ii) and (iii) show proposed scheme allows us to operate at various points on the tradeoff between subpacketization level and rate.
The issue of subpacketization was first considered in the work of [12, 13] in the decentralized coded caching setting. In the centralized case it was considered in the work of [14]. They proposed a low subpacketization scheme based on placement delivery arrays. Reference [15] viewed the problem from a hypergraph perspective and presented several classes of coded caching schemes. The work of [16] has recently shown that there exist coded caching schemes where the subpacketization level grows linearly with the number of users ; however, this result only applies when the number of users is very large. We elaborate on related work in Section 2.1.
In this work, we propose low subpacketization level schemes for coded caching. Our proposed schemes leverage the properties of combinatorial structures known as resolvable designs and their natural relationship with linear block codes. Our schemes are applicable for a wide variety of parameter ranges and allow the system designer to tune the subpacketization level and the gain of the system with respect to an uncoded system. We note here that designs have also been used to obtain results in distributed data storage [17] and network coding based function computation in recent work [18, 19].
This paper is organized as follows. Section 2 discusses the background and related work and summarizes the main contributions of our work. Section 3 outlines our proposed scheme. It includes all the constructions and the essential proofs. A central object of study in our work are matrices that satisfy a property that we call the consecutive column property (CCP). Section 4 overviews several constructions of matrices that satisfy this property. Several of the longer and more involved proofs of statements in Sections 3 and 4 appear in the Appendix. In Section 5 we perform an indepth comparison our work with existing constructions in the literature. We conclude the paper with a discussion of opportunities for future work in Section 6.
2 Background, Related Work and Summary of Contributions
We consider a scenario where the server has files each of which consist of subfiles. There are users each equipped with a cache of size subfiles. The coded caching scheme is specified by means of the placement scheme and an appropriate delivery scheme for each possible demand pattern. In this work, we use combinatorial designs [20] to specify the placement scheme in the coded caching system.
Definition 1.
A design is a pair such that

is a set of elements called points, and

is a collection of nonempty subsets of called blocks, where each block contains the same number of points.
A design is in onetoone correspondence with an incidence matrix which is defined as follows.
Definition 2.
The incidence matrix of a design is a binary matrix of dimension , where the rows and columns correspond to the points and blocks respectively. Let and . Then,
It can be observed that the transpose of an incidence matrix also specifies a design. We will refer to this as the transposed design. In this work, we will utilize resolvable designs which are a special class of designs.
Definition 3.
A parallel class in a design is a subset of disjoint blocks from whose union is . A partition of into several parallel classes is called a resolution, and is said to be a resolvable design if has at least one resolution.
For resolvable designs, it follows that each point also appears in the same number of blocks.
Example 1.
Consider a block design specified as follows.
Its incidence matrix is given below.
It can be observed that this design is resolvable with the following parallel classes.
In the sequel we let denote the set . We emphasize here that the original scheme of [1] can be viewed as an instance of the trivial design. For example, consider the setting when is an integer. Let and . In the scheme of [1], the users are associated with and the subfiles with . User caches subfile for if . The main message of our work is that carefully constructed resolvable designs can be used to obtain coded caching schemes with low subpacketization levels, while retaining much of the rate gains of coded caching. The basic idea is to associate the users with the blocks and the subfiles with the points of the design. The roles of the users and subfiles can also be interchanged by simply working with the transposed design.
Example 2.
Consider the resolvable design from Example 1. The blocks in correspond to six users , , , , , . Each file is partitioned into subfiles which correspond to the four points in . The cache in user , denoted is specified as . For example, .
We note here that the caching scheme is symmetric with respect to the files in the server. Furthermore, each user caches half of each file so that . Suppose that in the delivery phase user requests file where . These demands can be satisfied as follows. We pick three blocks, one each from parallel classes , , and generate the signals transmitted in the delivery phase as follows.
(1)  
The three terms in the in eq. (1) above correspond to blocks from different parallel classes . This equation has the allbutone structure that was also exploited in [1], i.e., eq. (1) is such that each user caches all but one of the subfiles participating in the equation. Specifically, user contains and for all . Thus, it can decode subfile that it needs. A similar argument applies to users and . It can be verified that the other three equations also have this property. Thus, at the end of the delivery phase, each user obtains its missing subfiles.
This scheme corresponds to a subpacketization level of and a rate of . In contrast, the scheme of [1] would require a subpacketization level of with a rate of . Thus, it is evident that we gain significantly in terms of the subpacketization while sacrificing some rate gains.
As shown in Example 2, we can obtain a scheme by associating the users with the blocks and the subfiles with the points. In this work, we demonstrate that this basic idea can be significantly generalized and several schemes with low subpacketization levels that continue to leverage much of the rate benefits of coded caching can be obtained.
2.1 Discussion of Related Work
Coded caching has been the subject of much investigation in recent work as discussed briefly earlier on. We now overview existing literature on the topic of low subpacketization schemes for coded caching. In the original paper [1], for given problem parameters (number of users) and (cache fraction), the authors showed that when , the rate equals
when is an integer multiple of . Other points are obtained via memory sharing. Thus, in the regime when is large, the coded caching rate is approximately , which is independent of . Crucially, though this requires the subpacketization level . It can be observed that for a fixed , grows exponentially with . This is one of main drawbacks of the original scheme and for reasons outlined in Section 1, deploying this solution in practice may be difficult.
The subpacketization issue was first discussed in the work of [12, 13] in the context of decentralized caching. Specifically, [13] showed that in the decentralized setting for any subpacketization level such that the rate would scale linearly in , i.e., . Thus, much of the rate benefits of coded caching would be lost if did not scale exponentially in . Following this work, the authors in [14] introduced a technique for designing low subpacketization schemes in the centralized setting which they called placement delivery arrays. In [14], they considered the setting when or and demonstrated a scheme where the subpacketization level was exponentially smaller than the original scheme, while the rate was marginally higher. This scheme can be viewed as a special case of our work. We discuss these aspects in more detail in Section 5. In [15], the design of coded caching schemes was achieved through the design of hypergraphs with appropriate properties. In particular, for specific problem parameters, they were able to establish the existence of schemes where the subpacketization scaled as . Reference[21] presented results in this setting by considering strong edge coloring of bipartite graphs.
Very recently, [16] showed the existence of coded caching schemes where the subpacketization grows linearly with the number of users, but the coded caching rate grows as where . Thus, while the rate is not a constant, it does not grow linearly with either. Both [15] and [16] are interesting results that demonstrate the existence of regimes where the subpacketization scales in a manageable manner. Nevertheless, it is to be noted that these results come with several caveats. For example, the result of [16] is only valid in the regime when is very large and is unlikely to be of use for practical values of . The result of [15] has significant restrictions on the number of users, e.g., in their paper, needs to be of the form and .
2.2 Summary of Contributions
In this work, the subpacketization levels we obtain are typically exponentially smaller than the original scheme. However, they still continue to scale exponentially in , albeit with much smaller exponents. However, our construction has the advantage of being applicable for a large range of problem parameters. Our specific contributions include the following.

We uncover a simple and natural relationship between a linear block code and a coded caching scheme. We first show that any linear block code over and in some cases (where is not a prime or a prime power) generates a resolvable design. This design in turn specifies a coded caching scheme with users where the cache fraction . A complementary cache fraction point where where is some integer between and can also be obtained. Intermediate points can be obtained by memory sharing between these points.

We consider a class of linear block codes whose generator matrices satisfy a specific rank property. In particular, we require collections of consecutive columns to have certain rank properties. For such codes, we are able to identify an efficient delivery phase and determine the precise coded caching rate. We demonstrate that the subpacketization level is at most whereas the coded caching gain scales as with respect to an uncoded caching scheme. Thus, different choices of allow the system designer significant flexibility to choose the appropriate operating point.

We discuss several constructions of generator matrices that satisfy the required rank property. We characterize the ranges of alphabet sizes over which these matrices can be constructed. If one has a given subpacketization budget in a specific setting, we are able to find a set of schemes that fit the budget while leveraging the rate gains of coded caching.
3 Proposed low subpacketization level scheme
All our constructions of low subpacketization schemes will stem from resolvable designs (cf. Definition 3). Our overall approach is to first show that any linear block code over can be used to obtain a resolvable block design. The placement scheme obtained from this resolvable design is such that . Under certain (mild) conditions on the generator matrix we show that a delivery phase scheme can be designed that allows for a significant rate gain over the uncoded scheme while having a subpacketization level that is significantly lower than [1]. Furthermore, our scheme can be transformed into another scheme that operates at the point . Thus, intermediate values of can be obtained via memory sharing. We also discuss situations under which we can operate over modular arithmetic where is not necessarily a prime or a prime power; this allows us to obtain a larger range of parameters.
3.1 Resolvable Design Construction
Consider a linear block code over . To avoid trivialities we assume that its generator matrix does not have an allzeros column. We collect its codewords and construct a matrix of size as follows.
(2) 
where the vector represents the th codeword of the code. Let be the point set and be the collection of all subsets for and , where
Using this construction, we can obtain the following result.
Lemma 1.
The construction procedure above results in a design where and for all and . Furthermore, the design is resolvable with parallel classes given by , for .
Proof.
Let , for , , . Note that for , we have
where . Let be such that . Consider the equation
where is fixed. For arbitrary values of , , this equation has a unique solution for , which implies that for any , and that forms a parallel class. ∎
Remark 1.
A generator matrix over where is a prime power can also be considered as a matrix over an extension field where is an integer. Thus, one can obtain a resolvable design in this case as well; the corresponding parameters can be calculated in an easy manner.
Remark 2.
We can also consider linear block codes over where is not necessarily a prime or a prime power. In this case the conditions under which a resolvable design can be obtained by forming the matrix are a little more involved. We discuss this in Lemma 4 in the Appendix.
Example 3.
Consider a linear block code over with generator matrix
Collecting the nine codewords, is constructed as follows.
Using , we generate the resolvable block design where the point set is . For instance, block is obtained by identifying the column indexes of zeros in the first row of , i.e., . Following this, we obtain
It can be observed that has a resolution (cf. Definition 3) with the following parallel classes.
3.2 A special class of linear block codes
We now introduce a special class of linear block codes whose generator matrices satisfy specific rank properties. It turns out that resolvable designs obtained from these codes are especially suited for usage in coded caching.
Consider the generator matrix of a linear block code over . The th column of is denoted by . Let be the least positive integer such that divides (denoted by ). We let denote .
In our construction we will need to consider various collections of consecutive columns of (wraparounds over the boundaries are allowed). For this purpose, let ( is a nonnegative integer) and . Let be the submatrix of specified by the columns in , i.e., is a column in if . Next, we define the consecutive column property that is central to the rest of the discussion.
Definition 4.
consecutive column property. Consider the submatrices of specified by for . We say that satisfies the consecutive column property if all submatrices of each are full rank.
Henceforth, we abbreviate the consecutive column property as CCP.
Example 4.
In Example 3 we have and hence . Thus, and . The corresponding generator matrix satisfies the CCP as any two columns of the each of submatrices are linearly independent over .
We note here that one can also define different levels of the consecutive column property. Let , and is the least positive integer such that .
Definition 5.
consecutive column property Consider the submatrices of specified by for . We say that satisfies the consecutive column property, where if each has full rank. In other words, the columns in each are linearly independent.
As pointed out in the sequel, codes that satisfy the CCP, where will result in caching systems that have a multiplicative rate gain of over an uncoded system. Likewise, codes that satisfy the CCP will have a gain of over an uncoded system. In the remainder of the paper, we will use the term CCP to refer to the CCP if the value of is clear from the context.
3.3 Usage in a coded caching scenario
A resolvable design generated from a linear block code that satisfies the CCP can be used in a coded caching scheme as follows. We associate the users with the blocks. Each subfile is associated with a point and an additional index. The placement scheme follows the natural incidence between the blocks and the points; a formal description is given in Algorithm 3.3 and illustrated further in Example 5.
[t] \SetKwInOutInputInput \SetKwInOutOutputOutput \InputResolvable design constructed from a linear block code. Let be the least positive integer such that . Divide each file , for into subfiles. Thus, User for caches \OutputCache content of user denoted for .
Example 5.
Consider the resolvable design from Example 3, where we recall that . The blocks in correspond to twelve users , , , , , , , , , , , . Each file is partitioned into subfiles, each of which is denoted by , , . The cache in user , denoted is specified as . This corresponds to a coded caching system where each user caches rd of each file so that .
In general, (see Algorithm 3.3) we have users. Each file , is divided into subfiles . A subfile is cached in user where if . Therefore, each user caches a total of subfiles. As each file consists of subfiles, we have that .
It remains to show that we can design a delivery phase scheme that satisfies any possible demand pattern. Suppose that in the delivery phase user requests file where . The server responds by transmitting several equations that satisfy each user. Each equation allows users from different parallel classes to simultaneously obtain a missing subfile. Our delivery scheme is such that the set of transmitted equations can be classified into various recovery sets that correspond to appropriate collections of parallel classes. For example, in Fig. 1, and so on. It turns out that these recovery sets correspond precisely to the sets defined earlier. We illustrate this by means of the example below.
Example 6.
Consider the placement scheme specified in Example 5. Let each user request file . The recovery sets are specified by means of the recovery set bipartite graph shown in Fig. 1, e.g., corresponds to . The outgoing edges from each parallel class are labeled arbitrarily with numbers and . Our delivery scheme is such that each user recovers missing subfiles with a specific superscript from each recovery set that its corresponding parallel class participates in. For instance, a user in parallel class recovers missing subfiles with superscript from , superscript 1 from and superscript 2 from ; these superscripts are the labels of outgoing edges from in the bipartite graph.
It can be verified, e.g., that user which lies in recovers all missing subfiles with superscript from the equations below.
Each of the equations above benefits three users. They are generated simply by choosing from , any block from and the last block from so that the intersection of all these blocks is empty. The fact that these equations are useful for the problem at hand is a consequence of the CCP. The process of generating these equations can be applied to all possible recovery sets. It can be shown that this allows all users to be satisfied at the end of the procedure.
In what follows, we first show that for the recovery set it is possible to generate equations that benefit users simultaneously.
Claim 1.
Consider the resolvable design constructed as described in Section III.A by a linear block code that satisfies the CCP. Let for , i.e., it is the subset of parallel classes corresponding to . We emphasize that . Consider blocks (where ) that are picked from any distinct parallel classes of . Then, .
Before proving Claim 1, we discuss its application in the delivery phase. Note that the claim asserts that blocks chosen from distinct parallel classes intersect in precisely one point. Now, suppose that one picks users from distinct parallel classes, such that their intersection is empty. These blocks (equivalently, users) can participate in an equation that benefits users. In particular, each user will recover a missing subfile indexed by the intersection of the other blocks. We emphasize here that Claim 1 is at the core of our delivery phase. Of course, we need to justify that enough equations can be found that allow all users to recover all their missing subfiles. This follows from a natural counting argument that is made more formally in the subsequent discussion. The superscripts are needed for the counting argument to go through.
Proof.
Following the construction in Section III.A, we note that a block is specified by
Now consider (where ) that are picked from distinct parallel classes of . W.l.o.g. we assume that . Let and denote the submatrix of obtained by retaining the rows in . We will show that the vector is a column in and only appears once.
To see this consider the system of equations in variables .
By the CCP, the vectors are linearly independent. Therefore this system of equations in variables has a unique solution over . The result follows. ∎
[t] \SetNoFillComment \SetKwInOutInputInput \SetKwInOutOutputOutput \InputFor , . Signal set . \Whileany user does not recover all its missing subfiles with superscript Pick blocks for all and such that \tccPick blocks from distinct parallel classes in such that their intersection is empty Let for \tccDetermine the missing subfile index that the user from will recover Add signal to \tccUser demands file . This equation allows it to recover the corresponding missing subfile index . The superscript is determined by the recovery set bipartite graph
Signal set . We now provide an intuitive argument for the delivery phase. Recall that we form a recovery set bipartite graph (see Fig. 1 for an example) with parallel classes and recovery sets as the disjoint vertex subsets. The edges incident on each parallel class are labeled arbitrarily from . For a parallel class we denote this label by . For a given recovery set , the delivery phase proceeds by choosing blocks from distinct parallel classes in such that their intersection is empty; this provides an equation that benefits users. It turns out that the equation allows a user in parallel class to recover a missing subfile with the superscript .
The formal argument is made in Algorithm 1. For ease of notation in Algorithm 1, we denote the demand of user for by .
Claim 2.
Consider a user belonging to parallel class . The signals generated in Algorithm 1 can recover all the missing subfiles needed by with superscript .
Proof.
Let . In the arguments below, we argue that user that demands file can recover all its missing subfiles with superscript . Note that . Thus, user needs to obtain missing subfiles with superscript . Consider an iteration of the while loop where block is picked in step 2. The equation in Algorithm 1 allows it to recover where . This is because and because of Claim 1.
Next we count the number of equations that participates in. We can pick users from some distinct parallel classes in . This can be done in ways. Claim 1 ensures that the blocks so chosen intersect in a single point. Next we pick a block from the only remaining parallel class in such that the intersection of all blocks is empty. This can be done in ways. Thus, there are a total of equations in which user participates in.
It remains to argue that each equation provides a distinct subfile. Towards this end, let be an index set such that . Suppose that there exist sets of blocks and such that , but . This is a contradiction since this in turn implies that , which is impossible since two blocks from the same parallel class have an empty intersection.
As the algorithm is symmetric with respect to all blocks in parallel classes belonging to , we have the required result. ∎
The overall delivery scheme repeatedly applies Algorithm 1 to each of the recovery sets.
Lemma 2.
The proposed delivery scheme terminates and allows each user’s demand to be satisfied. Furthermore the transmission rate of the server is and the subpacketization level is .
Proof.
See Appendix. ∎
The main requirement for Lemma 2 to hold is that the recovery set bipartite graph be biregular, where multiple edges between the same pair of nodes is disallowed and the degree of each parallel class is . It is not too hard to see that this follows from the definition of the recovery sets (see the proof in the Appendix for details).
In an analogous manner, if one starts with the generator matrix of a code that satisfies the CCP for , then we can obtain the following result which is stated below. The details are quite similar to the discussion for the CCP and can be found in the Appendix (Section .2).
Corollary 1.
Consider a coded caching scheme obtained by forming the resolvable design obtained from a code that satisfies the CCP where . Let be the least positive integer such that . Then, a delivery scheme can be constructed such that the transmission rate is and the subpacketization level is .
3.4 Obtaining a scheme for .
The construction above works for a system where . It turns out that this can be converted into a scheme for . Thus, any convex combination of these two points can be obtained by memorysharing.
Towards this end, we note that the class of coded caching schemes considered here can be specified by an equationsubfile matrix. This is inspired by the hypergraph formulation and the placement delivery array (PDA) based schemes for coded caching in [15] and [14]. Each equation is assumed to be of the allbutone type, i.e., it is of the form where for each , we have the property that user does not cache subfile but caches all subfiles where .
The coded caching system corresponds to a equationsubfile matrix as follows. We associate each row of with an equation and each column with a subfile. We denote the th row of by and th column of by . The value if in the th equation, user recovers subfile , otherwise, . Suppose that these equations allow each user to satisfy their demands, i.e., corresponds to a valid coded caching scheme. It is not too hard to see that the placement scheme can be obtained by examining . Namely, user caches the subfile corresponding to the th column if integer does not appear in the th column.
Example 7.
Consider a coded caching system in [1] with , and . We denote the four users as . Suppose that the equationsubfile matrix for this scheme is as specified below.
Upon examining it is evident for instance that user caches subfiles as the number does not appear in the corresponding columns. Similarly, the cache placement of the other users can be obtained. Interpreting this placement scheme in terms of the usersubfile assignment, it can be verified that the design so obtained corresponds to the transpose of the scheme considered in Example 1 (and also to the scheme of [1] for , ).
Lemma 3.
Consider a equationsubfile matrix whose entries belong to the set . It corresponds to a valid coded caching system if the following three conditions are satisfied.

There is no nonzero integer appearing more than once in each column.

There is no nonzero integer appearing more than once in each row.

If , then .
Proof.
The placement scheme is obtained as discussed earlier, i.e., user caches subfiles if integer does not appear in column . Therefore, matrix corresponds to a placement scheme.
Next we discuss the delivery scheme. Note that corresponds to an equation as follows.
where . The above equation can allow users to recover subfiles simultaneously if (a) does not cache and (b) caches all where . It is evident that does not cache owing to the placement scheme. Next, to guarantee the condition (b), we need to show that integer will not appear in column in where . Towards this end, because of Condition 2. Next, consider the nonzero entries that lie in the column but not in the row . Assume there exists an entry such that and , then , which is a contradiction to Condition 3. Finally, Condition 1 guarantees that each missing subfile is recovered only once. ∎
User caches a fraction where is the number of columns of that do not have the entry . Similarly, the transmission rate is given by .
The crucial point is that the transpose of , i.e., also corresponds to a coded caching scheme. This follows directly from the fact that also satisfies the conditions in Lemma 3. In particular, corresponds to a coded caching system with users and subfiles. In the placement phase, the cache size of is . In the delivery phase, by transmitting equations corresponding to the rows of , all missing subfiles can be recovered. Then, the transmission rate is .
Applying the above discussion in our context, consider the equationsubfile matrix corresponding to the coded caching system with , for , and . Then corresponds to a system with , , , and transmission rate . The following theorem is the main result of this paper.
Theorem 1.
Consider a linear block code over that satisfies the CCP. Th