Channels with Synchronization/Substitution Errors and Computation of Error Control Codes
Abstract
We introduce the concept of an maximal errordetecting block code, for some parameter between 0 and 1, in order to formalize the situation where a block code is close to maximal with respect to being errordetecting. Our motivation for this is that constructing a maximal errordetecting code is a computationally hard problem. We present a randomized algorithm that takes as input two positive integers , a probability value , and a specification of the errors permitted in some application, and generates an errordetecting, or errorcorrecting, block code having up to codewords of length . If the algorithm finds less than codewords, then those codewords constitute a code that is maximal with high probability. The error specification (also called channel) is modelled as a transducer, which allows one to model any rational combination of substitution and synchronization errors. We also present some elements of our implementation of various errordetecting properties and their associated methods. Then, we show several tests of the implemented randomized algorithm on various channels. A methodological contribution is the presentation of how various desirable error combinations can be expressed formally and processed algorithmically.
I Introduction
We consider block codes , that is, sets of words of the same length , for some integer . The elements of are called codewords or words. We use to denote the alphabet used for making words and
Our typical alphabet will be the binary one . We shall use the variables to denote words over (not necessarily in ). The empty word is denoted by .
We also consider error specifications , which we call combinatorial channels, or simply channels. A channel specifies, for each allowed input word , the set of all possible output words. We assume that errorfree communication is always possible, so . On the other hand, if and then the channel introduces errors into . Informally, a block code is detecting if the channel cannot turn a given word into a different word. It is correcting if the channel cannot turn two different words into the same word.
In Section II, we make the above concepts mathematically precise, and show how known examples of combinatorial channels can be defined formally so that they can be used as input to algorithms. In Section III, we present two randomized algorithms: the first one decides (up to a certain degree of confidence) whether a given block code is maximal detecting for a given channel . The second algorithm is given a channel , an detecting block code (which could be empty), and integer , and attempts to add to new words of length resulting into a new detecting code. If less than words get added then either the new code is 95%maximal or the chance that a randomly chosen word can be added is less than 5%. Our motivation for considering a randomized algorithm is that embedding a given detecting block code into a maximal detecting block code is a computationally hard problem—this is shown in Section IV. In Section V, we discuss briefly some capabilities of the new module codes.py in the open source software package FAdo [4, 1] and we discuss some tests of the randomized algorithms on various channels. In Section VI, we discuss a few more points on channel modelling and conclude with directions for future research.
We note that, while there are various algorithms for computing errorcontrol codes, to our knowledge these work for specific channels and implementations are generally not open source.
Ii Channels and Error Control Codes
We need a mathematical model for channels that is useful for answering algorithmic questions pertaining to error control codes. While many models of channels and codes for substitutiontype errors use a rich set of mathematical structures, this is not the case for channels involving synchronization errors [13]. We believe the appropriate model for our purposes is that of a transducer. We note that transducers have been defined as early as in [18], and are a powerful computational tool for processing sets of words—see [2] and pg 41–110 of [17].
Definition 1.
A transducer is a 5tuple
of transitions such that , , and . The relation realized by is the set of word pairs such that . A relation is called rational if it is realized by a transducer. If every input and every output label of is in , then we say that is in standard form. The domain of the transducer is the set of words such that . The transducer is called inputpreserving if , for all words in the domain of . The inverse of , denoted by , is the transducer that is simply obtained by making a copy of and changing each transition to . Then
We note that every transducer can be converted (in linear time) to one in standard form realizing the same relation.
In our objective to model channels as transducers, we require that a transducer is a channel if it allows errorfree communication, that is, is inputpreserving.
Definition 2.
An error specification is an inputpreserving transducer. The (combinatorial) channel specified by is , that is, the relation realized by . For the purposes of this paper, however, we simply identify the concept of channel with that of error specification.
A piece of notation that is useful in this work is the following, where is any set of words,
(1) 
Thus, is the set of all possible outputs of when the input is any word from . For example, if = the channel that allows up to 1 symbol to be deleted or inserted in the input word, then
Fig. 4 considers examples of channels that have been defined in past research when designing error control codes. Here these channels are shown as transducers, which can be used as inputs to algorithms for computing error control codes. For the channel , we have because on input 00000, the channel can read the first two input 0’s at state and output 0, 0; then, still at state , read the 3rd 0 and output 1 and go to state ; etc.
The concepts of errordetection and correction mentioned in the introduction are phrased below more rigorously.
Definition 3.
Let be a block code and let be a channel. We say that is detecting if
We say that is correcting if
An detecting block code is called maximal detecting if is not detecting for any word of length that is not in . The concept of a maximal correcting code is similar.
From a logical point of view (see Lemma 4 below) errordetection subsumes the concept of errorcorrection. This connection is stated already in [7] but without making use of it there. Here we add the fact that maximal errordetection subsumes maximal errorcorrection. Due to this observation, in this paper we focus only on errordetecting codes.
Note: The operation ’’ between two transducers and is called composition and returns a new transducer such that
Lemma 4.
Let be a block code and be a channel. Then is correcting if and only if it is detecting. Moreover, is maximal correcting if and only if it is maximal detecting.
Proof.
The first statement is already in [7]. For the second statement, first assume that is maximal correcting and consider any word . If were detecting then would also be correcting and, hence, would be nonmaximal; a contradiction. Thus, must be maximal detecting. The converse can be shown analogously. ∎
The operation ’’ between any two transdcucers and is obtained by simply taking the union of their five corresponding components (states, alphabet, initial states, transitions, final states) after a renaming, if necessary, of the states such that the two transdcucers have no states in common. Then
Let be a channel, let be an detecting block code, and let . In [3], the authors show that
is detecting iff .  (2) 
Definition 5.
Let be an detecting block code. We say that a word can be added into if .
Statement (2) above implies that
is maximal detecting iff .  (3) 
Definition 6.
The maximality index of a block code w. r. t. a channel is the quantity
Let be a real number in . An detecting block code is called maximal detecting if .
The maximality index of is the proportion of the ‘used up’ words of length over all words of length . One can verify the following useful lemma.
Lemma 7.
Let be a channel and let be an detecting block code.

if and only if is maximal detecting.

Assuming that words are chosen uniformly at random from , the maximality index is the probability that a randomly chosen word of length cannot be added into preserving its being detecting, that is,
Iii Generating Error Control Codes
We turn now our attention to algorithms processing channels and sets of words. A set of words is called a language, with a block code being a particular example of language. A powerful method of representing languages is via finite automata [17]. A (finite) automaton is a 5tuple as in the case of a channel, but each transition has only an input label, that is, it is of the form with being one alphabet symbol or the empty word . The language accepted by is denoted by and consists of all words formed by concatenating the labels in any path from an initial to a final state. The automaton is called deterministic, or DFA for short, if consists of a single state, there are no transitions with label , and there are no two distinct transitions with same labels going out of the same state. Special cases of automata are constraint systems in which normally all states are final (pg 1635–1764 of [16]), and trellises. A trellis is an automaton accepting a block code, and has one initial and one final state (pg 1989–2117 of [16]). In the case of a trellis we talk about the code represented by , and we denote it as , which is equal to .
For computational complexity considerations, the size of a finite state machine (automaton or transducer) is the number of states plus the sum of the sizes of the transitions. The size of a transition is 1 plus the length of the label(s) on the transition. We assume that the alphabet is small so we do not include its size in our estimates.
An important operation between an automaton and a transducer , here denoted by ’’, returns an automaton that accepts the set of all possible outputs of when the input is any word from , that is,
Remark 8.
We recall here the construction of from given and , where we assume that contains no transition with label . First, if necessary, we convert to standard form. Second, if contains any transition whose input label is , then we add into transitions , for all states . Let denote now the updated set of transitions. Then, we construct the automaton
such that , exactly when there are transitions and . The above construction can be done in time and the size of is . The required automaton is the trim version of , which can be computed in time . (The trim version of an automaton is the automaton resulting when we remove any states of that do not occur in some path from an initial to a final state of .)
Next we present our randomized algorithms—we use [14] as reference for basic concepts. We assume that we have available to use in our algorithms an ideal method that chooses uniformly at random a word in . A randomized algorithm with specific values for its parameters can be viewed as a random variable whose value is whatever value is returned by executing on the specific values.
Theorem 9.
Consider the algorithm nonMax in Fig. 2, which takes as input a channel , a trellis accepting an detecting code, and two numbers .

The algorithm either returns a word such that the code is detecting, or it returns None.

If is not maximal detecting, then

The time complexity of nonMax is
Proof.
The first statement follows from statement (2) in the previous section, as any returned by the algorithm is not in . For the second statement, suppose that the code is not maximal detecting. Let be the random variable whose value is the value of tr 1 at the end of execution of the randomized algorithm nonMax. Then, counts the number of words that are in out of randomly chosen words . Thus is binomial: the number of successes (words in ) in trials. So , where . By the definition of in nonMax, we get . Now consider the Chebyshev inequality, where is arbitrary and is the variance of some random variable . For the variance is , and we get
where we used and the fact that .
Using Lemma 7 and the assumption that is not maximal, we have that , which implies ; hence, . Then
as required.
Remark 10.
We mention the important observation that one can modify the algorithm nonMax by removing the construction of and replacing the ‘if’ line in the loop with
if ( is detecting) return ;
While with this change the output would still be correct, the time complexity of the algorithm would increase to . This is because testing whether is detecting, for any given automaton and channel , can be done in time , and in practice is much larger than .
In Fig. 3, we present the main algorithm for adding new words into a given deterministic trellis .
Remark 11.
In some sense, algorithm makeCode generalizes to arbitrary channels the idea used in the proof of the wellknown GilbertVarshamov bound [12] for the largest possible block code that is correcting, for some number of substitution errors. In that proof, a word can be added into the code if the word is outside of the union of the “balls” , for all . In that case, we have that and . The present algorithm adds new words to the constructed trellis such that each new word is outside of the “unionball” .
Theorem 12.
Algorithm makeCode in Fig. 3 takes as input a channel , a deterministic trellis of some length , and an integer such that the code is detecting, and returns a deterministic trellis and a list of words such that the following statements hold true:

and is detecting,

If has less than words, then either or the probability that a randomly chosen word from can be added in is .

The algorithm runs in time .
Proof.
Let be the value of the trellis at the end of the th iteration of the while loop. The first statement follows from Theorem 9: any word returned by nonMax is such that is detecting. For the second statement, assume that, at the end of execution, has words and is not 95%maximal. By the previous theorem, this means that the random process returns None with probability , as required. For the third statement, as the loop in the algorithm nonMax performs a fixed number of iterations (=2 000), we have that the cost of nonMax is . The cost of adding a new word of length to is and increases its size by , so each is of size . Thus, the cost of the th iteration of the while loop in makeCode is . As there are up to iterations the total cost is
∎
Remark 13.
In the algorithm makeCode, attempting to add only one word into (case of ), requires time , which is of polynomial magnitude. This case is equivalent to testing whether is maximal detecting, which is shown to be a hard decision problem in Theorem 15.
Remark 14.
In the version of the algorithm makeCode where the initial trellis is omitted, the time complexity is . We also note that the algorithm would work with the same time complexity if the given trellis is not deterministic. In this case, however, the resulting trellis would not be (in general) deterministic either.
Iv Why not Use a Deterministic Algorithm
Our motivation for considering randomized algorithms is that the embedding problem is computationally hard: given a deterministic trellis and a channel , compute (using a deterministic algorithm) a trellis that represents a maximal detecting code containing . By computationally hard, we mean that a decision version of the embedding problem is coNPhard. This is shown next.
Theorem 15.
The following decision problem is coNPhard.
 Instance:

deterministic trellis and channel .
 Answer:

whether is maximal detecting.
Proof.
Let us call the decision problem in question , and let be the problem of deciding whether a given trellis over the alphabet with no labeled transitions accepts , for some . The statement is a logical consequence of the following claims.
Claim 1: is coNPcomplete.
Claim 2: is polynomially reducible to .
The first claim follows from the proof of the following fact on page 329 of [10]: Deciding whether two given starfree regular expressions over are inequivalent is an NPcomplete problem. Indeed, in that proof the first regular expression can be arbitrary, but the second regular expression represents the language , for some positive integer . Moreover, converting a starfree regular expression to an acyclic automaton with no labeled transitions is a polynomial time problem.
For the second claim, consider any trellis with no labeled transitions in . We need to construct in polynomial time an instance of such that accepts if and only if is a maximal detecting block code of length . The rest of the proof consists of 5 parts: construction of deterministic trellis accepting words of length , construction of , facts about and , proving that is detecting, proving that accepts if and only if is maximal detecting.
Construction of : Let be the alphabet , where is the set of transitions of . The required deterministic trellis is any deterministic trellis accepting , that is,
This can be constructed, for instance, by making deterministic trellises and accepting, respectively, and , and then intersecting with the complement of . Note that any word in contains at least one symbol in .
Construction of : This is of the form as follows. The transducer has only one state and transitions , for all , and realizes the identity relation . Thus, we have that , for all words . The transducer is such that consists of exactly the transitions for which is a transition of .
Facts about and : The following facts are helpful in the rest of the proof. Some of these facts refer to the deterministic trellis resulting by omitting the output parts of the transition labels of , that is, exactly when . Then, .
F0: .
F1: The domain of is , a subset of .
F2: If then and .
F3: .
F4: .
For fact F0, note that the product construction described in Remark 8 produces in exactly the transitions , where is a transition in , by matching any transition of only with the transition of . Fact F1 follows by the construction of and the definition of : in any accepting computation of , the input labels appear in an accepting computation of that uses the same sequence of states. F3 is shown as follows: As the domain of is and , we have that , which is by F0. Fact F4 follows by noting that the domain of is a subset of but contains no words in .
is detecting: Let such that . We need to show that , that is, to show that . Indeed, if then , which contradicts .
accepts if and only if is maximal detecting: By statement (3) we have that is maximal detecting, if and only if . We have:
Thus, is maximal detecting, if and only if , as required. ∎
V Implementation and Use
All main algorithmic tools have been implemented over the years in the Python package FAdo [4, 1, 6]. Many aspects of the new module FAdo.codes are presented in [6]. Here we present methods of that module pertaining to generating codes.
Assume that the string d1 contains a description of the transducer in FAdo format. In particular, d1 begins with the type of FAdo object being described, the final states, and the initial states (after the character *). Then, d1 contains the list of transitions, with each one of the form “ \n”, where ’\n’ is the newline character. This shown in the following Python script.
import FAdo.codes as codes d1 = ’@Transducer 0 2 * 0\n’ ’0 0 0 0\n0 1 1 0\n0 0 @epsilon 1\n’ ’0 1 @epsilon 1\n1 0 0 1\n1 1 1 1\n’ ’1 @epsilon 0 2\n1 @epsilon 1 2\n’ pd1 = codes.buildErrorDetectPropS(d1) a = pd1.makeCode(100, 8, 2) print pd1.notSatisfiesW(a) print pd1.nonMaximalW(a, m) s2 = ...string for transducer sub_2 ps2 = codes.buildErrorDetectPropS(s2) pd1s2 = pd1 & ps2 b = pd1s2.makeCode(100, 8, 2)
The above script uses the string d1 to create the object pd1
representing the detection property over the alphabet {0,1}. Then,
it constructs an automaton a representing
a detecting block code of length
with up to words over the 2symbol alphabet {0,1}.
The method
notSatisfiesW(a) tests whether the code
is detecting and returns
a witness of nonerrordetection (= pair of codewords with ), or (None, None)—of course, in the above example
it would return (None, None). The method
nonMaximalW(a, m) tests whether the code
is maximal detecting and returns
either a word such that is detecting, or None if is already maximal.
The object m is any automaton—here it is
the trellis representing .
This method is used only for small codes, as
in general
the maximality problem is algorithmically hard (recall Theorem 15), which
motivated us to consider the randomized version nonMax
in this paper.
For any channel and trellis a, the method notSatisfiesW(a) can be made to work in
time , which is of polynomial
complexity.
The operation ’&
’ combines errordetection properties.
Thus, the second call to makeCode constructs a code that
is detecting and detecting (=correcting).
Vi More on Channel Modelling, Testing
In this section, we consider further examples of channels and show how operations on channels can result in new ones. We also show the results of testing our codes generation algorithm for several different channels.
Remark 16.
We note that the definition of errordetecting (or errorcorrecting) block code is trivially extended to any language , that is, one replaces in Definition 3 ’block code ’ with ’language ’. Let be channels. By Definition 3 and using standard logical arguments, it follows that

is detecting and detecting, if and only if is detecting;

is detecting, if and only if it is detecting, if and only if it is detecting.
The inverse of is and is shown in Fig. 4, where recall it results by simply exchanging the order of the two words in all the labels in . By statement 2 of the above remark, the detecting codes are the same as the detecting ones, and the same as the detecting ones—this is shown in [15] as well. The method of using transducers to model channels is quite general and one can give many more examples of past channels as transducers, as well as channels not studied before. Some further examples are shown in the next figures, Fig. 46.
One can go beyond the classical error control properties and define certain synchronization properties via transducers. Let be the set of all overlapfree words, that is, all words such that a proper and nonempty prefix of cannot be a suffix of . A block code is a solid code if any proper and nonempty prefix of a word cannot be a suffix of a word. For example, {0100, 1001} is not a block solid code, as 01 is a prefix and a suffix of some codewords and 01 is nonempty and a proper prefix (shorter than the codewords). Solid codes can also be nonblock codes by extending appropriately the above definition [19] (they are also called codes without overlaps in [9]). The transducer in Fig. 6 is such that any block code is a solid code, if and only if is an ’detecting’ block code. We note that solid codes have instantaneous synchronization capability (in particular all solid codes are commafree codes) as well as synchronization in the presence of noise [5].
For and , the value of in nonMax is 2 000. We performed several executions of the algorithm makeCode on various channels using , no initial trellis, and alphabet .
, , end=  

1  
01  
In the above table, the first column gives the values of and , and if present and nonempty, the pattern that all codewords should end with (1 or 01). For each entry in an ’’ row, we executed makeCode 21 times and reported smallest, median, and largest sizes of the 21 generated codes. For = 500, we reported the same figures by executing the algorithm 5 times. For example, the entry 37,42,51 corresponds to executing makeCode 21 times for , , end = . The entry 64,64,64 corresponds to the systematic code of [15] whose codewords end with 01, and any of the 6bit words can be used in positions 1–6. The entry for ’, end = , ’ corresponds to 2substitution errordetection which is equivalent to 1substitution errorcorrection. Here the Hamming code of length 7 with codewords has a maximum number of codewords for this length. Similarly, the entry for ’, ’ corresponds to 2synchronization errordetection which is equivalent to 1synchronization errorcorrection. Here the Levenshtein code [8] of length 8 has 30 codewords. We recall that a maximal code is not necessarily maximum, that is, having the largest possible number of codewords, for given and . It seems maximum codes are rare, but there are many random maximal ones having lower rates. The detecting code of [15] has higher rate than all the random ones generated here.
For the case of block solid codes (last column of the table), we note that the function pickFrom in the algorithm nonMax has to be modified as the randomly chosen word should be in .
Vii Conclusions
We have presented a unified method for generating error control codes, for any rational combination of errors. The method cannot of course replace innovative code design, but should be helpful in computing various examples of codes. The implementation codes.py is available to anyone for download and use [4]. In the implementation for generating codes, we allow one to specify that generated words only come from a certain desirable subset of , which is represented by a deterministic trellis. This requires changing the function pickFrom in nonMax so that it chooses randomly words from . There are a few directions for future research. One is to work on the efficiency of the implementations, possibly allowing parallel processing, so as to allow generation of block codes having longer block length. Another direction is to somehow find a way to specify that the set of generated codewords is a ‘systematic’ code so as to allow efficient encoding of information. A third direction is to do a systematic study on how one can map a stochastic channel , like the binary symmetric channel or one with memory, to a channel (representing a combinatorial channel), so as the available algorithms on have a useful meaning on as well.
Footnotes
 The general definition of transducer allows two alphabets: the input and the output alphabet. Here, however, we assume that both alphabets are the same.
References
 André Almeida, Marco Almeida, José Alves, Nelma Moreira, and Rogério Reis. FAdo and GUItar: Tools for automata manipulation and visualization. In Proceedings of CIAA 2009, Sydney, Australia, volume 5642 of Lecture Notes in Computer Science, pages 65–74, 2009.
 Jean Berstel. Transductions and ContextFree Languages. B.G. Teubner, Stuttgart, 1979.
 Krystian Dudzinski and Stavros Konstantinidis. Formal descriptions of code properties: decidability, complexity, implementation. International Journal of Foundations of Computer Science, 23:1:67–85, 2012.
 FAdo. Tools for formal languages manipulation. Accessed in Jan. 2016. URL: http://fado.dcc.fc.up.pt/.
 Helmut Jürgenesen and S. S. Yu. Solid codes. Elektron. Informationsverarbeit. Kybernetik., 26:563–574, 1990.
 Stavros Konstantinidis, Casey Meijer, Nelma Moreira, and Rogério Reis. Implementation of code properties via transducers. In YoSub Han and Kai Salomaa, editors, Proceedings of CIAA 2016, number 9705 in Lecture Notes in Computer Science, pages 189–201, 2016. ArXiv version: Symbolic manipulation of code properties. arXiv:1504.04715v1, 2015.
 Stavros Konstantinidis and Pedro V. Silva. Maximal errordetecting capabilities of formal languages. J. Automata, Languages and Combinatorics, 13(1):55–71, 2008.
 Vladimir I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Dokl., 10:707–710, 1966.
 Vladimir I. Levenshtein. Maximum number of words in codes without overlaps. Probl. Inform. Transmission, 6(4):355–357, 1973.
 H. Lewis and C.H. Papadimitriou. Elements of the Theory of Computation, 2nd ed. Prentice Hall, 1998.
 Zhenming Liu and Michael Mitzenmacher. Codes for deletion and insertion channels with segmented errors. In Proceedings of ISIT, Nice, France, 2007, pages 846–849, 2007.
 F. J. MacWilliams and N. J. A. Sloane. The Theory of ErrorCorrecting Codes. Amsterdam, 1977.
 Hugues Mercier, Vijay Bhargava, and Vahid Tarokh. A survey of errorcorrecting codes for channels with symbol synchronization errors. IEEE Communic. Surveys & Tutorials, 12(1):87–96, 2010.
 Michael Mitzenmacher and Eli Upfal. Probability and Computing. Cambridge Univ. Press, 2005.
 Filip Paluncic, Khaled AbdelGhaffar, and Hendrik Ferreira. Insertion/deletion detecting codes and the boundary problem. IEEE Trans. Information Theory, 59(9):5935–5943, 2013.
 V. S. Pless and W. C. Huffman, editors. Handbook of Coding Theory. Elsevier, 1998.
 Grzegorz Rozenberg and Arto Salomaa, editors. Handbook of Formal Languages, Vol. I. SpringerVerlag, Berlin, 1997.
 C. E. Shannon and W. Weaver. The Mathematical Theory of Communication. University of Illinois Press, Urbana, 1949.
 H. J. Shyr. Free Monoids and Languages. Hon Min Book Company, Taichung, second edition, 1991.