Minimax Optimum Clock Skew and Offset Estimators for IEEE 1588

# Minimax Optimum Clock Skew and Offset Estimators for IEEE 1588

Anantha K. Karthik, Student Member, IEEE and Rick S. Blum, IEEE Fellow This work was supported by the Department of Energy under Award DE-OE0000779, and by the National Science Foundation under Grant ECCS 1744129.Anantha K. Karthik and Rick S. Blum are with the Department of Electrical and Computer Engineering, Lehigh University, Bethlehem, PA 18015 USA (e-mail: akk314@lehigh.edu; rblum@lehigh.edu).
###### Abstract

This paper addresses the problem of clock skew and offset estimation for the IEEE 1588 precision time protocol. Built on the classical two-way message exchange scheme, IEEE 1588 is a prominent synchronization protocol for packet switched networks. It is employed in various applications including cellular base station synchronization in 4G long-term evaluation backhaul networks, substation synchronization in electrical grid networks and industrial control. Due to the presence of random queuing delays in a packet switched network, the recovery of clock skew and offset from the received packet timestamps can be viewed as a statistical estimation problem. Recently, assuming perfect clock skew information, minimax optimum clock offset estimators were developed for IEEE 1588. Building on this work, we develop minimax optimum clock skew and offset estimators for IEEE 1588 in this paper. Simulation results indicate the proposed minimax estimators exhibit a lower mean square estimation error than the estimators available in the literature for various network scenarios.

ptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptptpt

Minimax Optimum Clock Skew and Offset Estimators for IEEE 1588

Anantha K. Karthik, Student Member, IEEE and Rick S. Blum, IEEE Fellow

00footnotetext: This work was supported by the Department of Energy under Award DE-OE0000779, and by the National Science Foundation under Grant ECCS 1744129.00footnotetext: Anantha K. Karthik and Rick S. Blum are with the Department of Electrical and Computer Engineering, Lehigh University, Bethlehem, PA 18015 USA (e-mail: akk314@lehigh.edu; rblum@lehigh.edu).

Index Terms

IEEE 1588 Precision Time Protocol, Optimum Invariant Estimation, Minimax Estimation, Electrical Grid, Long Term Evolution.

## I Introduction

Precise synchronization of events is essential to ensure the proper functioning of a distributed network. The IEEE 1588 Precision Time Protocol (PTP) [1] is a popular time synchronization protocol for synchronizing the slave clocks to a master clock. It is cost effective and offers accuracy comparable to Global Positioning System (GPS)-based timing. PTP is utilized in various applications including electrical grid networks [2], cellular base station synchronization in 4G Long Term Evaluation (LTE) [3], substation communication networks [4] and industrial control [5]. In this paper, we will develop clock synchronization algorithms for PTP in a packet switched network.

The clock at the slave node can be modeled mathematically, as a function of the time of the master node’s clock . When the clocks of the slave and master node are synchronized, then . However, in practice these clocks are not synchronized, implying a synchronization error , that tends to grow over large time scales unless synchronization approaches are implemented. In general, the clock of the slave node is modeled as [6, 7, 8, 9, 10, 11], where and denote the relative clock skew and offset of the slave’s clock with respect to the master’s clock respectively.

A number of time synchronization protocols including PTP, Timing Protocol for Sensor Networks (TPSN) [12], tiny-sync [13], and Lightweight Time Synchronization (LTS) [14] are built on the classical two-way message exchange scheme. In these protocols, the slave node exchanges a series of synchronization packets with the master node and uses the packet timestamps to estimate and . The messages traveling between the master and slave node can encounter several intermediate switches and routers accumulating delays at each node. The main factors contributing to the overall delay are the fixed propagation and processing delays at the intermediate nodes along the network path between the master and slave node and the random queuing delays at each such node. This randomness in the overall network traversal time is referred to as Packet Delay Variation (PDV) [11], and the problem of estimating and , while combating the noisiness in the observations that occur due to PDV is called the “Clock Skew and Offset Estimation” (CSOE) problem.

Popular Probability Density Function (pdf) models available in the literature to model the PDV include Gaussian, exponential, gamma, Weibull, and log-normal [9]. The Cramer-Rao lower Bound (CRB) and the Maximum Likelihood (ML) estimate of the clock skew and offset for some of these PDV delay models were derived in [6, 8, 7]. The popular PDV delay models, however, seem unsuitable for general packet switched networks. For example, consider the scenario where PTP is used to synchronize the cellular base stations in 4G LTE networks using mobile backhaul networks. The backhaul networks are leased from commercial Internet Service Providers (ISPs), and the network is shared with other commercial and non-commercial users. The background traffic generated by these users often results in random delays for the synchronization packets. In the context of the backhaul networks, ITU-T G.8261 specification [15] provides models for modeling the background traffic. The empirical pdf of the PDV in the backhaul networks were obtained in [10], and are shown in Figure 1. Similar random delays can occur in any case where a shared network is utilized. The popular available pdf models do not closely match most of the cases in Figure 1.

Furthermore, the PDV pdf typically has limited support in a packet switched network. Hence, the CRB (the most popular lower bound in estimation theory) is not suitable for evaluating the performance of a CSOE scheme in these networks as the regularity conditions are violated [10]. Guruswamy et al. [10] addressed this issue and developed performance lower bounds for an invariant clock offset estimation scheme for PTP assuming knowledge of the clock skew. Building on their previous work of [10], Guruswamy et al. [11] developed minimax optimum clock offset estimation schemes for PTP under the squared error loss function.

Following the work of [11], we will for the first time, develop minimax optimum CSOE schemes for PTP in this paper. The problem of estimating the clock skew and offset in the presence of PDV falls under a variant of the location-scale parameter problems [16], with the unknown clock skew as the scale parameter and the unknown clock offset as the location parameter. Fixing the loss function to skew-normalized squared error loss, we use invariant decision theory (see chapter 6 of [16]) to design the optimum invariant CSOE scheme. Then, using results from [16, 17, 18, 19], we show the developed optimum invariant CSOE schemes are minimax optimum under the skew-normalized squared error loss function. Simulation results indicate the minimax optimum estimators exhibit a lower mean square estimation error than the estimators available in the literature in a variety of network scenarios, and theoretical results which prove the minimax optimum schemes must be as good or better are also provided.

Notations: We use bold upper case, bold lower case, and italic lettering to denote matrices, column vectors and scalars respectively. The notations and denote the transpose and Kronecker product respectively. stands for a -dimensional identity matrix and denotes a column vector of length with all the elements equal to . Further, denotes the set of real numbers, denotes the set of positive real numbers, denotes the set of non-negative real numbers and denotes the indicator function having the value when and when .

## Ii Signal Model and Problem Statement

In this section, we briefly describe the two-way message exchange scheme used in IEEE 1588 and present the considered problem statement. Recall that the relative clock skew and offset of the slave node with respect to the master node are denoted by and , respectively. Assuming rounds of two-way message exchanges, the following sequence of messages are exchanged between the master and slave node during the round of message exchanges (). For each , the master node initiates a two-way message exchange by sending a sync packet to the slave at time . The value of is later communicated to the slave via a follow_up message. The slave node records the time of reception of the sync message as . The slave node sends a delay_req message to the master node while recording the time of transmission as . The master records the time of arrival of the delay_req packet at time and this value is later communicated to the slave using a delay_resp packet. This procedure can be mathematically modeled as [6, 7, 8, 9]

 t2i = (t1i+dms+w1i)ϕ+δ, (1) t3i = (t4i−dsm−w2i)ϕ+δ (2)

for . In (II) and (II), and denote the fixed propagation delays in the master-to-slave forward path and slave-to-master reverse path respectively. The variables and denote the random queuing delays in the forward and reverse path respectively. Define for and for . The joint pdf of is defined as , for . In our work, we assume the queuing delays in the forward and reverse path are independent. Following [7, 11], we consider two observation models based on the amount of information available regarding the fixed path delays and :

#### Ii-1 Known fixed delay model (K-Model)

In this model, we assume complete knowledge of the fixed-path delays and . The received timestamps can be arranged in vector form as follows

 y = uϕ+δ12P, (3)

where, from (II) and (II), we have , and with and . The unknown parameters in this model are and .

#### Ii-2 Standard model (S-Model)

Freris et al. [21] provided some necessary conditions for obtaining a unique solution for the system of equations given in (II) and (II). We need to know either one of the fixed path delays (either or ), or have a prior known affine relationship between the fixed delays (see Theorem 4 in [21]). In this model, we assume a prior known affine relationship between the fixed path delays. For simplicity, we assume the fixed path delays are equal, i.e., , where represents the unknown fixed path delay in the master-slave communication path111We should mention here that the proposed estimators are also applicable when there is a prior known affine relationship between and , i.e., , where the constants and are known.. The received time stamps can be arranged in vector form as follows

 y = (hd+v)ϕ+δ12P, (4)

where with and , , and . The unknown parameters in this model are , and .

Problem Statement: In this paper, we look to develop CSOE schemes for estimating and from the received timestamps for the considered observation models.

## Iii Statistical Preliminaries

The purpose of this section is to formalize the concept of invariance by defining groups of transformations over parameter and observation spaces. To this end, we repeat several essential definitions from [16] to establish some concepts of invariant estimation theory. It is assumed throughout this section that the observed data is characterized by the pdf , which depends upon the vector of unknown parameters with the corresponding parameter space .

Suppose we are interested in estimating an unknown scalar parameter . Let denote an estimator of , denote the estimate of obtained using the estimator on , and denote the considered loss function. The performance of the estimator can be characterized by the following [17]:

1. The conditional risk of an estimator

 R(ψ,θ) = ∫RNL(ψ(x),θ)f(x|θ)dx, (5)
2. The maximum risk of an estimator

 M(ψ) = supθ∈ΘR(ψ,θ), (6)
3. The average risk of an estimator

 B(ψ,p) = ∫θ∈ΘR(ψ,θ)p(θ)dΘ, (7)

where is a prior distribution defined over .

In our work, we are primarily interested in developing minimax estimators, that is estimators that minimize the maximum risk over all possible estimators of the parameter of interest. We first present the definition of a minimax estimator from [17].

###### Definition 1 (Minimax estimators).

An estimator of is said to be a minimax estimator of for the considered loss function, if

 M(ψMinMax) = infψM(ψ)=infψsupθ∈ΘR(ψ,θ). (8)

We use the approach given in [16] (see Chapter 5) to design a minimax estimator of . We first construct the optimum invariant estimator of for a considered (invariant) loss function and then show the optimum invariant estimator is a minimax estimator of for the considered loss function.

We now present some important definitions from [16] with regards to invariant estimation theory. A measurable function is called a transformation on . If and are two transformations on , the composition of and , denoted by , is defined as for . We are now ready to define a group of transformations.

###### Definition 2 (Section 6.2.1, [16]).

A group of transformations on , denoted by , is a set of one-to-one and onto transformations which satisfy the following conditions:

• If and , then .

• If , then , the inverse transformation defined by the relation , is in .

• The identity transformation , defined by , is in .

Let denote the class of all densities for and denote a group of transformations on .

###### Definition 3 (Section 6.2.2, [16]).

The family of densities is said to be invariant under , if for every and , there exists a unique such that has density . We denote as .

###### Remark.

If is invariant under , then

 ¯G = {¯g:g∈G} (9)

is a group of transformations on [16]. We now present a simple example to illustrate these ideas.

###### Example 1.

Let and be a known density. Consider the class of densities of the form

 f(x|θ) = 1σNh(x−μ1TNσ), (10)

where (location parameter) and (scale parameter) are both unknown. From Definition , the class of such densities is invariant under the group of location-scale transformations (see Example 5, Section 6.2.1, [16]) , on , defined as

 Gaffine = {ga,b(m):ga,b(m)=am+b1N} % where a∈R+,b∈R and m∈RN, (11)

since has the density . The group, , of induced transformations on , is given by

 ¯Gaffine = {¯ga,b(μ,σ):¯ga,b(μ,σ)=(aμ+b,aσ)}. (12)

To be invariant, an estimation problem must have a loss function which is unchanged by the relevant transformations. We now present the definition of an invariant loss function.

###### Definition 4 (Section 6.2.2, [16]).

Let be invariant under the group and be an estimate of . A loss function is said to be invariant under , if for every , there exists an such that for all . We denote by .

In an invariant estimation problem, the formal structures of the statistical distributions of and are identical. Hence the invariance principle states that the estimates obtained from and , using an estimator must be related [16]. We now present the definition of an invariant estimator.

###### Definition 5 (Section 6.2.3, [16]).

Let denote an estimator of , and denote the estimate of obtained from the received observations characterized by the pdf . We say is invariant under group if for all and ,

 ψ(g(x)) = ~g(ψ(x)). (13)
###### Example (Example 1 continued).

Let and denote estimators of and , respectively and let and denote the estimates obtained from . From Definition 5, the estimators and are invariant under defined in (1), if

 ^μ(ga,b(x))=a^μ(x)+b, and ^σ(ga,b(x))=a^σ(x). (14)

for all . From Definition , the loss functions for and , defined by

 Lμ(^μ(x),[μ,σ])=(μ−^μ(x))2σ2, and Lσ(^σ(x),[μ,σ])=(σ−^σ(x))2σ2, (15)

respectively, are invariant under from (1), since

 (^μ(x)−μ)2σ2=(^μ(ga,b(x))−(aμ+b))2a2σ2, and (^σ(x)−σ)2σ2=(^σ(ga,b(x))−aσ)2a2σ2, (16)

for all from (1). The loss functions given in (III) are called the scale-normalized squared error loss.

We now present an important definition regarding the transitivity of the group of transformations on and the conditional risk of invariant estimators.

###### Definition 6 (Section 6.2.3, [16]).

A group of transformations of is said to be transitive if for any , there exists some for which .

###### Theorem 1 (Section 6.2.3, [16]).

When is transitive and the loss function is invariant, the conditional risk of an invariant estimator of , is constant for all .

###### Remark.

If the group of transformations on is transitive, and is an invariant estimator of , we have

 R(ψ,θ)=M(ψ)=B(ψ,p), (17)

for any defined over .

When is transitive, we can construct the optimum (or minimum conditional risk) invariant estimator under , when the loss function is invariant under using the theory from [16]. In this paper, we use the concepts of invariant estimation theory to design the optimum invariant CSOE schemes under the K-model (see (II-1)) and S-model (see (II-2)). As we are primarily interested in estimating and , we consider the loss functions defined by

 L1(aδ,θ) = (aδ−δ)2ϕ2, (18)

and

 L2(aϕ,θ) = (aϕ−ϕ)2ϕ2, (19)

for and , respectively222As seen in equations (II-1) and (II-2), the unknown clock skew is similar to the unknown scale parameter in Example 1 and is multiplied with the random queuing delays.. In (III) and (III), and denote estimates of and , respectively, in case of the K-model, and for the S-model333We are not interested in estimating the value of , as it is a nuisance parameter.. We then use results from [16, 19, 18, 17] to show the derived optimum invariant estimators of and are minimax for the skew-normalized squared error loss functions defined in (III) and (III), respectively.

## Iv Minimax optimum CSOE scheme under K-model

We now apply invariant decision theory to derive the optimum invariant estimator of and in the K-model. Recall from (II-1), the observations under the K-model can be represented as

 y = uϕ+δ12P, (20)

where , , and . Let denote the vector of unknown parameters. The parameter space of , denoted by , is given by

 Θ = {(ϕ,δ):ϕ∈R+,δ∈R}. (21)

From (II-1), we have , , and

 f(y|θ) = 1ϕ2Pfu(y−δ12Pϕ)=1ϕ2Pfu1(t2−δ1TPϕ)fu2(t3−δ1TPϕ), (22) = 1ϕ2Pfw1(t2−δ1TPϕ−dms1TP−t1)fw2(δ1TP−t3ϕ−dsm1TP+t4). (23)

Let denote the class of all densities for . The class of such densities is invariant under the group of location-scale transformations (see Example 5, Section 6.2.1, [16]) , on , defined as

 GKModel={ga,b(m):ga,b(m)=am+b12P} where a∈R+,b∈R and m∈R2P, (24)

since has the density . The group, , of induced transformations on defined in (IV), is given by

 ¯GKModel={¯ga,b((ϕ,δ)):¯ga,b((ϕ,δ))=(aϕ,(aδ+b))}, (25)

where and .

Let and denote estimators of and , respectively and let and denote the estimates obtained from the received data characterized by the pdf . The estimators and are invariant under from (IV) if for all

 ^δI(ga,b(y))=^δI(ay+b12P) = a^δI(y)+b, (26) ^ϕI(ga,b(y))=^ϕI(ay+b12P) = a^ϕI(y). (27)

Further, the skew-normalized loss functions defined in (III) and (III) for and , respectively, are invariant under from (IV), since

 (^δI(y)−δ)2ϕ2=(^δI(ga,b(y))−(aδ+b))2a2ϕ2, and (^ϕI(y)−ϕ)2ϕ2=(^ϕI(ga,b(y))−aϕ)2a2ϕ2 (28)

for all . We now present the minimax optimum estimators of and under the K-model.

###### Proposition 1.

The optimum (or minimum conditional risk) invariant estimators of and , denoted by and , respectively, under defined in (IV), for the skew-normalized squared error loss function defined in (III) and (III), respectively, are given by

 ^δMinRisk(y) = ∫R+∫Rδϕ3f(y|θ)dδdϕ∫R+∫R1ϕ3f(y|θ)dδdϕ, (29) ^ϕMinRisk(y) = ∫R+∫R1ϕ2f(y|θ)dδdϕ∫R+∫R1ϕ3f(y|θ)dδdϕ, (30)

where . Further, the derived optimum invariant estimators are minimax for the skew-normalized squared error loss.

###### Proof.

As defined in (IV) is transitive on , the optimum invariant estimator of under in (IV), denoted by , can be obtained by solving (See Result 3 in Section 6.6.2 of [16])

 ^δMinRisk(y)=argmin^δ∫ΘL1(^δ(y),θ)πr(θ|y)dθ=argmin^δ∫Θ(^δ(y)−δ)2ϕ2πr(θ|y)dθ, (31)

where is the posterior density of based on the right invariant prior on (see Section 6.6.1, [16])444The right invariant prior density need not be an actual density [16] (See section 6.6, page 409).. The right invariant prior for the location-scale group was derived in [16] (see Section 6.6). As from (IV) is a location-scale group, the right invariant prior density for is given by

 πr(θ) = 1ϕIR+(ϕ)IR(δ). (32)

To find , we differentiate the objective function in (IV) with respect to and set the result equal to zero (Section 2.4.1, [22]). We obtain

 ^δMinRisk(y) = ∫R+∫Rδϕ2πr(θ|y)dθ∫R+∫R1ϕ2πr(θ|y)dθ=∫R+∫Rδϕ3f(y|θ)dθ∫R+∫R1ϕ3f(y|θ)dθ. (33)

Similarly, the optimum invariant estimator of under in (IV), denoted by , can be obtained by

 ^ϕMinRisk(y) = argmin^ϕ∫Θ(^ϕ(y)−ϕ)2ϕ2πr(θ|y)dθ. (34)

Solving using the same derivative-based approach, we obtain

 ^ϕMinRisk(y) = ∫R+∫R1ϕ2f(y|θ)dδdϕ∫R+∫R1ϕ3f(y|θ)dδdϕ. (35)

When the class of densities is invariant under the location-scale group, it was shown in [18] that the optimum invariant estimator of a parameter for an invariant loss function is also a minimax estimator of the parameter for the considered loss function. As the class of densities is invariant under in (IV) (a location-scale group), and the scale invariant loss function is invariant under , the optimum invariant estimators and , are minimax optimum estimators of and , respectively, for the skew-normalized squared error loss functions given in (III) and (III), respectively. ∎

We now present an important result with regards to the mean square estimation error performance of the minimax optimum estimators when compared to ML estimators. Let and denote estimators of and , respectively. The Mean Square estimation Errors (MSEs) of and , denoted by and , respectively, are defined as

 MSE(^δ)=E{(^δ−δ)2|θ}, and MSE(^ϕ)=E{(^ϕ−ϕ)2|θ}, (36)

where denotes the expectation operator and is the vector of unknown parameters.

###### Proposition 2.

Let and denote the ML estimators of and , respectively. Under the K-model, the MSE of is always greater than or equal to the MSE of . Also, under the K-model, the MSE of is always greater than or equal to the MSE of .

###### Proof.

In the K-model, we have . Let and denote the ML estimates obtained from characterized by the pdf from (IV). We have

 ^θMLE(y)=[^ϕMLE(y),^δMLE(y)] = argmaxθlogL(θ|y), (37)

where is the likelihood function and is equal to . Let from (IV) and define . From (IV), the corresponding transformation of the parameter vector is given by . From the functional invariance of ML estimators [23] (see Chapter 7, Theorem 7.2.10), we have . So, we have the following relationship

 ^δMLE(yg)=a^δMLE(y)+b, and ^ϕMLE(yg)=a^ϕMLE(y). (38)

As this holds true for all from (IV), the ML estimators of and are invariant under as they satisfy (IV) and (IV). For the skew-normalized loss function defined in (III), we have

 R(^δMinRisk,θ) ≤ R(^δMLE,θ), (39)

since is the optimum invariant estimator under in (IV) and achieves the minimum conditional risk among all estimators that are invariant under (see Proposition 1). From (IV), we have

 ∫R2P(^δMinRisk(y)−δ)2ϕ2f(y|θ)dy ≤ ∫R2P(^δMLE(y)−δ)2ϕ2f(y|θ)dy, (40) ⟹∫R2P(^δMinRisk(y)−δ)2f(y|θ)dy ≤ ∫R2P(^δMLE(y)−δ)2f(y|θ)dy, (41) ⟹MSE(^δMinRisk) ≤ MSE(^δMLE). (42)

Following similar steps, we can show that . ∎

## V Minimax optimum CSOE scheme under S-model

We now apply invariant decision theory to derive the optimum invariant estimator of and under the S-model. Recall from (II-2), the observations under the S-model can be represented as

 y = (hd+v)ϕ+δ12P, (43)

where , , and . As the unknown fixed delay is always non-negative, we have . However, it is not possible to design invariant estimators under this constraint555When , it is not possible to construct a group of transformations for which the class of densities in the S-model is invariant under the group of transformations., so we assume , but later we see this is not a problem as we derive the minimax optimum estimator in Proposition 3. Let denote the vector of unknown parameters. The unrestricted parameter space of , denoted by , is given by

 Θ = {(ϕ,d,δ):ϕ∈R+,d∈R,δ∈R}, (44)

and the restricted parameter space of , denoted by , is given by

 Θ∗ = {(ϕ,d,δ):ϕ∈R+,d∈R+0,δ∈R}. (45)

From (II-2), we have , , and

 f(y|θ) = 1ϕ2Pfv(y−δ12Pϕ−hd)=1ϕ2Pfv1(t2−δ1TPϕ−d1TP)fv2(t3−δ1TPϕ+d1TP), (46) = 1ϕ2Pfw1(t2−δ1TPϕ−d1TP−t1)fw2(δ1TP−t3ϕ−d1TP+t4).

Let denote the class of all densities for . The class of such densities is invariant under the group of transformations , on , defined as

 GSModel = {ga,b,c(m):ga,b,c(m)=a(m+hb)+c12P}, (47)

where , since has the density . The group, , of induced transformations on defined in (V), is given by

 ¯GSModel={¯ga,b,c((ϕ,d,δ)):¯ga,b,c((ϕ,d,δ))=(aϕ,(d+b/ϕ),(aδ+c))}, (48)

where