Privacy-Preserving Nonlinear Observer Design
Using Contraction Analysis
Real-time signal processing applications are increasingly focused on analyzing privacy-sensitive data obtained from individuals, and this data might need to be processed through model-based estimators to produce accurate statistics. Moreover, the models used in population dynamics studies, e.g., in epidemiology or sociology, are often necessarily nonlinear. This paper presents a design approach for nonlinear privacy-preserving model-based observers, relying on contraction analysis to give differential privacy guarantees to the individuals providing the input data. The approach is illustrated in two applications: estimation of edge formation probabilities in a dynamic social network, and syndromic surveillance relying on an epidemiological model.
The development of many recent technological systems, such as location-based services, the “Internet of Things”, or electronic biosurveillance systems, relies on the analysis of personal data originating from generally privacy-sensitive participants. In many cases, the system is only interested in producing aggregate statistics from these individual data streams, e.g., a dynamic map showing road traffic conditions or an estimate of power consumption in a neighborhood, but even though aggregation helps, significant privacy breaches cannot be ruled out a priori [1, 2, 3]. This is mainly due to the possibility of correlating the system’s output with other publicly available data. The integration of privacy-preserving mechanisms with formal guarantees into such systems would help alleviate some of the justified concerns of the participants and encourage wider adoption.
While various information theoretic definitions can be given to the concept of privacy and are potentially applicable to the processing of data streams in real-time , we focus on the notion of differential privacy, which originates from the database and cryptography litterature . A differentially private mechanism publishes information about a dataset in a way that is not too sensitive to a single individual’s data. As a result, it becomes difficult to make inferences about that individual from the published output. Previous work on the design of linear filters with differential privacy guarantees includes [6, 7, 8, 9, 10]. The problem studied in this paper is that of designing privacy-preserving nonlinear model-based estimators, which to the best of our knowledge has not been studied in a general setting before.
A convenient way of achieving differential privacy for an estimator is to bound its so-called sensitivity , a form of incremental system gain between the private input signal and the published output . Various tools can be used for this purpose, and here we rely on contraction analysis, see, e.g., [11, 12, 13, 14] and the references therein.
The rest of the paper is divided as follows. Section II presents the problem statement formally, provides a brief introduction to the notion of differential privacy, and describes privacy-preserving mechanisms with input and output perturbation. In Section III we develop a type of “vanishing-input vanishing-output” property of contracting systems similar to the one presented in  but stated here for discrete-time systems. This result is then applied in Section IV to the design of differentially private observers with output perturbation. The methodology is illustrated via two examples. In Section V, we consider the problem of estimating link formation probabilities in a dynamic social network, with a nonlinear measurement model. In Section VI, we consider a nonlinear epidemiological model and design a differentially private estimator of the proportion of susceptible and infectious people in a population, assuming a syndromic data source.
Notation: In this paper, denotes the set of non-negative integers. For a linear map between finite dimensional vector spaces and equipped with the norms and respectively, we denote by its induced norm. If and both spaces are equipped with the same norm , we simply write .
Ii Problem Statement
Ii-a Observer Design
Suppose that we can measure a discrete-time signal for which we have a state-space model of the form
where are noise signals capturing the uncertainty in the model, for some , and for some . The goal is to reconstruct from an estimate of the state that we denote , i.e., we want to build a state observer, which we assume in this paper to be of the simple Luenberger-type form
where is a sequence of gain matrices to determine.
In the applications discussed later in the paper, the signal is collected from privacy-sensitive individuals, hence needs to be protected. On the other hand, the model (1), (2), i.e., the functions , , is assumed to be publicly available. The data aggregator wishes to release the signal produced by (3) publicly as well. However, since depends on the sensitive signal , we will only allow the release of an approximate version of carrying certain privacy guarantees detailed formally in the next subsection. We will later see that the gain matrices need to be carefully chosen to balance accuracy or speed of the observer on the one hand and the level of privacy offered on the other hand.
Note that we do not provide here nor use in our designs any model of the noise signals and , which are simply used as a device to explain the discrepancy between any measured signal and the signal predicted by a deterministic model.
Ii-B Differential Privacy
A differentially private version of the observer (3) should produce an output that is not too sensitive to certain variations associated to an individual’s data in the input signal . The formal definition of differential privacy is given in Definition 1 below. An individual’s signal could correspond to a specific component of , or could already represent an signal aggregated from many individuals . We specify first the type of variations in that we want to make hard to detect by defining a symmetric binary relation, denoted Adj, on the space of datasets of interest, here the space of signals . We consider here the following adjacency relation
where is a specified norm on , and , are given constants. In other words, we aim at providing differential privacy guarantees for transient deviations starting at any time that subsequently decrease geometrically. Note that in [6, 7] the authors consider for the design of a differentially private counter an adjacency condition where the (scalar) input signals can vary by at most one and at a single time period. In comparison, our adjacency condition (4) greatly enlarges the set of signal deviations associated to an individual for which we aim to provide guarantees.
Differentially private mechanisms necessarily randomize their outputs, so that they satisfy the following property.
Let be a space equipped with a symmetric binary relation denoted Adj, and let be a measurable space. Let . A mechanism is -differentially private for Adj if for all such that , we have
If , the mechanism is said to be -differentially private.
This definition quantifies the allowed deviation for the output distribution of a differentially private mechanism, when the variations at the input satisfy the adjacency relation. Smaller values of and correspond to stronger privacy guarantees. In this paper, the space was defined as the space of input signals , the adjacency relation considered is (4), and the output space is the space of output signals for the observer. We then wish to publish an accurate estimate of the state while satisfying the property of Definition 1 for specified values of and .
Ii-C Sensitivity and Basic Mechanisms
Enforcing differential privacy can be done by randomly perturbing the published output of a system, at the price of reducing its utility or quality. Hence, we are interested in evaluating as precisely as possible the amount of noise necessary to make a mechanism differentially private. For this purpose, the following quantity plays an important role.
Let be a positive integer. The -sensitivity of a system with inputs and outputs with respect to the adjacency relation Adj is defined by
where by definition for a vector-valued signal, where has components .
In practice we will be interested in the sensitivity of a system for the cases and . The basic mechanisms of Theorem 1 below (see  for proofs and references), can be used to produce differentially private signals. First, we need the following definitions. A zero-mean Laplace random variable with parameter has the pdf , and its variance is . The -function is defined as . Now for , , let and define , which can be shown to behave roughly as .
Let be a system with inputs and outputs. Then the mechanism , where all , are independent Laplace random variables with parameter , is -differentially private for Adj. If is instead a white Gaussian noise with covariance matrix , the mechanism is -differentially private.
Ii-D Input and Output Perturbation
We see that the amount of noise necessary for differential privacy with the mechanisms of Theorem 1 is proportional to or to . A very useful additional result stated here informally says that post-processing a differentially private signal without re-accessing the privacy-sensitive input signal does not change the differential privacy guarantee [9, Theorem 1]. Now in Theorem 1 the system can simply be the identity, whose - and - sensitivity for the adjacency relation (4) when is the -norm or the -norm are and respectively. This immediately gives a first possible design for our privacy-preserving observer, simply adding Laplace or Gaussian noise directly to the input signal , see Fig. 1 a). Moreover the observer can then be designed to mitigate the effect of this input noise, whose distribution is known. We call this design an input perturbation mechanism. Note also that for close to , can be significantly smaller than , so that sacrificing some in the privacy guarantee to use the -sensitivity can provide better accuracy.
The simple input perturbation mechanism is attractive and can perform well. However, it can also potentially exhibit the following drawbacks. First, the convergence of nonlinear observers is often local and adding noise at the input can lead to poor performance and perhaps divergence of the estimate from the true trajectory. Second, characterizing the output error due to the privacy-preserving noise requires understanding how this noise is transformed after passing through the nonlinear observer. In general, at the output the noise distribution can become multimodal and the noise non white and non zero mean, creating in particular a systematic bias that can be hard to predict. An alternative is the output perturbation mechanism, shown on Fig. 1 b). In this case the privacy-preserving noise is added after the observer denoted , which from Theorem 1 requires computing the sensitivity of . In this case we should try to design an observer that has both good tracking performance for the state trajectory and low sensitivity to reduce the output noise necessary, and we focus on this issue in the following. As shown on Fig. 1 b), we can also add a smoothing filter at the output to filter out the Laplace or Gaussian noise, although this will generally affect some transient performance measure of the overall system. We do not discuss the design of the smoothing filter in this paper.
Consider the memoryless system and the adjacency relation (4) for , so that we have a deviation at a single time period of at most between and . Consider then the Gaussian mechanism, and let’s assume . For the input perturbation scheme, the signal is differentially private when is a standard Gaussian white noise. In this case, the privacy-preserving noise at the input induces a systematic bias at the output between and equal to .
Iii Contracting Systems
In the rest of the paper we focus on output perturbation mechanisms, as described on Fig. 1 b), and we use contraction theory to bound the sensitivity and hence compute the noise level necessary for privacy. Contraction theory has seen significant developments in the past two decades, see, e.g., [11, 14, 12, 13] and the references therein for references to earlier work. In this section, we present some results that we rely on later in the paper. Proofs of these results are given for completeness, since most results in this area are typically stated for continuous-time rather than discrete-time systems.
Consider a discrete-time system
Let be a nonnegative constant. The system (6) is said to be -contracting for the norm on a forward invariant set if for any and any two initial conditions , we have, for all ,
A sufficient condition for the system (6) to be -contracting for a norm on a convex forward invariant set is that
where is the Jacobian matrix of at and is the matrix norm induced by .
Consider the path , for , between the initial conditions and . This path is transported into the sequence of functions Now define the tangent vectors We obtain immediately
Then, with and ,
For any positive definite matrix , defines a norm on . Specializing the condition of Theorem 2 to this norm, we obtain the following result.
Let be a positive definite matrix. A sufficient condition for the system (6) to be -contracting for the norm on a convex forward invariant set is that the following Linear Matrix Inequalities (LMI) are satisfied
Condition (8) for the matrix norm induced by can be rewritten where denotes the induced 2-norm of the matrix , i.e., its largest singular value, and is the positive-definite square root of . The equivalence with the LMI is immediate.
Contraction theory can be developed in a more general differential geometric framework [11, 13], which we do not use here however, for simplicity of exposition and also because some of the needed explicit calculations become more difficult, e.g., requiring the computation of non-trivial geodesic paths and distances.
Under conditions such as that of Theorem 2, cascades of contracting systems are again contracting [11, 12]. Consider the system (6) on equipped with the norm , and assumes that it satisfies condition (8). Then, consider another system with equipped with the norm , and assume that we have the bounds
where , , are nonnegative constants, is convex and is forward invariant for the coupled system.
Let . Then
which proves (12). Note that in Theorem 3 we need to choose large enough to satisfy the condition to show that pairwise trajectories of the cascade system are effectively converging toward each other. We can now prove the following result, which will be our main tool in the following.
Consider a (contracting) system on
and the modified system
where denotes a perturbation input. Suppose that there exists such that for , and
for some constants . Finally, suppose that we have the contraction condition
If , then for , and any , we have
where , and
Following the idea in [12, Lemma 4] for example, we consider the following cascade system with
For the initial condition at , we obtain a trajectory of the unperturbed system (13), whereas for the initial condition , we obtain a trajectory of the perturbed system (14). The scalar system is -contracting. For each , the -system is -contracting by (16). Moreover, the differential of the second vector field with respect to is , which is bounded by from (15). Hence, applying the result of Theorem 3, for any the overall system is contracting with respect to the norm (where ), with rate , so
Iv Differentially Private Observers with Output Perturbation
Let us now return to our initial differentially private observer design problem with output perturbation. We can rewrite the system (3) in the form For a measured signal adjacent to according to (4), we then get the observer state trajectory
where . We can now use the gain matrices to attempt to design a contractive observer (in order for to converge to ), while at the same time minimizing the “gain” of the map . The proof of the following proposition follows immediately from Theorem 4.
for some constant , where , , and is a convex forward invariant set for (3) and (17). Then for the two trajectories and of (3) corresponding to the inputs and (and assuming the same initial condition for our observer), we have for any
where , and is the time period where and start to potentially differ according to (4).
Note in the previous proposition that the choice of has an impact both on and . Increasing the gain can help decrease the contraction rate , but at the same time it increases , forcing us to increase so that . Hence in general we should look to achieve a reasonable contraction rate with the smallest gain possible, in order to reduce the overall system sensitivity (in the sense of Section II-C). We conclude this section with two corollaries of Proposition 1 providing differentially private observers with output perturbation.
From the bound of Proposition 1, we deduce that is a differentially private signal, where is a Gaussian white noise with covariance matrix and is the matrix square root of . Hence is also differentially private and we defined .
We thus have two differentially private mechanisms with output perturbation, provided we can design the matrices to verify the assumptions of Proposition 1 with the - or -norm on . The next sections provide application examples for the methodology.
V Example I: Estimating Link Formation Preferences in Dynamic Social Networks
Statistical studies of networks have intensified tremendously in recent years, with one motivating application being the emergence of online social networking communities. In this section we focus on a state-space model recently proposed in  to describe the dynamics of link formation in networks, called the Dynamic Stochastic Blockmodel. This model combines a linear state-space model for the underlying dynamics of the network and the stochastic blockmodel of Holland et al. , resulting in a nonlinear measurement equation. Examples of applications of this model include mining email and cell phone databases , which obviously contain privacy-sensitive data.
Consider a set of nodes. Each node corresponds to an individual and can belong to one of classes. Let be the probability of forming an edge at time between a node in class and a node in class , and let denote the vector of probabilities . For example, edges could represent email exchanges or phone conversations. Edges are assumed to be formed independently of each other according to . Let be the observed density of edges between classes and , where is the number of observed edges between classes and at time , and is the maximum possible number of edges between these two classes. For simplicity, we assume that the quantities are publicly known (for example, if the class of each node is public information), and we focus on the problem of estimating the parameters using the signals . This corresponds to the “a priori” blockmodeling setting in [16, 15]. The links formed between specific nodes constitute private information however, so directly releasing or or an estimate based on them is not allowed.
If is large enough, the authors in  argue from the Central Limit Theorem that an approximate model where is Gaussian is justified, so that
where is a Gaussian noise vector with diagonal covariance matrix (whose entries theoretically should depend on , but this aspect is neglected in the model). Rather than defining a dynamic model for , whose entries are constrained to be between and , let us redefine the state vector to be the so-called logit of , denoted , with entries , which are well defined for . The dynamics of is assumed to be linear
where the components of are given by the logistic function applied to each entry of , i.e.,
An Extended Kalman Filter (EKF) is proposed in  to estimate , but we pursue here a deterministic observer design to illustrate the ideas discussed in the previous sections. Hence, we consider an observer of the form
with a constant square gain matrix. To enforce contraction as in Proposition 1, we should choose so that where is the Jacobian of at , a square and diagonal matrix with entries with indexing the pairs . The only non-linearity in the model (21), (22) comes from the observation model (22).
To simplify the following discussion, let’s assume that is also diagonal (as in , where the coupling between components occurs only through the non-diagonal covariance matrix ). In this case, the systems completely decouple into scalar systems, and it is natural to choose to be diagonal as well. The observer for one of these scalar system takes the form
where is one component of and now represents just the corresponding scalar component of the measurement vector as well. Since the state space is now , the norm is simply the absolute value. For contraction, we wish to impose the condition, for some ,
Suppose that we want to design a privacy-preserving observer for the interval , or equivalently approximately. In this interval, we have
Suppose that we have . Then we must have
In general to reduce the sensitivity we should choose a small gain , which is compatible with (26) if we choose close enough to . Indeed, setting and in Proposition 1 so that (assuming ), we can verify that the sensitivity say and thus the noise parameter b in (19) decreases monotonically as increases toward . However, performance concerns for the observer should also dictate the minimum tolerable gain (with a gain , the observer is perfectly private but is not useful).
Suppose the disturbance tolerated by the adjacency relation satisfies the bound (4) with and . That is, for the pair of classes under consideration, we want to provide a differential privacy guarantee making it hard to detect a transient variation in the number of created edges, as long as this variation represents initially at most of all the edges between classes and , and subsequently decreases geometrically at rate . Concretely if edges represent phone conversations for example, this means that if an individual in class suddenly increases his call volume with class but by an amount representing less than of all calls between and , and then reduces this temporary activity at rate , detection of this event by any means from a differentially private estimate of will necessarily have a low probability of success. If a gain say is judged to be still adequate for the application in terms of tracking performance, we can take and we get in (19). If we publish with a Laplace white noise with this parameter , we obtain an -differentially private estimator of . Figure 2 illustrates the behavior of the resulting privacy-preserving observer.
Vi Example II: Syndromic Surveillance
Syndromic surveillance systems monitor health related data in real-time in a population to facilitate early detection of epidemic outbreaks . In particular, recent studies have shown the correlation between certain non-medical data, e.g., search engine queries related to a specific disease, and the proportion of individuals infected by this disease in the population . Although time series analysis can be used to detect abnormal patterns in the collected data , here we focus on a model-based filtering approach , and develop a differentially private observer for a -dimensional epidemiological model.
The following SIR model of Kermack and McKendrick [20, 21] models the evolution of an epidemic in a population by dividing individuals into 3 categories: susceptible (S), i.e., individuals who might become infected if exposed; infectious (I), i.e., currently infected individuals who can transmit the infection; and recovered (R) individuals, who are immune to the infection. A simple version of the model in continuous-time includes bilinear terms and reads
Here and represent the proportion of the total population in the classes and . The last class need not be included in this model because we have the constraint . The parameter is called the basic reproduction number and represents the average number of individuals infected by a sick person. The epidemic can propagate when . The parameter represents the rate at which infectious people recover and move to the class . More details about this model can be found in .
Discretizing this model with sampling period , we get the discrete-time model
where we have also introduced noise signals and in the dynamics. We assume here for simplicity that we can collect syndromic data providing a noisy measurement of the proportion of infected individuals, .i.e.,
where is a noise signal. We can then consider the design of an observer of the form
as well as the gain matrix and observation matrix .
where we used to simplify the notation. Defining the new variable , this can be rewritten
which, using the Schur complement, is equivalent to the family of LMIs
for all in the region where we want to prove contraction. If we can find satisfying these inequalities, we recover the observer gain matrix simply as .
Note that to minimize in Proposition 1, we should try to minimize , or equivalently minimize such that the following LMI is satisfied
However, we should also minimize , which appears in the covariance matrix of the privacy-preserving noise in Corollary 3, or equivalently minimize subject to
In the end, we choose to minimize a cost function of the form , with a coefficient appropriately tuned to balance observer gain and level of privacy-preserving noise, subject to the LMI contraints (29), (30) and (31), and or perhaps for another constant if we wish to impose a hard upper bound on the noise covariance.
Let’s assume , , in (4), and , . That is, we wish to provide a -differential privacy guarantee for maximum deviations of (see the discussion in the previous section). Although not a perfectly rigorous contraction certificate, we sample the continuous set of constraints (29) by sampling the set at the values of multiple of , to obtain a finite number of LMIs. A more rigorous approach to enforce these constraints could make use of sum-of-squares programming . Following the procedure above, for the choice , , we obtain the observer gain and the covariance matrix
for the Gaussian privacy-preserving noise. A typical sample trajectory of the estimate of is shown on Fig. 3.
We have discussed input and output perturbation mechanisms to design model-based nonlinear estimators with differential privacy guarantees. In general, we wish to achieve a good contraction rate with the smallest gain possible, and in fact this idea applies to both types of mechanisms. Future work includes comparing quantitatively input and output perturbation schemes, and generalizing both by combining pre- and post-processing as illustrated on Fig. 1 b) and in [9, 10] for the linear case.
-  A. Narayanan and V. Shmatikov, “Robust de-anonymization of large sparse datasets (how to break anonymity of the Netflix Prize dataset),” in Proceedings of the IEEE Symposium on Security and Privacy, 2008.
-  J. A. Calandrino, A. Kilzer, A. Narayanan, E. W. Felten, and V. Shmatikov, ““you might also like”: Privacy risks of collaborative filtering,” in Proceedings of the IEEE Symposium on Security and Privacy, Berkeley, CA, May 2011.
-  D. H. Wilson and C. Atkeson, “Simultaneous tracking and activity recognition (STAR) using many anonymous, binary sensors,” in Pervasive Computing, ser. Lecture Notes in Computer Science, H.-W. Gellersen, R. Want, and A. Schmidt, Eds. Springer Berlin Heidelberg, 2005, vol. 3468, pp. 62–79.
-  L. Sankar, W. Trappe, K. Ramchandran, H. V. Poor, and M. Debbah, Eds., IEEE Signal Processing Magazine, Special issue on Signal Processing for Cybersecurity and Privacy, September 2013.
-  C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Proceedings of the Third Theory of Cryptography Conference, 2006, pp. 265–284.
-  C. Dwork, M. Naor, T. Pitassi, and G. N. Rothblum, “Differential privacy under continual observations,” in Proceedings of the ACM Symposium on the Theory of Computing (STOC), Cambridge, MA, June 2010.
-  T.-H. H. Chan, E. Shi, and D. Song, “Private and continual release of statistics,” ACM Transactions on Information and System Security, vol. 14, no. 3, pp. 26:1–26:24, November 2011.
-  J. Le Ny and G. J. Pappas, “Differentially private Kalman filtering,” in Proceedings of the 50th Annual Allerton Conference on Communication, Control, and Computing, October 2012.
-  ——, “Differentially private filtering,” IEEE Transactions on Automatic Control, vol. 59, no. 2, pp. 341–354, February 2014.
-  J. Le Ny and M. Mohammady, “Differentially private MIMO filtering for event streams and spatio-temporal monitoring,” in IEEE Conference on Decision and Control, Los Angeles, CA, December 2014.
-  W. Lohmiller and J.-J. Slotine, “On contraction analysis for non-linear systems,” Automatica, vol. 34, no. 6, pp. 683–696, 1998.
-  E. D. Sontag, “Contractive systems with inputs,” in Perspectives in Mathematical System Theory, Control, and Signal Processing, J. Willems, S. Hara, Y. Ohta, and H. Fujioka, Eds. Springer-Verlag, 2010, pp. 217–228.
-  F. Forni and R. Sepulchre, “A differential Lyapunov framework for contraction analysis,” IEEE Transactions on Automatic Control, vol. 59, no. 3, pp. 614–628, March 2014.
-  D. Angeli, “A Lyapunov approach to incremental stability properties,” IEEE Transactions on Automatic Control, vol. 47, no. 3, pp. 410–421, March 2000.
-  K. S. Xu and A. O. Hero III, “Dynamic stochastic blockmodels for time-evolving social networks,” Journal of Selected Topics in Signal Processing, vol. 8, no. 4, pp. 552–562, August 2014, Special Issue on Signal Processing for Social Networks.
-  P. W. Holland, K. B. Laskey, and S. Leinhardt, “Stochastic blockmodels: First steps,” Social Networks, vol. 5, no. 2, pp. 109–137, 1983.
-  A. B. Lawson and K. Kleinman, Spatial and Syndromic Surveillance for Public Health. Wiley, 2005.
-  J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski, and L. Brilliant, “Detecting influenza epidemics using search engine query data,” Nature, vol. 457, pp. 1012–1014, 2009.
-  A. Skvortsov and B. Ristic, “Monitoring and prediction of an epidemic outbreak using syndromic observations,” Mathematical biosciences, vol. 240, pp. 12–19, 2012.
-  W. O. Kermack and A. G. McKendrick, “A contribution to the mathematical theory of epidemics,” Proceedings of the Royal Society of London Series A, vol. 115, pp. 700–721, 1927.
-  F. Brauer, P. van den Driessche, and J. Wu, Eds., Mathematical Epidemiology, ser. Lecture Notes in Mathematics. Berlin: Springer-Verlag, 2008, vol. 1945.
-  E. M. Aylward, P. A. Parrilo, and J.-J. E. Slotine, “Stability and robustness analysis of nonlinear systems via contraction metrics and SOS programming,” Automatica, vol. 44, no. 8, pp. 2163–2170, August 2008.