Optimal Obfuscation Mechanisms via Machine Learning

Optimal Obfuscation Mechanisms via Machine Learning


We consider the problem of obfuscating sensitive information while preserving utility, and we propose a machine-learning approach inspired by the generative adversarial networks paradigm. The idea is to set up two nets: the generator, that tries to produce an optimal obfuscation mechanism to protect the data, and the classifier, that tries to de-obfuscate the data. By letting the two nets compete against each other, the mechanism improves its degree of protection, until an equilibrium is reached. We apply our method to the case of location privacy, and we perform experiments on synthetic data and on real data from the Gowalla dataset. We evaluate the privacy of the mechanism not only by its capacity to defeat the classifier, but also in terms of the Bayes error, which represents the strongest possible adversary. We compare the privacy-utility tradeoff of our method with that of the planar Laplace mechanism used in geo-indistinguishability, showing favorable results. Like the Laplace mechanism, our system can be deployed at the user end for protecting his location.

I Introduction

Data analytics are crucial for modern companies and, consequently, there is an enormous interest in collecting and processing all sort of personal information. Individuals, on the other hand, are often willing to provide their data in exchange of improved services and experiences. However there is the risk that such disclosure of personal information could be used against them. The rise of machine learning, with its capability of performing powerful analytics on massive amounts of data, has further exacerbated the risks. Several researchers have pointed out possible threats such as the model inversion attacks [1] and the membership inference attacks [2, 3, 4, 5].

Nonetheless, if machine learning can be a threat, it can also be a powerful means to build good privacy protection mechanisms, as we will demonstrate in this paper. We focus on mechanisms that obfuscate data by adding controlled noise. Usually the quality of service (QoS) that the user receives in exchange of his obfuscated data degrades with the amount of obfuscation, hence the challenge is to find a good trade-off between privacy and utility. Following the approach of [6], we aim at maximizing the privacy protection while preserving the desired QoS1. We consider the case of location privacy and in particular the re-identification of the user from his location, but the framework that we develop is general and can be applied to any situation in which an attacker might infer sensitive information from accessible correlated data.

Utility is typically expressed as a bound on the expected distance between the real location and the obfuscated one2 [6, 9, 7, 10], capturing the fact that location based services usually offer a better QoS when they receive a more accurate location. If also privacy is expressed as a linear function, then the optimal trade-off can in principle be achieved with linear programming [6, 7, 11, 12]. The limitation of this approach, however, is that it does not scale to large datasets. The problem is that the linear program needs one variable for every pair of real and obfuscated locations. Such variables represent the probability of producing the obfuscated location when the real one is . For a grid this is more than six million variables, which is already at the limit of what modern solvers can do. For a grid, the program has 4.5 billion variables, making it completely intractable (we could not even launch such a program due to the huge memory requirements). Furthermore, the background knowledge and the correlation between data points affect privacy and are usually difficult to determine and express formally.

Our position is that machine learning can help to solve this problem. Inspired by the GANs paradigm [13], we propose a system consisting of two adversarial neural networks, (generator) and (classifier). The idea is that generates noise so to confuse the adversary as much as possible, within the boundaries of the utility constraints, while inputs the noisy locations produced by and tries to re-identify (classify) the corresponding user. While fighting against , refines its strategy, until a point where it cannot improve any longer. Note that a significant difference from the standard GANs is that, in the latter, the generator has to learn to reproduce an existing distribution from samples. In our case, instead, the generator has to “invent” a distribution from scratch.

The interplay between and can be seen as an instance of a zero-sum Stackelberg game [6], where is the leader, and is the follower, and the payoff function is the privacy loss. Finding the optimal point of equilibrium between and corresponds to solving a minimax problem on with being the minimizer and the maximizer.

A major challenge in our setting is represented by the choice of . A first idea would be to measure it in terms of ’s capability to re-associate a location to the right user. Hence we could define as the expected success probability of ’s classification. Such function would be convex/concave with respect to the strategies of and respectively, so from game theory we would derive the existence of a saddle point corresponding to the optimal obfuscation-re-identification pair. The problem, however, is that it is difficult to reach the saddle point via the typical alternation between the two nets. Let us clarify this point with a simple example3:

\thesubsubfigure Expected success probability of the classification.

\thesubsubfigure Bold: .    : .
Fig. 1: Payoff tables of the games in Example 1, for various payoff functions . stands for and for .
Example 1.

Consider two users, Alice and Bob, in locations and respectively. Assume that at first reports their true locations (no noise). Then learns that corresponds to and to . At the next round, will figure that to maximize the misclassification error (given the prediction of ) it should swap the locations, i.e., report for and for . Then, on its turn, will have to “unlearn” the previous classification and learn the new one. But then, at the next round, will again swap the locations, and bring the situation back to the starting point, and so on, without ever reaching an equilibrium. Note that a possible equilibrium point for would be the mixed strategy that reports for both and 4 (so that could only make a bling guess), but may not stop there. The problem is that it is difficult to calibrate the training of so that it stops in proximity of the saddle point rather than continuing all the way to reach its relative optimum. The situation is illustrated in Fig.1.

In order to address this issue we adopt a different target function, less sensitive to the particular labeling strategy of . The idea is to consider not just the precision of the classification, but, rather, the information contained in it. There are two main ways of formalizing this intuition: the mutual information and the Bayes error , where are respectively the random variable associated to the true ids, and to the ids resulting from the classification (predicted ids). We recall that , where is the entropy of and is the residual entropy of given , while is the probability of error when we select the value of X with maximum aposteriori probability, given . Mutual information and Bayes error are related by the Santhi-Vardy bound [15]:

If we set to be or , we obtain the payoff table illustrated in Fig.1. Note that the mimimum in the first and last columns corresponds now to a point of equilibrium for any choice of . This is not always the case, but in general it is closer to the equilibrium and makes the training of more stable: training for a longer time does not risk to increase the distance from the equilibrium point.

In this paper we use the mutual information to generate the noise, but we evaluate the level of privacy also in terms of the Bayes error, which represents the probability of error of the strongest possible adversary. Both notions have been used in the literature as privacy measures, for instance mutual information has been applied to quantify anonymity [16, 17]. The Bayes error has been considered in [17, 18, 19, 20], and indirectly as min-entropy leakage in [21]. Oya et al. advocate in [12] that to guarantee a good level of location privacy a mechanism should measure well in terms of both the Bayes error and the residual entropy (which is strictly related to mutual information). Fig. 2 anticipates some of the experimental results of Sections IV and V. We note that the performance of our mechanism is much better than the planar Laplace, and comparable to that of the optimal solution in all the three cases in which we can determine the latter. Of course, this comparison is not completely fair, because the planar Laplace was designed to satisfies a different notion of privacy, called geo-indistinguishability [9] (see next paragraph). Our mechanism on the contrary does not satisfy this notion.

Synthetic data, low utility
Laplace Ours Optimal
Synthetic data, high utility
Laplace Ours Optimal
Gowalla data, low utility
Laplace Ours Optimal
Gowalla data, high utility
Laplace Ours Optimal
Fig. 2: Bayes error on synthetic and Gowalla data, for the Laplace mechanism, our mechanism, and the optimal one, on a grid of cells. In the last table the Bayes error of the optimal mechanism is unknown: the linear program contains 4.5 billion variables, making it intractable in practice.

Other popular privacy metrics are differential privacy (DP) [22], local differential privacy (LPD) [23], and -privacy [24], of which geo-indistinguishability is an instance. The main difference between these and the notions used in this paper is that they are worst-case measures, while ours are average. In other words, ours refer to the expected level of privacy over all sensitive data, while the others are concerned with the protection of each individual datum. Clearly, the latter is stronger, as proved in [25] and [26], although [27] has proved that a conditional version of mutual information correspond to a relaxed form of differential privacy called -differential privacy. We regard the individual protection as an important issue, and we plan to investigate the possibility of generating worst-case mechanisms via ML in future work. This paper is a preliminary exploration of the applicability of ML to privacy, and as a starting point we focus on the average notions that have been considered in location privacy [6, 11, 12].

From a practical point of view our method belongs to the local privacy category, like LDP and geo-indistinguishability, in the sense that it can be deployed at the user’s end, with no need of a trusted third party. Once the training is done the system can be used as a personal device that, each time the user needs to report his location to a LBS, generates a sanitized version of it by adding noise to the real location.

I-a Contribution

The contributions of the paper are the following:

  • We propose an approach based on adversarial nets to generate obfuscation mechanisms with a good privacy-utility tradeoff. The advantage of our method is twofold:

    • wrt linear programming methods, we can work on a continuous domain instead of a small grid;

    • wrt analytic methods (such as the Planar Laplace mechanism) our approach is data-driven, taking into account prior knowledge about the users.

  • Although our approach is inspired by the GANs paradigm, it departs significantly from it: In our case, the distribution has to be “invented” rather than “imitated”. Hence we need different techniques for evaluating a distribution. To achieve our goal, we propose a new method based on the mutual information between the supervised and the predicted class labels.

  • We show that the use of the use of mutual information (instead of the cross entropy) for the generator is crucial for convergence. On the other hand for the classifier it is possible to use cross entropy and it is more efficient.

  • We evaluate the obfuscation mechanism produced by our method on real location data from the Gowalla dataset.

  • We compare our mechanism with the planar Laplace [9] and with the optimal one, when it is possible to compute or determine theoretically the latter. We show that the performance of our mechanism is much better than Laplace, and not so far from the optimal.

  • We have made publicly available the implementation and the experiments at https://gitlab.com/MIPAN/mipan.

I-B Related work

Optimal mechanisms, namely mechanisms providing an optimal compromise between utility and privacy, have attracted the interest of many researchers. Many of the studies so far have focused on optimization methods based on linear programming [6, 7, 11, 12]. Although they can provide exact solutions, the huge size of the corresponding linear programs limits the scalability of these methods. Our approach, in contrast, using the efficient optimization process of neural networks (the gradient descent), does not suffer from this drawback. All the experiments were done on grid sizes for which linear programming is completely intractable.

Adversarial networks to construct privacy-protection mechanisms have been also proposed by [14, 28, 29], with applications on image data (the MNIST and the GENKI datasets). The authors of [28, 29] have also developed a theoretical framework similar to ours. From the methodological point of view the main difference is that in the implementation they use as target function the cross entropy rather than the mutual information. Hence in our setting the convergence of their method may be problematic, due to the “swapping effect” described in Example 1. We have actually experimented the use of cross entropy as target function on our examples in Section  IV, and we could not achieve convergence. The intermediate mechanisms were unstable and the level of privacy was poor. Another related paper is [30], which uses an adversarial network to produce mechanisms against attribute inference attacks. The target function is the Kullback-Liebler divergence, which, in this particular context where the distribution of the secrets is fixed, reduces to cross entropy. Hence in our setting we would get the same swapping effect explained above.

Other works that have proposed the use of minimax learning to preserve privacy are [31, 32, 33, 34]. The author of [31] introduces the notion of minimax filter as a solution to the optimization problem between privacy as expected risk and utility as distortion, and propose various learning-based methods to approximate such solution. The authors of [32] consider multi-party machine learning, and use adversarial training to mitigate privacy-related attacks such as party membership inference of individual records. The authors of [33] propose the minimax technique to remove private information from personal images. Their approach is to use a stochastic gradient alternate min-max optimizer, but since they express the objective in terms of cross entropy, they may incur in the same problem as described above, i.e., they cannot guarantee convergence. The authors of [34] consider personal images, and in particular the problem of preventing their re-identification while preserving their utility, such as the the discernibility of the actions in the images. They use the angular softmax loss as objective function, and do not analyze the problem of convergence, but their experimental results are impressive.

Another related line of work is the generation of synthetic data via machine learning. An example is [35], where the authors use an adversarial network to generate artificial medical records that closely resemble participants of the Systolic Blood Pressure Trial dataset. In this case, the paradigm they use is the same as the original GAN: the discriminator takes in input both the records produced by the generator and samples from the original dataset, and tries to distinguish them. The original dataset is also obfuscated with differential privacy techniques to prevent membership attacks.

One of the side contributions of our paper is a method to compute mutual information in neural network (cfr. Section III). Recently, Belghazi et al. have proposed MINE, an efficient method to neural estimation of mutual information [36], inspired by the framework of [37] for the estimation of a general class of functions representable as -divergencies. These methods work also in the continuous case and for high-dimensional data. In our case, however, we are dealing with a discrete domain, and we can compute directly and exactly the mutual information. Another reason for developing our own method is that we need to deal with a loss function that contains not only the mutual information, but also a component representing utility, and depending on the notion of utility the result may not be an -divergence.

Our paradigm has been inspired by the GANs [13], but it comes with some fundamental differences:

  • is a classifier performing re-identification while in the GANs there is a discriminator able to distinguish a real data distribution from a generated one;

  • in the GANs paradigm the generator network tries to reproduce the original data distribution to fool the discriminator. A huge difference is that, in our adversarial scenario, does not have a model distribution to refer to. The final data distribution only depends on the evolution of the two networks over time and it is driven by the constraints imposed in the loss functions that rule the learning process.

  • We still adopt a training algorithm which alternates the training of and of , but as we will show in Section III, it is different from the one adopted for GANs.

Ii Our setting

Symbol Description
\pbox15cmClassifier network (attacker).
\pbox15cmGenerator network.
\pbox15cm Sensitive information. (Random var. and domain.)
\pbox15cmUseful information with respect to
the intended notion of utility.
\pbox15cmObfuscated information accessible
to the service provider and to the attacker.
\pbox15cmInformation inferred by the attacker.
\pbox15cmJoint probability of two random variables.
\pbox15cmConditional probability.
\pbox15cmObfuscation mechanism.
\pbox15cmBayes error.
\pbox15cmUtility loss induced by the obfuscation mechanism.
\pbox15cmThreshold on the utility loss.
\pbox15cmEntropy of a random variable.
\pbox15cmConditional entropy.
\pbox15cmMutual information between two random variables.
TABLE I: Table of symbols

We formulate the privacy-utility optimization problem using a framework similar to that of [38]. We consider four random variables, , ranging over the sets and respectively, with the following meaning:

  • : the sensitive information that the users wishes to conceal,

  • : the useful information with respect to some service provider and the intended notion of utility,

  • : the information made visible to the service provider, which may be intercepted by some attacker, and

  • : the information inferred by the attacker.

We assume a fixed joint distribution (data model) over the users’ data . We present our framework assuming that the variables are discrete, but all results and definitions can be transferred to the continuous case, by replacing the distributions with probability density functions, and the summations with integrals. For the initial definitions and results of this section and may be different sets. Starting from Section III we will assume that .

An obfuscation mechanism can be represented as a conditional probability distribution , where indicates the probability that the mechanism transform the data point into the noisy data point . We assume that are the only attributes visible to the attacker and to the service provider. The goal of the defender is to optimize the data release mechanism so to achieve a desired level of utility while minimizing the leakage of the sensitive attributes . The goal of the attacker is to retrieve from as precisely as possible. In doing so, it produces a classification (prediction).

Note that the four random variables form a Markov chain:


Their joint distribution is completely determined by the data model, the obfuscation mechanism and the classification:

From we can derive the marginals, the conditional probabilities of any two variables, etc. For instance:


The latter distribution, , is the posterior distribution of given , and plays an important role in the following sections.

Ii-a Quantifying utility

Concerning the utility, we consider a loss function , where represents the utility loss caused by reporting when the true value is .

Definition 1 (Utility loss).

The utility loss from the original data to the noisy data , given the loss function , is defined as the expectation of :


We will omit when it is clear from the context. Note that, given a data model , the utility loss can be expressed in terms of the mechanism :


Our goal is to build a privacy-protection mechanism that keeps the loss below a certain threshold . We denote by the set of such mechanisms, namely:


The following property is immediate: {restatable}[Convexity of ]propositionConvexityM_L The set is convex and closed.

Ii-B Quantifying privacy as mutual information

We recall the basic information-theoretic definitions that will be used in the paper:
Entropy of :


Residual Entropy of given :


Mutual Information between and :


Cross entropy between the posterior and the prediction:


We recall that the more correlated and are, the larger is , and viceversa. The minimum is when and are independent; the maximum is when the value of determines uniquely the value of and viceversa. In contrast, , that represents the precision loss in the classification prediction, is not related to the correlation between and , but rather to the similarity between and : the more similar they are, the smaller is . In particular, the minimum is when .

The privacy leakage of a mechanism with respect to an attacker , characterized by the prediction , will be quantified by the mutual information . This notion of privacy will be used as objective function, rather than the more typical cross entropy . As explained in the introduction, this choice makes the training of more stable because, in order to reduce , cannot simply swap around the labels of the classification learned by , it must reduce the correlation between and (via suitable modifications of ), and in doing so it limits the amount of information that any adversary can infer about from . We will come back on this point in more detail in subsection III-A.

Ii-C Formulation of the game

The game that and play corresponds to the following minimax formulation:


where the minimization by is on the mechanisms ranging over , while the maximization by is on the classifications .

Note that can be seen as a stochastic matrix and therefore as an element of a vector space. An important property for our purposes is that the mutual information is convex with respect to :


[Convexity of ]propositionConvexityI Given and , let . Then is convex.

Proposition II-A and II-C show that this problem is well defined: for any choice of , has a global minimum in , and no strictly-local minima.

On the use of the the classifier

We note that, in principle, one could avoid using the GAN paradigm, and try to achieve the optimal mechanism by solving, instead, the following minimization problem:


where is meant, as before, as a minimization over the mechanisms ranging over . This approach would have the advantage that it is independent from the attacker, so one would need to reason only about (and there would be no need for a GAN).

The main difference between and is that the latter represents the information about available to any adversary, not only those that are trying to retrieve by building a classifier. This fact reflects in the following relation between the two formulations:



Note that, since is an upper bound of our target, it imposes a limit on .

On the other hand, there are some advantages in considering instead than : first of all, may have a much larger and more complicated domain than , so performing the gradient descent on could be infeasible. Second, if we are interested in considering only classification-based attacks, then should give a better result than . In this paper we focus on the former, and leave the exploration of an approach based on as future work.

Ii-D Measuring privacy as Bayes error

As explained in the introduction, we intend to evaluate the resulting mechanism also in terms of Bayes error. Here we give the relevant definitions and properties.

Definition 2 (Bayes error).

The Bayes error of given is:

Namely, the Bayes error is the expected probability of “guessing the wrong id” of an adversary that, when he sees that produces the id , it guesses the id that has the highest posterior probability given .

The definition of is analogous. Given a mechanism , we regard as a measure of the privacy of w.r.t. one-try [21] classification-based attacks, whereas is w.r.t. any one-try attack. The following proposition shows the relation between the two notions.



Iii Implementation in Neural Networks

In this section we describe the implementation of our adversarial game between and in terms of alternate training of neural networks. The scheme of our game is illustrated in Fig. 3, where:


Fig. 3: Scheme of the adversarial nets for our setting.
  • and are instances of the random variables ,, and respectively, whose meaning is described in previous section. We assume that the domains of and coincide.

  • (seed) is a randomly-generated number in .

  • is the function learnt by , and it represents an obfuscation mechanism . The input provides the randomness needed to generate random noise. It is necessary because a neural network in itself is deterministic.

  • is the classification learnt by , corresponding to .

{tcolorbox}[width=0.45] Data: // Training data
Models: generator evolution at the --th step;
classifier evolution at the --th step.
trains the network on the data .
outputs a noisy version of .
= base classifier model
= base generator model
i = 0
while True do
       i += 1
       // Train class. from scratch
        and in cascade
        = generator layer in
end while
Algorithm 1 Adversarial algorithm with classifier reset.

The evolution of the adversarial network is described in Algorithm 1. and are trained at two different moments within the same adversarial training iteration. In particular is obtained by training the network against the noise generated by and is obtained by fighting against .

Note that in our method each is trained on the output of . This is a main difference with respect to the GANs paradigm, where the discriminator is trained both on the output of the generator and on samples from the target distribution generated by an external source. Another particularity of our method is that at the end of the -th iteration, while is retained for the next iteration, is discarded and the classifier for iteration is reinitialized to the base one . The reason is that restarting from is more efficient than starting from the last trained classifier . This is because may have changed at step the noise mechanism and therefore the association between and expressed by . The predictions that had produced during its training (trying to match the previously produced by as closely as possible), not only is not optimal anymore: for some ’s it may have become completely wrong, and starting from a wrong prediction is a drawback that slows down the learning of the new prediction. There may be several ’s for which the old prediction is a good approximation of the new one to be learned, but according to our experiments the net effect is negative: the training of the new classifier is usually faster if we restart from scratch. It is worth noting that this is only a matter of efficiency though: eventually, even if we started from , the new classifier would “unlearn” the old, wrong predictions and learn the correct new ones.

At the end of each training iteration we evaluate the quality of the produced noise by checking the performance of the network. In particular we make sure that the noise produced by the network affects the training, validation and test data in a similar way. In fact, in case the performances were good on the training data but not on the the other data, this would be a result of overfitting rather than of a quality indicator of the injected noise.

We describe now in more detail some key implementation choices of our proposal.

Iii-a Mutual information vs cross entropy

Based on the formulation of our game (13), the alternate training of both and is performed using the mutual information as the loss function. The goal of is to minimize by refining the mechanism , while aims at maximizing it by refining the classifier .

We remark that the use of mutual information as loss function is not standard. A more typical function for training a classifier is the cross entropy , which is more efficient to implement. is minimized when and coincide. Such outcome would correspond to the perfect classifier, that predicts the exact probability that a given sample belongs to the class . One could then think of reformulating the game in terms of the cross entropy , where would be the minimizer (trying to infer probabilistic information about the secret from a given observation ) and the maximizer (trying to prevent the adversary from achieving this knowledge). However, as already observed in Example 1 in the introduction, training via does not allow to reach an equilibrium, because it takes into account only one adversarial strategy (i.e., one particular classification). Indeed, a maximum can be achieved with a that simply causes a swapping of the associations between the labels ’s and the corresponding noisy locations ’s. This would change and therefore fool the present classifier (because the prediction would not be equal anymore to ), but at the next round, when will be trained on the new data, it will learn the new classification and obtain, again, the maximum information about that can be inferred from . The possibility of ending up in such cyclic behavior is experimentally proved in Section IV-A1. Note that this problem does not happen with mutual information, because swapping the labels does not affect at all.

Since can only change the mechanism , the only way for to reduce the mutual information is to reduce by reducing the correlation between and ( is correlated to only via ) . This limits the information about that can be inferred from , for any possible adversary, i.e., for any possible prediction , hence also for the optimal one. Still, if is very large cannot be reduced directly in an efficient way, and this is the reason why needs the feedback of the optimal prediction : in contrast to , minimizing can be done effectively in neural networks via the gradient descent when (the domain of and ) is “reasonably small”.

The above discussion about vs holds for the generator , but what about the adversary ? Namely, for a given , is it still necessary to train on , or could we equivalently train it on ? The following result answers this question positively. {restatable}propositionpropMICE

with defined by .

Given the above result, and since minimizing is more efficient than maximizing , in our implementation we have used for the training of . Of course, we cannot do the same for : as discussed above, the generator needs to be trained by using .

A consequence of subsection III-A is that the adversary represented by at the point of equilibrium is at least as strong as the Bayesian adversary, namely the adversary that minimizes the expected probability of error in the -try attack (which consists in guessing a single secret given a single observable  [21].) Indeed, from one can derive the following decision function (deterministic classifier) , which assigns to any the class with highest predicted probability:


To state formally the property of the optimality of w.r.t. -try attacks, let us recall the definition of the expected error for a generic decision function :




We can now state the following result, that relates the error of the attacker (induced by the at the equilibrium point) and the minimum Bayes error of any adversary for the at the equilibrium point (cfr. 2 and subsection II-D): {restatable}propositionpropCEB If , and is defined as in (15), then:

Iii-B Implementing Mutual Information

In order to describe the implementation of the mutual information loss function, we will consider the training on a specific batch of data. This technique is based on the idea that the whole training set of cardinality can be split into subsets of cardinality with . This is useful to to fit data in the memory and, since during each epoch the network is trained on all the batches, this corresponds to using all the training data (provided that the data distribution in each batch is a high fidelity representation of the training set distribution, otherwise the learning could be unstable).

To obtain the mutual information between and we estimate the distributions , and . Then we can compute using (11), or equivalently as the formula:


Let us consider a batch consisting of samples of type in the context of the classification problem, and let represents the cardinality of , i.e., the total number of classes. In the following we denote by and , respectively, the target and the prediction matrices for the batch. Namely, and are matrices, whose rows correspond to samples and whose columns to classes, defined as follows. represents the class one-hot encoding: the element in row and column , , is if is the target class for the sample , and otherwise. , on the other hand, reports the probability distribution over the classes computed by the classifier: is the predicted probability that sample be in class .

The estimation of for the given batch can be obtained by computing the frequency of among the samples, namely:


Similarly, is estimated as the expected prediction of :


The joint distribution can be estimated by considering the correlation of and through the samples. Indeed, the probability that sample has target class and predicted class can be computed as the product , and by summing up the contributions of all samples (where each sample contributes for ) we obtain .

More precisely, for a sample let us define the matrix as . Then we can estimate as:


The estimation of the mutual information relies on the estimation of the probabilities, which is based on the computation of the frequencies. Hence, in order to obtain a good estimation, the batches should be large enough to represent well the true distributions. Furthermore, if the batch size is too small, the gradient descent is unstable since the representation of the distribution changes from one batch to the other. In the ML literature there are standard validation techniques (such as the cross validation) that provide guidelines to achieve a “good enough” estimation of the probabilities.

Iii-C Base models

The base model is simply the “blank” classifier that has not learnt anything yet (i.e. the weights are initialized according to the Glorot initialization, which is a standard initialization technique [39]). As for , we have found out experimentally that it is convenient to start with a noise function pretty much spread out. This is because in this way the generator has more data points with non-null probability to consider, and can figure out faster which way to go to minimize the mutual information.

Iii-D Utility

The utility constraint is incorporated in the loss function of in the following way:


where and are parameters that allow us to tune the trade-off between utility and privacy. The purpose of is to ensure that the constraint on utility is respected, i.e., that the obfuscation mechanism that is trying to produce stays within the domain . We recall that represents the constraint (cfr. (8)). Since we need to compute the gradient on the loss, we need a derivable function for . We propose to implement it using , which is a function of two arguments in defined as: This function is non negative, monotonically increasing, and its value is close to for , while it grows very quickly for . Hence, we define


With this definition, does not interfere with when the constraint is respected, and it forces to stay within the constraint because its growth when the constraints is not respected is very steep.

Iii-E On the convergence of our method

In principle, at a each iteration , our method relies on the ability of the network to improve the obfuscation mechanism starting from the one produced by , and given only the original locations and the model , which are used to determine the direction of the gradient for . The classifier is a particular adversary modeled by its weights and its biases. However, thanks to the fact that the main component of is and not the the cross entropy, takes into account all the attacks that would be possible from ’s information. We have experimentally verified that indeed, using the mutual information rather than the cross entropy, determines a substantial improvement on the convergence process, and the resulting mechanisms provide a better privacy (for the same utility level). Again, the reason is that the the cross entropy would be subject to the “swapping effect” illustrated by Example 1 in the introduction.

Another improvement on the convergence is due the fact that, as explained before, we reset the classifier to the initial weight setting () at each iteration, instead than letting evolve from .

The function that has to minimize, , is convex wrt . This means that there are only global minima, although there can be many of them, all equivalent. Hence for sufficiently small updates the noise distribution modeled by converges to one of these optima, provided that the involved network has enough capacity to compute the gradient descent involved in the training algorithm. In practice, however, the network represents a limited family of noise distributions, and instead of optimizing the noise distribution itself we optimize the weights of this network, which introduces multiple critical points in the parameter space.

Number of epochs and batch size

The convergence of the game can be quite sensitive to the number of epochs and batch size. We just give two hints here, referring to literature [40] for a general discussion about the impact they have on learning.

First, choosing a batch too small for training might result in too strict a constraint on the utility. In fact, since the utility loss is an expectation, a larger number of samples makes it more likely that some points are pushed further than the threshold, taking advantage of the fact that their loss may be compensated by other data points for which the loss is small.

Second, training for too few epochs might result into a too weak adversary. On the other hand if it is trained for a long time we should make sure that the classification performances do not drop over the validation and test set because that might indicate an overfitting problem.

Iv Cross Entropy vs Mutual Information: demonstration on synthetic data

In this section we perform experiments on a synthetic dataset to obtain an intuition about the behaviour of our method. The dataset is constructed with the explicit purpose of being simple, to facilitate the interpretation of the results. The main outcome of these experiments is confirming the fact that, as discussed in Sec III-A, training the generator wrt cross entropy is not sound. Even in our simple synthetic case, training with as the loss function fails to converge: is just “moving points around”, temporarily fooling the current classifier, but failing to really hide the correlation between the secrets and the reported locations.

On the other hand, training with mutual information behaves as expected: the resulting network generates noise that mixes all classes together, making the classification problem hard for any adversary, not only for the current one. Note that cross entropy is still used, but only for (cfr. Sec III-A).

The dataset

We consider a simple location privacy problem; users want to disclose their location while protecting their identities. Both the real locations as well as the reported locations are taken to be all locations in a squared region of sq km centered in 5, Boulevard de Sébastopol, Paris. Each location entry is defined by a pair of coordinates normalized in .

The synthetic dataset consists of real locations for each of the users (classes), for a total of entries. The locations of each user are placed around one of the vertices of a square of sq meters centered in , Boulevard de Sébastopol, Paris. (Each user corresponds to a different vertex.) They are randomly generated so to form a cloud of entries around each vertex and in such a way that no locations falls further than about m from the corresponding vertex. These sets are represented in Fig. 4 ((a) and (b), left): it is evident from the figure that the four classes are easily distinguishable; without noise a linear classifier could predict the class of each location with no error at all.

Of the total entries of the dataset we use for training and validation ( for each user) and for testing ( for each user).

Network architecture

A relatively simple architecture is used for both and networks. They consist of three fully connected hidden layers of neuron with ReLU function. In particular has 60, 100 and 51 hidden neurons respectively in the first, second and third hidden layers. The network has 100 neurons in each hidden layer; such an architecture has proved to be enough to learn how to reproduce the Laplace noise distribution () with a negligible loss.

Bayes error estimation

As explained in Section II, we use the Bayes error to evaluate the level of protection offered by a mechanism. To this purpose, we discretize into a grid over the sq km region, thus determining a partition of the region into a number of disjoint cells. We will create different grid settings to see how the partition affects the Bayes error. In particular, we will consider the cases where the side of a cell is m, m, m and m long, which corresponds to , , and cells, respectively.

We run experiments with different numbers of obfuscated locations (hits). Specifically, for each grid we consider , , and obfuscated hits for each original one.

Each hit falls in exactly one cell. Hence, we can estimate the probability that a hit is in cell as:


and the probability that a hit in cell belong to class :


We can now estimate of the Bayes error as follows:


where is the total number of cells.

Note that these computations are influenced by the chosen grid. In particular we have two extreme cases:

  • when the grid consists of only one cell the Bayes error is for any obfuscation mechanism .

  • when the number of cells is large enough so that each cell contains at most one hit, then the Bayes error is for any obfuscation mechanism.

In general, we expects a finer granularity to give higher discrimination power and to decrease the Bayes error, especially with methods that scatter the obfuscated locations far away.

We estimate the Bayes error on the testing data in order to evaluate how well the obfuscation mechanisms protect new data samples never seen during the training phase. Moreover we evaluate the Bayes error on the same data we used for training and we compare the results with those obtained for the testing data. We notice that, in general, the difference between the two results is not large, meaning that the deployed mechanisms efficiently protect the new samples as well.

The planar Laplace mechanism

We compare our method against the planar Laplace mechanism[9], whose probability density to report , when the true location is , is:


where is the Euclidean distance between and .

In order to compare the the Laplace mechanism with ours, we need to tune the privacy parameter so that the expected distortion of is the same as the upper bound on the utility loss applied in our method, i.e. . To this purpose, we recall that the expected distortion of the planar Laplace depends only on (not on the prior ), and it is given by:


Iv-a Experiment 1: relaxed utility constraint

As a first experiment, we choose for the upper bound on the expected distortion a value high enough so that in principle we can achieve the highest possible privacy, which is obtained when the observed obfuscated location gives no information about the true location, which means that . In this case, the attacker can only do random guessing. Since we have users, the Bayes error is .

For the distortion, we take to be the geographical distance between and . One way to achieve the maximum privacy is to map all locations into the middle point. To compute a sufficient , note that the vertices of the original locations form a square of side m, hence each vertex is at a distance m from the center. Taking into account that the locations can be as much as m away from the corresponding vertex, we conclude that any value of larger than m should be enough to allow us to obtain the maximum privacy. We set the upper bound on the distortion a little higher:


but we will see from the experiments that a much smaller value of would have been sufficient.

We now need to tune the planar Laplace so that the expected distortion is at least . We decide to set:


which, using Equation (28), gives us a value


We have used this instance of the planar Laplace also as a starting point of our method: we have defined as with . For the next steps, and are constructed as explained in Algorithm 1. In particular, we train the generator with a batch size of samples for epochs during each iteration. The learning rate is set to . For this particular experiment we set the weight for the utility loss to and the weight for the mutual information to . The classifier is trained with a batch size of samples and epochs for each iteration. The learning rate for the classifier is set to .

(a) Iteration 30
(b) Iteration 40
Fig. 4: Using cross entropy for producing the noise does not make the system converge. The left sides of Figures (a) and (b) show the original synthetic data without noise. The right sides show the noisy data at different iterations. m.




Fig. 5: Synthetic testing data. From left to right: Laplace noise, no noise, our noise produced using mutual information. m.
Number of cells
(a) Training data.
Number of cells
(b) Testing data.
Fig. 6: Estimation of on the original version of the synthetic data.
Number of cells
Obf Lap Our Lap Our Lap Our Lap Our
(a) Training data.
Number of cells
Obf Lap Our Lap Our Lap Our Lap Our
(b) Testing data.
Fig. 7: Estimation of on synthetic data for the Laplace and our mechanisms, with m. The empirical utility loss for training and testing data is m respectively for the Laplace and m for ours. The optimal mechanism gives .

Training wrt cross entropy

As discussed in Sec III-A, training wrt is not sound. This is confirmed in the experiments by the fact that is failing to converge. Fig. 4 shows the distribution generated by in two different iterations of the game. We observe that, trying to fool the classifier , the generator on the right-hand side has simply moved locations around, so that each class has been placed in a different area. This clearly confuses a classifier trained on the distribution of the left-hand side, however the correlation between labels and location is still evident. A classifier trained on the new can infer the labels as accurately as before.

As a consequence, after each iteration, the accuracy of the newly trained is always , while the Bayes error is . The generator fails to converge to a distribution that effectively protects the users’ privacy. We can hence conclude that the use of cross entropy is unsound for training .

Training wrt mutual information

Using now for training (while still using the more efficient cross entropy for , as explained in Sec III-A), we observe a totally different behaviour. After each iteration the accuracy of the classifier drops, showing that the generator produces meaningful noise. Around iteration the accuracy of becomes both over the training and the validation set. This means that just randomly predicts one of the four classes. We conclude that the noise injection is maximally effective, since is the maximum possible Bayes error. Hence we know that we can stop.

The result of our method, i.e., the final generator , to the testing set is reported in Fig. 5(c). The empirical distortion is m. This is way below the limit of m set in (29), and it is due to the fact that to achieve the optimum privacy we probably do not need more than m. In fact, the distance of the vertices from the center is m, and even though some locations are further away (up to m more), there are also locations that are closer, and that compensate the utility loss (which is a linear average measure).

For comparison, the result of the application of the planar Laplace to the testing set is illustrated in Fig. 5(a). The empirical distortion (i.e., the distortion computed on the sampled obfuscated locations) is m, which is in line with the theoretical distortion formulated in (31).

From Fig. 5 we can see that, while the Laplace tends to “spread out” the obfuscated locations, our method tends to concentrate them into a single point (mode collapse), i.e., the mechanism is almost deterministic. This is due to the fact that the utility constraint is sufficiently loose to allow the noisy locations to be displaced enough so to overlap all in the same point. When the utility constraint is stricter, the mechanism is forced to be probabilistic (and the mode collapse does not happen anymore). For example, consider two individuals, and , in locations and respectively, at distance m, assume that m. Assume also, for simplicity, that there are no other locations available. Then the optimal solution maps into with probability , and into itself with probability and vice versa for ). Nevertheless, we can expect that our mechanism will tend to overlap the obfuscated locations of different classes, as much as allowed by the utility constraint. With the Laplace, on the contrary, the areas of the various classes remain pretty separated. This is reflected by the Bayes error estimation reported in Fig. 7.

We note that the Bayes error of the planar Laplace tend to decrease as the grid becomes finer. We believe that this is due to the fact that, with a coarse grid, there is an effect of confusion simply due to the large size of each cell. We remark that the behavior of our noise, on the contrary, is quite stable. Note that, when the grid is very coarse (