Improving VAEs’ Robustness to Adversarial Attack
Abstract
Variational autoencoders (VAEs) have recently been shown to be vulnerable to adversarial attacks, wherein they are fooled into reconstructing a chosen target image. However, how to defend against such attacks remains an open problem. We make significant advances in addressing this issue by introducing methods for producing adversarially robust VAEs. Namely, we first demonstrate that methods used to obtain disentangled latent representations produce VAEs that are more robust to these attacks. However, this robustness comes at the cost of reducing the quality of the reconstructions. We, therefore, further introduce a new hierarchical VAE, the SeatbeltVAE, which can produce high–fidelity autoencoders that are also adversarially robust. We confirm the empirical capabilities of the SeatbeltVAE on several different datasets and with current state–of–the–art VAE adversarial attack schemes.
[A]appendix
1 Introduction
\floatsetup[figure]style=plain,subcapbesideposition=top
Variational autoencoders (VAEs) are a powerful approach to learning deep generative models and probabilistic autoencoders (Kingma and Welling, 2014; Rezende et al., 2014).
However, recent work has shown that they are vulnerable to adversarial attacks (Tabacof et al., 2016; GondimRibeiro et al., 2018; Kos et al., 2018), wherein an adversary attempts to fool the VAE to produce reconstructions similar to a chosen target by adding distortions to the original input as shown in Figure 1. In particular, these papers have shown that effective attacks can be made by finding local perturbations of the original input that produce latentspace representations which are similar to that of the adversary’s target. This kind of attack can be harmful in applications where the encoder’s output is used downstream, as in Xu et al. (2017); Kusner et al. (2017); Theis et al. (2017); Townsend et al. (2019); Ha and Schmidhuber (2018); Higgins et al. (2017b). Furthermore, VAEs are often themselves used as a mechanism for protecting classifiers from adversarial attack (Schott et al., 2019; Ghosh et al., 2019). As such, ensuring VAEs are robust to adversarial attack is an important endeavor.
Despite these vulnerabilities, little progress has been made in the literature on how to defend VAEs from such adversarial attacks. The aim of this paper is thus to investigate and introduce possible strategies for defense. Moreover, we seek to find ways to defend VAEs in a manner that maintains reconstruction performance.
Our first contribution towards this aim is to show that regularising the variational objective (i.e. the ELBO) during training can lead to more robust VAEs. Specifically, we leverage ideas from the disentanglement literature (Mathieu et al., 2019) to improve VAEs’ robustness by learning simpler and smoother representations that are less vulnerable to attack. In particular, we show that the total correlation (TC) term used by Kim and Mnih (2018); Chen et al. (2018); Esmaeili et al. (2019) to encourage independence between the dimensions of the learned latent representations, also serves as an effective regulariser for learning robust VAEs.
Though a clear improvement over the standard VAE, a severe drawback of this approach is that the gains in robustness are coupled with drops in the reconstruction performance, due to the increased regularisation. Furthermore, we find that the achievable robustness with this approach can be limited (see Figure 1) and thus potentially insufficient for particularly sensitive tasks.
To address this, we introduce a new TC–regularised hierarchical VAE: the SeatbeltVAE. By using a richer latent space representation that the standard VAE, the SeatbeltVAE can learn deep generative models which are not only even more robust to adversarial attacks than those just using TC regularisation, but which are also able to achieve this while providing reconstructions which are comparable to, and often even better than, the standard VAE.
To summarize, our key contributions are:

Providing insights into what makes VAEs vulnerable to attack and how we might go about defending them.

Unearthing new connections between disentanglement and robustness to adversarial attack.

A demonstration that regularised VAEs, trained with an upweighted total correlation, are significantly more robust to adversarial attacks than vanilla VAEs.

Introducing a regularised hierarchical VAE, the SeatbeltVAE, that provides further robustness to adversarial attack while providing improved reconstructions.
2 Background
2.1 Variational Autoencoders
Variational autoencoders (VAEs) are a deep extension of factor analysis suitable for highdimensional data like images (Kingma and Welling, 2014; Rezende et al., 2014). They introduce a joint distribution over data and latent variables : where is an appropriate distribution given the form of the data, the parameters of which are represented by deep nets with parameters , and is a common choice for the prior. As exact inference is intractable, one performs amortised stochastic variational inference by introducing an inference network for the latent variables, , which often also takes the form of a Gaussian, . We can then perform gradient ascent on the evidence lower bound (ELBO)
w.r.t. both and , using the reparameterisation trick to take gradients through Monte Carlo samples from .
2.2 Attacking on VAEs
In an adversarial attack, an agent is trying to manipulate the behaviour of some machine learning model towards a goal of their choosing, such as fooling a classifier to misclassify an image through adding a small perturbation (Akhtar and Mian, 2018; Gilmer et al., 2018). For many deep learning models, very small changes in the input, of little importance to the human eye, can produce large changes in the model’s output.
Attacks on VAEs have been proposed by Tabacof et al. (2016); GondimRibeiro et al. (2018); Kos et al. (2018). Here the adversary looks to apply small input distortions that produce reconstructions to be close to a target adversarial image. An example of this is shown in Figure 1, where a successful attack is performed on a standard VAE to turn Hugh Jackman into Anna Wintour.
The current most effective mode of attack on VAEs is known as a latent space attack (Tabacof et al., 2016; GondimRibeiro et al., 2018; Kos et al., 2018). This aims to find a distorted image such that its posterior is close to that of the agent’s chosen target image . This, in turn, implies that the likelihood is high when conditioned on draws from the encoding of the adversarial example. It is particularly important to be robust to this attack if one is concerned with using the encoder network of a VAE as part of a downstream task.
3 Defending VAEs
Given these approaches to attacking VAEs, the critical question is now how to defend them. This problem was not considered by these prior works.
To solve it, we first need to consider the question: what makes VAEs vulnerable to adversarial attacks? We argue that two key factors dictate whether we can perform a successful attack on a VAE: a) whether we can induce significant changes in the encoding distribution through only small changes in the data , and b) whether we can induce significant changes in the reconstructed images through only small changes to the latents . The first of these relates to the smoothness of the encoder mapping, the latter to the smoothness of the decoder mapping.
Consider, for the sake of argument, the case where the encoder–decoder process is almost completely noiseless. Here successful reconstruction places no direct pressure for similar encodings to correspond to similar images: given sufficiently powerful networks, we can have an embedding where very small changes to imply very large changes to the reconstructed image because there is no ambiguity in the “correct” encoding of a particular datapoint. In essence, we can have a lookup–table style behaviour, where nearby realisations of do not necessarily relate to each other and very different images can have very similar encodings.
Such a system will now be very vulnerable to adversarial attacks: small changes to the image can lead to large changes in the encoding, and small changes to the encoding can lead to large changes in the reconstruction. Our autoencoder will also tend to overfit and have gaps in the aggregate posterior as each will be tightly peaked. This can then easily be exploited by an adversary.
We postulate two possible ways to avoid this undesirable behaviour. Firstly, we could try and directly regulate the networks used by the encoder and decoder to limit the capacity of the system to have small differences in images induce large differences in latents. Secondly, we can try to regulate the level of noise in the encoding to indirectly force a smoothness in the embedding. Having a noisy encoding creates uncertainty in the latent that gives rise to a particular image, forcing similar latents to correspond to similar images. In other words, we can avoid the aforementioned vulnerabilities by either ensuring our encode–decode process is sufficiently simple, or sufficiently noisy. The fact that the VAE is vulnerable to adversarial attack suggests that its standard setup does not sufficiently encourage either of these to provide an adequate defence. Introducing additional regularisation to enforce simplicity or noisiness thus provides an intriguing prospect for defending them.
Though in principle direct regularisation of the networks (e.g. through regularisation of their weights) might be a viable defence in a number of scenarios, we will, in this paper, instead focus on indirect regularisation approaches as discussed in the next section. The reason for this is that controlling the macroscopic behaviour of the networks through lowlevel regularisations can be difficult to control and, in particular, difficult to calibrate.
3.1 Disentanglement and Robustness
Recent research into disentangling VAEs (Higgins et al., 2017a; Siddharth et al., 2017; Kim and Mnih, 2018; Chen et al., 2018; Esmaeili et al., 2019; Mathieu et al., 2019) and the information bottleneck (Alemi et al., 2017, 2018) have looked to regularise the ELBO with the hope of providing more interpretable or simpler embeddings. This hints at an interesting link to robustness and raises the question: can we use methods for encouraging disentanglement to also encourage robustness?
Of particular relevance is the recent work of Mathieu et al. (2019). They introduce the notion of overlap in the embedding of a VAE and show how controlling it is critical to achieving smooth and meaningful latent representations. Overlap encapsulates both the level of uncertainty in the encoding process and also a locality of this uncertainty: to learn a smooth representation we not only need our encoder distribution to have an appropriate level of entropy, we also want the different possible encodings to be similar to each other, rather than spread out through the space.
Mathieu et al. (2019) further show that the success of many methods for disentanglement, and in particular the VAE Higgins et al. (2017a), are rooted precisely in controlling this level of overlap. Controlling overlap is exactly what we need to carry out in our second suggested approach to defending VAEs. We therefore propose to train more robust VAEs by using the same ELBO regularisers as employed by disentanglement methods.
A further link between disentanglement and robustness is that disentangled representations may often also be both simpler and more human–interpretable. For example, if we were hypothetically able to learn an embedding for CelebA where one of the latent variables has a clear and smooth correspondence with skin tone, then it is likely to be difficult to conduct an adversarial attack to produce an image with a different skin tone without making substantial changes to this latent. Thus, not only might disentangled representations be more robust if they induce simpler and smoother mappings through regularisation, if they encourage human–interpretable features, this should also make it more difficult to conduct successful attacks from the perspective of human–perceived changes to the reconstruction. For instance, the definition of a successful attack is rooted in what features of an image are perceived as important to a human observer: successful attacks are those which change the qualitative nature of the reconstruction, not those which induce the largest change in individual pixels. As such, there are strong links between disentanglement and robustness through common ideas of what it means to manipulate a datapoint such as an image.
3.2 Regularising for Robustness
There are a number of different disentanglement methods that one might consider using to train robust VAEs. Perhaps the simplest would be to use a VAE Higgins et al. (2017a), wherein we upweight the term in the VAE’s ELBO by a factor . Indeed this is the disentanglement approach that has been shown to most directly relate to overlap, with the value of transpiring to be directly linked to the entropy of the encoder (Mathieu et al., 2019).
However, the VAE is known to only provide disentanglement at the expense of substantial reductions in reconstruction quality as the data likelihood term has, in effect, been downweighted (Kim and Mnih, 2018; Chen et al., 2018; Mathieu et al., 2019). Furthermore, the level of disentanglement it can achieve is lesser than more recent methods (Kim and Mnih, 2018; Chen et al., 2018).
Because of these shortfalls, we instead propose to regularise through penalisation of a totalcorrelation (TC) term as per Kim and Mnih (2018); Chen et al. (2018). This looks to directly force independence across the different latent dimensions in aggregate posterior , such that the distribution of the data in the latent space (i.e. where we draw a datapoint at random and then pass it through the encoder) factorises across dimensions. As we are upweighting the total correlation by we refer to this as the TCVAE as per (Chen et al., 2018). This approach has been shown to provide improved disentanglement to the VAE, while also having a smaller deleterious effect on reconstruction quality.
To be more precise, the TCdecomposition of the VAE objective presented in (Hoffman and Johnson, 2016; Makhzani et al., 2016; Kim and Mnih, 2018; Chen et al., 2018; Esmaeili et al., 2019) reveals an explicit a TC term of the variational posterior . The factor and TCVAEs upweight this term, to produce the variational objective (with :
(2) 
where is the aggregate posterior, indexes over dimensions, and is the TC term.
Chen et al. (2018); Esmaeili et al. (2019) give a differentiable, stochastic approximation to , rendering this decomposition possible to use as a training objective using stochastic gradient descent. However this is a biased estimator: it is a nested expectation, for which unbiased, finite–variance, estimators do not generally exist (Rainforth et al., 2018). Consequently, it has the unfortunate consequence of needing large batch sizes to have the desired behaviour; for small batch sizes its practical behaviour mimics that of the VAE (Mathieu et al., 2019, Appendix C).
3.3 Adversarial Attacks on TCPenalised VAEs
We now consider attacking these TCpenalised VAEs and demonstrate one of the key contributions of the paper: that empirically this form of regularisation makes adversarial attacks on VAEs via their latent space harder to carry out.
To do this, we first train them under the TCVAE objective (i.e. Eq (2)), jointly optimising for a given . Once trained, we then attack the models using the methods outlined in Section 2.2. Namely, we use an adversary that tries to find a distortion to the input which minimises the attack loss as per Eq (1).
One possible metric for how successfully such attacks have been is the achieved value reached of the attack loss . If the latent space distributions for the original input and for the distorted input match exactly, then and the model has been completely fooled: reconstructions from samples from the attacked posterior would be indistinguishable from those from the target posterior. Meanwhile, the larger the converged value of the attack loss the less similar these distributions are and thus the more different the reconstructed image is to the adversarial target image.
We carry our these attacks for dSprites (Matthey et al., 2017), Chairs (Aubry et al., 2014) and 3D faces (Paysan et al., 2009), for a range of and values. We pick values of following the methodology in Tabacof et al. (2016); GondimRibeiro et al. (2018), and use LBFGSB for gradient descent (Byrd et al., 1995). We also tried varying the dimensionality of the latent space of the model, , but found it had little effect on the effectiveness of the attack.
In Figure 5 we show the effect on the attack loss from varying , averaged over different original inputtarget pairs and over different values of . Note that the plot is logarithmic in the values of the loss. We see a clear pattern for each dataset that the loss values reached by the adversary increases as we increase from the standard VAE (i.e. ). This analysis is also borne out by visual inspection of the effectiveness of these attacks as shown in Figure 1 and a number of other example attacks for different datasets, and , as shown in Appendix I. We will return to give further experimental results in Section 5.
An interesting point of note in Figure 5 is that in many cases the achievable adversarial loss actually starts to decrease again if is set too large. This is analogous to having too large an overlap when training for disentanglement as per Mathieu et al. (2019) or an overly restrictive information bottleneck (Alemi et al., 2017). The effect can be explained by thinking about what happens in the limit of . Here there is no pressure in the objective to produce good reconstructions, as such the encoder simply focuses on matching the prior regardless of the input. The prior does not use any information from the input and the KL term in becomes small for all possible distortions, even . For large but finite values of there will still be pressure to produce good reconstructions, but this will be dominated by the TC term which is most easily minimised by simply encoding to the prior.
4 The SeatbeltVAE
We are now armed with the fact that penalising the total correlation in the ELBO leads to more robust VAEs. However, this TCpenalisation in single layer VAEs comes at the expense of model reconstruction quality Chen et al. (2018). Our aim is to develop a model that is robust to adversarial attack while mitigating this tradeoff between robustness and sample quality.
To achieve this, we now consider instead using hierarchical VAEs Rezende et al. (2014); Sønderby et al. (2016); Zhao et al. (2017); Maaløe et al. (2019). These are known for their superior modelling capabilities and more accurate reconstructions. As these gains stem from using more complex hierarchical latent spaces, rather than less noisy encoders, this suggests they may be able to produce better reconstructions and generative capabilities, while also remain robust to adversarial attacks when appropriately regularised.
The simplest hierarchical extension of conditional stochastic variables in the generative model is the Deep Latent Gaussian Model (DLGM) of (Rezende et al., 2014). Here the forward model factorises as a chain
(3) 
where each is a Gaussian distribution with mean and variance parameterised by deep nets, while is an isotropic Gaussian.
Unfortunately, we found that naively applying TCcorrelation penalisation to DLGMstyle VAEs did not confer the improved robustness we observed in single layer VAEs. We postulate that this observed weakness is inherent to the structure of chain factorisation in the generative model: this structure means that the datalikelihood depends solely on , the bottommost latent variable, and attackers need only manipulate to produce a successful attack.
To account for this, we instead propose a generative model in which the likelihood depends on all the latent variables in the chain , rather than just the bottom layer . This leads to the following factorisation of the generative structure (which shares some similarity to that of BIVA (Maaløe et al., 2019))
(4) 
To construct the ELBO, we must further introduce an inference network . On the basis of simplicity and that it produces effective empirical performance, we simply use a chain factorisation for this as per (Rezende et al., 2014):
(5) 
where each conditional distribution takes the form of a Gaussian. Note that, marginalising out intermediate layers, we see is a nonGaussian, highly flexible, distribution. A summary of the dependency structure for the generative and inference networks is shown in Figure 8 for the case .
To defend this model against adversarial attack, we further introduce a TC regularisation term as per the last section. We refer to the resulting model as the SeatbeltVAE due to the protection it confers to adversarial attack. Because we find that, empirically, models of this type struggle to converge when TCpenalisation is applied to either the bottommost layer or every layer, the SeatbeltVAE only applies a TCpenalisation to the topmost latent variable . In other words, following the Factor and TCVAEs, we upweight the term for of the same form as in Eq (2) to give
(6) 
where indexes over the coordinates in .
Similar to Kim and Mnih (2018) and Chen et al. (2018), we can, in fact, reach Eq (6) by exposing this totalcorrelation term through an explicit decomposition of the KL (see Appendix C for a derivation). Specifically, now considering the ELBO for the whole dataset and using to indicate the empirical average over the datak, we have:
(7) 
where and indexes over the latent variables in the hierarchical chain. We see that, when , the SeatbeltVAE reduces to a TCVAE, and for it produces a DLGM with our augmented likelihood function.
As with the TCVAE, training using stochastic gradient ascent with minibatches of the data is complicated by the presence of aggregate posteriors which depend on the entire dataset. To deal with this, we derive a minibatch estimator that is a generalisation to disentangled hierarchical VAEs of the Minibatch–Weighted–Sampling estimator proposed in Chen et al. (2018); Esmaeili et al. (2019) in the context of TCVAEs. As discussed in Section 3.2, this estimator is inherently biased for finite dataset sizes, such that large batch sizes are required to provide a good estimate of the TC. See Appendix D for further details.
4.1 Attacking the SeatbeltVAE
In the SeatbeltVAE the likelihood over data is conditioned on all layers, so manipulations to any layer have the potential to be significant. We focus on simultaneously attacking all layers of the SeatbeltVAE, noting that, as shown in the Appendix, this is more effective that just targeting the top or base layers individually. Hence our adversarial objective for the SeatbeltVAE is based on the following generalisation of that introduced in (Tabacof et al., 2016; GondimRibeiro et al., 2018) to attack all the layers at the same time:
(8) 
5 Experiments
We now demonstrate that SeatbeltVAEs confer superior robustness to TCVAEs and standard VAEs, while preserving the ability to reconstruct inputs effectively. Through this, we demonstrate that SeatbeltVAEs are a powerful tool for learning robust deep generative models.
5.1 Methods
We first expand on our experiments in Section 3.3 and perform a battery of adversarial attacks on each of the introduced models. We randomly sample 10 inputtarget pairs for each dataset. As in Tabacof et al. (2016); GondimRibeiro et al. (2018), for each image pair we consider 50 different values of geometricallydistributed from to . Thus each model undergoes 500 attacks for each attack mode. As before, we used LBFGSB for gradient descent (Byrd et al., 1995). We perform these experiments on Chairs (Aubry et al., 2014), 3D faces (Paysan et al., 2009), and CelebA (Liu et al., 2015). Additional results for dSprites (Higgins et al., 2017a) can be found in Appendices H, I, and J. We used the same encoder and decoder architectures as Chen et al. (2018) for each dataset. Details of neural network architectures and training are given in Appendix E.
We evaluate the effectiveness of adversarial attacks using the attack objective as before, along with , the negative likelihood of the target image () given the embedding generated by the adversary (). Like with , higher values of this metric denote a less successful attack.
5.2 Visual Appraisal of Attacks
We first visually appraise the effectiveness of attacks on vanilla VAEs, TCVAEs and SeatbeltVAEs. As mentioned in Section 1, Figure 1 shows the results of latent space attacks on three models trained on CelebA. It is apparent that the TCVAE provides additional resilience to the attacks compared with the standard VAE. Furthermore, this figure shows that the SeatbeltVAE was sufficiently robust to almost completely thwart the adversary, producing an adversarial construction that still resembles the original input. Moreover, this was achieved while still producing a clearer non–adversarial reconstruction that either the VAE or TCVAE. See Appendix I for more examples.
One might expect that adversarial attacks targeting a single generative factor underpinning the data would be easier for the attacker. However, we find that TCpenalised models protect effectively against these attacks as well. For instance, see the Appendix I.1 for plots showing an attacker attempting to rotate a dSprites heart.
5.3 Numerical Appraisal of Robustness
Having ascertained perceptually that the SeatbeltVAE offers the strongest protection to adversarial attack, we now demonstrate this quantitatively. Figure 13 shows and over a range of datasets and s for the SeatbeltVAEs with and TCVAEs. This figure demonstrates that the combination of depth and high TCpenalisation offers the best protection to adversarial attacks and that the Seatbelt extension confers much greater protection to adversarial attack than a single layer TCVAE.
In Appendix H.1, we also calculate the distance between target images and adversarial outputs and show that the loss of effectiveness of adversarial attacks is not due to the degradation of reconstruction quality from increasing . We also include results in Appendix H for “output” attacks GondimRibeiro et al. (2018), which we find to be generally less effective. Here, the attacker directly tries to reduce the L2 distance between the reconstructed output and the target image. Again, TCpenalised models, and in particular SeatbeltVAEs, outperformed standard VAEs.
5.4 ELBO and Reconstruction Quality
Though SeatbeltVAEs offer better protection to adversarial attack than TCVAEs, we also motivate their utility by way of their reconstruction quality. In Figure 17 we plot the final ELBO of the two TCpenalised models, but calculated without the additional penalisation that was applied during training. We further show the effect of depth and TCpenalisation on reconstructions of CelebA. Both these plots show that SeatbeltVAEs’ reconstructions are more resilient to increasing than TCVAEs’. This resilience is both visually perceptible and measurable.
5.5 Noised input data
We finish by testing robustness to unstructured attacks where we noise the inputs and evaluate the model’s ability to reconstruct the original. Through this, we are evaluating their ability to denoise inputs. See Figure 20 for an illustration of the denoising properties of TCpenalised models trained on the CelebA dataset. This ability to denoise may partially explain these models’ robustness to more structured attacks.
6 Conclusion
We have shown that VAEs can be rendered more robust to both to adversarial attacks and noising of the inputs by adopting a TCpenalisation in the evidence lower bound. This increase in robustness can be strengthened even further by using our proposed hierarchical VAE, the SeatbeltVAE, which uses a carefully chosen generative structure where the likelihood makes use of all the latent variables.
Designing robust VAEs is becoming pressing as they are increasingly deployed as subcomponents in larger pipelines. As we have shown, methods typically used for disentangling, motivated by their ability to provide interpretable representations, also confer robustness to VAEs. Studying the beneficial effects of these methods is starting to come to the fore of research into VAEs Kumar and Poole (2019). We hope this work sparks further interest in the interplay between disentangling, regularisation, and model robustness.
References
 Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey. IEEE Access 6, pp. 14410–14430. External Links: Document, ISSN 21693536 Cited by: §2.2.
 Deep Variational Information Bottleneck. In ICLR, External Links: ISBN 1612.00410v5 Cited by: §3.1, §3.3.
 Fixing a Broken ELBO. ICML. Cited by: §3.1.
 Seeing 3D chairs: Exemplar partbased 2D3D alignment using a large dataset of CAD models. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3762–3769. External Links: ISBN 9781479951178, Document, ISSN 10636919 Cited by: §3.3, §5.1.
 A Limited Memory Algorithm for Bound Constrained Optimization. SIAM J. Sci. Comput. 16 (5), pp. 1190–1208. External Links: Document, ISSN 10648275 Cited by: §3.3, §5.1.
 Isolating Sources of Disentanglement in Variational Autoencoders. In NeurIPS, Cited by: §1, §3.1, §3.2, §3.2, §3.2, §3.2, §4, §4, §4, §5.1.
 Structured Disentangled Representations. In AISTATS, Cited by: §1, §3.1, §3.2, §3.2, §4.
 Resisting Adversarial Attacks Using Gaussian Mixture Variational Autoencoders. In AAAI, Cited by: §1.
 Motivating the Rules of the Game for Adversarial Example Research. CoRR. Cited by: §2.2.
 Adversarial Attacks on Variational Autoencoders. CoRR. Cited by: §1, §2.2, §2.2, §2.2, §3.3, §4.1, §5.1, §5.3.
 World Models. In NeurIPS, External Links: Document Cited by: §1.
 VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. In ICLR, External Links: Document, ISSN 10780874 Cited by: §3.1, §3.1, §3.2, §5.1.
 DARLA: Improving ZeroShot Transfer in Reinforcement Learning. In ICML, Cited by: §1.
 ELBO surgery: yet another way to carve up the variational evidence lower bound. In NeurIPS, Cited by: §3.2.
 Disentangling by Factorising. In NeurIPS, Cited by: §1, §3.1, §3.2, §3.2, §3.2, §4.
 Autoencoding Variational Bayes. In ICLR, Cited by: §1, §2.1.
 Adversarial Examples for Generative Models. In IEEE Security and Privacy Workshops, pp. 36–42. External Links: Document Cited by: §1, §2.2, §2.2.
 On Implicit Regularization in VAE. In NeurIPS Bayesian Deep Learning Workshop, Cited by: §6.
 Grammar Variational Autoencoder. In ICML, Cited by: §1.
 Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV), Cited by: §5.1.
 BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling. NeurIPS. Cited by: §4, §4.
 Adversarial Autoencoders. In ICLR, External Links: Link, ISBN 09284931, Document, ISSN 09284931 Cited by: §3.2.
 Disentangling Disentanglement in Variational Autoencoders. In ICML, Cited by: §1, §3.1, §3.1, §3.1, §3.2, §3.2, §3.2, §3.3.
 dSprites: Disentanglement testing Sprites dataset. Cited by: §3.3.
 A 3D face model for pose and illumination invariant face recognition. In 6th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2009, pp. 296–301. External Links: ISBN 9780769537184, Document Cited by: §3.3, §5.1.
 On nesting Monte Carlo estimators. In ICML, External Links: ISBN 9781510867963 Cited by: §3.2.
 Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In ICML, Cited by: §1, §2.1, §4, §4, §4.
 Toward the First Adversarially Robust Neural Network Model on MNIST. In ICLR, Cited by: §1.
 Learning disentangled representations with semisupervised deep generative models. In NeurIPS, Cited by: §3.1.
 Ladder Variational Autoencoders. In NeurIPS, Cited by: §4.
 Adversarial Images for Variational Autoencoders. In NIPS Workshop on Adversarial Training, Cited by: §1, §2.2, §2.2, §2.2, §3.3, §4.1, §5.1.
 Lossy Image Compression with Compressive Autoencoders. In ICLR, Cited by: §1.
 Practical Lossless Compression with Latent Variables using Bits Back Coding. ICLR. Cited by: §1.
 Variational Autoencoder for Semisupervised Text Classification. In AAAI, pp. 3358–3364. External Links: ISBN 9781450329569, Document, ISSN 10688838 Cited by: §1.
 Learning Hierarchical Features from Generative Models. In ICML, Cited by: §4.