Lagrangian Decomposition for Neural Network Verification
A fundamental component of neural network verification is the computation of bounds on the values their outputs can take. Previous methods have either used off-the-shelf solvers, discarding the problem structure, or relaxed the problem even further, making the bounds unnecessarily loose. We propose a novel approach based on Lagrangian Decomposition. Our formulation admits an efficient supergradient ascent algorithm, as well as an improved proximal algorithm. Both the algorithms offer three advantages: (i) they yield bounds that are provably at least as tight as previous dual algorithms relying on Lagrangian relaxations; (ii) they are based on operations analogous to forward/backward pass of neural networks layers and are therefore easily parallelizable, amenable to GPU implementation and able to take advantage of the convolutional structure of problems; and (iii) they allow for anytime stopping while still providing valid bounds. Empirically, we show that we obtain bounds comparable with off-the-shelf solvers in a fraction of their running time, and obtain tighter bounds in the same time as previous dual algorithms. This results in an overall speed-up when employing the bounds for formal verification.
appendix \newtoggleexperiments \newtogglelongversion \toggletrueappendix \togglefalselongversion \newrobustcmd\B
As deep learning powered systems become more and more common, the lack of robustness of neural networks and their reputation for being “Black Boxes” is increasingly worrisome. In order to deploy them in critical scenarios where safety and robustness would be a prerequisite, we need to invent techniques that can prove formal guarantees for neural network behaviour. A particularly desirable property is resistance to adversarial examples (Goodfellow2015, Szegedy2014): perturbations maliciously crafted with the intent of fooling even extremely well performing models. After several defenses were proposed and subsequently broken (Athalye2018, Uesato2018), some progress has been made in being able to formally verify whether there exist any adversarial examples in the neighbourhood of a data point (Tjeng2019, Wong2018).
Verification algorithms fall into three categories: unsound (some false properties are proven false), incomplete (some true properties are proven true), and complete (all properties are correctly verified as either true or false). A critical component of the verification systems developed so far is the computation of lower and upper bounds on the output of neural networks when their inputs are constrained to lie in a bounded set. In incomplete verification, by deriving bounds on the changes of the prediction vector under restricted perturbations, it is possible to identify safe regions of the input space. These results allow the rigorous comparison of adversarial defenses and prevent making overconfident statements about their efficacy (Wong2018). In complete verification, bounds can also be used as essential subroutines of Branch and Bound complete verifiers (Bunel2018). Finally, bounds might also be used as a training signal to guide the network towards greater robustness and more verifiability (Gowal2018b, Mirman2018, Wong2018).
Most previous algorithms for computing bounds are either computationally expensive (Ehlers2017) or sacrifice a lot of tightness in order to scale (Gowal2018b, Mirman2018, Wong2018). In this work, we design novel customised relaxations and their corresponding solvers for obtaining bounds on neural networks. Our approach offers the following advantages:
While previous approaches to neural network bounds (Dvijotham2018) are based on Lagrangian relaxations, we derive a new family of optimization problems for neural network bounds through Lagrangian Decomposition, which in general yields duals at least as strong as those obtained through Lagrangian relaxation (Guignard1987). We in fact prove that, in the context of ReLU networks, for any dual solution from the approach by Dvijotham2018, the bounds output by our dual are as least as tight. We demonstrate empirically that this derivation computes tighter bounds in the same time when using supergradient methods. We further improve on the performance by devising a proximal solver for the problem, which decomposes the task into a series of strongly convex subproblems. For each, we use an iterative method for which we derive optimal step sizes.
The basic step of both the supergradient and the proximal method are linear operations similar to the ones used during forward/backward pass of the network. As a consequence, we can leverage the convolutional structure when necessary, while standard solvers are often restricted to treating it as a general linear operation. Moreover, both methods are easily parallelizable: when computing bounds on the neural activations at layer , we need two solve two problems for each hidden unit of the network (one for the upper bound and one for the lower bound). These can all be solved in parallel. In complete verification, we need to compute bounds for several different problem domains at once: we solve these problems in parallel as well. Our GPU implementation thus allows us to solve several hundreds of linear programs at once on a single GPU, a level of parallelism that would be hard to match on CPU-based systems.
Most standard linear programming based relaxations (Ehlers2017) will only return a valid bound if the problem is solved to optimality. Others, like the dual simplex method employed by off-the-shelf solvers (gurobi-custom) have a very high cost per iteration and will not yield tight bounds without incurring significant computational costs. Both methods described in this paper are anytime (terminating it before convergence still provides a valid bound), and can be interrupted at very small granularity. This is useful in the context of a subroutine for complete verifiers, as this enables the user to choose an appropriate speed versus accuracy trade-off. It also offers great versatility as an incomplete verification method.
2 Related Works
Bound computations are mainly used for formal verification methods. Some methods are complete (Cheng2017, Ehlers2017, Katz2017, Tjeng2019, Xiang2017), always returning a verdict for each problem instances. Others are incomplete, based on relaxations of the verification problem. They trade speed for completeness: while they cannot verify properties for all problem instances, they scale significantly better. Two main types of bounds have been proposed: on the one hand, some approaches (Ehlers2017, Salman2019) rely on off-the-shelf solvers to solve accurate relaxations such as PLANET (Ehlers2017), which is the best known linear-sized approximation of the problem. On the other hand, as PLANET and other more complex relaxations do not have closed form solutions, some researchers have also proposed easier to solve, looser