Most stochastic optimization methods use gradients once before discarding them. While variance reduction methods have shown that reusing past gradien…

Parallel sentence extraction is a task addressing the data sparsity problem found in multilingual natural language processing applications. We propos…

Deep networks often perform well on the data manifold on which they are trained, yet give incorrect (and often very confident) answers when evaluated…

Catastrophic forgetting is a problem faced by many machine learning models and algorithms. When trained on one task, then trained on a second task, m…

Common nonlinear activation functions used in neural networks can cause training difficulties due to the saturation behavior of the activation functi…

A central challenge to many fields of science and engineering involves minimizing non-convex error functions over continuous, high dimensional spaces…

We propose Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork, $h$, is a neural netwo…

Quantum Cryptography uses the counter-intuitive properties of Quantum Mechanics for performing cryptographic tasks in a secure and reliable way. The …

We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simpl…

In this work, we model abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks, and show that they achieve state-o…

Brain-Machine Interfaces (BMIs) have recently emerged as a clinically viable option to restore voluntary movements after paralysis. These devices are…

Convolutional neural networks are becoming standard tools for solving object recognition and visual tasks. However, most of the design and implementa…

We review the current state of automatic differentiation (AD) for array programming in machine learning (ML), including the different approaches such…

Survival analysis is a type of semi-supervised ranking task where the target output (the survival time) is often right-censored. Utilizing this infor…

Restricted Boltzmann Machines (RBMs) are one of the fundamental building blocks of deep learning. Approximate maximum likelihood training of RBMs typ…

Here we propose a novel model family with the objective of learning to disentangle the factors of variation in data. Our approach is based on the spi…

The softmax content-based attention mechanism has proven to be very beneficial in many applications of recurrent neural networks. Nevertheless it suf…

We present two techniques to improve landmark localization in images from partially annotated datasets. Our primary goal is to leverage the common si…

Keyphrase extraction from documents is useful to a variety of applications such as information retrieval and document summarization. This paper prese…

We study the problem of learning representations of entities and relations in knowledge graphs for predicting missing links. The success of such a ta…

By signing up you accept our content policy

Already have an account? Sign in

No a member yet? Create an account