In this section, we demonstrate the effectiveness of our proposed method in Algorithm 1 for solving (1). We compare our method with another two solvers. The first one is the CVX package (Grant and Boyd, 2012), a general convex program solver which t…

We provide a simple and efficient algorithm for computing the Euclidean projection of a point onto the capped simplex---a simplex with an additional …

We perform a thorough investigation of the problem of LM integration in encoder-decoder based ASR models. We compare some of the most prominent past methods and a few of our own proposed methods on the medium-scale and publicly available Switchboard…

Attention-based recurrent neural encoder-decoder models present an elegant solution to the automatic speech recognition problem. This approach folds …

In this section, we study the case of noisy (stochastic) gradient updates, and the SDGD algorithm, in which the influence of the delay is quite different than in the noiseless case. Instantiating SDGD for quadratic F(w) (defined in (7)) results in t…

We provide tight finite-time convergence bounds for gradient descent and stochastic gradient descent on quadratic functions, when the gradients are d…

We develop a fully automatic image colorization system. Our approach leverages recent advances in deep networks, exploiting both low-level and semant…

We study the fair variant of the classic $k$-median problem introduced by Chierichetti et al. [2017]. In the standard $k$-median problem, given an in…

We introduce a design strategy for neural network macro-architecture based on self-similarity. Repeated application of a simple expansion rule genera…

In the classical Node-Disjoint Paths (NDP) problem, we are given an $n$-vertex graph $G=(V,E)$, and a collection $M=\{(s_1,t_1),\ldots,(s_k,t_k)\}$ o…

Recent progress on many imaging and vision tasks has been driven by the use of deep feed-forward neural networks, which are trained by propagating gr…

Modern robotics applications that involve human-robot interaction require robots to be able to communicate with humans seamlessly and effectively. Na…

End-to-end training of deep learning-based models allows for implicit learning of intermediate representations based on the final task loss. However,…

Segmental conditional random fields (SCRFs) and connectionist temporal classification (CTC) are two sequence labeling methods used for end-to-end tra…

Recently, several works have shown that natural modifications of the classical conditional gradient method (aka Frank-Wolfe algorithm) for constraine…

We model coherent conversation continuation via RNN-based dialogue models equipped with a dynamic attention mechanism. Our attention-RNN language mod…

Recent work has shown that speech paired with images can be used to learn semantically meaningful speech representations even without any textual sup…

Perhaps one of the most interesting takeaways from this work is that we should start considering improper learning algorithms for adversarially robust learning. Even though our improper learning rule might not be practical, our results suggest to co…

We study the question of learning an adversarially robust predictor. We show that any hypothesis class $\mathcal{H}$ with finite VC dimension is robu…

We study the fundamental problem of Principal Component Analysis in a statistical distributed setting in which each machine out of $m$ stores a sampl…

We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L2 regularization: We introduce…

We propose a neural sequence-to-sequence model for direction following, a task that is essential to realizing effective autonomous agents. Our alignm…

Previous work has shown that neural encoder-decoder speech recognition can be improved with hierarchical multitask learning, where auxiliary tasks ar…

We have shown how to leverage PPDB to learn state-of-the-art word embeddings and compositional models for paraphrase tasks. Since PPDB was created automatically from parallel corpora, our models are also built automatically. Only small amounts of an…

The Paraphrase Database (PPDB; Ganitkevitch et al., 2013) is an extensive semantic resource, consisting of a list of phrase pairs with (heuristic) co…

By signing up you accept our content policy

Already have an account? Sign in

No a member yet? Create an account