Recent works have shown that on sufficiently over-parametrized neural nets, gradient descent with relatively large initialization optimizes a predict…

Penalized likelihood methods are fundamental to ultra-high dimensional variable selection. How high dimensionality such methods can handle remains la…

Sparse reduced-rank regression is an important tool to uncover meaningful dependence structure between large numbers of predictors and responses in m…

Generative Adversarial Networks (GANs) are one of the most practical strategies to learn data distributions. A popular GAN formulation is based on th…

We consider the problem of simultaneous estimation of a sequence of dependent parameters that are generated from a hidden Markov model. Based on obse…

High-dimensional sparse modeling with censored survival data is of great practical importance, as exemplified by modern applications in high-throughp…

Although gradient descent (GD) almost always escapes saddle points asymptotically [Lee et al., 2016], this paper shows that even with fairly natural …

We establish that first-order methods avoid saddle points for almost all initializations. Our results apply to a wide variety of first-order methods, including gradient descent, block coordinate descent, mirror descent and variants thereof. The conn…

We establish that first-order methods avoid saddle points for almost all initializations. Our results apply to a wide variety of first-order methods,…

Our analysis in this paper reveals that the suggested RANK method exploiting the general framework of model-free knockoffs introduced in [9] can asymptotically control the FDR in general high-dimensional nonlinear models with unknown covariate distr…

Power and reproducibility are key to enabling refined scientific discoveries in contemporary big data applications with general high-dimensional nonl…

A common issue for classification in scientific research and industry is the existence of imbalanced classes. When sample sizes of different classes …

This paper is concerned with the selection and estimation of fixed and random effects in linear mixed effects models. We propose a class of nonconcav…

Overloading the Library Functions: It is not difficult to see that piecewise univariate functions can be implemented in our library.

The Cheap Gradient Principle (Griewank 2008) --- the computational cost of computing the gradient of a scalar-valued function is nearly the same (oft…

Heterogeneous treatment effects are the center of gravity in many modern causal inference applications. In this paper, we investigate the estimation …

Understanding how features interact with each other is of paramount importance in many scientific discoveries and contemporary applications. Yet inte…

Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high-dimensional regime …

Most existing binary classification methods target on the optimization of the overall classification risk and may fail to serve some real-world appli…

Past works have shown that, somewhat surprisingly, over-parametrization can help generalization in neural networks. Towards explaining this phenomeno…

Proof [Proof of Lemma 6] By Corollary 4, k∗≤k′. This implies that R0(^ϕk∗)≥R0(^ϕk′). Moreover, by Lemma 5, for any δ′0∈(0,1) and n′0≥4/(αδ0),

The Neyman-Pearson (NP) paradigm in binary classification seeks classifiers that achieve a minimal type II error while enforcing the prioritized type…

Friendship formation is important to online social network sites and to society, but can suffer from informational friction. In this study, we demons…

Performing statistical inference in high-dimension is an outstanding challenge. A major source of difficulty is the absence of precise information on…

By signing up you accept our content policy

Already have an account? Sign in

No a member yet? Create an account