Liu Liu
About Me
I am a Ph.D student in the ECE department of The University of Texas at Austin.
I am fortunately advised by Prof. Constantine Caramanis in the
Wireless Networking and Communications Group.
My research interests are in theoretical machine learning, optimization and high dimensional statistics.
I'm also interested in realworld data science and have spent some time at tech companies and financial firms as research intern.
Before moving to Austin, I received the B.E. degree from Department of Electronic Engineering, Tsinghua University, Beijing, China, in 2016.
Publications and Preprints
Expand all abstracts
Collapse all abstracts
Robust compressed sensing of generative models
Ajil Jalal, Liu Liu, Alexandros G Dimakis, Constantine Caramanis.
NeurIPS 2020. [Arxiv Link]
See abstract
The goal of compressed sensing is to estimate a high dimensional vector from an underdetermined system of noisy linear equations. In analogy to classical compressed sensing, here we assume a generative model as a prior, that is, we assume the vector is represented by a deep generative model G: R^k –> R^n. Classical recovery approaches such as empirical risk minimization (ERM) are guaranteed to succeed when the measurement matrix is subGaussian. However, when the measurement matrix and measurements are heavytailed or have outliers, recovery may fail dramatically. In this paper we propose an algorithm inspired by the MedianofMeans (MOM). Our algorithm guarantees recovery for heavytailed data, even in the presence of outliers. Theoretically, our results show our novel MOMbased algorithm enjoys the same sample complexity guarantees as ERM under subGaussian assumptions. Our experiments validate both aspects of our claims: other algorithms are indeed fragile and fail under heavytailed and/or corrupted data, while our approach exhibits the predicted robustness.
Robust Structured Statistical Estimation via Conditional Gradient Type Methods
Jiacheng Zhuo, Liu Liu, Constantine Caramanis.
arXiv preprint arXiv:2007.03572, 2020. [Arxiv Link]
See abstract
Structured statistical estimation problems are often solved by Conditional Gradient (CG) type methods to avoid the computationally expensive projection operation. However, the existing CG type methods are not robust to data corruption. To address this, we propose to robustify CG type methods against Huber's corruption model and heavytailed data. First, we show that the two Pairwise CG methods are stable, ie, do not accumulate error. Combined with robust mean gradient estimation techniques, we can therefore guarantee robustness to a wide class of problems, but now in a projectionfree algorithmic framework. Next, we consider high dimensional problems. Robust mean estimation based approaches may have an unacceptably high sample complexity. When the constraint set is a ell_0 norm ball, IterativeHardThresholdingbased methods have been developed recently. Yet extension is nontrivial even for general sets with O(d) extreme points. For setting where the feasible set has O(poly(d)) extreme points, we develop a novel robustness method, based on a new condition we call the Robust Atom Selection Condition (RASC). When RASC is satisfied, our method converges linearly with a corresponding statistical error, with sample complexity that scales correctly in the sparsity of the problem, rather than the ambient dimension as would be required by any approach based on robust mean estimation.
Low Rank Matrix Regression under Heavy Tailed Distribution
Liu Liu, Tianyang Li, Constantine Caramanis.
Submitted, 2019.
High Dimensional Robust MEstimation: Arbitrary Corruption and Heavy Tails
Liu Liu, Tianyang Li, Constantine Caramanis.
ArXiv preprint arXiv:1901.08237, 2019. [Arxiv Link] [PDF]
See abstract
We consider the problem of sparsityconstrained \(M\)estimation when both {em explanatory and response} variables have heavy tails (bounded 4th moments), or a fraction of arbitrary corruptions. We focus on the \(k\)sparse, highdimensional regime where the number of variables \(d\) and the sample size \(n\) are related through \(n \sim k \log d\). We define a natural condition we call the Robust Descent Condition (RDC), and show that if a gradient estimator satisfies the RDC, then Robust Hard Thresholding (IHT using this gradient estimator), is guaranteed to obtain good statistical rates. The contribution of this paper is in showing that this RDC is a flexible enough concept to recover known results, and obtain new robustness results. Specifically, new results include: (a) For \(k\)sparse highdimensional linear and logisticregression with heavy tail (bounded 4th moment) explanatory and response variables, a lineartimecomputable medianofmeans gradient estimator satisfies the RDC, and hence Robust Hard Thresholding is minimax optimal; (b) When instead of heavy tails we have \(O(1/\sqrt{k}\log(nd))\)fraction of arbitrary corruptions in explanatory and response variables, a near lineartime computable trimmed gradient estimator satisfies the RDC, and hence Robust Hard Thresholding is minimax optimal.
We demonstrate the effectiveness of our approach in sparse linear, logistic regression, and sparse precision matrix estimation on synthetic and realworld US equities data.
High Dimensional Robust Sparse Regression
Liu Liu, Yanyao Shen, Tianyang Li, Constantine Caramanis.
AISTATS 2020. [Arxiv Link] [PDF] [Slides]
See abstract
We provide a novel – and to the best of our knowledge, the first – algorithm for high dimensional sparse regression with constant fraction of corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse parameters with sublinear sample complexity, in the presence of a constant fraction of arbitrary corruptions. Our main contribution is a robust variant of Iterative Hard Thresholding. Using this, we provide accurate estimators: when the covariance matrix in sparse regression is identity, our error guarantee is near informationtheoretically optimal. We then deal with robust sparse regression with unknown structured covariance matrix. We propose a filtering algorithm which consists of a novel randomized outlier removal technique for robust sparse mean estimation that may be of interest in its own right: the filtering algorithm is flexible enough to deal with unknown covariance. Also, it is orderwise more efficient computationally than the ellipsoid algorithm. Using sublinear sample complexity, our algorithm achieves the best known (and first) error guarantee.
We demonstrate the effectiveness on largescale sparse regression problems with arbitrary corruptions.
Statistical inference using SGD
Tianyang Li, Liu Liu, Anastasios Kyrillidis, Constantine Caramanis.
AAAI 2018. [Arxiv Link] [PDF]
See abstract
We present a novel method for frequentist statistical inference in Mestimation problems, based on stochastic gradient descent (SGD) with a fixed step size: we demonstrate that the average of such SGD sequences can be used for statistical inference, after proper scaling. An intuitive analysis using the OrnsteinUhlenbeck process suggests that such averages are asymptotically normal. From a practical perspective, our SGDbased inference procedure is a first order method, and is wellsuited for large scale problems. To show its merits, we apply it to both synthetic and real datasets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation.
Approximate Newtonbased statistical inference using only stochastic gradients
Tianyang Li, Anastasios Kyrillidis, Liu Liu, Constantine Caramanis.
ArXiv preprint arXiv:1805.08920, 2018. [Arxiv Link] [PDF]
See abstract
We present a novel inference framework for convex empirical risk minimization, using approximate stochastic Newton steps. The proposed algorithm is based on the notion of finite differences and allows the approximation of a Hessianvector product from firstorder information. In theory, our method efficiently computes the statistical error covariance in Mestimation, both for unregularized convex learning problems and highdimensional LASSO regression, without using exact second order information, or resampling the entire data set. In practice, we demonstrate the effectiveness of our framework on largescale machine learning problems, that go even beyond convexity: as a highlight, our work can be used to detect certain adversarial attacks on neural networks.
