Assistant Professor in the Statistics Department (by courtesy) at Purdue.

Lawson Building 2142-J, West Lafayette, IN 47907, phone: 765-496-6757

e-mail: jhonorio at purdue.edu

Modern machine learning (ML) problems are combinatorial and non-convex, for which theoretical guarantees are quite limited. Furthermore, while quantitative guarantees (e.g., small test error) have been studied, qualitative guarantees (e.g., correctness of clustering) are mostly lacking. My long-term research goal is to uncover the general foundations of ML and optimization that drives the empirical success across many specific combinatorial and non-convex ML problems. I aim to develop a set of optimization-theoretic frameworks and tools to bridge the aforementioned gaps, to further our understanding of continuous (possibly non-convex) relaxations of combinatorial problems, as well as our knowledge of non-convexity.

My aim is to generate correct, computationally efficient and statistically efficient algorithms for high dimensional ML problems. My research group has produced breakthroughs not only on classical worst-case NP-hard problems, such as learning and inference in structured prediction, community detection and learning Bayesian networks, but also on areas of recent interest such as fairness, meta learning and federated learning. [vita]

Prior to joining Purdue, I was a postdoctoral associate at MIT CSAIL, working with Tommi Jaakkola. My Erdős number is 3: Jean Honorio → Tommi Jaakkola → Noga Alon → Paul Erdős.

*Current.*Adarsh Barik (CS PhD), Site Bai (CS PhD), Chuyang Ke (CS PhD), Hanbyul Lee (Stat PhD), Wenjie Li (Stat PhD), Deepak Maurya (CS PhD).*Past.*Kevin Bello (CS PhD 2021), Asish Ghoshal (CS PhD 2019).*Other co-authors.*Imon Banerjee (Stat PhD), Abi Komanduru (Eng PhD), Zitao Li (CS PhD), Huiming Xie (Stat PhD), Qiuling Xu (CS PhD).*Other past co-authors.*Donald Adams (CS BS), Meimei Liu (Stat PhD), Raphael Meyer (CS BS), Yuki Ohnishi (Stat PhD), Keehwan Park (CS MS), Zhanyu Wang (Stat PhD), Zhaosen Wang (CS MS), Yixi Xu (Stat PhD), Xiaochen Yang (Stat PhD), Qian Zhang (Stat MS), Yilin Zheng (CS MS).*Prospective.*Here is a note for students who are considering working with me.

Barik A.,

(Under submission.)

Invex Programs: First Order Algorithms and Their Convergence.

Barik A., Sra S.,

(Under submission.)

Support Recovery in Sparse PCA with Non-Random Missing Data. (Preprint)

Lee H., Song Q.,

(Under submission.)

Learning Against Distributional Uncertainty: On the Trade-off Between Robustness and Specificity. (Preprint)

Wang S., Wang H.,

(Under submission.)

Distributional Robustness Bounds Generalization Errors. (Preprint)

Wang S., Wang H.,

(Under submission.)

A Novel Plug-and-Play Approach for Adversarially Robust Generalization. (Preprint)

Maurya D., Barik A.,

(Under submission.)

Dual Convexified Convolutional Neural Networks. (Preprint)

Bai S., Ke C.,

(Under submission.)

Federated X-Armed Bandit. (Preprint)

Li W., Song Q.,

(Under submission.)

Provable Guarantees for Sparsity Recovery with Deterministic Missing Data Patterns. (Preprint)

Ke C.,

(Under submission.)

A Theoretical Study of the Effects of Adversarial Attacks on Sparse Regression. (Preprint)

Maurya D.,

(Under submission.)

Meta Learning for High-dimensional Ising Model Selection Using ℓ

Xie H.,

(Under submission.)

Meta Sparse Principal Component Analysis. (Preprint)

Banerjee I.,

(Under submission.)

Exact Support Recovery in Federated Regression with One-shot Communication. (Preprint)

Barik A.,

(Under submission.)

Exact Inference with Latent Variables in an Arbitrary Domain. (Preprint)

Ke C.,

(Under submission.)

Ke C.,

Remove Model Backdoors via Importance Driven Cloning.

Xu Q., Tao G.,

Provable Computational and Statistical Guarantees for Efficient Learning of Continuous-Action Graphical Games.

Barik A.,

Lee H., Song Q.,

Exact Partitioning of High-order Models with a Novel Convex Tensor Cone Relaxation.

Ke C.,

Sparse Mixed Linear Regression with Guarantees: Taming an Intractable Problem with Invex Relaxation.

Barik A.,

A Simple Unified Framework for High Dimensional Bandit Problems.

Li W., Barik A.,

On the Fundamental Limits of Exact Inference in Structured Prediction.

Lee H., Bello K.,

Exact Partitioning of High-order Planted Models with a Tensor Nuclear Norm Constraint.

Ke C.,

Provable Sample Complexity Guarantees for Learning of Continuous-Action Graphical Games with Nonparametric Utilities.

Barik A.,

Information Theoretic Limits for Standard and One-Bit Compressed Sensing with Graph-Structured Sparsity.

Barik A.,

A Thorough View of Exact Inference in Graphs from the Degree-4 Sum-of-Squares Hierarchy.

Bello K., Ke C.,

Federated Myopic Community Detection with One-shot Communication.

Ke C.,

Barik A.,

Inverse Reinforcement Learning in the Continuous Setting with Formal Guarantees.

Dexter G., Bello K.,

A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning.

Komanduru A.,

Meta Learning for Support Recovery in High-dimensional Precision Matrix Estimation.

Zhang Q., Zheng Y.,

A Le Cam Type Bound for Adversarial Learning and Applications.

Bello K., Xu Q.,

Information Theoretic Limits of Exact Recovery in Sub-hypergraph Models for Community Detection.

Liang J., Ke C.,

Information-Theoretic Bounds for Integral Estimation.

Adams D., Barik A.,

First Order Methods take Exponential Time to Converge to Global Minimizers of Non-Convex Functions.

Kesari K.,

Information-Theoretic Lower Bounds for Zero-Order Stochastic Gradient Estimation.

Alabdulkareem A.,

Regularized Loss Minimizers with Local Data Perturbation: Consistency and Data Irrecoverability.

Li Z.,

The Sample Complexity of Meta Sparse Regression.

Wang Z.,

Novel Change of Measure Inequalities with Applications to PAC-Bayesian Bounds and Monte Carlo Estimation.

Ohnishi Y.,

Randomized Deep Structured Prediction for Discourse-Level Processing.

Widmoser M., Pacheco M.,

PrivSyn: Differentially Private Data Synthesis.

Zhang Z., Wang T.,

Direct Estimation of Difference Between Structural Equation Models in High Dimensions. (Preprint)

Ghoshal A., Bello K.,

Technical report.

Bello K.,

Provable Efficient Skeleton Learning of Encodable Discrete Bayes Nets in Poly-Time and Sample Complexity.

Barik A.,

Minimax Bounds for Structured Prediction Based on Factor Graphs.

Bello K., Ghoshal A.,

Information Theoretic Sample Complexity Lower Bound for Feed-Forward Fully-Connected Deep Networks.

Yang X.,

Technical report.

Bello K.,

Learning Bayesian Networks with Low Rank Conditional Probability Tables.

Barik A.,

On the Correctness and Sample Complexity of Inverse Reinforcement Learning.

Komanduru A.,

Reconstructing a Bounded-Degree Directed Tree Using Path Queries.

Wang Z.,

Optimality Implies Kernel Sum Classifiers are Statistically Efficient.

Meyer R.,

Cost-Aware Learning for Improved Identifiability with Multiple Experiments.

Guo L.,

Bello K.,

Computationally and Statistically Efficient Learning of Causal Bayes Nets Using Path Queries.

Bello K.,

Information-Theoretic Limits for Community Detection in Network Models.

Ke C.,

Statistically and Computationally Efficient Variance Estimator for Kernel Ridge Regression.

Liu M.,

Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time. (Long presentation)

Ghoshal A.,

Learning Linear Structural Equation Models in Polynomial Time and Sample Complexity.

Ghoshal A.,

Learning Sparse Polymatrix Games in Polynomial Time and Sample Complexity.

Ghoshal A.,

On the Statistical Efficiency of Compositional Nonparametric Prediction.

Xu Y.,

The Error Probability of Random Fourier Features is Dimensionality Independent.

Li Y.,

Technical report.

Ghoshal A.,

On the Sample Complexity of Learning Graphical Games.

Information Theoretic Limits for Linear Prediction with Graph-Structured Sparsity.

Barik A.,

Learning Graphical Games from Behavioral Data: Sufficient and Necessary Conditions.

Ghoshal A.,

Information-Theoretic Limits of Bayesian Network Structure Learning.

Ghoshal A.,

Ghoshal A.,

Structured Prediction: From Gaussian Perturbations to Linear-Time Principled Algorithms.

Information-Theoretic Lower Bounds for Recovery of Diffusion Network Structures.

Park K.,

Variable Selection in Gaussian Markov Random Fields.

Invited book chapter in

Edited by Aravkin A., Deng L., Heigold G., Jebara T., Kanevski D., Wright S. (to be published on December, 2016)

On the Statistical Efficiency of ℓ

Technical report. [code]

Predictive Sparse Modeling of fMRI Data for Improved Classification, Regression, and Visualization Using the k-Support Norm.

Belilovsky E., Gkirtzou K., Misyrlis M., Konova A.,

Integration of PCA with a Novel Machine Learning Method for Reparameterization and Assisted History Matching Geologically Complex Reservoirs.

Tight Bounds for the Expected Risk of Linear Classifiers and PAC-Bayes Finite-Sample Guarantees.

Classification on Brain Functional Magnetic Resonance Imaging: Dimensionality, Sample Size, Subject Variability and Noise.

Invited book chapter in

Edited by Chen C.,

Improving Interpretability of Graphical Models in fMRI Analysis via Variable-Selection.

Predicting Cross-task Behavioral Variables from fMRI Data Using the k-Support Norm.

Misyrlis M., Konova A., Blaschko M.,

Medical Image Computing and Computer-Assisted Intervention.

Methylphenidate Enhances Executive Function and Optimizes Prefrontal Function in Both Health and Cocaine Addiction.

Moeller S.,

Integration of Principal Component Analysis and Streamline Information for the History Matching of Channelized Reservoirs.

Chen C., Gao G.,

Two-Sided Exponential Concentration Bounds for Bayes Error Rate and Shannon Entropy.

fMRI Analysis of Cocaine Addiction Using k-Support Sparsity.

Gkirtzou K.,

fMRI Analysis with Sparse Weisfeiler-Lehman Graph Statistics.

Gkirtzou K.,

Medical Image Computing and Computer-Assisted Intervention,

Variable Selection for Gaussian Graphical Models.

Can a Single Brain Region Predict a Disorder?

Two-person Interaction Detection Using Body-Pose Features and Multiple Instance Learning.

Yun K.,

IEEE Computer Vision and Pattern Recognition,

Dopaminergic Involvement During Mental Fatigue in Health and Cocaine Addiction.

Moeller S., Tomasi D.,

Enhanced Midbrain Response at 6-month Follow-up in Cocaine Addiction, Association with Reduced Drug-related Choice.

Moeller S., Tomasi D., Woicik P., Maloney T., Alia-Klein N.,

Digital Analysis and Visualization of Swimming Motion.

Kirmizibayrak C.,

Digital Analysis and Visualization of Swimming Motion.

Kirmizibayrak C.,

Conference on Computer Animation and Social Agents,

Dopaminergic contribution to endogenous motivation during cognitive control breakdown.

Moeller S., Tomasi D.,

Simple Fully Automated Group Classification on Brain fMRI.

Disrupted Functional Connectivity with Dopaminergic Midbrain in Cocaine Abusers.

Tomasi D., Volkow N., Wang R.,

Oral Methylphenidate Normalizes Cingulate Activity in Cocaine Addiction During a Salient Cognitive Task.

Goldstein R., Woicik P., Maloney T., Tomasi D., Alia-Klein N., Shan J.,

Learning Brain fMRI Structure Through Sparseness and Local Constancy.

Neural Information Processing Systems,

A Functional Geometry of fMRI BOLD Signal Interactions.

Langs G., Samaras D., Paragios N.,

Neural Information Processing Systems,

Dopaminergic Response to Drug Words in Cocaine Addiction.

Goldstein R., Tomasi D., Alia-Klein N.,

Anterior Cingulate Cortex Hypoactivations to an Emotionally Salient Task in Cocaine Addiction.

Goldstein R., Alia-Klein N., Tomasi D.,

Langs G., Samaras D., Paragios N.,