Lily H. Zhang

Machine Learning Researcher

I'm a PhD candidate at New York University advised by Professor Rajesh Ranganath. I'm broadly interested in advancing the reliability of machine learning models. This includes controllable generation and alignment of generative models as well as out-of-distribution detection and generalization. I'm also interested in applications for health and science. I'm grateful to be a DeepMind Scholar, Visiting Researcher at Facebook AI Research, and a JP Morgan Chase PhD Fellow.

I graduated from Harvard University with a bachelor's in statistics (primary) and computer science (secondary). I've worked for several machine learning start-ups and conducted LLM research at Google.

I've had the pleasure to work with Professors Kyle Cranmer (physics), Kyunghyun Cho (computer science), Don Rubin (statistics), Gary King (quantitative social science), Jukka-Pekka “JP” Onnela (biostatistics), John M. Higgins (pathology, systems biology), and Dustin Tingley (government, political science).

Feel free to contact me for research collaborations or other engagements.

Publications

Preference Learning Algorithms Do Not Learn Preference Rankings
Angelica Chen, Sadhika Malladi, Lily H. Zhang,Xinyi Chen, Qiuyi Zhang, Rajesh Ranganath, and Kyunghyun Cho.
Advances in Neural Information Processing Systems (NeurIPS), 2024.
[Paper] [Code]

Robust Anomaly Detection for Particle Physics Using Multi-Background Representation Learning
Abhijith Gandrakota*, Lily Zhang*, Aahlad Puli, Kyle Cranmer, Jennifer Ngadiuba, Rajesh Ranganath, and Nhan Tran.
Machine Learning for Science and Technology (MLST), 2024.
[Paper]

Towards Minimal Targeted Updates of Language Models with Targeted Negative Training
Lily H. Zhang, Rajesh Ranganath, and Arya Tafvizi.
Transactions on Machine Learning Research (TMLR), 2024.
[Paper] [Code]

Don't Blame Dataset Shift! Shortcut Learning due to Gradients and Cross Entropy
Aahlad Puli, Lily H. Zhang, Yoav Wald, and Rajesh Ranganath.
Advances in Neural Information Processing Systems (NeurIPS), 2023.
[Paper]

When More is Less: Incorporating Additional Datasets Can Hurt Performance By Introducing Spurious Correlations
Rhys Compton, Lily H. Zhang, Aahlad Puli, and Rajesh Ranganath.
Machine Learning for Healthcare (MLHC), 2023.
[Paper] [Code]

Robustness to Spurious Correlations Improves Semantic Out-of-Distribution Detection
Lily H. Zhang and Rajesh Ranganath.
Association for the Advancement of Artificial Intelligence (AAAI), 2023.
[Paper] [Code]

Set Norm and Equivariant Residual Connections: Putting the Deep in Deep Sets
Lily H. Zhang*, Veronica Tozzo*, John M. Higgins, Rajesh Ranganath.
International Conference on Machine Learning (ICML), 2022.
[Paper] [Code]

Out-of-Distribution Generalization in the Presence of Nuisance-Induced Spurious Correlations
Aahlad Puli, Lily H. Zhang, Eric Oermann, Rajesh Ranganath.
International Conference on Learning Representations (ICLR), 2022.
[Paper] [Code]

Understanding Out-of-Distribution Detection with Deep Generative Models
Lily H. Zhang, Mark Goldstein, Rajesh Ranganath
International Conference on Machine Learning (ICML), 2021.
[Paper] [Talk]

Rapid Model Comparison by Amortizing Across Models
Lily H. Zhang and Michael Hughes.
Proceedings of The 2nd Symposium on Advances in Approximate Bayesian Inference (AABI), 2020.
[Paper] [Code]

Education

New York University, New York, NY. Candidate for Doctor of Philosophy in Data Science. Aug. 2020 – Summer 2025 (projected).

Harvard College, Cambridge, MA. Bachelor of Arts in Statistics and Computer Science. Magna Cum Laude with High Honors. Aug. 2013 – May 2017.

Honors & Awards

JP Morgan PhD Fellow, 2024.
Meta AI Mentorship Fellow, 2024.
DeepMind Fellow, 2020.
Phi Beta Kappa, 2017.

Patents

Graphical user interface systems for generating hierarchical data extraction training dataset.