I work on machine learning and statistical methods for data-driven design: the design of novel objects with desired properties, such as proteins or small molecules, in a way that is learned from data. How can we quantify uncertainty or estimate risk when we deploy design algorithms? How can we understand the inductive biases of generative models used for design? I am particularly interested in these questions in the context of protein engineering.

Some highlighted work is below. See Google Scholar for a complete record.

* denotes equal contribution. (\(\alpha\)-\(\beta\)) denotes alphabetical ordering.

  • (\(\alpha\)-\(\beta\)) Anastasios N. Angelopoulos, Stephen Bates, Clara Fannjiang, Michael I. Jordan, and Tijana Zrnic. 2023. Prediction-powered inference. Science, 382, 669-674. publication arXiv code talk

  • Clara Fannjiang, Stephen Bates, Anastasios N. Angelopoulos, Jennifer Listgarten, and Michael I. Jordan. 2022. Conformal prediction under feedback covariate shift for biomolecular design. Proceedings of the National Academy of Sciences, 119(43), e2204569119. arXiv publication PDF code bibtex talk

  • Chloe Hsu, Hunter Nisonoff, Clara Fannjiang, and Jennifer Listgarten. 2022. Learning protein fitness models from evolutionary and assay-labelled data. Nature Biotechnology, 40, 1114–1122. PDF publication bibtex

  • Clara Fannjiang and Jennifer Listgarten. Autofocused oracles for model-based design. NeurIPS 2020. arXiv proceedings code bibtex

  • Clara Fannjiang, T. Aran Mooney, Seth Cones, David Mann, K. Alex Shorter, and Kakani Katija. 2019. Augmenting biologging with supervised machine learning to study in situ behavior of the medusa Chrysaora fuscescens. Journal of Experimental Biology, 222, jeb207654. PDF publication jellyfish footage code bibtex

  • Clara Fannjiang. 2013. Optimal arrays for compressed sensing in snapshot-mode radio interferometry. Astronomy & Astrophysics, 559, A73. PDF publication bibtex