About me
I'm a 4th Year PhD student at Brown University, advised by Lorin Crawford, and a Student Researcher at Google Deepmind on the Protein Function team in the Bay Area, CA. I work on biologically and physically inspired models, with a focus on learning meaningful, data-efficient representations for protein-knowledge problems.
Previously, I was a Machine Learning intern at Prescient Design, and a research intern at IBM Research. I completed by undergrad in Electrical Engineering and Computer Science at UC Berkeley in 2021.
I also spend time co-organizing a Machine Learning x Proteins seminar series and co-run after-school computational science workshops at Providence public schools with some of my wonderful peers.
I'm excited about the future of Machine Learning for scientific domains, especially in biology. My reading and work leans into the mathematical and algorithmic principles that we can use to explain structure and mechanisms in science. As I learn more about math and proteins, I'm trying to make notes and collect useful links. You can find notes here and interesting reads here. Some of the workflows that I've created as I've needed them can be found on my github.
Research Projects
- Logit
Subspace Diffusion for Protein Sequence Design.
With Nathan Frey, Andrew Watkins,and Jae Hyeyeon Lee.
Developed a score-based stochastic differential equation (SDE) framework that diffuses over zero-identity component simplexes of protein sequences for antibody design.
- Diffusion Models for Sequence-Structure Co-Design. arxiv
With Kevin Yang and Lorin Crawford.
Investigating sequence-structure distribution correlation through diffusion-based generative modelling for various downstream protein prediction tasks.
Presented at LMRL @ NeurIPS 2022. Spotlight.
- Reprogramming Pretrained Language Models for Protein Sequence Representation Learning. arxiv
With Pin-Yu Chen and Payel Das.
Introduces a representation learning framework in which large language models are reprogrammed for alternate tasks that can perform well in low-training data settings.
In review at Nature Biomedical Engineering.
- Representation Learning for Molecular Property Prediction. arxiv
With Pin-Yu Chen and Payel Das.
Introduces dictionary learning to utilize learned representations from pretrained deep models for functional and structural molecular property prediction.
Presented at WiML, LMRL @ NeurIPS 2020.