I am currently a Member of Technical Staff at Anthropic working on large language
models. I was previously a Staff Research Engineer at Google DeepMind.
Before that, I was an AI Resident at Google Brain, and a Research Intern at NVIDIA
with Anima Anandkumar.
I did my bachelor's degree in computer science and mathematics at Columbia
University. I studied machine learning and robotics at the Columbia Creative Machines Lab.
I was also an intern at NASA JPL in Summer 2019.
I'm also a pianist and enjoy mountaineering and through-hiking. As a pianist, I play a lot of chamber music, and I've performed at Carnegie Hall, Music
Mountain, Apple Hill, and Kinhaven.
My research currently focuses on making LLMs cheaper to train and serve. I believe that -- in today's scaling-oriented ML environment -- the most important innovations come from systems engineering and not from algorithmic innovation. Making LLMs cheaper and more scalable also lets more people participate meaningfully in the field.
At Google, I published a number of research papers, including "Program Synthesis with Large Language Models", exploring how well large language models can synthesize programs in real programming languages, and "Structured Denoising Diffusion Models in Discrete State Spaces", introducing a kind of diffusion model for discrete data like text.
Training Google's LLMs: I also helped to train, evaluate, and serve most of Google's recent LLMs, including Gemini 1.0, Gemini 1.5, Gemini 2.0, Gemini 2.5, Gemini 3, PaLM and PaLM 2. I led code capabilities work for several of these models, and also helped build core training and serving infrastructure. I was also a core contributor to Bard (now Gemini), leading its code capabilities workstream.
Developer Tooling at Google: I contributed to many of Google's internal ML-powered developer tools, including ML-powered code completion using LLMs, similar to a custom, Google-internal version of GitHub Copilot, and ML-powered code review, using LLMs to predict edits that will resolve code review comments for Google developers (here's a more detailed paper).
Planning with LLMs: I have done some early work exploring how large language models can solve algorithmic
reasoning tasks, for instance using an intermediate scratchpad to perform open-ended calculations in "Show Your Work: Scratchpads for Intermediate Computation with Language
Models" and chaining LLMs together into Cascades.
AI for Science: As part of the Blueshift team, I helped build the first model to exceed 90% on Hendryck's
MATH and the first model to achieve a silver medal in the IMO
competition. My work on LLM inference was inspired by a desire to scale up inference-time compute for
scientific reasoning.
How To Scale Your Model: I recently co-authored a kind of LLM
systems textbook called "How To Scale Your Model" that tries to explain how scaling LLM training and
inference works at a systems level. I hope this encourages more researchers to study these core systems problems.
At Columbia, I was a researcher at the Columbia Creative Machines laboratory, where I developed the Titan Simulation Library, a GPU accelerated physics simulation library which is widely used in research today. I recently published a first-author paper on Titan in ICRA 2020. In the past, I have worked at the Columbia Plasma Physics Lab where I published a first-author paper on stellarator coil design.
I was also the head teaching assistant at Columbia University for COMS 4771 Machine Learning and I have taught MATH 3027 and 3028 Ordinary and Partial Differential Equations in past years.
I led the aerodynamics and software teams for the 2017 Columbia Space Initiative team that won first place in the NASA Langley aerospace design challenge. I also won the Grand Prize in the 2019 NASA Data Visualization and Storytelling Competition, and presented my work at AGU 2019 for the head of NASA Science.
My work has been presented at the 2019 DARPA Lifelong Learning Machines (L2M) conference, several Gordon Research Conferences (GRC), an American Physical Society (APS) meeting, ISHW2017, and AGU.
You can find a full list of my publications here.
In a previous life I worked on a variety of open-source projects.
Coral Programming Language: a gradually type-infered Python compiler with type hints which compiles Python to native machine code. Written in OCaml.
Titan Simulation Library: a physics simulation using NVIDIA CUDA to accelerate physics and machine learning research. Widely used within the lab for experiment and simulation.
AutoPPL: a C++ template library for high-performance probabilistic programming supporting Metropolis Hastings and NUTS sampling and fast compile-time probabilistic model building.
NASA GPU Data Visualization Library: while at NASA JPL, I built and developed a set of GPU-accelerated Earth Science data visualization libraries that allow for real-time modeling and manipulation of large Earth science datasets.
More on Github: to see more projects, visit my Github.
I also play piano in my free time, particularly chamber music. You can listen to some of my music here on SoundCloud.
I've studied with Naydene Bowder, George Lopez, Michael Skelly, and Julia Hamos, and performed at masterclasses with Wolfram Koessel, Anne-Marie McDermott, Ray Chen, Frank Glazer, and others. At Columbia I performed with the Columbia Music Performance Program.
I was a winner of the Bay Chamber Competition, the Bangor Symphony Orchestra Concerto Competition, and the Columbia Music Performance Program Carnegie Hall Competition. I was also a finalist in the A. Ramon Rivera Competition.