I am currently a Member of Technical Staff at Anthropic working on large
language models. I was previously a Staff Research Engineer at Google DeepMind.
Before that, I was an AI Resident at Google Brain, and a Research Intern at
NVIDIA with Anima Anandkumar.
I did my bachelor's degree in computer science and mathematics at Columbia
University. I studied machine learning and robotics at the Columbia Creative
Machines Lab. I was also an intern at NASA JPL in Summer 2019.
I'm also a pianist and enjoy mountaineering and through-hiking. As a pianist,
I play a lot of chamber music, and I've performed at Carnegie Hall, Music
Mountain, Apple Hill, and Kinhaven.
My research currently focuses on making LLMs cheaper to train and serve. I
believe that -- in today's scaling-oriented ML environment -- the most important
innovations come from systems engineering and not from algorithmic innovation.
Making LLMs cheaper and more scalable also lets more people participate
meaningfully in the field.
At Google, I published a number of research papers, including
"Program Synthesis with Large
Language Models", exploring how well large language models can synthesize
programs in real programming languages, and
"Structured Denoising Diffusion
Models in Discrete State Spaces", introducing a kind of diffusion model
for discrete data like text.
Training Google's LLMs: I also helped to train, evaluate, and serve
most of Google's recent LLMs, including
Gemini 1.0,
Gemini 1.5,
Gemini 2.0,
Gemini 2.5,
Gemini 3,
PaLM and
PaLM 2. I led code capabilities
work for several of these models, and also helped build core
training and serving
infrastructure. I was also a core contributor to
Bard
(now Gemini), leading its code capabilities workstream.
Developer Tooling at Google: I contributed to many of Google's internal
ML-powered developer tools, including
ML-powered
code completion using LLMs, similar to a custom, Google-internal version of
GitHub Copilot, and
ML-powered
code review, using LLMs to predict edits that will resolve code review
comments for Google developers
(here's
a more detailed paper).
Planning with LLMs: I have done some early work exploring how large
language models can solve algorithmic reasoning tasks, for instance using an
intermediate scratchpad to perform open-ended calculations in
"Show Your Work: Scratchpads for
Intermediate Computation with Language Models" and chaining LLMs together
into Cascades.
AI for Science: As part of the Blueshift team, I helped build
the first
model to exceed 90% on Hendryck's MATH and the first model to achieve
a
silver medal in the IMO competition. My work on LLM inference was inspired
by a desire to scale up inference-time compute for scientific reasoning.
How To Scale Your Model: I recently co-authored a kind of
LLM systems textbook called
"How To Scale Your Model" that tries to explain how scaling LLM training
and inference works at a systems level. I hope this encourages more researchers
to study these core systems problems.
At Columbia, I was a researcher at the Columbia Creative Machines laboratory,
where I developed the
Titan
Simulation Library, a GPU accelerated physics simulation library which is
widely used in research today. I recently published a first-author paper on
Titan in ICRA 2020. In the past, I
have worked at the Columbia Plasma Physics Lab where I published a first-author
paper
on stellarator coil design.
I was also the head teaching assistant at Columbia University for
COMS 4771
Machine Learning and I have taught MATH 3027 and 3028 Ordinary and Partial
Differential Equations in past years.
I led the aerodynamics and software teams for the 2017 Columbia Space Initiative
team that
won
first place in the NASA Langley aerospace design challenge. I also won the
Grand
Prize in the 2019 NASA Data Visualization and Storytelling Competition, and
presented my work at AGU 2019 for the head of NASA Science.
My work has been presented at the 2019 DARPA Lifelong Learning Machines (L2M)
conference, several Gordon Research Conferences (GRC), an American Physical
Society (APS) meeting, ISHW2017, and AGU.
You can find a full list of my publications
here.
In a previous life I worked on a variety of open-source projects.
Coral Programming
Language: a gradually type-infered Python compiler with type hints
which compiles Python to native machine code. Written in OCaml.
Titan
Simulation Library: a physics simulation using NVIDIA CUDA to
accelerate physics and machine learning research. Widely used within the lab
for experiment and simulation.
AutoPPL: a C++
template library for high-performance probabilistic programming supporting
Metropolis Hastings and NUTS sampling and fast compile-time probabilistic
model building.
NASA GPU Data
Visualization Library: while at NASA JPL, I built and developed a
set of GPU-accelerated Earth Science data visualization libraries that allow
for real-time modeling and manipulation of large Earth science datasets.
More on Github:
to see more projects, visit my Github.
I also play piano in my free time, particularly chamber music. You can listen
to some of my music
here on SoundCloud.
I've studied with Naydene Bowder, George Lopez, Michael Skelly, and Julia
Hamos, and performed at masterclasses with Wolfram Koessel, Anne-Marie
McDermott, Ray Chen, Frank Glazer, and others. At Columbia I performed with
the Columbia Music Performance Program.
I was a winner of the Bay Chamber Competition, the Bangor Symphony Orchestra
Concerto Competition, and the Columbia Music Performance Program Carnegie Hall
Competition. I was also a finalist in the A. Ramon Rivera Competition.