Eric Todd

Hello! I'm a fourth-year PhD student at Northeastern University advised by David Bau. Before my PhD, I studied Applied and Computational Mathematics (ACME) at Brigham Young University (BYU).

I'm interested in understanding learned structure inside of large neural networks, and how their internal representations enable their impressive generalization capabilities.

My research interests generally include machine learning, interpretability, and deep learning as a science. I'm particularly interested in research on in-context learning (ICL) and causal abstraction in neural networks.


Selected Publications

algebra-paper In-Context Algebra
Eric Todd, Jannik Brinkmann, Rohit Gandikota, David Bau.
The Fourteenth International Conference on Learning Representations (ICLR), 2026.

We investigate the mechanisms that arise when transformers are trained to solve arithmetic on sequences where tokens are variables whose meaning is determined only through their interactions. While prior work has found that transformers develop geometric embeddings that mirror algebraic structure, those previous findings emerge from settings where arithmetic-valued tokens have fixed meanings. We devise a new task in which the assignment of symbols to specific algebraic elements varies from one sequence to another. Despite this challenging setup, transformers achieve near-perfect accuracy on the task and even generalize to unseen groups. We develop targeted data distributions to create causal tests of a set of hypothesized mechanisms, and we isolate three mechanisms models consistently learn: commutative copying where a dedicated head copies answers, identity element recognition that distinguishes identity-containing facts, and closure-based cancellation that tracks group membership to constrain valid answers. Complementary to the geometric representations found in fixed-symbol settings, our findings show that models develop symbolic reasoning mechanisms when trained to reason in-context with variables whose meanings are not fixed.

fv-paper Function Vectors in Large Language Models
Eric Todd, Millicent L. Li, Arnab Sen Sharma, Aaron Mueller, Byron C. Wallace, David Bau.
The Twelfth International Conference on Learning Representations (ICLR), 2024.

We report the presence of a simple neural mechanism that represents an input-output function as a vector within autoregressive transformer language models (LMs). Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV). FVs are robust to changes in context, i.e., they trigger execution of the task on inputs such as zero-shot and natural text settings that do not resemble the ICL contexts from which they are collected. We test FVs across a range of tasks, models, and layers and find strong causal effects across settings in middle layers. We investigate the internal structure of FVs and find while that they often contain information that encodes the output space of the function, this information alone is not sufficient to reconstruct an FV. Finally, we test semantic vector composition in FVs, and find that to some extent they can be summed to create vectors that trigger new complex tasks. Our findings show that compact, causal internal vector representations of function abstractions can be explicitly extracted from LLMs.


News

2026

January 2026

In our new preprint, In-Context Algebra, we design an in-context learning setting where tokens have no fixed meaning, allowing us to study how context alone imparts meaning. We find rich structure beyond fuzzy induction-style copying. [Twitter]

2025

October 2025

Reviewed for the ICLR 2026 Main Conference.
Had a lot of fun attending COLM 2025 in Montreal!

August 2025

Had a great time attending NEMI again this year. We had 200+ people come to Northeastern and it was fun to see the exciting research others are working on.

July 2025

Big life update: My wife gave birth to our twins! It's been very busy, but also very fun having these two new little people at home.
Our Dual-Route Model of Induction paper was accepted at COLM 2025!

June 2025

April 2025

Had a fun time attending NENLP 2025 at Yale.
In a new preprint, we find we can separate out how LLMs do verbatim copying of tokens vs. copying of word meanings. This was a fun project led by Sheridan, that helps clarify how "induction" in LLMs can also happen over abstract contextual information rather than just literal token values.

January 2025

Our NNsight and NDIF paper was accepted to ICLR 2025! I'm really excited about this framework for enabling research on large-scale AI, and about the mission of NDIF in general.
Contributed to a new review-style preprint (with many others!), Open Problems in Mechanistic Interpretability, which details what kinds of problems we're currently thinking about as interpretability researchers and also what kinds of questions we still don't know the answers to.

2024

August 2024

Our causal interpretability survey is out on arXiv. As interpretability researchers, we're still trying to understand the right level of abstraction for thinking about neural network computation, but causal methods have become a promising method for studying them.

July 2024

Our preprint about NNsight and NDIF is out on arXiv. I'm excited about this framework for enabling access to the internal computations of large foundation models!

May 2024

Invited talk on Function Vectors at the Princeton Neuroscience Institute.
Had a great time presenting our Function Vectors work at ICLR 2024.

January 2024

Our Function Vectors paper was accepted to ICLR 2024!