About Me

Hi, my name is Eric Todd. I'm a third-year PhD student at Northeastern University advised by Professor David Bau. Prior to beginning my PhD, I studied Applied and Computational Mathematics at Brigham Young University (BYU).

I'm interested in understanding the learned structure of large neural networks, and how their internal representations enable their impressive generalization capabilities.

My research interests generally include machine learning and interpretability. I'm particularly excited by generative models and their applications in natural language and computer vision.


News

August 2024
Our causal interpretability survey is out on arXiv. As interpretability researchers, we're still trying to understand the right level of abstraction for thinking about neural network computation, but causal methods have become a promising method for studying them.
July 2024
Our preprint about NNsight and NDIF is out on arXiv. I'm excited about this framework for enabling access to the internal computations of large foundation models!
May 2024
Invited Talk on Function Vectors at the Princeton Neuroscience Institute.
May 2024
Had a great time presenting our Function Vectors work at ICLR 2024.
January 2024
Our Function Vectors paper was accepted to ICLR!

Selected Publications

nnsight-paper NNsight and NDIF: Democratizing Access to Foundation Model Internals. Jaden Fiotto-Kaufman, Alexander R. Loftus, Eric Todd, Jannik Brinkmann, Caden Juang, Koyena Pal, Can Rager, Aaron Mueller, Samuel Marks, Arnab Sen Sharma, Francesca Lucchetti, Michael Ripa, Adam Belfki, Nikhil Prakash, Sumeet Multani, Carla Brodley, Arjun Guha, Jonathan Bell, Byron C. Wallace, David Bau. 2024.

The enormous scale of state-of-the-art foundation models has limited their accessibility to scientists, because customized experiments on large models require costly hardware and complex engineering that is impractical for most researchers. To alleviate these problems, we introduce NNsight, an open-source Python package with a simple, flexible API that can express interventions on any PyTorch model by building computation graphs. We also introduce NDIF, a collaborative research platform providing researchers access to foundation-scale LLMs via the NNsight API. Code, documentation, and tutorials are available at https://nnsight.net/.

fv-paper Function Vectors in Large Language Models.
Eric Todd, Millicent L. Li, Arnab Sen Sharma, Aaron Mueller, Byron C. Wallace, David Bau. Proceedings of the 2024 International Conference on Learning Representations. (ICLR 2024)

We report the presence of a simple neural mechanism that represents an input-output function as a vector within autoregressive transformer language models (LMs). Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV). FVs are robust to changes in context, i.e., they trigger execution of the task on inputs such as zero-shot and natural text settings that do not resemble the ICL contexts from which they are collected. We test FVs across a range of tasks, models, and layers and find strong causal effects across settings in middle layers. We investigate the internal structure of FVs and find while that they often contain information that encodes the output space of the function, this information alone is not sufficient to reconstruct an FV. Finally, we test semantic vector composition in FVs, and find that to some extent they can be summed to create vectors that trigger new complex tasks. Our findings show that compact, causal internal vector representations of function abstractions can be explicitly extracted from LLMs.