A downloadable project

Causal reasoning is a crucial part of how we humans safely and robustly think about the world. Can we identify if LLMs have causal reasoning? Marius Hobbhahn and Tom Lieberum (2022, Alignment Forum) approached this with probing. For this hackathon, we follow-up on that work by exploring a mechanistic interpretability analysis of causal reasoning in the 80 million parameters of GPT-2 Small using Neel Nanda’s Easy Transformer package.

More information

Status	Released
Category	Other
Author	Jacy Reese Anthis

Download

Mechanisms of Causal Reasoning in LLMs - Ben, Jacy, Mark, Sky - Interpretability Hackathon.pdf 111 kB

Download

Mechanisms_of_Causal_Reasoning_in_LLMs.ipynb 164 kB

Download

Ball prompts.csv 1 kB

Mechanisms of Causal Reasoning

Download

Leave a comment