Mechanisms of Causal Reasoning
A downloadable project
Causal reasoning is a crucial part of how we humans safely and robustly think about the world. Can we identify if LLMs have causal reasoning? Marius Hobbhahn and Tom Lieberum (2022, Alignment Forum) approached this with probing. For this hackathon, we follow-up on that work by exploring a mechanistic interpretability analysis of causal reasoning in the 80 million parameters of GPT-2 Small using Neel Nanda’s Easy Transformer package.
Status | Released |
Category | Other |
Author | Jacy Reese Anthis |
Leave a comment
Log in with itch.io to leave a comment.