dynaxx on the hidden graph: interpretability via circuit-level abstraction
This comprehensive guide explores the emerging paradigm of circuit-level abstraction for understanding neural networks, moving beyond neuron-level int...
5 articles in this category
This comprehensive guide explores the emerging paradigm of circuit-level abstraction for understanding neural networks, moving beyond neuron-level int...
This comprehensive guide explores the Dynaxx Mechanistic Audit, a rigorous methodology for tracing causal paths in production models. Unlike conventio...
Introduction: The Problem of Feature SuperpositionWhen we inspect the internal representations of a deep neural network, we often find that individual...
Introduction: The Black Box is a MirageIn my decade of working with deep learning systems, first in academic research and now leading the Dynaxx resea...
Introduction: The Black Box That Changed EverythingWhen I first encountered GPT-3's ability to perform a novel task from just a few examples in a prom...