Emergent Depthwise Activation Structure in Decoder-Only Transformer Language Models
The Hau Curve paper identifies a recurring tri-phasic activation geometry across decoder-only transformer models, with early-training convergence and stable depthwise landmarks. This work became the foundation for AEGIS.
- 77
- models in the core-plus-boundary cohort
- 7-15%
- approximate early-training emergence window in examined runs