New preprint: a simple untrained model of language in the brain
Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network
In 2021 we were surprised to find that untrained language models are already decent predictors of activity in the human language system (http://doi.org/10.1073/pnas.2105646118). Badr figured out the core components underlying the alignment of untrained models: tokenization and aggregation. With these findings, we built a simple untrained network "SUMA" with state-of-the-art alignment to brain and behavioral data -- this feature encoder provides representations that are then useful for efficient language modeling. Directly mapping our model onto the brain, these results characterize the human language system as a generic feature encoder that aggregates incoming sensory representations for downstream use. If you disagree we hope you consider breaking our model (soon on Brain-Score/Github).
See here for social media posts:
https://x.com/bkhmsi/status/1805595986510717136
https://x.com/martin_schrimpf/status/1805599047098470793
https://x.com/GretaTuckute/status/1805676221189308491
https://x.com/ABosselut/status/1805600725537370119