RePo: Language Models with Context Re-Positioning
- •Sakana AI introduces RePo, a novel architecture allowing models to dynamically reorganize input sequences based on relevance.
- •The system outperforms standard encodings on noisy contexts and long-range dependencies by reshaping attention geometry.
- •RePo includes open-source code and an interactive website to demonstrate flexible working memory in language models.
Sakana AI has unveiled RePo, a research breakthrough that challenges how language models process information. Traditionally, these models view text as a rigid sequence where the physical order of words dictates their importance. This forces AI to treat nearby words as more relevant than distant ones, regardless of context, often causing errors when crucial facts are buried in long documents. Inspired by Cognitive Load Theory—the concept that brains have a limited capacity for processing new information—RePo breaks this bottleneck with a dynamic re-positioning module. Instead of a fixed index, the model assigns positions based on content relevance. This allows it to 'pull' important information closer while 'pushing' noise away, effectively reorganizing its internal workspace and reshaping its attention geometry. RePo demonstrates significantly higher robustness when dealing with noisy data and long-range dependencies that typically confuse standard LLM architectures. By treating context as a flexible map rather than a straight line, Sakana AI is moving toward models that intelligently curate their own working memory. This evolution marks a leap for deep learning, suggesting a future where AI actively structures inputs for more efficient and accurate results.