What are the key points?

Microsoft Research debuts Multiplex Thinking, a soft reasoning mechanism using token-wise branch-and-merge for better efficiency. System aggregates multiple candidate token embeddings into single multiplex tokens to optimize reasoning via reinforcement learning. New method outperforms standard reasoning baselines on math benchmarks while producing shorter, higher-bandwidth token sequences.

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

•Microsoft Research debuts Multiplex Thinking, a soft reasoning mechanism using token-wise branch-and-merge for better efficiency.
•System aggregates multiple candidate token embeddings into single multiplex tokens to optimize reasoning via reinforcement learning.
•New method outperforms standard reasoning baselines on math benchmarks while producing shorter, higher-bandwidth token sequences.

Microsoft Research has unveiled a novel reasoning framework called Multiplex Thinking, designed to overcome the efficiency bottlenecks of standard LLM reasoning methods. While traditional models rely on long, sequential text chains to solve complex problems, this new approach mimics human intuition by maintaining a distribution of several plausible next steps simultaneously. This shift from discrete steps to a 'soft' continuous approach allows the system to consider multiple possibilities at once without slowing down.

The core innovation lies in a 'token-wise branch-and-merge' mechanism. Instead of picking just one word at each step, the model samples multiple candidate tokens and merges their mathematical representations into a single 'multiplex token.' This allows the model to explore various reasoning paths without the massive computational cost of generating separate, long sentences for every scenario. By preserving original vocabulary priors, the system stays grounded in its training data while effectively compressing logic into fewer tokens.

Crucially, this method is optimized through Reinforcement Learning, which helps the model learn the most effective logical paths. The system is self-adaptive: when the model is certain, it acts like a normal text generator; when uncertain, it uses multiplex tokens to represent multiple ideas compactly. Results show superior performance on math benchmarks, consistently beating strong baselines on metrics like pass@k while maintaining significantly shorter and more efficient output sequences.

Microsoft Research has unveiled a novel reasoning framework called Multiplex Thinking, designed to overcome the efficiency bottlenecks of standard LLM reasoning methods. While traditional models rely on long, sequential text chains to solve complex problems, this new approach mimics human intuition by maintaining a distribution of several plausible next steps simultaneously. This shift from discrete steps to a 'soft' continuous approach allows the system to consider multiple possibilities at once without slowing down.

The core innovation lies in a 'token-wise branch-and-merge' mechanism. Instead of picking just one word at each step, the model samples multiple candidate tokens and merges their mathematical representations into a single 'multiplex token.' This allows the model to explore various reasoning paths without the massive computational cost of generating separate, long sentences for every scenario. By preserving original vocabulary priors, the system stays grounded in its training data while effectively compressing logic into fewer tokens.

Crucially, this method is optimized through Reinforcement Learning, which helps the model learn the most effective logical paths. The system is self-adaptive: when the model is certain, it acts like a normal text generator; when uncertain, it uses multiplex tokens to represent multiple ideas compactly. Results show superior performance on math benchmarks, consistently beating strong baselines on metrics like pass@k while maintaining significantly shorter and more efficient output sequences.

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Tags