What are the key points?

Sakana AI released the 'Namazu' post-training series and a proprietary chat service. Global open-weight models like Llama and DeepSeek were optimized for Japanese cultural and social contexts. The technology corrects biases and refusal behaviors on political topics while maintaining original model performance.

Sakana AI Optimizes Global Models for Japanese Context

•Sakana AI released the 'Namazu' post-training series and a proprietary chat service.
•Global open-weight models like Llama and DeepSeek were optimized for Japanese cultural and social contexts.
•The technology corrects biases and refusal behaviors on political topics while maintaining original model performance.

At the forefront of AI development, 'pre-training'—which requires immense computational resources—is increasingly concentrated among tech giants in the US and China. In response, Tokyo-based Sakana AI is advocating for the strategic importance of post-training technology. This approach leverages high-performance open-weight models and optimizes them to align with specific regional cultures, values, and security requirements. As a first proof of concept, the company unveiled 'Namazu' (alpha), a series of prototype models tailored for Japan, along with the 'Sakana Chat' service.

The standout feature of the Namazu series is its ability to adapt to Japan's unique context without compromising the fundamental reasoning and coding capabilities of world-class base models like Llama-3.1-405B and DeepSeek-V3.1. Foreign-made models often carry inherent biases or tend to refuse answers to politically sensitive topics due to the ideologies or information controls of their origin regions. By applying post-training with proprietary datasets, Sakana AI has successfully mitigated 'self-censorship' and enabled more objective, multifaceted responses.

This technical breakthrough is most evident in performance metrics: while the base DeepSeek-V3.1-Terminus model refused 72% of certain queries, the Namazu-enhanced version brought that refusal rate down to nearly 0%. This suggests that Japanese users can now safely and effectively unlock a model's true potential by removing external constraints through targeted technical refinement. Additionally, the integration of web search functionality allows the engine to synthesize real-time news, transforming it into a highly practical tool for daily use.

Looking ahead, Sakana AI plans to integrate agent technologies and sophisticated control systems to provide even more advanced AI solutions. The release of the Namazu series demonstrates a powerful methodology for 'taming' massive foundation models to meet specific national needs. As the democratization of technology continues, such localization through post-training will likely serve as a cornerstone for maintaining Japan's independent competitiveness in the global AI landscape.

At the forefront of AI development, 'pre-training'—which requires immense computational resources—is increasingly concentrated among tech giants in the US and China. In response, Tokyo-based Sakana AI is advocating for the strategic importance of post-training technology. This approach leverages high-performance open-weight models and optimizes them to align with specific regional cultures, values, and security requirements. As a first proof of concept, the company unveiled 'Namazu' (alpha), a series of prototype models tailored for Japan, along with the 'Sakana Chat' service.

The standout feature of the Namazu series is its ability to adapt to Japan's unique context without compromising the fundamental reasoning and coding capabilities of world-class base models like Llama-3.1-405B and DeepSeek-V3.1. Foreign-made models often carry inherent biases or tend to refuse answers to politically sensitive topics due to the ideologies or information controls of their origin regions. By applying post-training with proprietary datasets, Sakana AI has successfully mitigated 'self-censorship' and enabled more objective, multifaceted responses.

This technical breakthrough is most evident in performance metrics: while the base DeepSeek-V3.1-Terminus model refused 72% of certain queries, the Namazu-enhanced version brought that refusal rate down to nearly 0%. This suggests that Japanese users can now safely and effectively unlock a model's true potential by removing external constraints through targeted technical refinement. Additionally, the integration of web search functionality allows the engine to synthesize real-time news, transforming it into a highly practical tool for daily use.

Looking ahead, Sakana AI plans to integrate agent technologies and sophisticated control systems to provide even more advanced AI solutions. The release of the Namazu series demonstrates a powerful methodology for 'taming' massive foundation models to meet specific national needs. As the democratization of technology continues, such localization through post-training will likely serve as a cornerstone for maintaining Japan's independent competitiveness in the global AI landscape.

Sakana AI Optimizes Global Models for Japanese Context

Tags