PFN Launches JFBench to Master Japanese Instruction Following
- •Preferred Networks (PFN) releases JFBench to evaluate professional-level Japanese instruction following performance.
- •The benchmark features 174 constraints, including unique Japanese cultural nuances and complex multi-instruction requirements.
- •The PLaMo 2.2 Prime model achieved performance comparable to GPT-5.1 by leveraging JFBench during the post-training phase.
Preferred Networks (PFN) has released JFBench, a new benchmark designed to accurately evaluate and improve AI performance in handling the complex linguistic structures and cultural backgrounds unique to the Japanese language.
Most existing benchmarks for measuring instruction-following performance were developed in English-speaking regions. These translated versions often fail to capture the nuances of Japanese, such as the distinction between polite and casual speech or the intricate mixing of hiragana, katakana, and kanji.
JFBench covers 174 diverse constraints ranging from strict format requirements like JSON output to specific Japanese business customs, such as telephone number and address formats.
Perhaps most importantly, the benchmark evaluates the ability to satisfy up to eight independent constraints simultaneously, rather than just single instructions. This allows developers to visualize how faithfully an AI can respond to the high-level, complex demands encountered in professional environments.
In this project, JFBench was utilized not only as an evaluation metric but also as a training dataset called JFTrain.
By optimizing the post-training process—including supervised fine-tuning and Direct Preference Optimization (DPO)—the company’s domestically developed model, PLaMo 2.2 Prime, achieved instruction-following performance on par with the latest frontier model, GPT-5.1.
PFN has made the source code for JFBench available on GitHub to support the growth of the domestic AI development community.
Instruction-following capability is the fundamental technology required to realize AI Agents that can autonomously perform tasks in the future. This initiative, specifically tailored for the Japanese environment, represents a major step toward accelerating the social implementation of domestic generative AI.