Polish achieves the highest accuracy in multilingual long-context AI tasks, surpassing English and other major languages, according to a new study by researchers from the University of Maryland and Microsoft.
A recent analysis found that AI models perform best when prompted in Polish, outperforming widely spoken languages such as English and Chinese.
The ranking comes from One Ruler to Measure Them All: Benchmarking Multilingual Long-Context Language Models, by Yekyung Kim, Jenna Russell, Marzena Karpinska, and Mohit Iyyer, affiliated with the University of Maryland and Microsoft.
The study introduces ONERULER, a benchmark for evaluating large language models across 26 languages, focusing on tasks requiring long contextual understanding.
Researchers found that Polish achieved the highest performance, while English ranked 6th.
🤖🇵🇱 A new study from the 𝗨𝗻𝗶𝘃𝗲𝗿𝘀𝗶𝘁𝘆 𝗼𝗳 𝗠𝗮𝗿𝘆𝗹𝗮𝗻𝗱 has found that AI chatbots like ChatGPT 𝗽𝗲𝗿𝗳𝗼𝗿𝗺 𝗯𝗲𝘀𝘁 𝘄𝗵𝗲𝗻 𝗽𝗿𝗼𝗺𝗽𝘁𝗲𝗱 𝗶𝗻...
Opublikowany przez Radio Poland Poniedziałek, 27 października 2025
Experiments included both open-weight and closed large language models, such as OpenAI’s o3-mini-high, and tested context lengths from 8K to 128K tokens.
The results also highlighted performance drops in low-resource languages and fluctuations in cross-lingual scenarios where instructions and context appeared in different languages.
(mp)
Source: arXiv:2503.01996/Radio Poland