Technology

Polish mathematician reveals secrets behind AI test that no human could pass

Bartosz Naskręcki, PhD, source: Adam Mickiewicz University in Poznań
Bartosz Naskręcki, PhD, source: Adam Mickiewicz University in Poznań

Scientists from the FrontierMath project have created a new mathematics exam designed to push the limits of artificial intelligence.

The test consists of 50 problems so difficult that, according to project participants, no mathematician could solve them all.

The initiative was launched after recent AI models from Google DeepMind and OpenAI demonstrated they could handle high school–level math with ease, rendering existing benchmarks obsolete. FrontierMath is coordinated by Epoch AI, with contributions from 30 experts worldwide, including Bartosz Naskręcki, PhD, of Adam Mickiewicz University in Poznań.

Naskręcki co-authored the most difficult set, Tier 4, which covers number theory, topology, combinatorics, mathematical analysis, and algebraic geometry. Current AI systems have managed to solve only four of the 50 problems.

“I was invited to prepare a problem. The answer was supposed to be a very large number, so that the model could not accidentally guess it. I put all my expert knowledge accumulated during all my years of study and work into this problem,” he said.

He added that he was instructed to design a completely new problem with no existing solution online. “It is essentially my buried scientific work. The documented solution is 13 pages of dense, mathematical text.”

The difficulty of each task is such that, in Naskręcki’s view, even a PhD-level mathematician would need at least a month to devise an approach. “I do not think there’s a mathematician in the world who could solve all 50 problems in this set,” he said.

Preparation of the benchmark took place during a two-day meeting in Berkeley, where specialists worked in topic groups. Problems were tested against advanced AI models in incognito mode, then refined to ensure they could not be solved too easily. Many proposals were discarded.

AI companies can now access the FrontierMath benchmark under controlled conditions via Epoch AI. Each model faces strict limits: for example, up to three hours of runtime and a million tokens to attempt a single problem.

So far, top-performing systems have solved only a handful of tasks. Naskręcki expects progress will be rapid. “In just two to three years, the AI will saturate this benchmark — it will provide correct answers to most of the problems. And then we can say that we have a model that is a truly good mathematician.”

But he stressed a limitation. “AI is brilliant at sharp combinations and combining existing knowledge, but it cannot create new concepts. No current model will figure out how to prove the Riemann hypothesis. So if models can solve all the problems we prepare, the last domain left for mathematicians will be coming up with new, crazy mathematical ideas.”

The researcher compared AI’s rise to a shock. “The development of AI is a hammer that hits us over the head and forces a revolution in our thinking about work and education,” he said.

He argued that the traditional school system is outdated. “We must abandon the Prussian school model, which formed obedient soldiers who would obey every order. Now we need people who can think independently, take risks, and build something new.”

According to Naskręcki, future value will come from originality and creativity. “There will be no more cutting coupons and adding details to existing theories. Mathematics will return to its roots: it will involve asking bold questions and proposing unconventional solutions.”

He added that human experience still gives researchers an edge. “Our advantage over AI lies in unique experiences — taking a walk, reading a book, viewing a play. Connections that occur in non-obvious fields give birth to ideas that AI has no access to. Therefore, in the new reality, our greatest value will not be the correct execution of routine tasks, but rather the ability to ask questions and generate original ideas.”

PAP - Science in Poland, Ludwika Tomala (PAP)

lt/ zan/ mow/

The PAP Foundation allows free reprinting of articles from the Nauka w Polsce portal provided that we are notified once a month by e-mail about the fact of using the portal and that the source of the article is indicated. On the websites and Internet portals, please provide the following address: Source: www.scienceinpoland.pl, while in journals – the annotation: Source: Nauka w Polsce - www.scienceinpoland.pl. In case of social networking websites, please provide only the title and the lead of our agency dispatch with the link directing to the article text on our web page, as it is on our Facebook profile.

More on this topic

  • Adobe Stock

    Humanising chatbots will transform our value system, says tech anthropologist

  • Adobe Stock

    Polish scientists develop new geomaterials from waste in European MidSafe project

Before adding a comment, please read the Terms and Conditions of the Science in Poland forum.