Technology

AI ‘hacks our thinking system' to appear correct, wants expert

07.10.2025. Director of the Centre for Credible Artificial Intelligence Przemysław Biecek. PAP/Paweł Supernak
07.10.2025. Director of the Centre for Credible Artificial Intelligence Przemysław Biecek. PAP/Paweł Supernak

AI models are designed to persuade and appear correct, but they do not truly understand the world, and this can make them dangerous in high-risk areas, according to Professor Przemysław Biecek, Director of the Centre for Credible Artificial Intelligence (CCAI) at Warsaw University of Technology.

“Models learn to hack our thinking system to convince us of something. As a result, they become extremely effective rhetorically and persuasively. Furthermore, they are packaged so that their answers appear correct,” Biecek said.

The mathematician and computer scientist noted that machine learning methods have been in development for 50–60 years, with efficiency as the primary goal. “For tasks such as recognizing tanks or cancerous tumours in X-rays, we have effective measures of whether LLMs (Large Language Models) are executing commands correctly. But there are a growing number of problems for which we cannot easily define and evaluate a good measure of effectiveness,” he said.

Discrimination presents a particular challenge. “While there are laws prohibiting discrimination, we do not always know how to translate such a requirement into verification of LLM performance. Without a good measure of discrimination, it is impossible to guarantee a fair AI system. And we live in a world that is historically unfair, so LLMs easily learn this injustice from historical data,” Biecek explained.

He stressed that while outdated biases in training data cannot simply be erased, AI systems can be calibrated—but only if we understand how they work. “Meanwhile, our understanding of AI has not kept pace with its development,” he said.

One major limitation of AI is its lack of knowledge about physical reality. “When we want to explore areas that were not included in the training data, LLMs cannot predict the values or course of events. They have various strategies for what to do when they encounter such a no-man’s-land, for example, they provide average values,” Biecek said.

He added that simply feeding AI textbooks on physics is insufficient. “They can provide a definition and a formula, but they cannot deduce the implications of such an equation,” he said, giving an example of AI’s misunderstanding of reality:

“When we see a tail sticking out of a cupboard, we know there is a cat inside. The AI has no idea because it has never seen anything like it before. At one conference, we asked the AI how many ‘r’ letters are in the English word ‘strawberry’. The model learned that there are three. When we asked how many strawberries are in the letter ‘r’, it answered the same way: three,” he said.

Biecek warned that the persuasive nature of AI is especially risky in fields like medicine or defence. “Even specialists lose their vigilance when they see suggestions generated by seemingly credible systems. They then make mistakes they would not otherwise make,” he said.

Research at the CCAI studies AI-human interaction, emphasizing the need to tailor AI to users’ contexts. “Designing AI models is similar [to user interface design],” he said. “ChatGPT does not know who it is talking to: whether it is doing homework with a child, helping a scientist compile research results, or perhaps someone is using it to generate a funny image. A one-size-fits-all model does not fully meet the needs of any of these users, and in some cases, it can even be harmful.”

Biecek highlighted the risks of sycophancy, or AI overindulging users. “AI uses various tricks, flattery, and praise: it is great that you asked about it; it is good that you pointed it out. It is very rare, but in some people, it causes adverse reactions, even psychosis. People subconsciously feel that while the AI appears to be praising them, something is not right, because there is no reason to praise them,” he said.

He added that AI systems can occasionally recommend harmful behaviours or content. “Some algorithms recommend unfavourable behaviours or content that can—rarely, but still—lead to depression, especially in children and adolescents, and even increase suicidal tendencies,” Biecek said. “We need to teach AI not to harm us. This applies to younger, more impressionable individuals, and older people who lack the appropriate tools to verify what the technology proposes and are prone to over-trusting it.”

Developing reliable AI presents both mathematical and social challenges. “The mathematical objects we describe are very difficult. We are talking about functions with billions of parameters; we do not even have the tools to analyse them. There are also many issues unrelated to technology. Average users want to use LLM differently, the police differently, and lobbyists have different goals. All of this must be taken into account,” he said.

Looking ahead, Biecek said AI will have transformative effects on society.

“Interesting times lie ahead, because we are certainly not dealing with a fad that will last a few years. We are on the verge of a massive technological transformation, and society's reactions to it may vary greatly. It is unclear, for example, how the labour market will react to the increasingly widespread presence of artificial intelligence. In my opinion, Poland can greatly benefit from this transformation,” he said.

PAP - Science in Poland, Anna Bugajska (PAP)

abu/ agt/ mow/

tr. RL

The PAP Foundation allows free reprinting of articles from the Nauka w Polsce portal provided that we are notified once a month by e-mail about the fact of using the portal and that the source of the article is indicated. On the websites and Internet portals, please provide the following address: Source: www.scienceinpoland.pl, while in journals – the annotation: Source: Nauka w Polsce - www.scienceinpoland.pl. In case of social networking websites, please provide only the title and the lead of our agency dispatch with the link directing to the article text on our web page, as it is on our Facebook profile.

More on this topic

  • Adobe Stock

    XMaS beamline helps develop methane recovery technology

  • Adobe Stock

    Polish scientists patent cold plasma method to protect plants from bacteria

Before adding a comment, please read the Terms and Conditions of the Science in Poland forum.