AI censorship affects accuracy, warns Bielik co-creator

Technology

AI censorship affects accuracy, warns Bielik co-creator

27.03.2026 update: 27.03.2026

3 minutes read

Adobe Stock

Several mechanisms allow artificial intelligence models to censor responses, which can affect the quality and reliability of the information they provide, according to Krzysztof Wróbel, co-creator of the Polish AI system Bielik.

A study recently published in PNAS Nexus found that Chinese AI chatbots respond differently to politically sensitive questions about China compared to Western language models. The Chinese systems were more likely to refuse to answer, omit inconvenient facts, or provide false information, indicating systemic censorship.

"In the case of closed models (like those from Google or OpenAI), we cannot be certain of their creators' intentions. We do not know what data they used or what values guided their model development. Remember that the results you obtain from such sources may be biased," Wróbel told PAP.

He said Bielik was designed without censorship. "In Bielik's case, we assumed we would not censor it. We are not training it to refuse to answer specific questions." He cited psychoactive substances as an example, where most closed models deliver censored responses. "However, there are industries, such as the pharmaceutical industry, where such topics should not be taboo. Therefore, Bielik (the downloadable version) is designed to provide information even on sensitive topics."

Wróbel noted that completely unrestricted AI can also pose risks. He described the Bielik Guard (Sójka), a content moderation add-on that prevents the chatbot from delivering dangerous messages, including hate speech, profanity, sexual content, instructions for crime, or material related to self-harm. Sójka allows institutions to adjust "safety sliders" to protect chatbots—not just Bielik—from misuse.

According to Wróbel, AI censorship can occur at multiple stages. One is through the selection of training data. "If a model never sees texts on a given topic, it simply will not learn to talk about it. For example, if a country bans publishing content about a historical event, the language model will not learn about it and therefore will not provide a correct response."

Creators can also deliberately reject or modify training texts before adding them to the database. Fully open models documenting every step of their development remain rare. Even in Bielik, low-quality materials had to be filtered, which could unintentionally introduce bias. "For example, we can assume that the Google models received a lot of data about the corporation itself. But perhaps it is mostly positive information about the company," Wróbel said.

Censorship can also be introduced during training by human annotators, who teach the model desired forms of expression. Employees can then ensure AI responds according to organizational or government policies.

Restrictions can also be applied to an existing system through hidden instructions, or "prompts," which specify how a chatbot should answer particular questions. According to Wróbel, developers can add new prompts overnight—sometimes at the request of government authorities or other stakeholders.

"The law in individual countries already influences the responses citizens receive from chatbots. In Poland, we also have some restrictions. For example, automated systems should not provide medical, legal, or financial advice," he said. He added that failing to include appropriate disclaimers could expose developers to lawsuits.

Wróbel highlighted even subtler forms of censorship. Research on Chinese AI models generating source code found that projects on topics sensitive to China had 50% more security vulnerabilities than neutral projects, potentially making them more vulnerable to cyberattacks. "It was either a deliberate action or a side effect of incorporating censorship into the functioning of these models," he said.

"If you use language models, remember: they will never be 100% accurate or objective. You must always verify the information they provide. The most important thing is not to blindly trust them," Wróbel added.

Ludwika Tomala (PAP)

lt/ bar/

tr. RL

artificial intelligence

The PAP Foundation allows free reprinting of articles from the Nauka w Polsce portal provided that we are notified once a month by e-mail about the fact of using the portal and that the source of the article is indicated. On the websites and Internet portals, please provide the following address: Source: www.scienceinpoland.pl, while in journals – the annotation: Source: Nauka w Polsce - www.scienceinpoland.pl. In case of social networking websites, please provide only the title and the lead of our agency dispatch with the link directing to the article text on our web page, as it is on our Facebook profile.

Space

Polish scientists build key instrument for ESA mission to intercept ancient comets
Technology

Polish researchers use UV light to cut smog-causing emissions from hydrogen burners

Before adding a comment, please read the Terms and Conditions of the Science in Poland forum.

Polish scientists develop ultra-fast protein analysis tool for large-scale biology research
Scientists observe widespread brain activity before people recall words
Ancient dolphin-sized whale found in Poland rewrites early Cetacean evolution
Inside the DNA hunt for Poland’s first kings
Modern relationships seen as ‘work’ rather than lifelong certainty

Technology

Advanced AI models can develop hidden ‘toxic’ behaviour, researcher warns
Technology

AI models can secretly pass preferences to other systems, researchers find
Technology

Generative AI misuse can weaken students’ independent thinking, study finds