
ChatGPT and Gemini AI exhibit their own unique writing styles, akin to human idiolects, according to new research by Dr. Karolina Rudnicka from the University of Gdańsk, published in Scientific American.
“The style is the man,” Rudnicka notes, referencing the famous 18th-century quote Le style c’est l’homme by Georges-Louis Leclerc, Count de Buffon.
Her study explores how artificial intelligence models develop distinctive linguistic habits, just as humans do.
Rudnicka, a linguist specializing in language variability and technological impact on communication, analyzed datasets of short texts about diabetes generated by ChatGPT and Gemini.
These texts, compiled by computer scientist Muhammad Naveed, were nearly identical in length, enabling a fair comparison of the models’ language patterns.
Using the Delta method introduced by John Burrows in 2001 for authorship attribution, the study measured the linguistic “distance” between samples.
The technique compares the frequency of common function words (like “and,” “the,” “of”) and content-specific words (such as “glucose” or “sugar”).
Results show a clear stylistic split: a random 10% sample of ChatGPT texts had a linguistic distance of 0.92 to the ChatGPT corpus but 1.49 to Gemini’s. Conversely, Gemini samples had a distance of 0.84 to their own corpus and 1.45 to ChatGPT’s.
These findings indicate both models maintain consistent, but distinct, idiolects.
Rudnicka’s further analysis of “trigrams” (groups of three-word sequences) revealed ChatGPT’s style leans toward formal, clinical, and academic language with phrases like “individuals with diabetes,” “blood glucose levels,” and “characterized by elevated.” Gemini’s style, however, is more conversational and explanatory, using expressions such as “the way for,” “the cascade of,” and “high blood sugar.”
One telling difference is word choice: ChatGPT favors “glucose” over “sugar,” while Gemini prefers “sugar” more than twice as often as “glucose.”
According to Rudnicka: “Choosing words such as ‘sugar’ instead of ‘glucose’ indicates a preference for simple, accessible language.”
Why do large language models develop idiolects? Rudnicka suggests the answer lies in efficiency. Once a phrase or word becomes part of a model’s repertoire during training, it tends to reuse and combine it with similar expressions—much like human speakers have favorite words and turns of phrase.
“This could also be a form of priming,” she adds. “Just as humans are more likely to use a word once they have heard it, AI models might prime themselves with frequently used expressions.”
The presence of distinct idiolects in AI models challenges the notion that they simply mirror or average their training data.
Instead, like humans, chatbots appear to form their own lexical and grammatical habits shaped by their training and updates.
This insight is significant amid debates about how close AI is to human-level intelligence.
For now, knowing that LLMs write in idiolects could help determine if an essay or an article was generated by a model or by a particular individual, just as you might recognize a friend's message in a group chat by their characteristic style.
Paweł Wernicki (PAP)
pmw/ bar/