Scientists from the Gdańsk University of Technology and OPI have developed Polish language models called Qra. This is the first equivalent of Meta or Mistral AI's open tools on this scale. Qra understands content in Polish better and is better at creating coherent texts, the Gdańsk University of Technology reports.
According to the press office of the Gdańsk University of Technology, the university and the AI Lab at the Information National Information Processing Institute - National Research Institute (OPI) have developed Polish-language generative neural language models trained on a terabyte of text data exclusively in Polish.
'Qra is the first of its kind and the best in modelling Polish language equivalent of open tools like Meta or Mistral AI. Qra understands Polish content better, has an improved comprehension of asked questions and produces coherent texts', we read in the release.
A computing environment dedicated to building artificial intelligence models was created at Gdańsk University of Technology in IT Competence Center STOS, one of the most advanced IT centres in this part of Europe with a supercomputer Kraken.
According to the release, a cluster of 21 NVidia A100 80GB graphics cards was used in the process. The teams worked about six months on preparing the environment, creating tools and models, training them (based on content from areas such as law, technology, social sciences, biomedicine, religion or sport) and testing. 'Thanks to the rich infrastructure available at STOS, the actual training process for the most complex models was shortened from years to about a month', the university reports.
The cooperation between Gdańsk Tech and OPI resulted in the creation of three models differing in complexity, i.e. Qra 1B, Qra 7B, Qra 13B. Models Qra 7B and Qra 13B achieve a significantly better perplexity result, i.e. the ability to model the Polish language in terms of comprehension, the lexical layer and grammar, than the original models Llama-2-7b-hf (Meta) and Mistral-7B-v0.1 (Mistral-AI).
Perplexity measurement tests were performed, for example, on the set of the first 10 000 sentences from the PolEval-2018 test set, and the models were additionally tested on a set of 5 000 long and more demanding documents written in 2024.
The Qra models will constitute the basis for IT solutions to handle issues and processes that require a better understanding of the Polish language.
'At this stage, Qra is a fundamental language model that can generate grammatically and stylistically correct answers in Polish. The created content is of very high quality, which can be confirmed by the perplexity measure, among other things', we read in the release.
The team will start working on tuning the models to verify their capabilities of classifying texts, summarizing them, and answering questions.
The new models were published in the dedicated OPI-Gdańsk Tech repository on huggingface platform. Anyone can download the model and adapt it to their field and problems or tasks, such as providing answers. (PAP)
pm/ bar/ kap/
tr. RL