Webinar: How a Large Language Model is Created for a Low-Resource Language
We invite you to join our next expert webinar, which will focus on advances in artificial intelligence for the Slovenian language. The event will take place online in mid-September and will offer a unique opportunity to look behind the scenes of developing a large language model tailored for a low-resource language. The speaker will be Domen Vreš from the Faculty of Computer and Information Science, University of Ljubljana, who has long been engaged in research in the fields of natural language processing and artificial intelligence.
Speaker: Domen Vreš
Affiliation: Faculty of Computer and Information Science, University of Ljubljana
Title of the lecture: Advances in Artificial Intelligence for the Slovenian Language: Training and Scaling the GaMS Large Language Model
Format: Online
Date and Time: 17 September 2025 at 10:00 CEST
About webinar
This webinar will provide an in-depth look at the development of GaMS, a large language model (LLM) tailored for the Slovenian language. We will begin by exploring the general principles of LLM training and explain why this process requires substantial computational resources and data. Participants will learn about the entire workflow, from data preparation through training to model performance evaluation.
We will then focus on scaling LLM training on EuroHPC infrastructure, specifically on the Leonardo supercomputer. This step made it possible to achieve higher performance and quality of the model. Domen Vreš will also share insights into the technical and organisational aspects of collaborating on such a project.
Finally, we will delve into the specific challenges of developing an LLM for a low-resource language such as Slovenian. What makes this process more demanding? How can the limitations of available data be overcome? And what creative solutions did the team apply in the GaMS model?
A dedicated part of the webinar will address the challenges of evaluating LLMs in low-resource environments, where established benchmarks and comprehensive evaluation datasets are often lacking. The speaker will explain how this gap can be bridged through the collection and integration of human feedback, and how such data is used to further improve and fine-tune the model.
Target audience:
The webinar is suitable for artificial intelligence professionals, researchers in natural language processing, as well as students, technology enthusiasts, and anyone interested in the development and applications of large language models in minority languages.
Whether you are an AI expert or simply seeking inspiration, this webinar offers a clear and engaging insight into the groundbreaking work behind the GaMS model and its significance for the future of Slovenian language processing.
Don’t miss the opportunity to participate and gain valuable insights directly from an expert at the University of Ljubljana.


