Webinar: How a Large Language Model is Created for a Low-Resource Language
We invite you to join our next expert webinar, which will focus on advances in artificial intelligence for the Slovenian language. The event will take place online in mid-September and will offer a unique opportunity to look behind the scenes of developing a large language model tailored for a low-resource language. The speaker will be Domen Vreš from the Faculty of Computer and Information Science, University of Ljubljana, who has long been engaged in research in the fields of natural language processing and artificial intelligence.
Speaker: Domen Vreš
Affiliation: Faculty of Computer and Information Science, University of Ljubljana
Title of the lecture: Advances in Artificial Intelligence for the Slovenian Language: Training and Scaling the GaMS Large Language Model
Format: Online
Date and Time: 17 September 2025 at 10:00 CEST
About webinar
This webinar will provide an in-depth look at the development of GaMS, a large language model (LLM) tailored for the Slovenian language. We will begin by exploring the general principles of LLM training and explain why this process requires substantial computational resources and data. Participants will learn about the entire workflow, from data preparation through training to model performance evaluation.
We will then focus on scaling LLM training on EuroHPC infrastructure, specifically on the Leonardo supercomputer. This step made it possible to achieve higher performance and quality of the model. Domen Vreš will also share insights into the technical and organisational aspects of collaborating on such a project.
Finally, we will delve into the specific challenges of developing an LLM for a low-resource language such as Slovenian. What makes this process more demanding? How can the limitations of available data be overcome? And what creative solutions did the team apply in the GaMS model?
A dedicated part of the webinar will address the challenges of evaluating LLMs in low-resource environments, where established benchmarks and comprehensive evaluation datasets are often lacking. The speaker will explain how this gap can be bridged through the collection and integration of human feedback, and how such data is used to further improve and fine-tune the model.
Target audience:
The webinar is suitable for artificial intelligence professionals, researchers in natural language processing, as well as students, technology enthusiasts, and anyone interested in the development and applications of large language models in minority languages.
Whether you are an AI expert or simply seeking inspiration, this webinar offers a clear and engaging insight into the groundbreaking work behind the GaMS model and its significance for the future of Slovenian language processing.
Don’t miss the opportunity to participate and gain valuable insights directly from an expert at the University of Ljubljana.
BeeGFS in Practice — Parallel File Systems for HPC, AI and Data-Intensive Workloads 6 Feb - This webinar introduces BeeGFS, a leading parallel file system designed to support demanding HPC, AI, and data-intensive workloads. Experts from ThinkParQ will explain how parallel file systems work, how BeeGFS is architected, and how it is used in practice across academic, research, and industrial environments.
When a production line knows what will happen in 10 minutes 5 Feb - Every disruption on a production line creates stress. Machines stop, people wait, production slows down, and decisions must be made under pressure. In the food industry—especially in the production of filled pasta products, where the process follows a strictly sequential set of technological steps—one unexpected issue at the end of the line can bring the entire production flow to a halt. But what if the production line could warn in advance that a problem will occur in a few minutes? Or help decide, already during a shift, whether it still makes sense to plan packaging later the same day? These were exactly the questions that stood at the beginning of a research collaboration that brought together industrial data, artificial intelligence, and supercomputing power.
Who Owns AI Inside an Organisation? — Operational Responsibility 5 Feb - This webinar focuses on how organisations can define clear operational responsibility and ownership of AI systems in a proportionate and workable way. Drawing on hands-on experience in data protection, AI governance, and compliance, Petra Fernandes will explore governance approaches that work in practice for both SMEs and larger organisations. The session will highlight internal processes that help organisations stay in control of their AI systems over time, without creating unnecessary administrative burden.
