NCC for HPC at INOFEST 2024: Innovation Festival in Žilina
On September 17 and 18, 2024, the fifth edition of the innovation festival INOFEST 2024 organized by the INOVATO association, took place in Žilina. The event became a unique platform for the meeting of experts, entrepreneurs, academics, innovators, students, representatives of the state administration and the public, thereby strengthening cooperation in the field of regional innovation ecosystems in Slovakia.
More than 200 participants attended the two-day program, which included lectures, workshops, discussions and networking. The main goal of the event was to contribute to the connection of companies and other actors at the national level, while creating new business opportunities and inspiration for solving current challenges in the business and social environment.
The first day's program focused on key topics such as automation, robotics, space industry and sustainable solutions. Experts such as František Duchoň from the National Robotics Center spoke at the conference, who talked about Slovakia's progress in the field of robotics, and Michaela Musilová, an astrobiologist and astronaut, who presented the possibilities of how space technologies can improve life on Earth.
A leading place also belonged to the topics of sustainability, circular economy and modularity, where, for example, Stanislav Martinec from KOMA Modular spoke about how modular construction can contribute to a sustainable future. During the first day of INOFEST 2024, a significant part of the program was devoted to the topic of artificial intelligence under the title "Artificial intelligence as the topic of 2024". In his presentation, Libor Bešényi from XOLUTION ROBOTS focused on the implementation of AI in businesses and drew attention to the difference between effective and unnecessary deployment of AI. Martin Jančura from ITRANSYS explained that not all AI is the same and understanding it correctly is key to success. Martin Haranta from PERBIOTIX pointed out how AI can help even small companies achieve great success. Mária Bieliková from KInIT spoke about the potential of AI to transform Slovakia in the field of business and the public sector. The block ended with Vladimír Šucha, representative of the European Commission in Slovakia, with the presentation "Artificial intelligence in action: Catalyst of changes in the economy and society". Šucha emphasized that AI is not only a technological tool, but also a catalyst for fundamental changes in the economy, education and public services, which can fundamentally affect the future of Slovakia and Europe. The evening was characterized by innovations and a social program, including a musical performance by Tomáš Bezdeda and the ceremonial presentation of the innovation award.
The second day of the festival was dedicated to inspiration from business, where successful entrepreneurs such as Artur Gevorkyan and Ľubomír Klieštik presented their stories. The conversation with Michaela Musilová attracted the attention of the younger generation in particular, as she shared her experiences from scientific missions and simulations of life on Mars.
The INOFEST 2024 program also included a special workshop attended by the National Competence Center for High Performance Computing (HPC). Lucia Demovičová presented the competence center project, which helps companies and institutions use powerful computing capacities to solve demanding tasks in the field of research and development.
During his presentation, Michal Pitonák presented several successful case studiessuccess storieson which NCC collaborated with various companies. These projects have shown how the use of supercomputers and computing clusters can accelerate and streamline innovation and product development in various industries.
In addition to the main program, participants had the opportunity to take part in side events, including an exhibition of innovative technologies, robotics workshops and discussions on the future of electromobility. One of the highlights of the festival was the presentation of financing options for innovative projects, where experts provided an overview of available resources, including European funds and the Horizon Europe program.
INOFEST 2024 in Žilina once again proved that Slovakia is a country where innovation, cooperation and new technologies have their place and that such a festival can be a great opportunity for building relationships, inspiration and growth in the field of innovation.
INOFEST 2024Halyna Hyryavets (National Competence Centre for High-performacne Computing)Lucia Demovičová (National Competence Centre for High-performacne Computing)EuroCC Slovakia Slovak National Competence Centre for High-performance ComputingMichal Pitoňák (Slovak National Competence Centre for High-performance Computing)Lucia Demovčiová and Michal Pitoňák (Inofest 2024)ŽilinaEuroCC Slovakia
Intent Classification for Bank Chatbots through LLM Fine-Tuning
This study evaluates the application of large language models (LLMs) for intent classification within a chatbot with predetermined responses designed for banking industry websites. Specifically, the research examines the effectiveness of fine-tuning SlovakBERT compared to employing multilingual generative models, such as Llama 8b instruct and Gemma 7b instructin both their pre-trained and fine-tuned versions. The findings indicate that SlovakBERT outperforms the other models in terms of in-scope accuracy and out-of-scope false positive rate, establishing it as the benchmark for this application.
The advent of digital technologies has significantly influenced customer service methodologies, with a notable shift towards integrating chatbots for handling customer support inquiries. This trend is primarily observed on business websites, where chatbots serve to facilitate customer queries pertinent to the business’s domain. These virtual assistants are instrumental in providing essential information to customers, thereby reducing the workload traditionally managed by human customer support agents.
In the realm of chatbot development, recent years have witnessed a surge in the employment of generative artificial intelligence technologies to craft customized responses. Despite this technological advancement, certain enterprises continue to favor a more structured approach to chatbot interactions. In this perspective, the content of responses is predetermined rather than generated on-the-fly, ensuring accuracy of information and adherence to the business’s branding style. The deployment of these chatbots typically involves defining specific classifications known as intents. Each intent correlates with a particular customer inquiry, guiding the chatbot to deliver an appropriate response. Consequently, a pivotal challenge within this system lies in accurately identifying the user’s intent based on their textual input to the chatbot.
Problem Description and Our Approach
This work is a joint effort of Slovak National Competence Center for High-Performance Computing and nettle, s.r.o., which is a Slovakia-based start-up focusing on natural language processing, chatbots, and voicebots. HPC resources of Devana system were utilized to handle the extensive computations required for fine-tuning LLMs. The goal is to develop a chatbot designed for an online banking service.
In frameworks as described in the introduction, a predetermined precise response is usually preferred over a generated one. Therefore, the initial development step is the identification of a domain-specific collection of intents crucial for the chatbot’s operation and the formulation of corresponding responses for each intent. These chatbots are often highly sophisticated, encompassing a broad spectrum of a few hundreds of distinct intents. For every intent, developers craft various exemplary phrases that they anticipate users would articulate when inquiring about that specific intent. These phrases are pivotal in defining each intent and serve as foundational training material for the intent classification algorithm.
Our baseline proprietary intent classification model, which does not leverage any deep learning framework, achieves a 67% accuracy on a real-world test dataset described in the next section. The aim of this work is to develop an intent classification model using deep learning, that will outperform this baseline model.
We present two different approaches for solving this task. The first one explores the application of Bidirectional Encoder Representations from Transformers (BERT), evaluating its effectiveness as the backbone for intent classification and its capacity to power precise response generation in chatbots. The second approach employs generative large language models (LLMs) with prompt engineering to identify the appropriate intent with and without fine-tuning the selected model.
Dataset
Our training dataset consists of pairs (text, intent), wherein each text is an example query posed to the chatbot, that triggers the respective intent. This dataset is meticulously curated to cover the entire spectrum of predefined intents, ensuring a sufficient volume of textual examples for each category.
In our study, we have access to a comprehensive set of intents, each accompanied by corresponding user query examples. We consider two sets of training data: a “simple” set, providing 10 to 20 examples for each intent, and a “generated” set, which encompasses 20 to 500 examples per intent, introducing a greater volume of data albeit with increased repetition of phrases within individual intents.
These compilations of data are primed for ingestion by supervised classification models. This process involves translating the set of intents into numerical labels and associating each text example with its corresponding label, followed by the actual model training.
Additionally, we utilize a test dataset comprising approximately 300 (text, intent) pairs extracted from an operational deployment of the chatbot, offering an authentic representation of real-world user interactions. All texts within this dataset are tagged with an intent by human annotators. This dataset is used for performance evaluation of our intent classification models by feeding them the text inputs and comparing the predicted intents with those annotated by humans.
All of these datasets are proprietary to nettle, s.r.o., so they cannot be discussed in more detail here.
Evaluation Process
In this article, the models are primarily evaluated based on their in-scope accuracy using a real-world test dataset comprising 300 samples. Each of these samples belongs to the in-scope intents on which the models were trained. Accuracy is calculated as the ratio of correctly classified samples to the total number of samples. For models that also provide a probability output, such as BERT, a sample is considered correctly classified only if its confidence score exceeds a specified threshold. Throughout this article, accuracy refers to this in-scope accuracy.
As a secondary metric, the models are assessed on their out-of-scope false positive rate, where a lower rate is preferable. For this evaluation, we use artificially generated out-of-scope utterances.
The model is expected either to produce a low confidence score below the threshold (for BERT) or generate an ’invalid’ label (for LLM, as detailed in their respective sections).
Since the data at hand is in the Slovak language, the choice of a model with Slovak understanding was inevitable. Therefore, we have opted for a model named SlovakBERT [5], which is the first publicly available large-scale Slovak masked language model.
Multiple experiments were undertaken by fine-tuning this model before arriving at the top-performing model. These trials included adjustments to hyperparameters, various text preprocessing techniques, and, most importantly, the choice of training data.
Given the presence of two training datasets with relevant intents (“simple” and “generated”), experiments with different ratios of samples from these datasets were conducted. The results showed that the optimal performance of the model is achieved when training on the “generated” dataset.
After the optimal dataset was chosen, further experiments were carried out, focusing on selecting the right preprocessing for the dataset. The following options were tested:
turning text to lowercase,
removing diacritics from text, and
removing punctuation from text.
Additionally, combinations of these three options were tested as well. Given that the leveraged SlovakBERT model is case-sensitive and diacritic-sensitive, all of these text transformations impact the overall performance.
Findings from the experiments revealed that the best results are obtained when the text is lowercased and both diacritics and punctuation are removed.
Another aspect investigated during the experimentation phase was the selection of layers for fine-tuning. Options to fine-tune only one quarter, one half, three quarters of the layers, and the whole model were analyzed (with variations including fine-tuning the whole model for the first few epochs and then a selected number of layers further until convergence). The outcome showed that the average improvement achieved by these adjustments to the model’s training process is statistically insignificant. Since there is a desire to keep the pipeline as simple as possible, these alterations did not take place in the final pipeline.
Every experiment trial underwent assessment three to five times to ensure statistical robustness in considering the results.
The best model produced from these experiments had an average accuracy of 77.2% with a standard deviation of 0.012.
Banking-Tailored BERT
Given that our data contains particular banking industry nomenclature, we opted to utilize a BERT model fine-tuned specifically for the banking and finance sector. However, since this model exclusively understands the English language, the data had to be translated accordingly.
For the translation, DeepL API[1]was employed. Firstly, training, validation, and test data was translated. Due to the nature of the English language and translation, no further correction (preprocessing) was done to the text, as discussed in 2.3.1Subsequently, the model’s weights were fine-tuned to enhance performance.
The fine-tuned model demonstrated promising initial results, with accuracy slightly exceeding 70%. Unfortunately, further training and hyperparameter tuning did not yield better results. Other English models were tested as well, but all of them produced similar results. Using a customized English model proved insufficient to achieve superior results, primarily due to translation errors. The translation contained inaccuracies caused by the ’noisiness’ of the data, especially within the test dataset.
Approach 2: LLMs for Intent Classification
As mentioned in Section 2in addition to fine-tuning SlovakBERT model and other BERT-based models, the use of generative LLMs for the intent classification was explored too. Specifically, instruct models were selected for their proficiency in handling instruction prompts and question-answering tasks.
Since there are not open-source instruct model exclusively trained for the Slovak language, several multilingual models were selected: Gemma 7b instruct [6] a Llama3 8b instruct For comparison, we also include results for the closed-source OpenAI’s gpt-3.5-turbomodel under the same conditions.
Similarly to [4], we use LLM prompts with intent names and descriptions to perform zero-shot prediction. The output is expected to be the correct intent label. Since the full set of intents with their descriptions would inflate the prompt too much, we use our baseline model to select only top 3 intents. Hence the prompt data for these models was created as follows:
Each prompt includes a sentence (user’s question) in Slovak, four intent options with descriptions, and an instruction to select the most appropriate option. The first three intent options are the ones selected by the baseline model, which has a Top-3 recall of 87%. The last option is always ‘invalid’ and should be selected when neither of the first three matches the user’s question or the input intent is out-of-scope. Consequently, the highest attainable in-scope accuracy in this setting is 87%.
Pre-trained LLM Implementation
Initially, a pre-trained LLM implementation was utilized, meaning a given instruct model was leveraged without fine-tuning on our dataset. A prompt was passed to the model in the user’s role, and the model generated an assistant’s response.
To improve the results, prompt engineering was employed too. It included subtle rephrasing of the instruction; instructing the model to answer only with the intent name, or with the number/letter of the correct option; or placing the instruction in the system’s role while the sentence and options were in the user’s role.
Despite these efforts, this approach did not yield better results than SlovakBERT’s fine-tuning. However, it helped us identify the most effective prompt formats for fine-tuning of these instruct models. Also, these steps were crucial in understanding the models’ behaviour and response pattern, which we leveraged in fine-tuning strategies of these models.
LLM Optimization through Fine-Tuning
The prompts that the pre-trained models reacted best to were used for fine-tuning of these models. Given that LLMs do not require extensive fine-tuning datasets, we utilized our “simple” dataset as detailed in section 2.1The model was then fine-tuned to respond to the specified prompts with the appropriate label names.
Due to the size of the chosen models, parameter efficient training (PEFT) [2] strategy was employed to handle the memory and time issues. PEFT updates only a subset of parameters, while “freezing” the rest, therefore reducing the number of trainable parameters. Specifically, the Low-Rank Adaptation (LoRA) [3] approach was used.
To optimize performance, various hyperparameters were tuned too, including learning rate, batch size, lora alpha parameter of the LoRA configuration, the number of gradient accumulation steps, and chat template formulation.
Optimizing language models involves high computational demands, necessitating the use of HPC resources to achieve the desired performance and efficiency. The Devana system, with each node containing 4 NVidia A100 GPUs with 40GB of memory each, offers significant computational power. In our case, both models we are fine-tuning fit within the memory of one GPU (full size, not quantized) with a maximum batch size of 2.
Although leveraging all 4 GPUs in a node would reduce training time and allow for a larger overall batch size (while maintaining the same batch size per device), for benchmarking purposes and to guarantee consistency and comparability of the results, we conducted all experiments using 1 GPU only.
These efforts led to some improvements in models’ performances. Particularly for Gemma 7b instruct instruct in reducing the number of false positives. On the other hand, while fine-tuning Llama3 8b instruct, both metrics (accuracy and the number of false positives) were improved. However, neither Gemma 7b instruct nor Llama3 8b instruct models outperformed the capabilities of the fine-tuned SlovakBERT model.
With Gemma 7b instructsome sets of hyperparameters resulted in high accuracy but also a high false positive rate, while others led to lower accuracy and low false positive rate. Search for a set of hyperparameters bringing balanced accuracy and false positive rate was challenging. The best-performing configuration achieved an accuracy slightly over 70% with a false positive rate of 4.6%. Compared to the model’s performance without fine-tuning, fine-tuning only slightly increased the accuracy, but dramatically reduced the false positive rate by almost 70%.
With Llama3 8b instruct, the best-performing configuration achieved an accuracy of 75.1% with a false positive rate of 7.0%. Compared to the model’s performance without fine-tuning, fine-tuning significantly increased the accuracy and also halved the false positive rate.
Comparison with a Closed-Source Model
To benchmark our approach against a leading closed-source LLM, we conducted experiments using gpt-3.5-turbo OpenAI.[1]We employed identical prompt data for a fair comparison and tested both the pre-trained and fine-tuned versions of this model. Without fine-tuning, gpt-3.5-turbo achieved an accuracy of 76%, although it exhibited a considerable false positive rate. After fine-tuning, the accuracy improved to almost 80%, and the false positive rate was considerably reduced.
Results
In our initial strategy, involving fine-tuning SlovakBERT model for our task, we achieved average accuracy of 77.2% with a standard deviation of 0.012, representing an increase of 10% from the baseline model’s accuracy.
Fine-tuning banking-tailored BERT on translated dataset showcased the final accuracy slightly under 70%, which outperforms the baseline model, however it does not surpass the performance of fine-tuned SlovakBERT model.
Subsequently, we experimented with pre-trained (but not fine-tuned with our data) generative LLMs for our task. While these models showed promising capabilities, their performance was inferior to that of the SlovakBERT fined-tuned for our specific task. Therefore, we proceeded to fine-tune these models, namely Gemma 7b instruct and Llama3 8b instruct. Gemma 7b instruct and Llama3 8b instruct.
The fine-tuned Gemma 7b instruct 7b instruct models demonstrated a final accuracy comparable to the banking-tailored BERT, while fine-tuned Llama3 8b instruct performance was slightly worse than the SlovakBERT fined-tuned. Despite extensive efforts to find the configuration surpassing the capabilities of the SlovakBERT model, these attemps were unsuccessful, establishing the SlovakBERT model as our benchmark for performance.
All results are displayed in Table 1including the baseline proprietary model and a closed-source model for comparison.
Table 1: Percentage comparison of models’ in-scope accuracy and out-of-scope false positive rate.
Conclusion
The goal of this study was to find an approach leveraging a pre-trained language model (fine-tuned or not) as a backbone for chatbot for banking industry. The data provided for the study consisted of pairs of text and intent, where the text represents user’s (customer’s) query and the intent represents the triggered intent.
Several language models were experimented with, including SlovakBERT, banking-tailored BERT and generative models Gemma 7b instruct and Llama3 8b instructAfter experimentations with the dataset, fine-tuning configurations and prompt engineering; fine-tuning SlovakBERT emerged as the best approach yielding final accuracy slightly above 77%, which represents a 10% increase from the baseline’s models accuracy, demonstrating its suitability for our task.
In conclusion, our study highlights the efficacy of fine-tuning pre-trained language models for developing a robust chatbot with accurate intent classification. Moving forward, leveraging these insights will be crucial for further enhancing performance and usability in real-world banking applications.
Research results were obtained with the support of the Slovak National competence centre for HPC, the EuroCC 2 project and Slovak National Supercomputing Centre under grant agreement 101101903-EuroCC 2-DIGITAL-EUROHPC-JU-2022-NCC-01.
References:
[1] AI@Meta. Llama 3 model card. 2024. URL: https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md.
[2] Zeyu Han, Chao Gao, Jinyang Liu, Jeff Zhang, and Sai Qian Zhang. Parameter-efficient fine-tuning for large models: A comprehensive survey, 2024. arXiv:2403.14608.
[3] Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models. CoRR, abs/2106.09685, 2021. URL: https://arxiv.org/abs/2106.09685, arXiv:2106.09685.
[4] Soham Parikh, Quaizar Vohra, Prashil Tumbade, and Mitul Tiwari. Exploring zero and fewshot techniques for intent classification, 2023. URL: https://arxiv.org/abs/2305.07157, arXiv:2305.07157.
[5] Matúš Pikuliak, Štefan Grivalský, Martin Konôpka, Miroslav Blšták, Martin Tamajka, Viktor Bachratý, Marián Šimko, Pavol Balážik, Michal Trnka, and Filip Uhlárik. Slovakbert: Slovak masked language model. CoRR, abs/2109.15254, 2021. URL: https://arxiv.org/abs/2109.15254, arXiv:2109.15254.
[6] Gemma Team, Thomas Mesnard, and Cassidy Hardin et al. Gemma: Open models based on gemini research and technology, 2024. arXiv:2403.08295.
Authors
Bibiána Lajčinová – Slovak National Supercomputing Centre Patrik Valábek – Slovak National Supercomputing Centre, ) Institute of Information Engineering, Automation, and Mathematics, Slovak University of Technology in Bratislava Michal Spišiak – nettle, s.r.o., Bratislava, Slovakia
Digital Twins of Society: HPC-Powered Simulations25 Jun-Join us for a thought-provoking webinar exploring how artificial intelligence and multi-agent simulation technologies are helping researchers understand and predict complex societal dynamics. This session brings together leading experts in cultural cybernetics, cognitive modeling, and national-scale digital twin simulations.
Strengthening EuroCC ties: NCC Slovakia visits FCCN in Lisbon24 Jun-On June 24th, representative of NCC Slovakia, Božidara Pellegrini, met with colleagues from NCC Portugal at the headquarters of FCCN – Fundação para a Ciência e a Tecnologia in Lisbon.
HPC webinar for SMEs: Examples of real use of HPC in Poland, the Czech Republic and Slovakia
An informative webinar was held on September 4, highlighting the potential of high-performance computing through real-life success stories and engaging projects implemented with the support of the National Competence Centers for HPC. In addition to examples implemented in the Slovak NCC, the webinar also presented the expertise and experience of neighboring competence centers in the Czech Republic and Poland.
Michal Pitoňák shared experiences from four successful HPC use cases, including the transfer and optimization of CFD computational workflows in the HPC environment, anomaly detection in time series to prevent gambling using deep learning, entity identification for address extraction from transcribed interviews using synthetic data, and measurement of structural parameters of capsules using AI and ML techniques. Tomáš Karásek presented examples of using artificial intelligence to solve engineering problems focused on energy and transportation. Szymon Mazurek introduced the SpeakLeash initiative, a community-driven project to develop a national large language model (LLM) ecosystem in Poland.
ARE YOU AN ENTREPRENEUR? HOW DO YOU DISCOVER THAT YOUR COMPANY IS UNDER CYBER ATTACK?
In today's digital era, when technology has become an essential part of business, cyber attacks are becoming an alarming reality for businesses of all sizes. With the increasing number of threats and the sophistication of attacks, it is imperative that companies invest in employee training and securing their systems.
As the number of digital devices and online services increases, so does the risk of cyber attacks, which can have serious consequences for individuals and organizations. Prevention and early detection are key to protecting sensitive data and reducing the risk of financial losses. Various forms of threats such as phishing, ransomware and DDoS attacks are becoming increasingly sophisticated and require a proactive approach to data protection.
THE MOST COMMON CYBER ATTACKS
Cyber attacks usually take place in several stages. The first is preparation and reconnaissance, where attackers gather information about the target. "It can be an analysis of public profiles on social networks, company websites or finding out technical specifications and systems. After collecting the necessary information, the attackers try to gain access to the company's systems. This can include phishing emails that contain malicious links or attachments, but also exploiting vulnerabilities in software. If attackers have access, they start by stealing data, installing malware, or directly attacking servers. In the case of ransomware, the data is encrypted and the attackers demand a ransom," explains Ondrej Kreheľ, co-founder of Conference®, ktorá sa venuje organizovaniu konferencií o kybernetickej bezpečnosti.
DDoS attacks are aimed at overloading servers and networks, causing service outages. They can result in lost productivity and financial losses for organizations.
HOW TO RECOGNIZE AND DETECT A CYBER ATTACK
Attackers are very clever and persistent, so it is crucial for businesses to have mechanisms in place to detect and prevent cyber threats. “Employees should be regularly trained in cyber security to recognize phishing emails and other fraudulent techniques. Network activity should also be monitored and analyzed. Implementing systems to monitor network traffic can help identify unusual activity that could indicate an attempted attack. Regular updates and backups will also help. Keeping software and systems up-to-date and regularly backing up data are basic steps to protect against cyber attacks," says Ondrej Kreheľ, a top cyber security expert who works in New York.
BLACKMAIL AND RANSOM IS NOT A TABOO EVEN IN SLOVAKIA The theft or encryption of sensitive company data and the subsequent request for a ransom is nothing new in Slovakia either. "It also happens to Slovak companies, it also involves millions of dollars for restoring systems and returning advances so that the company can continue. So the question is not whether it will ever happen to you, but when you will become a victim of a cyber attack," says O. Kreheľ. He also reminds that it is also important to develop an incident response plan, which can help companies quickly respond to attacks and minimize their impact.
THE CONFERENCE OFFERS A WIDE RANGE OF EXPERTS
Cybersecurity is a dynamic and ever-evolving process, and for entrepreneurs and businesses, it is essential to address it. Lectures by experts and trainings with the best in the field are offered by the Qubit Conference,® which takes place on November 13-14, 2024 in the Chopok Wellness Hotel in Jasná.
Devana: Call for Standard HPC access projects 3/24
Výpočtové stredisko SAV a Národné superpočítačové centrum otvarujú druhú tohtoročnú Call for Projects for Standard Access to HPC 3/24. Projects are possible continuously, while there are 3 closing dates as standard during the year, after which the evaluation will take place until the submitted applications. It is possible to apply for access through the register.nscc.sk user portal register.nscc.sk .
Standard access to high-performance computing resources is open to all areas of science and research, especially for larger-scale projects. These projects should demonstrate excellence in the respective fields and a clear potential to bring innovative solutions to current social and technological challenges. In the application, it is necessary to demonstrate the efficiency and scalability of the proposed calculation strategies and methods in the HPC environment. The necessary data on the performance and parameters of the considered algorithms and applications can be obtained within the Testing Access.
Allocations are awarded for one (1) year with the option to apply for extension, if necessary. Access is free of charge, provided that all requirements defined in the Terms of reference are met. Submitted projects are evaluated from a technical point of view by the internal team of CC SAS and SK NSCC, and the quality of the scientific and research part is always evaluated by two independent external reviewers.
Call opening date: 2.9.2024 Call closing date: 1.10. 2024, 17:00 CET Communication of allocation decision: Up to 2 weeks from Call closing. Start of the allocation perion for awarded projects: no latter than 15.10.2024
Eligible Researchers Scientists and researchers from Slovak public universities and the Slovak Academy of Sciences, as well as from public and state administration organizations and private enterprises registered in the Slovak Republic, can apply for standard access to HPC. Access is provided exclusively for civil and non-commercial open-science research and development. Interested parties from private companies should first contact the National Competence Centre for HPC.
Final report within 2 months from the end of the project.
Peer-review and other publications in domestic and foreign scientific periodicals with acknowledgments in the pre-defined wording, reported through the user portal.
Active participation in the Slovak HPC conference organized by the coordinator of this call (poster, other contribution).
Participation in dissemination activities of the coordinator (interview, article in the HPC magazine, etc.).
Invitation to Qubit Conference® Slovakia 2024: Don't miss the unique opportunity to participate in the top event in the field of cyber security
Qubit Conference® Slovakia 2024 is approaching and we are happy to invite you to this prestigious event, which will take place on November 13 and 14, 2024 in the wonderful surroundings of the Congress & Wellness Hotel Chopok in Jasná. This is a unique opportunity for professionals from various industries to meet, share their experiences and gain the latest knowledge in the field of cyber security.
What awaits you?
The conference offers a rich and varied program that includes more than 30 renowned speakers and numerous panel discussions and training sessions. The first day will be dedicated to five key panel discussions, which will focus on the most important challenges and current topics in the field of cyber security:
The Joys and Sorrows of Everyday Cyber Security Operations: How to Manage Cyber Security's Everyday Challenges and Contingencies?
Don't be afraid of "GRC": Governance, risk and compliance - how to ensure that your organizations meet all regulatory requirements while minimizing risks?
Here's how we solved it: Cyber incident resolution case studies straight from the experts.
Taking care of people in IT from A to Z: How to effectively recruit, develop and retain top IT professionals?
Future threats: Looking at security through the window of the future - what new threats are emerging and how to prepare for them?
The first conference day will end with networking where you will find a pleasant community atmosphere, colleagues from the industry and the opportunity to establish new business cooperation. The evening culminates in a bowling tournament, which promises great fun and an opportunity for informal meetings with other participants.
Second day: Intensive training with experts
The second day of the conference will focus on practical trainings, which will be led by experienced experts. Participants have the opportunity to choose from three full-day training sessions:
Solving incidents using open source tools - This training, led by an expert from ESET, Ladislav Bač, will focus on the effective use of open source tools in solving cyber incidents.
Code Strong - mental resilience in times of chaos - Zuzana Reľovská from Wellbeing will teach you how to develop mental resilience in times of constant change and cyber threats.
We manage cyber risks quantitatively - Michal Hanus from Cyber Rangers will present methods of quantitative assessment and management of cyber risks.
These trainings will provide participants with a deeper understanding and practical skills that they can immediately apply in their daily work.
Why should you be part of the Qubit Conference® Slovakia 2024?
Qubit Conference® Slovakia 2024 is not only about lectures and trainings. It is a platform where cybersecurity experts from various industries, including finance, energy, manufacturing, telecommunications, pharmaceuticals, critical infrastructure, IT and government institutions, meet. The conference offers a unique opportunity for networking and exchanging experiences with colleagues and specialists from the entire region.
In addition, the Qubit Conference® is known for gradually becoming a key event in the field of cyber security in Central Europe, welcoming more than 200 participants every year. It's a place to share the latest technologies and strategies that are shaping the future of digital security.
Register Today!
Do not miss the opportunity to participate in this important event. Register for Qubit Conference® Slovakia 2024 as soon as possible and secure your place among cyber security experts. You can find more information about the conference and the possibility of registration on the official website of the event.
We look forward to your participation and believe that the Qubit Conference® Slovakia 2024 will bring you new knowledge, inspiration and valuable contacts that will move you and your organization forward in the field of cyber security.
The National Competence Center for High-Performance Computing (HPC) has established a new strategic partnership with IT Valley Košice under the HPC Ambassador program. This collaboration aims to strengthen technological innovation and development in eastern Slovakia, contributing to the growth of the innovation ecosystem across the country.
Goals and Vision
The partnership focuses on promoting the adoption of HPC technologies among members of IT Valley Košice including companies, academic institutions, and research organizations. Through this effort, we aim to create an environment that fosters the development of talent and innovative companies capable of competing on the global stage.
IT Valley Košice strives to build a technologically advanced business environment in eastern Slovakia, and this partnership significantly contributes to establishing the region as a center of excellence for business, research, and education.
Practical Collaboration
The National Competence Center for HPC will provide relevant information, training, and services to IT Valley Košice members, while IT Valley Košice will promote these opportunities and help identify organizations ready to leverage HPC technologies. Members will gain access to top-notch support and expert consultations.
Additionally, IT Valley Košice will play a key role in facilitating knowledge and technology transfer between academia and the IT industry. Educational events, workshops, and competitions will be organized to enhance the region's innovation potential. We look forward to successful collaboration with IT Valley Košice and to projects that will support the innovative and entrepreneurial environment in Slovakia.
In today's rapidly changing market, it is crucial for small and medium enterprises (SMEs) to effectively leverage new technologies to stay competitive. One of the innovative solutions that can significantly advance SMEs is High-Performance Computing (HPC).
If you are interested in how HPC can improve your business, don't hesitate to join a special online event organized by the National Competence Centre for HPC. The eventwill take place online on September 4, 2024. Registrationis mandatory.
Why You Shouldn't Miss It?
The event will focus on the possibilities of using HPC in Central Europe and provide practical information and examples of how even smaller companies can use this technology to enhance their business processes. HPC can help speed up product development, optimize manufacturing processes, improve service quality, and reduce costs, which is especially important for SMEs looking for ways to gain a competitive edge.
What Can You Look Forward To?
During the event, you will have the opportunity to hear real case studies from various industries that demonstrate how HPC has helped small and medium-sized businesses achieve their goals. You'll learn how engineering companies use HPC for simulations and optimization of design proposals or how pharmaceutical companies utilize this technology to accelerate the development of new drugs.
Collaboration and Support
Another important topic will be the collaboration between SMEs and technology centers. You will learn how these organizations can provide the necessary infrastructure and expertise that SMEs need to utilize HPC. Experts from the National Competence Centers for HPC in Central Europe will also present opportunities to access modern computing resources that would otherwise be financially inaccessible.
Register Today!
Don't miss this unique opportunity and join the event on September 4, 2024, which will take place online. It's a chance to gain valuable information for free, make new connections, and discover how HPC can take your business to the next level. Registration is open, so don't hesitate and secure your spot today!
Registrationis open, so don't hesitate and secure your spot today! We also have a giveaway for event participants: giveaway!
Get inspired and find out how you can gain a competitive advantage in the global market with HPC!
Leveraging LLMs for Efficient Religious Text Analysis
The analysis and research of texts with religious themes have historically been the domain of philosophers, theologians, and other social sciences specialists. With the advent of artificial intelligence, such as the large language models (LLMs), this task takes on new dimensions. These technologies can be leveraged to reveal various insights and nuances contained in religious texts — interpreting their symbolism and uncovering their meanings. This acceleration of the analytical process allows researchers to focus on specific aspects of texts relevant to their studies.
One possible research task in the study of texts with religious themes involves examining the works of authors affiliated with specific religious communities. By comparing their writings with the official doctrines and teachings of their denominations, researchers can gain deeper insights into the beliefs, convictions, and viewpoints of the communities shaped by the teachings and unique contributions of these influential authors.
This report proposes an approach utilizing embedding indices and LLMs for efficient analysis of texts with religious themes. The primary objective is to develop a tool for information retrieval, specifically designed to efficiently locate relevant sections within documents. The identification of discrepancies between the retrieved sections of texts from specific religious communities and the official teaching of the particular religion the community originates from is not part of this study; this task is entrusted to theological experts.
This work is a joint effort of Slovak National Competence Center for High-Performance Computing and the Faculty of Theology at Trnava University. Our goal is to develop a tool for information retrieval using LLMs to help theologians analyze religious texts more efficiently. To achieve this, we are leveraging resources of HPC system Devana to handle the computations and large datasets involved in this project.
Dataset
The texts used for the research in this study originate from the religious community known as the Nazareth Movement (commonly referred to as ”Beňovci”), which began to form in the 1970s. The movement, which some scholars identify as having sect-like characteristics, is still active today, in reduced and changed form. Its founder, Ján Augustín Beňo (1921 - 2006), was a secretly ordained Catholic priest during the totalitarian era. Beňo encouraged members of the movement to actively live their faith through daily reading of biblical texts and applying them in practice through specific resolutions. The movement spread throughout Slovakia, with small communities existing in almost every major city. It also spread to neighboring countries such as Poland, the Czech Republic, Ukraine, and Hungary. In 2000, the movement included approximately three hundred married couples, a thousand children, and 130 priests and students preparing for priesthood. The movement had three main goals: radical prevention in education, fostering priests who could act as parental figures to identify and nurture priestly vocations in children, and the production and distribution of samizdat materials needed for catechesis and evangelization.
27 documents with texts from this community are available for research. These documents, which significantly influenced the formation of the community and its ideological positions, were reproduced and distributed during the communist regime in the form of samizdats — literature banned by the communist regime. After the political upheaval, many of them were printed and distributed to the public outside the movement. Most of the analyzed documents consist of texts intended for ”morning reflections” — short meditations on biblical texts. The documents also include the founder’s comments on the teachings of the Catholic Church and selected topics related to child rearing, spiritual guidance, and catechesis for children.
Although the documents available to us contained a few duplications, this did not pose a problem for the information retrieval task and will thus remain unaddressed in this report. All of the documents are written exclusively in Slovak language.
One of the documents is annotated for test purposes by experts from the partner faculty, who have long been studying the Nazareth Movement. By annotations, we refer to text parts labeled as belonging to one of the five classes, where these classes represent five topics, namely
Directive obedience
Hierarchical upbringing
Radical adoption of life model
Human needs fulfilled only in religious community and family
Strange/Unusual/Intense
Additionally, each of this topics is supplemented with a set of queries designed to test the retrieval capabilities of our solution.
Table 1
Strategy/Solution
There are multiple strategies appropriate for solving this task, including text classification, topic modelling, retrieval-augmented generation (RAG), and fine-tuning of LLMs. However, the theologians’ requirement is to identify specific parts of the text for detailed analysis, necessitating the retrieval of exact wording. Therefore, a choice was made to leverage information retrieval. This approach differs from RAG, which typically incorporates both information retrieval and text generation components, in focusing solely on retrieving textual data, without the additional step of new content generation.
Information retrieval leverages LLMs to transform complex data such as text, into a numerical representation that captures the semantic meaning and context of the input. This numerical representation, known as embedding, can be used to conduct semantic searches by analysing the positions and proximity of embeddings within a multi-dimensional vector space. By using queries, the system can retrieve relevant parts of the text by measuring the similarity between the query embeddings and the text embeddings. This approach does not require any fine-tuning of the existing LLMs, therefore the models can be used without any modification and the workflow remains quite simple.
Model choice
Information retrieval leverages LLMs to transform complex data such as text, into a numerical representation that captures the semantic meaning and context of the input. This numerical representation, known as embedding, can be used to conduct semantic searches by analysing the positions and proximity of embeddings within a multi-dimensional vector space. By using queries, the system can retrieve relevant parts of the text by measuring the similarity between the query embeddings and the text embeddings.
These four models were leverages to acquire vector representations of the chunked text, and their specific contributions will be discussed in the following parts of the study.
Data preprocessing
The first step of data preprocessing involved text chunking. The primary reason for this step was to meet the requirement of religious scholars for retrieval of paragraph-sized chunks. Besides, documents needed to be split into smaller chunks anyway due to the limited input lengths of some LLMs. For this purpose, the Langchain library was utilized. It offers hierarchical chunking that produces overlapping chunks of a specific length (with a desired overlap) to ensure that the context is preserved. Chunks with lengths of 300, 400, 500 and 700 symbols were generated. Subsequent preprocessing steps included removal of diacritics, case normalization according to the requirements of the models and stopwords removal. The removal of stopwords is a common practice in natural language processing tasks. While some models may benefit from the exclusion of stopwords to improve relevancy of retrieved chunks, others may take advantage of retaining stopwords to preserve contextual information essential for understanding the text.
Table 2
Vector Embeddings
Vector embeddings were created from text chunks using selected pre-trained language models.
For the Slovak-BERT model, generating embedding involves leveraging the model without any additional layers for inference and then using the first embedding, which contains all the semantic meaning of the chunk, as the context embedding. Other models produce embeddings in required form, so no further postprocessing was needed.
In the subsequent results section, the performance of all created embedding models will be analyzed and compared based on their ability to capture and represent the semantic content of the text chunks.
Results
Prior to conducting quantitative tests, all embedding indices underwent preliminary evaluation to determine the level of understanding of the Slovak language and the specific religious terminology by the selected LLMs. This preliminary evaluation involved subjective judgement of the relevance of retrieved chunks.
These tests revealed that the E5 model embeddings exhibit limited effectiveness on our data. When retrieving for a specific query, the retrieved chunks contained most of the key words used in the query, but did not contain the context of the query. One of the explanations could be that this model prioritizes word-level matches over the nuanced context in Slovak language, because it’s possible that the training data of this model for Slovak was less extensive or less contextually rich, leading to weaker performance. However, these observations are not definitive conclusions but rather hypotheses based on current, limited results. A decision was made not to further evaluate the performance of the embedding indices leveraging E5 embeddings, as it seemed irrelevant given the inability to effectively capture the nuances of the religious texts. On the other hand, the abilities of Slovak-BERT model, based on the RoBERTa architecture characterized by its relatively simple architecture, exceeded the expectations. Moreover, the performance of text-embedding-3-small and BGE M3 embeddings met expectations, as the first test, subjectively evaluated, demonstrated a very good grasp of the context, proficiency in Slovak language, and understanding of the nuances within the religious texts.
Therefore, quantitative tests were performed only on embedding indices utilizing Slovak-BERT, OpenAI’s text-embedding-3-small and BGE M3 embeddings.
Given the problem specification and the nature of test annotations, there arises a potential concern regarding the quality of the annotations. It is possible that some text parts were misclassified as there may be sections of text that belong to multiple classes. This, combined with the possibility of human error, can affect the consistency and accuracy of the annotations.
With this consideration in mind, we have opted to focus solely on recall evaluation. By recall, we mean the proportion of correctly retrieved chunks out of the total number of annotated chunks, regardless of the fraction of false positive chunks. Recall will be evaluated for every topic and for every length-specific embedding index for all selected LLMs.
Moreover, the provided test queries might also reflect the complexity and interpretative nature of religious studies. For example, consider a query ”God’s will” for the topic Directive obedience. While careful reader understands how this query relates to the given topic, it might not be as clear to a language model. Therefore, apart from evaluating using provided queries, another evaluation was conducted using queries acquired through contextual augmentation. Contextual/query augmentation is a prompt engineering technique for enhancing text data quality and is well-documented in various research papers , . This technique involves prompting a language model to generate a new query based on initial query and other contextual information in order to formulate a better query. Language model used for generation of queries through query augmentation technique was GPT 3.5 and these queries will be referred to as ”GPT queries” throughout the rest of the report.
Slovak-BERT embedding indices
Recall evaluation for embedding indices utilizing Slovak-BERT embeddings for four different chunk sizes with and without stopwords removal is presented in Figure 1The evaluation covers each topic specified in the list in Section 2 and includes both original queries and GPT queries.
We observe, that GPT queries generally yield better results compared to the original queries, except for the last two topics, where both sets of queries produce similar results. Also, it is apparent, that Slovak-BERT-based embeddings benefit from stopwords removal in most cases. The highest recall values were achieved for the third topic Radical adoption of life model, with the chunk size of 700 symbols with removed stopwords, reaching more than 47%. In contrast, the worst results were observed for the topic Strange/Unusual/Intense, where neither the original nor GPT queries successfully retrieved relevant parts. In some cases none of the relevant parts were retrieved at all.
Recall values obtained for all topics using both original and GPT queries, across various chunk sizes of embeddings generated using the Slovak-BERT model. Embedding indices marked as +SW include stopwords, while -NoSW indicates stopwords were removed.
Figure 1: Recall values obtained for all topics using both original and GPT queries, across various chunk sizes of embeddings generated using the Slovak-BERT model. Embedding indices marked as +SW include stopwords, while -NoSW indicates stopwords were removed.
OpenAI’s text-embedding-3-small embedding indices
Similar to the evaluation for Slovak-BERT embedding indices, evaluation charts for embedding indices utilizing OpenAI’s text-embedding-3-small embeddings are presented in Figure 2The recall values are generally much higher than those observed with Slovak-BERT embeddings. As with the previous results, GPT queries produce better outcomes. We can observe a subtle trend in recall value and chunk size dependency – longer chunk sizes generally yield higher recall values.
An interesting observation can be made for the topic Radical adoption of life model. When using the original queries, hardly any relevant results were retrieved. However, when using GPT queries, recall values were much higher, reaching almost 90% for chunk sizes of 700 symbols.
Regarding the removal of stopwords, its impact on embeddings varies. For topics 4 and 5, stopwords removal proves beneficial. However, for the other topics, this preprocessing step does not offer advantages.
Topics 4 and 5 exhibited the weakest performance among all topics. This may be due to the nature of the queries provided for these topics, which are quotes or full sentences, compared to queries for other topics, that are phrases, keywords or expressions. It appears that this model performs better with the latter type of queries. On the other hand, since the queries for topics 4 and 5 are full sentences, the embeddings benefit from stopwords removal, as it probably helps in handling the context of sentence-like queries.
Topic 4 is very specific and abstract, while topic 5 is very general, making it understandable that capturing this topic in queries is challenging. The specificity of topic 4 might require more nuanced test queries, as the provided test queries probably did not contain all nuances of a given topic. Conversely, the general nature of topic 5 might benefit from a different analytical approach. Methods like Sentiment Analysis could potentially grasp the strange, unusual, or intense mood in relation to the religious themes analysed.
Figure 2: Recall values assessed for all topics using both original and GPT queries, utilizing various chunk sizes of embeddings generated with the text-embedding-3-small model. Embedding indices labeled +SW include stopwords, and those labeled -NoSW have stopwords removed.
BGE M3 embedding indices
Evaluation charts for embedding indices utilizing BGE M3 embeddings are presented in Figure 3The recall values demonstrate a performance falling between Slovak-BERT and OpenAI’s text-embedding-3-small embeddings. While, in some cases, not reaching the recall values of OpenAI’s embeddings, BGE M3 embeddings show competitive performance, particularly considering their open-source availability compared to OpenAI’s embeddings, that are accessible through API, which might pose a problem with data confidentiality.
With these embeddings, we also observe the same phenomenon as with OpenAI’s text-embedding-3-small embeddings: shorter, phrase-like queries are preferred over quote-like queries. Therefore, recall values are higher for first three topics.
Stopwords removal seems to be mostly beneficial, mainly for the last two topics.
Figure 3: Recall values for all topics using original and GPT queries, with embeddings of different chunk sizes produced by the BGE M3 model. Indices labeled as +SW contain stopwords, while -NoSW indicates their removal.
Conclusion
This paper presents an approach for analysis of text with religious themes with the use of text numerical representations known as embeddings, generated by three selected pre-trained large language models: Slovak-BERT, OpenAI’s text-embedding-3-small and BGE M3 embedding model. These models were selected after it was evaluated, that their proficiency in Slovak language and religious terminology is sufficient to handle the task of information retrieval for a given set of documents.
Challenges related to quality of test queries were addressed using query augmentation technique. This approach helped in formulating appropriate queries, resulting in more relevant retrieval of text chunks, capturing all the nuances of topics that interest theologians.
Evaluation results proved the effectiveness of the embeddings produced by these models, particularly the text-embedding-3-small from OpenAI, which exhibited a strong contextual understanding and linguistic proficiency. The recall value for this model’s retrieval abilities varied depending of the topic and queries used, with the highest values reaching almost 90% for topic Radical adoption of life model when using GPT queries and chunk length of 700 symbols. Generally, text-embedding-3-small performed best with the longest chunk lengths studied, showing a trend of increasing recall with the increase in chunk length. The topic Strange/Unusual/Intense had the lowest recall, possibly due to the uncertainty in topic specification.
For Slovak-BERT embedding indices, the recall values were slightly lower, but still impressive given the simplicity of this language model. Better results were achieved using GPT queries, with the best recall value of 47.1% for the topic Radical adoption of life model at a chunk length of 700 symbols, with embeddings created from chunks with removed stropwords. Generally, this embedding model benefited most from the stopwords removal preprocessing step.
As for BGE M3 embeddings, the result were impressive, achieving high recall, though not as high as OpenAI’s embeddings. However, considering that BGE M3 is an open-source model, these results are remarkable.
These findings highlight the potential of leveraging LLMs for specialized domains like analysis of texts with religious themes. Future work could explore the connections between text chunks using clustering techniques with embeddings to discover hidden associations and inspirations of the text authors. For theologians, future work lies in examining the retrieved text parts to identify deviations from official teaching of Catholic Church, shedding light on movement’s interpretations and insights.
Acknowledgment
Research results were obtained with the support of the Slovak National competence centre for HPC, the EuroCC 2 project and Slovak National Supercomputing Centre under grant agreement 101101903-EuroCC 2-DIGITAL-EUROHPC-JU-2022-NCC-01.
Computational resources were procured in the national project National competence centre for high performance computing (project code: 311070AKF2) funded by European Regional Development Fund, EU Structural Funds Informatization of society, Operational Program Integrated Infrastructure.
Bibiána Lajčinová – Slovak National Supercomputing Centre Jozef Žuffa – Faculty of Theology, Trnava University, Milan Urbančok – Faculty of Theology, Trnava University,
References:
[1] Matúš Pikuliak, Štefan Grivalský, Martin Konôpka, Miroslav Blšťák, Martin Tamajka, Viktor Bachratý, Marián Šimko, Pavol Balážik, Michal Trnka, and Filip Uhlárik. Slovakbert: Slovak masked language model, 2021.
[2] Jianlv Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, and Zheng Liu. Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation, 2024.
[3] Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, and Furu Wei. Multi-lingual e5 text embeddings: A technical report, 2024.
Digital Twins of Society: HPC-Powered Simulations25 Jun-Join us for a thought-provoking webinar exploring how artificial intelligence and multi-agent simulation technologies are helping researchers understand and predict complex societal dynamics. This session brings together leading experts in cultural cybernetics, cognitive modeling, and national-scale digital twin simulations.
Strengthening EuroCC ties: NCC Slovakia visits FCCN in Lisbon24 Jun-On June 24th, representative of NCC Slovakia, Božidara Pellegrini, met with colleagues from NCC Portugal at the headquarters of FCCN – Fundação para a Ciência e a Tecnologia in Lisbon.
The Starmus Science Festival, known for its unique combination of scientific lectures, music, and art, was held this year in May in Bratislava. This unique festival attracted numerous science enthusiasts who enjoyed inspirational presentations and discussions with leading scientists from around the world.
The festival was not just about passively watching lectures. Participants had the opportunity to engage directly in various scientific demonstrations and interactive activities that provided practical examples of scientific principles. The demonstration booths were popular spots where visitors could try out different experiments and technologies.
In the panel discussions, experts also addressed ethical questions and the societal impact of new technologies. These discussions provided a deeper understanding of the challenges we face in connection with rapid technological advancement.
One of the most intriguing lectures of the festival was given by Neil Lawrence, titled "What Makes Us Unique in the Age of AI." Neil Lawrence, a renowned scientist in the field of artificial intelligence, covered a wide range of topics related to our uniqueness in an era of rapid AI development. He discussed how we can preserve human values and abilities at a time when artificial intelligence is increasingly penetrating our lives. The lecture was inspiring and provided deep insights into the future of human-technology interaction.
Neil Lawrence spoke about the importance of an interdisciplinary approach in science and technological progress. His presentation emphasized how crucial collaboration between different fields is to achieving significant scientific discoveries. He pointed out that the combination of various scientific disciplines can lead to new and groundbreaking insights.
Another part of the lecture focused on the latest discoveries in space. Lawrence used visualizations and animations to explain complex concepts in a simple way, which appealed to the general audience. During the lecture, he also addressed the history of space exploration. He illustrated his points with numerous historical photographs and videos, which added an authentic and informative character to his presentation. Another significant topic was breakthroughs in genetics and biotechnology. Lawrence explained how these new technologies have the potential to treat previously incurable diseases and improve the quality of life for many people. In discussions about artificial intelligence, he emphasized its ability to transform various sectors, including medicine, transportation, and education. He stressed the importance of ethics and responsibility in the development and implementation of AI technologies.
Another crucial point in his lecture was the advancements in renewable energy sources and sustainability. He highlighted the importance of investing in solar and wind energy and innovative technologies that can help reduce the carbon footprint. The discussion also focused on global initiatives and cooperation between different countries to address climate change. Lawrence also focused on the latest technologies in medicine, particularly AI, explaining how artificial intelligence helps doctors diagnose diseases more quickly and accurately. He also spoke about how new technologies enable personalized medicine tailored to individual patients.
Another topic was quantum technology in computers and communication. He emphasized how quantum computers can revolutionize various sectors, including medicine and finance, by allowing faster and more efficient information processing. He also discussed the importance of oceans for our planet and the need to protect them. He warned about threats like pollution and climate change that endanger the marine ecosystem and stressed the need for international cooperation in ocean protection.
Finally, he addressed the issue of space debris and its impact on future space missions. He discussed technologies and strategies being developed to address the growing problem of debris in orbit. The final part of the lecture focused on the challenges and benefits of integrating neuroscience and artificial intelligence. He discussed how AI can help understand and treat neurological disorders and how studying the brain can contribute to the development of smarter AI systems.
Lawrence also talked about ecological innovations and their potential to change the way we live and work. He discussed how new technologies can contribute to sustainable development and reduce the negative impact on the environment. He also addressed the development of space technologies and their potential to improve life on Earth. He spoke about how space research contributes to progress in areas such as material sciences, energy, and communication.
The last discussion focused on the importance of education in science and technology for future generations. Lawrence emphasized the need for investments in educational programs that promote critical thinking and innovative solutions to global problems. He also discussed virtual reality (VR) technology and its applications in education and healthcare. He explained how VR can enhance learning by providing immersive and interactive environments and how it can help patients in rehabilitation and therapy.
The latest information emphasized the importance of interdisciplinary research and collaboration between different scientific fields. Lawrence explained how the combination of various expertise can lead to innovative solutions to complex global problems such as climate change and health crises (pandemics).
Scientific and Artistic Projects
The festival also brought a discussion about the future of art and science. Collaboration between artists and scientists was showcased through various multimedia projects that demonstrated how these two worlds can come together to create innovative and inspiring works.
The Starmus festival is not only a celebration of science but also a platform for sharing knowledge and inspiration. This year's edition in Bratislava once again highlighted the importance of dialogue between science and the public. It allowed scientists, artists, and the general public to meet, discuss, and jointly seek solutions to current global challenges. We are already looking forward to the next editions and the new discoveries they will bring.
About Starmus Festival
Starmus is a festival of science, art, and music created by Garik Israelian, PhD., an astrophysicist from the Institute of Astrophysics of the Canary Islands (IAC), and Sir Brian May, PhD., an astrophysicist and the lead guitarist of the iconic rock band Queen. It consists of presentations by astronauts, cosmonauts, Nobel laureates, thinkers, and prominent figures from various scientific and musical fields. Starmus brings these exceptional people together to share their knowledge and experiences and to jointly seek answers to humanity's big questions.
Stephen Hawking Medal for Science Communication
In 2015, Stephen Hawking and Alexei Leonov, along with Brian May, created the Stephen Hawking Medal for Science Communication, awarded to individuals and teams for significant contributions to science communication. Previous recipients of the Stephen Hawking Medal include Dr. Jane Goodall, Elon Musk, Neil deGrasse Tyson, Brian Eno, Hans Zimmer, and the documentary Apollo 11.
This year's Starmus festival brought a wealth of new knowledge and inspiration that will resonate with participants and the general public for a long time. The festival once again confirmed its important role in promoting science, art, and education worldwide.
Neil Lawrence StarmusStarmus festivalStarmus BratislavaStarmus SlovakiaFestival StarmusStarmus 2024Starmus BratislavaStarmus Slovakia