Improving scientific excellence and creativity in combating disinformation with artificial intelligence and language technologies

Informacje na temat projektu

DisAI

Identyfikator umowy o grant: 101079164

DOI

10.3030/101079164

Projekt został zamknięty

Data podpisania przez KE 8 Lipca 2022

Data rozpoczęcia 1 Grudnia 2022

Data zakończenia 30 Listopada 2025

Finansowanie w ramach

Widening participation and spreading excellence

Koszt całkowity

€ 1 499 750,00

Wkład UE

€ 1 499 750,00

1 499 750,00

Koordynowany przez

KEMPELENOV INSTITUT INTELIGENTNYCH TECHNOLOGII
Slovakia

Ten projekt został przedstawiony w…

Periodic Reporting for period 1 - DisAI (Improving scientific excellence and creativity in combating disinformation with artificial intelligence and language technologies)

Okres sprawozdawczy: 2022-12-01 do 2025-11-30

The Slovak research and innovation ecosystem is characterised by long-standing challenges, including limited scientific excellence, weak industry–academia collaboration, low levels of internationalisation, academic inbreeding and brain drain. Limited experience with competitive grant applications has further contributed to low participation and success rates in EU funding programmes. The Kempelen Institute of Intelligent Technologies (KInIT), a widening organisation in the project, is an independent non-profit research institute founded in 2020 by a consortium of companies in response to these challenges. Throughout the project, KInIT pursued its mission of strengthening scientific excellence by bridging the public and private sectors, attracting high-quality researchers from abroad, and offering competitive opportunities for young talents to stay in, return to, or relocate to Slovakia, thereby contributing to improved national competitiveness.

Disinformation disseminated through social media and digital platforms continues to pose a significant societal challenge, contributing to polarisation and undermining democratic processes. Despite sustained efforts by the European Commission, Member States, civil society and the research community, the problem remains unresolved, with particularly strong effects in post-communist EU countries characterised by weaker institutions, lower media literacy and reduced societal resilience. In Slovakia, for example, 56% of surveyed citizens report belief in conspiracy theories or misinformation narratives, while only 30% trust news media most of the time. In this context, the project strengthened KInIT’s research capacity in AI-based approaches to disinformation analysis, in line with the Slovak Recovery and Resilience Plan and the Digital Transformation Strategy 2030.

Overall, the project objective was to enhance the scientific excellence of KInIT and the consortium partners in trustworthy AI and multimodal natural language processing and multilingual language technologies to combat disinformation.

To strengthen research excellence at KInIT, a scientific strategy for combating disinformation using AI and language technologies was developed. Software and hardware infrastructure for AI research was enhanced, and a key dataset supporting project research activities was assembled. Research staff expertise was increased through targeted training activities and joint research work. In total, 24 KInIT researchers participated in and benefited from the project. Training activities laid strong foundations for long-term scientific development, including four scientific webinars delivered by advanced partners and a summer school on trustworthy, multilingual and multimodal AI, which attracted over 30 participants from Slovakia and abroad and involved experts from industry, including Meta and Google. In addition, a replication challenge involving 11 early-stage researchers was organised, and a Shared Task on Multilingual and Crosslingual Fact-Checked Claim Retrieval was held as part of the SemEval workshop, attracting more than 170 researchers worldwide.

Novel methods, primarily focused on claim matching, were developed, resulting in 42 scientific publications targeting top-tier NLP venues, including ACL and EMNLP. KInIT strengthened its presence at leading scientific conferences, expanded its professional network and implemented a staff exchange programme. Improvements in scientific excellence were reflected in increased industry collaboration and knowledge transfer, growth in the average h-index of participating researchers, and extensive international engagement, with over 85% of researchers participating in international mobility. Beyond the planned networking activities, the LowResNLP workshop was organised, further strengthening the scientific network and advancing one of the project’s core themes: NLP for low-resource languages.

From a research management and administration perspective, an institutional assessment of KInIT was conducted, followed by the development and implementation of an improvement plan. As a result, the research management and administration unit was upgraded, with a significantly increased share of trained administrative staff. A research support network with partner organisations was established, two workshops on research management skills were organised, and 22 research project and grant proposals were submitted. Overall, KInIT substantially expanded its network, engaging with more than 70 industrial and over 100 research partners.

To maximise visibility and impact, a dissemination and communication strategy was developed and successfully implemented, with most performance indicators meeting or exceeding planned targets. Multiple communication channels were used to reach diverse target groups. The sustainability of project results is further ensured through follow-up initiatives building directly on DisAI’s outcomes, including the DisAI-AMPLIFIED project (2024–2026) funded under Slovakia’s Recovery and Resilience Plan, and the lorAI: Low Resource Artificial Intelligence project (2025–2031), supported by the Horizon WIDERA Teaming for Excellence programme.

The project results advance the state of the art in several ways:

– MultiClaim, a multilingual dataset comprising over 200k fact-checked claims and 28k social media posts, was created and published together with a scientific paper. The dataset enabled subsequent research within the project and beyond, including by third-party researchers.
– Extensive evaluations using MultiClaim under the multilingual CBFA framework demonstrated how cross-lingual retrieval can be reliably performed and compared the effectiveness of different system configurations, addressing the three original research questions defined in the proposal.
– Additional findings show that state-of-the-art generative LLMs can be effectively integrated into multilingual claim-matching pipelines, both as re-rankers and relevance classifiers, and that auxiliary components such as OCR engines based on multilingual LLMs can substantially improve performance in cross-lingual scenarios, addressing an additional research question introduced during the project.
– A version of MultiClaim augmented with visual data demonstrated that multimodal information can be effectively leveraged in multilingual and cross-lingual claim matching, outperforming unimodal baselines. A novel architecture, FACTOR, was proposed for this purpose. The dataset was also used to study the role of multimodal data in facilitating cross-lingual knowledge transfer, further extending the state of the art. Results additionally show that generative vision–language models can be effectively used as re-rankers.
– An AutoXAI framework was introduced to support the selection of suitable explainability methods for specific model–dataset combinations, demonstrating that automated selection can identify XAI methods balancing technical fidelity and human comprehensibility.
– Systemic inequities in multilingual systems were analysed, showing substantial performance variation across languages, particularly for low-resource and non-Latin script languages. While reasoning-enabled reranking models partially mitigate these biases, they remain present.

Periodic Reporting for period 1 - DisAI (Improving scientific excellence and creativity in combating disinformation with artificial intelligence and language technologies)

Pobierz Pobierz zawartość strony