Skip to main content
European Commission logo
Deutsch Deutsch
CORDIS - Forschungsergebnisse der EU
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

AutoLLMSelect: Framework for Robust and Explainable Automated Large Language Model Selection

Ziel

Large Language Models (LLMs) are gradually becoming part of academic and industrial processes due to their inherent capacity to solve a multitude of different problems across different domains. However, an open question remains – from the multitude of LLMs available, how to select the most appropriate LLM to use on a specific supervised machine learning (ML) problem (with or without fine-tuning), without evaluating a large portfolio of LLMs on the labelled dataset related to that ML problem. Evaluating a large LLM portfolio across multiple criteria introduces high computational cost, which then translates into a negative environmental impact, especially in terms of increased carbon emission. This proposal aims to (1) publish a comprehensive LLM benchmark dataset analysis that would facilitate a robust and unbised LLM benchmarking, (2) make the first steps towards a robust, explainable and evolving framework for automated LLM selection based on a multi-disciplinary approach that would reduce the cost for comparing large LLM portfolio on ML datasets, and (3) evaluate the applicability of the framework on a use-case from in field of sustainable development. Due to the high complexity of the problem to be solved, the proposal will present a proof-of-concept on a selected LLM portfolio, dataset portfolio, and performance metrics, based on the available data in public benchmarks. The framework would evolve and could be extended in the future with new LLMs, benchmark datasets, ML tasks, performance metrics, from both our side and the community.

Koordinator

INSTITUT JOZEF STEFAN
Netto-EU-Beitrag
€ 182 717,52
Adresse
Jamova 39
1000 Ljubljana
Slowenien

Auf der Karte ansehen

Region
Slovenija Zahodna Slovenija Osrednjeslovenska
Aktivitätstyp
Research Organisations
Links
Gesamtkosten
Keine Daten

Partner (1)