Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
ELOQUENT CLEF Shared Tasks for Evaluation of Generative Language Model Quality
Silo AI, Finland.
RISE Research Institutes of Sweden, Digitala system, Datavetenskap.ORCID-id: 0000-0003-3246-1664
RISE Research Institutes of Sweden, Digitala system, Datavetenskap.ORCID-id: 0000-0002-9162-6433
RISE Research Institutes of Sweden, Digitala system, Datavetenskap.
Vise andre og tillknytning
2024 (engelsk)Inngår i: Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349, Vol. 14612 LNCS, s. 459-465Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

ELOQUENT is a set of shared tasks for evaluating the quality and usefulness of generative language models. ELOQUENT aims to bring together some high-level quality criteria, grounded in experiences from deploying models in real-life tasks, and to formulate tests for those criteria, preferably implemented to require minimal human assessment effort and in a multilingual setting. The selected tasks for this first year of ELOQUENT are (1) probing a language model for topical competence; (2) assessing the ability of models to generate and detect hallucinations; (3) assessing the robustness of a model output given variation in the input prompts; and (4) establishing the possibility to distinguish human-generated text from machine-generated text.

sted, utgiver, år, opplag, sider
Springer Science and Business Media Deutschland GmbH , 2024. Vol. 14612 LNCS, s. 459-465
Emneord [en]
Benchmarking; CLEF; Generative language model; Human assessment; Language model; LLM; Modeling quality; Multilinguality; Quality benchmark; Quality criteria; Shared task; Computational linguistics
HSV kategori
Identifikatorer
URN: urn:nbn:se:ri:diva-72876DOI: 10.1007/978-3-031-56069-9_63Scopus ID: 2-s2.0-85189366495OAI: oai:DiVA.org:ri-72876DiVA, id: diva2:1854721
Konferanse
46th European Conference on Information Retrieval, ECIR 2024. Glasgow, UK. 24 March 2024 through 28 March 2024
Tilgjengelig fra: 2024-04-26 Laget: 2024-04-26 Sist oppdatert: 2025-09-23bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Person

Dürlich, LuiseGogoulou, EvangeliaNivre, Joakim

Søk i DiVA

Av forfatter/redaktør
Dürlich, LuiseGogoulou, EvangeliaNivre, Joakim
Av organisasjonen
I samme tidsskrift
Lecture Notes in Computer Science

Søk utenfor DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 138 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
v. 2.47.0