Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Using LLMs to Build a Database of Climate Extreme Impacts
Vrije Universiteit Brussel, Belgium.
RISE Research Institutes of Sweden, Digital Systems, Data Science. Swedish Centre for Impacts of Climate Extremes, Sweden.ORCID iD: 0009-0007-2792-9345
Helmholtz Centre for Environmental Research, Germany.
Uppsala University, Sweden; Swedish Centre for Impacts of Climate Extremes, Sweden.
Show others and affiliations
2024 (English)In: ClimateNLP 2024 - 1st Workshop on Natural Language Processing Meets Climate Change, Proceedings of the Workshop, Association for Computational Linguistics (ACL) , 2024, p. 93-110Conference paper, Published paper (Refereed)
Abstract [en]

To better understand how extreme climate events impact society, we need to increase the availability of accurate and comprehensive information about these impacts. We propose a method for building large-scale databases of climate extreme impacts from online textual sources, using LLMs for information extraction in combination with more traditional NLP techniques to improve accuracy and consistency. We evaluate the method against a small benchmark database created by human experts and find that extraction accuracy varies for different types of information. We compare three different LLMs and find that, while the commercial GPT-4 model gives the best performance overall, the open-source models Mistral and Mixtral are competitive for some types of information.

Place, publisher, year, edition, pages
Association for Computational Linguistics (ACL) , 2024. p. 93-110
Keywords [en]
Computational linguistics; Database systems; Open systems; Benchmark database; Climate event; Climate extremes; Comprehensive information; Extraction accuracy; Extreme climates; Human expert; Large-scale database; Open-source model; Performance; Data accuracy
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:ri:diva-76190Scopus ID: 2-s2.0-85204502136OAI: oai:DiVA.org:ri-76190DiVA, id: diva2:1914202
Conference
1st Workshop on Natural Language Processing Meets Climate Change, ClimateNLP 2024. Bangkok, Thailand. 16 August 2024
Note

The research presented in this paper was supported by the Swedish Research Council (grants no. 2022-02909, 2022-03448 and 2022-06599). Ni Li is supported by the VUB Research Council in the framework of a EUTOPIA inter-university co-tutelle PhD program between the Vrije Universiteit Brussel, Belgium, and the Technische Universit\u00E4t Dresden, Germany. The EUTOPIA alliance is part of the European Universities Initiatives co-funded by the European Union. The experiments with the open-source LLMs were enabled by the National Academic Infrastructure for Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no. 2022-06725. We thank NAISS for providing computational resources under Project 2024/22-211.

Available from: 2024-11-18 Created: 2024-11-18 Last updated: 2025-09-23Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Zahra, ShorouqGörnerup, OlofNivre, Joakim

Search in DiVA

By author/editor
Zahra, ShorouqGörnerup, OlofNivre, Joakim
By organisation
Data Science
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 165 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf