Using LLMs to Build a Database of Climate Extreme ImpactsShow others and affiliations
2024 (English)In: ClimateNLP 2024 - 1st Workshop on Natural Language Processing Meets Climate Change, Proceedings of the Workshop, Association for Computational Linguistics (ACL) , 2024, p. 93-110Conference paper, Published paper (Refereed)
Abstract [en]
To better understand how extreme climate events impact society, we need to increase the availability of accurate and comprehensive information about these impacts. We propose a method for building large-scale databases of climate extreme impacts from online textual sources, using LLMs for information extraction in combination with more traditional NLP techniques to improve accuracy and consistency. We evaluate the method against a small benchmark database created by human experts and find that extraction accuracy varies for different types of information. We compare three different LLMs and find that, while the commercial GPT-4 model gives the best performance overall, the open-source models Mistral and Mixtral are competitive for some types of information.
Place, publisher, year, edition, pages
Association for Computational Linguistics (ACL) , 2024. p. 93-110
Keywords [en]
Computational linguistics; Database systems; Open systems; Benchmark database; Climate event; Climate extremes; Comprehensive information; Extraction accuracy; Extreme climates; Human expert; Large-scale database; Open-source model; Performance; Data accuracy
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:ri:diva-76190Scopus ID: 2-s2.0-85204502136OAI: oai:DiVA.org:ri-76190DiVA, id: diva2:1914202
Conference
1st Workshop on Natural Language Processing Meets Climate Change, ClimateNLP 2024. Bangkok, Thailand. 16 August 2024
Note
The research presented in this paper was supported by the Swedish Research Council (grants no. 2022-02909, 2022-03448 and 2022-06599). Ni Li is supported by the VUB Research Council in the framework of a EUTOPIA inter-university co-tutelle PhD program between the Vrije Universiteit Brussel, Belgium, and the Technische Universit\u00E4t Dresden, Germany. The EUTOPIA alliance is part of the European Universities Initiatives co-funded by the European Union. The experiments with the open-source LLMs were enabled by the National Academic Infrastructure for Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no. 2022-06725. We thank NAISS for providing computational resources under Project 2024/22-211.
2024-11-182024-11-182025-09-23Bibliographically approved