Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
What Causes Unemployment?: Unsupervised Causality Mining from Swedish Governmental Reports
RISE Research Institutes of Sweden, Digital Systems, Data Science. Uppsala University, Sweden.ORCID iD: 0000-0003-3246-1664
RISE Research Institutes of Sweden, Digital Systems, Data Science. Uppsala University, Sweden.ORCID iD: 0000-0002-7873-3971
Uppsala University, Sweden.
2023 (English)In: RESOURCEFUL 2023 - Workshop on Resources and Representations for Under-Resourced Languages and Domains, Proceedings of the 2nd, Association for Computational Linguistics , 2023, p. 25-29Conference paper, Published paper (Refereed)
Abstract [en]

Extracting statements about causality from text documents is a challenging task in the absence of annotated training data. We create a search system for causal statements about user-specified concepts by combining pattern matching of causal connectives with semantic similarity ranking, using a language model fine-tuned for semantic textual similarity. Preliminary experiments on a small test set from Swedish governmental reports show promising results in comparison to two simple baselines. 

Place, publisher, year, edition, pages
Association for Computational Linguistics , 2023. p. 25-29
Keywords [en]
Computational linguistics; Search engines; Semantics; Annotated training data; Language model; Pattern-matching; Search system; Semantic similarity; Similarity rankings; Swedishs; Test sets; Text document; Textual similarities; Pattern matching
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:ri:diva-68044Scopus ID: 2-s2.0-85175867685ISBN: 9781959429739 (electronic)OAI: oai:DiVA.org:ri-68044DiVA, id: diva2:1814236
Conference
2nd Workshop on Resources and Representations for Under-Resourced Languages and Domains, RESOURCEFUL 2023; Conference date: 22 May 2023
Note

This work was funded by Vinnova in the project 2019-02252: Datalab for results in the public sector. We thank Sven-Olof Junker, Martin Sparr, Fredrik Carlsson, Sebastian Reimann, and Gustav Finnve-den for valuable discussions. The computations were enabled by resources in project UPPMAX 2020/2-2 at the Uppsala Multidisciplinary Center for Advanced Computational Science.

Available from: 2023-11-23 Created: 2023-11-23 Last updated: 2023-11-28Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Dürlich, LuiseNivre, Joakim

Search in DiVA

By author/editor
Dürlich, LuiseNivre, Joakim
By organisation
Data Science
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 181 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf