Planned maintenance
A system upgrade is planned for 10/12-2024, at 12:00-13:00. During this time DiVA will be unavailable.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
Luleå University of Technology, Sweden.
RISE Research Institutes of Sweden, Digital Systems, Data Science. Luleå University of Technology, Sweden.ORCID iD: 0000-0003-4293-6408
Luleå University of Technology, Sweden.
Luleå University of Technology, Sweden.
Show others and affiliations
2022 (English)In: Vol. 3 (2022): Proceedings of the Northern Lights Deep Learning Workshop 2022, Septentrio Academic Publishing , 2022, Vol. 3Conference paper, Published paper (Refereed)
Abstract [en]

Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English.This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources: Reddit, Familjeliv and the GDC. Perplexity score (an automated intrinsic metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models. We also compare the DialoGPT experiments with an attention-mechanism-based seq2seq baseline model, trained on the GDC dataset. The results indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogues judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. The work agrees with the hypothesis that deep monolingual models learn some abstractions which generalize across languages. We contribute the codes, datasets and model checkpoints and host the demos on the HuggingFace platform.

Place, publisher, year, edition, pages
Septentrio Academic Publishing , 2022. Vol. 3
Keywords [en]
Conversational Systems, Chatbots, Dialogue, DialoGPT, Swedish, Language Technology (Computational Linguistics), Språkteknologi (språkvetenskaplig databehandling), Computer Sciences, Datavetenskap (datalogi)
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:ri:diva-62512DOI: 10.7557/18.6231OAI: oai:DiVA.org:ri-62512DiVA, id: diva2:1730398
Conference
the Northern Lights Deep Learning Workshop 2022 
Available from: 2023-01-24 Created: 2023-01-24 Last updated: 2023-06-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Brännvall, Rickard

Search in DiVA

By author/editor
Brännvall, Rickard
By organisation
Data Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 525 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf