Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning
Luleå University of Technology, Sweden.
RISE Research Institutes of Sweden, Digitala system, Datavetenskap. Luleå University of Technology, Sweden.ORCID-id: 0000-0003-4293-6408
Luleå University of Technology, Sweden.
Luleå University of Technology, Sweden.
Visa övriga samt affilieringar
2022 (Engelska)Ingår i: Vol. 3 (2022): Proceedings of the Northern Lights Deep Learning Workshop 2022, Septentrio Academic Publishing , 2022, Vol. 3Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English.This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources: Reddit, Familjeliv and the GDC. Perplexity score (an automated intrinsic metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models. We also compare the DialoGPT experiments with an attention-mechanism-based seq2seq baseline model, trained on the GDC dataset. The results indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogues judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. The work agrees with the hypothesis that deep monolingual models learn some abstractions which generalize across languages. We contribute the codes, datasets and model checkpoints and host the demos on the HuggingFace platform.

Ort, förlag, år, upplaga, sidor
Septentrio Academic Publishing , 2022. Vol. 3
Nyckelord [en]
Conversational Systems, Chatbots, Dialogue, DialoGPT, Swedish, Language Technology (Computational Linguistics), Språkteknologi (språkvetenskaplig databehandling), Computer Sciences, Datavetenskap (datalogi)
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:ri:diva-62512DOI: 10.7557/18.6231OAI: oai:DiVA.org:ri-62512DiVA, id: diva2:1730398
Konferens
the Northern Lights Deep Learning Workshop 2022 
Tillgänglig från: 2023-01-24 Skapad: 2023-01-24 Senast uppdaterad: 2023-06-07Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltext

Person

Brännvall, Rickard

Sök vidare i DiVA

Av författaren/redaktören
Brännvall, Rickard
Av organisationen
Datavetenskap
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 542 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf