Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer LearningShow others and affiliations
2022 (English)In: Vol. 3 (2022): Proceedings of the Northern Lights Deep Learning Workshop 2022, Septentrio Academic Publishing , 2022, Vol. 3Conference paper, Published paper (Refereed)
Abstract [en]
Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English.This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources: Reddit, Familjeliv and the GDC. Perplexity score (an automated intrinsic metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models. We also compare the DialoGPT experiments with an attention-mechanism-based seq2seq baseline model, trained on the GDC dataset. The results indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogues judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. The work agrees with the hypothesis that deep monolingual models learn some abstractions which generalize across languages. We contribute the codes, datasets and model checkpoints and host the demos on the HuggingFace platform.
Place, publisher, year, edition, pages
Septentrio Academic Publishing , 2022. Vol. 3
Keywords [en]
Conversational Systems, Chatbots, Dialogue, DialoGPT, Swedish, Language Technology (Computational Linguistics), Språkteknologi (språkvetenskaplig databehandling), Computer Sciences, Datavetenskap (datalogi)
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:ri:diva-62512DOI: 10.7557/18.6231OAI: oai:DiVA.org:ri-62512DiVA, id: diva2:1730398
Conference
the Northern Lights Deep Learning Workshop 2022
2023-01-242023-01-242023-06-07Bibliographically approved