Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Distributional Semantic Online Lexicon for Linguistic Explorations of Societies
Mid Sweden University, Sweden; University of Bergen, Norway.
University of Gothenburg, Sweden.
RISE Research Institutes of Sweden.
RISE Research Institutes of Sweden.ORCID iD: 0000-0001-5100-0535
Show others and affiliations
2023 (English)In: Social science computer review, ISSN 0894-4393, E-ISSN 1552-8286, Vol. 41, no 2, p. 308-329Article in journal (Refereed) Published
Abstract [en]

Linguistic Explorations of Societies (LES) is an interdisciplinary research project with scholars from the fields of political science, computer science, and computational linguistics. The overarching ambition of LES has been to contribute to the survey-based comparative scholarship by compiling and analyzing online text data within and between languages and countries. To this end, the project has developed an online semantic lexicon, which allows researchers to explore meanings and usages of words in online media across a substantial number of geo-coded languages. The lexicon covers data from approximately 140 language–country combinations and is, to our knowledge, the most extensive free research resource of its kind. Such a resource makes it possible to critically examine survey translations and identify discrepancies in order to modify and improve existing survey methodology, and its unique features further enable Internet researchers to study public debate online from a comparative perspective. In this article, we discuss the social scientific rationale for using online text data as a complement to survey data, and present the natural language processing–based methodology behind the lexicon including its underpinning theory and practical modeling. Finally, we engage in a critical reflection about the challenges of using online text data to gauge public opinion and political behavior across the world. © The Author(s) 2022.

Place, publisher, year, edition, pages
SAGE Publications Inc. , 2023. Vol. 41, no 2, p. 308-329
Keywords [en]
comparative surveys, distributional semantics, language use, natural language processing, semantic similarities, word2vec
National Category
Communication Studies
Identifiers
URN: urn:nbn:se:ri:diva-59347DOI: 10.1177/08944393211049774Scopus ID: 2-s2.0-85130070813OAI: oai:DiVA.org:ri-59347DiVA, id: diva2:1673029
Available from: 2022-06-20 Created: 2022-06-20 Last updated: 2023-07-06Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Sahlgren, Magnus

Search in DiVA

By author/editor
Sahlgren, Magnus
By organisation
RISE Research Institutes of Sweden
In the same journal
Social science computer review
Communication Studies

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 61 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf