Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Schrödinger's tree—On syntax and neural language models
Uppsala University, Sweden.
RISE Research Institutes of Sweden, Digital Systems, Data Science. Uppsala University, Sweden.ORCID iD: 0000-0002-7873-3971
2022 (English)In: Frontiers in Artificial Intelligence, E-ISSN 2624-8212, Vol. 5, article id 796788Article in journal (Refereed) Published
Abstract [en]

In the last half-decade, the field of natural language processing (NLP) has undergone two major transitions: the switch to neural networks as the primary modeling paradigm and the homogenization of the training regime (pre-train, then fine-tune). Amidst this process, language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities and proving to be an indispensable means of knowledge transfer downstream. Due to the otherwise opaque, black-box nature of such models, researchers have employed aspects of linguistic theory in order to characterize their behavior. Questions central to syntax—the study of the hierarchical structure of language—have factored heavily into such work, shedding invaluable insights about models' inherent biases and their ability to make human-like generalizations. In this paper, we attempt to take stock of this growing body of literature. In doing so, we observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form, as well as the conclusions they draw from their findings. To remedy this, we urge researchers to make careful considerations when investigating coding properties, selecting representations, and evaluating via downstream tasks. Furthermore, we outline the implications of the different types of research questions exhibited in studies on syntax, as well as the inherent pitfalls of aggregate metrics. Ultimately, we hope that our discussion adds nuance to the prospect of studying language models and paves the way for a less monolithic perspective on syntax in this context. 

Place, publisher, year, edition, pages
Frontiers Media S.A. , 2022. Vol. 5, article id 796788
Keywords [en]
coding properties, language models, natural language understanding, neural networks, representations, syntax
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:ri:diva-61203DOI: 10.3389/frai.2022.796788Scopus ID: 2-s2.0-85140983143OAI: oai:DiVA.org:ri-61203DiVA, id: diva2:1716716
Available from: 2022-12-06 Created: 2022-12-06 Last updated: 2022-12-06Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Nivre, Joakim

Search in DiVA

By author/editor
Nivre, Joakim
By organisation
Data Science
In the same journal
Frontiers in Artificial Intelligence
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 13 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf