Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
An integration of vector-based semantic analysis and simple recurrent networks for the automatic acquisition of lexical representations from unlabeled corpora
RISE - Research Institutes of Sweden, ICT, SICS.ORCID-id: 0000-0001-5100-0535
2002 (engelsk)Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

This study presents an integration of Simple Recurrent Networks to extract grammatical knowledge and Vector-Based Semantic Analysis to acquire semantic information from large corpora. Starting from a large, untagged sample of English text, we use Simple Recurrent Networks to extract morpho-syntactic vectors in an unsupervised way. These vectors are then used in place of random vectors to perform Vector-Based Semantic Analysis. In this way, we obtain rich lexical representations in the form of high-dimensional vectors that integrate morpho-syntactic and semantic information about words. Apart from incorporating data from the different levels, we argue how these vectors can be used to account for the particularities of each different word token of a given word type. The amount of lexical knowledge acquired by the technique is evaluated both by statistical analyses comparing the information contained in the vectors with existing `hand-crafted' lexical resources such as CELEX and WordNet, and by performance in language proficiency tests. We conclude by outlining the cognitive implications of this model and its potential use in the bootstrapping of lexical resources.

sted, utgiver, år, opplag, sider
2002, 1.
HSV kategori
Identifikatorer
URN: urn:nbn:se:ri:diva-22530OAI: oai:DiVA.org:ri-22530DiVA, id: diva2:1042095
Konferanse
Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Language Data Workshop at LREC 2002, 1 June 2002, Las Palmas, Spain
Tilgjengelig fra: 2016-10-31 Laget: 2016-10-31 Sist oppdatert: 2025-09-23bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Person

Sahlgren, Magnus

Søk i DiVA

Av forfatter/redaktør
Sahlgren, Magnus
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 129 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
v. 2.47.0