Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
An integration of vector-based semantic analysis and simple recurrent networks for the automatic acquisition of lexical representations from unlabeled corpora
RISE - Research Institutes of Sweden, ICT, SICS.ORCID iD: 0000-0001-5100-0535
2002 (English)Conference paper, Published paper (Refereed)
Abstract [en]

This study presents an integration of Simple Recurrent Networks to extract grammatical knowledge and Vector-Based Semantic Analysis to acquire semantic information from large corpora. Starting from a large, untagged sample of English text, we use Simple Recurrent Networks to extract morpho-syntactic vectors in an unsupervised way. These vectors are then used in place of random vectors to perform Vector-Based Semantic Analysis. In this way, we obtain rich lexical representations in the form of high-dimensional vectors that integrate morpho-syntactic and semantic information about words. Apart from incorporating data from the different levels, we argue how these vectors can be used to account for the particularities of each different word token of a given word type. The amount of lexical knowledge acquired by the technique is evaluated both by statistical analyses comparing the information contained in the vectors with existing `hand-crafted' lexical resources such as CELEX and WordNet, and by performance in language proficiency tests. We conclude by outlining the cognitive implications of this model and its potential use in the bootstrapping of lexical resources.

Place, publisher, year, edition, pages
2002, 1.
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:ri:diva-22530OAI: oai:DiVA.org:ri-22530DiVA, id: diva2:1042095
Conference
Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Language Data Workshop at LREC 2002, 1 June 2002, Las Palmas, Spain
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-08-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Sahlgren, Magnus

Search in DiVA

By author/editor
Sahlgren, Magnus
By organisation
SICS
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 23 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf