Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Streaming word similarity mining on the cheap
RISE - Research Institutes of Sweden, ICT, SICS. (RISE AI)ORCID-id: 0000-0001-9244-4546
RISE - Research Institutes of Sweden, ICT, SICS. (RISE AI)ORCID-id: 0000-0001-8952-3542
2018 (engelsk)Konferansepaper, Publicerat paper (Annet vitenskapelig)
Abstract [en]

Accurately and efficiently estimating word similarities from text is fundamental in natural language processing. In this paper, we propose a fast and lightweight method for estimating similarities from streams by explicitly counting second-order co-occurrences. The method rests on the observation that words that are highly correlated with respect to such counts are also highly similar with respect to first-order co-occurrences. Using buffers of co-occurred words per word to count second-order co-occurrences, we can then estimate similarities in a single pass over data without having to do prohibitively expensive similarity calculations. We demonstrate that this approach is scalable, converges rapidly, behaves robustly under parameter changes, and that it captures word similarities on par with those given by state-of-the-art word embeddings.

sted, utgiver, år, opplag, sider
2018.
HSV kategori
Identifikatorer
URN: urn:nbn:se:ri:diva-35186OAI: oai:DiVA.org:ri-35186DiVA, id: diva2:1249038
Konferanse
Conference on Empirical Methods in Natural Language Processing (EMNLP)
Tilgjengelig fra: 2018-09-18 Laget: 2018-09-18 Sist oppdatert: 2025-02-07bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Person

Görnerup, OlofGillblad, Daniel

Søk i DiVA

Av forfatter/redaktør
Görnerup, OlofGillblad, Daniel
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 190 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
v. 2.46.0