Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Random indexing of multidimensional data
Luleå University of Technology, Sweden.
RISE - Research Institutes of Sweden, ICT, SICS.
RISE - Research Institutes of Sweden, ICT, SICS.ORCID iD: 0000-0001-5100-0535
2017 (English)In: Knowledge and Information Systems, ISSN 0219-1377, E-ISSN 0219-3116, Vol. 52, no 1, p. 267-290Article in journal (Refereed) Published
Abstract [en]

Random indexing (RI) is a lightweight dimension reduction method, which is used, for example, to approximate vector semantic relationships in online natural language processing systems. Here we generalise RI to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis. The RI method is well suited for online processing of data streams because relationship weights can be updated incrementally in a fixed-size distributed representation, and inner products can be approximated on the fly at low computational cost. An open source implementation of generalised RI is provided. © 2016, The Author(s).

Place, publisher, year, edition, pages
2017. Vol. 52, no 1, p. 267-290
Keywords [en]
Data mining, Dimensionality reduction, Natural language processing, Random embeddings, Semantic similarity, Sparse coding, Streaming algorithm
National Category
Natural Sciences
Identifiers
URN: urn:nbn:se:ri:diva-30273DOI: 10.1007/s10115-016-1012-2Scopus ID: 2-s2.0-85001755138OAI: oai:DiVA.org:ri-30273DiVA, id: diva2:1130806
Available from: 2017-08-11 Created: 2017-08-11 Last updated: 2018-02-23Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Sahlgren, Magnus
By organisation
SICS
In the same journal
Knowledge and Information Systems
Natural Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 7 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
v. 2.34.0