Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Selective compound splitting of Swedish queries for boolean combination of truncated terms
RISE., Swedish ICT, SICS.
RISE - Research Institutes of Sweden (2017-2019), ICT, SICS.ORCID-id: 0000-0001-5100-0535
RISE., Swedish ICT, SICS.
2003 (Engelska)Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In compounding languages such as Swedish, it is often neccessary to split compound words when indexing documents or queries. One of the problems is that it is difficult to find constituents that express a concept similar to that expressed by the compound. The approach taken here is to expand a query with the leading constituents of the compound words. Every query term is truncated so as to increase recall by hopefully finding other compounds with the leading constituent as prefix. This approach increase recall in a rather uncontrolled way, so we use a Boolean quorum-level type of search to rank documents both according to a tf-idf factor but also to the number of matching Boolean combinations. The Boolean combinations performed relatively well, taken into consideration that the queries were very short (maximum five search terms). Also included in this paper are the results of two other methods we are currently working on in our lab; one for re-ranking search results on the basis of stylistic analysis of documents, and one for dimensionality reduction using Random Indexing.

Ort, förlag, år, upplaga, sidor
2003, 1.
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:ri:diva-22400OAI: oai:DiVA.org:ri-22400DiVA, id: diva2:1041945
Konferens
Fourth CLEF workshop, August 2003, Trondheim, Norway
Tillgänglig från: 2016-10-31 Skapad: 2016-10-31 Senast uppdaterad: 2020-12-02Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Person

Sahlgren, Magnus

Sök vidare i DiVA

Av författaren/redaktören
Sahlgren, Magnus
Av organisationen
SICSSICS
Data- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 209 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf