Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
RISE., Swedish ICT, SICS. IAM.
Antal upphovsmän: 32012 (Engelska)Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

It has been established that incorporating word cluster features derived from large unlabeled corpora can significantly improve prediction of linguistic structure. While previous work has focused primarily on English, we extend these results to other languages along two dimensions. First, we show that these results hold true for a number of languages across families. Second, and more interestingly, we provide an algorithm for inducing cross-lingual clusters and we show that features derived from these clusters significantly improve the accuracy of cross-lingual structure prediction. Specifically, we show that by augmenting direct-transfer systems with cross-lingual cluster features, the relative error of delexicalized dependency parsers, trained on English treebanks and transferred to foreign languages, can be reduced by up to 13%. When applying the same method to direct transfer of named-entity recognizers, we observe relative improvements of up to 26%.

Ort, förlag, år, upplaga, sidor
2012, 8.
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:ri:diva-24024OAI: oai:DiVA.org:ri-24024DiVA, id: diva2:1043103
Konferens
The 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2012)
Tillgänglig från: 2016-10-31 Skapad: 2016-10-31 Senast uppdaterad: 2020-12-01Bibliografiskt granskad

Open Access i DiVA

fulltext(533 kB)220 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 533 kBChecksumma SHA-512
dcf70d23ab40d88e5c6da3ea1bd84b4bda8c1b748232f4ecf9c66d8c23025d628d8f4a4634aa5dd3ab80d4f191958e92c556655ce542f05419192ff8cb6edd9e
Typ fulltextMimetyp application/pdf

Av organisationen
SICS
Data- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 220 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 373 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf