Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On Privacy Preservation and Document-based Active Learning for Named Entity Recognition
RISE, Swedish ICT, SICS. Userware.
Number of Authors: 12009 (English)Conference paper, Published paper (Refereed)
Abstract [en]

The preservation of the privacy of persons mentioned in text requires the ability to automatically recognize and identify names. Named entity recognition is a mature field and most current approaches are based on supervised machine learning techniques. Such learning requires the presence of labeled examples on which to train; training examples are usually provided to the learner on the form of annotated corpora. Creating and annotating corpora is a tedious, meticulous and error prone process; obtaining good training examples is a hard task in itself. This paper describes the development and in-depth empirical investigation of a method, called BootMark, for bootstrapping the marking up of named entities in textual documents. Experimental results show that BootMark requires a human annotator to manually annotate fewer documents in order to produce a named entity recognizer with a given performance, than would be needed if the documents forming the basis for the recognizer were randomly drawn from the same corpus. The investigation further indicates that the primary gain obtained by BootMark compared to passive learning is in terms of higher recall. Thus, it is argued, the recognizers are suitable for use in privacy preservation applications.

Place, publisher, year, edition, pages
2009, 7.
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:ri:diva-23588OAI: oai:DiVA.org:ri-23588DiVA, id: diva2:1042664
Conference
ACM First International Workshop on Privacy and Anonymity for Very Large Datasets
Projects
COMPANIONS
Note
Workshop held in conjunction with The 18th ACM Conference on Information and Knowledge Management (CIKM 2009)Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-01-14Bibliographically approved

Open Access in DiVA

No full text in DiVA

By organisation
SICS
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 2 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
v. 2.35.4