Change search
Refine search result
1 - 14 of 14
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Eriksson, Gunnar
    et al.
    RISE, Swedish ICT, SICS.
    Franzén, Kristofer
    Olsson, Fredrik
    RISE, Swedish ICT, SICS.
    Using heuristics, syntax and a local dynamic dictionary for protein name tagging2002Conference paper (Refereed)
    Abstract [en]

    This paper presents work on a method to detect names of proteins in running text. The detection and categorisation of named entities, such as names of people, organisations and places, in classical MUC-style information extraction tasks (Borthwick 1998) might be regarded a solved problem. But names of proteins present a slightly different challenge because of their variant structural characteristics and the specifics of the text domains in which they appear. This certainly holds true for other biological substances, and probably for many other kinds of terminology as well. We will present the different steps involved in our approach to this problem, and show how combinations of them influence recall and precision.

  • 2.
    Eriksson, Gunnar
    et al.
    RISE, Swedish ICT, SICS.
    Franzén, Kristofer
    RISE, Swedish ICT, SICS.
    Olsson, Fredrik
    RISE, Swedish ICT, SICS.
    Asker, Lars
    Lidén, Per
    Exploiting Syntax when Detecting Protein Names in Text2002In: Proceedings of FMI Workshop on Natural Language Processing in Biomedical Applications, 2002, 1, , p. 6Conference paper (Refereed)
    Abstract [en]

    This paper presents work on a method to detect names of proteins in running text. Our system - Yapex - uses a combination of lexical and syntactic knowledge, heuristic filters and a local dynamic dictionary. The syntactic information given by a general-purpose off-the-shelf parser supports the correct identification of the boundaries of protein names, and the local dynamic dictionary finds protein names in positions incompletely analysed by the parser. We present the different steps involved in our approach to protein tagging, and show how combinations of them influence recall and precision. We evaluate the system on a corpus of MEDLINE abstracts and compare it with the KeX system (Fukuda et al., 1998) along four different notions of correctness.

    Download full text (pdf)
    fulltext
  • 3.
    Franzén, Kristofer
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Olsson, Fredrik
    RISE, Swedish ICT, SICS.
    Asker, Lars
    Lidén, Per
    Cöster, Joakim
    Protein names and how to find them2002In: International Journal of Medical Informatics, ISSN 1386-5056, E-ISSN 1872-8243, Vol. 67, p. 13p. 49-61Article in journal (Refereed)
    Abstract [en]

    A prerequisite for all higher level information extraction tasks is the identification of unknown names in text. Today, when large corpora can consist of billions of words, it is of utmost importance to develop accurate techniques for the automatic detection, extraction and categorization of named entities in these corpora. Although named entity recognition might be regarded a solved problem in some domains, it still poses a significant challenge in others. In this work we focus on one of the more difficult tasks, the identification of protein names in text. This task presents several interesting difficulties because of the named entities' variant structural characteristics, their sometimes unclear status as names, the lack of common standards and fixed nomenclatures, and the specifics of the texts in the molecular biology domain in which they appear. We describe how we approached these and other difficulties in the implementation of Yapex, a system for the automatic identification of protein names in text. We also evaluate Yapex under four different notions of correctness and compare its performance to that of another publicly available system for protein name recognition.

    Download full text (pdf)
    fulltext
    Download full text (ps)
    fulltext
  • 4.
    Gambäck, Björn
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Fourla, Athanassia
    Natural Language Processing at the School of Information Studies for Africa2005Conference paper (Refereed)
    Abstract [en]

    The lack of persons trained in computational linguistic methods is a severe obstacle to making the Internet and computers accessible to people all over the world in their own languages. The paper discusses the experiences of designing and teaching an introductory course in Natural Language Processing to graduate computer science students at Addis Ababa University, Ethiopia, in order to initiate the education of computational linguists in the Horn of Africa region.

    Download full text (pdf)
    fulltext
  • 5.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Authors, Genre, and Linguistic Convention2007In: Proceedings from the SIGIR Workshop on Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection, 2007, 1, , p. 5Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 6.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Franzén, Kristofer
    RISE, Swedish ICT, SICS.
    Where Attitudinal Expressions Get Their Attitude2005In: Computing Attitude and Affect in Text, Dordrecht: Springer , 2005, 1, Vol. Vol. 20Chapter in book (Refereed)
  • 7.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Franzén, Kristofer
    RISE, Swedish ICT, SICS.
    Clough, Paul
    Hansen, Preben
    RISE, Swedish ICT, SICS.
    Mizzaro, Stefano
    Sanderson, Mark
    Reading between the lines: attitudinal expressions in text2004Conference paper (Refereed)
    Abstract [en]

    This paper describes how a proposed project will research the expression of attitude, affect, and sentiment in text in order to automatically identify and extract such expressions. The project starting points are a set of hypotheses: + There are syntactic and lexical markers in text such that attitudinal information can be harvested using them; + Players, or discourse referents, in text are one such crucial marker for modeling topicality in general and attitudinal information flow in particular; + Attitudes in texts are dependent on text type and domain; + Attitudinal information can be applied in the development of practical tools for information access, among other application areas; + An extended notion of relevance will afford us with a empirical evaluation model for our theories and experiments.

  • 8.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Friesek, Madlen
    Gäde, Maria
    Hansen, Preben
    RISE, Swedish ICT, SICS.
    Järvelin, Anni
    RISE, Swedish ICT, SICS.
    Lupu, Mihai
    Müller, Henning
    Petras, Vivian
    Stiller, Juliane
    Initial specification of the evaluation tasks "Use cases to bridge validation and benchmarking" PROMISE Deliverable 2.12011Other (Other academic)
    Abstract [en]

    Evaluation of multimedia and multilingual information access systems needs to be performed from a usage oriented perspective. This document outlines use cases from the three use case domains of the PROMISE project and gives some initial pointers to how their respective characteristics can be extrapolated to determine and guide evaluation activities, both with respect to benchmarking and to validation of the usage hypotheses. The use cases will be developed further during the course of the evaluation activities and workshops projected to occur in coming CLEF conferences.

    Download full text (pdf)
    FULLTEXT01
  • 9.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Täckström, Oscar
    RISE, Swedish ICT, SICS.
    SICS at NTCIR-7 MOAT: constructions represented in parallel with lexical items2008In: Proceedings of the 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies, 2008, 1, , p. 4p. 237-240Conference paper (Refereed)
    Abstract [en]

    This paper describes experiments to find attitudinal expressions in written English text. The experiments are based on an analysis of text with respect to not only the vocabulary of content terms present in it (which most other approaches use as a basis for analysis) but also on structural features of the text as represented by presence of function words (in other approaches often removed by stop lists) and by presence of constructional features (typically disregarded by most other analyses). In our analysis, following a constructional grammatical framework, structural features are treated similarly to vocabulary features. Our result gives us reason to conclude - provisionally, until more empirical verification experiments can be performed - that: * Linguistic structural information does help in establishing whether a sentence is opinionated or not; whereas * Linguistic information of this specific type does not help in distinguishing sentences of differing polarity.

  • 10.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Täckström, Oscar
    RISE, Swedish ICT, SICS.
    Sahlgren, Magnus
    RISE - Research Institutes of Sweden (2017-2019), ICT, SICS.
    Between Bags and Trees - Constructional Patterns in Text Used for Attitude Identification2010Conference paper (Refereed)
    Abstract [en]

    This paper describes experiments to use non-terminological information to find attitudinal expressions in written English text. The experiments are based on an analysis of text with respect to not only the vocabulary of content terms present in it (which most other approaches use as a basis for analysis) but also with respect to presence of structural features of the text represented by constructional features (typically disregarded by most other analyses). In our analysis, following a construction grammar framework, structural features are treated as occurrences, similarly to the treatment of vocabulary features. The constructional features in play are chosen to potentially signify opinion but are not specific to negative or positive expressions. The framework is used to classify clauses, headlines, and sentences from three different shared collections of attitudinal data. We find that constructional features transfer well across different text collections and that the information couched in them integrates easily with a vocabulary based approach, yielding improvements in classification without complicating the application end of the processing framework.

    Download full text (pdf)
    fulltext
  • 11. Lidén, Per
    et al.
    Asker, Lars
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Protein name tagging for browsing support, active database cross linking, and information retrieval2002Conference paper (Refereed)
    Abstract [en]

    Whereas many applications of natural language processing for molecular biology focus on protein name tagging for the purpose high-level information extraction from large corpuses of scientific text, such as automatic identification of protein-protein interactions, high quality protein name tagging has a value in itself. The aim of this study was to design, implement, and evaluate a high-accuracy protein name tagger, and give proof-of-concept for some of the most basic applications of protein name tagging in an information retrieval setting, namely browsing support, active database cross linking, and enhanced query functionality. A combination of heuristics, dictionary look-up, syntactic analysis, and the application of a local dynamic dictionary were used to create a protein name tagger. This tagger outperforms a previously published similar system when benchmarked on a corpus of manually annotated Medline abstracts. In addition to evaluating the tagging performance, the implemented algorithm was used to add mark-up to a corpus of approximately 10000 Medline abstracts, which were indexed in a state-of-the-art information retrieval system. Indexing highlights many basic benets of adding named entity mark-up such as protein names. One obvious benet is that the search process is enhanced by the addition of a search eld. Furthermore, the mark-up can be used for providing active hyperlinks between protein entities in presented documents and protein sequence databases, such as SwissProt, when both databases are indexed in the same information retrieval system. Efficient links can also be constructed in the opposite direction providing high precision retrieval of documents relevant for protein entries. Fast and accurate cross linking can be obtained by using an efficient implementation of the eld based approximate cosine measure, which is a simple standard information retrieval technique for document similarity searching. This poster presents methods, results, implementation details, and features of a prototype system.

  • 12.
    Olsson, Fredrik
    et al.
    RISE, Swedish ICT, SICS.
    Franzén, Kristofer
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Asker, Lars
    Lidén, Per
    Notions of correctness when evaluating protein name taggers2002In: Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002): Vol. 1, 2002, 1Conference paper (Refereed)
    Abstract [en]

    This paper introduces four different notions of correctness to be used when measuring the performance of protein name taggers, each of which reflects certain characteristics of the tagger under evaluation. The discussion regarding the different notions is centered around the evaluation of two protein name taggers; Yapex, developed by the authors, and KeX developed by Fukuda et al (1998). For the purpose of illustrating the difference between the ways of evaluation, both taggers are applied to a corpus of 101 MEDLINE abstracts in which all occurrences of protein names have been marked up by domain experts.

  • 13.
    Sahlgren, Magnus
    et al.
    RISE - Research Institutes of Sweden (2017-2019), ICT, SICS.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    SICS: Valence annotation based on seeds in word space2007Conference paper (Refereed)
  • 14.
    Täckström, Oscar
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Velupillai, Sumithra
    Dalianis, Hercules
    Hassel, Martin
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Uncertainty Detection as Approximate Max-Margin Sequence Labelling2010In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, 2010, 12Conference paper (Refereed)
    Abstract [en]

    This paper reports experiments for the CoNLL 2010 shared task on learning to detect hedges and their scope in natural language text. We have addressed the experimental tasks as supervised linear maximum margin prediction problems. For sentence level hedge detection in the biological domain we use an L1-regularised binary support vector machine, while for sentence level weasel detection in the Wikipedia domain, we use an L2-regularised approach. We model the in-sentence uncertainty cue and scope detection task as an L2-regularised approximate maximum margin sequence labelling problem, using the BIO-encoding. In addition to surface level features, we use a variety of linguistic features based on a functional dependency analysis. A greedy forward selection strategy is used in exploring the large set of potential features. Our official results for Task 1 for the biological domain are 85.2 F1-score, for the Wikipedia set 55.4 F1-score. For Task 2, our official results are 2.1 for the entire task with a score of 62.5 for cue detection. After resolving errors and final bugs, our final results are for Task 1, biological: 86.0, Wikipedia: 58.2; Task 2, scopes: 39.6 and cues: 78.5.

    Download full text (pdf)
    fulltext
1 - 14 of 14
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf