Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A novel approach to estimate proximity in a random forest: An exploratory study
RISE, Swedish ICT, Viktoria. (Kooperativa System)ORCID iD: 0000-0002-1043-8773
Halmstad University, Sweden; Kaunas University of Technology, Lithuania.
2012 (English)In: Expert systems with applications, ISSN 0957-4174, E-ISSN 1873-6793, Vol. 39, no 17, p. 13046-13050Article in journal (Refereed) Published
Abstract [en]

A data proximity matrix is an important information source in random forests (RF) based data mining, including data clustering, visualization, outlier detection, substitution of missing values, and finding mislabeled data samples. A novel approach to estimate proximity is proposed in this work. The approach is based on measuring distance between two terminal nodes in a decision tree. To assess the consistency (quality) of data proximity estimate, we suggest using the proximity matrix as a kernel matrix in a support vector machine (SVM), under the assumption that a matrix of higher quality leads to higher classification accuracy. It is experimentally shown that the proposed approach improves the proximity estimate, especially when RF is made of a small number of trees. It is also demonstrated that, for some tasks, an SVM exploiting the suggested proximity matrix based kernel, outperforms an SVM based on a standard radial basis function kernel and the standard proximity matrix based kernel.

Place, publisher, year, edition, pages
2012. Vol. 39, no 17, p. 13046-13050
Keywords [en]
Data mining, Kernel matrix, Proximity matrix, Random forest, Support vector machine
National Category
Signal Processing
Identifiers
URN: urn:nbn:se:ri:diva-27852DOI: 10.1016/j.eswa.2012.05.094Scopus ID: 2-s2.0-84865043451OAI: oai:DiVA.org:ri-27852DiVA, id: diva2:1064123
Available from: 2017-01-11 Created: 2017-01-11 Last updated: 2021-01-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopushttp://www.sciencedirect.com/science/article/pii/S095741741200810X

Authority records

Englund, Cristofer

Search in DiVA

By author/editor
Englund, Cristofer
By organisation
Viktoria
In the same journal
Expert systems with applications
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 125 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf