Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
BoostVHT: Boosting distributed streaming decision trees
RISE - Research Institutes of Sweden, ICT, SICS.ORCID-id: 0000-0002-8180-7521
KTH Royal Institute of Technology, Sweden.
Qatar Computing Research Institute, Qatar.
2017 (engelsk)Inngår i: International Conference on Information and Knowledge Management, Proceedings, 2017, s. 899-908Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Online boosting improves the accuracy of classifiers for unbounded streams of data by chaining them into an ensemble. Due to its sequential nature, boosting has proven hard to parallelize, even more so in the online setting. This paper introduces BoostVHT, a technique to parallelize online boosting algorithms. Our proposal leverages a recently-developed model-parallel learning algorithm for streaming decision trees as a base learner. This design allows to neatly separate the model boosting from its training. As a result, BoostVHT provides a flexible learning framework which can employ any existing online boosting algorithm, while at the same time it can leverage the computing power of modern parallel and distributed cluster environments. We implement our technique on Apache SAMOA, an open-source platform for mining big data streams that can be run on several distributed execution engines, and demonstrate order of magnitude speedups compared to the state-of-the-art. © 2017 Copyright held by the owner/author(s).

sted, utgiver, år, opplag, sider
2017. s. 899-908
Emneord [en]
Boosting, Decision trees, Distributed systems, Online learning, Big data, Cluster computing, Clustering algorithms, Data mining, Distributed computer systems, Forestry, Knowledge management, Online systems, Trees (mathematics), Distributed clusters, Distributed streaming, Flexible Learning, Open source platforms, Parallel learning algorithms, Learning algorithms
HSV kategori
Identifikatorer
URN: urn:nbn:se:ri:diva-33210DOI: 10.1145/3132847.3132974Scopus ID: 2-s2.0-85037345394ISBN: 9781450349185 (tryckt)OAI: oai:DiVA.org:ri-33210DiVA, id: diva2:1179210
Konferanse
26th ACM International Conference on Information and Knowledge Management, CIKM 2017, 6 November 2017 through 10 November 2017
Tilgjengelig fra: 2018-01-31 Laget: 2018-01-31 Sist oppdatert: 2018-08-13bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Person

Vasiloudis, Theodore

Søk i DiVA

Av forfatter/redaktør
Vasiloudis, Theodore
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 22 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
v. 2.44.0