Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Hi-WAY: Execution of scientific workflows on hadoop YARN
Humboldt-Universität zu Berlin, Germany.
Humboldt-Universität zu Berlin, Germany.
Humboldt-Universität zu Berlin, Germany.
RISE - Research Institutes of Sweden, ICT, SICS.ORCID-id: 0000-0002-9484-6714
Visa övriga samt affilieringar
2017 (Engelska)Ingår i: Advances in Database Technology - EDBT, OpenProceedings.org , 2017, s. 668-679Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Scientific workflows provide a means to model, execute, and exchange the increasingly complex analysis pipelines necessary for today’s data-driven science. However, existing scientific workflow management systems (SWfMSs) are often limited to a single workflow language and lack adequate support for large-scale data analysis. On the other hand, current distributed dataflow systems are based on a semi-structured data model, which makes integration of arbitrary tools cumbersome or forces re-implementation. We present the scientific workflow execution engine Hi-WAY, which implements a strict black-box view on tools to be integrated and data to be processed. Its generic yet powerful execution model allows Hi-WAY to execute workflows specified in a multitude of different languages. Hi-WAY compiles workflows into schedules for Hadoop YARN, harnessing its proven scalability. It allows for iterative and recursive workflow structures and optimizes performance through adaptive and data-aware scheduling. Reproducibility of workflow executions is achieved through automated setup of infrastructures and re-executable provenance traces. In this application paper we discuss limitations of current SWfMSs regarding scalable data analysis, describe the architecture of Hi-WAY, highlight its most important features, and report on several large-scale experiments from different scientific domains. © 2017, Copyright is with the authors.

Ort, förlag, år, upplaga, sidor
OpenProceedings.org , 2017. s. 668-679
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:ri:diva-38105DOI: 10.5441/002/edbt.2017.87Scopus ID: 2-s2.0-85046452463ISBN: 9783893180738 (tryckt)OAI: oai:DiVA.org:ri-38105DiVA, id: diva2:1294777
Konferens
20th International Conference on Extending Database Technology, EDBT 2017, 21 March 2017 through 24 March 2017
Tillgänglig från: 2019-03-08 Skapad: 2019-03-08 Senast uppdaterad: 2023-05-22Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Dowling, Jim

Sök vidare i DiVA

Av författaren/redaktören
Dowling, Jim
Av organisationen
SICS
Data- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 348 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf