Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Hi-WAY: Execution of scientific workflows on hadoop YARN
Humboldt-Universität zu Berlin, Germany.
Humboldt-Universität zu Berlin, Germany.
Humboldt-Universität zu Berlin, Germany.
RISE - Research Institutes of Sweden, ICT, SICS.ORCID iD: 0000-0002-9484-6714
Show others and affiliations
2017 (English)In: Advances in Database Technology - EDBT, OpenProceedings.org , 2017, p. 668-679Conference paper, Published paper (Refereed)
Abstract [en]

Scientific workflows provide a means to model, execute, and exchange the increasingly complex analysis pipelines necessary for today’s data-driven science. However, existing scientific workflow management systems (SWfMSs) are often limited to a single workflow language and lack adequate support for large-scale data analysis. On the other hand, current distributed dataflow systems are based on a semi-structured data model, which makes integration of arbitrary tools cumbersome or forces re-implementation. We present the scientific workflow execution engine Hi-WAY, which implements a strict black-box view on tools to be integrated and data to be processed. Its generic yet powerful execution model allows Hi-WAY to execute workflows specified in a multitude of different languages. Hi-WAY compiles workflows into schedules for Hadoop YARN, harnessing its proven scalability. It allows for iterative and recursive workflow structures and optimizes performance through adaptive and data-aware scheduling. Reproducibility of workflow executions is achieved through automated setup of infrastructures and re-executable provenance traces. In this application paper we discuss limitations of current SWfMSs regarding scalable data analysis, describe the architecture of Hi-WAY, highlight its most important features, and report on several large-scale experiments from different scientific domains. © 2017, Copyright is with the authors.

Place, publisher, year, edition, pages
OpenProceedings.org , 2017. p. 668-679
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:ri:diva-38105DOI: 10.5441/002/edbt.2017.87Scopus ID: 2-s2.0-85046452463ISBN: 9783893180738 (print)OAI: oai:DiVA.org:ri-38105DiVA, id: diva2:1294777
Conference
20th International Conference on Extending Database Technology, EDBT 2017, 21 March 2017 through 24 March 2017
Available from: 2019-03-08 Created: 2019-03-08 Last updated: 2023-05-22Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Dowling, Jim

Search in DiVA

By author/editor
Dowling, Jim
By organisation
SICS
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 342 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf