Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
PonIC: Using stratosphere to speed up pig analytics
KTH Royal Institute of Technology, Sweden.
KTH Royal Institute of Technology, Sweden.
RISE, Swedish ICT, SICS.
2013 (English)In: Lecture Notes in Computer Science, 2013, Vol. 8097, p. 279-290Conference paper, Published paper (Refereed)
Abstract [en]

Pig, a high-level dataflow system built on top of Hadoop MapReduce, has greatly facilitated the implementation of data-intensive applications. Pig successfully manages to conceal Hadoop's one input and two-stage inflexible pipeline limitations, by translating scripts into MapReduce jobs. However, these limitations are still present in the backend, often resulting in inefficient execution. Stratosphere, a data-parallel computing framework consisting of PACT, an extension to the MapReduce programming model and the Nephele execution engine, overcomes several limitations of Hadoop MapReduce. In this paper, we argue that Pig can highly benefit from using Stratosphere as the backend system and gain performance, without any loss of expressiveness. We have ported Pig on top of Stratosphere and we present a process for translating Pig Latin scripts into PACT programs. Our evaluation shows that Pig Latin scripts can execute on our prototype up to 8 times faster for a certain class of applications.

Place, publisher, year, edition, pages
2013. Vol. 8097, p. 279-290
Keywords [en]
Backend system, Computing frameworks, Data parallel, Data-intensive application, Execution engine, Gain performance, Hadoop MapReduce, Map-reduce programming, Multiprocessing systems, Parallel architectures, Program translators, Mammals
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:ri:diva-48671DOI: 10.1007/978-3-642-40047-6_30Scopus ID: 2-s2.0-84883160941ISBN: 9783642400469 (print)OAI: oai:DiVA.org:ri-48671DiVA, id: diva2:1469322
Conference
19th International Conference on Parallel Processing, Euro-Par 2013; Aachen; Germany; 26 August 2013 through 30 August 2013
Available from: 2020-09-21 Created: 2020-09-21 Last updated: 2020-12-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus
By organisation
SICS
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 8 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf