Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 16) Show all publications
Castaneda Lozano, R., Carlsson, M., Blindell, G. & Schulte, C. (2019). Combinatorial register allocation and instruction scheduling. ACM Transactions on Programming Languages and Systems, 41(3), Article ID 17.
Open this publication in new window or tab >>Combinatorial register allocation and instruction scheduling
2019 (English)In: ACM Transactions on Programming Languages and Systems, ISSN 0164-0925, E-ISSN 1558-4593, Vol. 41, no 3, article id 17Article in journal (Refereed) Published
Abstract [en]

This article introduces a combinatorial optimization approach to register allocation and instruction scheduling, two central compiler problems. Combinatorial optimization has the potential to solve these problems optimally and to exploit processor-specific features readily. Our approach is the first to leverage this potential in practice: it captures the complete set of program transformations used in state-of-the-art compilers, scales to medium-sized functions of up to 1,000 instructions, and generates executable code. This level of practicality is reached by using constraint programming, a particularly suitable combinatorial optimization technique. Unison, the implementation of our approach, is open source, used in industry, and integrated with the LLVM toolchain. An extensive evaluation confirms that Unison generates better code than LLVM while scaling to medium-sized functions. The evaluation uses systematically selected benchmarks from MediaBench and SPEC CPU2006 and different processor architectures (Hexagon, ARM, MIPS). Mean estimated speedup ranges from 1.1% to 10% and mean code size reduction ranges from 1.3% to 3.8% for the different architectures. A significant part of this improvement is due to the integrated nature of the approach. Executing the generated code on Hexagon confirms that the estimated speedup results in actual speedup. Given a fixed time limit, Unison solves optimally functions of up to 946 instructions, nearly an order of magnitude larger than previous approaches. The results show that our combinatorial approach can be applied in practice to trade compilation time for code quality beyond the usual compiler optimization levels, identify improvement opportunities in heuristic algorithms, and fully exploit processor-specific features.

Place, publisher, year, edition, pages
Association for Computing Machinery, 2019
Keywords
Combinatorial optimization, Instruction scheduling, Register allocation
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-39654 (URN)10.1145/3332373 (DOI)2-s2.0-85068443848 (Scopus ID)
Note

 Funding details: Vetenskapsrådet, VR, 621-2011-6229; Funding text 1: This article is partially based on preliminary work presented at the Principles and Practice of Constraint Programming (2012) [20]; Languages, Compilers, and Tools for Embedded Systems (2014) [21]; and Compiler Construction (2016) [22]conferences. Compared to the preliminary work, this article is completely restructured and rewritten, completes the combinatorial model with rematerialization, proposes extensions to capture additional program transformations and processor-specific features, and contributes a more exhaustive evaluation. Additions to the evaluation include more benchmarks and processors, evidence of the fundamental benefit of the integrated approach, an in-depth study of scalability, and actual execution measurements. This work has been partially funded by Ericsson AB and the Swedish Research Council (VR) under grant 621-2011-6229. Authors’ addresses: R. C. Lozano, RISE SICS, Electrum 229, Kista, 164 40, Sweden, KTH Royal Institute of Technology, School of Electrical Engineering and Computer Science, Electrum 229, Kista, 164 40, Sweden; email: roberto.castaneda@ri.se; M. Carlsson, RISE SICS, Electrum 229, Kista, 164 40, Sweden; email: mats.carlsson@ri.se; G. H. Blindell, KTH Royal Institute of Technology, School of Electrical Engineering and Computer Science, Electrum 229, Kista, 164 40, Sweden; email: ghb@kth.se; C. Schulte, KTH Royal Institute of Technology, School of Electrical Engineering and Computer Science, Electrum 229, Kista, 164 40, Sweden, RISE SICS, Electrum 229, Kista, 164 40, Sweden; email: cschulte@kth.se. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2019 Association for Computing Machinery. 0164-0925/2019/07-ART17 $15.00 https://doi.org/10.1145/3332373

Available from: 2019-08-07 Created: 2019-08-07 Last updated: 2019-08-07Bibliographically approved
Castañedalozano, R. & Schulte, C. (2019). Survey on combinatorial register allocation and instruction scheduling. ACM Computing Surveys, 52(3), Article ID 62.
Open this publication in new window or tab >>Survey on combinatorial register allocation and instruction scheduling
2019 (English)In: ACM Computing Surveys, ISSN 0360-0300, E-ISSN 1557-7341, Vol. 52, no 3, article id 62Article in journal (Refereed) Published
Abstract [en]

Register allocation (mapping variables to processor registers or memory) and instruction scheduling (reordering instructions to increase instruction-level parallelism) are essential tasks for generating efficient assembly code in a compiler. In the past three decades, combinatorial optimization has emerged as an alternative to traditional, heuristic algorithms for these two tasks. Combinatorial optimization approaches can deliver optimal solutions according to a model, can precisely capture trade-offs between conflicting decisions, and are more flexible at the expense of increased compilation time. This article provides an exhaustive literature review and a classification of combinatorial optimization approaches to register allocation and instruction scheduling, with a focus on the techniques that are most applied in this context: integer programming, constraint programming, partitioned Boolean quadratic programming, and enumeration. Researchers in compilers and combinatorial optimization can benefit from identifying developments, trends, and challenges in the area; compiler practitioners may discern opportunities and grasp the potential benefit of applying combinatorial optimization. .

Place, publisher, year, edition, pages
Association for Computing Machinery, 2019
Keywords
Combinatorial optimization, Instruction scheduling, Register allocation, Computer programming, Constraint theory, Economic and social effects, Heuristic algorithms, Program compilers, Quadratic programming, Scheduling, Assembly code, Constraint programming, Instruction level parallelism, Literature reviews, Optimal solutions, Potential benefits, Integer programming
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-39459 (URN)10.1145/3200920 (DOI)2-s2.0-85068049964 (Scopus ID)
Available from: 2019-07-08 Created: 2019-07-08 Last updated: 2019-07-08Bibliographically approved
Hjort Blindell, G., Carlsson, M., Lozano, R. C. & Schulte, C. (2017). Complete and practical universal instruction selection. ACM Transactions on Embedded Computing Systems, 16(5s), Article ID 119.
Open this publication in new window or tab >>Complete and practical universal instruction selection
2017 (English)In: ACM Transactions on Embedded Computing Systems, ISSN 1539-9087, E-ISSN 1558-3465, Vol. 16, no 5s, article id 119Article in journal (Refereed) Published
Abstract [en]

In code generation, instruction selection chooses processor instructions to implement a program under compilation where code quality crucially depends on the choice of instructions. Using methods from combinatorial optimization, this paper proposes an expressive model that integrates global instruction selection with global code motion. The model introduces (1) handling of memory computations and function calls, (2) a method for inserting additional jump instructions where necessary, (3) a dependency-based technique to ensure correct combinations of instructions, (4) value reuse to improve code quality, and (5) an objective function that reduces compilation time and increases scalability by exploiting bounding techniques. The approach is demonstrated to be complete and practical, competitive with LLVM, and potentially optimal (w.r.t. the model) for medium-sized functions. The results show that combinatorial optimization for instruction selection is well-suited to exploit the potential of modern processors in embedded systems.

Keywords
Code generation, Combinatorial optimization, Constraint programming, Instruction selection, Computer programming, Constraint theory, Embedded systems, Program processors, Bounding techniques, Function calls, Memory computations, Modern processors, Objective functions, Codes (symbols)
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-33198 (URN)10.1145/3126528 (DOI)2-s2.0-85030692980 (Scopus ID)
Available from: 2018-01-31 Created: 2018-01-31 Last updated: 2018-08-23Bibliographically approved
Corcoran, D., Andimeh, L., Ermedahl, A., Kreuger, P. & Schulte, C. (2017). Data driven selection of DRX for energy efficient 5G RAN. In: : . Paper presented at 2017 13th International Conference on Network and Service Management (CNSM). 26-30 Nov. 2017. Tokyo, Japan. Tokyo: IEEE conference proceedings
Open this publication in new window or tab >>Data driven selection of DRX for energy efficient 5G RAN
Show others...
2017 (English)Conference paper, Published paper (Refereed)
Abstract [en]

The number of connected mobile devices is increasing rapidly with more than 10 billion expected by 2022. Their total aggregate energy consumption poses a significant concern to society. The current 3gpp (3rd Generation Partnership Project) LTE/LTE-Advanced standard incorporates an energy saving technique called discontinuous reception (DRX). It is expected that 5G will use an evolved variant of this scheme. In general, the single selection of DRX parameters per device is non trivial. This paper describes how to improve energy efficiency of mobile devices by selecting DRX based on the traffic profile per device. Our particular approach uses a two phase data-driven strategy which tunes the selection of DRX parameters based on a smart fast energy model. The first phase involves the off-line selection of viable DRX combinations for a particular traffic mix. The second phase involves an on-line selection of DRX from this viable list. The method attempts to guarantee that latency is not worse than a chosen threshold. Alternatively, longer battery life for a device can be traded against increased latency. We built a lab prototype of the system to verify that the technique works and scales on a real LTE system. We also designed a sophisticated traffic generator based on actual user data traces. Complementary method verification has been made by exhaustive off-line simulations on recorded LTE network data. Our approach shows significant device energy savings, which has the aggregated potential over billions of devices to make a real contribution to green, energy efficient networks.

Place, publisher, year, edition, pages
Tokyo: IEEE conference proceedings, 2017
Keywords
Software architecture, 5G mobile communication, Adaptive systems, Energy efficiency, Green computing
National Category
Computer Sciences
Identifiers
urn:nbn:se:ri:diva-33288 (URN)10.23919/CNSM.2017.8255972 (DOI)2-s2.0-85046680815 (Scopus ID)
Conference
2017 13th International Conference on Network and Service Management (CNSM). 26-30 Nov. 2017. Tokyo, Japan
Available from: 2018-02-20 Created: 2018-02-20 Last updated: 2018-12-19Bibliographically approved
Castaneda Lozano, R., Carlsson, M., Hjort Blindell, G. & Schulte, C. (2016). Register allocation and instruction scheduling in Unison (6ed.). In: Proceedings of the 25th International Conference on Compiler Construction: . Paper presented at 25th International Conference on Compiler Construction (CC 2016), March 17-18, 2016, Barcelona, Spain (pp. 263-264).
Open this publication in new window or tab >>Register allocation and instruction scheduling in Unison
2016 (English)In: Proceedings of the 25th International Conference on Compiler Construction, 2016, 6, p. 263-264Conference paper, Published paper (Refereed)
Abstract [en]

This paper describes Unison, a simple, flexible, and potentially optimal software tool that performs register allocation and instruction scheduling in integration using combinatorial optimization. The tool can be used as an alternative or as a complement to traditional approaches, which are fast but complex and suboptimal. Unison is most suitable whenever high-quality code is required and longer compilation times can be tolerated (such as in embedded systems or library releases), or the targeted processors are so irregular that traditional compilers fail to generate satisfactory code.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24536 (URN)10.1145/2892208.2892237 (DOI)978-1-4503-4241-4 (ISBN)
Conference
25th International Conference on Compiler Construction (CC 2016), March 17-18, 2016, Barcelona, Spain
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2019-06-25Bibliographically approved
Hjort Blindell, G., Castaneda Lozano, R., Carlsson, M. & Schulte, C. (2015). Modeling Universal Instruction Selection (7ed.). In: Principles and Practice of Constraint Programming: . Paper presented at 21st International Conference on the Principles and Practice of Constraint Programming (CP 2015), August 31 - September 4, 2015, Cork, Ireland (pp. 609-626). , 9255
Open this publication in new window or tab >>Modeling Universal Instruction Selection
2015 (English)In: Principles and Practice of Constraint Programming, 2015, 7, Vol. 9255, p. 609-626Conference paper, Published paper (Refereed)
Abstract [en]

Instruction selection implements a program under compilation by selecting processor instructions and has tremendous impact on the performance of the code generated by a compiler. This paper introduces a graph-based universal representation that unifiees data and control flow for both programs and processor instructions. The representation is the essential prerequisite for a constraint model for instruction selection introduced in this paper. The model is demonstrated to be expressive in that it supports many processor features that are out of reach of state-of-the-art approaches, such as advanced branching instructions, multiple register banks, and SIMD instructions. The resulting model can be solved for small to medium size input programs and sophisticated processor instructions and is competitive with LLVM in code quality. Model and representation are significant due to their expressiveness and their potential to be combined with models for other code generation tasks.

Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349 ; 9255
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24436 (URN)10.1007/978-3-319-23219-5_42 (DOI)978-3-319-23218-8 (ISBN)978-3-319-23219-5 (ISBN)
Conference
21st International Conference on the Principles and Practice of Constraint Programming (CP 2015), August 31 - September 4, 2015, Cork, Ireland
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2019-07-11Bibliographically approved
Castaneda Lozano, R., Carlsson, M., Hjort Blindell, G. & Schulte, C. (2014). Combinatorial Spill Code Optimization and Ultimate Coalescing (5ed.). In: : . Paper presented at Fourteenth ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (pp. 23-32).
Open this publication in new window or tab >>Combinatorial Spill Code Optimization and Ultimate Coalescing
2014 (English)Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents a novel combinatorial model that integrates global register allocation based on ultimate coalescing, spill code optimization, register packing, and multiple register banks with instruction scheduling (including VLIW). The model exploits alternative temporaries that hold the same value as a new concept for ultimate coalescing and spill code optimization. The paper presents Unison as a code generator based on the model and advanced solving techniques using constraint programming. Thorough experiments using MediaBench and a processor (Hexagon) that are typical for embedded systems demonstrate that Unison: is robust and scalable; generates faster code than LLVM (up to 41% with a mean improvement of 7%); possibly generates optimal code (for 29% of the experiments); effortlessly supports different optimization criteria (code size on par with LLVM). Unison is significant as it addresses the same aspects as traditional code generation algorithms, yet is based on a simple integrated model and robustly can generate optimal code.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24161 (URN)10.1145/2597809.2597815 (DOI)2-s2.0-84907032397 (Scopus ID)
Conference
Fourteenth ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2019-08-15Bibliographically approved
Castaneda Lozano, R. & Schulte, C. (2014). Survey on Combinatorial Register Allocation and Instruction Scheduling (4ed.). arXiv:1409.7628 [cs.PL]
Open this publication in new window or tab >>Survey on Combinatorial Register Allocation and Instruction Scheduling
2014 (English)In: arXiv:1409.7628 [cs.PL]Article in journal (Refereed) Published
Abstract [en]

Register allocation and instruction scheduling are two central compiler back-end problems that are critical for quality. In the last two decades, combinatorial optimization has emerged as an alternative approach to traditional, heuristic algorithms for these problems. Combinatorial approaches are generally slower but more flexible than their heuristic counterparts and have the potential to generate optimal code. This paper surveys existing literature on combinatorial register allocation and instruction scheduling. The survey covers approaches that solve each problem in isolation as well as approaches that integrate both problems. The latter have the potential to generate code that is globally optimal by capturing the trade-off between conflicting register allocation and instruction scheduling decisions.

Place, publisher, year, edition, pages
arXiv, 2014 Edition: 4
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24383 (URN)10.1145/3200920 (DOI)2-s2.0-85068049964 (Scopus ID)
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2019-08-13Bibliographically approved
Castaneda Lozano, R., Hjort Blindell, G., Carlsson, M., Drejhammar, F. & Schulte, C. (2013). Constraint-based Code Generation (6ed.). In: : . Paper presented at Sixteenth International Workshop on Software and Compilers for Embedded Systems (pp. 93-95).
Open this publication in new window or tab >>Constraint-based Code Generation
Show others...
2013 (English)Conference paper, Published paper (Refereed)
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24216 (URN)
Conference
Sixteenth International Workshop on Software and Compilers for Embedded Systems
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2019-06-20Bibliographically approved
Castaneda Lozano, R., Carlsson, M., Drejhammar, F. & Schulte, C. (2012). Constraint-based Register Allocation and Instruction Scheduling (6ed.). In: : . Paper presented at Eighteenth International Conference on Principles and Practice of Constraint Programming (pp. 750-766).
Open this publication in new window or tab >>Constraint-based Register Allocation and Instruction Scheduling
2012 (English)Conference paper, Published paper (Refereed)
Abstract [en]

This paper introduces a constraint model and solving techniques for code generation in a compiler back-end. It contributes a new model for global register allocation that combines several advanced aspects: multiple register banks (subsuming spilling to memory), coalescing, and packing. The model is extended to include instruction scheduling and bundling. The paper introduces a decomposition scheme exploiting the underlying program structure and exhibiting robust behavior for functions with thousands of instructions. Evaluation shows that code quality is on par with LLVM, a state-of-the-art compiler infrastructure. The paper makes important contributions to the applicability of constraint programming as well as compiler construction: essential concepts are unified in a high-level model that can be solved by readily available modern solvers. This is a significant step towards basing code generation entirely on a high-level model and by this facilitates the construction of correct, simple, flexible, robust, and high-quality code generators.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24215 (URN)
Conference
Eighteenth International Conference on Principles and Practice of Constraint Programming
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2019-06-20Bibliographically approved
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-6283-7004

Search in DiVA

Show all publications
v. 2.35.7