Change search
Link to record
Permanent link

Direct link
BETA
Castañeda Lozano, RobertoORCID iD iconorcid.org/0000-0002-2806-7333
Alternative names
Publications (10 of 10) Show all publications
Castaneda Lozano, R., Carlsson, M., Blindell, G. & Schulte, C. (2019). Combinatorial register allocation and instruction scheduling. ACM Transactions on Programming Languages and Systems, 41(3), Article ID 17.
Open this publication in new window or tab >>Combinatorial register allocation and instruction scheduling
2019 (English)In: ACM Transactions on Programming Languages and Systems, ISSN 0164-0925, E-ISSN 1558-4593, Vol. 41, no 3, article id 17Article in journal (Refereed) Published
Abstract [en]

This article introduces a combinatorial optimization approach to register allocation and instruction scheduling, two central compiler problems. Combinatorial optimization has the potential to solve these problems optimally and to exploit processor-specific features readily. Our approach is the first to leverage this potential in practice: it captures the complete set of program transformations used in state-of-the-art compilers, scales to medium-sized functions of up to 1,000 instructions, and generates executable code. This level of practicality is reached by using constraint programming, a particularly suitable combinatorial optimization technique. Unison, the implementation of our approach, is open source, used in industry, and integrated with the LLVM toolchain. An extensive evaluation confirms that Unison generates better code than LLVM while scaling to medium-sized functions. The evaluation uses systematically selected benchmarks from MediaBench and SPEC CPU2006 and different processor architectures (Hexagon, ARM, MIPS). Mean estimated speedup ranges from 1.1% to 10% and mean code size reduction ranges from 1.3% to 3.8% for the different architectures. A significant part of this improvement is due to the integrated nature of the approach. Executing the generated code on Hexagon confirms that the estimated speedup results in actual speedup. Given a fixed time limit, Unison solves optimally functions of up to 946 instructions, nearly an order of magnitude larger than previous approaches. The results show that our combinatorial approach can be applied in practice to trade compilation time for code quality beyond the usual compiler optimization levels, identify improvement opportunities in heuristic algorithms, and fully exploit processor-specific features.

Place, publisher, year, edition, pages
Association for Computing Machinery, 2019
Keywords
Combinatorial optimization, Instruction scheduling, Register allocation
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-39654 (URN)10.1145/3332373 (DOI)2-s2.0-85068443848 (Scopus ID)
Note

 Funding details: Vetenskapsrådet, VR, 621-2011-6229; Funding text 1: This article is partially based on preliminary work presented at the Principles and Practice of Constraint Programming (2012) [20]; Languages, Compilers, and Tools for Embedded Systems (2014) [21]; and Compiler Construction (2016) [22]conferences. Compared to the preliminary work, this article is completely restructured and rewritten, completes the combinatorial model with rematerialization, proposes extensions to capture additional program transformations and processor-specific features, and contributes a more exhaustive evaluation. Additions to the evaluation include more benchmarks and processors, evidence of the fundamental benefit of the integrated approach, an in-depth study of scalability, and actual execution measurements. This work has been partially funded by Ericsson AB and the Swedish Research Council (VR) under grant 621-2011-6229. Authors’ addresses: R. C. Lozano, RISE SICS, Electrum 229, Kista, 164 40, Sweden, KTH Royal Institute of Technology, School of Electrical Engineering and Computer Science, Electrum 229, Kista, 164 40, Sweden; email: roberto.castaneda@ri.se; M. Carlsson, RISE SICS, Electrum 229, Kista, 164 40, Sweden; email: mats.carlsson@ri.se; G. H. Blindell, KTH Royal Institute of Technology, School of Electrical Engineering and Computer Science, Electrum 229, Kista, 164 40, Sweden; email: ghb@kth.se; C. Schulte, KTH Royal Institute of Technology, School of Electrical Engineering and Computer Science, Electrum 229, Kista, 164 40, Sweden, RISE SICS, Electrum 229, Kista, 164 40, Sweden; email: cschulte@kth.se. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2019 Association for Computing Machinery. 0164-0925/2019/07-ART17 $15.00 https://doi.org/10.1145/3332373

Available from: 2019-08-07 Created: 2019-08-07 Last updated: 2020-01-24Bibliographically approved
Castaneda Lozano, R., Carlsson, M., Hjort Blindell, G. & Schulte, C. (2016). Register allocation and instruction scheduling in Unison (6ed.). In: Proceedings of the 25th International Conference on Compiler Construction: . Paper presented at 25th International Conference on Compiler Construction (CC 2016), March 17-18, 2016, Barcelona, Spain (pp. 263-264).
Open this publication in new window or tab >>Register allocation and instruction scheduling in Unison
2016 (English)In: Proceedings of the 25th International Conference on Compiler Construction, 2016, 6, p. 263-264Conference paper, Published paper (Refereed)
Abstract [en]

This paper describes Unison, a simple, flexible, and potentially optimal software tool that performs register allocation and instruction scheduling in integration using combinatorial optimization. The tool can be used as an alternative or as a complement to traditional approaches, which are fast but complex and suboptimal. Unison is most suitable whenever high-quality code is required and longer compilation times can be tolerated (such as in embedded systems or library releases), or the targeted processors are so irregular that traditional compilers fail to generate satisfactory code.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24536 (URN)10.1145/2892208.2892237 (DOI)2-s2.0-84966560429 (Scopus ID)978-1-4503-4241-4 (ISBN)
Conference
25th International Conference on Compiler Construction (CC 2016), March 17-18, 2016, Barcelona, Spain
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-02-07Bibliographically approved
Hjort Blindell, G., Castaneda Lozano, R., Carlsson, M. & Schulte, C. (2015). Modeling Universal Instruction Selection (7ed.). In: Principles and Practice of Constraint Programming: . Paper presented at 21st International Conference on the Principles and Practice of Constraint Programming (CP 2015), August 31 - September 4, 2015, Cork, Ireland (pp. 609-626). , 9255
Open this publication in new window or tab >>Modeling Universal Instruction Selection
2015 (English)In: Principles and Practice of Constraint Programming, 2015, 7, Vol. 9255, p. 609-626Conference paper, Published paper (Refereed)
Abstract [en]

Instruction selection implements a program under compilation by selecting processor instructions and has tremendous impact on the performance of the code generated by a compiler. This paper introduces a graph-based universal representation that unifiees data and control flow for both programs and processor instructions. The representation is the essential prerequisite for a constraint model for instruction selection introduced in this paper. The model is demonstrated to be expressive in that it supports many processor features that are out of reach of state-of-the-art approaches, such as advanced branching instructions, multiple register banks, and SIMD instructions. The resulting model can be solved for small to medium size input programs and sophisticated processor instructions and is competitive with LLVM in code quality. Model and representation are significant due to their expressiveness and their potential to be combined with models for other code generation tasks.

Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349 ; 9255
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24436 (URN)10.1007/978-3-319-23219-5_42 (DOI)2-s2.0-84944558675 (Scopus ID)978-3-319-23218-8 (ISBN)978-3-319-23219-5 (ISBN)
Conference
21st International Conference on the Principles and Practice of Constraint Programming (CP 2015), August 31 - September 4, 2015, Cork, Ireland
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-01-31Bibliographically approved
Castaneda Lozano, R., Carlsson, M., Hjort Blindell, G. & Schulte, C. (2014). Combinatorial Spill Code Optimization and Ultimate Coalescing (5ed.). In: : . Paper presented at Fourteenth ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (pp. 23-32).
Open this publication in new window or tab >>Combinatorial Spill Code Optimization and Ultimate Coalescing
2014 (English)Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents a novel combinatorial model that integrates global register allocation based on ultimate coalescing, spill code optimization, register packing, and multiple register banks with instruction scheduling (including VLIW). The model exploits alternative temporaries that hold the same value as a new concept for ultimate coalescing and spill code optimization. The paper presents Unison as a code generator based on the model and advanced solving techniques using constraint programming. Thorough experiments using MediaBench and a processor (Hexagon) that are typical for embedded systems demonstrate that Unison: is robust and scalable; generates faster code than LLVM (up to 41% with a mean improvement of 7%); possibly generates optimal code (for 29% of the experiments); effortlessly supports different optimization criteria (code size on par with LLVM). Unison is significant as it addresses the same aspects as traditional code generation algorithms, yet is based on a simple integrated model and robustly can generate optimal code.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24161 (URN)10.1145/2597809.2597815 (DOI)2-s2.0-84907032397 (Scopus ID)
Conference
Fourteenth ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-01-24Bibliographically approved
Castaneda Lozano, R. (2014). Integrated Register Allocation and Instruction Scheduling with Constraint Programming (5ed.). (Licentiate dissertation). Stockholm, Sweden: KTH Royal Institute of Technology
Open this publication in new window or tab >>Integrated Register Allocation and Instruction Scheduling with Constraint Programming
2014 (English)Licentiate thesis, monograph (Other academic)
Abstract [en]

This dissertation proposes a combinatorial model, program representations, and constraint solving techniques for integrated register allocation and instruction scheduling in compiler back-ends. In contrast to traditional compilers based on heuristics, the proposed approach generates potentially optimal code by considering all trade-offs between interdependent decisions as a single optimization problem. The combinatorial model is the first to handle a wide array of global register allocation subtasks, including spill code optimization, ultimate coalescing, register packing, and register bank assignment, as well as instruction scheduling for Very Long Instruction Word (VLIW) processors. The model is based on three novel, complementary program representations: Linear Static Single Assignment for global register allocation; copy extension for spilling, basic coalescing, and register bank assignment; and alternative temporaries for spill code optimization and ultimate coalescing. Solving techniques are proposed that exploit the program representation properties for scalability. The model, program representations, and solving techniques are implemented in Unison, a code generator that delivers potentially optimal code while scaling to medium-size functions. Thorough experiments show that Unison: generates faster code (up to 41% with a mean improvement of 7%) than LLVM (a state-of-the-art compiler) for Hexagon (a challenging VLIW processor), generates code that is competitive with LLVM for MIPS32 (a simpler RISC processor), is robust across different benchmarks such as MediaBench and SPECint 2006, scales up to medium-size functions of up to 1000 instructions, and adapts easily to different optimization criteria. The contributions of this dissertation are significant. They lead to a combinatorial approach for integrated register allocation and instruction scheduling that is, for the first time, practical (it robustly scales to medium-size functions) and effective (it yields better code than traditional heuristic approaches).

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2014 Edition: 5
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-23941 (URN)
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-01-24Bibliographically approved
Mallach, S. & Castaneda Lozano, R. (2014). Optimal General Offset Assignment (6ed.). In: : . Paper presented at 17th International Workshop on Software and Compilers for Embedded Systems, SCOPES 2014; Schloss RheinfelsSt. Goar; Germany; 10 June 2014 through 11 June 2014 (pp. 50-59).
Open this publication in new window or tab >>Optimal General Offset Assignment
2014 (English)Conference paper, Published paper (Refereed)
Abstract [en]

We present an exact approach to the General Offset Assignment problem arising in the domain of address code generation for application specific and digital signal processors. General Offset Assignment is composed of two subproblems, namely to find a permutation of variables in memory and to select a responsible address register for each access to one of these variables. Our method is a combination of established techniques to solve both subproblems using integer linear programming. To the best of our knowledge, it is the first approach capable of solving almost all instances of the established OffsetStone benchmark set to global optimality within reasonable time. We provide a first comprehensive evaluation of the quality of several state-of-the-art heuristics relative to the optimal solutions.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24304 (URN)10.1145/2609248.2609251 (DOI)2-s2.0-84908895295 (Scopus ID)
Conference
17th International Workshop on Software and Compilers for Embedded Systems, SCOPES 2014; Schloss RheinfelsSt. Goar; Germany; 10 June 2014 through 11 June 2014
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-01-24Bibliographically approved
Castaneda Lozano, R. & Schulte, C. (2014). Survey on Combinatorial Register Allocation and Instruction Scheduling (4ed.). arXiv:1409.7628 [cs.PL]
Open this publication in new window or tab >>Survey on Combinatorial Register Allocation and Instruction Scheduling
2014 (English)In: arXiv:1409.7628 [cs.PL]Article in journal (Refereed) Published
Abstract [en]

Register allocation and instruction scheduling are two central compiler back-end problems that are critical for quality. In the last two decades, combinatorial optimization has emerged as an alternative approach to traditional, heuristic algorithms for these problems. Combinatorial approaches are generally slower but more flexible than their heuristic counterparts and have the potential to generate optimal code. This paper surveys existing literature on combinatorial register allocation and instruction scheduling. The survey covers approaches that solve each problem in isolation as well as approaches that integrate both problems. The latter have the potential to generate code that is globally optimal by capturing the trade-off between conflicting register allocation and instruction scheduling decisions.

Place, publisher, year, edition, pages
arXiv, 2014 Edition: 4
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24383 (URN)10.1145/3200920 (DOI)2-s2.0-85068049964 (Scopus ID)
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-01-24Bibliographically approved
Castaneda Lozano, R., Hjort Blindell, G., Carlsson, M., Drejhammar, F. & Schulte, C. (2013). Constraint-based Code Generation (6ed.). In: : . Paper presented at Sixteenth International Workshop on Software and Compilers for Embedded Systems (pp. 93-95).
Open this publication in new window or tab >>Constraint-based Code Generation
Show others...
2013 (English)Conference paper, Published paper (Refereed)
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24216 (URN)
Conference
Sixteenth International Workshop on Software and Compilers for Embedded Systems
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-01-24Bibliographically approved
Castaneda Lozano, R., Carlsson, M., Drejhammar, F. & Schulte, C. (2012). Constraint-based Register Allocation and Instruction Scheduling (6ed.). In: : . Paper presented at Eighteenth International Conference on Principles and Practice of Constraint Programming (pp. 750-766).
Open this publication in new window or tab >>Constraint-based Register Allocation and Instruction Scheduling
2012 (English)Conference paper, Published paper (Refereed)
Abstract [en]

This paper introduces a constraint model and solving techniques for code generation in a compiler back-end. It contributes a new model for global register allocation that combines several advanced aspects: multiple register banks (subsuming spilling to memory), coalescing, and packing. The model is extended to include instruction scheduling and bundling. The paper introduces a decomposition scheme exploiting the underlying program structure and exhibiting robust behavior for functions with thousands of instructions. Evaluation shows that code quality is on par with LLVM, a state-of-the-art compiler infrastructure. The paper makes important contributions to the applicability of constraint programming as well as compiler construction: essential concepts are unified in a high-level model that can be solved by readily available modern solvers. This is a significant step towards basing code generation entirely on a high-level model and by this facilitates the construction of correct, simple, flexible, robust, and high-quality code generators.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24215 (URN)
Conference
Eighteenth International Conference on Principles and Practice of Constraint Programming
Projects
Unison
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-01-24Bibliographically approved
Castaneda Lozano, R., Schulte, C. & Wahlberg, L. (2010). Testing Continuous Double Auctions with a Constraint-based Oracle (10ed.). In: : . Paper presented at Sixteenth International Conference on Principles and Practice of Constraint Programming (pp. 613-627). St Andrews, UK: Springer-Verlag, 6308
Open this publication in new window or tab >>Testing Continuous Double Auctions with a Constraint-based Oracle
2010 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Computer trading systems are essential for today's financial markets where the trading systems' correctness is of paramount economical significance. Automated random testing is a useful technique to find bugs in these systems, but it requires an independent system to decide the correctness of the system under test (known as oracle problem). This paper introduces a constraint-based oracle for random testing of a real-world trading system. The oracle provides the expected results by generating and solving constraint models of the trading system's continuous double auction. Constraint programming is essential for the correctness of the test oracle as the logic for calculating trades can be mapped directly to constraint models. The paper shows that the generated constraint models can be solved efficiently. Most importantly, the approach is shown to be successful by finding errors in a deployed financial trading system and in its specification.

Place, publisher, year, edition, pages
St Andrews, UK: Springer-Verlag, 2010 Edition: 10
Series
Lecture Notes in Computer Science
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-23761 (URN)
Conference
Sixteenth International Conference on Principles and Practice of Constraint Programming
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2020-01-24Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-2806-7333

Search in DiVA

Show all publications
v. 2.35.10