Description |
1 online resource (xvi, 187 pages) : illustrations (some color) |
Contents |
Foreword; Acknowledgement; Abstract; Kurzfassung; Contents; 1 Introduction; 1.1 Contributions; 1.2 Publications; 2 Foundations & Terminology; 2.1 Basic Block; 2.2 Control Flow Graph (CFG); 2.3 Dominance and Postdominance; 2.4 Loops; 2.5 Static Single Assignment (SSA) Form; 2.5.1 LCSSA Form (LCSSA); 2.6 Control Dependence; 2.7 Live Values; 2.8 Register Pressure; 2.9 LLVM; 2.9.1 Intermediate Representation (IR); 2.9.2 Data Types; 2.9.3 Important Instructions; 2.10 Single Instruction, Multiple Data (SIMD); 3 Overview; 3.1 Whole-Function Vectorization (WFV); 3.2 Algorithmic Challenges |
|
3.3 Performance Issues of Vectorization4 Related Work; 4.1 Classic Loop Vectorization; 4.2 Superword Level Parallelism (SLP); 4.3 Outer Loop Vectorization (OLV); 4.4 Auto-Vectorizing Languages; 4.4.1 OpenCL and CUDA; 4.5 SIMD Property Analyses; 4.6 Dynamic Variants; 4.7 Summary; 5 SIMD Property Analyses; 5.1 Program Representation; 5.2 SIMD Properties; 5.2.1 Uniform & Varying Values; 5.2.2 Consecutivity & Alignment; 5.2.3 Sequential & Non-Vectorizable Operations; 5.2.4 All-Instances-Active Operations; 5.2.5 Divergent Loops; 5.2.6 Divergence-Causing Blocks & Rewire Targets |
|
5.3 Analysis Framework5.4 Operational Semantics; 5.4.1 Lifting to Vector Semantics; 5.5 Collecting Semantics; 5.6 Vectorization Analysis; 5.6.1 Tracked Information; 5.6.2 Initial State; 5.6.3 Instance Identifier; 5.6.4 Constants; 5.6.5 Phi Functions; 5.6.6 Memory Operations; 5.6.7 Calls; 5.6.8 Cast Operations; 5.6.9 Arithmetic and Other Instructions; 5.6.10 Branch Operation; 5.6.11 Update Function for All-Active Program Points; 5.6.12 Update Function for Divergent Loops; 5.7 Soundness; 5.7.1 Local Consistency; 5.8 Improving Precision with an SMT Solver |
|
5.8.1 Expression Trees of Address Computations5.8.2 Translation to Presburger Arithmetic; 5.8.3 From SMT Solving Results to Code; 5.9 Rewire Target Analysis; 5.9.1 Running Example; 5.9.2 Loop Criteria; 5.9.3 Formal Definition; 5.9.4 Application in Partial CFG Linearization; 6 Whole-Function Vectorization; 6.1 Mask Generation; 6.1.1 Loop Masks; 6.1.2 Running Example; 6.1.3 Alternative for Exits Leaving Multiple Loops; 6.2 Select Generation; 6.2.1 Loop Blending; 6.2.2 Blending of Optional Loop Exit Results; 6.2.3 Running Example; 6.3 Partial CFG Linearization; 6.3.1 Running Example |
|
6.3.2 Clusters of Divergence-Causing Blocks6.3.3 Rewire Target Block Scheduling; 6.3.4 Computation of New Outgoing Edges; 6.3.5 Linearization; 6.3.6 Repairing SSA Form; 6.3.7 Branch Fusion; 6.4 Instruction Vectorization; 6.4.1 Broadcasting of Uniform Values; 6.4.2 Consecutive Value Optimization; 6.4.3 Merging of Sequential Results; 6.4.4 Duplication of Non-Vectorizable Operations; 6.4.5 Pumped Vectorization; 6.5 Extension for Irreducible Control Flow; 7 Dynamic Code Variants; 7.1 Uniform Values and Control Flow; 7.2 Consecutive Memory Access Operations; 7.3 Switching to Scalar Code |
Summary |
Ralf Karrenberg presents Whole-Function Vectorization (WFV), an approach that allows a compiler to automatically create code that exploits data-parallelism using SIMD instructions. Data-parallel applications such as particle simulations, stock option price estimation or video decoding require the same computations to be performed on huge amounts of data. Without WFV, one processor core executes a single instance of a data-parallel function. WFV transforms the function to execute multiple instances at once using SIMD instructions. The author describes an advanced WFV algorithm that includes a v |
Analysis |
computerwetenschappen |
|
computer sciences |
|
engineering |
|
toegepaste wiskunde |
|
applied mathematics |
|
computational science |
|
programmeertalen |
|
programming languages |
|
computergrafie |
|
computer graphics |
|
Information and Communication Technology (General) |
|
Informatie- en communicatietechnologie (algemeen) |
Notes |
Preface in English and German |
|
Online resource; title from PDF title page (SpringerLink, viewed June 17, 2015) |
Subject |
Vector processing (Computer science)
|
|
Compilers (Computer programs)
|
|
Parallel processing (Electronic computers)
|
|
COMPUTERS -- Computer Literacy.
|
|
COMPUTERS -- Computer Science.
|
|
COMPUTERS -- Data Processing.
|
|
COMPUTERS -- Hardware -- General.
|
|
COMPUTERS -- Information Technology.
|
|
COMPUTERS -- Machine Theory.
|
|
COMPUTERS -- Reference.
|
|
Compilers (Computer programs)
|
|
Parallel processing (Electronic computers)
|
|
Vector processing (Computer science)
|
Form |
Electronic book
|
ISBN |
9783658101138 |
|
365810113X |
|
3658101121 |
|
9783658101121 |