Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Linked e-resources

Details

Foreword; Acknowledgement; Abstract; Kurzfassung; Contents; 1 Introduction; 1.1 Contributions; 1.2 Publications; 2 Foundations & Terminology; 2.1 Basic Block; 2.2 Control Flow Graph (CFG); 2.3 Dominance and Postdominance; 2.4 Loops; 2.5 Static Single Assignment (SSA) Form; 2.5.1 LCSSA Form (LCSSA); 2.6 Control Dependence; 2.7 Live Values; 2.8 Register Pressure; 2.9 LLVM; 2.9.1 Intermediate Representation (IR); 2.9.2 Data Types; 2.9.3 Important Instructions; 2.10 Single Instruction, Multiple Data (SIMD); 3 Overview; 3.1 Whole-Function Vectorization (WFV); 3.2 Algorithmic Challenges

3.3 Performance Issues of Vectorization4 Related Work; 4.1 Classic Loop Vectorization; 4.2 Superword Level Parallelism (SLP); 4.3 Outer Loop Vectorization (OLV); 4.4 Auto-Vectorizing Languages; 4.4.1 OpenCL and CUDA; 4.5 SIMD Property Analyses; 4.6 Dynamic Variants; 4.7 Summary; 5 SIMD Property Analyses; 5.1 Program Representation; 5.2 SIMD Properties; 5.2.1 Uniform & Varying Values; 5.2.2 Consecutivity & Alignment; 5.2.3 Sequential & Non-Vectorizable Operations; 5.2.4 All-Instances-Active Operations; 5.2.5 Divergent Loops; 5.2.6 Divergence-Causing Blocks & Rewire Targets

5.3 Analysis Framework5.4 Operational Semantics; 5.4.1 Lifting to Vector Semantics; 5.5 Collecting Semantics; 5.6 Vectorization Analysis; 5.6.1 Tracked Information; 5.6.2 Initial State; 5.6.3 Instance Identifier; 5.6.4 Constants; 5.6.5 Phi Functions; 5.6.6 Memory Operations; 5.6.7 Calls; 5.6.8 Cast Operations; 5.6.9 Arithmetic and Other Instructions; 5.6.10 Branch Operation; 5.6.11 Update Function for All-Active Program Points; 5.6.12 Update Function for Divergent Loops; 5.7 Soundness; 5.7.1 Local Consistency; 5.8 Improving Precision with an SMT Solver

5.8.1 Expression Trees of Address Computations5.8.2 Translation to Presburger Arithmetic; 5.8.3 From SMT Solving Results to Code; 5.9 Rewire Target Analysis; 5.9.1 Running Example; 5.9.2 Loop Criteria; 5.9.3 Formal Definition; 5.9.4 Application in Partial CFG Linearization; 6 Whole-Function Vectorization; 6.1 Mask Generation; 6.1.1 Loop Masks; 6.1.2 Running Example; 6.1.3 Alternative for Exits Leaving Multiple Loops; 6.2 Select Generation; 6.2.1 Loop Blending; 6.2.2 Blending of Optional Loop Exit Results; 6.2.3 Running Example; 6.3 Partial CFG Linearization; 6.3.1 Running Example

6.3.2 Clusters of Divergence-Causing Blocks6.3.3 Rewire Target Block Scheduling; 6.3.4 Computation of New Outgoing Edges; 6.3.5 Linearization; 6.3.6 Repairing SSA Form; 6.3.7 Branch Fusion; 6.4 Instruction Vectorization; 6.4.1 Broadcasting of Uniform Values; 6.4.2 Consecutive Value Optimization; 6.4.3 Merging of Sequential Results; 6.4.4 Duplication of Non-Vectorizable Operations; 6.4.5 Pumped Vectorization; 6.5 Extension for Irreducible Control Flow; 7 Dynamic Code Variants; 7.1 Uniform Values and Control Flow; 7.2 Consecutive Memory Access Operations; 7.3 Switching to Scalar Code

Browse Subjects

Show more subjects...

Statistics

from
to
Export