
LLVM Cookbook
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
Book DescriptionThe book is for compiler programmers who are familiar with concepts of compilers and want to indulge in understanding, exploring, and using LLVM infrastructure in a meaningful way in their work. This book is also for programmers who are not directly involved in compiler projects but are often involved in development phases where they write thousands of lines of code. With knowledge of how compilers work, they will be able to code in an optimal way and improve performance with clean code.What you will learn
Introduction to LLVM modular design and LLVM tools Write a frontend for a language Add JIT support and use frontends for different languages Learn about the LLVM Pass infrastructure and the LLVM Pass Manager Create analyses and transform optimization passes Build a LLVM TOY backend from scratch Optimize the code at SelectionDAG level and allocate registers to variables
Who this book is for
All prices
More details
Other editions
Additional editions

Persons
Content
Getting started with llvm
Steps in Writing a Frontend
Adding JIT Support and Writing Front-Ends for languages
Preparing Optimization
Implementing optimization
Target Independent Code Generator
Optimizing generated code and Register Allocation
Writing an LLVM Backend
Using LLVM for various useful projects
Vectorizing IR
Vectorization is an important optimization for compilers where we can vectorize code to execute an instruction on multiple datasets in one go. If the backend architecture supports vector registers, a broad range of data can be loaded into those vector registers, and special vector instructions can be executed on the registers.
There are two types of vectorization in LLVM-Superword-Level Parallelism (SLP) and loop vectorization. Loop vectorization deals with vectorization opportunities in a loop, while SLP vectorization deals with vectorizing straight-line code in a basic block. In this recipe, we will see how straight-line code is vectorized.
Getting ready
SLP vectorization constructs a bottom-up tree of the IR expression, and broadly compares the nodes of the tree to see whether they are similar and hence can be combined to form vectors. The file to be modified is lib/Transform/Vectorize/SLPVectorizer.cpp.
We will try to vectorize a piece of straight-line code, such as return a[0] + a[1] + a[2] + a[3].
The expression tree for the preceding type of code will be a somewhat one-sided tree. We will run a DFS to store the operands and the operators.
The IR for the preceding kind of expression will look like this:
define i32 @hadd(i32* %a) { entry: %0 = load i32* %a, align 4 %arrayidx1 = getelementptr inbounds i32* %a, i32 1 %1 = load i32* %arrayidx1, align 4 %add = add nsw i32 %0, %1 %arrayidx2 = getelementptr inbounds i32* %a, i32 2 %2 = load i32* %arrayidx2, align 4 %add3 = add nsw i32 %add, %2 %arrayidx4 = getelementptr inbounds i32* %a, i32 3 %3 = load i32* %arrayidx4, align 4 %add5 = add nsw i32 %add3, %3 ret i32 %add5 }The vectorization model follows three steps:
- Checking whether it's legal to vectorize.
- Calculating the profitability of the vectorized code over the scalarized code.
- Vectorizing the code if these two conditions are satisfied.
How to do it...
- Open the
SLPVectorizer.cppfile. A new function needs to be implemented for DFS traversal of the expression tree for the IR shown in the Getting ready section: bool matchFlatReduction(PHINode *Phi, BinaryOperator *B, const DataLayout *DL) { if (!B) return false; if (B->getType()->isVectorTy() || !B->getType()->isIntegerTy()) return false; ReductionOpcode = B->getOpcode(); ReducedValueOpcode = 0; ReduxWidth = MinVecRegSize / DL->getTypeAllocSizeInBits(B->getType()); ReductionRoot = B; ReductionPHI = Phi; if (ReduxWidth < 4) return false; if (ReductionOpcode != Instruction::Add) return false; SmallVector<BinaryOperator *, 32> Stack; ReductionOps.push_back(B); ReductionOpcode = B->getOpcode(); Stack.push_back(B); // Traversal of the tree. while (!Stack.empty()) { BinaryOperator *Bin = Stack.back(); if (Bin->getParent() != B->getParent()) return false; Value *Op0 = Bin->getOperand(0); Value *Op1 = Bin->getOperand(1); if (!Op0->hasOneUse() || !Op1->hasOneUse()) return false; BinaryOperator *Op0Bin = dyn_cast<BinaryOperator>(Op0); BinaryOperator *Op1Bin = dyn_cast<BinaryOperator>(Op1); Stack.pop_back(); // Do not handle case where both the operands are binary //operators if (Op0Bin && Op1Bin) return false; // Both the operands are not binary operator. if (!Op0Bin && !Op1Bin) { ReducedVals.push_back(Op1); ReducedVals.push_back(Op0); ReductionOps.push_back(Bin); continue; } // One of the Operand is binary operand, push that into stack // for further processing. Push the other non-binary operand //into ReducedVals. if (Op0Bin) { if (Op0Bin->getOpcode() != ReductionOpcode) return false; Stack.push_back(Op0Bin); ReducedVals.push_back(Op1); ReductionOps.push_back(Op0Bin); } if (Op1Bin) { if (Op1Bin->getOpcode() != ReductionOpcode) return false; Stack.push_back(Op1Bin); ReducedVals.push_back(Op0); ReductionOps.push_back(Op1Bin); } } SmallVector<Value *, 16> Temp; // Reverse the loads from a[3], a[2], a[1], a[0] // to a[0], a[1], a[2], a[3] for checking incremental // consecutiveness further ahead. while (!ReducedVals.empty()) Temp.push_back(ReducedVals.pop_back_val()); ReducedVals.clear(); for (unsigned i = 0, e = Temp.size(); i < e; ++i) ReducedVals.push_back(Temp[i]); return true; } - Calculate the cost of the resultant vectorized IR and conclude whether it is profitable to vectorize. In the
SLPVectorizer.cppfile, add the following lines to thegetReductionCost()function: int HAddCost = INT_MAX; // If horizontal addition pattern is identified, calculate cost. // Such horizontal additions can be modeled into combination of // shuffle sub-vectors and vector adds and one single extract element // from last resultant vector. // e.g. a[0]+a[1]+a[2]+a[3] can be modeled as // %1 = load <4 x> %0 // %2 = shuffle %1 <2, 3, undef, undef> // %3 = add <4 x> %1, %2 // %4 = shuffle %3 <1, undef, undef, undef> // %5 = add <4 x> %3, %4 // %6 = extractelement %5 <0> if (IsHAdd) { unsigned VecElem = VecTy->getVectorNumElements(); unsigned NumRedxLevel = Log2_32(VecElem); HAddCost = NumRedxLevel * (TTI->getArithmeticInstrCost(ReductionOpcode, VecTy) + TTI->getShuffleCost(TargetTransformInfo::SK_ExtractSubvector, VecTy, VecElem / 2, VecTy)) + TTI->getVectorInstrCost(Instruction::ExtractElement, VecTy, 0); } - In the same function, after calculating
PairwiseRdxCostandSplittingRdxCost, compare them withHAddCost: VecReduxCost = HAddCost < VecReduxCost ? HAddCost : VecReduxCost; - In the
vectorizeChainsInBlock()function, call thematchFlatReduction()function you just defined: // Try to vectorize horizontal reductions feeding into a return. if (ReturnInst *RI = dyn_cast<ReturnInst>(it)) if (RI->getNumOperands() != 0) if (BinaryOperator *BinOp = dyn_cast<BinaryOperator>(RI->getOperand(0))) { DEBUG(dbgs() << "SLP: Found a return to vectorize.\n"); HorizontalReduction HorRdx; IsReturn = true; if ((HorRdx.matchFlatReduction(nullptr, BinOp, DL) && HorRdx.tryToReduce(R, TTI)) || tryToVectorizePair(BinOp->getOperand(0), BinOp->getOperand(1), R)) { Changed = true; it = BB->begin(); e = BB->end(); continue; } } - Define two global flags to keep a track of horizontal reduction, which feeds into a return: static bool IsReturn = false; static bool IsHAdd = false;
- Allow the vectorization of small trees if they feed into a return. Add the following line to the
isFullyVectorizableTinyTree()function: if (VectorizableTree.size() == 1 && IsReturn && IsHAdd)return true;
How it works.
Compile the LLVM project after saving the file containing the preceding code, and run the opt tool on the example IR, as follows:
- Open the
example.llfile and paste the following IR in it: define i32 @hadd(i32* %a) { entry: %0 = load i32* %a, align 4 %arrayidx1 = getelementptr inbounds i32* %a, i32 1 %1 = load i32* %arrayidx1, align 4 %add = add nsw i32 %0, %1 %arrayidx2 = getelementptr inbounds i32* %a, i32 2 %2 = load i32* %arrayidx2, align 4 %add3 = add nsw i32 %add, %2 %arrayidx4 = getelementptr inbounds i32* %a, i32 3 %3 = load i32* %arrayidx4, align 4 %add5 = add nsw i32 %add3, %3 ret i32 %add5 } - Run the opt tool on
example.ll: $ opt -basicaa -slp-vectorizer -mtriple=aarch64-unknown-linux-gnu -mcpu=cortex-a57The output will be vectorized code, like the following:
define i32 @hadd(i32* %a) { entry: %0 = bitcast i32* %a to <4 x i32>* %1 = load <4 x i32>* %0, align 4 %rdx.shuf = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef> %bin.rdx = add <4 x i32> %1, %rdx.shuf %rdx.shuf1 = shufflevector <4 x i32> %bin.rdx, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx2 = add <4 x i32> %bin.rdx, %rdx.shuf1 %2 = extractelement <4 x i32> %bin.rdx2, i32 0 ret i32 %2 }
As observed, the code gets vectorized. The matchFlatReduction() function performs a DFS traversal of the expression and stores all the...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.