1
Building LLVM and Understanding the Directory Structure
The LLVM infrastructure provides a set of libraries that can be assembled to create different tools and compilers.
LLVM originally stood for Low-Level Virtual Machine. Nowadays, it is much more than that, as you will shortly learn, and people just use LLVM as a name.
Given the sheer volume of code that makes the LLVM repository, it can be daunting to even know where to start.
In this chapter, we will give you the keys to approach and use this code base confidently. Using this knowledge, you will be able to do the following:
- Understand the different components that make a compiler
- Build and test the LLVM project
- Navigate LLVM's directory structure and locate the implementation of different components
- Contribute to the LLVM project
This chapter covers the basics needed to get started with LLVM. If you are already familiar with the LLVM infrastructure or followed the tutorial from the official LLVM website (https://llvm.org/docs/GettingStarted.html), you can skip it. You can, however, check the Quiz time section at the end of the chapter to see whether there is anything you may have missed.
Technical requirements
To work with the LLVM code base, you need specific tools on your system. In this section, we list the required versions of these tools for the latest major LLVM release: 20.1.0.
Later, in Identifying the right version of the tools, you will learn how to find the version of the tools required to build a specific version of LLVM, including older and newer releases and the LLVM top-of-tree (that is, the actively developed repository). Additionally, you will learn how to install them.
With no further due, here are the versions of the tools required for LLVM 20.1.0:
Tool
Required version
Git
None specified
C/C++ toolchain
>=Clang 5.0
>=Apple Clang 10.0
>=GCC 7.4
>=Visual Studio 2019 16.8
CMake
>=3.20.0
Ninja
None specified
Python
>=3.8
Table 1.1: Tools required for LLVM 20.1.0
Furthermore, this book comes with scripts, examples, and more that will ease your journey with learning the LLVM infrastructure. We will specifically list the relevant content in the related sections, but remember that the repository lives at https://github.com/PacktPublishing/LLVM-Code-Generation.
Getting ready for LLVM's world
In the Technical requirement section, we already listed which version of tools you needed to work with LLVM 20.1.0. However, LLVM is a lively project and what is required today may be different than what is required tomorrow. Also, to step back a bit, you may not know why you need these tools to begin with and/or how to get them.
This section addresses these questions, and you will learn the following in the process:
- The purpose of each required tool
- How to check that your environment has the proper tools
- How to install the proper tools
Depending on how familiar you are with development on Linux/macOS, this setup can be tedious or a walk in the park.
Ultimately, this section aims to teach you how to go beyond a fixed release of LLVM by giving you the knowledge required to find the information you need.
If you are familiar with package managers (e.g., the apt-get
command-line tool on Linux and Homebrew (https://brew.sh) on macOS), you can skip this part and directly install Git, Clang, CMake, Ninja, and Python through them. For Windows, if you do not have a package manager, the steps provided here are all manual, meaning that if you pick the related Windows binary distribution of the related tools, it should just work. Now, for Windows again, you may be better off installing these tools through Visual Studio Code (VS Code) (https://code.visualstudio.com) via the VS Code's extensions.
In any case, you might want to double-check which version of these tools you need by going through the Identifying the right version of the tools section.
Prerequisites
As mentioned previously, you need a set of specific tools to build the LLVM code base. This section summarizes what each of these tools does and how they work together to build the LLVM project.
This list of tools is as follows:
- Git: The software used for the versioning control of LLVM
- A C/C++ toolchain: The LLVM code base is in C/C++, and as such, we will need a toolchain to build that type of code
- CMake: The software used to configure the build system
- Ninja: The software used to drive the build system
- Python: The scripting language and execution environment used for testing
Figure 1.1 illustrates how the different tools work together to build an LLVM compiler:
Figure 1.1: The essential command-line tools to build an LLVM compiler
Breaking this figure down, here are the steps it takes:
- Git retrieves the source code.
- CMake generates the build system for a particular driver, such as Ninja, and a particular C/C++ toolchain.
- Ninja drives the build process.
- The C/C++ toolchain builds the compiler.
- Python drives the execution of the tests.
Identifying the right version of the tools
The required version of these tools depends on the version of LLVM you are building. For instance, see the Technical requirements section for the latest major release of LLVM, 20.1.0.
To check the required version for a specific release, check out the Getting Started page of the documentation for this release. To get there, perform the following steps:
- Go to https://releases.llvm.org/.
- Scroll down to the Download section.
- In the
documentation
column, click on the link named llvm
or docs
for the release you are interested in. For instance, release 20.1.0 should bring you to a URL such as https://releases.llvm.org/20.1.0/docs/index.html. - Scroll down to the Documentation section.
- Click on Getting Started/Tutorials.
- Find the Software and the Host C++ Toolchain[...] sections. For instance, for release 20.1.0, the Software section lives at https://releases.llvm.org/20.1.0/docs/GettingStarted.html#software.
To find the requirements for LLVM top-of-tree, simply follow the same steps but with the release named Git. This release should have a release date of Current.
You learned how to identify which version of the tools you need to have to be able to work with LLVM. Now, let's see how to install these versions.
Note
Ninja is the preferred driver of the build system of LLVM. However, LLVM also supports other drivers such as Makefile (the default), Xcode, and, to some extent, Bazel. Feel free to choose what works best for you.
Installing the right tools
Depending on your operating system (OS), you may have already all the necessary tools installed. You can use the following commands to check which version of the tools are installed and whether they meet the minimum requirements that we described in the previous section:
Tool
Checking the availability
Git
git -version
C/C++ toolchain (LLVM)
clang -version
CMake
...