# The LLVM Target-Independent Code Generator
```{raw} html
```
```{contents}
:local:
```
```{warning}
This is a work in progress.
```
## Introduction
The LLVM target-independent code generator is a framework that provides a suite
of reusable components for translating the LLVM internal representation to the
machine code for a specified target---either in assembly form (suitable for a
static compiler) or in binary machine code format (usable for a JIT
compiler). The LLVM target-independent code generator consists of six main
components:
1. {ref}`Abstract target description ` interfaces which capture important properties
about various aspects of the machine, independently of how they will be used.
These interfaces are defined in `include/llvm/Target/`.
2. Classes used to represent the {ref}`code being generated ` for a target. These
classes are intended to be abstract enough to represent the machine code for
*any* target machine. These classes are defined in
`include/llvm/CodeGen/`. At this level, concepts like "constant pool
entries" and "jump tables" are explicitly exposed.
3. Classes and algorithms used to represent code at the object file level, the
{ref}`MC Layer `. These classes represent assembly level constructs like labels,
sections, and instructions. At this level, concepts like "constant pool
entries" and "jump tables" don't exist.
4. {ref}`Target-independent algorithms ` used to implement various phases of native
code generation (register allocation, scheduling, stack frame representation,
etc). This code lives in `lib/CodeGen/`.
5. {ref}`Implementations of the abstract target description interfaces ` for
particular targets. These machine descriptions make use of the components
provided by LLVM, and can optionally provide custom target-specific passes,
to build complete code generators for a specific target. Target descriptions
live in `lib/Target/`.
6. The target-independent JIT components. The LLVM JIT is completely target
independent (it uses the `TargetJITInfo` structure to interface for
target-specific issues. The code for the target-independent JIT lives in
`lib/ExecutionEngine/JIT`.
Depending on which part of the code generator you are interested in working on,
different pieces of this will be useful to you. In any case, you should be
familiar with the {ref}`target description ` and {ref}`machine code representation `
classes. If you want to add a backend for a new target, you will need to
{ref}`implement the target description ` classes for your new target and understand
the {doc}`LLVM code representation `. If you are interested in
implementing a new {ref}`code generation algorithm `, it should only depend on the
target-description and machine code representation classes, ensuring that it is
portable.
### Required components in the code generator
The two pieces of the LLVM code generator are the high-level interface to the
code generator and the set of reusable components that can be used to build
target-specific backends. The two most important interfaces (
{ref}`TargetMachine ` and {ref}`DataLayout `
) are the only ones that are required to be defined for a
backend to fit into the LLVM system, but the others must be defined if the
reusable code generator components are going to be used.
This design has two important implications. The first is that LLVM can support
completely non-traditional code generation targets. For example, the C backend
does not require register allocation, instruction selection, or any of the other
standard components provided by the system. As such, it only implements these
two interfaces, and does its own thing. Note that the C backend was removed from the
trunk in the LLVM 3.1 release. Another example of a code generator like this is a
(purely hypothetical) backend that converts LLVM to the GCC RTL form and uses
GCC to emit machine code for a target.
This design also implies that it is possible to design and implement radically
different code generators in the LLVM system that do not make use of any of the
built-in components. Doing so is not recommended at all, but could be required
for radically different targets that do not fit into the LLVM machine
description model: FPGAs for example.
(high-level design of the code generator)=
### The high-level design of the code generator
The LLVM target-independent code generator is designed to support efficient and
quality code generation for standard register-based microprocessors. Code
generation in this model is divided into the following stages:
1. {ref}`Instruction Selection ` --- This phase determines an efficient way to
express the input LLVM code in the target instruction set. This stage
produces the initial code for the program in the target instruction set, then
makes use of virtual registers in SSA form and physical registers that
represent any required register assignments due to target constraints or
calling conventions. This step turns the LLVM code into a DAG of target
instructions.
2. {ref}`Scheduling and Formation ` --- This phase takes the DAG of target
instructions produced by the instruction selection phase, determines an
ordering of the instructions, then emits the instructions as
{ref}`MachineInstr `s with that ordering. Note that we
describe this in the {ref}`instruction selection section ` because it operates on
a {ref}`SelectionDAG `.
3. {ref}`SSA-based Machine Code Optimizations ` --- This optional stage consists of a
series of machine-code optimizations that operate on the SSA-form produced by
the instruction selector. Optimizations like modulo-scheduling or peephole
optimization work here.
4. {ref}`Register Allocation ` --- The target code is transformed from an infinite
virtual register file in SSA form to the concrete register file used by the
target. This phase introduces spill code and eliminates all virtual register
references from the program.
5. {ref}`Prolog/Epilog Code Insertion ` --- Once the machine code has been generated
for the function and the amount of stack space required is known (used for
LLVM alloca's and spill slots), the prolog and epilog code for the function
can be inserted and "abstract stack location references" can be eliminated.
This stage is responsible for implementing optimizations like frame-pointer
elimination and stack packing.
6. {ref}`Late Machine Code Optimizations ` --- Optimizations that operate on "final"
machine code can go here, such as spill code scheduling and peephole
optimizations.
7. {ref}`Code Emission ` --- The final stage actually puts out the code for the
current function, either in the target assembler format or in machine
code.
The code generator is based on the assumption that the instruction selector will
use an optimal pattern matching selector to create high-quality sequences of
native instructions. Alternative code generator designs based on pattern
expansion and aggressive iterative peephole optimization are much slower. This
design permits efficient compilation (important for JIT environments) and
aggressive optimization (used when generating code offline) by allowing
components of varying levels of sophistication to be used for any step of
compilation.
In addition to these stages, target implementations can insert arbitrary
target-specific passes into the flow. For example, the X86 target uses a
special pass to handle the 80x87 floating point stack architecture. Other
targets with unusual requirements can be supported with custom passes as needed.
### Using TableGen for target description
The target description classes require a detailed description of the target
architecture. These target descriptions often have a large amount of common
information (e.g., an `add` instruction is almost identical to a `sub`
instruction). In order to allow the maximum amount of commonality to be
factored out, the LLVM code generator uses the
{doc}`TableGen/index` tool to describe big chunks of the
target machine, which allows the use of domain-specific and target-specific
abstractions to reduce the amount of repetition.
As LLVM continues to be developed and refined, we plan to move more and more of
the target description to the `.td` form. Doing so gives us a number of
advantages. The most important is that it makes it easier to port LLVM because
it reduces the amount of C++ code that has to be written, and the surface area
of the code generator that needs to be understood before someone can get
something working. Second, it makes it easier to change things. In particular,
if tables and other things are all emitted by `tblgen`, we only need a change
in one place (`tblgen`) to update all of the targets to a new interface.
(Abstract target description)=
(target description)=
## Target description classes
The LLVM target description classes (located in the `include/llvm/Target`
directory) provide an abstract description of the target machine independent of
any particular client. These classes are designed to capture the *abstract*
properties of the target (such as the instructions and registers it has), and do
not incorporate any particular pieces of code generation algorithms.
All of the target description classes (except the {ref}`DataLayout `
class) are designed to be subclassed by the concrete target
implementation, and have virtual methods implemented. To get to these
implementations, the {ref}`TargetMachine ` class
provides accessors that should be implemented by the target.
(TargetMachine)=
### The `TargetMachine` class
The `TargetMachine` class provides virtual methods that are used to access the
target-specific implementations of the various target description classes via
the `get*Info` methods (`getInstrInfo`, `getRegisterInfo`,
`getFrameInfo`, etc.). This class is designed to be specialized by a concrete
target implementation (e.g., `X86TargetMachine`) which implements the various
virtual methods. The only required target description class is the
{ref}`DataLayout ` class, but if the code
generator components are to be used, the other interfaces should be implemented
as well.
(DataLayout)=
### The `DataLayout` class
The `DataLayout` class is the only required target description class, and it
is the only class that is not extensible (you cannot derive a new class from
it). `DataLayout` specifies information about how the target lays out memory
for structures, the alignment requirements for various data types, the size of
pointers in the target, and whether the target is little-endian or
big-endian.
(TargetLowering)=
### The `TargetLowering` class
The `TargetLowering` class is used by SelectionDAG based instruction selectors
primarily to describe how LLVM code should be lowered to SelectionDAG
operations. Among other things, this class indicates:
* an initial register class to use for various `ValueType`s,
* which operations are natively supported by the target machine,
* the return type of `setcc` operations,
* the type to use for shift amounts, and
* various high-level characteristics, like whether it is profitable to turn
division by a constant into a multiplication sequence.
(TargetRegisterInfo)=
### The `TargetRegisterInfo` class
The `TargetRegisterInfo` class is used to describe the register file of the
target and any interactions between the registers.
Registers are represented in the code generator by unsigned integers. Physical
registers (those that actually exist in the target description) are unique
small numbers, and virtual registers are generally large. Note that
register `#0` is reserved as a flag value.
Each register in the processor description has an associated
`TargetRegisterDesc` entry, which provides a textual name for the register
(used for assembly output and debugging dumps) and a set of aliases (used to
indicate whether one register overlaps with another).
In addition to the per-register description, the `TargetRegisterInfo` class
exposes a set of processor-specific register classes (instances of the
`TargetRegisterClass` class). Each register class contains sets of registers
that have the same properties (for example, they are all 32-bit integer
registers). Each SSA virtual register created by the instruction selector has
an associated register class. When the register allocator runs, it replaces
virtual registers with a physical register in the set.
The target-specific implementations of these classes is auto-generated from a
{doc}`TableGen/index` description of the register file.
(TargetInstrInfo)=
### The `TargetInstrInfo` class
The `TargetInstrInfo` class is used to describe the machine instructions
supported by the target. Descriptions define things like the mnemonic for
the opcode, the number of operands, the list of implicit register uses and defs,
whether the instruction has certain target-independent properties (accesses
memory, is commutable, etc), and holds any target-specific flags.
### The `TargetFrameLowering` class
The `TargetFrameLowering` class is used to provide information about the stack
frame layout of the target. It holds the direction of stack growth, the known
stack alignment on entry to each function, and the offset to the local area.
The offset to the local area is the offset from the stack pointer on function
entry to the first location where function data (local variables, spill
locations) can be stored.
### The `TargetSubtarget` class
The `TargetSubtarget` class is used to provide information about the specific
chip set being targeted. A sub-target informs code generation of which
instructions are supported, instruction latencies and instruction execution
itinerary; i.e., which processing units are used, in what order, and for how
long.
### The `TargetJITInfo` class
The `TargetJITInfo` class exposes an abstract interface used by the
Just-In-Time code generator to perform target-specific activities, such as
emitting stubs. If a `TargetMachine` supports JIT code generation, it should
provide one of these objects through the `getJITInfo` method.
(code being generated)=
(machine code representation)=
## Machine code description classes
At the high-level, LLVM code is translated to a machine-specific representation
formed out of {ref}`MachineFunction `,
{ref}`MachineBasicBlock `, and
{ref}`MachineInstr ` instances (defined in
`include/llvm/CodeGen`). This representation is completely target agnostic,
representing instructions in their most abstract form: an opcode and a series of
operands. This representation is designed to support both an SSA representation
for machine code, as well as a register allocated, non-SSA form.
(MachineInstr)=
### The `MachineInstr` class
Target machine instructions are represented as instances of the `MachineInstr`
class. This class is an extremely abstract way of representing machine
instructions. In particular, it only keeps track of an opcode number and a set
of operands.
The opcode number is a simple unsigned integer that only has meaning to a
specific backend. All of the instructions for a target should be defined in the
`*InstrInfo.td` file for the target. The opcode enum values are auto-generated
from this description. The `MachineInstr` class does not have any information
about how to interpret the instruction (i.e., what the semantics of the
instruction are); for that you must refer to the
{ref}`TargetInstrInfo ` class.
The operands of a machine instruction can be of several different types: a
register reference, a constant integer, a basic block reference, etc. In
addition, a machine operand should be marked as a def or a use of the value
(though only registers are allowed to be defs).
By convention, the LLVM code generator orders instruction operands so that all
register definitions come before the register uses, even on architectures that
are normally printed in other orders. For example, the SPARC add instruction:
"`add %i1, %i2, %i3`" adds the "%i1", and "%i2" registers and stores the
result into the "%i3" register. In the LLVM code generator, the operands should
be stored as "`%i3, %i1, %i2`": with the destination first.
Keeping destination (definition) operands at the beginning of the operand list
has several advantages. In particular, the debugging printer will print the
instruction like this:
```llvm
%r3 = add %i1, %i2
```
Also if the first operand is a def, it is easier to {ref}`create instructions ` whose
only def is the first operand.
(create instructions)=
#### Using the `MachineInstrBuilder.h` functions
Machine instructions are created by using the `BuildMI` functions, located in
the `include/llvm/CodeGen/MachineInstrBuilder.h` file. The `BuildMI`
functions make it easy to build arbitrary machine instructions. Usage of the
`BuildMI` functions look like this:
```c++
// Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42')
// instruction and insert it at the end of the given MachineBasicBlock.
const TargetInstrInfo &TII = ...
MachineBasicBlock &MBB = ...
DebugLoc DL;
MachineInstr *MI = BuildMI(MBB, DL, TII.get(X86::MOV32ri), DestReg).addImm(42);
// Create the same instr, but insert it before a specified iterator point.
MachineBasicBlock::iterator MBBI = ...
BuildMI(MBB, MBBI, DL, TII.get(X86::MOV32ri), DestReg).addImm(42);
// Create a 'cmp Reg, 0' instruction, no destination reg.
MI = BuildMI(MBB, DL, TII.get(X86::CMP32ri8)).addReg(Reg).addImm(42);
// Create an 'sahf' instruction which takes no operands and stores nothing.
MI = BuildMI(MBB, DL, TII.get(X86::SAHF));
// Create a self looping branch instruction.
BuildMI(MBB, DL, TII.get(X86::JNE)).addMBB(&MBB);
```
If you need to add a definition operand (other than the optional destination
register), you must explicitly mark it as such:
```c++
MI.addReg(Reg, RegState::Define);
```
#### Fixed (preassigned) registers
One important issue that the code generator needs to be aware of is the presence
of fixed registers. In particular, there are often places in the instruction
stream where the register allocator *must* arrange for a particular value to be
in a particular register. This can occur due to limitations of the instruction
set (e.g., the X86 can only do a 32-bit divide with the `EAX`/`EDX`
registers), or external factors like calling conventions. In any case, the
instruction selector should emit code that copies a virtual register into or out
of a physical register when needed.
For example, consider this simple LLVM example:
```llvm
define i32 @test(i32 %X, i32 %Y) {
%Z = sdiv i32 %X, %Y
ret i32 %Z
}
```
The X86 instruction selector might produce this machine code for the `div` and
`ret`:
```text
;; Start of div
%EAX = mov %reg1024 ;; Copy X (in reg1024) into EAX
%reg1027 = sar %reg1024, 31
%EDX = mov %reg1027 ;; Sign extend X into EDX
idiv %reg1025 ;; Divide by Y (in reg1025)
%reg1026 = mov %EAX ;; Read the result (Z) out of EAX
;; Start of ret
%EAX = mov %reg1026 ;; 32-bit return value goes in EAX
ret
```
By the end of code generation, the register allocator would coalesce the
registers and delete the resultant identity moves producing the following
code:
```text
;; X is in EAX, Y is in ECX
mov %EAX, %EDX
sar %EDX, 31
idiv %ECX
ret
```
This approach is extremely general (if it can handle the X86 architecture, it
can handle anything!) and allows all of the target-specific knowledge about the
instruction stream to be isolated in the instruction selector. Note that
physical registers should have a short lifetime for good code generation, and
all physical registers are assumed dead on entry to and exit from basic blocks
(before register allocation). Thus, if you need a value to be live across basic
block boundaries, it *must* live in a virtual register.
#### Call-clobbered registers
Some machine instructions, like calls, clobber a large number of physical
registers. Rather than adding `` operands for all of them, it is
possible to use an `MO_RegisterMask` operand instead. The register mask
operand holds a bit mask of preserved registers, and everything else is
considered to be clobbered by the instruction.
#### Machine code in SSA form
`MachineInstr`'s are initially selected in SSA-form, and are maintained in
SSA-form until register allocation happens. For the most part, this is
trivially simple since LLVM is already in SSA form; LLVM PHI nodes become
machine code PHI nodes, and virtual registers are only allowed to have a single
definition.
After register allocation, machine code is no longer in SSA-form because there
are no virtual registers left in the code.
(MachineBasicBlock)=
### The `MachineBasicBlock` class
The `MachineBasicBlock` class contains a list of machine instructions
( {ref}`MachineInstr ` instances). It roughly
corresponds to the LLVM code input to the instruction selector, but there can be
a one-to-many mapping (i.e., one LLVM basic block can map to multiple machine
basic blocks). The `MachineBasicBlock` class has a "`getBasicBlock`" method,
which returns the LLVM basic block that it comes from.
(MachineFunction)=
### The `MachineFunction` class
The `MachineFunction` class contains a list of machine basic blocks
( {ref}`MachineBasicBlock ` instances). It
corresponds one-to-one with the LLVM function input to the instruction selector.
In addition to a list of basic blocks, the `MachineFunction` contains a
`MachineConstantPool`, a `MachineFrameInfo`, a `MachineFunctionInfo`, and
a `MachineRegisterInfo`. See `include/llvm/CodeGen/MachineFunction.h` for
more information.
### `MachineInstr Bundles`
LLVM code generator can model sequences of instructions as MachineInstr
bundles. A MI bundle can model a VLIW group / pack which contains an arbitrary
number of parallel instructions. It can also be used to model a sequential list
of instructions (potentially with data dependencies) that cannot be legally
separated (e.g., ARM Thumb2 IT blocks).
Conceptually a MI bundle is a MI with a number of other MIs nested within:
```
--------------
| Bundle | ---------
-------------- \
| ----------------
| | MI |
| ----------------
| |
| ----------------
| | MI |
| ----------------
| |
| ----------------
| | MI |
| ----------------
|
--------------
| Bundle | --------
-------------- \
| ----------------
| | MI |
| ----------------
| |
| ----------------
| | MI |
| ----------------
| |
| ...
|
--------------
| Bundle | --------
-------------- \
|
...
```
MI bundle support does not change the physical representations of
MachineBasicBlock and MachineInstr. All the MIs (including top level and nested
ones) are stored as sequential list of MIs. The "bundled" MIs are marked with
the 'InsideBundle' flag. A top-level MI with the special BUNDLE opcode is used
to represent the start of a bundle. It's legal to mix BUNDLE MIs with individual
MIs that are not inside bundles nor represent bundles.
MachineInstr passes should operate on a MI bundle as a single unit. Member
methods have been taught to correctly handle bundles and MIs inside bundles.
The MachineBasicBlock iterator has been modified to skip over bundled MIs to
enforce the bundle-as-a-single-unit concept. An alternative iterator
instr_iterator has been added to MachineBasicBlock to allow passes to iterate
over all of the MIs in a MachineBasicBlock, including those which are nested
inside bundles. The top-level BUNDLE instruction must have the correct set of
register MachineOperand's that represent the cumulative inputs and outputs of
the bundled MIs.
Packing / bundling of MachineInstrs for VLIW architectures should
generally be done as part of the register allocation super-pass. More
specifically, the pass which determines what MIs should be bundled
together should be done after code generator exits SSA form
(i.e., after two-address pass, PHI elimination, and copy coalescing).
Such bundles should be finalized (i.e., adding BUNDLE MIs and input and
output register MachineOperands) after virtual registers have been
rewritten into physical registers. This eliminates the need to add
virtual register operands to BUNDLE instructions which would
effectively double the virtual register def and use lists. Bundles may
use virtual registers and be formed in SSA form, but may not be
appropriate for all use cases.
(MC Layer)=
## The "MC" Layer
The MC Layer is used to represent and process code at the raw machine code
level, devoid of "high level" information like "constant pools", "jump tables",
"global variables" or anything like that. At this level, LLVM handles things
like label names, machine instructions, and sections in the object file. The
code in this layer is used for a number of important purposes: the tail end of
the code generator uses it to write a `.s` or `.o` file, and it is also used by the
llvm-mc tool to implement standalone machine code assemblers and disassemblers.
This section describes some of the important classes. There are also a number
of important subsystems that interact at this layer, they are described later in
this manual.
(MCStreamer)=
### The `MCStreamer` API
MCStreamer is best thought of as an assembler API. It is an abstract API which
is *implemented* in different ways (e.g., to output a `.s` file, output an ELF `.o`
file, etc) but whose API corresponds directly to what you see in a `.s` file.
MCStreamer has one method per directive, such as EmitLabel, EmitSymbolAttribute,
switchSection, emitValue (for .byte, .word), etc, which directly correspond to
assembly level directives. It also has an EmitInstruction method, which is used
to output an MCInst to the streamer.
This API is most important for two clients: the llvm-mc stand-alone assembler is
effectively a parser that parses a line, then invokes a method on MCStreamer. In
the code generator, the {ref}`Code Emission ` phase of the code generator lowers
higher level LLVM IR and Machine* constructs down to the MC layer, emitting
directives through MCStreamer.
On the implementation side of MCStreamer, there are two major implementations:
one for writing out a `.s` file (MCAsmStreamer), and one for writing out a `.o`
file (MCObjectStreamer). MCAsmStreamer is a straightforward implementation
that prints out a directive for each method (e.g., `EmitValue -> .byte`), but
MCObjectStreamer implements a full assembler.
For target-specific directives, the MCStreamer has a MCTargetStreamer instance.
Each target that needs it defines a class that inherits from it and is a lot
like MCStreamer itself: It has one method per directive and two classes that
inherit from it, a target object streamer and a target asm streamer. The target
asm streamer just prints it (`emitFnStart -> .fnstart`), and the object
streamer implements the assembler logic for it.
To make llvm use these classes, the target initialization must call
TargetRegistry::RegisterAsmStreamer and TargetRegistry::RegisterMCObjectStreamer
passing callbacks that allocate the corresponding target streamer and pass it
to createAsmStreamer or to the appropriate object streamer constructor.
### The `MCContext` class
The MCContext class is the owner of a variety of uniqued data structures at the
MC layer, including symbols, sections, etc. As such, this is the class that you
interact with to create symbols and sections. This class can not be subclassed.
### The `MCSymbol` class
The MCSymbol class represents a symbol (aka label) in the assembly file. There
are two interesting kinds of symbols: assembler temporary symbols, and normal
symbols. Assembler temporary symbols are used and processed by the assembler
but are discarded when the object file is produced. The distinction is usually
represented by adding a prefix to the label, for example "L" labels are
assembler temporary labels in MachO.
MCSymbols are created by MCContext and uniqued there. This means that MCSymbols
can be compared for pointer equivalence to find out if they are the same symbol.
Note that pointer inequality does not guarantee the labels will end up at
different addresses though. It's perfectly legal to output something like this
to the `.s` file:
```
foo:
bar:
.byte 4
```
In this case, both the foo and bar symbols will have the same address.
### The `MCSection` class
The `MCSection` class represents an object-file specific section. It is
subclassed by object file specific implementations (e.g., `MCSectionMachO`,
`MCSectionCOFF`, `MCSectionELF`) and these are created and uniqued by
MCContext. The MCStreamer has a notion of the current section, which can be
changed with the SwitchToSection method (which corresponds to a ".section"
directive in a `.s` file).
(MCInst)=
### The `MCInst` class
The `MCInst` class is a target-independent representation of an instruction.
It is a simple class (much more so than {ref}`MachineInstr `) that holds a
target-specific opcode and a vector of MCOperands. MCOperand, in turn, is a
simple discriminated union of three cases: 1) a simple immediate, 2) a target
register ID, 3) a symbolic expression (e.g., "`Lfoo-Lbar+42`") as an MCExpr.
MCInst is the common currency used to represent machine instructions at the MC
layer. It is the type used by the instruction encoder, the instruction printer,
and the type generated by the assembly parser and disassembler.
(ObjectFormats)=
### Object File Format
The MC layer's object writers support a variety of object formats. Because of
target-specific aspects of object formats each target only supports a subset of
the formats supported by the MC layer. Most targets support emitting ELF
objects. Other vendor-specific objects are generally supported only on targets
that are supported by that vendor (i.e., MachO is only supported on targets
supported by Darwin, and XCOFF is only supported on targets that support AIX).
Additionally some targets have their own object formats (i.e., DirectX, SPIR-V
and WebAssembly).
The table below captures a snapshot of object file support in LLVM:
```{table} Object File Formats
| Format | Supported Targets |
| --- | --- |
| `COFF` | AArch64, ARM, X86 |
| `DXContainer` | DirectX |
| `ELF` | AArch64, AMDGPU, ARM, AVR, BPF, CSKY, Hexagon, Lanai, LoongArch, M86k, MSP430, MIPS, PowerPC, RISCV, SPARC, SystemZ, VE, X86 |
| `GOFF` | SystemZ |
| `MachO` | AArch64, ARM, X86 |
| `SPIR-V` | SPIRV |
| `WASM` | WebAssembly |
| `XCOFF` | PowerPC |
```
(Target-independent algorithms)=
(code generation algorithm)=
## Target-independent code generation algorithms
This section documents the phases described in the
{ref}`high-level design of the code generator `.
It explains how they work and some of the rationale behind their design.
(Instruction Selection)=
(instruction selection section)=
### Instruction Selection
Instruction Selection is the process of translating LLVM code presented to the
code generator into target-specific machine instructions. There are several
well-known ways to do this in the literature. LLVM uses a SelectionDAG based
instruction selector.
Portions of the DAG instruction selector are generated from the target
description (`*.td`) files. Our goal is for the entire instruction selector
to be generated from these `.td` files, though currently there are still
things that require custom C++ code.
[GlobalISel](https://llvm.org/docs/GlobalISel/index.html) is another
instruction selection framework.
(SelectionDAG)=
#### Introduction to SelectionDAGs
The SelectionDAG provides an abstraction for code representation in a way that
is amenable to instruction selection using automatic techniques
(e.g., dynamic-programming based optimal pattern matching selectors). It is also
well-suited to other phases of code generation; in particular, instruction
scheduling (SelectionDAG's are very close to scheduling DAGs post-selection).
Additionally, the SelectionDAG provides a host representation where a large
variety of very-low-level (but target-independent) {ref}`optimizations ` may be
performed; ones which require extensive information about the instructions
efficiently supported by the target.
The SelectionDAG is a Directed-Acyclic-Graph whose nodes are instances of the
`SDNode` class. The primary payload of the `SDNode` is its operation code
(Opcode) that indicates what operation the node performs and the operands to the
operation. The various operation node types are described at the top of the
`include/llvm/CodeGen/ISDOpcodes.h` file.
Although most operations define a single value, each node in the graph may
define multiple values. For example, a combined div/rem operation will define
both the dividend and the remainder. Many other situations require multiple
values as well. Each node also has some number of operands, which are edges to
the node defining the used value. Because nodes may define multiple values,
edges are represented by instances of the `SDValue` class, which is a
`` pair, indicating the node and result value being used,
respectively. Each value produced by an `SDNode` has an associated `MVT`
(Machine Value Type) indicating what the type of the value is.
SelectionDAGs contain two different kinds of values: those that represent data
flow and those that represent control flow dependencies. Data values are simple
edges with an integer or floating point value type. Control edges are
represented as "chain" edges which are of type `MVT::Other`. These edges
provide an ordering between nodes that have side effects (such as loads, stores,
calls, returns, etc). All nodes that have side effects should take a token
chain as input and produce a new one as output. By convention, token chain
inputs are always operand #0, and chain results are always the last value
produced by an operation. However, after instruction selection, the
machine nodes have their chain after the instruction's operands, and
may be followed by glue nodes.
A SelectionDAG has designated "Entry" and "Root" nodes. The Entry node is
always a marker node with an Opcode of `ISD::EntryToken`. The Root node is
the final side-effecting node in the token chain. For example, in a single basic
block function it would be the return node.
One important concept for SelectionDAGs is the notion of a "legal" vs.
"illegal" DAG. A legal DAG for a target is one that only uses supported
operations and supported types. On a 32-bit PowerPC, for example, a DAG with a
value of type i1, i8, i16, or i64 would be illegal, as would a DAG that uses a
SREM or UREM operation. The {ref}`legalize types ` and {ref}`legalize operations ` phases
are responsible for turning an illegal DAG into a legal DAG.
(SelectionDAG-Process)=
#### SelectionDAG Instruction Selection Process
SelectionDAG-based instruction selection consists of the following steps:
1. {ref}`Build initial DAG ` --- This stage performs a simple translation from the
input LLVM code to an illegal SelectionDAG.
1. {ref}`Optimize SelectionDAG ` --- This stage performs simple optimizations on the
SelectionDAG to simplify it, and recognize meta instructions (like rotates
and `div`/`rem` pairs) for targets that support these meta operations.
This makes the resultant code more efficient and the
{ref}`select instructions from DAG