LLVM - unifying game technology

(C) 2011 Andy Thomason


 

Problems facing games compiler technology:

 

Need for standardization of build platforms

Increasing build times

 

Platform independent binary?

The holy grail?

 

Problems facing games compiler technology:

 

Desire for for Vectorization and parallelism

Need for integration of CPU and GPU

Need for integration of Mobile and console platforms

 

What did LLVM ever do for us?

 

LLVM is a big bag of compiler parts

Can be used as a GCC replacement

But also can be used to build more flexible systems

Most development at Apple (Chris Lattner)

MIT-style license - enables private branching

 

Front ends

 

Imperative: C, C++, ObjC

Functional: OCaml

GPGPU: CUDA, OpenCL

Shaders: CG, GLSL

 

Back ends

 

X86, ARM

AMDIL, NVidia PTX

Cell PPU/SPU

FPGA

Xmos

 

But also:

 

JIT on X86, ARM

Hence Python, Lua, Javascript.

However - jit overhead may be greater than real gains.

Overview of LLVM systems

 

CLang - C/C++/GLSL/OpenCL front-end

 

LLVM core - code transformation in high level IR

LLVM code generation - platform dependant code emission

LLVM JIT - just in time compilation

LLVM link-time optimisation (LTO)

; ModuleID = '\llvm\tools\opt\opt.cpp'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:....
target triple = "i386-pc-mingw32"


define i8* @_ZnwjPv(i32 %unnamed_arg, i8* %__p) nounwind {
entry:
  %unnamed_arg_addr = alloca i32, align 4
  %__p_addr = alloca i8*, align 4
  %retval = alloca i8*
  %0 = alloca i8*
  %"alloca point" = bitcast i32 0 to i32
  store i32 %unnamed_arg, i32* %unnamed_arg_addr
  store i8* %__p, i8** %__p_addr
  %1 = load i8** %__p_addr, align 4
  store i8* %1, i8** %0, align 4
  %2 = load i8** %0, align 4
  store i8* %2, i8** %retval, align 4
  br label %return

return:                                           ; preds = %entry
  %retval1 = load i8** %retval
  ret i8* %retval1
}
      
 

CLang front end

 

Primarily used as a replacement for GCC or MSC

Multilingual front end with excellent error reporting

 

CLang front end

 

Can be used as a library for dynamic compilation

Now well supported and builds vast majority of source

 

CLang front end

 

Can be used to integrate shaders and GPGPU into main source

Common source for iPhone, PC and consoles

 

Static analysis - finding bugs

 

Able to look more deeply than compiler for potential bugs

Potentially useful for global alias analysis

Potential use in improving build times - reduce header files.

 

LLVM core

 

Uses a very simple intermediate representation.

All the usual code transformations - combines, loops, etc.

 

LLVM core

 

You can write extension passes such as Lua bindings and instrumentation hooks

Would allow aggressive code size reducing optimizations

 

LLVM code generation

 

ARM and X86 code generation has most support

Can be used as replacement for "cl" in PC games

Configurable target description

Has plug-in register allocation and other options

Rapidly adapts to new hardware

 

LLVM customization

 

Can add extra passes in shared libraries:

Bindings for Java and Lua

Metadata for serialization

Code profile and debugging hooks

 

LLVM JIT

 

Most useful for dynamic languages such as scripting

No need to re-compile or stop the game

Supports X86 and Arm

Dynamic languages will become increasingly important

Dynamic C/C++ a possibility

 

LLVM in games

 

Can improve build times

Link time optimisation can avoid need for unity builds

Naturally parallel

 

LLVM in games

 

Ideal for dynamic optimisation of game loops

Can integrate GPGPU, shaders and CPU code

Works with exising debuggers and profilers - gdb, ProDG

 

The future

 

No more need for inline asm and intrinsics (there may be some hold-outs!)

Parallelization of serial code

Vectorization of scalar code throughout the program

Support for scatter-gather DMA

 

The future

 

Interpreter loop optimization

Extension of GPGPU and eventual phase-out of special purpose graphics hardware

Ray-tracing, Radiosity, volumetric fluid effects

Debug info per source file, not per module?

 

LLVM right now

 

Needs productization

Needs to focus on real code, not benchmarks

 

LLVM right now

 

Code size a priority

Need to build some large games

But does work

Reference

Thanks to lou at bluelou.net for the illustrations.