Branchless Programming

Published Date: 2021-3-5

A quick introduction to branchless programming

Note: Modern compiler's are quite smart, it is likely this technique might reduce performance. In order to ensure performance, please use a tool to view your assembly output and compare the performance of a naive version and the optimised version.

What is branchless programming?

Branchless programming is the art of writing programs that minimise the use of jmp instructions.

Why is this important?

Modern CPUs follow a pipeline architecture. Meaning that it can run a sequence of instructions simultaneously. The CPU will not know what instruction to execute following a jump instruction. There are mitigations to this issue, namely branch predictors but if the prediction is wrong all the work done is discarded, this is obvisouly bad for performance, since we need to start over again.

Have a look here for more information.

If we directly map if statements to assembly, we see jmp statements, it isn't just if statements that cause jmp statements to be used, but they are the most frequent. Here is a small example.

Note: Instead of mapping into x86, I decided to make up a tiny instruction set as I think this should make it more obvious.

C

if(x >= 0) {
    do_something();
}

Assembler

Small note, this is a made up assembly language.

Assume the variable x is loaded to register r0 and the constant 0 is loaded to register r1.

The instruction geq is used in this format geq <src> <v1> <v2>.

The instruction jmp_true is used in this format jmp_true <register> <label>.


; Comparision happens here
geq r2 r0 r1
jmp_true r2 .false_condition

; True Condition starts here
call do_something

; False Condition
.false_condition

There are two cases that I believe make use of branchless programming.

Cryptography (I am not entirely sure about this)
High performance code

Cryptography

The claims I have seen online are that branchless code prevent information leaks. This makes sense as information is leaked through timing attacks. I did find this. What they have written about makes sense, please note however that this is AES specific, I did not know this before now but apparently the AES instruction set increases the resistance to side channel attacks, I would imagine other cryptographic protocols are maybe more vulnerable to side channel attacks, so branchless programming would likely be criticial there.

A tiny (inefficient) example

I can't think of a better example than what whatsacreel (linked below) has. I think that example is small and shows branchless programming minimalistically.

Branched version

int min(int a, int b) {
    return a < b ? a : b;
}

Branchless version

int min(int a, int b) {
    return a*(a<b) + b*(b<=a);
}

Note: More examples are available here.