Turning lowers to uppers without a branch
In another comment of the Curiouser Case, Arnaud Bouchez pointed to an interesting optimization that is use in mORMot‘s UpperCopy255Buf function: a branchless parallel upper case conversion.
At the core of that implementation are the following lines of code
c := src[i]; d := c or _80; dest[i] := c - ((d - _61) and not (d - _7b)) and ((not c) and _80) shr 2;
Ok, it may not be too obvious what happens at first sight. Let’s break it down.
(more…)