Turning lowers to uppers without a branch

In another comment of the Curiouser Case, Arnaud Bouchez pointed to an interesting optimization that is use in mORMot‘s UpperCopy255Buf function: a branchless parallel upper case conversion.

At the core of that implementation are the following lines of code

c := src[i];
d := c or _80;
dest[i] := c - ((d - _61) and not (d - _7b)) and ((not c) and _80) shr 2;

Ok, it may not be too obvious what happens at first sight. Let’s break it down.
(more…)

The Mischievous Case-Insensitive Hash

In a comment to the previous article on a case insensitive hash code, Stefan Glienke pointed to an approach used in Spring4D’s comparers, which is a delightful hack.

Rather than converting the string to a “proper lower case”, it converts the string to an “approximate lower case” using an “or $20”, which happens to be good enough for a hash on string identifiers.

To figure the trick, one needs to check the ASCII Table. (more…)

The Curiouser and Curiouser Case of Case-Insensitive Tweaks

Recent commits to the DWScript repository doubled the compiler performance when compiling many small scripts, like happens in the unit tests suites.

This started from a first profiling run where the memory allocations around the UnicodeLowerCase function came out as top bottlenecks.

Thing is, Pascal being a case-insensitive language, there are lots of case-insensitive comparisons, lookups, searches and hashes, and turns out a key hash code was computed with code like

(more…)