Efficient String Building in Delphi

Previous: Benchmark Results

Why are Plain Delphi Strings doing so Well?

anonymous_bullShould not simple String concatenation fail like Schlemiel the Painter?

Two things: Copy-On-Write and FastMM. Both work hand in hand there.

Copy-On-Write is what makes Delphi String different, it gives them both the advantages of immutability and those of mutability, which are leveraged here.

FastMM on the other hand performs automatic speculative allocation for buffers that grow, and mutable strings that are appended to are just buffers that grow.

Why is TStringBuilder slow?

Some will say that is because it was just ripped from .Net or Java, where constraints are difference, and where it’s used as a trick to work around String immutability. While there is some truth is that, TWriteOnlyBlockStream is using a structurally similar approach, yet leads comfortably.

No, one of the main reasons is because the implementer of TStringBuilder ran afoul of implicit workloads:

  • most of the Append overloads work by creating a small local string and appending it, which means an exception frame, a string allocation and de-allocation each time (in contrast, the trivial implementations reuse the same local variable each time)
  • Append(String) looks innocent, but is choke-full of implicitness, just look at it in the CPU view.

Last but not least, the SetLength implementation, which is invoked from each Append call, is just not very efficient. For instance it does two checks on the value where one guard check could suffice, and it systematically enters an exception block that is only useful when growing the buffer.

So even if you pre-allocate the buffer, you still pay for most of the overhead of TStringBuilder, which is why pre-allocating doesn’t have a magical effect. Buffer growth isn’t the bottleneck (and would not be under FastMM anyway).

 

Check the followup article: Going Multi-Threaded.

10 thoughts on “Efficient String Building in Delphi

  1. Yes TTextWriter is fast, however it’s dealing with utf-8, so wouldn’t be “fair” with other methods, and the performance is very close to TWriteOnlyBlockStream anyway (slightly behind in the 10 and 10000 tests, slightly ahead in the 100 and 1000 tests, but the deltas are very minor in all cases).

  2. HVStringData performs half-way between trivial string concatenation and TStringBuilder. AFAICT it uses a strategy similar to TWriteOnlyBlockStream, but with a buffer size (Chars) way too small, so it ends up bottle-necking on reference counting (UStrClr) and the memory manager.

  3. I suspect in multithread, both TWriteOnlyBlockStream and TTextWriter.Add(aInteger) would shine, since neither the two do allocate any temporary string.

    What make TWriteOnlyBlockStream a bit more efficient is the fact that it allocates memory blocks via a linked list.

    But on the other hand, TWriteOnlyBlockStream will enforce all data to fit in memory, whereas TTextWriter is more versatile, and is able to flush its content by chunk into any external TStream – e.g. a file. For instance, we use TTextWriter for our logging features, while I would not use TWriteOnlyBlockStream for it. TTextWriter is also the base class for all our JSON generation. And I like very much the TTextWriter.CancelLastChar method: pretty useful you want to ignore a trailing ‘,’ in your content. 🙂

  4. Yep, the lack of CanceLastChar is a limitation. But data isn’t really enforced in memory, since it’s “write-only”, it can be flushed at any time to another stream or disk (the size then becomes a partial size though).

    And yes, for integers and multi-threaded scenarios, both outshine their competitors by as many CPU cores as they can grab 😉
    Also TWOBS only has an Int64 converter (since it was made for DWScript which only deals in Int64), which is why TTextWriter comes slightly on top for the 100 & 1000 iterations tests.

  5. What time measurements are you using? Using a TStringStream, 100,000 iterations consistently takes about 50 ms on my not-very-new i7 before doing any normalisation.

    for i := 1 to aCount do
    lStream.WriteString(#13#10’Eating apple #’ + IntToStr(i));
    Result := lStream.DataString;

  6. @Bruce Timings are the minimum run times of 15 runs, each run being for 100k iterations (so the 10k test is executed 10 times). Using a single WriteString (as in your snippet) rather than two (as in mine) is about 10% faster here, and is in the 50 ms range as well.
    Note that your snippet cuts the TStringStream stress in half, and leverages regular String concatenation instead for the other half.

  7. Thanks.

    The concatenation was an oversight.

    I’ll follow up by e-mail for some more details.

Comments are closed.