As a followup to the String Concatenation article, let’s take a look at a less trivial case: what if instead of concatenating a couple strings, you want to concatenate a few hundred?
Sounds like a task at which TStringBuilder should excel, but one should never assume, and always measure.
Eating Lots of Apples
While some drink bottles of beer, we will eat apples instead.
Here is the Trivial version
Result := '';
for i := 1 to NB do
Result := Result + #13#10'Eating apple #' + IntToStr(i);
The TStringBuilder and other object version are a bit longer, but to keep things short, I’m not reproducing the variable declarations or the try..finally for constructor/destructor.
for i := 1 to NB do
sb.Append(#13#10'Eating apple #').Append(i);
Result := sb.ToString;
I made two variants, one without pre-allocation, and another with pre-allocated buffer (through the Capacity property).
Similarly, you can have a TStringStream version (using WriteString in place of Append, and ToString with DataString).
Finally for the Format function lovers there is a Trivial Format as well
Result := '';
for i := 1 to NB do
Result := Result + Format('#13#10'Eating apple #%d', [i]);
And just for the fun of it, I made a version with DWScript‘s TWriteOnlyBlockStream (yes, that is a mouthful) whose code is similar to the TStringBuilder and TStringStream contenders.
Okay, ladies and gentlemen, place your bets, let the drums roll end and let’s see the benchmark results.
TTextWriter from Arnaud Bouchez doesn’t have these problems AFAIK
Yes TTextWriter is fast, however it’s dealing with utf-8, so wouldn’t be “fair” with other methods, and the performance is very close to TWriteOnlyBlockStream anyway (slightly behind in the 10 and 10000 tests, slightly ahead in the 100 and 1000 tests, but the deltas are very minor in all cases).
Hi Eric,
For the fun of it, you could try to see how this one performs:
http://omnithreadlibrary.googlecode.com/svn/trunk/src/HVStringData.pas
🙂
Or even this version, and using the AppendResidentBuf for the constant strings.
http://cc.embarcadero.com/item/18276
HVStringData performs half-way between trivial string concatenation and TStringBuilder. AFAICT it uses a strategy similar to TWriteOnlyBlockStream, but with a buffer size (Chars) way too small, so it ends up bottle-necking on reference counting (UStrClr) and the memory manager.
I suspect in multithread, both TWriteOnlyBlockStream and TTextWriter.Add(aInteger) would shine, since neither the two do allocate any temporary string.
What make TWriteOnlyBlockStream a bit more efficient is the fact that it allocates memory blocks via a linked list.
But on the other hand, TWriteOnlyBlockStream will enforce all data to fit in memory, whereas TTextWriter is more versatile, and is able to flush its content by chunk into any external TStream – e.g. a file. For instance, we use TTextWriter for our logging features, while I would not use TWriteOnlyBlockStream for it. TTextWriter is also the base class for all our JSON generation. And I like very much the TTextWriter.CancelLastChar method: pretty useful you want to ignore a trailing ‘,’ in your content. 🙂
Yep, the lack of CanceLastChar is a limitation. But data isn’t really enforced in memory, since it’s “write-only”, it can be flushed at any time to another stream or disk (the size then becomes a partial size though).
And yes, for integers and multi-threaded scenarios, both outshine their competitors by as many CPU cores as they can grab 😉
Also TWOBS only has an Int64 converter (since it was made for DWScript which only deals in Int64), which is why TTextWriter comes slightly on top for the 100 & 1000 iterations tests.
What time measurements are you using? Using a TStringStream, 100,000 iterations consistently takes about 50 ms on my not-very-new i7 before doing any normalisation.
for i := 1 to aCount do
lStream.WriteString(#13#10’Eating apple #’ + IntToStr(i));
Result := lStream.DataString;
@Bruce Timings are the minimum run times of 15 runs, each run being for 100k iterations (so the 10k test is executed 10 times). Using a single WriteString (as in your snippet) rather than two (as in mine) is about 10% faster here, and is in the 50 ms range as well.
Note that your snippet cuts the TStringStream stress in half, and leverages regular String concatenation instead for the other half.
Thanks.
The concatenation was an oversight.
I’ll follow up by e-mail for some more details.