The limitations of Delphi’s “inline”
Sometimes, the most simple-looking code can cause the Delphi compiler to stumble.
I bumped on such a case recently, and simplified it to a bare-bones version that still exhibits the issue:
Tips, Hints and Documentation Posts for SamplingProfiler and other Delphi Tools
Sometimes, the most simple-looking code can cause the Delphi compiler to stumble.
I bumped on such a case recently, and simplified it to a bare-bones version that still exhibits the issue:
DWScript includes a debugging facility, in the form of the IDebugger interface. The TdwsSimpleDebugger component implements that interface and can be used to simply surface the events.
I recently bumped on a post by François on FieldByName performance, and was bit surprised by the magnitude of speedups reported by Marco Cantu in a comment:
While making the rounds of “compiler magic” functions, I bumped on iif, the ternary operator, which f.i. Prism, VB and other support. Which looks like:
function iif (boolean; expressionIfTrue; expressionIfFalse) : value;
One part of the magic goes to the type-checking, the other part which interests me here, is that in a regular function call, all parameters are evaluated once before the call is made.
For iif, either expressionIfTrue or expressionIfFalse is evaluated, but not both, this means you can have such code as:
v := iif( Assigned(myObject); myObject.VirtualGetMethod; zero );
While with a regular function (without compiler magic), if myObject isn’t assigned, you would get an exception, as myObject.VirtualGetMethod would be invoked regardless. There are obviously countless other uses when the expressions have side-effects.
It occurred to me that in DWS, that “magic” is not only already available to internal “magic functions”, but that it could also be easily offered in the language and made no longer magic at all. It could just be another call convention, in which you wouldn’t pass a value or a reference, but something akin to a “light” anonymous expression parameter.
Would it be worth it?
Such a parameter could be qualified with a uses for instance (to reuse the keyword) rather than a var or const.
function iif( bool : Boolean; uses ifTrue, ifFalse : Variant) : Variant; begin if bool then Result := ifTrue else Result := ifFalse; end;
Would declare the iif “magic” function on variants.
Nothing would limit you to invoke a uses expression only once, so f.i.
procedure PrintNFloats(n : Integer; uses needFloat : Float); begin while n>0 do begin Print(needFloat); Dec(n); end; end; PrintNFloats(10, Random); // will print 10 different random numbers
And you could use the caller’s capture for side-effects, f.i. by combining a var parameter and a uses expression parameter making use of that variable.
procedure SkipEmpty(var iter: Integer; maxIter: Integer; uses needStr: String); begin while (iterator<=maxIterator) and (needString='') do Inc(iterator); end; ... SkipEmpty(iter, sl.Count-1, sl[iter]); // with a TStrings SkipEmpty(iter, High(tbl), tbl[iter]); // with an array
Contrary to anonymous functions, the capture is thus explicitly declarative, and also hierarchical only (it’s only valid in functions directly called from your functions). That’s a drastic limitation, so such a syntax isn’t intended for out-of-context tasks (like closures), but for local sub-tasks, which you also guarantee will be local (something that anonymous methods can’t guarantee).
And as a final sample, in the exemple above if you want to equate the ‘hello’ and ‘world’ strings to an empty string for SkipEmpty, you could use:
SkipEmpty(iter, sl.Count-1, iif(sl[iter] in ['hello', 'world'], '', sl[iter]) );
You could thus chain the expression parameters to introduce some non-traditional (for Delphi code) behaviors.
All in all, this could cover a variety of scenarios for default values, conditional alternatives, iterators, with a much restricted capability compared to full-blown anonymous methods, but with hopefully less scope for confusion than anonymous methods offer. But still, it would introduce the possibility of complex side-effects.
Any opinions? Should the possibility be surfaced or be kept only as an internal magic?
Post Scriptum:
As Craig Stuntz & APZ noted in the comments, this is similar to Digital Mars D’s lazy parameters, and both suggested using the “lazy” keyword in place of “uses” to match. However, lazy evokes more delayed evaluation, but evaluated once (as in “lazy binding”, etc.), something D doesn’t seem to support AFAICT with the lazy keyword (every use of a parameter leads to an evaluationif I’m not mistaken). While “uses” was to indicate you could “use” the parmeter’s underlying expression, as many times as you wanted to. More input welcome 🙂
One of the memory hogs when you have object trees or graphs can be good old TList and its many variants (TObjectList, TList<T>, etc.).
This is especially an issue when you have thousandths of lists that typically hold very few items, or are empty.
In the DWS expression tree f.i. there are quickly thousandths of such lists, for local symbol tables, parameter lists, etc.
A TList holding a single item already costs you:
So that’s 28 bytes, with two dynamically allocated blocks which, and those dynamic allocations, depending on your allocation alignment settings, can cost you something like an extra dozen of bytes (even with FastMM).
What about the other TList variants?
Neither of these are better candidates for memory-efficient small lists.
You can find TTightList in DWS’s dwsUtils unit.
For an empty list, the cost is 8 bytes, no dynamic memory, and for a list with a single item, 8 bytes still with no dynamic memory.
For a n-items list, the cost is 8 bytes plus one n*4 bytes dynamic block.
To achieve that, the TTightList makes use of two tricks:
The second trick is where we sacrifice a bit of performance, to save one dynamic allocation for lists holding a single item. Though if you benchmark the TTightList, you’ll see it holds its own fairly well against TList for the smaller item counts, which is what it was designed for.
That’s partly thanks to TList‘s own inefficiencies, and FastMM’s in-place reallocation (on which TTightLight relies, since it doesn’t maintain a capacity field).
A compiled script, a TdwsProgram, cannot be saved to a file, and will not ever be. Why is that?
This is a question that dates back to the first DWS, as it was re-asked recently, I will expose the rationale here.
I did some quick benchmarking against PascalScript and Delphi itself.
I generated a script based on the following template:
var myvar : Integer; begin myVar:=2*myvar-StrToInt(IntToStr(myvar)); end;
The assignment line being there only once, 100 times, 1000 times, etc. The result was saved to a file, and the benchmark consisted in loading the file, compiling and then running it for DWS. For PascalScript, the times are broken down into compiling, loading the bytecode output from a file, and then running that bytecode. Disk size indicates the size of the generated bytecode.
All times are in milliseconds (and have been updated, see Post-Scriptum below):
For line counts expected for typical scripts (less than 1000), compared to PascalScript, the cost of not being able to save to a bytecode is a one-time hit in the sub-15 milliseconds range, on the first run.
This illustrates why it is not really worth the trouble maintaining a bytecode version for scripting purposes, and that is also my practical experience.
For larger scripts, it is expected the execution complexity will dwarf the compile time: the benchmark code tested here doesn’t have any loops, anything more real-life will have loops, and will likely have a greater runtime/compiletime ratio.
For reference, I tried compiling the larger line counts versions with Delphi XE, from the IDE.
What else? The DWS compiler has an initial setup cost higher than PascalScript, but as code size grows, it starts pulling ahead. That setup overhead will nevertheless bear some investigation 😉.
Once compiled, the 10x execution speed ratio advantage of DWS vs PascalScript is consistent with other informal benchmarks.
Gave a quick look at the setup overhead with SamplingProfiler, and found two bottlenecks/bugs. The outcome was the shaving off of 3 ms from the DWS compile times, ie. the compile times for the 1, 100 and 1000 lines cases are now 0.95 ms, 2.85 ms and 19.1 ms respectively.
Passing parameters as “const” is a classic Delphi optimization trick, but the mechanisms behind that “trick” go beyond cargo-cult recipes, and may actually stumble into the “good practice” territory.
A recurring subject when it comes to freeing objects and preventing is whether you should just .Free them, thus leaving a invalid reference that should however never be used anymore when the code design is correct, or if you should defensively FreeAndNil() them, thus leaving a nil value that will hopefully trigger AVs more often on improper usage after release.
SamplingProfiler has a few options to help profile a multi-threaded application which I’ll go over here.
In the current version, those options allow identifying CPU-related bottlenecks, as in “threads taking too much CPU resources or execution time”. However, they do not provide much clues yet to pinpoint bottlenecks arising from thread synchronization issues or serialization (insufficient parallelism). Hopefully, more support for profiling multi-threaded applications will come in future versions.
Code optimization can sometimes be experienced as a lengthy process, with disruptive effects on code readability and maintainability. For effective optimization, it is crucial to focus efforts on areas where minimal work and minimal changes will have to most impact, ie. go for the jugular