Faster, smaller, safer

Here is a summary of recent changes for DWScript, available in the SVN version:

  • faster: compilation is now about 30% faster in situations like that benchmark, thanks to a few bug fixes (typechecks were performed multiple times) and a couple tokenizer enhancements.
  • smaller: reduced memory usage for compiled scripts (about 15% in the infamous benchmark, which translates in an execution speedup of around 5% once a few hundred lines are involved).
  • safer: fixed an old issue with object reference cycles, which weren’t covered by the reference counting and thus could be leaked.
  • basic support for “for..in”, at the moment limited to the case of enumerations (for <element> in <enumeration>).
  • fixed compile error source locations for some issues, plus various minor fixes and enhancements.

DWScript preview 4

Last DWS preview zip was already a while back, so I posted a new preview zip of Delphi Web Script for the SVN-averse.
For the changes since the previous zip, you may want to check here, here and here), as while you’re at it, you may as well check the following:

  • reactivated the COM Connector, which allows to connect to arbitrary COM/OLE objects from within a script. More tests are required.
  • introduced initial support for the “in” operator, at the moment, it merely allows use of the “case..of” syntax in boolean expressions, with no specific optimizations f.i.
    if (n in [4, 6..8, 10]) then ...
    if (s in ['abc', 'def']) then ...
  • open arrays can now be declared in script functions, as “const someName : array of const” parameters. They behave like arrays of variants from a type-checking point of view.
  • added basic support for default values in the shorthand notation used to register internal functions, this is used for Inc and Dec at the moment.
  • introduced constant unification, this is both a memory and performance optimization, it’s still largely experimental at the moment.
  • various other optimizations and bug fixes.

Thanks go to Alexey Kazantsev for the testing efforts!

Optimizing for memory: a tighter TList

One of the memory hogs when you have object trees or graphs can be good old TList and its many variants (TObjectList, TList<T>, etc.).

This is especially an issue when you have thousandths of lists that typically hold very few items, or are empty.
In the DWS expression tree f.i. there are quickly thousandths of such lists, for local symbol tables, parameter lists, etc.

How much does a TList cost in terms of memory?

A TList holding a single item already costs you:

  • 4 bytes for the field in the owner object
  • 20 bytes for the TList instance
    • 8 hidden bytes: Monitor + VMT pointer
    • 12 field bytes: data pointer + Count + Capacity
  • 4 bytes for the data

So that’s 28 bytes, with two dynamically allocated blocks which, and those dynamic allocations, depending on your allocation alignment settings, can cost you something like an extra dozen of bytes (even with FastMM).

What about the other TList variants?

  • TObjectList has an extra boolean field, which with alignment, costs an extra 4 bytes.
  • TList<T> has an instance size of 28 bytes, and a dynamic array storage with 8 hidden extra bytes (4 for length, 4 for the refcount).

Neither of these are better candidates for memory-efficient small lists.

Enter the TTightList

You can find TTightList in DWS’s dwsUtils unit.
For an empty list, the cost is 8 bytes, no dynamic memory, and for a list with a single item, 8 bytes still with no dynamic memory.
For a n-items list, the cost is 8 bytes plus one n*4 bytes dynamic block.

To achieve that, the TTightList makes use of two tricks:

  • it’s designed to be composed, and hosted as an object field
    • it’s a record-with-methods, not a class, but retains classic-looking use semantics (.Add, .IndexOf, .Clear etc.).
    • eliminates the need for a pointer to the instance in the host object
    • eliminates the dynamically allocated storage for the TTightList itself
  • if the Count is one, the array pointer itself points to the only item, rather than to a dynamically allocated block holding a pointer to the item.

The second trick is where we sacrifice a bit of performance, to save one dynamic allocation for lists holding a single item. Though if you benchmark the TTightList, you’ll see it holds its own fairly well against TList for the smaller item counts, which is what it was designed for.
That’s partly thanks to TList‘s own inefficiencies, and FastMM’s in-place reallocation (on which TTightLight relies, since it doesn’t maintain a capacity field).

Why no bytecode format?

A compiled script, a TdwsProgram, cannot be saved to a file, and will not ever be. Why is that?

This is a question that dates back to the first DWS, as it was re-asked recently, I will expose the rationale here.

  • DWS has a very fast compiler, there are no performance problems compiling scripts instead of loading a binary representation that has to be de-serialized. How fast is it? See below.
  • DWS lets you define custom filters, that enable you to encrypt your scripts easily, if hiding the script source is what you were after with the bytecode.
  • DWS compiler/parser portion is quite light (currently less than 75kB), especially compared to the size of the Delphi libraries you will be using for the runtime. You probably will not notice it in the EXE size once you expose more than a few trivial libraries.
  • Last but not least, when loading a binary representation of a script, you have to make sure all libraries are compiled into the application that loads and wants to execute the script, and that they are entirely backward-compatible with what was exposed to the script back when it was compiled. That is irrelevant when re-compiling.

How fast is the DWS compiler?

I did some quick benchmarking against PascalScript and Delphi itself.
I generated a script based on the following template:

var myvar : Integer;
begin
   myVar:=2*myvar-StrToInt(IntToStr(myvar));
end;

The assignment line being there only once, 100 times, 1000 times, etc. The result was saved to a file, and the benchmark consisted in loading the file, compiling and then running it for DWS. For PascalScript, the times are broken down into compiling, loading the bytecode output from a file, and then running that bytecode. Disk size indicates the size of the generated bytecode.
All times are in milliseconds (and have been updated, see Post-Scriptum below):

For line counts expected for typical scripts (less than 1000), compared to PascalScript, the cost of not being able to save to a bytecode is a one-time hit in the sub-15 milliseconds range, on the first run.
This illustrates why it is not really worth the trouble maintaining a bytecode version for scripting purposes, and that is also my practical experience.

For larger scripts, it is expected the execution complexity will dwarf the compile time: the benchmark code tested here doesn’t have any loops, anything more real-life will have loops, and will likely have a greater runtime/compiletime ratio.

What of Delphi?

For reference, I tried compiling the larger line counts versions with Delphi XE, from the IDE.

  • the 100k lines case took 3 minutes 27 seconds to compile (ouch!), obviously hitting some Delphi parser or compiler limitation. Runtime was 63 ms.
  • the 10k lines case in Delphi compiled in a more reasonable 2400 msec, and ran in 4 ms (50% faster than DWS).

What else? The DWS compiler has an initial setup cost higher than PascalScript, but as code size grows, it starts pulling ahead. That setup overhead will nevertheless bear some investigation 😉.
Once compiled, the 10x execution speed ratio advantage of DWS vs PascalScript is consistent with other informal benchmarks.

Post-Scriptum

Gave a quick look at the setup overhead with SamplingProfiler, and found two bottlenecks/bugs. The outcome was the shaving off of 3 ms from the DWS compile times, ie. the compile times for the 1, 100 and 1000 lines cases are now 0.95 ms, 2.85 ms and 19.1 ms respectively.

DWS weekly news

Here is a quick summary of what changed in the SVN since the last update:

  • array variables can now be initialized (with [] to enclose values), ie. the code below is now supported:
      var abc : array [1..3] of String = ['a', 'b', 'c'];
      abc := [ 'hello', 'world', '!'];
  • dwsClassesLibModule now exposes a minimalistic TStringBuilder.
  • added dwsRunner to the demos, a simple command-line runner for DWScript.
  • fixes to const parameters and constant arrays, though  those don’t benefit from optimizations at this point.
  • trivial infinite loops are detected and warned about (repeat or while loops with an empty body and a constant condition).
  • upgraded dwsResult and dwsHTMLFilter to use TStringBuilder internally.
  • added MaxRecursionDepth to TdwsConfig, allows limiting the maximum number of recursive calls a script is allowed to perform.
  • various fixes and minor additions.

DWScript SVN: method keyword and other changes

A quick update on recent changes in the SVN:

  • introduced support for Prism-like ‘method‘ keyword, which can be used as an alternative for ‘procedure‘ and ‘function‘ for class methods.
    procedure TMyClass.MyMethodWithNoResult;
    function TMyClass.ReturnsAString : String;

    can now alternatively be written as

    method TMyClass.MyMethodWithNoResult;
    method TMyClass.ReturnsAString : String;

    both forms are supported, though if you declared as a method‘ in the class, you’ll have to implement as a ‘method‘.

  • local ‘const‘ sections are now accepted in a procedure body, and you can have multiple ‘const‘ and ‘var‘ sections before ‘begin‘.
  • type inference now accepted for variables in the ‘var‘ section of a procedure body.
  • minor compiler optimization and fixes

Abstracted file system for DWS, and other news

The DWScript SVN version introduced a file system abstraction (via the dwsFileSystem unit). This is a both a security fix (restrict what scripts see and what the compiler can include) and an enhancement. You can f.i. run your scripts within a virtualized file system limited to a directory, a database, or whatever. At the moment this is still quite experimental and subject to change.

In other news, only in the SVN:

  • format() function: it is seen as taking an array of variant from the script side, but on the Delphi side that array can be seen as a TVarRec array (Delphi’s array of const).
  • multiple fixes for mis-detected or improper compile-time error messages (thanks Alexey Kazantsev!)
  • unit tests coverage for the core compiler and runtime units is now around 70%, still some way to go, but improving.

DWScript 2.1 preview 3 – now with type inference

A DWS 2.1 preview 3 7z archive is available at googlecode Delphi Web Script page, you can also get the same code via SVN of course.

This versions fixes several reported and unreported  issues, and adds support for type inference and Delphi Prism variable initialization syntax. Before you could write code like:

var myInteger : Integer = 1234;
var myObj : TSomeObject = TSomeObject.Create;

where ‘=‘ was used as assignment operator (using the same syntax as for constant declaration or default parameter values), you can now write the above using ‘:=‘ as well

var myInteger : Integer := 1234;
var myObj : TSomeObject := TSomeObject.Create;

and you additionally can make use of type inference and write just

var myInteger := 1234;
var myObj = TSomeObject.Create;

Type inference is currently limited to the variable declaration, this will probably stay for code clarity (opinions?). The inferred type isn’t restricted at the moment, but that will probably change (so you can disallow to inferring to a Variant type f.i.).

DWScript extended language features

DWScript has supported several extensions to the Delphi language since the beginning. Here are a few you may wish the Delphi compiler supported too (and not just Delphi Prism…):

  • generalized case of, which supports non-ordinal type, for instance you can write
    case myString of
      'hello': PrintLn('Hello!');
      'goodbye': PrintLn('See you!');
    end;
  • variables can be declared in-line anywhere in the script, allowing code like
    for i:=1 to 10 do begin
       var k : Integer;
       ...
    end;
  • variables are always initialized, not just global or managed variables
    procedure MyProc;
    var
       k : Integer;
    begin
       ...k is guaranteed to be zero here (unlike in Delphi)
  • variables can be initialized to custom values upon declaration
    var obj : TMyObject = TMyObject.Create;

Delphi Web Script preview 2

A DWS 2.1 preview 2 7z archive is available at googlecode DWScript page, you can also get the same code via SVN of course.

  • added support for the Delphi “Exit” syntax, which allows passing a return value as in Exit(“my return value”)
  • TConfiguration renamed to TdwsConfiguration, some DFM persistence tweaks (less verbose by default, should be backward compatible)
  • improved unit test coverage
  • minor tweaks to runtime or compile-time error messages

As the unit tests “safety net” spreads, I’ll add support for more modern-era Delphi language additions.