Archive

Posts Tagged ‘Delphi’

Immutable strings… in Delphi?

May 13th, 2013

One of the “novelties” of the NextGen Delphi compiler is immutable strings, which I find quite puzzling, for lack of a better word, given that Delphi already had reference-counted copy-on-write strings, and the NextGen compiler uses reference-counted strings.

I always considered that Delphi’s String type was one of its remaining strong points, being a high-level abstraction (higher than Java’s or .Net’s String/StringBuilder dichotomy) with excellent low-level performance (on par with C/C++  character arrays).

From the recent discussions, it appears many don’t know what makes/made Delphi String so special, so here is a quick summary.

Immutability

String being immutable means you can keep a single reference across threads without trouble. That’s an advantage over C strings.

It also means that copying a string, be it for an assignment or a parameter passing, is just like passing a reference, you don’t have to duplicate the content if you want to be sure it isn’t modified behind your back. That’s an advantage over C strings and StringBuilder.

Note that none of the above are advantages over Delphi Strings, since the copy-on-write mechanism means that Delphi Strings are effectively immutable once they’re referenced more than once.

Reference-counting vs Garbage Collection

Every time a new assignment or parameter passing is made, the reference count of the String has to be increased, this is an atomic lock, and is related to memory management, so it’s there whether you’re using simple reference-counting or copy-on-write.

Under a GC, no atomic lock is required, a simple reference (pointer) has to be copied. This is very efficient, locally, but the memory management costs are just deferred to a later garbage collection phase. Since immutable strings don’t have reference to other objects, the GC for them can theoretically happen in parallel without any drawbacks (assuming the GC supports it).

So under a GC, an immutable String type makes a whole lot of sense, as implementing a copy-on-write one requires a lot of effort, and a mutable one is problematic multi-threading wise.

Copy-on-write mutability

Making reference-counted strings mutable doesn’t change any of the above, you just add one capability: when the reference count indicates there is no other reference to a string, then you can mutate it, ie. change characters, adjust its length, etc.

In other words, when the only reference to a string is a single variable locally scoped to a procedure, then it’s safe to do just about anything with it, the multi-threading issues can’t  apply until that string is referenced somewhere else.

This is both convenient and very efficient, since what the compiler does before applying a mutation can be summarized as:

if myString is "referenced somewhere else" then
   myString := make a local copy of my String
mutate myString

The local copy is of course referenced nowhere else, and thus is safe to mutate. Copy-on-write is really copy-on-mutate, as it encompasses just not changing the characters, but also resizing a string (re-allocations) and concatenations.

Keep in mind this is an “added-on” behavior, where you just take advantage of the memory management scheme being a reference-counting one. If you know what you’re doing and want more performance, you can even waive the COW check by using UniqueString(), which will ensure you have a local copy, and then acquiring a PChar to the string content.

It can be done under a GC, but means you have to maintain a reference count or similar information since the GC doesn’t have one. Android relies a lot on copy-on-write, and that was actually one key differentiation between Dalvik VM and more classic Java VM.

Advantages of RawByteString & UTF8String over TBytes

And this will be a bit more controversial, but Copy-On-Write is also why RawByteString/UTF8String can ofttimes make a lot more sense than TBytes for binary buffers: RawByteString isn’t just reference-counted (like TBytes), it is also supporting copy-on-write.

This means that in a multi-threaded environment, RawByteString shares the same advantages of immutability String enjoys, and which TBytes just doesn’t enjoy, as TBytes is always mutable.

Conclusion

String wraps up both advantages of Java/.Net String & StringBuilder, they have bother multi-threading immutability advantages and the mutability capability.

Performance-wise, under a speculative memory manager (like most modern allocators), you’ll also find that merely concatenating to a String is typically just as fast as using TStringBuilder, and in several occurrences  it’s actually faster because String benefits from compiler magic, while TStringBuilder does not (also some TStringBuilder implementations are a little weak).

Alas some String performance was lost during the Unicode and 64bit transition, when some FastCode routines where replaced by lower performing pure-pascal ones, and you’ll lose even more performance with TStringHelper, which introduces some algorithmically poor pure-Pascal implementations.

Tips , , ,

Gaining Visual Basic OLE super-powers

April 30th, 2013
Comments Off

Visual Basic in its various incarnations and off-springs has super-powers when it comes to OLE Automation, aka late-bound COM through IDispatch.

Super Powers?

For instance, when doing MS Word OLE automation, you can in VBS and VBA write things like

WordApp.Documents[1].Name

which in Delphi (and many others) has to be written as

WordApp.Documents.Item(1).Name

and it can also call some methods like

set v1 = WordApp.CentimetersToPoints(2.5)
set v2 = WordApp.InchesToPoints(2.5)

which Delphi, FreePascal, PowerBasic, Python, C++’s CComPtr and others are unable to call, and which instead result in an “unspecified error” without further description.

While trying to solve the above riddle, I found posts about it for various languages and frameworks, dating back to the beginning of the millennium with no solutions, just specific workarounds.

Until today that is, bit thanks to David Heffernan and Hans Passant, two StackOverflow super-stars.

The reason is that some “methods” like CentimetersToPoints aren’t methods, they’re really indexed properties that you can’t call as indexed properties (ie.  WordApp.CentimetersToPoints[2.5] will fail in Delphi as well).

The Magic Potion

If you want to get the same super-power, the fix is basically to not follow the spec when calling IDispatch.Invoke, and instead of using DISPATCH_METHOD, use DISPATCH_METHOD or DISPATCH_PROPERTYGET.

In Delphi’s ComObjDispatchInvoke function that means the

end else
  if (InvKind = DISPATCH_METHOD) and (ArgCount = 0) and (Result <> nil) then
    InvKind := DISPATCH_METHOD or DISPATCH_PROPERTYGET;

should be changed to just

end else
  if (InvKind = DISPATCH_METHOD) then
    InvKind := DISPATCH_METHOD or DISPATCH_PROPERTYGET;

And the call will now go through. So all the previously mentioned languages and environment that follow the spec are doomed to fail.

Similarly, investigations have showed that for some OLE automation properties “get” to pass, you need to use DISPATCH_METHOD or DISPATCH_PROPERTYGET, and using just DISPATCH_PROPERTYGET will fail.

Speculations

There is a suspicious bit in the MSDN documentation for Getting and Setting properties:

 Properties are accessed in the same way as methods, except you specify DISPATCH_PROPERTYGET or DISPATCH_PROPERTYPUT instead of DISPATCH_METHOD. Some languages cannot distinguish between retrieving a property and calling a method. In this case, you should set the flags DISPATCH_PROPERTYGET or DISPATCH_METHOD.

Do you know a language that can’t distinguish between properties and methods? Visual Basic.

Given that Visual Basic was a primary user of OLE Automation, it’s likely using DISPATCH_PROPERTYGET or DISPATCH_METHOD all the time indiscriminately, so I’ll speculate that at some point, using both dispatch flags became the effective spec. I suppose Raymond Chen might be able to provide some further insight?

Bottom Line

DWScript COM Connector now has the same OLE Automation super-powers, and it’ll work without modifying the ComObj unit.

The COM connector uses its own IDispatch.Invoke wrapper, distinct from Delphi, which is incidentally also capable of working with Single precision floats (requires a workaround in Delphi).

Tips , , , , ,

What would get you to buy a newer Delphi version?

April 26th, 2013

This is a practical poll question, what would get you to buy a newer Delphi version?
What would you like to see most and foremost, and would most have a use for?
To force you to choose, you can only pick two items!

I would buy a newer version of Delphi with...

View Results

Loading ... Loading ...

News

Not in (interpreted) Kansas anymore

April 24th, 2013

dws-mirrorFor a few weeks now, an experimental JIT compiler has been available in the DWScript SVN for 32bits code. A more detailed article into the hows and whats and WTFs will come later.

Consider this article as an extended teaser, and a sort of call for test cases, benchmarks and eyeballs on the source code.

Following benchmarks were compiled in XE2 with full optimization, no stack frames and no range checking. Absolute values aren’t meaningful, just consider them relatively.

Mandelbrot fractal

bench_mandelbrot(lower is better)

Delphi XE2-32bit : 515
Delphi XE3-64bit: 162
DWScript JIT 32bit: 281

The JIT was initially tested on the Mandelbrot benchmark, so that’s one of the cases where JITing is almost complete, with the exception of the SetPixel() call.

SciMark 2

bench_scimark(higher is better)

Delphi XE2-32bit : 507
Delphi XE3-64bit: 682
DWScript JIT 32bit: 215

Delphi version uses pointers, DWScript version was slightly updated to use dynamic arrays instead  and JITting is partial at the moment.

The benchmark involves fairly large matrices, and DWScript use of Variant (16 bytes) rather than Double (8 bytes) means the data no longer fits in the CPU cache, which partly accounts for the poor showing of the JIT.

Array statistics

bench_array_stats(lower is better)

Delphi XE2-32bit : 208
Delphi XE3-64bit: 47
DWScript JIT 32bit: 63

This test measures of execution time of the following code (fully JITted), which computes the base values for common array statistics (range, average, deviation, etc.). The Delphi 32bit compiler really suffers because of Min/Max (despite having inlined them).

// "a" a floating point array of non-ordered values
for v in a do begin
   s := s + v;
   s2 := s2 + v*v;
   mi := Min(mi, v);
   ma := Max(ma, v);
end;

What next?

The DWScript JIT compiler relies on SSE2 to outperform the Delphi 32bit compiler, its current main limitations are:

  • JIT centers around floating point and a limited subset of integer and Boolean operations, the rest isn’t JITted yet.
  • Function calls aren’t JITted at the moment, and neither are a variety of other statements.
  • The JIT works with the same data structures as the interpreted engine, that means script debuggers and everything else works on JITted code as if it was still interpreted, but that also means the basic data unit is still the 16 bytes Variant at the moment.
  • The JIT register allocator is currently limited to floating point (ie. no integer or pointer allocations).
  • DWScript Integer type is 64bit sized, so for 32bit values, Integer performance is lower than what Delphi 32 can do, even though the JIT can generate typically faster code for it than the Delphi 32bit compiler does for Int64.

The JIT also suffers against a 64bit compiler as there are 64bit CPU instructions (and registers) not accessible in 32bit mode, but a 64bit JIT should be able to go farther.

If you’re interested and want to help, I’m currently looking for benchmarks and test cases, if you have code that compiles in both Delphi and DWScript,  particularly on integer maths (encryption, etc.) or object-oriented manipulations (graphs, trees…) that could help. You’re even allowed to have the Delphi version use pointers and other tricks, the comparison doesn’t need to be fair ;-)

News , ,

DWScript happenings

April 1st, 2013

dws-mirrorThis is a belated news update, with only the highlights:

Language News

  • combined property/fields declaration is now supported (same syntax as Oxygene)
  • dynamic arrays now support a Remove method, which has the same syntax as IndexOf, and can be used to remove elements by value
  • for var xxx in array syntax is now supported, which combines a local, type-inferenced variable declaration and a for … in loop.
  • unit test coverage is now at 97% for the compiler, 91% for the whole of DWScript
  • various obscure bugs found and fixed

Script engine News

  • script engine transition from stack-based to closure-based has begun, besides internal changes, the visible impact should be improved performance for objects, records, static arrays, var and const params has been improved
  • full transition to closure-based engine (and support for anonymous methods and lambdas in the script engine) is pushed back to 2.4

Teaser News

Also as way of a teaser, here is a screenshot related to something brewing in the lab… it’s from a Delphi XE app running the Mandelbrot benchmark (timings are for several runs), and the “DWScript” in that screenshot is the script engine.

dws_jit

Note that since this screenshot was taken, performance has improved, and the pony has learned new tricks :-)

 

News , , ,

Delphi array constructors performance (or lack of)

February 18th, 2013

In Delphi you can initialize a dynamic array in two ways, either manually or via the Create magic constructor:

type
   TIntegerArray = array of Integer;

procedure Test;
var 
   a : TIntegerArray;
begin
   // magic constructor
   a := TIntegerArray.Create(1, 2);

   // manual creation
   SetLength(a, 2);
   a[0] := 1;
   a[1] := 2;
end;

The outcome in both cases is the same, are all things equal?

Some array initializations are more equal than others

The first method is less verbose in code, but quite a bit less efficient, if you check the CPU view, that becomes obvious

TestUnit.pas.32: a := TIntegerArray.Create(1, 2);
00511335 8D45F8           lea eax,[ebp-$08]
00511338 8B15F0125100     mov edx,[$005112f0]
0051133E E89576EFFF       call @DynArrayClear   // anybody knows why?
00511343 6A02             push $02
00511345 8D45F8           lea eax,[ebp-$08]
00511348 B901000000       mov ecx,$00000001
0051134D 8B15F0125100     mov edx,[$005112f0]
00511353 E87476EFFF       call @DynArraySetLength
00511358 83C404           add esp,$04
0051135B 8B45F8           mov eax,[ebp-$08]
0051135E C70001000000     mov [eax],$00000001
00511364 8B45F8           mov eax,[ebp-$08]
00511367 C7400402000000   mov [eax+$04],$00000002
0051136E 8B55F8           mov edx,[ebp-$08]
00511371 8D45FC           lea eax,[ebp-$04]
00511374 8B0DF0125100     mov ecx,[$005112f0]
0051137A E89576EFFF       call @DynArrayAsg

// Manual initialization
TestUnit.pas.35: SetLength(a, 2);
0051137F 6A02             push $02
00511381 8D45FC           lea eax,[ebp-$04]
00511384 B901000000       mov ecx,$00000001
00511389 8B15F0125100     mov edx,[$005112f0]
0051138F E83876EFFF       call @DynArraySetLength
00511394 83C404           add esp,$04
TestUnit.pas.36: a[0] := 1;
00511397 8B45FC           mov eax,[ebp-$04]
0051139A C70001000000     mov [eax],$00000001
TestUnit.pas.37: a[1] := 2;
005113A0 8B45FC           mov eax,[ebp-$04]
005113A3 C7400402000000   mov [eax+$04],$00000002

Now before you complain on the compiler capability, you’ve got to realize that the two ways of initializing a dynamic arrays are not equivalent:

  • the magic constructor creates an array, then assigns it, so the array variable is always in a well-defined state
  • the manual initialization mutates the array in several steps, so the array during the intermediate state is in an unfinished step

Of course, in the limited Test procedure, the compiler could figure out the array isn’t visible from the outside, and thus use the shorter form, but that’s an optimization that would apply only to a local variables.

A more generic optimization would be to have the compiler waive the temporary array when the array that is initialized isn’t referenced anywhere else (so intermediate states don’t matter), that’s possible given that dynamic arrays are reference-counted.

Overhead in detail

The final outcome is that using the Create magic constructor can incur quite a bit of overhead:

  • a DynArrayClear call (not sure why it’s there), that will release the previously assigned block of memory for the temporary array
  • a DynArraySetLength, that will allocate a new block of memory and zero it
  • a DynArrayAssign, that will trigger the release of the memory for the existing array (if it wasn’t empty), along with a bus lock for the reference count overhead
  • extra initialization and finalization for the temporary array

In a multi-threaded applications, all that extra memory management and bus locking is going to have a disproportionate impact on performance. If you test the above snippets in a multi-threaded environment, you’ll notice that when using the array constructor, execution quickly becomes single threaded, bottle-necking on the memory manager and bus locks.

The manual initialization only has a single DynArraySetLength call, and if the array is not empty, this may not result in a new block being allocated (as the existing memory block could just be resized in place). So if you initialize the same array variable more than once, the manual form can be quite cheap.

A better array initializer?

Now that I showed you the magic array Create constructor is no good, what if you still want something convenient? Well open arrays can come to the rescue:

procedure InitializeArray(var a : TIntegerArray; const values : array of Integer);
begin
   SetLength(a, Length(values));
   if Length(values)>0 then
      Move(values[0], a[0], Length(values)*SizeOf(Integer));
end;
...
InitializeArray(a, [1, 2]);

The above function won’t be as efficient as manual initialization: there is an extra function call and the values will be copied twice. However it eliminates all the extra memory management and bus locking, so will scale quite better in multi-thread, while being compact syntax and code-wise.

Note that for a managed type (String, Interface…) then System.Move can’t be used, you’ll need to use either asm hackery or a for-to-do loop with item-by-item assignment, which will incur a performance hit, and often make it non-competitive with the manual version.

Need even more speed?

In the grand scheme of things however, all the above approaches suffer from the SetLength call, which is quite complex (have a look at DynArraySetLength in the System.pas unit… and weep), so if you know there is a chance the dynamic array wasn’t resized,  in the manual version, you can gain by doing

if Length(a)<>Length(value) then
   SetLength(a, Length(Values));

Which can when the SetLength is waived, net you more than a mind boggling 10x speedup (ten times).
Ouch! Why doesn’t the RTL do that?

Well, it doesn’t do that because it can’t, as Delphi’s dynamic arrays are not some kind of hybrid half-way between a value type and a reference type, and SetLength is the key stone where all the hybridization happens (for more on the subject, see Dynamic Arrays as Reference or Value Type).

And FWIW, in DWScript, arrays are first-class reference types, which means they can have more capability, and their initialization syntax is also more compact, the above initialization is just:

a := [1, 2];

And if you’re using Smart Pascal and running it in Chrome V8 or node.js, well, let’s just say you’ll need to use all the above tricks for Delphi to come ahead performance-wise.

Tips , , , , ,

SamplingProfiler 1.8.0

January 17th, 2013

Version 1.8.0 of SamplingProfiler is now available.

The options dialog now support Delphi XE2 & Delphi XE3 search/browse paths, other changes are only indirect fixes/improvements related to the components and libraries used in SamplingProfiler.

News ,

Thirteen features of DWScript

December 31st, 2012

2013After having survived the Mayan apocalypse and just before the year ends, and the US economy falls from the fiscal cliff into the pit of recession, here is a quick look at thirteen features of DWScript, so you don’t come into 2013 unaware :)

1. Supercharged memory management

ARC is trendy, but introduces cycles and weak references hell, GC is cool, but garbage collections stalls aren’t, deterministic memory management is fast and deterministic, but verbose and error-prone.

Why not have them all at the same time work together instead? That’s what you get in DWScript: ARC for immediate releases in simple cases, GC to collect cycles and avoid the weak reference, and manual deterministic release when you need it.

2. First-class dynamic arrays

Dynamic arrays are strongly typed, first-class dynamic arrays have methods to Add, Delete, etc. like a (T)List<T>.

And where automatic memory management absolves you from destructors most of the time, true dynamic arrays absolve you from requiring constructors for list and LIFO queues.

3. Type-inferenced variables

Why manually specify the type when the compiler can infer it for you and you get a strongly typed variable?

“var a := 2;” is the same as “var a : Integer; … a := 2;” would be in Delphi.

4. Combined field declaration and initialization

Also type inferenced, fields default values can absolve you from having to write a constructor in trivial cases.

5. Multiple helpers per type

Helpers aka extension methods can be a powerful way to add functionality to existing types. Having a limit of one-helper-per-type hamstrings them.

6. Multi-line strings

Whatever your strings or constants are used for, sometimes they need to span more than one line. When that happens, being able to absolve yourself from concatenating an obscure #13#10 is nice. Having a syntax that will allow you to properly indent those strings without introducing spurious indentation in the string is even better.

7. Lambda syntax

This applies to SmartPascal, where closures and anonymous methods are common, by leveraging type inference and a less verbose, yet not operator-soupy syntax, you can have a syntax far more compact, yet still statically and strongly typed. Enough to make one happy.

8. Custom language extensions

Sometimes it would be nice to introduce custom construct into your source code, with DWScript’s language extension mechanism you can do just that and take over the parser for a section of code, can be used to introduce asm or JSON capability.

9. Unicode-safe strings and iterations

Because sooner or later you’re going to end up parsing Chinese and your 16-bit-characters assumption are going to haunt you. Having Unicode-aware loops and no 8-bit or 16-bit Char type in the language avoids that.

10. Unified function pointers / delegate

Because life is too short for procedure vs procedure of object vs reference to procedure. Let the compiler figure it out.

11. Compile-time function evaluations and constant expressions

When something is constant, it’s constant, no? DWScript lets you use constant expressions to define constants, so that you can have a constant defined by just “Sin(PI/4)” rather than a magic number plus a comment to explain what that magic number is.

12. Partial classes

Partial classes allow you to support automatically-generated code more easily, including code generated from compile-time type information, or just make your classes more extensible, by design.

13. Deprecated

Feature #13 got deprecated and isn’t available anymore.

14. Being able to compile to HTML5+JavaScript

With SmartMobileStudio, Make your business logic cross-platform, cross-terrain and as ubiquitous as  it can get these days. Run in mobile or desktop browsers, run server-side with node.js, or client-side with node-webkit as runtime environment.

Tips , , ,

Pascal open-source projects trending… Up!

November 7th, 2012

At least according to ohloh, when measuring commits to open-source projects, the graph is below:

Food for thought for Marco Cantù, the new Delphi Product Manager? The percentage figures are back in the 2004 range, and the trend after going downhill for years, reversed to shows 3+ years of growth.

When looking in details though, a most of the recent years activity is on projects around FreePascal and Lazarus projects, whereas at the turn of the millennium, the vast majority of it was targeted at Delphi.

The above chart is expressed in terms of percentage of activity in all open-source projects, 0.4% is low, but that better than Visual Basic does ;-) though there arguably never was a strong Visual Basic open-source community to begin with.

Other fashionable languages (C#, Java, PHP,Ruby, etc.) are quite higher, though their trends are flat or downward. Even Objective-C, is quite flat, indicating Apple developers probably don’t share.

There is however one language, that shows a constant, tranquil upward trends, it’s… JavaScript

I couldn’t find other languages with a similar trend, except maybe Python.

News , ,

Common Unit Tests for Delphi / FreePascal

September 6th, 2012

Serg recent wrote an introduction to unit testing under Lazarus, showing how everything is there, but just that little bit “off” because of different unit names between FPCUnit and DUnit.

Not being a fan of ifdef, the prospect of having unit tests “uses” sections littered with ifdef did not attract me, so I made a little adapter unit to keep the “uses” sections clean.

It simply aliases the useful unit tests classes and units and goes like:

unit dwsXPlatformTests;

interface

uses
   Classes, SysUtils,
   {$ifdef FPC}
   fpcunit, testutils, testregistry
   {$else}
   TestFrameWork
   {$endif}
   ;

type

   {$ifdef FPC}
   TTestCase = fpcunit.TTestCase;
   {$else}
   TTestCase = TestFrameWork.TTestCase;
   {$endif}

procedure RegisterTest(const testName : String; aTest : TTestCaseClass);

implementation

// RegisterTest
//
procedure RegisterTest(const testName : String; aTest : TTestCaseClass);
begin
   {$ifdef FPC}
   testregistry.RegisterTest(aTest);
   {$else}
   TestFrameWork.RegisterTest(testName, aTest.Suite);
   {$endif}
end;

With it, unit test cases can just refer “dwsXPlatformTests”, which encapsulates the ugly ifdef’ing.

Tips , , ,