Archive

Posts Tagged ‘DWS’

Small is Beautiful

February 6th, 2012

Small JavaScript that is. Or how to go from 350 kB down to just… 25 kB 23 kB.

Smaller JavaScript can help in up to three ways:

  • faster download: faster application installation or startup.
  • faster parsing for the browser: faster startup.
  • smaller identifiers: faster execution for non-JITting JavaScript engines.

And smaller also means you can have far more complex applications for a given size budget.

Using “Nickel Iron” as illustration

Nickel Iron is now available in the Chrome Web Store and in the Android marketplace, it’s built with Smart Mobile Studio, using the DWScript JS CodeGen.

The Pascal source for Nickel Iron is made of about 10k lines of code (most of it from the VJL), and a “normal” build, results in 350kB of JavaScript, well formatted, readable and debug-able with clear variable & class names. That’s larger than Pascal source size by about 50%.

Starting from that, if you enable obfuscation, optimization for size and smart linking, the 350 kB go down to 100 kB of a JavaScript source (not really readable anymore), and when that source is packaged in an Android app or sent with HTTP compression, you’ll be looking at a 25 kB file.

For comparison, jQuery is 229 kB raw, and 31 kB minified & compressed, and jQuery UI is about twice larger than jQuery. And when you’ve taken all that baggage, you haven’t done anything yet!

Obfuscation

Obfuscation isn’t just to make your code more annoying to reverse engineer: it can also help make your JavaScript smaller.

When obfuscation is active, the CodeGen will replace most identifiers with shorter versions, usually 1 to 3 characters in length.

Since JavaScript is case sensitive, each extra character added to an identifier can take 62 different values (the CodeGen reserves “$” and “_” for special uses). So obfuscated identifiers are typically short, and that allows to save on space.

JavaScript is also quite heavy on hash-table name lookups, and smaller names help making hash computations faster. On a Desktop JavaScript engine, that advantage quickly fades away as the browser hot-spots profiler decides to JIT, but on your typical Smart-phone browser, the difference can be felt.

Optimize for Size

Optimize for size triggers two mechanisms in the CodeGen:

  • a JavaScript “minifier” is run on the output, it will strip away comments, useless spaces, tabs and other characters.
  • alternative code generation templates are used, which spit out less readable but smaller code. At this point, there is no choice between size and performance, only between more and less human-readable.

The minifier is applied to “asm” sections too, and performs “safe” minifications only.

Smart Linking

Just like in Delphi, smart linking will eliminate functions methods and classes you have in the Pascal source, but never use in your program.

This is where things break away from other JavaScript libraries in terms of size. At best, they offer manual smart-linking like  jQuery UI’s “Build Your Download“, or plain old plug-ins. But if you want to use those, it means you’ll be dealing with manually managing hundreds of different builds (given all the possible combinations), and will probably just be bundling useless stuff sooner or later, because life’s too short and/or time is money.

However, just like in Delphi, Smart Linking works best if your code is well decoupled, if you use dependency-injection and other light-coupling design approaches. So avoid coding spaghetti plates ;-)

At the time this article was written, the DWScript Smart-Linker limitations are :

  • virtual or interfaced methods of a class you use aren’t eliminated (same limitation as in Delphi) update 02/08: now supported, was simpler than anticipated.
  • there is no de-virtualization just yet (same limitation as in Delphi).
  • cross-referencing functions aren’t eliminated (procA calls procB, and procB calls procA), though as this may be more of a sign of a code smell, it might just be getting a compiler “hint” rather than smart-linker support.

Finally, Pascal being declarative and statically-typed (as long as you’re not abusing RTTI/asm stuff), the Smart-Linker will be able to go further than other optimizers that start from JavaScript (like Google’s closure), and thus have to accommodate for all potential dynamic tricks.

Tips , ,

Meanwhile, in the DWScript SVN…

February 2nd, 2012

This summary of recent DWS changes is coming a bit late, and there is quite a bit to cover. Here is a quick, partial roundup of what changed since the last update.

Language changes:

  • Initial support for overloads (currently limited to standalone functions/procedures).
  • New operator “sar” for bitwise shift arithmetic right.
  • Delphi-like “dotted unit names” are now supported.
  • Support for classic Delphi-style local procedures declaration in units (before the “begin”).
  • Support const “blocks” in units.
  • Support for “array [TEnumeratedType] of” short declaration for static arrays.
  • New hints for unused private symbols and redundant scope specifiers.
  • New built-in functions: LastDelimiter, dynamic array’s Insert(), Min()/Max() overloads.

Library and script engine changes:

  • TContextMap & TSymbolDictionary were renamed to TdwsSourceContextMap & TdwsSymbolDictionary.
  • The context map content has been significantly reorganized, fixed and is now more detailed.
  • TdwsSuggestions can now optionally suggest keywords too.
  • Random functions now based on XorShift, with independent random per execution.
  • New demo/sample: simple web server based on Indy with multi-threaded server-side script execution.
  • Faster DWS SynEdit highlighter.
  • Improved documentation.
  • Various minor fixes and improvements.

JavaScript & CodeGen changes:

  • Closures are now supported by the JavaScript CodeGen.
  • Added Smart-Linking capabilities to the CodeGen when program is compiled with context map & symbol dictionary. Its capabilities and limitations are currently roughly similar to Delphi’s.
  • Improved JavaScript CodeGen for various code generation cases (faster and smaller, as measured through jsperf benchmarking on various browser engines).
  • Improved JavaScript obfuscator.
  • Improved JavaScript minifier.

News

Closures in DWScript / OP4JS

January 21st, 2012

Closures, also known as “anonymous methods” in Delphi, are now supported by DWScript for the JavaScript CodeGen, with the same syntax as in Delphi:

myObject.MyEvent := procedure (Sender : TObject);
                    begin
                       ...
                    end;

There are of course some extensions that go beyond what Delphi supports ;-)

You are allowed to use a named local procedure as a closure / anonymous method, with optional capture of local variables, allowing for neater layout of code, for instance:

begin
   ...
   procedure MyLocalProc(Sender : TObject);
   begin
      ...
   end;
   ...
   myObject.MyEvent := MyLocalProc;
   ...
end;

The function pointers and closures are unified, you did not have to distinguish between a “procedure” and a “procedure of object“, and you don’t have to distinguish a “reference to procedure” either, ie. if you declare

type TNotifyEvent = procedure (Sender : TObject);

as long as the parameters match (and result type for a function) , the above type will accept standalone functions, object methods, interface methods, and now closures/anonymous methods (and even record methods, which are just syntax sugar for standalone function with an implicit parameter).

PS: DWScript (as a scripting engine), will very likely evolve (in time) from a stack based-engine to a closure-based engine, so the above syntax will be supported for scripting purposes too, and not just when compiling to JavaScript.

News , ,

Good Practices for JavaScript “asm” sections in DWS/OP4JS

January 16th, 2012
Comments Off

The compiler supports writing “asm” aka JavaScript section in the middle of Object Pascal, there are a few good practices as well as tips to keep in mind, let’s review the menu:

  1. Name conflicts and obfuscation support
  2. Do you really need an “asm” section?
  3. Don’t rely on implicit parameters structure
  4. Handling callbacks with “Variant” methods
  5. Handling callbacks in an “asm” section
  6. Current limitations

1. Name conflicts and obfuscation support

This should be a point zero actually, but the first thing to have in mind is that you are allowed in Pascal to use as names identifiers that are reserved in JavaScript. Those can be language keywords (“this”, “delete”, etc.) or common DOM objects and properties (“document”, “window”).

The compiler automatically protects you from such conflicts by transparently renaming your identifiers (currently by adding a “$”+number at the end).

Then there is the obfuscator, which will basically rename everything to short, meaningless names. That’s good for more than obfuscation: it reduces the size of the JavaScript, improves the parsing and lookup-based performance in the browser.

The consequence is that in an “asm” section, you should prefix all Pascal identifiers with an ‘@’, so the compiler can correctly compile your asm. For instance in:

var window : String;
...
asm
   @window = window.name
end;

The ‘@window’ refers to the ‘window’ string variable (which the compiler will rename), while ‘window.name’ will be compiled “as is”, as it reads the ‘name’ property of the global ‘window’ JavaScript object.

2. Do you really need an “asm”‘ section?

Though for some weird cases you might (like this gem), there are many cases in which you don’t need “asm”, as the language supports a “Variant” type which is a raw JavaScript object, and upon which you can call methods, read properties directly or via indexes.

For instance, with v a Variant, the following code:

v := v.getNext();
v['hello'] := v.space + 'world';

will get compiled (almost) straight into

v = v.getNext();
v['hello'] = v.space + 'world';

When using Variant, you don’t have strong compile-time checks (it’s just you vs JavaScript), property and function names are case-sensitive, so use them with care. This is similar in syntax and essence to using OLE Variants and Delphi.

On the other hand, you have compiler support, and you get automatic casts when assigning a variant to a strong type (Integer, String, etc.), and you also get name conflict protection & obfuscation support without having to ‘@’ your Pascal references.

3. Don’t rely on implicit parameters structure

Because they may change in future compiler revisions!

For instance, methods are currently invoked with an implicit “Self” parameters, and the others behind, so currently “arguments[0]” is Self, and everything else if after that. But don’t rely on it.

Future compiler revisions may change that parameter’s name, may obfuscate it, may remove it entirely in favor of an implicit “this”, may inline your function, etc.

So if you need explicit parameters, declare them, if you’re in a method and need to access the object (Self), use “@Self”, if you need to access a field of the current object use “@Self.FieldName”, etc. That will keep working.

4. Avoid declaring variables in “asm” sections

Declare them in the parent function/method instead, and reference them with the ‘@’ prefix.

There are three main reasons for that, the first is that doing so means they’ll be case-insensitive, the second is that it will allow the obfuscator to obfuscate them reason for that, and the third is that you’ll get compiler warnings if you declare a variable but do not use it (or if you forgot to @-prefix it).

So don’t write that:

asm
   var myTemp;
   myTemp = ...whatever...;
   ...
end;

But write this instead:

var myTemp : Variant;
...
asm
   @myTemp = ...whatever...;
   ...
end;

5. Handling callbacks with “Variant” methods

A common occurrence is to register a callback to a JavaScript object, when that object is hosted in a Variant, that’s fairly simple to achieve:

procedure DoImageLoaded;
begin
   ...
end;
...
var myImage : Variant; // will refer to an image object
...
myImage.onload(@DoImageLoaded);

There we use the ‘@’ operator Pascal-side, to make it explicit that we want a function pointer, and not call the function. The ‘@’ isn’t necessary when the function is declared Pascal-side, as the compiler can figure it out, but when invoking a Variant method, it doesn’t know the parameters type.

Note that since function pointers are unified, you can get a function pointer from an object method or an interface method in the same fashion:

myImage.onload(@myObject.DoImageLoaded);
myImage.onload(@myInterface.DoImageLoaded);

6. Handling callbacks in an “asm” section

If you want to register the callback in an “asm” section, the situation is a little more complex, as “@myObject.myMethod” will refer to the function prototype, outside of its context. It means it’s okay for standalone functions or procedures, but may not do what you’re expecting for object or interface methods.

The solution is to acquire the function pointer outside of the “asm” section:

var myCallback : Variant;
...
myCallback := @myObject.DoImageLoaded;
asm
   @myImage.onload(@myCallback);
end;

7. Current limitations

Currently the parser for “asm” sections doesn’t really understand JavaScript:

  • it’s still treating JS as a weird invalid form of Pascal, and notably {} define comments for it, so it will pass whatever is inside curlies “as is”, and will annoyingly ignore @ signs inside curlies
  • some weird operator combos (but valid JS)  may throw off the parser, if that happens, place that code in between curlies, and post a bug report

Hopefully in time, there will be a proper JS parser, but currently the focus is more on the Pascal side, and “asm” sections are intended for handling corner cases more than as a main workhorse.

Tips , , ,

Christmas present for SynEdit users

December 26th, 2011

SynEdit LogoJust committed to the SynEdit SVN a few enhancements:

  • much improved performance for long & large files, still not quite notepad++-class just yet, but my profiler tells me there are many juicy low-hanging candies left :-)
  • improvements to the TSynGeneralSyn highlighter (single & double quote mode, token callback, and a few other niceties)
  • DWScript syntax highlighter has been moved to the SynEdit SVN, the copy in the DWScript SVN will no longer be the primary reference

edit 27/12: committed another optimization, AFAICT, when working on large files, SynEdit is now faster than the Delphi IDE code editor and *way* faster than Scintilla/notepad++ 5.9.6.2 (with and without a syntax highlighter like DWS’s)

News , ,

DWS news + OP4JS aka SmartMobileStudio

December 23rd, 2011
Comments Off
A quick news roundup before Christmas.
 

OP4JS Alpha aka SmartMobileStudio is in the wild

We’ve now sent “Smart Mobile Studio” Alpha version to 50 testers.

Did you miss the beta invite?

Visit www.SmartMobileStudio.com to participate.

SmartMobileStudio leverages DWScript’s JavaScript CodeGen.

My first test app with the alpha was a clock, check it in your iOS or Android Browser or in the Android market. Source is included in the alpha. Will be beautified it later on :-)

I’ve been playing on another one, head over to YouTube to see a small video, you can also get the apk (47 kB), but beware it’s basic, ugly, and definitely early alpha, but it’s coded in Pascal!

Below is a snippet of the source code (using DWS inline implementations for illustration and compactness purposes, most of OP4JS is written in the more classic interface/implementation style), it’s a snip of the root class of the mini-engine of the game (yes, virtual methods are supported):

type
   TEntity = class
      X, Y : Float;

      function Progress : Boolean; virtual;
      begin
         // does nothing by default
      end;

      constructor Create(aX, aY : Float);
      begin
         X := aX;
         Y := aY;
      end;

      function Dist2(entity : TEntity) : Float;
      begin
         Result := Sqr(X-entity.X)+Sqr(Y-entity.Y);
      end;
   end;
 

Other recent changes to the DWScript SVN

  • Added sample/simple IndyWebServer demo, implements basic “pascal server pages” and demonstrates how to use DWS in a multi-threaded environment. Makes use of RTTI Environment class to expose WebRequest & WebResponse. Expect more details in a future post.
  • TTerminatorThread has been replaced by TGuardianThread, which can “guard” multiple executions
  • Dotted unit names are now supported
  • Random no longer uses the Delphi RTL but XorShift
  • unit name symbols are now included in the Symbol Dictionary
  • include references are now included in the Symbol Dictionary
  • TdwsSuggestions can now optionally suggest reserved words (begin, procedure, etc.)
  • fixes for Inc() & Dec() when operating on references with side-effects.
  • improved several error messages related to parameter passing.
  • other misc. fixes and optimizations, more unit tests.

News , ,

Zero-based Strings indexes?

December 15th, 2011

In a now infamous and enormous thread I won’t name, Allen Bauer dropped a bomb:

<bomb>Oh, and strings may become immutable and 0-based ;-)…</bomb>

Currently Oxygene has zero-based strings, I was considering it for DWScript too, but the backward-compatibility implications are a bit too huge (yes, we and customers have many years of accumulated DWS code), and the kind of issues triggered by that are hard to track/fix/warn about… or are they?

One evolution that is looming (at least for DWS, can’t speak for Delphi) is having methods on base-types too, since these would be new methods, with no legacy, a zero-based convention could be introduced there, f.i.:

sub := Copy(str, 1, 10); // legacy, 1-based
sub := str.Copy(0, 10); // new, 0-based

As time passes, the functions would be marked as “deprecated”, and code migrated over to methods incrementally. The interim time would of course be a mess mix of zero and 1-based conventions… not very desirable, but certainly preferable to breaking code in non-obvious ways.

One hard case (without easy compiler-warnings) that would remain would be that of indexed character access, like “str[i]“. I can think of only one safe way around that one: not having a default array property. That could however be leverage to gain some, f.i.:

char16 := str.Char16[i]; // equivalent to old str[i]
code := str.Code16[i]; // equivalent to old Ord(str[i])
charStr := str.Char[i]; // new, retrieves the whole character (1 one or more char)
codePoint := str.Code[i]; // new, retrieves the whole unicode codepoint

The Xxx16 versions would return a a WordChar, equivalent to a current Char, and only capable of holding a character from the BMP. The Xxx version would return a String (a whole Unicode character/codepoint) or an UTF32 code.

Comments? Other Ideas?

Ideas , , ,

Pimp your random numbers with XorShift!

December 13th, 2011

A 64bit XorShift is now used to generate random numbers in DWScript, and there is a now a separate random number generator per-execution, which is auto-randomized when an execution is created.

Previously, the RTL random generator was used, this was “okay” when you had only one script using random numbers at a time, but multiple scripts running at the same time would interfere (Randomize calls would affect each others f.i.), and Random isn’t really thread-safe.

Performance fo XorShift is roughly comparable to the Delphi RTL’s linear congruential generator, but with much better statistical random properties and a very long period, without the overhead of a Mersenne Twister. For those interested in the mathematical details, see “XorShift RNGs” paper by G. Marsagalia.

As an illustration of the improved random properties, consider filling a bitmap with “random” RGB colors for each pixel:

var x, y : Integer;
for x := 0 to bmp.Width-1 do
   for y := 0 to bmp.Height-1 do
      bmp.Pixel[x, y] := RandomInt($1000000);

Using the Delphi built-in Random, you’ll get something like the image below (generated at 512×512, then halved and downgraded to 4bpp for web consumption)

Delphi RTL Random

Oooh… the horizontal scratch lines! Not so random after all… I don’t know if the Delphi LCG is as biased as RANDU, but visibly, it is probably not something you want to rely upon too much.

And now, the same but with the XorShift implementation now used in DWS:

DWScript XorShift Random

The  XorShift implementation is very simple, fast, and doesn’t require much memory: a single 64bit value is enough to get good random, use two if you want longer periods that won’t have a chance to loop before the universe ends.

Last but not least, 64bit XorShift may be fast in 32bit binaries, but it practically walks on water in 64bit binaries ;-)

News , , , , ,

DWS news roundup

December 3rd, 2011

Here is a quick summary of recent additions to DWScript:

  • exit, break and continue are now reserved keyword (and highlighted as such), previously they weren’t (as in Delphi), but having variables or functions named that way just “breaks” (pun intended) those language features (as it does in Delphi)
  • new TdwsRTTIEnvironment, fields and properties of a class exposed this way are directly accessible in scripts, more details on that one in a future post.
  • support passing constant arrays to dynamic array parameters (automatic conversion).
  • improved the language documentation.
  • added ability for custom extension of Abs() special function (now supported by TComplex).
  • added Clamp() floating-point function.
  • added Gcd(), Lcm(), IsPrime() and LeastFactor() integer functions.
  • fixed an issue that prevented conditional directives from being supported in all portions of the code, they can now properly be used anywhere.
  • JavaScript codegen optimization for variable symbol lookup.
  • minor tokenizer speedup, compile speed should now be close to 200k LOC/sec*.
  • more test cases, minor fixes.

*: FWIW since the old benchmark, compile and execution performance almost tripled and memory requirements were cut by approx 30%. At the same time the language became quite a bit richer.

News

Compile-time evaluations, ‘&’ prefix, internal changes

November 17th, 2011

Here is a summary of recent changes for DWScript in the SVN:

  • support for compile-time evaluation of constant records and static arrays.
  • support for the ‘&’ escape prefix to allow using reserved words as identifiers.
  • fledgling math extensions (TComplex, TVector & TMatrix), currently they are still incomplete, slow and experimental.
  • added fledgling language doc in the wiki, redactors and help welcome!
  • several units were split: dwsSymbols to dwsUnitSymbols, dwsExprs to dwsInfo, and dwsFunctions to dwsMagicExprs. Advanced script users may have to update some of their “uses” clauses.
  • system & internal units are now “static” by default, this is mostly relevant if you were compiling lots of simple scripts, as it saves about 40 kB per script and reduces compile times for short scripts to a matter of nanoseconds.

As a consequence of the above, the unit tests suite now runs quite a bit faster… actually DUnit’s TreeView updates are now the bottleneck, despites hundredths of scripts getting compiled and executed at each step. Time to add more unit tests I guess ;-)

Vector and Matrix support have actually been a “fuzzy” goal for DWS, since a long time ago, which is only now getting implemented. The plan is to eventually have them use SIMD, and possibly be part of a future OpenCL CodeGen target.

News