Here is a small poll to help me decide in which direction to go about dynamic arrays in DWScript. The poll is at the bottom of the post, to encourage reading before voting 😉
The Problem
In Delphi, fixed-size arrays behave as value types, while dynamic arrays behave as reference type, this can be illustrated by:
type TFixedSizeArray = array [0..9] of Integer; TDynamicArray = array of Integer; ... procedure SomeProc(fixed : TFixedSizeArray; dynamic : TDynamicArray); begin fixed[0]:=2; dynamic[0]:=2; end; ... var f : TFixedSizeArray; d : TDynamicArray; begin f[0]:=1; SetLength(d, 10); d[0]:=1; SomeProc(fixed, dynamic); // at this point f[0] is still 1, but d[0] is now 2 end;
However if you change SomeProc to
procedure SomeProc(fixed : TFixedSizeArray; dynamic : TDynamicArray); begin fixed[0]:=2; SetLength(dynamic, 20); dynamic[0]:=2; end;
then d[0] will be unchanged, as the SetLength() call will have spawned a different dynamic array, if you want to resize a dynamic array you have to pass it as a var parameter, while if you only want to change the items, you don’t…
With current Delphi syntax, dynamic arrays are schizophrenic: they are passed by reference, like a TObject would be, but resized as value types.
In a way, dynamic arrays borrow the String type’s SetLength() syntax and behavior, but String is passed in a form that mimics a value type, in the example above, it would behave like TFixedSizeArray, ie. if you modify a String that wasn’t passed as var in SomeProc(), the original variable won’t be changed.
The Options
To rationalize the situation, there are two options:
- make dynamic arrays a value type, with similar copy-on-write optimization as String to minimize unnecessary copies. A side benefit is that it makes dynamic arrays behave similarly to fixed-size arrays, all would behave as value types, a downside is a performance hit on write accesses (for the copy-on-write mechanism).
- make SetLength() treat dynamic arrays as a 1st-class reference type, ie. have it work as dynamic.SetLength() would, and if you need to make a unique copy of a dynamic array, you would use a dedicated function like Copy() rather than piggyback SetLength().
So, what would feel more natural to you?
Dynamic arrays would be better as...
- a value type, like non-dynamic arrays, minimize the surprise factor (38%, 22 Votes)
- a full self-respecting reference type, more practical and efficient (62%, 36 Votes)
Total Voters: 58
I don’t want to vote but it seems to me that following the lead set by Delphi will reduce confusion amongst developers.
@David Heffernan
IME many existing Delphi developers are already confused by the Delphi behavior, Borlanders themselves experienced a few bugs related to dynamic arrays and SetLength in their days, so I don’t think Delphi has set any “lead” in that area, it’s more a case of a “mislead”.
In many modern languages, you are pushed towards collections instead of arrays (ie. reference types), though the Delphi approach of “behaves like a value type” introduced by String has its own merits (as a solution to the mutability/immutability dilemma).
Hi;
Maybe this video about dinamic arrays and types could help…
http://www.youtube.com/watch?v=Q67ur8GlFtQ
Elton
It’s not really schizophrenic.
Calling SetLength() on a dynamic array is equivalent to allocating a new chunk of memory for a new TObject reference to be held in the parameter value originally passed.
If you were to do that in your 2nd version of SomeProc() to a TObject then you would see the same thing: the local version has a (potentially) new reference to some new area of memory, but the calling proc still has the old reference to the original chunk of memory.
i.e. the behaviour is entirely consistent with a reference type.
The only difference between TObject and dynamic arrays here is that you rarely (if ever – more likely NEVER) reallocate memory for a given TObject reference in the way that changing the length of a dynamic array does.
I agree with David – arguing that making DWScript less confusing would reduce overall confusion neglects the fact that by being different the lessons learned in one “flavour” of “Delphi” Pascal in this area would not be portable to the other.
Can we have a 3rd option on the poll: Be consistent with Delphi ?
🙂
But yes, have a specific Copy() function for creating a true copy.
@Jolyon Smith
Also I would note that the *apparent* schizophrenia I think stems more from the fact that fixed length and dynamic length arrays are impossible to distinguish from one another at the point of use, so it’s easy to forget/overlook the differences in behaviour.
@Jolyon Smith
Yes, SetLength() is a wrapper around a ReallocMem(), hence the behavior.
In practice, dynamic arrays behave more like dynamically allocated static-sized arrays that can’t be resized without creating new copies (and where making a unique copy involves SetLength…).
Resizing a dynamic array is a rather centric aspect of a dynamic array aspect, while as you point, it’s irrelevant to TObject.
IME, because of the above behavior, the only “safe” dynamic arrays in Delphi production code are those tucked away as private properties in a class, that are never passed around as result or parameter. FWIW, it’s also what makes TBytes problematic as a general purpose binary buffer.
As for consistence with Delphi, given that the current syntax is inconsistent (wrt to regular static-sized arrays f.i.), I’m not sure there is much point in being consistent with it.
Other flavors of Delphi and other modern languages went for a syntax that explicits the dynamically allocated static array nature, but typically still recommend going for collections (ie. true reference types, where you can resize a List and have existing reference point to the new resized List). Thus beyond use as internal building block wrappers for a ReallocMem, I’m not sure there is much point to dynamic arrays in Delphi.
If dynamic arrays syntax is clarified on the other hand, it opens up interesting syntax extensions in the direction of array comprehension, which with the current Delphi approach would just be full of traps and gotchas.
@Elton
Your video reminds me of the interesting side-effects in terms of memory management that can happen if you have dynamic arrays of objects, rather than just dynamic arrays of basic types, records or ref-counted types.
@Elton
Hey! Nice name. (:
@Eric
I really agree that it is a schizophrenic behavior. But it is a known schizophrenic behavior. Well, not so known… but yet…
If you asked me what I would like Delphi did, I’d probably say that static array like behavior is the best, because, otherwise, the var keyword makes no sense at all.
But when you ask me about the way that DWScript will work, then I must take a breath and check: What the way that others pascal compilers work?
So I am not sure how to vote… 🙁
@EMB
var keywords still makes sense if you want to return a different array, f.i.
it also makes sense if you introduce the “new” keyword to create a new dynamic array explicitly, and the “new” keyword is incidentally the route taken by Prism (as Java & C# before it).
In Delphi SetLength() serves as “new”, “copy” and “resize” all wrapped into the same function, and the exact behavior it will have is extremely contextual.
Static array behavior has its own merits too, hence the poll.
@Eric
to me, your example of use of “var” don’t looks schizophrenic, but remind me OCD (Obsessive–compulsive disorder)… What I mean is, if you can change the values and length without the var keyword, then why you would need it to return a different array?
Well, I like the static array behavior cause it instinctive after you learn how about the others “normal” variables work. It is easy to explain too, like “if you want to change the values of the parameters, put var else, don’t”.
I know theres underneath it all pointers and how the compiler will take care of the memory (copy? don’t copy? create new object?)… but I am talking the way a language look to the programmer, specially the beginners. And well, compilers and these actual processors are not my best.
I am waiting to see how this will go… Also let me take this opportunity to congratulate you. Your works are great. Thanks.
@EMB.
Think of it has having the same behavior for a dynamic array as you would have for a TList, but having the compiler understand the syntax directly. In essence, a pure reference dynamic array is similar to having direct language support for something like a TList
@Eric
The confusion is already here, whether you like it or not. I can’t see Delphi ever changing in this regard. There’s too much code out there and such a major breaking change would hurt too many people.
So if you opt to go a different route then I really don’t see how that would lessen the confusion levels.
Anyway, it’s your product, your call, this is just my own personal opinion.
@David Heffernan
I don’t think cleaning up that behavior would affect existing code anywhere that much, speaking for our 1.5 million + codebase here and the open-source project I know enough about, they would be unaffected apart from a couple asm source blocks maybe.
I personally wouldn’t be surprised to see the Delphi behavior changed, or the Delphi SetLength method deprecated in the future, the only code at risk would be code that passes raw dynamic array buffers around (like TBytes), and then only in cases of misuse.