Zero-based Strings indexes?

In a now infamous and enormous thread I won’t name, Allen Bauer dropped a bomb:

<bomb>Oh, and strings may become immutable and 0-based ;-)…</bomb>

Currently Oxygene has zero-based strings, I was considering it for DWScript too, but the backward-compatibility implications are a bit too huge (yes, we and customers have many years of accumulated DWS code), and the kind of issues triggered by that are hard to track/fix/warn about… or are they?

One evolution that is looming (at least for DWS, can’t speak for Delphi) is having methods on base-types too, since these would be new methods, with no legacy, a zero-based convention could be introduced there, f.i.:

sub := Copy(str, 1, 10); // legacy, 1-based
sub := str.Copy(0, 10); // new, 0-based

As time passes, the functions would be marked as “deprecated”, and code migrated over to methods incrementally. The interim time would of course be a mess mix of zero and 1-based conventions… not very desirable, but certainly preferable to breaking code in non-obvious ways.

One hard case (without easy compiler-warnings) that would remain would be that of indexed character access, like “str[i]“. I can think of only one safe way around that one: not having a default array property. That could however be leverage to gain some, f.i.:

char16 := str.Char16[i]; // equivalent to old str[i]
code := str.Code16[i]; // equivalent to old Ord(str[i])
charStr := str.Char[i]; // new, retrieves the whole character (1 one or more char)
codePoint := str.Code[i]; // new, retrieves the whole unicode codepoint

The Xxx16 versions would return a a WordChar, equivalent to a current Char, and only capable of holding a character from the BMP. The Xxx version would return a String (a whole Unicode character/codepoint) or an UTF32 code.

Comments? Other Ideas?

7 thoughts on “Zero-based Strings indexes?

  1. @Serg
    Immutability is quite simple: you just have to forbid write accesses per-character, it’s rather painless to add in DWS, and can be safely enforced at compile-time (unlike a base index change).

    AFAIK, there is no benefits to immutability when using a copy-on-write mechanism, but there are if using a GC. So having immutable strings could be a hint that Delphi will get a GC…

  2. @Rudy Velthuis
    Strings are using copy-on-write, which some argue is more advanced and practical for concurrency management, whole sets of classes in Java are built around it to show performance advantages in multi-threading tasks, databases make use of it heavily, etc.

    In practice COW gives has many advantages in terms of concurrency that immutability does, without less duplication issues.

    For instance in the link you provide, the author touts the threading capabilities of immutability, and then immediately fallbacks on a mutable class (StringBuilder) that just does NOT support threading (except through locking)…

    Copy–on-write solutions in f.i. Erlang, the above mentioned Java classes and Delphi Strings offer an elegant solution that works well in threading situations without gotchas, unlike the String/StringBuilder duo.

  3. @Eric
    I agree with you that immutability is not for solving concurrency issues.

    But I do not think Delphi would have a GC. I saw Allen Bauer left word somewhere, that he prefer deallocate object at where it is allocated. In some other words, he prefer the way to manage objects like c, c++.

  4. Never go to that post to see the context of Allen Bauer’s quote… It is a time consuming “now infamous and enormous thread”. You’ve been advised!!!

Comments are closed.