July 27th, 2012

Locational naming

I'm finally getting around to a language feature I've been meaning to add for a long time now. The design is pretty straightforward, but I'd like to get some opinions before settling on the exact syntax.

I'll start by describing the feature at a high level, then I'll get into the syntax details.

First the motivation. Object naming in larger TADS games gets a little tedious, mostly because all object names are global. For a common object like a table, it'd be nice to be able to name it something simple like 'table', but we usually can't because we might have a few other tables scattered around the game. The usual way I deal with this is to choose fairly long names for items, doing something like combining the room name and the item name: kitchenTable, say, or iceCavePedestal. That solves the problem of keeping names unique, but at the cost of making the names hard to read and tedious to type.

(Note that anonymous objects were motivated by this same problem, and went a long way toward solving it. Anonymous objects neatly deal with the very common case of self-contained objects that no one else needs to refer to directly, such as decorations and components. But anonymity doesn't work when we need to refer to an object from code belonging to another object. That happens often enough that the object naming nuisance is still with us.)

Given this pattern of ad hoc object naming by location, how could make it more automatic, and also improve readability and reduce our keyboard workload?

The compiler already has some awareness of the game world's containment structure, thanks to the '+' syntax for object definitions. That syntax lets us mirror the containment structure of the game world in the lexical structure of the source code, in a fairly natural way. The idea behind the new feature is to take this compile-time containment structure and use it to partition the object namespace. Rather than having to *explicitly* use a name like kitchenTable, it would be nicer to be able to call it simply 'table', and let the compiler remember that it's defined within the kitchen. If we also have a table defined in the parlor, it would be nice to be able to call it simply 'table' as well, and let the compiler sort out which is which based on context, and when there's not enough context, based on some new syntax that lets us tell the compiler which one we're talking about.

The details of the new feature:

1. Object names can be repeated, as long a name is only used once at a given containment level.

kitchen: Room;
+ table: Surface;
++ box: Container;

parlor: Room;
+ table: Surface;
++ box: Container;

2. Within an object definition, the naming context is established by the object's location. Within a parlor method, a reference to 'table' is the parlor table; within a kitchen method, a reference to 'table' is the kitchen table. This extends inwards and outwards, so that 'table' means the parlor table from within the parlor, the parlor table, and the parlor table box.

3. When the location context doesn't resolve an ambiguous name, we can explicitly name the object path. For example, in the code for livingRoom, outside of the kitchen and parlor namespaces, if we want to refer to one of those tables, we have to call it 'kitchen.table' or 'parlor.table'.

4. Similarly, if we want to refer to something outside of the current context namespace, we can use an explicit path. E.g., within a kitchen method, we can refer to parlor.table.

5. Paths only have to qualify things as far as they're unique, so we can refer to parlor.box, for example - we don't have to write the full path to parlor.table.box.

6. Globally unique object names never have to be qualified when referenced. If there's only one 'box' object defined in the game, there's never a need to use a location path to refer to it, no matter what the context - 'box' can only mean that one object. This is crucial because it means that existing code is seamlessly compatible with the new naming system. All existing code necessarily uses a unique name for each object, since reuse was always an error before, so all object references in existing code are already unique without any location qualifiers.

Now, on to my syntax questions.

When I started thinking about this feature, it seemed natural to use "." as the location path operator. This overloads the "." symbol, which of course also is used for property evaluation, but there's no ambiguity because of the rule that a given symbol can be an object name or a property name, but not both. If an object name is on the right side of a dot, there's only one possible meaning. There's also an excellent precedent for this within the C++ language family, in that Java uses "." as the package namespace scoping operator, which I think is a fairly close parallel.

If we relied on C++ alone as our syntax model, we'd probably choose "::" as the operator, since C++ uses that as its namespace scoping operator. The advantage of using "::" instead of "." for TADS location naming is that there's no ambiguity in the syntax - anyone looking at a piece of code would be able to tell that they're looking at an object location path, without having to know anything about the object names involved. With ".", there's no semantic ambiguity, but there's syntactic ambiguity - you have to know how the names are defined to know whether a given "." means property evaluation or location scoping. So a casual reader looking at a piece of code without having studied the overall context might misinterpret it.

Of the two, "." or "::", I like "." better aesthetically. I do see some value in the clarity of using separate operators for scoping vs property evaluation, but "." just looks a little cleaner to me somehow.

The wrench in the works, though, is that we need a global scoping operator, and I don't think "." will work for that. The global scoping operator is needed for situations like this:

kitchen: Room;
+ table: Surface;
++ box: Container;

box: Container;

We have a box within the kitchen, and a box out on its own at the top of the location tree. If we refer to just 'box' anywhere within the kitchen object tree, the context rule will always give us kitchen.table.box. If for some reason we really want to refer to that top-level box instead within the kitchen context, we need a global scoping operator - we need a way to say "the outermost 'box'". In C++, we'd write ::box. I don't think there's an equivalent with Java packages - or, rather, I think the equivalent in Java is that you simply have to move that outermost 'box' into a namespace if you want to be able to refer to it from within another namespace. I'm pretty sure I want to have an explicit outer scoping operator, though, mostly because it makes things a little easier to isolate when writing extensions and libraries.

Here are the options I see:

1. ::box, kitchen.box - use :: as the global scope operator only, and use . as the scope path operator. The upside is that most paths will contain only .'s because most paths won't need to shoot out to global scope, and when they do, it'll mostly be single element paths like ::box. The downside is the inconsistent representation of what are essentially two facets the same operator; multi-element global paths like ::kitchen.box make this especially apparent.

2. ::box, kitchen::box - use :: as the scoping operator in all cases. It's consistent, but I find paths like kitchen::box somewhat less aesthetically pleasing than kitchen.box.

3. kitchen.box - use . as the infix operator, and don't allow reusing an object name at global scope. There's no need for a global scope operator in this case because all ambiguous names are inside unique root objects.

The more I think about it, the more this strikes me as a reasonable and maybe even desirable restriction. Allowing reuse at the global level seems like a recipe for confusion when mixing code from multiple sources, like libraries and extensions.

4. outer.box, kitchen.box - use a keyword to represent the global scope, such as outer, global, unnamed, or anonymous. In principle I kind of like this approach, but in practice I haven't been able to come up with a keyword that I like for it.

Any thoughts are welcome.