Log in

No account? Create an account

TADS 3 System Development

TADS 3 System Development



Skipped Back 20

July 17th, 2010

IF UI Concepts


After reading a recent thread in the intfiction forum, which was a spinoff from an active discussion in Emily Short's blog, I'm thinking about ways to make the IF UI more ... uh, modern.

HTML TADS provides some nice tools with which to approach this goal. I don't think the feature set is anywhere near complete, but thinks like banners and clickable hyperlinks are clearly a good start.

The problem, of course, is that the player needs an HTML TADS interpreter. Windows users can download a game as a .exe, so no interpreter is needed. Mac users have to download and install an extra component (CocoaTADS), which is less desirable simply because they're not all going to be willing to do it.

So one question that pops into my beebee-sized brain is, might we be able to hope that someday Workbench will be able to compile a game to a native OS X app?

Another question is, are there features that could usefully be added to HTML TADS in order to create a UI that was, in some way, less command-prompt-centric? I'm not sure what those features might be; it's an open-ended question.

I'm not talking about point-and-click graphic adventures. I'm a text guy. What I'm musing about are ways to make the experience of a text game easier and more inviting. Clickable words in a room description are one example; I'm sure there are others that I haven't thought of.


May 27th, 2010

This time I'm getting into some inner details of the networking layer, so my usual disclaimer: to write a game you won't have to know how any of this works.  This is mostly for those interested in how this all works on the inside.

As I've been explaining, the http server that's feeding HTML/XML to the player's browser is part of the game itself, not an external Apache server or anything like that.  This is important because it means that a game's process lifecycle is the same as what we're accustomed to with writing conventional TADS games: the game and all its objects stay in memory throughout the game session.  Meaning that as an author you don't have to overhaul your whole programming approach to fit the transactional programming model more typical of web servers, where nothing stays in memory between requests.

But if you're already familiar with that web model, this raises a bootstrapping question.  If you're thinking of regular web programming, and you're hearing me say the game is an http server, you're probably imagining that our server machine will need to have a bunch of "t3run mygame.t3" processes started up at boot time, idly waiting for people to connect.  That's basically the way a normal web server runs - you set up an Apache daemon (httpd)  to launch at boot time, bind to port 80, and start listening for connections.  If the game is a web server, does this mean that it has to start at boot time and bind to port 80?  And if so, what happens if two people want to play at the same time?  Do we need one game on port 80, another on port 81, etc?  Obviously the answer had better be no, and it is.

In fact, there is no "t3rund" that runs continuously or gets launched at boot time.  Instead, t3run processes are spun up on demand.  For a stand-alone configuration, where everything runs on the player's PC (just like a conventional TADS game today), the technical details are pretty simple:

- the player double-clicks mygame.t3 on the desktop
- the OS launches "t3run mygame.t3"
- t3run loads mygame.t3
- mygame.t3 does the 'srv = new HTTPServer' setup work
- mygame.t3 calls connectWebUI(srv)
- conectWebUI opens the local browser and navigates to the game server
- the game server returns the initial HTML page for the game

The magic function is connectWebUI().  This is another net built-in, whose function is to connect the newly launched game server with the user interface.  This function is the key abstraction in making stand-alone vs Web play seamless to the game (and the game programmer).  The interpreter knows how the user is connecting, based on how the interpreter was invoked (and based on its capabilities - HTML TADS will continue to be for local play only, for example).  When the interpreter is invoked in stand-alone mode, which is the case when it's invoked from the desktop shell by a .t3 file double-click, it knows that "connect to the UI" means to open a browser window locally and connect it to the game's http server.  And remember that when I say "browser window" here, I really mean a customized TADS application frame with an HTML widget inside.  Although on some systems it might really be just a simple browser invocation - either way works; it's just a matter of how seamless it looks.

So from the user's perspective, they double-click a .t3 file, and a game window appears on the desktop.  Exactly like running a conventional game with the current system.

In the Web play configuration, everything is exactly the same from the game's perspective, but of course the user's experience is a little different, as is the underlying implementation.  Even in the Web play configuration, t3run is spun up on demand, so the question is, how do we demand it?  Clearly we need some kind of web server running.  I keep saying there's no Apache involved, but that's not entirely true; it is true that we don't need Apache involved once t3run is launched, but we do need some kind of conventional web server to do the launch in the first place.  So on a machine that hosts t3run, we'll need to set up a regular web server, and install a couple of php scripts alongside the t3run executable.  These scripts will be part of the TADS release, and shouldn't need per-system customization.  Web server installation should thus be very straightforward; it'll probably amount to some file copies and a little .htaccess editing.  What these scripts do is handle an HTTP request that's directed to the "launch t3run" URL on the server, by launching a t3run process instance, communicating with the new process to get its TCP port, and replying to the HTTP request with a redirect to the newly established t3run server.  The sequence of events is like so:

- the game's author puts a hyperlink on her home page
- the player finds the game page and clicks on the hyperlink
- the browser navigates to the web page in the link
- the Apache runs the t3run launcher php script
- t3run starts up and loads the game .t3
- the .t3 program does the 'new HTTPServer()' stuff
- the .t3 calls connectWebUI(srv)
- connectWebUI communicates the port info back to the php script
- the php script sends a '301' redirect reply with the t3run server URL
- the browser navigates to the redirect URL
- the game server returns the initial HTML page for the game

Note that the first step, where you put a link on your home page, will probably be the only thing you need to "install" as an author.  That link will probably point to a generic, shared t3run server on some other machine (as we got into in the comment discussion for my previous post "Network services part 2").  So you probably won't need to host your own t3run server, although you'll certainly be able to if you have a hosting plan that's capable of it.

From the game's perspective, there's absolutely no difference between this and the local configuration - the game creates an HTTPServer object and connects it to the UI via connectWebUI().  From the player's perspective, it's also extremely simple: click on a link, the game comes up in the browser.   From the server's perspective, configuration is easy (just some packaged php scripts to install); t3run instances are created on demand, and bind to OS-assigned random ports, so the server machine can run as many simultaneous game sessions as its CPU and memory will permit.  TADS games tend to be pretty light on CPU averaged over time - the sense-path stuff can make them gorge on cycles in brief bursts, but overall they spend most of their time waiting for user input.  The limiting factor for scaling is probably memory; I'd guess TADS 3 games are mid-sized in web app terms.  I expect a minimal 128 MB VPS could easily handle 10-15 simultaneous sessions for a good-sized game.  I think I'm being conservative with that estimate; we'll see when I get a little further along with the implementation.

May 20th, 2010

I don't have any big updates to report right now; I'm just slogging away at getting a core Web UI built on top of the network plumbing I described last time.  In the process I've been adding to the http infrastructure as I run into things useful for the web UI, so it's in that "last 20% that takes 80% of the time" stage right now.

For example, I know some people will be concerned about the security implications of making network connections possible in an IF system, so one of the things I've added is a "network safety" option setting, basically parallel to the "file safety" options.  People coming to TADS from the neo-BSG universe who know about the apocalyptic dangers of hooking up even an RS-232 connection between two computers will be able to shut everything off if they so choose.

The Web UI part is far from done, but I know how it's *going* to work, so I can at least give you an outline of the design.  The basic idea is that the game program acts as a Web server, and serves up the UI through an HTTP connection.  When I run through this design with most people, they scratch their head and try to figure out where the Apache server or whatever fits in.  They picture an HTML page with the game's initial text sitting on a hosting server somewhere, and an Apache server serving up that page to the client browser.  Then somehow you get the game involved in sending text across a socket mumble mumble...

So the important thing to understand is that it's all self-contained in the game program.  The game itself is the Web server.  There's no separate Apache server and no separate HTML page to be installed on a hosting service; all of the HTML comes out of the game via its integrated HTTP server.  The HTML might be static or it might be entirely generated - in practice, it'll be a mix.  Static HTML pages, along with static JPEGs, PNGs, etc., can be served directly out of the resource bundle.  (One of the helper classes I've already built is a little "web server plug-in" (plugging in to the in-game web server) that serves resource files - so with couple of lines of object declaration you can tell your game server to serve up any subset of your bundled resources.)  This is important when it comes to deploying games, because it means that a .t3 file is a complete Web deployment package, just like it's a complete download deployment package today.  That is, to publish your game to the Web, you just upload the .t3 file to a suitable location.  Depending on what I can arrange in terms of engine hosting, it might turn out that the .t3 file can be just about anywhere, so that a generic engine on my site can load your game from your site (or the IF Archive) via a URL.

Let's go a little more into the technical details of how the UI actually works.  The first step is that we have to load some kind of HTML into the browser.  This initial page can be anything you want it to be - that's one of the great benefits of having the whole thing integrated in the game engine - but there'll be a core Web UI start page in the library that sets up an initial window layout a la HTML TADS, with the browser interior devoted to an HTML transcript window.  So when you launch the game, the browser connects to the game server and requests that core UI start page, and the game server reads the page out of its resource bundle and sends it back to the client:

Browser -> Game:  GET ""
Game -> Browser: <html>...core UI page contents...</html>

The browser then loads up this page.  The default Web UI page has some on-load javascript that sets up the transcript window and sends an asynchronous XML request to the server for an "event".  This is known as an AJAX request - Asynchronous Javascript and XML - and it's the technology that sites with dynamic content use to get server updates without reloading the page.  For example, when you're typing a search string into Google, and it shows its list of suggestions as you type, that's AJAX at work: the page is sending your keystrokes to the Google server, the server sends back matching search terms, and the browser shows the reply data in a popup under the search box.

Browser -> Game:  GET

The game engine receives this request and holds onto it.  When there's something interesting to tell the UI, such as when there's some text to display in the transcript window or it's time to ask the user for a command line, the game sends back a reply with the event information.

Game -> Browser:  <?xml version="1.0"?><event><writeText>Welcome to TADS Online!</writeText></event>

The browser parses the XML and sends it back to the Javascript that sent the request in the first place.  The javascript processes it and carries out whatever the event says to do, in this case adding text to the transcript window.  The javascript finishes by queuing up the next event request.

Browser -> Game:  GET

When the game finishes displaying the intro text and wants to ask for input, it replies to the last getEvent with an input-line event.  Or if it wants a character, it sends an input-char event.  Etc.   The core UI page handles an input-line event by displaying an input editor and letting the user type in a command line.  (One of the little details that's taken some time is tweaking that command-line handler to look seamless on the different browsers.  This kind of thing is fiddly, so it's good to have in a reusable library.)  The user types in the command and presses enter...

Browser -> Game GET

The game server gets the input line text, sends an acknowledgment to the client, and passes ithe text back to the game for processing via the normal parser mechanisms.

Game -> Browser: <?xml version="1.0?><inputLine><ok/></inputLine>

The game processes the text as normal, sends its response, and asks for another input line.   (Recall that we still have a getEvent request from the game pending - that's what we're replying to with our first item below.)

Game -> Browser:  <?xml version="1.0"?><event><writeText>box: Taken.&lt;br&gt;ball: Taken.&lt;p&gt;&gt;</writeText></event>
Browser -> Game:  GET
Game -> Browser: <?xml version="1.0?><inputLine><ok/></inputLine>

So you can see how the basic flow works.  It's one thing to see on paper, but quite another to see it actually working - I'm at that point in the lab, and hopefully I'll have something to demonstrate publicly before too long.

There are a couple of big-picture things I want to point out.  First is that, as with the network-level plumbing, games won't have to deal first-hand with any of this, because the core Web UI library will take care of it all.  Games that just want the traditional transcript-and-banners type of UI will get a traditional print/inputLine/inputChar API for that.  The second thing is that games that want more control can tweak, revamp, or replace that library layer and write as close to the "metal" as they want, right down to the HTTP conversation with the browser, and right down to every line of HTML and Javascript that goes across the wire.  This is an opening up of the UI technology that's comparable to the evolution of the parser and execution engine going from TADS 2 to TADS 3 - what was once wrapped up tight inside the VM is moving out into the daylight of the library.  Granted, the "metal" in this case an interpreted language (javascript/html dom) running inside a sandboxed app frame (a browser), but it's a heck of a leap in power nonetheless.

Next time: the experienced Web programmers among us have probably noticed that I'm leaving out a key piece of the puzzle for Web deployment, which is how you hook a browser up to this thing in the first place.  The answer isn't "run it on port 80", as that would never be scalable - every game would need its own IP address, and the world is already short on those as it is!  I had to work out the same details in an earlier unrelated project, so I have a proven solution.  The solution is also conducive to multi-player and collaborative systems, which is in fact how it was used originally.

May 10th, 2010

I've finally had a chance to do some of the foundational work for the networked TADS I've mentioned a couple of times.  This initial work involves the low-level plumbing for TCP/IP and HTTP support, and that's now basically done.  I think this makes TADS the first IF-specific language in which you can write a Web server.

There's an example at the end of this message showing a very basic, but complete, Web server written with the new infrastructure.  This server simply accepts file path requests and sends back the requested files.  Apache it's not, but it should give you an idea of how the new infrastructure works at the nuts-and-bolts level.

The example uses a couple of new intrinsic classes: HTTPServer, which encapsulates the code that binds to a port and listens for incoming connections; and HTTPRequest, which represents an HTTP request from a connected client.  An HTTPServer object contains a background thread that runs autonomously once started, so starting a Web server is simply a matter of creating the object.  An HTTPRequest is where most of your program interaction takes place.  This object contains the information the client sent with the request (and provides a method to parse a query-style URL string, with "?"  parameters), and has methods that let you send a reply, including headers, a status code, and the content body.

To keep the example short and to the point, I've left the loadFile() function as an exercise to the reader.   That's not a new intrinsic; it's just some mundane file I/O code written with the existing File class.  I'm not leaving out anything interesting there.

Now, I said this example gives you an idea of the nuts-and-bolts level of this stuff, but don't worry that game programming is going to turn into Web server programming.  It's not.  All of this HTTPxxx stuff will be buried in the library; most games won't have to even know it's there.  The library will provide a UI-event-oriented API on top of its network code, so most games will see a UI API that looks a lot like the current one.  But games that do want to get down into the networking details will be able to, which will give them a degree of control that's fairly unprecedented (in IF systems).  I can imagine using this to create extensions for things like collaborative play and multi-player games.

But the main motivation and initial use for this is obviously single-player, browser-based games that simply replicate the current playing experience in a browser across a network.  I'll talk more later about how we build that on top of this foundation.

If you know anything about network servers, you're probably wondering how a single-threaded system like TADS is going to handle a multi-threaded job like HTTP serving.  In fact, the new infrastructure is multi-threaded: an HTTPServer starts an autonomous thread to listen for new connections, and that thread launches session threads when clients connect.  This is all hidden away in the native code, though.  The byte-code program remains blissfully single-threaded.  Eric Eve should be particularly happy about this, because it means he won't be adding sections to the Getting Started about mutexes and race conditions.  (I don't think that would really help the infamous TADS 3 learning curve.)  The cross-thread coordination is achieved via a message queue; you'll notice in the example that the Web server is basically an event loop that reads messages from the new netEvent() API.  This design allows for efficient handling of the low-level networking, by assigning a separate thread to handle the network I/O on each connection, while keeping game code single-threaded.

So here's the example: presenting the world's first IF-language Web server (as far as I know)...

    /* set up the server */
    local ip = getLocalIP();
    local l = new HTTPServer(ip, nil, nil);
    "HTTP server listening on <<ip>>:<<l.getPortNum()>><br>";

    /* handle network events */
    for (;;)
            /* get the next network event */
            local evt = netEvent();
            /* see what we have */
            if (evt.evType == NetEvRequest
                 && evt.evRequest.ofKind(HTTPRequest))
                /* get the request, and parse the resource name */
                local req = evt.evRequest;
                local res = req.parseQuery()[1].substr(2);

                /* try opening the file */
                local f = loadFile(res);
                if (f != nil)
                    /* found a file - send the contents to the client */
                    req.sendReply(f.body, f.mimeType);
                    /* not found - send a 404 error */
                    req.sendReply('File not found', 'text/plain', 404);
        catch (SocketDisconnectException sdx)
             *   ignore these - they just mean that the client closed its
             *   connection before we could send the reply; their loss
        catch (NetException nx)

    "Shutting down the HTTP server...<br>";

May 8th, 2010

Jesse Welton sent me some email (in lieu of posting here) suggesting that the name for this feaure could be better, specifically that it should emphasize the way these variables are 'inherited' in callees. I tend to agree.

His first suggestion was "dynamic environment variables", which isn't bad, but it could be a little confusing because of Unix environment variables. He also came up with "context variables" and "named dynamic variables".

A couple of ideas of my own: "public locals" (an oxymoron), "public arguments" (sounds like something you ought to deal with in marriage counseling), "super locals", "super arguments" (sounds like it involves throwing chairs on daytime television).

What we really need is a standard term of art for "a function and all of its callees", but I can't think of one.

I think I'm warming up to "context variables" - it's concise and it pretty well captures what they do.

Any thoughts on these, or other ideas?

May 3rd, 2010

Named arguments


Another new feature in the next update is something I'm calling named arguments.  It might not be obvious at first glance, but this is designed to be a convenience feature, to address a particular inconvenience that's always afflicted TADS programming.

Let's start with the mechanism - I'll get to the motivation in a bit.  The basic idea is that when you call a function or method, you can include one or more "named" arguments.  These are given explicit names in the call, with the syntax "name: value".

  doSomething(a: 1, b: 2);

The callee uses similar syntax to specify receiving the named values; they just leave off the value part.

  doSomething(a:, b:)
    "This is doSomething: a=<<a>>, b=<<b>>\n";

Since the arguments are named, you can put them in whatever order you want and get the same result: doSomething(b:2, a:1) is the same as doSomething(a:1, b;2).  You can freely mix named and positional arguments: doSomething(a: 1, b:2, 3, 4).

This sort of thing exists in several mainstream programming languages.  The main motivation in other languages is that it makes code more self-documenting by spelling out which arguments are tied to which parameter names, which is especially useful in functions that takes gobs of arguments, such as you typical Win32 API.  But the TADS feature has two other important details that makes it very different from the usual named argument mechanism.

First, a callee doesn't have to declare all of the named arguments a caller sent.  You can define doSomething as just doSomething(), and it'll work fine - no error from the compiler or run-time.  Second, named arguments "pass through" callees to their grand-callees, and their great-grand-callees, etc.  In other words, if doSomething() calls doSomethingElse(), and doSomethingElse() calls doAnotherThing(), you can do this:

  doAnotherThing(a:, b:) { "Hey! I inherited a=<<a>> and b=<<b>>!"; }

Python [er, make that] Perl is the only language I know off-hand with anything similar.  It has dynamic local variable scoping such that a callee inherits the local variables of its caller, grand-caller, etc.  Python Perl essentially treats the call stack the way C-like languages treat lexical scope.  TADS named arguments are basically like that, but only named arguments - ordinary local variable scoping continues to be C-like.

This probably seems a little bizarre and out of the blue, but it's the first decent solution I've been able to come up with to a problem that's vexed me since the TADS 2 days.  (At least, I think it's a decent solution.  We'll see if anyone else agrees.)  The problem is peculiar to systems like TADS where you primarily write a program by extending an existing class framework.  If class frameworks for GUI programming had caught like they were supposed to, C++ might have tackled this problem by now; or maybe if C++ had tackled the problem class frameworks would have caught on.  But I digress.

As I said, you program in TADS largely by subclassing library classes and overriding methods.  If you want to describe a room, you create a Room instance/subclass, and you override the desc method.  When it's time to describe the room, the library calls your override.  It's an inversion of the traditional relationship between software libraries and their users: traditionally, the user code calls the library.  Here, the library calls user code probably about as much as vice versa.

So here's the basic problem I was trying to solve: the library often has a whole bunch of context information when calling one of these overridable methods; how much do you pass to the callee as arguments?  For example, when generating a room description, the library knows who's looking, how far away they are, what the light levels are like, what's in scope, etc.  Some or all of this information is sometimes useful in writing a room description; but mostly it's not.  Much of the time you just want to write out a simple static message.  It would be a huge pain if you had to write something like

  desc(pov, distance, brightness, scopelist, verb) { "It's a fairly boring empty room."; }

every time.  I mentioned that this problem goes back to tads 2: those of us who used tads 2 will recall that every verb handler had to have an actor parameter, and sometimes an other-object parameter:

  verDoPutIn(actor, io) = ...

That was tedious, which is why it was a top priority in Adv3 to find some other way to pass that command context information.  In Adv3, the solution was global variables - gActor, gIobj, etc.  And in fact that's the pattern that Adv3 uses with a fair degree of consistency for this kind of situation.  Some other examples are callFromPOV() and callWithSenseContext().  There's a lot of this kind of thing in the library:

  local oldFoo = gFoo;
    gFoo = newFoo;
    gFoo = oldFoo;

It's not pretty, but it works.  Apart from the inelegance, though, it has a couple of serious downsides.  One is that it's such a pain to write all that code that there almost has to be a dedicated library routine each time the pattern is used, which bloats the library and is somewhat limiting to user code.  Another problem is that it's ad hoc and unstructured - there's no formal way of knowing if a global variable is valid at any particular time, so you might accidentally use an old value.  Third, this ad hoc approach doesn't protect against other code clobbering your globals unexpectedly - using the save-try-restore pattern ensures that you play nice and don't clobber other code's globals, but it doesn't protect you.

The named argument scheme is basically a structure replacement for that try-save-restore pattern.  It's plainly a lot more concise.  To me, at least, it fixes the inelegance problem; the syntax is straightforward and a fairly natural extension of the existing call syntax, and I think it's much easier to see at a glance what's going on than with the try-save-restore business.  Since the mechanism is uniform, you don't have to worry about other code playing nice - other code basically has no choice but to play nice, and can't unwittingly clobber your variables.  It also solves the problem of knowing if a variable is valid at any given time: the system will throw an error if a callee declares a named argument that doesn't exist anywhere in the call stack.  That lets you tell immediately that you're doing something wrong, without having to accidentally discover it via a nil deref or (worse) a stale value.

I don't intend to retrofit this into Adv3, even though there are a lot of places it would be a great improvement.  However, I am using this throughout the "Mercury" library I've mentioned here.  That'll be the real proving ground for it, and I expect that it'll be a big part of achieving the Mercury goals of easier to learn and use.

April 30th, 2010

My previous post about dynamic features was about the new DynamicFunc object, which lets you create new code on the fly out of source code strings.  This time I'm going to expand on that a little.

First, you might wonder where macros fit in.  It's important that they do fit in somehow, since they're a big part of the adv3 library as well as the base system library.  The DynamicFunc class handles macros in a fairly straightforward way, by letting you specify them.  You simply provide a LookupTable in a particular format: each key is a macro name, and each value is a specially formatted list by contains the details of the macro's definition - basically a digested version of a #define directive.  Now, you can construct one of these macro tables yourself, which lets you not only compile code on the fly, but also create your own macro definitions on the fly for the code you compile.  But most of the time all you want is the original global macros for the main program, and happily, that's now available via reflection.  The Compiler object makes this completely transparent by plugging in the global macros as the default macro table for each compilation.  But it's worth knowing that  you can override this and create your own macros if you want.

Second, you might recall from an earlier posting that the system's reflection services now give you access to local variables in the active call stack..  A very cool feature of DynamicFunc is that it can take one or more of these local variable tables as a parameter when compiling new source code, and put those locals in scope for the new code.  The effect is a dynamic-code analogy to the lexical scope access that anonymous functions have.  A static anonymous function can access locals from lexically enclosing scopes - that is, from the source text that surrounds the anonymous function's definition.  The parallel for dynamic functions is that they can access locals from enclosing stack scopes - i.e., the current function and callers of the current function.  The parallel continues: this kind of access effectively "detaches" locals from their stack frames, so that the dynamic function can continue to access enclosing locals even after the calling functions have returned, just as anonymous functions can continue accessing lexically enclosing locals after those callers have returned.  As with many of these dynamic abilities, this is especially of interest to library utility writers, because it opens lots of interesting possibilities for library functions that interpret expressions written in string form. 

That mostly wraps it up for the new dynamic coding features.  I'm pretty excited about them; I think they're going to open up a whole new range of possibilities for extensions.  DynamicFunc is the big news, but I think once the system is out people will come up with lots of interesting ideas for combining it with the other new and existing dynamic features.

April 23rd, 2010

There are a couple of nice find-and-replace enhancements in the next release that probably won't seem huge at first glance but are really nice to have in practice.  These new features apply to both String.findReplace() and rexReplace().

First, a tiny one: the 'flags' argument is now optional, and ReplaceAll is the default if the argument is omitted.  This is a small thing, but 90% of the time you just want to do a global replacement, so this cuts down the typing in the common case.

Second, you'll be able to conduct a whole series of replacements in one shot, by specifying a list of patterns and a corresponding list of replacements.  For example, if you wanted to do a bunch of mappings of HTML to plain text, you could do something like this:

  s = s.findReplace(['<p>', '<br>', '<b>', '<i>'], ['\b', '\n', '', '']);

Each element of the pattern list corresponds to an element of the replacement list, so <p> is replaced by \b, <br> is replaced by \n, etc.

The algorithm is pretty robust, by the way, in contrast to similar features in certain other languages (I'm looking at you, php).  Replacements are by default carried out "in parallel", meaning that the function repeatedly looks for the leftmost match, replaces it, and then proceeds with the remainder of the string.  The particularly important feature is that replaced text isn't re-scanned in this algorithm, which makes it actually different from doing a series of replacements, the way you had to in the past:

 s = s.findReplace('<p>', '\b').findReplace('<br>', '\n').findReplace('<b>', '').findReplace('<i>', '');

In this particular case this doesn't really matter, apart from the efficiency gain of not having to go through four separate searches and construct four separate strings.  But it does matter in cases where the replacement text from one pattern happens to contain search text from a later pattern.  Consider this:

 s = s.findReplace('a', 'b').findReplace('b', 'c');

That'll turn any 'a' in the original string into a 'c' in the final string, since the intermediate replacement of 'a' with 'b' will be re-scanned, and the second scan will change the b's to c's.  In contrast, this will simply turn a's to b's, and *existing* b's to c's:

 s = s.findReplace(['a', 'b'], ['b', 'c']);
In case you actually do want to do the replacement serially, there's a flag that says to do that.

The third feature is that regular (non-regex) replacements can be done with case-insensitive matching, and case-insenstive matches (regular and regex) can "follow the case" of the match.  Following the case means that the replacement text will be converted to match the upper/lower case pattern of the matched text.  For example, if the pattern is 'hello' and you do a case-insensitive match on 'HELLO', the replacement text will be converted to all caps; if it matches 'hello', the result will be lower-case; if it's 'Hello', the result will have an initial capital.

  s = 'This is this test of THIS function'.findReplace('this', 'that', ReplaceIgnoreCase | ReplaceFollowCase);

The result will be 'That is that test of THAT function'.

The fourth feature is that the replacement text can now be given as a callback function rather than as simple text.  This is surprisingly powerful, and once you get the knack of it, it can really simplify code.  I've already applied it to a number of library functions that formerly had big gnarly loops that stepped through strings a character at a time; in some cases it gets big loops down to a line or two.

Here's an example that converts a string to title case.  The looping equivalent is rather tedious to write, but with a callback it's pretty easy:

   local r = new function(match, idx)
       /* don't capitalize certain small words, except at the beginning */
       if (idx > 1 && ['a', 'an', 'of', 'the', 'to'].indexOf(match.toLower()) != nil)
           return match;

       /* capitalize the first letter */
       return match.substr(1, 1).toUpper() + match.substr(2);
   return rexReplace('%<(<alphanum>+)%>', str, r);

April 20th, 2010

The really big new dynamic feature in the next release is dynamic compilation.  This is the ability to take a run-time string - something the user types in, something you read from a file, or something you construct within the program using string operators - and run it through the compiler as though it had been part of the source code all along, producing a function you can execute.

This capability will be familiar to anyone who's used Lisp, Javascript, or another mostly-interpreted language.  C++ and other mostly-compiled languages don't tend to offer anything similar.  Up until now, TADS has been about halfway between those poles, but this new feature pushes TADS fully into the dynamic camp.

Dynamic compilation is quite easy to use.  You just run a string through the new Compiler object:

  local x = 'function(x) { return x*x; }';
  local f = Compiler.compile(x);
  local sq = f(10);

The string you give to the Compiler.compile() method uses almost the same syntax you'd use to write a function in the program's main source code.  The only difference is that the function has no name; instead, we just use the "function" keyword to introduce the argument list.  Instead of invoking the dynamic code by name, you invoke it via the object that Compiler.compile() returns.  This is a new type of intrinsic object called a DynamicFunc, which behaves in most respects like an anonymous function object.  To invoke it, we simply call the object as though it were a function, as shown in the last line above.

A DynamicFunc can be used anywhere an ordinary function pointer or anonymous function can be used.  A really interesting implication is that you can use a dynamic function with setMethod().  So not only can you create new methods of existing objects, but you can also create new methods out of thin air, without anticipating what they might look like when you write the program code.

  hallway.setMethod(&desc, Compiler.compile('method() { "The hall is very nice!"; }'));

Note that we've used the "method" keyword in place of "function".  The difference is that using "method" tells the compiler that you expect a "self" and related values to be available when the code is executed, because you intend to use the code as an object method.

Dynamic compilation has some other interesting capabilities that we'll talk more about next time.

April 15th, 2010

A few more convenience-oriented features for the next release...

- You'll be able to use << >> expression in single-quoted strings.  This does the logical analog to what it does for d-strings: it builds a concatenation expression that's evaluated when you evaluate the string.  For example, 'You are in <<me.location.name>>.' effectively turns into the expression 'You are in ' + me.location.name + '.'.

The thing that made it a no-brainer to add this was that I ran into a bug in the compiler where it was already interpreting << sequences in s-strings.  This always led to an error, which means it was in the past basically impossible to write s-strings in that notation, which means there is no compatibility danger with the new syntax.

- There's new syntax for optional arguments.  If you define a function or method like this:

   func(a, b?, c?) { return a+b+c; }

it means that 'a' is required, but 'b' and 'c' are optional: you can call this as func(1), func(1,2), or func(1,2,3).  If you leave out b and/or c, they get default values of nil (but you can also check argcount to see whether they're nil because they were missing or because they were explicitly specified as nil).  If you want to specify a default other than nil, use this syntax:

  func(a, b=2, c=3) { return a+b+c; }

You could formerly get a similar effect with "..." functions, but this is more convenient for cases where you have a fixed number of optional arguments, because it gives the extra arguments names, and it checks that you haven't specified too many arguments.

- New short-hand syntax lets you define a pre-populated lookup table very quickly and easily.  The syntax is based on the regular list syntax, but rather than just giving a list of elements, you give a list of key/value pairs:

  local tab = ['one' -> 1, 'two' -> 2, 'three' -> 3];

- A couple of String improvements.  The new splice() method does fairly complex edits in one go, by replacing a substring with a new substring.  You can append nil to a string and get back the same string.

Powered by LiveJournal.com