Embedding Scheme for a game mission scripting DSL

31 August, 2014

Motivation

For the next version of The Spatials I wanted to improve the small missions you get when you visit a new alien world. Art and content updates are already underway with new environments like space stations or caves. But a big part of the experience is the “what to do”, the goals the game gives to you and the guidance to perform them.

The Spatials is already very heavy on random generation. The galaxy and star systems are random, the planet surfaces are random, etc. In version 1.0 the missions were also random for the “bounty” planets outside of the campaign.

For version 2.0 we are removing the campaign and instead we are going to give every planet a hand crafted mission, which will automatically adapt to the random surface and features of the planet. This is not a small task. Every random galaxy has 150 to 200 planets. Fully scripting 200 missions, plus giving the code the ability to adapt to any possible generated surface, is a daunting task. The scripting thus has to be as simple and expressive as possible, and its interpreter smart enough to not need much hand-holding on how to translate it to the in-game structures.

The prototype DSL

Before I even considered the “how”, I played with the “what”. What would be a simple, almost purely declarative script with some structure? The first sketches looked like this:

camp crash-camp
camp pirate-camp near crash-camp

actor ship kind "Crashed spaceship" camp crash-camp
actor robot kind "Robot" camp crate-camp
actor pirates kind "Pirate" n 5 camp pirate-camp

dialog ship
    "The black box indicates this ship was an unmanned transport drone. There's a radio beacon."
    option "Follow the beacon"

spawn ship pirates
mark-goal ship
wait-for ship option "Follow the beacon"

From this I identified two clear parts to the script: a declarative prelude, laying out all the actors and data which are featured in the mission. And at the end a not-so-declarative part, which encodes some state changes in the map (“spawn”, “mark-goal”) and makes the game logic wait for some action that requires player interaction (“wait-for”).

Why Lisp/Scheme?

At this point The Spatials was a 100% C++ application. I toyed at some point with scripting engines but it wasn’t until this point that I truly identified the need for external scripting. It had to be the combination of a very expressive language, very fast to iterate (including in-game reloads), and most importantly, that made the structure and expression of the program part of its own data. It had to be Lisp!

I’ve never programmed professionally with Lisp or any of its dialects until now, but I’ve toyed a bit with Clojure. I’ve found Lisp fascinating for years and I am very glad I finally took the plunge in a real project. Here are some reasons on why I picked Lisp/Scheme:

Your code is just structured data, and actually it’s only code if you choose to point the evaluator to it.
On-the-spot DSL are trivially to implement with macros, including sub-specialized DSL that are only valid inside some expressions.
The implementations of basic Lisp interpreters are easy to read and understand. Lisp only really “clicked” for me when I read the C source of TinyScheme.
The “small Lisps” are truly small, unlike, say, Lua (Update: I am wrong here. Lua IS truly small, smaller than some of the Lisps I considered).
I’ve been looking for an excuse to learn Lisp for real, and this was my chance.

So I required a small, embeddable Lisp interpreter, that ran on Mac OSX, Windows (compiled with MSVC) and iOS. In the process of choosing an implementation I tested a few interpreters.

___ The “midweights” ___

Heavyweights like Racket were out of the picture. I also tested Chicken Scheme and Gambit. Although there are some success stories of their support for iOS, they were large and heavy enough that not having iOS as a primary supported platform meant a lot of extra work to get them working (including multi-hour compilations because of bugs in Clang). Basically these implementations are fully featured software platforms, not just a Scheme interpreter. They felt more like I was embedding my game into them than the opposite. I needed to go smaller!

___ femtolisp ___

My first, and favorite choice, was femtolisp, by @JeffBezanson of Julia fame. femtolisp was a very easy to understand implementation and had some optimizations common in bigger interpreters like using a single machine word for packing integers plus pointers. It also had native hash tables, full utf8 support, a cvalues module in the style of Python, etc. It was really promising but it failed in the portability department since it hardcoded dependencies like UNIX sockets which are not available under MSVC.

___ femtolisp (Julia version) ___

Julia uses femtolisp to implement its parser, and its source tree contains a full copy of it, which supports MSVC! Unfortunately a new hardcoded dependency appeared in this version, libuv. Not wanting to add yet another source blob to my project, specially one that touched low level platform APIs, I decided to move on from femtolisp.

___ libparen ___

libparen is basically a s-expression parser with evaluation support. No GC (uses std::shared_ptr), code takes ten minutes to read and fully understand, virtually zero dependencies other than an up to date C++ compiler. Unfortunately I found a bug that was compiler dependent (which I reported) and it didn’t give me much faith on how tested/deployed it has been. But with some development it looks like a worthy successor to TinyScheme for the title of simplest (usable) Lisp interpreter.

___ S7 Scheme ___

At this point it was very clear that I was going to choose TinyScheme, but before finally going into it, I tested its bigger brother, S7 Scheme. I found the C API of S7 WAY better than TinyScheme. Plus it’s a much bigger and battle tested project, with features only present in the mid-to-big Schemes like a full numeric tower (fortunately optional). In the end I didn’t pick it up because of the absurd compilation time under Clang: the source code is a single 2MB C file, which Clang doesn’t like very much.

___ TinyScheme ___

TinyScheme is the granddaddy of small Lisps. It’s based on an older project, MiniScheme, dating back to the 80s. Single C file, GC, starts immediately, minimal platform dependencies (just stdio/stdlib), and support for MSVC meant it was the perfect choice in the end.

__ Bridging C++ and TinyScheme __

C++, unlike C, is a way too big and weird animal to expose everything into an embedded interpreter. More importantly it lacks any reflection system. Since my goal was to make a declarative DSL to fill my C++ instances with data the very first step was to make some reflection/marshalling system available to TinyScheme. And this means making compromises. I can’t even begin to think how to systematically express in Scheme, much less preproc-/ compile-/ running-time reflect something like this madness.

Fortunately (or unfortunately), like in many C++ projects, I already made a choice a year ago on what I considered a sane subset of C++, with the express purpose of making an automated as possible reflection layer for my classes and instances. I cut and cut and cut, and I succeeded, at a huge cost (no collection-owned instances for example, so way less RAII goodness than I would like). The upshot was that from the beginning I had full serialization support for my game model, and a few months later I even managed to make an unholy C preproc macro that allowed me to do single-point definition and reflection for class members, forgetting forever about the typical problem of adding a class member and not updating your marshaller for the class.

This existing reflection support, initially meant for saving games, was perfectly suited for the task at hand. The first (and basically only) FFI calls I expose to TinyScheme are functions to create instances of a named C++ class and set and get their members. Those FFI calls perform typechecking and bark when required. For pointers I use another system already in place that can track any reflective C++ class with an integer handle. This is how the FFI calls look like:

(define effect-create-building (game-make-persisting "Effect"))
(game-set! effect-create-building "createBuildingKind" "BuildingKind_ServiceDecorRocket")
(game-set! effect-create-building "createBuildingKindTag" tag-building)
(game-set! effect-create-building "createBuildingRect.w" 1)
(game-set! effect-create-building "createBuildingRect.h" 1)
(game-set! effect-create-building "assingPivotByTag" tag-building)
(game-set! effect-create-building "assignPivot" pivot)

Here a new C++ instance of the class Effect is created, and then the code sets some of its members. The returned value from game-make-persisting, and others vars like pivot, are just an integer handle that is automatically transformed into an instance pointer by the FFI call, thanks to the type info encoded in the reflector for the class.

__ The final DSL__

Now that I decided to base my DSL on Lisp syntax I had to update my prototype into a language that could be interpreted directly by TinyScheme. With the FFI system it was just a matter of encoding the declarations of the DSL into the appropriately setup C++ instances. In the C++ game model everything is an object, including seemingly “procedural” things like the act of creating a building in some map position. That’s what the previous FFI code sample did, and in the DSL is expressed as just:

(actor ship building "BuildingKind_ServiceDecorRocket" camp crash-camp)
...
(mission
    (spawn ship)
    ...

The spawn call doesn’t just prepare the Effect instance, it also deals with translating the declarative fields in the actor into the correct parameters, generates unique tags to keep track of the created building inside goal conditions, creates and attaches the required instances if the object as an associated dialog, etc. Here is a snippet of the Scheme code for one of the internal helpers of spawn:

(define (spawn-building tag sym kind camp-sym camp-point data)
    (let* ((effect (game-make-persisting "Effect"))
        (dialog-data (assv-v 'dialog data)))
        (game-set! effect "createBuildingKind" (game-get-named-persisting kind))
        (game-set! effect "createBuildingKindTag" tag)
        (game-set! effect "createBuildingRect.x" (car camp-point))
        (game-set! effect "createBuildingRect.y" (cadr camp-point))
        (game-set! effect "createBuildingRect.w" 1)
        (game-set! effect "createBuildingRect.h" 1)
        (fill-spawnSpill effect camp-sym)
        (cond (dialog-data
                (game-set! effect "assingPivotByTag" tag)
                (game-set! effect "assignPivot" (make-pivot (assv-v 'dialog data)))))
        (add-effect effect)))

I make extensive use of association lists due to the lack of native hash tables. Since the interpreter only runs while loading a planet I didn’t care much about performance. The DSL-to-FFI calls code is around 300 lines of newbie Scheme code.

Here’s a fully working example of the final DSL:

(camp crash-camp)
(camp crate-camp near crash-camp)
(camp pirate-camp near crash-camp)

(actor ship building "BuildingKind_ServiceDecorRocket" camp crash-camp)
(actor crate building "BuildingKind_PirateCrates" camp crate-camp)
(actor robot building "BuildingKind_Bust01" camp crate-camp position crate)
(actor pirates agent "TEST" n 5 camp pirate-camp)

(dialog ship 
    0 (body "The black box indicates this ship was an unmaned transport drone with a classified cargo crate, which emits a beacon. There's also a nearby strange heap of debris, which doesn't appear to belong to the crashed ship."
        a "Follow the beacon" a->page 1 a->tag choosen-beacon)
    1 (body "The crate emits a beacon, follow it"))

(dialog crate 0 (body "The crate is shaking!" a "Poke it carefully" a->tag poked-crate a->page #f) )

(dialog robot
    0 (body "You come to rescue me! Now I can stop hiding! I was in that crashed transport ship and survived the landing. Those damn pirates stole my leg and, uh, they are kinda intimidating. Can you get it back?"
        a "Sure" a->page 1 a->tag accept-robot)
    1 (body "Thanks! They are just over there.")
    2 (body "Perfect! Now I can call a Lemurian taxi and get outta here."))

(mission
    (spawn ship crate pirates)
    (mark ship)
    (->step "Investigate the ship" (choose-option 'choosen-beacon))
    
    (mark-only crate)
    (->step "Investigate the crate" (choose-option 'poked-crate))
    
    (despawn crate)
    (spawn robot)
    (mark-only robot)
    (->step "Talk to the robot" (choose-option 'accept-robot))
    
    (mark-only pirates)
    (->step "Kill the pirates" (kill pirates))
    
    (mark-only robot)
    (page-to robot 2)
    (->step "Talk to the robot" (open-dialog robot))

    (end)
)

Like all abstractions it is a bit leaky, but still worth it. Creating the same mission in C++ would have needed more than an order of magnitude extra lines of code. With some boilerplate it could have been reduced somewhat, but writing a hundred or more little missions would still be a nightmare. Now it’s just an approachable and fun problem!