JIT-ing a stack machine (with SLJIT)

(bullno1.com)

37 points | by bullno1 6 days ago ago

9 comments

  • alexisread 2 days ago ago

    In my spare time I'm looking at JIT from a different perspective.

    https://github.com/dan4thewin/FreeForth2/tree/master

    This is a Forth with a few tricks, namely using flow control instead of a compilation switch flag. This, always compiling into an eval buffer before execution, and use of macros, allows you to unroll a function/word/expression before execution, which makes it fast.

    Macros can be used to do stack caching (though it doesn't here) and cross compilation etc.

    Lastly, Freeforth caches the top two stack items in registers, so at compile time it avoids swap by register renaming.

    This all is quite a different approach and somewhat language specific. Just wanted to highlight the variety, as uxn is not actually that far from forth and has such a different approach.

  • ivankra 2 days ago ago

    > The initial naive application didn’t even yield much gains. Only after a bunch of optimizations that it really shines: a 30-46% speedup compared to the computed goto interpreter.

    Looks like quite a lot of complexity for such gain. 30-40% is roughly what context-threading would buy you [1]. It takes relatively little code to implement - only do honest assembly for jumps and conditional branches, for other opcodes just emit a call to interpreter's handler. Reportedly, it took Apple just 4k LOC to ship first JIT like that in JavaScriptCore [2].

    Also, if you haven't seen it, musttail + preserve_none is a cool new dispatch technique to get more mileage out of plain C/C++ before turning to hand-coded assembly/JIT [3]. A step up from computed goto.

    [1] https://webdocs.cs.ualberta.ca/~amaral/cascon/CDP05/slides/C...

    [2] https://webkit.org/blog/214/introducing-squirrelfish-extreme...

    [3] https://godbolt.org/z/TPozdErM5

    • blakepelton 2 days ago ago

      I wonder how tricks that rely on compiler extensions (e.g., computed goto, musttail, and preserve_none) compare against the weval transform? The weval transform involves a small language extension backed by a larger change to the compiler implementation.

      I suppose the downside of the weval transform is that it is only helpful for interpreters, whereas the other extensions could have other use cases.

      Academic paper about weval: https://dl.acm.org/doi/pdf/10.1145/3729259

      My summary of that paper: https://danglingpointers.substack.com/p/partial-evaluation-w...

      • ivankra 2 days ago ago

        Well, runtime/warmup costs seems like one obvious downside to me - weval would add some non-trivial compilation overhead to your interpreter (unrolling of interpreter loop, dead code elimination, optimizing across opcodes boundaries - probably a major source of speedup). Great if you have the time to precompile your script - only have to pay those costs once. It also helps if your host language's runtime ships with an optimizing compiler/JIT you can piggyback on (WASM runtime in weval's paper, JVM in Graal's case) - these things take space. But sometimes you might just have a huge pile of code that's not hot enough to be worth optimizing and you would be better off with a basic interpreter (that can benefit from computed gotos or tail-call dispatch with zero runtime overhead). Octane's CodeLoad or TypeScript benchmarks are such examples - GraalJS does pretty poorly there.

      • naasking 2 days ago ago

        Partial evaluation subsumes a lot of other compiler optimizations, like constant folding, inlining and dead code elimination, so it wouldn't just find application with interpreters.

  • Rochus 5 days ago ago

    Very interesting article, thanks for sharing. I'm still considering using SLJIT for my Micron interpreter (https://github.com/rochus-keller/Micron), which is a stack machine as well; but given the relatively low speed-up I still doubt whether it's worthwhile. It should then also support some kind of debugger (not only for JIT development, but for the user of the jitted language), which is apparently not yet supported by SLJIT.

  • rurban 2 days ago ago

    That's why I hate stack machines so much. Look at Lua for a proper register machine.

  • 6 days ago ago
    [deleted]
  • curtisszmania 2 days ago ago

    [dead]