The Lone Lisp Heap

(matheusmoreira.com)

88 points | by stevekemp a day ago ago

27 comments

  • PaulHoule a day ago ago

    You can learn a lot developing a language and runtime but you will reach a point when you'll realize you can go back and do it all better.

    • matheusmoreira a day ago ago

      No doubt. The second attempt is always better. My initial plan was to write a complete first implementation in C so that it's always possible to bootstrap the language, then write a compiler inside the lisp itself, or write a Rust version. Hope I somehow manage to do it all before I die of old age.

      • exe34 a day ago ago

        Aha I've been semi-vibe-coding a scheme for esp32c3 and linux at the same time, and forgoing the libc too, so baremetal c. I went with a slightly awkward approach for allocation, with heap being seen as pages, and within pages, fixed size objects of size 8, 16 or 32 bytes. A pair is two words, and either bitpacked or pointer to another object. I'm not far in, vaguely following the Peter Michaux approach.

      • zabzonk a day ago ago

        > The second attempt is always better.

        Um, see The Second System Effect.

        https://en.wikipedia.org/wiki/Second-system_effect

        • matheusmoreira a day ago ago

          Lone itself is a second system: it's the spiritual successor of liblinux. I suppose the scope did increase... I'll try to be careful.

        • mpyne a day ago ago

          In fairness, with the waterfall methodology that pervaded back then, the "first" system you shipped was actually the second. "Build one to throwaway; you will, anyhow".

          • db48x a day ago ago

            Actual waterfall development was far more iterative than most people realize. If you want a primary source, I recommend Sunbust and Luminary by Don Eyles. It recounts the development of the software for the Apollo lunar lander.

    • cultofmetatron a day ago ago

      this is my favorite thing about agentic coding. its super easy to build a v1 and get the system to a point where it does what you want which means you learn what you need for the v2 much faster

    • mhb a day ago ago

      Also building furniture.

  • NetMageSCW a day ago ago

    Flashbacks to when I implemented a new storage system for a Lisp/Prolog system while porting the core code from Pascal to C. We stuck with linked list of pages so we could keep pointers and gain some speed over array indexing for every object access.

    • matheusmoreira a day ago ago

      That's cool! Can you post more details? In my benchmarks the contiguous heap dramatically improved cache utilization. How did you obtain higher performance with linked lists?

  • throwaway58670 a day ago ago

    Such a delight to read a technical article, and about something outside the hype cycle.

    I also liked the color scheme.

    • matheusmoreira 20 hours ago ago

      Thank you! I'm glad you enjoyed it. I'm already writing the next one!

  • Someone a day ago ago

    FTA: “Lone is a lisp interpreter written in freestanding C. There is no dynamic memory allocation in freestanding C. There is no such thing as malloc. There is no libc. There is only me and the code. If I wanted malloc, I would have to write it myself.

    […]

    mmap is awesome but Linux's got something even more awesome: mremap, which makes it almost trivial to manage the page-based heap.“*

    ⇒ they are using “freestanding” in a non-standard way. Usually, it means running on bare metal without an OS (https://en.wikipedia.org/wiki/Standalone_program: “A standalone program, also known as a freestanding program, is a computer program that does not load any external module, library function or program and that is designed to boot with the bootstrap procedure of the target processor – it runs on bare metal”

    • matheusmoreira a day ago ago

      Freestanding as in freestanding C. Lone passes the -ffreestanding -nostdlib flags. No external modules, libraries, functions or programs are loaded by the interpreter.

      It's true that lone doesn't run on bare metal. I chose Linux for pragmatic and philosophical reasons. I'm not sure lone would ever have printed hello world if I hadn't chosen to build on top of Linux, which is also the only kernel with a stable binary interface. No other system lets you avoid the libc.

    • shakna a day ago ago

      They're using "freestanding" as in "-ffreestanding" [0].

      "C built to compile with freestanding mode", rather than "C built to run bare metal".

      [0] https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html#in...

  • adelmotsjr 16 hours ago ago

    As a Brazilian, i'll let my nationalism escape a little bit to say that it is awesome to see such an incredible work done by a fellow brasileiro. This gives me the same feeling of awesomeness that I get when I see José Valim's work on Elixir.

    I only wish I could be as diligent and focused to do a thing like this, because I think it would be so much fun. I bought the Crafting Interpreters book a while ago, but still haven't got time to actually go through with it.

    Well, maybe one day...

    • matheusmoreira 13 hours ago ago

      Thanks, that's high praise! I highly recommend Crafting Interpreters. Bob Nystrom is among my favorite authors. You should read his blog too.

    • quibono 12 hours ago ago

      If I'm not mistaken there's also the author of Starlette / FastAPI

  • dullcrisp a day ago ago

    I don’t know too much about this, but strictly based on the name I assumed that you should use a heap. Is that not right?

    • matheusmoreira a day ago ago

      Do you mean heap, the data structure?

      https://en.wikipedia.org/wiki/Heap_(data_structure)

      That is indeed nowhere to be found in the article. Heap is also used to refer to dynamically allocated memory, which used to be implemented via a heap segment.

      https://en.wikipedia.org/wiki/C_dynamic_memory_allocation

      https://en.wikipedia.org/wiki/Data_segment#Heap

      • dullcrisp a day ago ago

        Yeah upon looking it up it appears to have nothing to do with the data structure.

        When they were doing a linear scan to find a large enough free block I was waiting for a heap to come into the picture, but apparently it’s not done that way.

        • matheusmoreira a day ago ago

          I think the term "heap" is used in the sense of "pile" which evokes the image of a somewhat disorganized heap of things you can grab in any order. Contrasts with the disciplined LIFO stack.

          Maybe actual heaps were used to implement this at some point. To be perfectly honest, I'm not sure.

          • adrian_b 19 hours ago ago

            The first use in computing of the word "heap", which refers to a data structure used for priority queues or for sorting, i.e. a kind of binary tree, was in an article published in June 1964, about the algorithm "Heapsort" (by John William Joseph Williams).

            The second use in computing of the word "heap", which has no relationship whatsoever with the other meaning, occurred first in the report about the programming language ALGOL 68, which was published in December 1968 by Adriaan van Wijngaarden et al.

            In ALGOL 68, "heap" is used as a keyword, to indicate that a new variable must be allocated in the heap, instead of being allocated in the stack, which is the default.

            So "heap" in the second sense, which is used by you and traditionally in UNIX, was coined to oppose "stack", as not being constrained by a LIFO allocate/free policy.

            The word stack had been popularized by another Dutch, Edsger W. Dijkstra, in May 1960, who explained how to use a stack and a stack pointer for implementing the blocks of ALGOL 60. So both "stack" and "heap", in the senses related to memory management, come from Dutch computer scientists, and they have been used first in connection with the languages ALGOL 60 and, respectively, ALGOL 68.

            Before ALGOL 68, the language IBM PL/I had used since December 1964 the terms "automatic storage" instead of "stack" (the source of the C keyword "auto") and "controlled storage" instead of "heap".

            John McCarthy published the description of the garbage collector used by LISP in April 1960, but he did not use any special word for the heap, because in LISP that was the only kind of memory used for data, so unlike in PL/I or ALGOL 68 there was no need to distinguish it from memory areas that were managed in a different way.

          • dullcrisp a day ago ago

            That’s the answer I found too. With pictures even https://stackoverflow.com/a/662285

  • a day ago ago
    [deleted]
  • mbrock a day ago ago

    [dead]