Writing Portable ARM64 Assembly (2023)

(ariadne.space)

59 points | by luu 4 days ago ago

27 comments

  • t-3 a day ago ago

    > You will also need to be aware of minor differences between the Darwin ABI and other platform ABIs. A notable example is that the x18 register is reserved by the Darwin ABI and is explicitly zeroed on context switches in some cases. This register is also reserved on Android, but not on GNU/Linux or Alpine.

    x18 is "the platform register", reserved for the OS. The ISA manual says not to touch it unless you know what you're doing. Also, I don't know but I could believe that android and non-googly linux use different ABIs (but probably not because everyone uses pretty much the same ABI on aarch64 from what I've seen), but surely Alpine is linux and has the same ABI as other linuxes.

    • Joker_vD a day ago ago

      You know, it always rubs me wrong when I'm reading an ISA manual and it tells me how I am supposed to use general-purpose registers. Why do ISA designers even believe they're in a place to design the user-level ABIs? Like, sure, you've hardwired BL and RET to use x30, that's fine. But every other register? If I want to pass return values in x21 and x23, that's none of your business.

  • rurban 21 hours ago ago

    I have so many ifdef __APPLE__ hacks in my assembler/compiler. And they are much harder to fix when I had no Apple Silicon available, only github actions and some models and gas sources.

    But basically it's similar to elf vs coff. windows also uses several modern alignments and a shadow stack to make life easier. But arm has so much more scratch registers, much more fun.

  • steve1977 a day ago ago

    > It just requires being aware of a few differences between the Mach-O and ELF ABIs, as well as knowing what Apple-specific syntax extensions to avoid.

    And completely ignoring PE and Windows on ARM.

    • makerofthings a day ago ago

      I have been very successfully ignoring Windows on Arm since it first appeared :)

      • steve1977 a day ago ago

        I understand the sentiment ;) But IMHO the title of the article is still a bit misleading or incomplete.

      • commandlinefan a day ago ago

        I mean, so has Microsoft, so...

        • steve1977 a day ago ago

          I like to complain about Microsoft as much as anyone, but this is simply not true. At least not since the "second coming of Windows on ARM".

  • mockbuild 20 hours ago ago

    ORPort is not reachable in x86_64 arm overlay network.

  • crest a day ago ago

    How is this code portable to other platforms if it assumes that clang implies macOS?

  • RomanVoropaev a day ago ago

    [flagged]

  • pjmlp a day ago ago

    Assembly was never portable, or do you think 6502 on Apple would work out of the box in a Gameboy or C64?

    • whobre a day ago ago

      In 1981, one could write a z80 assembly program for cp/m and it would run on thousands of different computer models.

      • pjmlp a day ago ago

        As someone that was alive back then already, only if you never touched the hardware directly outside the CPU.

        Even the PC clones didn't had something like portable Assembly if you ventured outside 0x10h and 0x21h interrupts.

        • whobre a day ago ago

          Right. I am saying there is a difference between portable and non portable assembly code. If you interacted with the machine via call 05h interface, it was portable. If you accessed computer’s video memory buffer directly it wasn’t.

          • MomsAVoxell a day ago ago

            Good portable assembly would stub the system stuff off, anyway, and once that was done for the cpu class in focus, it was very possible to have a thin HAL and write portable code. A great deal many successful products of the era were written in pure assembly this way.

            In any case, you could also get high performance multiplatform video/io assembly libraries on the market, soon enough, back in the day .. it begat a lot of Delphi units too, I seem to recall ..

        • MomsAVoxell a day ago ago

          I did it many times on CP/M and also DOS.

          Never ‘touching the hardware’ was attainable for a great deal many assembly programs.

          You could do a lot with 0x10h and 0x21h on DOS.

          • pjmlp a day ago ago

            Yes, not much for games though.

      • steve1977 a day ago ago

        of different CP/M computer models though, no?

    • tieze a day ago ago

      Would be tricky to have your 6502 code running on a Game Boy as it has a Z80/8080.

      • pjmlp a day ago ago

        Should have said NES.

    • foldr a day ago ago

      The article is perfectly clear:

      > The good news is that it is very easy to write assembly which targets Apple’s computers as well as the other 64-bit ARM devices running operating systems other than Darwin.

      • pjmlp a day ago ago

        As long as nothing outside the CPU itself gets used.

        • foldr a day ago ago

          Obviously. Who could possibly be writing ARM assembly code who is not also aware that system calls, etc. will vary across platforms?

          Sometimes you do seem to make negative comments just for the sake of it.

          • pjmlp a day ago ago

            So it isn't portable after all...

            • mdp2021 a day ago ago

              There are portable (or not) software applications, portable algorithms code, non-portable algorithm code.

              Assembly based software for the Motorola 68000 /and/ the Amiga will not run on a 68000 Mac. A "polynomial based packed fastSin()" subroutine written using the XMM x64 registers will work on most x86 CPUs. The same written for the ZMM registers will not work on a number of x86 CPUs.

              Surely 40 years ago we could easily assume talking about specific implementations of software applications; clearly today we see problems as being optimizable on some classes of platforms (e.g. the vectors in ARM work differently from the vectors in x64 CPUs).

            • foldr a day ago ago

              If you can understand what someone means when they talk about a “small elephant”, then you can understand what they mean when they talk about “portable assembly”. In this case, the relevant point is that you can write ARM64 assembly routines that do useful work (e.g optimized matrix multiplication, or something like that) in such a way that they’ll work correctly on a number of different ARM64 platforms.