I find the goals of explicitness and maintainability to be really pitched to my current taste. From a quick view it looks like the syntax is approaching a local maximum for conforming to expectations and not sacrificing the explicitness sought.
As the developer, where do you land on meta-programming for the language? I applaud the straight up nature of ‘the battery will never be included’ and the reminder to consider the possibility of a feature being a library instead of a syntax or language feature. I certainly don’t think meta-programming is essential, but the ability can contribute to the ease of use for library code.
And I’ll ask now since it always comes up, where does Mach stand on ‘advanced’ type theory uses for ‘low-level’ programming? I noticed the admonition that safety is the developers job which is sure to bring some ‘heat’ from the memory-safety-is-table-stakes crowd, in light of that, where does Mach stand regarding ways to ensure ‘safety’?
I played around with an entire branch of the language that used a zig-style meta programming type system and it ended up just getting in the way more than it helped.
The only thing I may consider in the future is something along the lines of a VERY weak macros system, but that's a highly well explored and well known footCANNON that I'm not keen on implementing. I do have designs in mind if it ever becomes an essential, but for the foreseeable future, any sort of metaprogramming is practically off the table.
On safety... That's a big one. I'm absolutely prepared to take smoke for that all day and I know I will be. Memory management is one of those problems that every programmer has to deal with at some point in their career, whether it's fighting a GC, fighting a borrow checker, or fighting poorly written code. I decided to go "back to basics" and encourage programmers to fight their own knowledge of the system and to be more aware of the code they are writing. Yes, that means that it can be more difficult to write code that is... "safe"... but when it comes down to it, Rust and C are both compiled to CPU instructions that have no concept of memory management intrinsically. Taking the load off the developer has the side effect of removing total control with it. In my opinion, it's a situation of throwing the baby out with the bath water for a benefit that can be fully achieved through best practices and proper understanding of the code you're writing. Does that help clarify my stance? I hope it does lol
Glad you took the time to explore the project! Thank you for that!
I am also working on a programming language with that no hidden behavior philosophy at the forefront (along with a few other things), but radically different on everything else, so I may have a few design choices to question.
I think that your conception of cleverness is too wide basically. It's true that, in my opinion, any new programming language today should stop people from C-style "cleverness". But there are languages where the clever programs are actually the pretty ones. For example, the borrow-checker in Rust encourages good code design (and yet it still can be better). And most importantly, algebraic types constitute a radical improvement over classic C-style code. You may think that they hide what's happening, as they have a hidden tag and could be built from C structs and C unions. But after thinking a lot about it, thinking about the tag is actually not thinking about the program's behavior. There is a difference between the program you want to write, and what needs to happen on the machine to compute it. I could probably write a lot about the solution I found to reconcile both opinions, but my project is a lot newer so I can't redirect you to anything yet. But I think that by wanting a more grounded language, you are maybe dismissing too many ideas that would actually encourage programs to be clear and transparent while still not doing anything automatically with magic (and at the same time, would dismiss incorrect programs, that C can express but that machines cannot execute).
Btw, I understand that I am probably not the target of your language, so I won't expand much further unless you want.
You may not be the target, but that does not invalidate that point. My personal view of the what is and isn't hidden behavior is definitely something that will rub a LOT of people the wrong way. That being said though, this project has grown out of the scope of "fun personal toy" and is intended to be embraced by other developers. If those developers don't want to use the language, then that could be viewed as a fault of the language.
This entire system was designed by ONLY myself up to this point, so I've effectively had blinders on the entire time. I personally like the drastic simplicity of the language, but if it gets in the way of the majority of the users who want to work with the language, then at a very minimum, a conversation should be had about the pain points. Finding that balance between being appropriately hard-headed and willing to adopt features I may not have originally intended to adopt is a difficult line to balance on.
This is an awesome achievement. I am not so sure how you made it 4x slower than C … since it strikes me that you have 1:1 mapping to asm, perhaps vector benchmarks are unfair for this kind of comparison.
To those who criticize use of LLM in projects like yours, I demur - where there are productivity gains to be had, LLMs can make small niche FOSS projects like yours more viable and less drudgery.
I prefer a concise language btw:
say “hello, raku”;
say (0, 1, *+* ... *)[10];
This is more by way of saying let 1000 flowers bloom.
Thanks! The speed is almost entirely a dramatic optimization difference. From what I can tell, most of that difference comes from vectorization (like you stated) and other things like target-specific machine code optimization happening post-codegen.
Mach will eventually get to that point for most targets (or as close as possible). I stated 4x slower as the worst case I've observed so far. In some cases, we were only about 1.3x slower which is fantastic IMO.
LLMs were critical for me to be able to do this. I did not understand how compilers worked when I started this project. Once I was able to understand it, having an LLM churn out the enormous amounts of code was the only way for me to ever have been able to practically release a working codebase in a reasonable timeframe. It leaves a bad taste in peoples mouths (even mine), but it's an invaluable tool that simply can't go overlooked these days. To say mach was "vibe coded" is completely incorrect. To say it was largely "implemented" (typed out) by an AI is far closer to reality. I plan to use that tool for the rest of my career.
That's okay if you prefer something else! Mach isn't made to be the "please everyone" language! I'm glad you took the time to check it out and shared your opinion with me :)
That said, it seems pretty damned impressive to me that mach is only four times slower than C, particularly since you've only worked on it for two years.
I like the syntax. The example code and a couple files in src I looked at were all easy to read.
The performance is definitely something I'm happy with at this early stage. A few optimizations get you a LONG way. I'm glad to hear that you found it easy to read (that was a main goal of the design)!
Archiving that without the behemoth of LLVM is really impressive! On the other hand, there are Lisps—with all their dynamic wishy-washiness—that archive similar performance.
Quite impressive to develop so much without turning to LLVM. But it’s unclear to me from the docs the value prop of why developers would want to use Mach instead of Go or C.
That's a good point. I was thinking about that specific thing this morning (how the documentation doesn't fill it out properly).
Here's an easy comparison list for you. Mach vs:
- C: More modern and consistent syntax with a dramatically simplified tooling system (no 3rd party tooling required out of the box, even for dependencies).
- Go: More control over the "nitty gritty" stuff that go abstracts away too frequently. No garbage collector. Honestly, I love Go and lots of inspiration was taken directly from it, so I have no beef with the language lol.
- Rust: Dramatically simpler language to use. Flat out. Rust's tooling and ecosystem is the best bar none... but then you have to write code in Rust ;)
- Zig: Simpler version of a similar language philosophy, but without the "features" that can often become footguns like the metaprogramming system zig uses for types. Zig is a great example of a language done right, but I do personally feel that it attempts to make some things "do more" than they should. I'm also not a fan of the syntax personally lol
You can shoot your legs all day in mach. There is no such thing as unsafe code, just poorly designed code.
The stdlib has patterns like zig's allocator design as well as Option and Result types that are used frequently through the stdlib itself and other projects written in mach. Achieving safety is very possible, and entirely optional.
> There is no such thing as unsafe code, just poorly designed code.
It's incorrect. In many programming languages there is a clear separation between safe and unsafe code - via special unsafe blocks or something similar. Languages without such separation are always unsafe or (rarely) always safe.
To build mach from the archived bootstrap compiler, you need clang. That's it. git to clone the repo and a C compiler. That goes all the way through the self hosted compiler. Mach doesn't rely on LLVM, doesn't link to libc, doesn't use system linkers for object file management, etc. Everything is "in house".
This looks really nice, great work so far! I see a macOS backend is still in development. I'd like to try the language out on macOS, so if I find the time I'll try to pitch in!
Please do! I've barely touched macos since I don't have the ability to properly test it from my end. If you want to take a crack at it, feel free to work on some of the open issues related to it!
Partially, yes, but not in the way you're thinking.
LLMs have done a significant amount of the "typing", but I was behind the wheel at every stage of development and wrote almost the entirety of the original compiler designs by hand.
There was no point in development when I said "claude, write me a programming language". I simply would not have been able to finish this project in my lifetime by myself without the assistance of LLMs in both the scale and speed that they can operate as well as using them as an accelerated learning tool for myself. It's been a great boon to this project and I have zero regrets :)
Yes LOL. This was a little bit of an oversight on my part from back in the day. Mach can produce MachO binaries (no affiliation) that run on the Mach kernel (no affiliation). I see it as more of a funny coincidence than an impending lawsuit, though. Maybe one day we'll change the name if a single letter ever becomes free XD
I find the goals of explicitness and maintainability to be really pitched to my current taste. From a quick view it looks like the syntax is approaching a local maximum for conforming to expectations and not sacrificing the explicitness sought.
As the developer, where do you land on meta-programming for the language? I applaud the straight up nature of ‘the battery will never be included’ and the reminder to consider the possibility of a feature being a library instead of a syntax or language feature. I certainly don’t think meta-programming is essential, but the ability can contribute to the ease of use for library code.
And I’ll ask now since it always comes up, where does Mach stand on ‘advanced’ type theory uses for ‘low-level’ programming? I noticed the admonition that safety is the developers job which is sure to bring some ‘heat’ from the memory-safety-is-table-stakes crowd, in light of that, where does Mach stand regarding ways to ensure ‘safety’?
I played around with an entire branch of the language that used a zig-style meta programming type system and it ended up just getting in the way more than it helped.
The only thing I may consider in the future is something along the lines of a VERY weak macros system, but that's a highly well explored and well known footCANNON that I'm not keen on implementing. I do have designs in mind if it ever becomes an essential, but for the foreseeable future, any sort of metaprogramming is practically off the table.
On safety... That's a big one. I'm absolutely prepared to take smoke for that all day and I know I will be. Memory management is one of those problems that every programmer has to deal with at some point in their career, whether it's fighting a GC, fighting a borrow checker, or fighting poorly written code. I decided to go "back to basics" and encourage programmers to fight their own knowledge of the system and to be more aware of the code they are writing. Yes, that means that it can be more difficult to write code that is... "safe"... but when it comes down to it, Rust and C are both compiled to CPU instructions that have no concept of memory management intrinsically. Taking the load off the developer has the side effect of removing total control with it. In my opinion, it's a situation of throwing the baby out with the bath water for a benefit that can be fully achieved through best practices and proper understanding of the code you're writing. Does that help clarify my stance? I hope it does lol
Glad you took the time to explore the project! Thank you for that!
I am also working on a programming language with that no hidden behavior philosophy at the forefront (along with a few other things), but radically different on everything else, so I may have a few design choices to question.
I think that your conception of cleverness is too wide basically. It's true that, in my opinion, any new programming language today should stop people from C-style "cleverness". But there are languages where the clever programs are actually the pretty ones. For example, the borrow-checker in Rust encourages good code design (and yet it still can be better). And most importantly, algebraic types constitute a radical improvement over classic C-style code. You may think that they hide what's happening, as they have a hidden tag and could be built from C structs and C unions. But after thinking a lot about it, thinking about the tag is actually not thinking about the program's behavior. There is a difference between the program you want to write, and what needs to happen on the machine to compute it. I could probably write a lot about the solution I found to reconcile both opinions, but my project is a lot newer so I can't redirect you to anything yet. But I think that by wanting a more grounded language, you are maybe dismissing too many ideas that would actually encourage programs to be clear and transparent while still not doing anything automatically with magic (and at the same time, would dismiss incorrect programs, that C can express but that machines cannot execute).
Btw, I understand that I am probably not the target of your language, so I won't expand much further unless you want.
You may not be the target, but that does not invalidate that point. My personal view of the what is and isn't hidden behavior is definitely something that will rub a LOT of people the wrong way. That being said though, this project has grown out of the scope of "fun personal toy" and is intended to be embraced by other developers. If those developers don't want to use the language, then that could be viewed as a fault of the language.
This entire system was designed by ONLY myself up to this point, so I've effectively had blinders on the entire time. I personally like the drastic simplicity of the language, but if it gets in the way of the majority of the users who want to work with the language, then at a very minimum, a conversation should be had about the pain points. Finding that balance between being appropriately hard-headed and willing to adopt features I may not have originally intended to adopt is a difficult line to balance on.
This is an awesome achievement. I am not so sure how you made it 4x slower than C … since it strikes me that you have 1:1 mapping to asm, perhaps vector benchmarks are unfair for this kind of comparison.
To those who criticize use of LLM in projects like yours, I demur - where there are productivity gains to be had, LLMs can make small niche FOSS projects like yours more viable and less drudgery.
I prefer a concise language btw:
This is more by way of saying let 1000 flowers bloom.(The main strength of Raku is its built in Grammars https://slangify.org)
Thanks! The speed is almost entirely a dramatic optimization difference. From what I can tell, most of that difference comes from vectorization (like you stated) and other things like target-specific machine code optimization happening post-codegen.
Mach will eventually get to that point for most targets (or as close as possible). I stated 4x slower as the worst case I've observed so far. In some cases, we were only about 1.3x slower which is fantastic IMO.
LLMs were critical for me to be able to do this. I did not understand how compilers worked when I started this project. Once I was able to understand it, having an LLM churn out the enormous amounts of code was the only way for me to ever have been able to practically release a working codebase in a reasonable timeframe. It leaves a bad taste in peoples mouths (even mine), but it's an invaluable tool that simply can't go overlooked these days. To say mach was "vibe coded" is completely incorrect. To say it was largely "implemented" (typed out) by an AI is far closer to reality. I plan to use that tool for the rest of my career.
That's okay if you prefer something else! Mach isn't made to be the "please everyone" language! I'm glad you took the time to check it out and shared your opinion with me :)
I haven't ever made a low level language.
That said, it seems pretty damned impressive to me that mach is only four times slower than C, particularly since you've only worked on it for two years.
I like the syntax. The example code and a couple files in src I looked at were all easy to read.
The performance is definitely something I'm happy with at this early stage. A few optimizations get you a LONG way. I'm glad to hear that you found it easy to read (that was a main goal of the design)!
Thanks for actually reading code :)
Archiving that without the behemoth of LLVM is really impressive! On the other hand, there are Lisps—with all their dynamic wishy-washiness—that archive similar performance.
Quite impressive to develop so much without turning to LLVM. But it’s unclear to me from the docs the value prop of why developers would want to use Mach instead of Go or C.
That's a good point. I was thinking about that specific thing this morning (how the documentation doesn't fill it out properly).
Here's an easy comparison list for you. Mach vs:
- C: More modern and consistent syntax with a dramatically simplified tooling system (no 3rd party tooling required out of the box, even for dependencies).
- Go: More control over the "nitty gritty" stuff that go abstracts away too frequently. No garbage collector. Honestly, I love Go and lots of inspiration was taken directly from it, so I have no beef with the language lol.
- Rust: Dramatically simpler language to use. Flat out. Rust's tooling and ecosystem is the best bar none... but then you have to write code in Rust ;)
- Zig: Simpler version of a similar language philosophy, but without the "features" that can often become footguns like the metaprogramming system zig uses for types. Zig is a great example of a language done right, but I do personally feel that it attempts to make some things "do more" than they should. I'm also not a fan of the syntax personally lol
What about safety? Does the language allows shooting the leg? Does it have safe/unsafe code separation?
You can shoot your legs all day in mach. There is no such thing as unsafe code, just poorly designed code.
The stdlib has patterns like zig's allocator design as well as Option and Result types that are used frequently through the stdlib itself and other projects written in mach. Achieving safety is very possible, and entirely optional.
> There is no such thing as unsafe code, just poorly designed code.
It's incorrect. In many programming languages there is a clear separation between safe and unsafe code - via special unsafe blocks or something similar. Languages without such separation are always unsafe or (rarely) always safe.
So, I assume Mach is fully unsafe, like C is.
fully self hosted without any external dependencies is incredibly impressive, amazing work
Thank you. Took a long... long time to get it to even this stage, and there's so much more left to do.
Can you elaborate more on NO external dependencies?
To build mach from the archived bootstrap compiler, you need clang. That's it. git to clone the repo and a C compiler. That goes all the way through the self hosted compiler. Mach doesn't rely on LLVM, doesn't link to libc, doesn't use system linkers for object file management, etc. Everything is "in house".
This looks really nice, great work so far! I see a macOS backend is still in development. I'd like to try the language out on macOS, so if I find the time I'll try to pitch in!
Please do! I've barely touched macos since I don't have the ability to properly test it from my end. If you want to take a crack at it, feel free to work on some of the open issues related to it!
> Contributors: @claude
Is it yet another LLM-generated project with little human-written code?
Partially, yes, but not in the way you're thinking.
LLMs have done a significant amount of the "typing", but I was behind the wheel at every stage of development and wrote almost the entirety of the original compiler designs by hand.
There was no point in development when I said "claude, write me a programming language". I simply would not have been able to finish this project in my lifetime by myself without the assistance of LLMs in both the scale and speed that they can operate as well as using them as an accelerated learning tool for myself. It's been a great boon to this project and I have zero regrets :)
See also https://wikipedia.org/wiki/Mach_(kernel) /s
Yes LOL. This was a little bit of an oversight on my part from back in the day. Mach can produce MachO binaries (no affiliation) that run on the Mach kernel (no affiliation). I see it as more of a funny coincidence than an impending lawsuit, though. Maybe one day we'll change the name if a single letter ever becomes free XD
It is going to be awful confusing if Apple starts using Mach the language.