Ada has some really good ideas which its a shame never took off or got used outside of the safety critical community that mostly used it. The ability to make number types that were limited in their range is really useful for certain classes of bugs. Spark Ada was a relatively easy substandard to learn and apply to start to develop software that was SIL 4 compliant.
I can't help but feel that we just went through a huge period of growth at all costs and now there is a desire to return, after 30-years of anything goes, to trying to make software that is safer again. Would be nice to start to build languages based on all the safety learnings over the decades to build some better languages, the good ideas keep getting lost in obscure languages and forgotten about.
I know quite some people in the safety/aviation domain that kind of dislike the subranges, as it inserts run-time checks that are not easily traceable to source code, thus escaping the trifecta of requirements/tests/source-code (which all must be traceable/covered by each other).
Weirdly, when going through the higher assurance levels in aviation, defensive programming becomes more costly, because it complicates the satisfaction of assurance objectives. SQLite (whiches test suite reaches MC/DC coverage which is the most rigorous coverage criterion asked in aviation) has a nice paragraph on the friction between MC/DC and defensive programming:
Ideally, a compiler can statically prove that values stay within the range; it's no different than proving that values of an enumeration type are valid. The only places where a check is needed are conversions from other types, which are explicit and traceable.
let a: u8 is 0..100 = 1;
let b: u8 is 0..100 = 2;
let c = a + b;
The type of c could be u8 in 0..200. If you have holes in the middle, same applies. Which means that if you want to make c u8 between 0..100 you'd have to explicitly clamp/convert/request that, which would have to be a runtime check.
In your example we have enough information to know that the addition is safe. In SPARK, if that were a function with a and b as arguments, for instance, and you don't know what's being passed in you make it a pre-condition. Then it moves the burden of proof to the caller to ensure that the call is safe.
But obviously the result of a + b is [0..200], so an explicit cast, or an assertion, or a call to clamp() is needed if we want to put it back into a [0..100].
Comptime constant expression evaluation, as in your example, may suffice for the compiler to be able to prove that the result lies in the bounds of the type.
If you're using SPARK, it'll catch at compile time if there's ever a possibility that it would fit within that condition. Otherwise it'll throw an exception (constraint_error) during runtime for you to catch.
Your example also gets evaluated at comptime. For more complex cases I wouldn't be able to tell you, I'm not the compiler :) For example, this get's checked:
let ageFails = (200 + 2).Age
Error: 202 can't be converted to Age
If it cannot statically prove it at comptime, it will crash at runtime during the type conversion operation, e.g.:
import std/strutils
stdout.write("What's your age: ")
let age = stdin.readLine().parseInt().Age
Then, when you run it:
$ nim r main.nim
What's your age: 999
Error: unhandled exception: value out of range: 999 notin 0 .. 200 [RangeDefect]
Exactly this. Fails at runtime. Consider rather a different example: say the programmer thought the age were constrained to 110 years. Now, as soon as a person is aged 111, the program crashes. Stupid mistake by a programmer assumption turns into a program crash.
Why would you want this?
I mean, we've recently discussed on HN how most sorting algorithms have a bug for using ints to index into arrays when they should be using (at least) size_t. Yet, for most cases, it's ok, because you only hit the limit rarely. Why would you want to further constrain the field, would it not just be the source of additional bugs?
Once the program is operating outside of the bounds of the programmers assumption, it’s in an undefined state that may cause a crash to happen at a later point of time in a totally different place.
Making the crash happen at the same time and space as the error means you don’t have to trace a later crash back to the root cause.
This makes your system much easier to debug at the expense of causing some crashes that other systems might not have. A worthy trade off in the right context.
Out of bounds exception is ok to crash the program. User input error is not ok to crash the program.
I could go into many more examples but I hope I am understood. I think these hard-coded definition of ranges at compile time are causes of far more issues than they solve.
Let's take a completely different example: size of a field in a database for a surname. How much is enough? Turns out 128 varchars is not enough, so now they've set it to 2048 (not a project I work(ed) on, but am familiar with). Guess what? Not in our data set, but theoretically, even that is not enough.
> Out of bounds exception is ok to crash the program. User input error is not ok to crash the program.
So you validate user input, we've known how to do that for decades. This is a non-issue. You won't crash the program if you require temperatures to be between 0 and 1000 K and a user puts in 1001, you'll reject the user input.
If that user input crashes your program, you're not a very good programmer, or it's a very early prototype.
I think, if I am following things correctly, you will find that there's a limit to the "validate user input" argument - especially when you think of scenarios where multiple pieces of user input are gathered together and then have mathematical operations applied to them.
eg. If the constraint is 0..200, and the user inputs one value that is being multiplied by our constant, it's trivial to ensure the user input is less than the range maximum divided by our constant.
However, if we are having to multiply by a second, third... and so on.. piece of user input, we get to the position where we have to divide our currently held value by a piece of user input, check that the next piece of user input isn't higher, and then work from there (this assumes that the division hasn't caused an exception, which we will need to ensure doesn't happen.. eg if we have a divide by zero going on)
I mean, yeah. If you do bad math you'll get bad results and potentially crashes. I was responding to someone who was nonsensically ignoring that we validate user input rather than blindly putting it into a variable. Your comment seems like a non sequitur in this discussion. It's not like the risk you describe is unique to range constrained integer types, which is what was being discussed. It can happen with i32 and i64, too, if you write bad code.
Hmm, I was really pointing at the fact that once you get past a couple of pieces of user input, all the validation in the world isn't going to save you from the range constraints.
Assuming you want a good faith conversation, then the idea that there's bad math involved seems a bit ludicrous
> Stupid mistake by a programmer assumption turns into a program crash.
I guess you can just catch the exception in Ada? In Rust you might instead manually check the age validity and return Err if it's out of range. Then you need to handle the Err. It's the same thing in the end.
> Why would you want to further constrain the field
You would only do that if it's a hard requirement (this is the problem with contrived examples, they make no sense). And in that case you would also have to implement some checks in Rust.
Exactly, but how do you catch the exception? One exception catch to catch them all, or do you have to distinguish the types?
And yes... error handle on the input and you'd be fine. How would you write code that is cognizant enough to catch outofrange for every +1 done on the field? Seriously, the production code then devolves into copying the value into something else, where operations don't cause unexpected exceptions. Which is a workaround for a silly restriction that should not reside in runtime level.
Also, I would be very interested to learn the case for hard requirement for a range.
In almost all the cases I have seen it eventually breaks out of confinement. So, it has to be handled sensibly. And, again, in my experience, if it's built into constraints, it invarianly is not handled properly.
Consider the size of the time step in a numerical integrator of some chemical reaction equation, if it gets too big the prediction will be wrong and your chemical plant could explode.
So too big times steps cannot be used, but constant sized steps is wasteful. Seems good to know the integrator can never quietly be wrong, even if you have to pay the price that tge integrator could crash.
Ada, or at least GNAT, also supports compile-time dimensional analysis (unit checking). I may be biased, because I mostly work with engineering applications, but I still do not understand why in other languages it is delegated to 3rd party libraries.
FWIW, physical dimensions like meters were the original apples-to-oranges type system that pre-dates all modern notions of things beyond arithmetic. I'm a little surprised it wasn't added to early FORTRAN. In a different timeline, maybe. :)
I think what is in "the" "stdlib" or not is a tricky question. For most general/general purpose languages, it can be pretty hard to know even the probability distribution of use cases. So, it's important to keep multiple/broad perspectives in mind as your "I may be biased" disclaimer. I don't like the modern (well, it kind of started with CTAN where the micros seemed meant more for copy-paste and then CPAN where it was not meant for that) trend toward dozens to hundreds of micro-dependencies, either, though. I think Python, Node/JS, and Rust are all known for this.
Yes, we re-invent the wheel. The more time you spend writing software for a living, the more you will see the wheel re-invented. But Ada and Rust are safe under different definitions of safety. I view Rust as having a more narrow definition of safety, but a very important notion of safety, and executed with brutal focus. While Ada's definition of safety being broader, but better suited to a small subset of use cases.
I write Rust at work. I learned Ada in the early 1990s as the language of software engineering. Back then a lot of the argument against Ada was it was too big, complex, and slowed down development too much. (Not to mention the validating Ada 83 compiler I used cost about $20,000 a seat in today's money). I think the world finally caught up with Ada and we're recognizing that we need languages every bit as big and complex, like Rust, to handle issues like safe, concurrent programming.
> The ability to make number types that were limited in their range is really useful for certain classes of bugs.
This is a feature I use a lot in C++. It is not part of the standard library but it is trivial to programmatically generate range-restricted numeric types in modern C++. Some safety checks can even be done at compile-time instead of runtime.
It should be a standard feature in programming languages.
I've never come across any range restricting constructions in C++ projects in the wild before. It truly is a shame, I think it's something more programmers should be aware of and use. Eliminating all bounds checking and passing that job to the compiler is pretty killer and eliminates whole classes of bugs.
I would guess that Ada is simply more known. Keep in mind that tech exploded in the past ~3.5 decades whereas those languages are much older and lost the popularity contest. If you ask most people about older languages, the replies other than the obvious C and (kind of wrong but well) C++ are getting thin really quickly. COBOL, Ada, Fortran, and Lisp are probably what people are aware of the most, but other than that?
The first five languages I learned back in the 70s: FORTRAN, Pascal, PL/I, SNOBOL, APL. Then I was an Ada and Icon programmer in the 80s. In the 90s, it was C/C++ and I just never had the enthusiasm for it.
Icon (which came from SNOBOL) is one of the few programming languages I consider to embody truly new ideas. (Lisp, Forth, and Prolog are others that come to mind.)
Icon is an amazing language and I wish it was better known.
I found Pascal more readable as a budding programmer. Later on, C's ability to just get out of the way to program what I wanted trumped the Pascal's verbosity and opinionatedness.
I admit that the terseness of the syntax of C can be off-putting. Still, it's just syntax, I am sorry you were disuaded by it.
I dabbled in some of them during some periods when I took a break from work. And also some, during work, in my free time at home.
Pike, ElastiC (not a typo), Icon, Rebol (and later Red), Forth, Lisp, and a few others that I don't remember now.
Not all of those are from the same period, either.
Heck, I can even include Python and Ruby in the list, because I started using them (at different times, with Python being first) much before they became popular.
2005-2010 my college most interesting (in this direction) language was Haskell. I don't think that there was any other language (like Ada) being taught)
Turbo Pascal could check ranges on assignment with the {$R+} directive, and Delphi could check arithmetic overflow with {$Q+}. Of course, nobody wanted to waste the cycles to turn those on :)
I would argue that was one of the reasons why those languages lost.
I distinctly remember arguments for functions working on array of 10. Oh, you want array of 12? Copy-paste the function to make it array of 12. What a load of BS.
It took Pascal years to drop that constraint, but by then C had already won.
I never ever wanted the compiler or runtime to check a subrange of ints. Ever. Overflow as program crash would be better, which I do find useful, but arbitrary ranges chosen by programmer? No thanks. To make matters worse, those are checked even by intermediate results.
I realize this is opinioned only on my experience, so I would appreciate a counter example where it is a benefit (and yes, I worked on production code written in Pascal, French variant even, and migrating it to C was hilariously more readable and maintainable).
It still results in overflow and while you are right that it's UB by the standard, it's still pretty certain what will happen on a particular platform with a particular compiler :)
No, optimizing compilers don't translate overflow to platform-specific behavior for signed integers - since it's UB they'll freely make arithmetic or logic assumptions that can result in behavior that can't really be humanly predicted without examining the generated machine code.
In my personal experience it's not just safety. Reliability of produced is also a big part.
Ime, being able to express constraints in a type systems yields itself to producing better quality code. A simple example from my experience with rust and golang is mutex handling, rust just won't let you leak a guard handle while golang happily let's you run into a deadlock.
It doesn’t really compete in the same space as Ada or Rust but C# has range attributes that are similar, the only downside is you have to manually call the validation function unless you are using something like ASP.NET that does it automatically at certain times.
>Ada has some really good ideas which its a shame never took off or got used outside of the safety critical community that mostly used it. The ability to make number types that were limited in their range is really useful for certain classes of bugs.
As pjmlp says in a sibling comment, Pascal had this feature, from the beginning, IIRC, or from an early version - even before the first Turbo Pascal version.
30+ years ago I was programming in Ada, and I feel the same way and have been repeatedly disappointed. Maybe this time around things will be different.
The author indicates some obvious differences, including the fact that Ada has a formal spec and rust doesn't -- rustc seems to be both in flux as well as the reference implementation. This might matter if you're writing a new compiler or analyzer.
But the most obvious difference, and maybe most important to a user, was left unstated: the adoption and ecosystem such as tooling, libraries, and community.
Ada may have a storied success history in aerospace and life safety, etc, and it might have an okay standard lib which is fine for AOC problems and maybe embedded bit poking cases in which case it makes sense to compare to Rust. But if you're going to sit down for a real world project, ie distributed system or OS component, interfacing with modern data formats, protocols, IDEs, people, etc is going to influence your choice on day one.
Rust has now a donated spec that was provided by Ferrocene. This spec style was influenced by the Ada spec. It is available publicly now on https://rust-lang.github.io/fls/ .
This is part of the effort of Ferrocene to provide a safety certificate compiler. And they are already available now.
This is only meaningful if Rust compiler devs give any guarantees about never breaking the spec and always being able to compile code that adheres to this spec.
That's not how it works for most language standards, though. Most language standards are prescriptive, while Rust is descriptive.
Usually the standard comes first, compiler vendors implement it, and between releases of the spec the language is fixed. Using Ada as an example, there was Ada 95 and Ada 2003, but between 95 and 2003 there was only Ada 95. There was no in-progress version, the compiler vendors weren't making changes to the language, and an Ada95 compiler today compiles the same language as an Ada95 compiler 30 years ago.
Looking at the changelog for the Rust spec (https://rust-lang.github.io/fls/changelog.html), it's just the changelog of the language as each compiler verion is released, and there doesn't seem to be any intention of supporting previous versions. Would there be any point in an alternative compiler implementing "1.77.0" of the Rust spec?
And the alternative compiler implementation can't start implementing a compiler for version n+1 of the spec until that version of rustc is released because "the spec" is just "whatever rustc does", making the spec kind of pointless.
> Usually the standard comes first, compiler vendors implement it, and between releases of the spec the language is fixed.
This is not how C or C++ were standardized, nor most computer standards in the first place. Usually, vendors implement something, and then they come together to agree upon a standard second.
When updating standards, sometimes things are put in the standard before any implementations, but that's generally considered an antipattern for larger designs. You want real-world evaluation of the usefulness of something before it's been standardized.
Because otherwise the spec is just words on a paper, and the standard is just "whatever the compiler does is what it supposed to do". The spec codifies the intentions of the creators separately from the implementation.
In rust, there is currently only one compiler so it seems like there's no problem
Rust doesn’t have quite as strong compatibility guarantees. For example, it’s not considered a NC-breaking change to add new methods to standard library types, even though this can make method resolution ambiguous for programs that had their own definitions of methods with the same name. A C++ implementation claiming to support C++11 wouldn’t do that, they’d use ifdefs to gate off the new declarations when compiling in C++11 mode.
You have to squint fairly hard to get here for any of the major C++ compilers.
I guess maybe someone like Sean Baxter will know the extent to which, in theory, you can discern the guts of C++ by reading the ISO document (or, more practically, the freely available PDF drafts, essentially nobody reads the actual document, no not even Microsoft bothers to spend $$$ to buy an essentially identical PDF)
My guess would be that it's at least helpful, but nowhere close to enough.
And that's ignoring the fact that the popular implementations do not implement any particular ISO standard, in each case their target is just C++ in some more general sense, they might offer "version" switches, but they explicitly do not promise to implement the actual versions of the ISO C++ programming language standard denoted by those versions.
Neither the Rust nor the Ada spec is formal, in the sense of consumable by a theorem prover. AFAIK for Ada Spark, there is of course assumptions on the language semantics built-in to Spark, but: these are nowhere coherently written down in a truly formal (as in machine-readable) spec.
There's a formally verified C compiler, IIRC the frontend isn't, but if you define the language to the structs that are in the inputs to whatever is formally verified I guess whatever C like dialect of a language it implements must be.
I'm sure the programmers of the flight control software safely transporting 1 billion people per year see your "real world project" and reply with something like "yes, if you are writing software where the outputs don't matter very much, our processes are excessive" :p
This write-up shows that while Ada may have some cultural and type-related disadvantages compared to Rust, Ada seems to generally win the readability contest.
What is missing from the comparison is compiler speed - Ada was once seen as a complex language, but that may not be the case if compared against Rust.
In any case, thanks for the post, it made me want to try using Ada for a real project.
As far as I'm aware, Ada has a much more expressive type system and not by a hair. By miles. Being able to define custom bounds checked ordinals, being able to index arrays with any enumerable type. Defining custom arithmatic operators for types. adding compile and runtime typechecks to types with pre/post conditions, iteration variants, predicates, etc... Discriminant records. Record representation clauses.
On strings in Ada vs Rust. Ada's design predates Unicode (early 1980s vs 1991), so Ada String is basically char array whereas Rust string is a Unicode text type. This explains why you can index into Ada Strings, which are arrays of bytes, but not into Rust strings, which are UTF8 encoded buffers that should be treated as text. Likely the Rust implementation could have used a byte array here.
Worse, the built-in Unicode strings are arrays of Unicode scalars, effectively UTF-32 in the general case. There's no proper way to write UTF-8 string literals AFAIK, you need to convert them from arrays of 8, 16 or 32 bit characters depending on the literal.
How is the internal representation an issue?
Java string are utf16 internally and it's doesn't matter how you write your code nor what's the targeted format.
It's an issue because there's nothing internal about the representation in Ada: They're regular arrays of Character/Wide_Character/Wide_Wide_Character, and string literals have different type depending on the width required to represent it as such.
Also, string representations very much matter if you're coding with even the slightest amount of mechanical sympathy.
I mean you can index into Rust's strings, it's just that you probably don't want that:
"Clown"[2..5] // is "own"
Notice that's a range, Rust's string slice type doesn't consider itself just an array (as the Ada type is) and so we can't just provide an integer index, the index is a range of integers to specify where our sub-string should begin and end. If we specify the middle of a Unicode character then the code panics - don't do that.
Yes, since AoC always uses ASCII it will typically make sense to use &[u8] (the reference to a slice of bytes) and indeed the str::as_bytes method literally gives you that byte slice if you realise that's what you actually needed.
I found it kind of odd that the author says Rust doesn't support concurrent programming out of the box. He links to another comment which points out you don't need Tokio for async (true enough), but even that aside async isn't the only way to achieve concurrency. Threads are built right into the language, and are easier to use than async. The only time they wouldn't be a good choice is if you anticipate needing to spawn so many threads that it causes resource issues, which very few programs will.
How does the cancellation story differ between threads and async in Rust? Or vs async in other languages?
There's no inherent reason they should be different, but in my experience (in C++, Python, C#) cancellation is much better in async then simple threads and blocking calls. It's near impossible to have organised socket shutdown in many languages with blocking calls, assuming a standard read thread + write thread per socket. Often the only reliable way to interrupt a socket thread it's to close the socket, which may not be what you want, and in principle can leave you vulnerable to file handle reuse bugs.
Async cancellation is, depending on the language, somewhere between hard but achievable (already an improvement) and fabulous. With Trio [1] you even get the guarantee that non-compared socket operations are either completed or have no effect.
Did this work any better in Rust threads / blocking calls? My uneducated understanding is that things are actually worse in async than other languages because there's no way to catch and handle cancellations (unlike e.g. Python which uses exceptions for that).
I'm also guessing things are no better in Ada but very happy to hear about that too.
Ok I could be super wrong here, but I think that's not true.
Dropping a future does not cancel a concurrently running (tokio::spawn) task. It will also not magically stop an asynchronous I/o call, it just won't block/switch from your code anymore while that continues to execute. If you have created a future but not hit .await or tokio::spawn or any of the futures:: queue handlers, then it also won't cancel it it just won't begin it.
Cancellation of a running task from outside that task actually does require explicit cancelling calls IIRC.
Spawn is kind of a special case where it's documented that the future will be moved to the background and polled without the caller needing to do anything with the future it returns. The vast majority of futures are lazy and will not do work unless explicitly polled, which means the usual way of cancelling is to just stop polling (e.g. by awaiting the future created when joining something with a timeout; either the timeout happens before the other future completes, or the other future finishes and the timeout no longer gets polled). Dropping the future isn't technically a requirement, but in practice it's usually what will happen because there's no reason to keep around a future you'll never poll again, so most of the patterns that exist for constructing a future that finishes when you don't need it anymore rather than manually cancelling will implicitly drop any future that won't get used again (like in the join example above, where the call to `join` will take ownership of both futures and not return either of them, therefore dropping whichever one hasn't finished when returning).
That's a rare exception, and just a design choice of this particular library function. It had to intentionally implement a workaround integrated with the async runtime to survive normal cancellation. (BTW, the anti-cancellation workaround isn't compatible with Rust's temporary references, which can be painfully restrictive. When people say Rust's async sucks, they often actually mean `tokio:spawn()` made their life miserable).
Regular futures don't behave like this. They're passive, and can't force their owner to keep polling them, and can't prevent their owner from dropping them.
When a Future is dropped, it has only one chance to immediately do something before all of its memory is obliterated, and all of its inputs are invalidated. In practice, this requires immediately aborting all the work, as doing anything else would be either impossible (risking use-after-free bugs), or require special workarounds (e.g. io_uring can't work with the bare Future API, and requires an external drop-surviving buffer pool).
Rain showed that not all may be as simple as it seems to do it correctly.
In her presentation on async cancellation in Rust, she spoke pretty extensively on cancel safety and correctness, and I would recommend giving it a watch or read.
Yeah that's what I'm talking about ... Cancellation where the cancelled object can't handle the cancellation, call other async operations and even (very rarely) suppress it, isn't "real" cancellation to me, having seen how this essential it is.
> There's no inherent reason they should be different
There is... They're totally different things.
And yeah Rust thread cancellation is pretty much the same as in any other language - awkward to impossible. That's a fundamental feature of threads though; nothing to do with Rust.
There's no explicit cancel, but there's trivial one shot cancellation messages that you can handle on the thread side. It's perfectly fine, honestly, and how I've been doing it forever.
I would call that clean shutdown more than cancellation. You can't cancel a long computation, or std::thread::sleep(). Though tbf that's sort of true of async too.
To be clear about what I meant: I was saying that, in principle, it would be possible design a language or even library where all interruptable operations (at least timers and networking) can be cancelled from other threads. This can be done using a cancellation token mechanism which avoids even starting the operation of already cancelled token, in a way that avoids races (as you might imagine from a naive check of a token before starting the operation) if another thread cancels this one just as the operation is starting.
Now I've set (and possibly moved) the goalposts, I can prove my point: C# already does this! You can use async across multiple threads and cancellation happens with cancellation tokens that are thread safe. Having a version where interruptable calls are blocking rather than async (in the language sense) would actually be easier to implement (using the same async-capable APIs under the hood e.g., IOCP on Windows).
Well sure, there's nothing to stop you writing a "standard library" that exposes that interface. The default one doesn't though. I expect there are platforms that Rust supports that don't have interruptible timers and networking (whereas C# initially only supported Windows).
I wonder where the cut-off is where a work stealing scheduler like Tokio's is noticeably faster than just making a bunch of threads to do work, and then where the hard cut-off is that making threads will cause serious problems rather than just being slower because we don't steal.
It might be quite small, as I found for Maps (if we're putting 5 things in the map then we can just do the very dumbest thing which I call `VecMap` and that's fine, but if it's 25 things the VecMap is a little worse than any actual hash table, and if it's 100 things the VecMap is laughably terrible) but it might be quite large, even say 10x number of cores might be just fine without stealing.
Threads as they are conventionally considered are inadequate. Operating systems should offer something along the lines of scheduler activations[0]: a low-level mechanism that represents individual cores being scheduled/allocated to programs. Async is responsive simply because it conforms to the (asynchronous) nature of hardware events. Similarly, threads are most performant if leveraged according to the usage of hardware cores. A program that spawns 100 threads on a system with 10 physical cores is just going to have threads interrupting each other for no reason; each core can only do so much work in a time frame, whether it's running 1 thread or 10. The most performant/efficient abstraction is a state machine[1] per core. However, for some loss of performance and (arguable) ease of development, threads can be used on top of scheduler activations[2]. Async on top of threads is just the worst of both worlds. Think in terms of the hardware resources and events (memory accesses too), and the abstractions write themselves.
You may be correct in theory though in practice the reason to use Async over threads is often because the crate you want to use is async. Reqwest is a good example, it cannot be used without Tokio. Ureq exists and works fine. I've done a fairly high level application project where I tried to avoid all async and at some point it started to feel like swimming upstream.
Or in cases where the platform doesn't support threads easily - WASM and embedded (Embassy). Tbh I think that's a better motivation for using async than the usual "but what if 100k people visit my anime blog all at once?"
Interesting that Ada has an open source compiler. For whatever reason when I looked at it years ago I thought it was proprietary compilers only so I never really looked at it again. Maybe I’ll look again now.
GNAT has been around for 30 years. There were some limitations with (one version of?) it due to it being released without the GPL runtime exception, which meant linking against its runtime technically required your programs to also be released under GPL. That hasn't been an issue for a very long time either, though.
GNAT has been around since the 90s, based on GCC. My university did some work on the compiler and used it for some undergrad courses like real-time programming. IIRC there was an attempt to use Ada for intro to programming courses, but I think they chose Pascal and then eventually C++.
Tangentially related, one of the more interesting projects I've seen in the 3D printing space recently is Prunt. It's a printer control board and firmware, with the latter being developed in Ada.
I found the inclusion of arrays indexed on arbitrary types in the feature table as a benefit of Ada surprising. That sounds like a dictionary type, which is in the standard library of nearly every popular Language. Rust includes two.
I think they're focused very much specifically on the built-in array type. Presumably Ada is allowed to say eggs is an array and the index has type BirdSpecies so eggs[Robin] and eggs[Seagull] work but eggs[5] is nonsense.
Rust is OK with you having a type which implements Index<BirdSpecies> and if eggs is an instance of that type it's OK to ask for eggs[Robin] while eggs[5] won't compile, but Rust won't give you an "array" with this property, you'd have to make your own.
My guess is that this makes more sense in a language where user defined types are allowed to be a subset of say a basic integer type, which I know Ada has and Rust as yet does not. If you can do that, then array[MyCustomType] is very useful.
I call out specifically User Defined types because, Rust's NonZeroI16 - the 16-bit integers except zero - is compiler-only internal magic, if you want a MultipleOfSixU32 or even U8ButNotThreeForSomeReason that's not "allowed" and so you'd need nightly Rust and an explicit "I don't care that this isn't supported" compiler-only feature flag in your source. I want to change this so that anybody can make the IntegersFiveThroughTwelveU8 or whatever, and there is non zero interest in that happening, but I'd have said the exact same thing two years ago so...
I really don't understand this so I hope you won't mind explaining it. If I would have the type U8ButNotThreeForSomeReason wouldn't that need a check at runtime to make sure you are not assigning 3 to it?
At runtime it depends. If we're using arbitrary outside integers which might be three, we're obliged to check yes, nothing is for free. But perhaps we're mostly or entirely working with numbers we know a priori are never three.
NonZero<T> has a "constructor" named new() which returns Option<NonZero<T>> so that None means nope this value isn't allowed because it's zero. But unwrapping or expecting an Option is constant, so NonZeroI8::new(9).expect("Nine is not zero") will compile and produce a constant that the type system knows isn't zero.
Three in particular does seem like a weird choice, I want Balanced<signed integer> types such as BalancedI8 which is the 8-bit integers including zero, -100 and +100 but crucially not including -128 which is annoying but often not needed. A more general system is envisioned in "Pattern Types". How much more general? Well, I think proponents who want lots of generality need to help deliver that.
Option<U8ButNotThreeForSomeReason> would have a size of 2 bytes (1 for the discriminant, 1 for the value) whereas Option<NonZeroU8> has a size of only 1 byte, thanks to some special sauce in the compiler that you can't use for your own types. This is the only "magic" around NonZero<T> that I know of, though.
You can make an enum, with all 255 values spelled out, and then write lots of boilerplate, whereupon Option<U8ButNotThreeForSomeReason> is also a single byte in stable Rust today, no problem.
That's kind of silly for 255 values, and while I suspect it would work clearly not a reasonable design for 16-bits let alone 32-bits where I suspect the compiler will reject this wholesale.
Another trick you can do, which will also work just fine for bigger types is called the "XOR trick". You store a NonZero<T> but all your adaptor code XORs with your single not-allowed value, in this case 3 and this is fairly cheap on a modern CPU because it's an ALU operation, no memory fetches except that XOR instruction, so often there's no change to bulk instruction throughput. This works because only 3 XOR 3 == 0, other values will all have bits jiggled but remain valid.
Because your type's storage is the same size, you get all the same optimisations and so once again Option<U8ButNotThreeForSomeReason> is a single byte.
The feature of being able to use a discrete range as an array index is very helpful when you have a dense map (most keys will be used) or you also want to be able to iterate over a sequential block of memory (better performance than a dictionary will generally give you, since they don't usually play well with caches).
The Americans with Disabilities Act doesn't cover subtyping the index type into an array. Ada, the language, does though.
EDIT: Seems I'm getting downvoted, do people not know that ADA is not the name of the programming language? It's Ada, as in Ada Lovelace, whose name was also not generally SHOUTED as ADA.
There does seem to be a strain of weird language fanatics who insist all programming language names must be shouted, so they'll write RUST and ADA, and presumably JAVA and PYTHON, maybe it's an education thing, maybe they're stuck in an environment with uppercase only like a 1960s FORTRAN programmer ?
Similarly, and less explicably, are people who program in "C" and don't understand when people mention the oddity. Do people not see quotes? Do they just add them and not realize?
Very nice text. As i am very sceptical to Rust i apreciate the comarison to a language i know better. I was not aware that there is no formal spec for rust. Isn't that a problem if rustc makes changes?
1. There is a spec now, Ferrocene donated theirs, and the project is currently integrating it
2. The team takes backwards compatibility seriously, and uses tests to help ensure the lack of breakage. This includes tests like “compile the code being used by Linux” and “compile most open source Rust projects” to show there’s no regressions.
Ferrocene is a specification but it’s not a formal specification. [Minirust](https://github.com/minirust/minirust) is the closest thing we have to a formal spec but it’s very much a work-in-progress.
It's a good enough spec to let rustc work with safety critical software, so while something like minirust is great, it's not necessary, just something that's nice to have.
For implementers of third-party compilers, researchers of the Rust programming language, and programmers who write unsafe code, this is indeed a problem. It's bad.
For the designers of Rust, "no formal specification" allows them to make changes as long as it is not breaking. It's good.
Medical or Miltary often require the software stack/tooling to be certified following certain rules. I know most of this certifications are bogus, but how is that handled with Rust?
Ferrocene provides a certified compiler (based on a spec they've written documenting how it behaves) which is usable for many uses cases, but it obviously depends what exactly your specific domain needs.
I'd love to use Ada as my primary static language if it had broader support. It's in my opinion the best compiled language with strong static typing. Although it has gained traction with Alire, it unfortunately doesn't have enough 3rd party support for my needs yet.
You can generate bindings using a gcc -c -fdump-ada-spec <header.h>. They typically work well enough without needing additional tweaks but If it's more involved you can ask Claude to make a shell script that generates bindings for whatever C library you wanted to use and it works reasonably well.
In my opinion, don't make thick bindings for your C libraries. It just makes it harder to use them.
For example I don't really like the OpenGL thick bindings for Ada because using them is so wildly different than the C examples that I can't really figure out how to do what I want to do.
It's just everything and the kitchen sink, I'm afraid, from a NATS server and client library over cross-platform GUIs including mobile, common file format reading and writing like Excel, Word, and PDF to post-quantum cryptography. I'm unfortunately in the business of Lego-brick software development using well-maintained 3rd-party libraries whenever possible. It's the same with CommonLisp, I love it but in the end I'd have to write too many things on my own to be productive in it.
I envy people who can write foundational, self-contained software. It's so elegant.
Ada has some really good ideas which its a shame never took off or got used outside of the safety critical community that mostly used it. The ability to make number types that were limited in their range is really useful for certain classes of bugs. Spark Ada was a relatively easy substandard to learn and apply to start to develop software that was SIL 4 compliant.
I can't help but feel that we just went through a huge period of growth at all costs and now there is a desire to return, after 30-years of anything goes, to trying to make software that is safer again. Would be nice to start to build languages based on all the safety learnings over the decades to build some better languages, the good ideas keep getting lost in obscure languages and forgotten about.
Nim was inspired by Ada & Modula, and has subranges [1]:
Then at compile time: [1] https://nim-lang.org/docs/tut1.html#advanced-types-subrangesI know quite some people in the safety/aviation domain that kind of dislike the subranges, as it inserts run-time checks that are not easily traceable to source code, thus escaping the trifecta of requirements/tests/source-code (which all must be traceable/covered by each other).
Weirdly, when going through the higher assurance levels in aviation, defensive programming becomes more costly, because it complicates the satisfaction of assurance objectives. SQLite (whiches test suite reaches MC/DC coverage which is the most rigorous coverage criterion asked in aviation) has a nice paragraph on the friction between MC/DC and defensive programming:
https://www.sqlite.org/testing.html#tension_between_fuzz_tes...
Ideally, a compiler can statically prove that values stay within the range; it's no different than proving that values of an enumeration type are valid. The only places where a check is needed are conversions from other types, which are explicit and traceable.
But if the number type’s value can change at runtime as long as it stays within the range, thus may not always be possible to check at compile time.
If you have
The type of c could be u8 in 0..200. If you have holes in the middle, same applies. Which means that if you want to make c u8 between 0..100 you'd have to explicitly clamp/convert/request that, which would have to be a runtime check.In your example we have enough information to know that the addition is safe. In SPARK, if that were a function with a and b as arguments, for instance, and you don't know what's being passed in you make it a pre-condition. Then it moves the burden of proof to the caller to ensure that the call is safe.
But obviously the result of a + b is [0..200], so an explicit cast, or an assertion, or a call to clamp() is needed if we want to put it back into a [0..100].
Comptime constant expression evaluation, as in your example, may suffice for the compiler to be able to prove that the result lies in the bounds of the type.
Can you help me understand the context in which this would be far more beneficial from having a validation function, like this in Java:
It’s a question of compile time versus runtime.
Isn’t this just Design By Contract from Eiffel just in another form?
How does this work for dynamic casting? Say like if an age was submitted from a form?
I assume it’s a runtime error or does the compiler force you to handle this?
If you're using SPARK, it'll catch at compile time if there's ever a possibility that it would fit within that condition. Otherwise it'll throw an exception (constraint_error) during runtime for you to catch.
What happens when you add 200+1 in a situation where the compiler cannot statically prove that this is 201?
Your example also gets evaluated at comptime. For more complex cases I wouldn't be able to tell you, I'm not the compiler :) For example, this get's checked:
If it cannot statically prove it at comptime, it will crash at runtime during the type conversion operation, e.g.: Then, when you run it:Exactly this. Fails at runtime. Consider rather a different example: say the programmer thought the age were constrained to 110 years. Now, as soon as a person is aged 111, the program crashes. Stupid mistake by a programmer assumption turns into a program crash.
Why would you want this?
I mean, we've recently discussed on HN how most sorting algorithms have a bug for using ints to index into arrays when they should be using (at least) size_t. Yet, for most cases, it's ok, because you only hit the limit rarely. Why would you want to further constrain the field, would it not just be the source of additional bugs?
Once the program is operating outside of the bounds of the programmers assumption, it’s in an undefined state that may cause a crash to happen at a later point of time in a totally different place.
Making the crash happen at the same time and space as the error means you don’t have to trace a later crash back to the root cause.
This makes your system much easier to debug at the expense of causing some crashes that other systems might not have. A worthy trade off in the right context.
Out of bounds exception is ok to crash the program. User input error is not ok to crash the program.
I could go into many more examples but I hope I am understood. I think these hard-coded definition of ranges at compile time are causes of far more issues than they solve.
Let's take a completely different example: size of a field in a database for a surname. How much is enough? Turns out 128 varchars is not enough, so now they've set it to 2048 (not a project I work(ed) on, but am familiar with). Guess what? Not in our data set, but theoretically, even that is not enough.
> Out of bounds exception is ok to crash the program. User input error is not ok to crash the program.
So you validate user input, we've known how to do that for decades. This is a non-issue. You won't crash the program if you require temperatures to be between 0 and 1000 K and a user puts in 1001, you'll reject the user input.
If that user input crashes your program, you're not a very good programmer, or it's a very early prototype.
I think, if I am following things correctly, you will find that there's a limit to the "validate user input" argument - especially when you think of scenarios where multiple pieces of user input are gathered together and then have mathematical operations applied to them.
eg. If the constraint is 0..200, and the user inputs one value that is being multiplied by our constant, it's trivial to ensure the user input is less than the range maximum divided by our constant.
However, if we are having to multiply by a second, third... and so on.. piece of user input, we get to the position where we have to divide our currently held value by a piece of user input, check that the next piece of user input isn't higher, and then work from there (this assumes that the division hasn't caused an exception, which we will need to ensure doesn't happen.. eg if we have a divide by zero going on)
I mean, yeah. If you do bad math you'll get bad results and potentially crashes. I was responding to someone who was nonsensically ignoring that we validate user input rather than blindly putting it into a variable. Your comment seems like a non sequitur in this discussion. It's not like the risk you describe is unique to range constrained integer types, which is what was being discussed. It can happen with i32 and i64, too, if you write bad code.
Hmm, I was really pointing at the fact that once you get past a couple of pieces of user input, all the validation in the world isn't going to save you from the range constraints.
Assuming you want a good faith conversation, then the idea that there's bad math involved seems a bit ludicrous
> Why would you want this?
Logic errors should be visible so they can be fixed?
> Stupid mistake by a programmer assumption turns into a program crash.
I guess you can just catch the exception in Ada? In Rust you might instead manually check the age validity and return Err if it's out of range. Then you need to handle the Err. It's the same thing in the end.
> Why would you want to further constrain the field
You would only do that if it's a hard requirement (this is the problem with contrived examples, they make no sense). And in that case you would also have to implement some checks in Rust.
Exactly, but how do you catch the exception? One exception catch to catch them all, or do you have to distinguish the types?
And yes... error handle on the input and you'd be fine. How would you write code that is cognizant enough to catch outofrange for every +1 done on the field? Seriously, the production code then devolves into copying the value into something else, where operations don't cause unexpected exceptions. Which is a workaround for a silly restriction that should not reside in runtime level.
Also, I would be very interested to learn the case for hard requirement for a range.
In almost all the cases I have seen it eventually breaks out of confinement. So, it has to be handled sensibly. And, again, in my experience, if it's built into constraints, it invarianly is not handled properly.
Consider the size of the time step in a numerical integrator of some chemical reaction equation, if it gets too big the prediction will be wrong and your chemical plant could explode.
So too big times steps cannot be used, but constant sized steps is wasteful. Seems good to know the integrator can never quietly be wrong, even if you have to pay the price that tge integrator could crash.
Ada, or at least GNAT, also supports compile-time dimensional analysis (unit checking). I may be biased, because I mostly work with engineering applications, but I still do not understand why in other languages it is delegated to 3rd party libraries.
https://docs.adacore.com/gnat_ugn-docs/html/gnat_ugn/gnat_ug...
Nim (https://nim-lang.org), mentioned elsethread Re: numeric ranges like Ada, only needs a library for this: https://github.com/SciNim/Unchained
FWIW, physical dimensions like meters were the original apples-to-oranges type system that pre-dates all modern notions of things beyond arithmetic. I'm a little surprised it wasn't added to early FORTRAN. In a different timeline, maybe. :)
I think what is in "the" "stdlib" or not is a tricky question. For most general/general purpose languages, it can be pretty hard to know even the probability distribution of use cases. So, it's important to keep multiple/broad perspectives in mind as your "I may be biased" disclaimer. I don't like the modern (well, it kind of started with CTAN where the micros seemed meant more for copy-paste and then CPAN where it was not meant for that) trend toward dozens to hundreds of micro-dependencies, either, though. I think Python, Node/JS, and Rust are all known for this.
F# can do this too.
https://learn.microsoft.com/en-us/dotnet/fsharp/language-ref...
Yes, we re-invent the wheel. The more time you spend writing software for a living, the more you will see the wheel re-invented. But Ada and Rust are safe under different definitions of safety. I view Rust as having a more narrow definition of safety, but a very important notion of safety, and executed with brutal focus. While Ada's definition of safety being broader, but better suited to a small subset of use cases.
I write Rust at work. I learned Ada in the early 1990s as the language of software engineering. Back then a lot of the argument against Ada was it was too big, complex, and slowed down development too much. (Not to mention the validating Ada 83 compiler I used cost about $20,000 a seat in today's money). I think the world finally caught up with Ada and we're recognizing that we need languages every bit as big and complex, like Rust, to handle issues like safe, concurrent programming.
> The ability to make number types that were limited in their range is really useful for certain classes of bugs.
This is a feature I use a lot in C++. It is not part of the standard library but it is trivial to programmatically generate range-restricted numeric types in modern C++. Some safety checks can even be done at compile-time instead of runtime.
It should be a standard feature in programming languages.
I've never come across any range restricting constructions in C++ projects in the wild before. It truly is a shame, I think it's something more programmers should be aware of and use. Eliminating all bounds checking and passing that job to the compiler is pretty killer and eliminates whole classes of bugs.
> The ability to make number types that were limited in their range is really useful for certain classes of bugs.
Yes! I would kill to get Ada's number range feature in Rust!
It is worked on under the term "pattern types" mainly by Oli oli-obk Scherer I think, who has an Ada background.
Can't tell you what the current state is but this should give you the keywords to find out.
Also, here is a talk Oli gave in the Ada track at FOSDEM this year: https://hachyderm.io/@oli/113970047617836816
That feature is actually from Pascal, and Modula-2, before making its way into Ada.
For some strange reason people always relate to Ada for it.
I would guess that Ada is simply more known. Keep in mind that tech exploded in the past ~3.5 decades whereas those languages are much older and lost the popularity contest. If you ask most people about older languages, the replies other than the obvious C and (kind of wrong but well) C++ are getting thin really quickly. COBOL, Ada, Fortran, and Lisp are probably what people are aware of the most, but other than that?
You've forgotten about BASIC, SNOBOL, APL, Forth, and PL/1. There were many interesting programming languages back then. Good times!
The first five languages I learned back in the 70s: FORTRAN, Pascal, PL/I, SNOBOL, APL. Then I was an Ada and Icon programmer in the 80s. In the 90s, it was C/C++ and I just never had the enthusiasm for it.
Icon (which came from SNOBOL) is one of the few programming languages I consider to embody truly new ideas. (Lisp, Forth, and Prolog are others that come to mind.)
Icon is an amazing language and I wish it was better known.
I found Pascal more readable as a budding programmer. Later on, C's ability to just get out of the way to program what I wanted trumped the Pascal's verbosity and opinionatedness.
I admit that the terseness of the syntax of C can be off-putting. Still, it's just syntax, I am sorry you were disuaded by it.
True.
I dabbled in some of them during some periods when I took a break from work. And also some, during work, in my free time at home.
Pike, ElastiC (not a typo), Icon, Rebol (and later Red), Forth, Lisp, and a few others that I don't remember now.
Not all of those are from the same period, either.
Heck, I can even include Python and Ruby in the list, because I started using them (at different times, with Python being first) much before they became popular.
For me it's because I learned Ada in college.
18 year old me couldn't appreciate how beautiful a language it is but in my 40s I finally do.
2000-2005 College was teaching Ada?
2005-2010 my college most interesting (in this direction) language was Haskell. I don't think that there was any other language (like Ada) being taught)
Yes, I learned it in a course that surveyed a bunch of different languages like Ada, Standard ML, and Assembly
Ada is sometimes taught as part of a survey course in Programming Languages. That’s how I learned a bit about it.
Turbo Pascal could check ranges on assignment with the {$R+} directive, and Delphi could check arithmetic overflow with {$Q+}. Of course, nobody wanted to waste the cycles to turn those on :)
Most Pascal compilers could do that actually.
Yeah not wanting to waste cycles is how we ended up with the current system languages, while Electron gets used all over the place.
I would argue that was one of the reasons why those languages lost.
I distinctly remember arguments for functions working on array of 10. Oh, you want array of 12? Copy-paste the function to make it array of 12. What a load of BS.
It took Pascal years to drop that constraint, but by then C had already won.
I never ever wanted the compiler or runtime to check a subrange of ints. Ever. Overflow as program crash would be better, which I do find useful, but arbitrary ranges chosen by programmer? No thanks. To make matters worse, those are checked even by intermediate results.
I realize this is opinioned only on my experience, so I would appreciate a counter example where it is a benefit (and yes, I worked on production code written in Pascal, French variant even, and migrating it to C was hilariously more readable and maintainable).
Thankfully instead of overflow, C gets you the freedom of UB based optimizations.
Funny :)
It still results in overflow and while you are right that it's UB by the standard, it's still pretty certain what will happen on a particular platform with a particular compiler :)
No, optimizing compilers don't translate overflow to platform-specific behavior for signed integers - since it's UB they'll freely make arithmetic or logic assumptions that can result in behavior that can't really be humanly predicted without examining the generated machine code.
compile time user config checking?
Sorry? That's not possible...
There is RFC but I guess the work stopped.
As a sibling comment[0] mentioned, pattern types are actively being worked on.
[0] https://news.ycombinator.com/item?id=45474777
Oh. I thought it stalled since there was a long time without activity.
If I am not wrong, you could do a zero-cost abstraction in C++ and use user-defined literals if you wosh for nice syntax.
In my personal experience it's not just safety. Reliability of produced is also a big part.
Ime, being able to express constraints in a type systems yields itself to producing better quality code. A simple example from my experience with rust and golang is mutex handling, rust just won't let you leak a guard handle while golang happily let's you run into a deadlock.
It doesn’t really compete in the same space as Ada or Rust but C# has range attributes that are similar, the only downside is you have to manually call the validation function unless you are using something like ASP.NET that does it automatically at certain times.
>Ada has some really good ideas which its a shame never took off or got used outside of the safety critical community that mostly used it. The ability to make number types that were limited in their range is really useful for certain classes of bugs.
As pjmlp says in a sibling comment, Pascal had this feature, from the beginning, IIRC, or from an early version - even before the first Turbo Pascal version.
30+ years ago I was programming in Ada, and I feel the same way and have been repeatedly disappointed. Maybe this time around things will be different.
The author indicates some obvious differences, including the fact that Ada has a formal spec and rust doesn't -- rustc seems to be both in flux as well as the reference implementation. This might matter if you're writing a new compiler or analyzer.
But the most obvious difference, and maybe most important to a user, was left unstated: the adoption and ecosystem such as tooling, libraries, and community.
Ada may have a storied success history in aerospace and life safety, etc, and it might have an okay standard lib which is fine for AOC problems and maybe embedded bit poking cases in which case it makes sense to compare to Rust. But if you're going to sit down for a real world project, ie distributed system or OS component, interfacing with modern data formats, protocols, IDEs, people, etc is going to influence your choice on day one.
Rust has now a donated spec that was provided by Ferrocene. This spec style was influenced by the Ada spec. It is available publicly now on https://rust-lang.github.io/fls/ .
This is part of the effort of Ferrocene to provide a safety certificate compiler. And they are already available now.
This is only meaningful if Rust compiler devs give any guarantees about never breaking the spec and always being able to compile code that adheres to this spec.
Why so?
Specs for other languages are also for a specific version/snapshot.
It's also a specific version of a compiler that gets certified, not a compiler in perpetuity, no matter what language.
That's not how it works for most language standards, though. Most language standards are prescriptive, while Rust is descriptive.
Usually the standard comes first, compiler vendors implement it, and between releases of the spec the language is fixed. Using Ada as an example, there was Ada 95 and Ada 2003, but between 95 and 2003 there was only Ada 95. There was no in-progress version, the compiler vendors weren't making changes to the language, and an Ada95 compiler today compiles the same language as an Ada95 compiler 30 years ago.
Looking at the changelog for the Rust spec (https://rust-lang.github.io/fls/changelog.html), it's just the changelog of the language as each compiler verion is released, and there doesn't seem to be any intention of supporting previous versions. Would there be any point in an alternative compiler implementing "1.77.0" of the Rust spec?
And the alternative compiler implementation can't start implementing a compiler for version n+1 of the spec until that version of rustc is released because "the spec" is just "whatever rustc does", making the spec kind of pointless.
> Usually the standard comes first, compiler vendors implement it, and between releases of the spec the language is fixed.
This is not how C or C++ were standardized, nor most computer standards in the first place. Usually, vendors implement something, and then they come together to agree upon a standard second.
When updating standards, sometimes things are put in the standard before any implementations, but that's generally considered an antipattern for larger designs. You want real-world evaluation of the usefulness of something before it's been standardized.
Because otherwise the spec is just words on a paper, and the standard is just "whatever the compiler does is what it supposed to do". The spec codifies the intentions of the creators separately from the implementation.
In rust, there is currently only one compiler so it seems like there's no problem
How is this different from the existing situation that Rust remains compatible since Rust 1.0 over a decade ago?
Rust doesn’t have quite as strong compatibility guarantees. For example, it’s not considered a NC-breaking change to add new methods to standard library types, even though this can make method resolution ambiguous for programs that had their own definitions of methods with the same name. A C++ implementation claiming to support C++11 wouldn’t do that, they’d use ifdefs to gate off the new declarations when compiling in C++11 mode.
By that criteria there's no meaningful C++ compiler/spec.
How so? There are compiler-agnostic C++ specs, and compiler devs try to be compatible with it.
What the GP is suggesting is that the rust compiler should be written and then a spec should be codified after the fact (I guess just for fun?).
> compiler devs try to be compatible with it.
You have to squint fairly hard to get here for any of the major C++ compilers.
I guess maybe someone like Sean Baxter will know the extent to which, in theory, you can discern the guts of C++ by reading the ISO document (or, more practically, the freely available PDF drafts, essentially nobody reads the actual document, no not even Microsoft bothers to spend $$$ to buy an essentially identical PDF)
My guess would be that it's at least helpful, but nowhere close to enough.
And that's ignoring the fact that the popular implementations do not implement any particular ISO standard, in each case their target is just C++ in some more general sense, they might offer "version" switches, but they explicitly do not promise to implement the actual versions of the ISO C++ programming language standard denoted by those versions.
Neither the Rust nor the Ada spec is formal, in the sense of consumable by a theorem prover. AFAIK for Ada Spark, there is of course assumptions on the language semantics built-in to Spark, but: these are nowhere coherently written down in a truly formal (as in machine-readable) spec.
There was also Larch/Ada [0], which was a formally proved subset of Ada, developed for NASA [1].
[0] https://apps.dtic.mil/sti/tr/pdf/ADA249418.pdf
[1] https://ntrs.nasa.gov/citations/19960000030
What even is the most complex programming language with a fully machine-checkable spec? Are there actually practically useful ones? I know of none.
One candidate is ATS [1].
Another, https://cakeml.org/
[1]: https://en.wikipedia.org/wiki/ATS_(programming_language)
There's a formally verified C compiler, IIRC the frontend isn't, but if you define the language to the structs that are in the inputs to whatever is formally verified I guess whatever C like dialect of a language it implements must be.
I'm sure the programmers of the flight control software safely transporting 1 billion people per year see your "real world project" and reply with something like "yes, if you are writing software where the outputs don't matter very much, our processes are excessive" :p
This write-up shows that while Ada may have some cultural and type-related disadvantages compared to Rust, Ada seems to generally win the readability contest.
What is missing from the comparison is compiler speed - Ada was once seen as a complex language, but that may not be the case if compared against Rust.
In any case, thanks for the post, it made me want to try using Ada for a real project.
What exactly is a "type-related disadvantage"?
As far as I'm aware, Ada has a much more expressive type system and not by a hair. By miles. Being able to define custom bounds checked ordinals, being able to index arrays with any enumerable type. Defining custom arithmatic operators for types. adding compile and runtime typechecks to types with pre/post conditions, iteration variants, predicates, etc... Discriminant records. Record representation clauses.
I'm not sure what disadvantages exist.
On strings in Ada vs Rust. Ada's design predates Unicode (early 1980s vs 1991), so Ada String is basically char array whereas Rust string is a Unicode text type. This explains why you can index into Ada Strings, which are arrays of bytes, but not into Rust strings, which are UTF8 encoded buffers that should be treated as text. Likely the Rust implementation could have used a byte array here.
> Ada String is basically char array
Worse, the built-in Unicode strings are arrays of Unicode scalars, effectively UTF-32 in the general case. There's no proper way to write UTF-8 string literals AFAIK, you need to convert them from arrays of 8, 16 or 32 bit characters depending on the literal.
How is the internal representation an issue? Java string are utf16 internally and it's doesn't matter how you write your code nor what's the targeted format.
It's an issue because there's nothing internal about the representation in Ada: They're regular arrays of Character/Wide_Character/Wide_Wide_Character, and string literals have different type depending on the width required to represent it as such.
Also, string representations very much matter if you're coding with even the slightest amount of mechanical sympathy.
I mean you can index into Rust's strings, it's just that you probably don't want that:
Notice that's a range, Rust's string slice type doesn't consider itself just an array (as the Ada type is) and so we can't just provide an integer index, the index is a range of integers to specify where our sub-string should begin and end. If we specify the middle of a Unicode character then the code panics - don't do that.Yes, since AoC always uses ASCII it will typically make sense to use &[u8] (the reference to a slice of bytes) and indeed the str::as_bytes method literally gives you that byte slice if you realise that's what you actually needed.
I found it kind of odd that the author says Rust doesn't support concurrent programming out of the box. He links to another comment which points out you don't need Tokio for async (true enough), but even that aside async isn't the only way to achieve concurrency. Threads are built right into the language, and are easier to use than async. The only time they wouldn't be a good choice is if you anticipate needing to spawn so many threads that it causes resource issues, which very few programs will.
(Honest question from non Rustacean.)
How does the cancellation story differ between threads and async in Rust? Or vs async in other languages?
There's no inherent reason they should be different, but in my experience (in C++, Python, C#) cancellation is much better in async then simple threads and blocking calls. It's near impossible to have organised socket shutdown in many languages with blocking calls, assuming a standard read thread + write thread per socket. Often the only reliable way to interrupt a socket thread it's to close the socket, which may not be what you want, and in principle can leave you vulnerable to file handle reuse bugs.
Async cancellation is, depending on the language, somewhere between hard but achievable (already an improvement) and fabulous. With Trio [1] you even get the guarantee that non-compared socket operations are either completed or have no effect.
Did this work any better in Rust threads / blocking calls? My uneducated understanding is that things are actually worse in async than other languages because there's no way to catch and handle cancellations (unlike e.g. Python which uses exceptions for that).
I'm also guessing things are no better in Ada but very happy to hear about that too.
Cancellation in rust async is almost too easy, all you need to do is drop the future.
If you need cleanup, that still needs to be handled manually. Hopefully the async Drop trait lands soon.
Ok I could be super wrong here, but I think that's not true.
Dropping a future does not cancel a concurrently running (tokio::spawn) task. It will also not magically stop an asynchronous I/o call, it just won't block/switch from your code anymore while that continues to execute. If you have created a future but not hit .await or tokio::spawn or any of the futures:: queue handlers, then it also won't cancel it it just won't begin it.
Cancellation of a running task from outside that task actually does require explicit cancelling calls IIRC.
Edit here try this https://cybernetist.com/2024/04/19/rust-tokio-task-cancellat...
Spawn is kind of a special case where it's documented that the future will be moved to the background and polled without the caller needing to do anything with the future it returns. The vast majority of futures are lazy and will not do work unless explicitly polled, which means the usual way of cancelling is to just stop polling (e.g. by awaiting the future created when joining something with a timeout; either the timeout happens before the other future completes, or the other future finishes and the timeout no longer gets polled). Dropping the future isn't technically a requirement, but in practice it's usually what will happen because there's no reason to keep around a future you'll never poll again, so most of the patterns that exist for constructing a future that finishes when you don't need it anymore rather than manually cancelling will implicitly drop any future that won't get used again (like in the join example above, where the call to `join` will take ownership of both futures and not return either of them, therefore dropping whichever one hasn't finished when returning).
That's a rare exception, and just a design choice of this particular library function. It had to intentionally implement a workaround integrated with the async runtime to survive normal cancellation. (BTW, the anti-cancellation workaround isn't compatible with Rust's temporary references, which can be painfully restrictive. When people say Rust's async sucks, they often actually mean `tokio:spawn()` made their life miserable).
Regular futures don't behave like this. They're passive, and can't force their owner to keep polling them, and can't prevent their owner from dropping them.
When a Future is dropped, it has only one chance to immediately do something before all of its memory is obliterated, and all of its inputs are invalidated. In practice, this requires immediately aborting all the work, as doing anything else would be either impossible (risking use-after-free bugs), or require special workarounds (e.g. io_uring can't work with the bare Future API, and requires an external drop-surviving buffer pool).
Rain showed that not all may be as simple as it seems to do it correctly.
In her presentation on async cancellation in Rust, she spoke pretty extensively on cancel safety and correctness, and I would recommend giving it a watch or read.
https://sunshowers.io/posts/cancelling-async-rust/
Yeah that's what I'm talking about ... Cancellation where the cancelled object can't handle the cancellation, call other async operations and even (very rarely) suppress it, isn't "real" cancellation to me, having seen how this essential it is.
> There's no inherent reason they should be different
There is... They're totally different things.
And yeah Rust thread cancellation is pretty much the same as in any other language - awkward to impossible. That's a fundamental feature of threads though; nothing to do with Rust.
There's no explicit cancel, but there's trivial one shot cancellation messages that you can handle on the thread side. It's perfectly fine, honestly, and how I've been doing it forever.
I would call that clean shutdown more than cancellation. You can't cancel a long computation, or std::thread::sleep(). Though tbf that's sort of true of async too.
To be clear about what I meant: I was saying that, in principle, it would be possible design a language or even library where all interruptable operations (at least timers and networking) can be cancelled from other threads. This can be done using a cancellation token mechanism which avoids even starting the operation of already cancelled token, in a way that avoids races (as you might imagine from a naive check of a token before starting the operation) if another thread cancels this one just as the operation is starting.
Now I've set (and possibly moved) the goalposts, I can prove my point: C# already does this! You can use async across multiple threads and cancellation happens with cancellation tokens that are thread safe. Having a version where interruptable calls are blocking rather than async (in the language sense) would actually be easier to implement (using the same async-capable APIs under the hood e.g., IOCP on Windows).
Well sure, there's nothing to stop you writing a "standard library" that exposes that interface. The default one doesn't though. I expect there are platforms that Rust supports that don't have interruptible timers and networking (whereas C# initially only supported Windows).
I wonder where the cut-off is where a work stealing scheduler like Tokio's is noticeably faster than just making a bunch of threads to do work, and then where the hard cut-off is that making threads will cause serious problems rather than just being slower because we don't steal.
It might be quite small, as I found for Maps (if we're putting 5 things in the map then we can just do the very dumbest thing which I call `VecMap` and that's fine, but if it's 25 things the VecMap is a little worse than any actual hash table, and if it's 100 things the VecMap is laughably terrible) but it might be quite large, even say 10x number of cores might be just fine without stealing.
The general rule is that if you need to wait faster use async, and if you need to process faster use threads.
Another way of thinking about this is whether you want to optimize your workload for throughput or latency. It's almost never a binary choice, though.
Threads as they are conventionally considered are inadequate. Operating systems should offer something along the lines of scheduler activations[0]: a low-level mechanism that represents individual cores being scheduled/allocated to programs. Async is responsive simply because it conforms to the (asynchronous) nature of hardware events. Similarly, threads are most performant if leveraged according to the usage of hardware cores. A program that spawns 100 threads on a system with 10 physical cores is just going to have threads interrupting each other for no reason; each core can only do so much work in a time frame, whether it's running 1 thread or 10. The most performant/efficient abstraction is a state machine[1] per core. However, for some loss of performance and (arguable) ease of development, threads can be used on top of scheduler activations[2]. Async on top of threads is just the worst of both worlds. Think in terms of the hardware resources and events (memory accesses too), and the abstractions write themselves.
[0] https://en.wikipedia.org/wiki/Scheduler_activations, https://dl.acm.org/doi/10.1145/121132.121151 | Akin to thread-per-core
[1] Stackless coroutines and event-driven programming
[2] User-level virtual/green threads today, plus responsiveness to blocking I/O events
Can you say more about what you mean by wait faster? Is it as in, enqueue many things faster?
You may be correct in theory though in practice the reason to use Async over threads is often because the crate you want to use is async. Reqwest is a good example, it cannot be used without Tokio. Ureq exists and works fine. I've done a fairly high level application project where I tried to avoid all async and at some point it started to feel like swimming upstream.
Or in cases where the platform doesn't support threads easily - WASM and embedded (Embassy). Tbh I think that's a better motivation for using async than the usual "but what if 100k people visit my anime blog all at once?"
Interesting that Ada has an open source compiler. For whatever reason when I looked at it years ago I thought it was proprietary compilers only so I never really looked at it again. Maybe I’ll look again now.
GNAT has been around for 30 years. There were some limitations with (one version of?) it due to it being released without the GPL runtime exception, which meant linking against its runtime technically required your programs to also be released under GPL. That hasn't been an issue for a very long time either, though.
GNAT has been around since the 90s, based on GCC. My university did some work on the compiler and used it for some undergrad courses like real-time programming. IIRC there was an attempt to use Ada for intro to programming courses, but I think they chose Pascal and then eventually C++.
Tangentially related, one of the more interesting projects I've seen in the 3D printing space recently is Prunt. It's a printer control board and firmware, with the latter being developed in Ada.
https://prunt3d.com/
https://github.com/Prunt3D/prunt
It's kind of an esoteric choice, but struck me as "ya know, that's really not a bad fit in concept."
I wrote about some of the reason for choosing it here: https://news.ycombinator.com/item?id=42319962
In Case Study 2, near the end it says "if the client may need to know SIDE_LENGTH, then you can add a function to return the value"
Which yeah, you can do that but it's a constant so you can also more literally write (in the implementation just like that function):
I’d disagree that both languages encourage stack-centric programming idioms. Ada encourages static allocation instead.
I found the inclusion of arrays indexed on arbitrary types in the feature table as a benefit of Ada surprising. That sounds like a dictionary type, which is in the standard library of nearly every popular Language. Rust includes two.
I think they're focused very much specifically on the built-in array type. Presumably Ada is allowed to say eggs is an array and the index has type BirdSpecies so eggs[Robin] and eggs[Seagull] work but eggs[5] is nonsense.
Rust is OK with you having a type which implements Index<BirdSpecies> and if eggs is an instance of that type it's OK to ask for eggs[Robin] while eggs[5] won't compile, but Rust won't give you an "array" with this property, you'd have to make your own.
My guess is that this makes more sense in a language where user defined types are allowed to be a subset of say a basic integer type, which I know Ada has and Rust as yet does not. If you can do that, then array[MyCustomType] is very useful.
I call out specifically User Defined types because, Rust's NonZeroI16 - the 16-bit integers except zero - is compiler-only internal magic, if you want a MultipleOfSixU32 or even U8ButNotThreeForSomeReason that's not "allowed" and so you'd need nightly Rust and an explicit "I don't care that this isn't supported" compiler-only feature flag in your source. I want to change this so that anybody can make the IntegersFiveThroughTwelveU8 or whatever, and there is non zero interest in that happening, but I'd have said the exact same thing two years ago so...
I really don't understand this so I hope you won't mind explaining it. If I would have the type U8ButNotThreeForSomeReason wouldn't that need a check at runtime to make sure you are not assigning 3 to it?
At runtime it depends. If we're using arbitrary outside integers which might be three, we're obliged to check yes, nothing is for free. But perhaps we're mostly or entirely working with numbers we know a priori are never three.
NonZero<T> has a "constructor" named new() which returns Option<NonZero<T>> so that None means nope this value isn't allowed because it's zero. But unwrapping or expecting an Option is constant, so NonZeroI8::new(9).expect("Nine is not zero") will compile and produce a constant that the type system knows isn't zero.
Three in particular does seem like a weird choice, I want Balanced<signed integer> types such as BalancedI8 which is the 8-bit integers including zero, -100 and +100 but crucially not including -128 which is annoying but often not needed. A more general system is envisioned in "Pattern Types". How much more general? Well, I think proponents who want lots of generality need to help deliver that.
Option<U8ButNotThreeForSomeReason> would have a size of 2 bytes (1 for the discriminant, 1 for the value) whereas Option<NonZeroU8> has a size of only 1 byte, thanks to some special sauce in the compiler that you can't use for your own types. This is the only "magic" around NonZero<T> that I know of, though.
You can make an enum, with all 255 values spelled out, and then write lots of boilerplate, whereupon Option<U8ButNotThreeForSomeReason> is also a single byte in stable Rust today, no problem.
That's kind of silly for 255 values, and while I suspect it would work clearly not a reasonable design for 16-bits let alone 32-bits where I suspect the compiler will reject this wholesale.
Another trick you can do, which will also work just fine for bigger types is called the "XOR trick". You store a NonZero<T> but all your adaptor code XORs with your single not-allowed value, in this case 3 and this is fairly cheap on a modern CPU because it's an ALU operation, no memory fetches except that XOR instruction, so often there's no change to bulk instruction throughput. This works because only 3 XOR 3 == 0, other values will all have bits jiggled but remain valid.
Because your type's storage is the same size, you get all the same optimisations and so once again Option<U8ButNotThreeForSomeReason> is a single byte.
Ada also has hash maps and sets.
http://www.ada-auth.org/standards/22rm/html/RM-TOC.html - See section A.18 on Containers.
The feature of being able to use a discrete range as an array index is very helpful when you have a dense map (most keys will be used) or you also want to be able to iterate over a sequential block of memory (better performance than a dictionary will generally give you, since they don't usually play well with caches).
Thanks for the clarification. I can imagine that being a useful optimization on occasion.
It's not a dictionary, that's a totally different data structure.
In ADA you can subtype the index type into an array, i.e. constraining the size of the allowed values.
The Americans with Disabilities Act doesn't cover subtyping the index type into an array. Ada, the language, does though.
EDIT: Seems I'm getting downvoted, do people not know that ADA is not the name of the programming language? It's Ada, as in Ada Lovelace, whose name was also not generally SHOUTED as ADA.
There does seem to be a strain of weird language fanatics who insist all programming language names must be shouted, so they'll write RUST and ADA, and presumably JAVA and PYTHON, maybe it's an education thing, maybe they're stuck in an environment with uppercase only like a 1960s FORTRAN programmer ?
Maybe who cares?
I have found a strong correlation between people who say JAVA and country of origin. And thus have assumed it's an education thing.
Similarly, and less explicably, are people who program in "C" and don't understand when people mention the oddity. Do people not see quotes? Do they just add them and not realize?
It's funny because Fortran's official name now is Fortran, not FORTRAN.
Very nice text. As i am very sceptical to Rust i apreciate the comarison to a language i know better. I was not aware that there is no formal spec for rust. Isn't that a problem if rustc makes changes?
A few things:
1. There is a spec now, Ferrocene donated theirs, and the project is currently integrating it
2. The team takes backwards compatibility seriously, and uses tests to help ensure the lack of breakage. This includes tests like “compile the code being used by Linux” and “compile most open source Rust projects” to show there’s no regressions.
Ferrocene is a specification but it’s not a formal specification. [Minirust](https://github.com/minirust/minirust) is the closest thing we have to a formal spec but it’s very much a work-in-progress.
It's a good enough spec to let rustc work with safety critical software, so while something like minirust is great, it's not necessary, just something that's nice to have.
Isn’t the Ada spec also not a formal spec?
It depends on who you are.
For implementers of third-party compilers, researchers of the Rust programming language, and programmers who write unsafe code, this is indeed a problem. It's bad.
For the designers of Rust, "no formal specification" allows them to make changes as long as it is not breaking. It's good.
Medical or Miltary often require the software stack/tooling to be certified following certain rules. I know most of this certifications are bogus, but how is that handled with Rust?
Ferrocene provides a certified compiler (based on a spec they've written documenting how it behaves) which is usable for many uses cases, but it obviously depends what exactly your specific domain needs.
I'd love to use Ada as my primary static language if it had broader support. It's in my opinion the best compiled language with strong static typing. Although it has gained traction with Alire, it unfortunately doesn't have enough 3rd party support for my needs yet.
You can generate bindings using a gcc -c -fdump-ada-spec <header.h>. They typically work well enough without needing additional tweaks but If it's more involved you can ask Claude to make a shell script that generates bindings for whatever C library you wanted to use and it works reasonably well.
In my opinion, don't make thick bindings for your C libraries. It just makes it harder to use them.
For example I don't really like the OpenGL thick bindings for Ada because using them is so wildly different than the C examples that I can't really figure out how to do what I want to do.
What 3rd party things would you like to see?
It's just everything and the kitchen sink, I'm afraid, from a NATS server and client library over cross-platform GUIs including mobile, common file format reading and writing like Excel, Word, and PDF to post-quantum cryptography. I'm unfortunately in the business of Lego-brick software development using well-maintained 3rd-party libraries whenever possible. It's the same with CommonLisp, I love it but in the end I'd have to write too many things on my own to be productive in it.
I envy people who can write foundational, self-contained software. It's so elegant.
You have Alire crates for generating Excel and PDF streams/files. Of course you want everything else ;-).
The word for types depending on a value is dependent typing. Eg lists of size N, numbers in a range, are all what you call dependent types.
Idris - cosmetically looks like haskell, Lean and a bunch of other languages have this feature
https://en.wikipedia.org/wiki/Dependent_type