Async Python Is Secretly Deterministic

(dbos.dev)

77 points | by KraftyOne 4 days ago ago

37 comments

  • 12_throw_away 4 days ago ago

    No, determinstic scheduling is not a property of async python.

    Yes, the stdlib asyncio event loop does have deterministic scheduling, but that's an implementation detail and I would not rely on it for anything critical. Other event loops - for instance trio [1] - explicitly randomize startup order so that you won't accidentally write code that relies on it.

    [1] https://github.com/python-trio/trio/issues/32

    • StableAlkyne 4 days ago ago

      > but that's an implementation detail

      That sounds familiar...

      https://stackoverflow.com/questions/39980323/are-dictionarie...

    • KraftyOne 4 days ago ago

      It's been a stable (and documented) behavior of the Python standard library for almost a decade now. It's possible it may change--nothing is ever set in stone--but that would be a large change in Python that would come with plenty of warning and time for adjustment.

      • 9dev 4 days ago ago

        And then one day, Astral creates a new Python implementation in Rust or something that is way faster and all the rage, but does this particular thing different than CPython. Whoops, you can’t use that runtime, because you now have cursed parts in your codebase that produce nondeterministic behaviour you can’t really find a reason for.

        • stuartjohnson12 3 days ago ago

          and then all the serverless platforms will start using Astral's new rust-based runtime to reduce cold starts, and in theory it's identical, except half of packages now don't work and it's very hard to anticipate which ones will and will not and behold! You have achieved Deno

        • ubercore 3 days ago ago

          That's a bit what it felt like when I was learning Rust async.

          I get it, but "ecosystems" of async runtimes have a pretty big cost.

        • wavemode 3 days ago ago

          If I know anything about the Python community - that new runtime would simply never gain significant traction, due to the incompatibility.

        • LtWorf 3 days ago ago

          If the python core team cared about not breaking things I wouldn't need to run my tests on all versions of python.

      • farsa 3 days ago ago

        Well, in my early days programming python I made a lot(!!) of code assuming non-concurrent execution, but some of that code will break in the future with GIL removal. Hopefully the Python devs keep these important changes as opt-ins.

    • mort96 3 days ago ago

      How do you differentiate between something that "happens to work due to an implementation detail" and a "proper feature that's specified to work" in a language without a specification?

      • nhumrich 3 days ago ago

        In a language without a spec? You don't. But python has a very strong spec.

      • BrenBarn 3 days ago ago

        There's still documentation.

    • game_the0ry 3 days ago ago

      I just realized how little I know about how async event loops.

    • OptionOfT 3 days ago ago

      The good old Workflow XKCD matches this perfectly: https://xkcd.com/1172/

  • jpollock 4 days ago ago

    That's deterministic dispatch, as soon as it forks or communicates, it is non deterministic again?

    Don't you need something like a network clock to get deterministic replay?

    It can't use immediate return on replay, or else the order will change.

    This makes me twitchy. The dependencies should be better modelled, and idempotency used instead of logging and caching.

  • whinvik 4 days ago ago

    Is this guaranteed by the async specification? Or is this just current behavior which could be changed in a future update. Feels like a brittle dependency if its not part of the spec.

    • KraftyOne 4 days ago ago

      It's documented behavior for the low-level API (e.g. asyncio.call_soon https://docs.python.org/3/library/asyncio-eventloop.html#asy...). More broadly, this has been a stable behavior of the Python standard library for almost a decade now. If it does change, that would be a huge behavioral change that would come with plenty of warning and time for adjustment.

      • btilly 4 days ago ago

        In my experience, developers who rely on precise and relatively obscure corner cases, tend to assume that they are more stable than they later prove to be. I've been that developer, and I've been burned because of it.

        Even more painfully, I've been the maintenance programmer who was burned because some OTHER programmer trusted such a feature. And then it was my job to figure out the hidden assumption after it broke, long after the original programmer was gone. You know the old saying that you have to be twice as clever to debug code, as you need to be to write it? Debugging another person's clever and poorly commented tricks is no fun!

        I'd therefore trust this feature a lot less than you appear to. I'd be tempted to instead wrap the existing loop with a new loop to which I can add instrumentation etc. It's more work. But if it breaks, it will be clear why it broke.

      • whinvik 3 days ago ago

        That gives me slightly more confidence but only slightly.

        For example what happens if I use a different async backend like Tokio?

  • arn3n 4 days ago ago

    While not production ready, I’ve been happily surprised at this functionality when building with it. I love my interpreters to be deterministic, or when random to be explicitly seeded. It makes debugging much easier when I can rerun the same program multiple times and expect identical results.

    • frizlab 4 days ago ago

      Interestingly I think things that should not be deterministic should actually forced not to be.

      Swift for instance will explicitly make iterating on a dictionary not deterministic (by randomizing the iteration), in order to catch weird bugs early if a client relies (knowingly or not) on the specific order the elements of the dictionary are ordered.

      • lilyball 4 days ago ago

        This claim sounds vaguely familiar to me (though the documentation on Dictionary does not state any reason for why the iteration order is unpredictable), though the more common reason for languages to have unstable hash table iteration orders is as a consequence of protection against hash flooding, malicious input causing all keys to hash to the same bucket (because iteration order is dependent on bucket order).

        • frizlab 3 days ago ago

          Oh yeah you’re right, apparently the main reason was to avoid hash-flooding attacks[1].

          I do seem to remember there was a claim regarding the fact that it also prevented a certain class of errors (that I mentioned earlier), but I cannot find the source again, so it might just be my memory playing tricks on me.

          [1] https://forums.swift.org/t/psa-the-stdlib-now-uses-randomly-...

      • saidinesh5 4 days ago ago

        One more reason for randomizing hash table iteration was to prevent Denial of service attacks:

        https://lukasmartinelli.ch/web/2014/11/17/php-dos-attack-rev...

  • annexrichmond 3 days ago ago

    This reminds me of this great talk from Temporal about how they built their Python SDK by creating a distributed deterministic event loop on top of asyncio[1]

    [1] https://www.youtube.com/watch?v=wEbUzMYlAAI

  • lexicality 4 days ago ago

    > This makes it possible to write simple code that’s both concurrent and safe.

    Yeah, great, my hello world program is deterministic.

    What happens when you introduce I/O? Is every network call deterministic? Can you depend on reading a file taking the same amount of time and being woken up by the scheduler in the same order every time?

    • PufPufPuf 4 days ago ago

      This is about durable execution -- being able to resume execution "from the middle", which is often done by executing from the beginning but skipping external calls. Second time around, the I/O is exactly replayed from stored values, and the "deterministic" part only refers to the async scheduler which behaves the same as long as the results are the same.

      Coincidentally I have been experimenting with something very similar in JavaScript in the past and there the scheduler also has the same property.

    • TeMPOraL 3 days ago ago

      No, but determinism reduces the number of stones you need to turn over when debugging hairy problems such as your program occasionally returning different results for the same inputs. You may not have control over the timing of I/O operations or order of external events (including OS scheduler), but at least you know that your side of the innovation/response is, in isoaltion, behaving predictably.

      • nomel 3 days ago ago

        I go the opposite approach for this sort of thing, since I would much rather flip and remove the stones: I explicitly randomize order of containers during development and testing, and always in my unit tests, so depending on order can't be a problem. No luck required!

        • TeMPOraL 3 days ago ago

          You want both. More specifically, you want to be in control of which one you're actually doing.

          Randomization is great at avoiding erroneous dependencies on spurious cause-and-effect chains. Determinism is needed to ensure the cause-and-effect chains that are core to the problem actually work.

          • nomel 9 hours ago ago

            I don't understand.

            Determinism isn't required unless it's required.

            If it's not required, then you must plan for it NOT being deterministic, with any accidental determinism being ignored (to be safe, forcefully so with an intentional randomization/delays within the library). If it is required, then my random input should always (from the tests perspective) come out the same as I put it in.

            If possible, force the corner case if the corner case is a concern. That's the purpose of testing. If there's a concern with timing, force bad timing with random delays. The alternative is relying on luck. I try to make my code as unlucky as possible, during development/testing.

    • KraftyOne 4 days ago ago

      That's the cool thing about this behavior--it doesn't matter how complex your program is, your async functions start in the same order they're called (though after that, they may interleave and finish in any order).

      • lexicality 4 days ago ago

        Only for tasks that are created in synchronous code. If you start two tasks that each make a web request and then start a new task with the result of that request you will immediately lose ordering.

        • KraftyOne 4 days ago ago

          Yes, this only applies for tasks created from the same (sync or async) function. If tasks are creating other tasks, anything is possible.

  • Sim-In-Silico 3 days ago ago

    [dead]