Multiversion Python Thoughts

(lucumr.pocoo.org)

86 points | by divbzero 11 hours ago ago

73 comments

  • CraigJPerry 6 hours ago ago

    Should this be facilitated? Should this work really be done? I’m thinking not.

    If I have a special case and I need to do this today, I’m not blocked from doing so (I’ll vendor one of the dependencies and change the name) - certainly a pain to do, especially if I have to go change imports in a c module as part of the dep but achievable and not a blocker for a source available dependency.

    However, if this becomes easily possible, well why shouldn’t I use it?

    The net result is MORE complexity in python packaging. More overheads for infra tools to accomadate.

    • baq 4 hours ago ago

      You’ve never found yourself in a dependency resolution situation where there are no solutions to requirements. You need multiple versions of the same package in such cases.

      The alternative is just cheating: ignore requirements, install packages and hope for the best. Alas hope is not a process.

      • zokier 4 hours ago ago

        No, the alternative is to just fix the code so that the conflict is removed. This is why open source is so powerful, you are empowered to fix stuff instead of piling up tower of workarounds.

        • HelloNurse 2 hours ago ago

          You can fix your code, but indirect dependencies (you use A and B, both depend on C, but different versions of C) cannot be handled well.

          In C, C++, maybe Java, you would at least be able to link A and B with their own private copies of C to avoid conflicts reliably with standard mechanisms rather than unreliably with clever magical tools.

      • dikei 4 hours ago ago

        The usual way is to put conflicting versions in optional-dependencies, and then build one target for each conflicting set of deps. That'd work fine if the code path of one target doesn't touch the others, which is often the case.

        You'd obviously need to have tests for both targets, possibly using a flexible test runner like `nox` to setup separate test env for each target.

      • theelous3 4 hours ago ago

        I've never found myself in this situation where it can't be solved by being throughtful and taking the time to improve my code.

        • leetrout 4 hours ago ago

          These issues crop up in dependencies more than your code. Then you have to vendor one of the deps and edit it (and hopefully ship that back upstream and hope the maintainer will merge it).

      • immibis 4 hours ago ago

        You're viewing this from the perspective of a dependency resolution engine. From the perspective of a software engineer, the solution is to find better-behaved dependencies.

    • ForHackernews 4 hours ago ago

      I think it could be very beneficial for enabling backwards compatibility:

          import version1
          import version2
      
          try:
            version2.load_data(input)
          except ValidationError:
            version1.load_data(input)
      
      
       I'm sure people will abuse it, but the idea doesn't seem terrible on the face of it to me.
      • CraigJPerry 4 hours ago ago

        Could you not do that today?

            import version2
        
            try:
              version2.load_data(input)
            except ValidationError:
              import version1
              version1.load_data(input)
        
        (Assuming version in name like this example version2 or lib2 etc.)
        • ForHackernews an hour ago ago

          Only if the library is renamed for each new version, which seems a bit impractical.

      • Drakim 4 hours ago ago

        This assumes that you will get an actual error throw and not merely incorrect behavior.

  • andrewchambers 7 hours ago ago

    I feel like most of these problems just disappear if people would follow the same naming scheme sqlite3 uses. Just put the major version in the name and most tools just work out of the box with multiple versions.

    • the_mitsuhiko 7 hours ago ago

      That’s why I called the library “jinja2”. It was a new major version. But people over the years really did not like it and it did not catch on much.

      • TZubiri 6 hours ago ago

        Agree. The issue was breaking bc by releasing a new major under the same name.

      • smitty1e 7 hours ago ago

        Once more, the wisdom of "Explicit is better than implicit" shines.

        Instead, we jump through hoops with our hair on fire to manage complexity.

        People.

      • rbanffy 5 hours ago ago

        Really, people should just update their software to use the newer libraries and fix whatever breaks. If you want to use functions from version 2, you should port the rest of the code to version 2.

    • bobnamob 7 hours ago ago

      Of course coming with the _major_ caveat that the interface and behaviour[1] of the dependency is absolutely stable for the entirety of the "major version".

      I'm an advocate for this style of library/dependency development, unfortunately in my experience the average dependency doesn't have the discipline to pull it off.

      [1] https://www.hyrumslaw.com

      • zo1 3 hours ago ago

        I don't want to "pull this off", nor do I want to expend the time/energy to do so. We're not Microsoft with effectively infinite budgets, nor are we Elasticsearch/Grafana Orgs with metric oodles of developer-hours, and 50k+ github star mindshares-worth of evangelists behind us to document and tutorialize every tiny little feature in a cookbook or doc website.

        Code changes, and it's kinda silly to expect interfaces to be locked in place as that'll stifle development for even small-ish features. Does that mean every minor version or commit will change fundamental or large parts of the codebase? Probably not, but it's a sliding scale and people seriously need to find something better to do than writing Yet Another Python Package Manager.

        I use the term "we" loosely here ofc in the context of this mini-rant.

    • benrutter 6 hours ago ago

      I think this works ok if your library is something like Django or Pandas that people are building their project around. But it makes things exponentially more complex for libraries like pyarrow or fsspec that are normally subdependencies.

      Imagine trying to do things like import pyarrow14, then if that failed try pyarrow13, etc. Additionally, python doesn't have a good way of saying "I need one of the following different libraries as a dependency"

      • andrewchambers 6 hours ago ago

        I like the way python handled this situation - python3 for when you care, python for when you don't.

    • greener_grass 7 hours ago ago

      This is Rich Hickey's suggestion too

      https://www.youtube.com/watch?v=oyLBGkS5ICk

    • dist-epoch 4 hours ago ago

      sqlite3 was released 20 years ago, in 2004. I'm not convinced all code written in 2004 which used sqlite3 would still work today against the latest version.

      • aragilar 4 hours ago ago

        If stuff was removed, then I would have expected them to bump the ABI? As they haven't done that, I would actually assume it would work absent evidence to the contrary?

  • xrd 3 hours ago ago

    Another person here mentioned a comment "More overheads for infra tools to accomadate." I agree.

    I am using python as my main language these days, coming from JS and C++ and a bit of rust. The biggest problem I face is that the tools (editors, mainly) don't support the basic packaging tools.

    I use venv for everything, but when I try to use something else, I almost always get bitten.

    For example, asdf. I tried to use this tool. So awesome! It works great from the command line. But, when I try to use zed, it cannot figure out what to do, and I cannot find references in the zed github repository on the right way to setup pyproject.toml.

    And, emacs. Will uv work within emacs? Each of these packaging tools (and I'm thinking about the long history of nvm (node version manager), brew, and everything else) makes different assumptions about the right way to modify the path variable, or create aliases, or use shims (the definition of which varies with each tool) and I'm sure I'm missing other details.

    Does uv do the right thing mostly? I will say my experiences with python and the tooling has been more frustrating than the tools for JS. I use pnpm and it just works, and I understand the benefits. And, I can survive with just npm and yarn. But, to me, it is saying a lot that the python tooling feels more broken than JS. I mean, I lived through the webpack years, and I'm still using JS and have a generally favorable opinion of it.

    • salomonk_mur 2 hours ago ago

      Yes, uv generally does the right thing.

      I hold the same opinion as you. Python packaging is awful. But uv managed to just make it work.

    • codethief 2 hours ago ago

      Could you elaborate on the issues you've been seeing in Zed? I've been using asdf as a version manager for Python for quite some time and haven't really had any issues.

      • xrd 2 hours ago ago

        You are using asdf and zed successfully? I'm so glad to hear this.

        I removed asdf because I could not get it to recognize the pyproject.toml file. This is working in harmony for you?

        With Zed and a venv (from python -m venv .venv for example), Zed properly recognized installed packages and provided type hinting and docs, but when I switched to asdf it did not seem to work. But, I was new to asdf and perhaps was using it incorrectly.

        I was always assuming that when I'm in the command line, running asdf to use the right python works because the path is correctly established. But, when I run zed, it launches without the path setup step, and things went badly. I'm just speculating, but I could not get type hinting and didn't know how to fix it.

  • orf 7 hours ago ago

    This comes up every now and again, and there are two fairly simple examples that I think show the complexities:

    One:

    Library A takes a callback function, and catches “request.HttpError” when invoking that callback.

    The callback throws an exception from a differing version of “request”, which is missing an attribute that the exception handler requires.

    What happens? How?

    Two:

    Library A has a function that returns a “request.Response” object.

    Library B has a function that accepts a “request.Response” object, and performs “isinstance”/type equality on the object.

    Library A and library B have differing and incompatible dependencies on “request”.

    What version of the request object is sent to library B from library A, and how does “isinstance”/“type” interact with it?

    Both of these resolve around class identities. In Python they are intrinsically linked to the defining module. Either you break this invariant and have two incompatible/different types have the same identity and introduce all kinds of bugs, or you don’t and also introduce all kinds of bugs - “yes, this is a request.Response object, but this method doesn’t exist on this request.Response object”, or “yes this is someone’s request.Response object, but it’s not your request.Response object”

    Getting different module imports to succeed is more than possible, getting them to work together is another thing entirely.

    One solution to this is the concept of visibility, which in Python is famously “not really a thing”. It’s safe to use incompatible versions of a library as long as the types are not visible - I.e no method returns a request.Response object, so the module is essentially a completely private implementation detail. This is how Rust handles this, I think.

    However is obviously fucked by exceptions, so it seems pretty intractable.

    • the_mitsuhiko 7 hours ago ago

      That is not any different in Python than it is in Rust, Go or JavaScript. Yes: it's a problem if those different dependency versions shine through via public APIs. However there are plenty of cases where the dependency you are using in a specific version is isolated within your own internal API.

      I think if Python were to want to go down this path it should be isolated to explicit migration cases for specific libraries that want to opt themselves into multi-version resolution. I think it would enable the move of pretty core libraries in the ecosystem in backwards incompatible ways in a much smoother way than it is today.

      • orf 6 hours ago ago

        The problem with this is exceptions: they easily allow dependencies to escape, be that via uncaught exceptions or wrapped ones.

        Go and JavaScript have type systems and idioms far more amenable to this kind of thing (interfaces for Go, no real type identity + reliance on structural typing for JS) and rely a lot less on the kind of reflection common in Python (identity, class, etc).

        I guess there are some use cases for this, I just feel that the lack of ability to enforce visibility combined with the “rock and a hard place” identity trade-off limits the practical usefulness.

        • the_mitsuhiko 4 hours ago ago

          > The problem with this is exceptions: they easily allow dependencies to escape, be that via uncaught exceptions or wrapped ones.

          Sure, but that just means your dependency was not really internal. Errors are API too.

        • simiones 6 hours ago ago

          Exceptions are no different from Go's error types (and in general interface types in any language) from this point of view. If moduleA is doing something like `errors.Is(err, ModuleBError)` on an error that was returned from moduleC which uses a diffferent version of moduleB, you'll get the same issue.

          • orf 5 hours ago ago

            That’s interesting - is it common to do this instead of casting to an interface?

            It seems a lot more impactful with Python due to type equality being core to how exceptions are handled, even if there are similarities.

            • simiones 5 hours ago ago

              Well, the most common is of course `if err != nil`, which is unaffected. But in the very rare occasions that someone is actually handling errors in Go, `errors.Is` and `errors.As` are recommended over plain casts since they correctly handle composite exceptions

              Say a function returns `fmt.Errorf("Error while doing intermediate operation: %w", lowerLevelErr)`, where `lowerLevelErr` is `ModuleBError`. Then, if you do `if _, ok := err.(ModuleBError) {...}`, this will return false; but if you do `if errors.Is(err, ModuleBError)`, you will get the expected true.

              Regardless, the core problem would be the same: if your code can handle moduleB v1.5 errors but it's receiving moduleB v.17 errors, then it may not be able to handle them. This same thing happens with error values, Exceptions, and in fact any other case of two different implementations returned under the same interface.

              You even have this problem with C-style integer error codes: say in version 1.5, whenever you try to open a path that is not recognized, you return the int 404. But in 1.7, you return 404 for a missing file, but 407 if it's a missing dir. Any code that is checking for err > 0 will keep working exactly as well, but code which was checking for code 404 to fix missing dir paths is now broken, even though the types are all exactly the same.

      • kelnos 5 hours ago ago

        I think the issue is more that in Python you could get confusing runtime failures. In Rust, it will fail to compile if you're trying to mix two different major versions of a dependency like that. I'm fine with the latter, but the former is unacceptable.

        • Macha 3 hours ago ago

          It is still painful in Rust, I remember at one stage in a project having a bunch of helper functions to convert structs from one version of nalgebra to another as my game engine (Amethyst at the time, I think?) used one version and the ncollide library used other, and both exposed nalgebra in their public interfaces.

        • dwattttt 4 hours ago ago

          You can have multiple incompatible dependency versions imported by the one create; you have to give them different names when declaring them, but it works fine (I just tested it).

          It follows the approach of "objects from one version of the library are not compatible with objects of the library" mentioned above, and results in a compile time error (a potentially confusing type error, although the error message might call out that there's multiple library versions involved).

    • snatchpiesinger 6 hours ago ago

      You can have private and public dependencies. Private dependencies are the ones that don't show up on your interface at all. That is you don't return it, you don't throw it or catch it (other than passing through), you don't take callbacks that have it in their signature, etc... You can use private dependencies for the implementation.

      It should be safe to use multiple versions of the same library, as long as they are used as private dependencies of unrelated dependencies. It would require some tooling support to do it safely:

      1. Being able to declare dependencies are "private" or "public".

      2. Tooling to check that you don't use private dependencies in your interfaces. This requires type annotations to gain some confidence, but even then, exceptions are a problem that is hard to check for (in Python that is).

      In compiled languages there are additional compilications, like exported symbols. It is solveable in some controlled circumstances, but it's best to just not have this problem.

      • orf 6 hours ago ago

        > you don't throw it or catch it

        Herein lies the issue: in this context exceptions can be thought of as the same as returns. So you actually need to catch/handle all possible exceptions in order to not leak private types.

        Also what does “except requests.HttpError” do in an outer context? It checks the class of an exception - so either it doesn’t catch some other modules version of requests.HttpError (confusion, invariants broken) or it does (confusion, invariants broken).

        • snatchpiesinger an hour ago ago

          It's fine as long as you catch all exceptions, and only produce ones that you document. Your users aren't supposed to know that you used `requests` at all.

          • orf an hour ago ago

            Sure, but who does this? And the typical pattern is to wrap exceptions, giving you access to the inner exception if you need more context.

            The requests HTTP exception contains the request and response object. Wrapping that would be a huge pain and a lot of code.

  • eqvinox 38 minutes ago ago

    The absence of any references to how other ecosystems do this (e.g. oldest widely in use: ELF soversion) strikes me as massive oversight? There's 40-50 years of history of people trying to do close cousins of this.

  • pid-1 5 hours ago ago

    I have done the following in the past:

    1. pip install libfoo==1.x.x

    2. pip install libfoo==2.x.x --target ~/libs/libfoo_v2 # vendor libfoo v2

    3.

    import sys

    import libfoo

    original_sys_path = sys.path.copy()

    sys.path.insert(0, '~/libs/libfoo_v2')

    import libfoo as libfoo_v2

    sys.path = original_sys_path

    There are caveats of course. But works for simple cases.

  • aragilar 4 hours ago ago

    What is the difference between this and and the approach taken by eggs/buildout (other than the time difference, and so some changes in APIs)? My impression (having never used buildout) was handling multiple versions made debugging hard, and there were lots of random issues unless you did everything correctly (and that setuptools and its variants switched away from the options to use multi-versioning because it was effectively a foot-gun)?

  • dikei 6 hours ago ago

    One way is to publish 2 variants of your packages: one with the major version number appended to package name and one without the version number. Users who need to install multiple versions can use the first variant, while user that just want to follow the latest can use the second.

  • rbanffy 5 hours ago ago

    One of the reasons I made pip-chill is to create an incentive to not bother with version numbers and just make your software always run against the latest versions. If something breaks, fix it. If it's too hard to do it, maybe you depend on too many things. Leftpad feelings here.

    Having your software depend on two different versions of a library is just asking for more pain.

    BTW, I still need to fix it to run on 3.12+ in a neat way. For now, it runs, but I don't like it.

    • theelous3 4 hours ago ago

      This is a fun idea. I always do this currently with some non-core dependencies like black, linters, or dependencies I controll the versioning of. I can't imagine I'd use this on libraries that provide real functionality in prod code though. If you have an incident popping off and you find out your dependency resolution goblin has reared its little head at the same time, bad day ahead.

    • 4 hours ago ago
      [deleted]
  • physicsguy 7 hours ago ago

    Does this not fall over in the circumstance of linking against a C library (not specifically a Python extension) as many Python libraries do?

    For e.g. I write “library” of which v1 depends on somelib.so.1.0.0 and v2 depends on somelib.so.2.0.0

    If somelib has some symbols clashing in the names this can cause real problems!

    • the_mitsuhiko 7 hours ago ago

      They don’t show up in the global symbol namespace usually so it’s fine. It’s only an issue for some libraries that load globally so that one library can reference another c library.

    • snatchpiesinger 6 hours ago ago

      Python dlopens binary wheels with RTLD_LOCAL on Linux, and I assume it does the equivalent on Windows.

      There were issues relatively recently with -ffast-math binary wheels in Python packages, as some versions of gcc generates a global constructor with that option that messes with the floating point environment, affecting the whole process regardless of symbol namespaces. It's mostly just an insanity of this option and gcc behavior though.

      • dist-epoch 4 hours ago ago

        Windows has a different design where the symbols are not merged, the .dll name is part of the resolution, so you can have 2 different .dll export the same symbol.

  • benavn 4 hours ago ago

    It is kind of the same problem as for shared libraries. In the GNU universe the cleanest solution is to have multiple soversions. Transferred to jinja it would be jinja.so.1 and jinja.so.2.

    Maintaining fine grained symbol versioning is a pain and a massive amount of work for the package maintainer:

    https://invisible-island.net/ncurses/ncurses-mapsyms.html

    Honestly, multiple installed versions like jinja1 and jinja2 sounds best to me.

  • i-use-nixos-btw 4 hours ago ago

    This would introduce more problems than it solves 99% of the time. The 1% of the time, it could be very handy.

    I haven’t used UV, but it says that it manages python as well as packages - I’m guessing like conda, python-venv, and of course nix does.

    If the C api is an issue, it sounds like you have control over it if you need it. You manage the python distribution, so could it be patched?

    This way it feels like you’d be able to establish not just what is being imported, but what is importing it - then redirect through a package router and grab the one you want.

    This may be particularly useful if you’re loading in shared libraries, because that is already a dumpster fire in python, and I imagine loading in different versions of the same thing would be quite awkward as-is.

  • thingsgg 7 hours ago ago

    It's one thing to see if something like this is possible from a technical standpoint, but whether this is desirable for the ecosystem as a whole is a different question. I would argue that allowing multiple versions of packages in the dependency tree is bad. It removes incentives for maintainers to adhere to sane versioning standards like semver, and also the incentive to keep dependencies updated, because resolution will never be impossible for package users. Instead, they will get subtle bugs due to unknowingly relying on multiple versions of the same package in their application that are incompatible with each other.

    For lack of a better word, the single package version forces the ecosystem to keep "moving": if you want your package to be continued to be used, you better make sure it works with reasonably recent other packages from the ecosystem.

    • the_mitsuhiko 7 hours ago ago

      > It removes incentives for maintainers to adhere to sane versioning standards like semver

      Semvers does not matter in this way. The issue with having a singular resolution — semver or not — is that you can only move your entire dependency tree at once. If you have a very core library then you are locked in unless you can move the entire ecosystem up which is incredibly hard.

    • greatgib 6 hours ago ago

      Indeed, in my opinion it is the best way to finish in a cluster mess like the nodejs/npm ecosystem...

      And a very real issue is that young developers don't know anymore how to develop by limiting dependencies to the strict minimum. You have some projects with hundreds of dependencies without a real reason than lazyness or always using the new shiny thing.

      • jollyllama 3 hours ago ago

        Indeed, I blame npm for normalizing this kind of thing. It's no surprise that frontend devs wouldn't understand why it's bad, but Python devs should know better.

  • 2 hours ago ago
    [deleted]
  • TZubiri 6 hours ago ago

    Pip, Venv, poetry, pipenv, now uv

    If you are still struggling with this in 2024, you are missing the actual challenges of the world.

    • dikei 6 hours ago ago

      Just like JS has npm, yarn, pnpm, bun...

      It seems the more users a language has; the more dev tools get written for it.

      • TZubiri 6 hours ago ago

        Maybe if you don't import stuff like left-pad and actually write some code, you wouldn't have to write a dissertation on package management.

        • dikei 6 hours ago ago

          Actually, I prefer spending my time writing right-pad

          • TZubiri an hour ago ago

            Maybe we could abstract directions away.

            d-pad(pad_character,direction,string)

            Of course this would be an internal dependency of both left-pad and right-pad.

    • the_mitsuhiko 4 hours ago ago

      Not that multiversion support was one of the goals that I had when I built rye. It's also something I want to see if it's possible with uv. Multiversion support is entirely orthogonal with how you are installing packages.

    • simiones 6 hours ago ago

      So the fact that there are 5 different tools that (attempt to?) fix this problem is a sign that it is not a problem, from your point of view?

      • TZubiri 6 hours ago ago

        It's a sign developers get stuck in a paper bags. Same thing with text editors, orms, frameworks.

        Imagine 2050, flying cars, talking robots, teleportation, and John Developer is going to be releasing solution 57 to a problem that was solved in the 90s

        • dist-epoch 4 hours ago ago

          What is the 90's solution to this problem - autoconf, apt, yum, oops, here we go again with multiple standards.

    • globular-toast 5 hours ago ago

      Agreed, but who is struggling with it? Do you think tool development should just stop because it's not an "actual challenge"?

      • zo1 3 hours ago ago

        Not OP, but I honestly think that these kinds of devs are just way too opinionated and don't want to accept the 95% valid existing solution. In a healthy ecosystem, people can try and experiment and if something is awesome then people would naturally switch to it over time.

        Sadly what happens now is that everyone under the sun tries to evangelise and "create content" for these new tools so much that the natural filter mechanisms don't work. Doubly-so because tools like Google effectively created a new fitness function for peoples' behavior that incentivizes just plain old content creation (whatever weird form it may take, including new libraries being created + promoted).

        • maleldil 2 hours ago ago

          IMO, there's a direct line of improvement between pipenv, poetry, (briefly) rye and now uv. I think the ecosystem is improving over time and will eventually coalesce around a majority platform. I like uv, but I'm unsure if that will be the final product.

          Beyond just tool names, it's also important to realise that there has been a significant movement from the Python development team to standardise aspects of tooling. Tools like Poetry and uv weren't possible a few years ago before there was pyproject.toml to unify a bunch of separate things, for example.

          • TZubiri an hour ago ago

            I don't know I don't care. I just use pip. If I need to virtualize I just do so at che OS or Kernel level.

            At any rate I take care of all of my python installs by not downloading a gazillion of random packages online, if I ever reach the situation where package 517 depends on package 208 and package 598 depends on a different version of package 208, I'll just pull out the FlemmenWerfer and trash the whole thing before it reproduces.