Building a CI/CD Pipeline Runner from Scratch in Python

(muhammadraza.me)

64 points | by mr_o47 4 days ago ago

23 comments

  • systemerror 2 days ago ago

    Why does air-gapped environment require rolling your own CI/CD solution? Plenty of examples of air-gapped Jenkins and/or Argo Workflows. Was this just an educational exercise?

    • verdverm a day ago ago

      Jenkins sucks but is insanely reliable

      Argo Workflows does not live up to what they advertise, it is so much more complex to setup and then build workflows for. Helm + Argo is pain (both use the same template delimiters...)

      • bigstrat2003 a day ago ago

        Jenkins, like many tools with extreme flexibility, sucks as much as you make it suck. You can pretty easily turn Jenkins into a monstrosity that makes everyone afraid to ever try to update it. On the other hand, you can also have a pretty solid setup that is easy to work on. The trouble is that the tool itself doesn't guide you much to the good path, so unless you've seen a pleasant Jenkins instance before you're likely to have a worse time than necessary.

        • IshKebab a day ago ago

          Are you sure, because last time I used Jenkins it actively sucked. The interface was a total mess and it doesn't surface results in any useful way.

          • blackjack_ a day ago ago

            What particular issues do you have with it? My company uses it at scale (dozens of different instances, hundreds of workers, thousands of pipelines) to support thousands of applications and we are reasonably happy with it. DSL is incredibly helpful at scale. IAC is incredibly helpful at scale. It requires a good amount of upkeep, but all things underpinning large amounts of infrastructure require a good amount of upkeep.

            • verdverm a day ago ago

              We've minimized our usage of the DSL, there is no way for devs to debug it without pushing commits, and it means you have to implement much of your CI logic twice (once for local dev, once for ci system).

              IMO, ci should be running the same commands humans would run (or could if it is production settings). Thus our Jenkins pipelines became a bunch of DSL boilerplate wrapped around make commands. The other nice thing about this is that it prepares you for easier migrations to a new ci system

          • ownagefool 18 hours ago ago

            Jenkins has pros and cons.

            It's one of the few CI tools where you can test your pipeline without committing it. You also have controls such as only pulling the pipeline from trunk, again, something that wasn't always available elsewhere.

            However, it can also be a complete footgun if you're not fairly savvy. Pipeline security isn't something every developer groks.

          • fowlie a day ago ago

            When was the last time you used Jenkins? I don't get the hate. Not only from you, but lots of people on the internet. What makes Jenkins stand out IMO is the community and the core maintainers, they are perhaps moving slow, but they are moving in the right directions. The interface looks really nice now, they've done a lot of ux improvements lately.

            • larkost 13 hours ago ago

              I haven't used Jenkins in a few years, so some of this might change, but in working with it I saw that Jenkins has a few fundamental flaws that I don't see them as working to change:

              1. There is no central database to coordinate things. Rather it tries to manage serialization of important bits to/from XML for a lot of things, for a lot of concurrent processes. If you ever think you can manage concurrency better than MySQL/Postgres, you should examine your assumptions.

              2. In part because of the dance-of-the-XMLs, when a lot of things are running at the same time Jenkins starts to come to a crawl, so you are limited on the number of worker nodes. At my last company that used Jenkins they instituted rules to keep below 100 worker nodes (and usually less than that) per Jenkins. This lead to fleets of Jenkins servers (and even a Jenkins server to build Jenkins servers as a service), and lots of wasted time for worker nodes.

              3. "Everything is a plugin" sounds great, but it winds up with lots of plugins that don't necessarily work with each other, often in subtle ways. In the community this wound up with blessed sets of plugins that most people used, and then you gambled with a few others you felt you needed. Part of this problem is the choice of XMLs-as-database, but it goes farther than that.

              4. The way the server/client protocol works is to ship serialized Java processes to the client, which then runs it, and reserializes the process to ship back at the end. This is rather than having something like RPC. This winds up being very fragile (e.g.: communications breaks were a constant problem), makes troubleshooting a pain, and prevents you from doing things like restarting the node in the middle of a job (so you usually have Jenkins work on a Launchpad, and have a separate device-under-test).

              Some of these could be worked on, but there seemed to be no desire in the community to make the large changes that would be required. In fact there seemed to be pride in all of these decisions, as if they were bold ideas that somehow made things better.

            • verdverm a day ago ago

              both the old & new interfaces to Jenkins are riddled with bugs, work seems to be maintenance mode, across the plugin ecosystem too

              If you are talking about Jenkins-X, that is a different story, it's basically a rewrite to Kubernetes. I haven't talked to anyone actually using it, if you go k8s, you are far more likely to go argo

    • piker a day ago ago

      It seems like a simple CI/CD in an airgapped environment might be simpler to implement than to (1) learn and (2) onboard an off-the-shelf solution when your airgapped requirement limits your ability to leverage the off-the-shelf ecosystem.

    • mr_o47 a day ago ago

      This was more like an educational exercise

      • esafak a day ago ago

        Since you're exercising, you can take it to the next level where you don't specify the next step but the inputs to each task, allowing you to infer the DAG and implement caching...

        • verdverm a day ago ago

          You can do this with cue/flow, but have not turned it into a full CI system. The building blocks are there

          • mr_o47 a day ago ago

            Never heard of cue/flow will definitely check it out

  • max-privatevoid a day ago ago

    Why use Docker as a build job execution engine? It seems terribly unsuited for this.

    • ramon156 19 hours ago ago

      > terribly unsuited

      Care to elaborate? If you already deploy in docker then wouldn't this be nice?

      • max-privatevoid 10 hours ago ago

        Docker is unusable for build tools that use namespaces (of which Docker itself is one), unless you use privileged mode and throw away much more security than you'd need to. Docker images are difficult to reproduce with conventional Docker tools, and using a non-reproducible base image for your build environment seems like a rather bad idea.

    • mr_o47 a day ago ago

      It's widely used among DevOps Engineers so hence I picked Docker as it makes it easier to understand

  • halfcat a day ago ago

    > We need to:

    > Build a dependency graph (which jobs need which other jobs)

    > Execute jobs in topological order (respecting dependencies)

    For what it’s worth, Python has graphlib.TopologicalSorter in the standard library that can do this, including grouping tasks that can be run in parallel:

    https://docs.python.org/3/library/graphlib.html

    • skylurk 11 hours ago ago

      One of the best real "batteries" added in recent years.

  • a day ago ago
    [deleted]