Your CI pipeline isn't ready for AI

(blog.morgante.net)

22 points | by morgante 14 hours ago ago

6 comments

  • pocketarc 20 minutes ago ago

    > More troublingly, performance has not improved in CI at the same pace as on developer machines—it’s usually a lot slower to build our app in CI than it is to do it locally on my M1 laptop.

    While some of the other comments around optimizing CI pipelines are solid, this whole thing seems to be due to having CI running on servers that are -worse- than a laptop. Isn't that wild? Servers weaker than laptops. Not even desktops or workstations. LAPTOPS.

    And they are, because they're just cloud instances. And most cloud instances... are not fast.

    Consider the idea that you could run your CI runner on an M1 laptop if you so choose to. Setting up a self-hosted GH Actions runner (for example) is quite straightforward. Doesn't even need to be an internet-facing machine, it can be a spare machine sitting at home/office. $600 will get you a Mac mini with an M2 CPU and super-fast SSD; everything will build faster than it ever could on any generic CI build server.

  • skeptrune 9 hours ago ago

    It's incredibly frustrating that LLM's still aren't useful for automating CI and IaC configs despite all the hype.

    • firesteelrain 5 hours ago ago

      Not sure what you mean. ChatGPT does a very good job at generating GitLab YAML and Terraform HCL

  • mike_hearn 3 hours ago ago

    There are some quick wins you can do to improve CI times and reliability. I use them some of these and it does ease the pain. I have a company that develops a tool that is itself a build system that does complex and intensive builds as part of its testing process, so CI times are something I keep an eye on. These tips are mostly useful for JVM/.NET projects, I think. We use self-managed TeamCity which makes this stuff easy.

    1. Preserve checkout/build directories between builds. In other words, don't do clean builds. Let your build system do incremental builds and use its dependency caches as it would when running locally. This means not running builds in Docker containers, for instance (unless you take steps to keep them running).

    2. Make sure your servers run behind caching HTTP proxies so if you do need to trigger a clean build downloads are properly cached and optimized.

    3. Run builds on Macs! Yes, they are now much faster than other machines so if you can afford them and your codebase is portable enough, throw them into the mix and let high priority changes run on them instead of on slower Linux VMs. Apple silicon machines are a bit too new to be reaching obsolescence, but if you do have employees who give up "old" ARM machines then turn those into CI workers.

    4. Ensure all build machines have fast SSDs.

    5. Use dedicated machines for build workers i.e. not cloud VMs which are often over-subscribed. Or use a cloud that's good value for money and doesn't over-subscribe VMs like Oracle's [1]. Dedicated machines in the big clouds can be expensive, but you can get cheaper smaller machines elsewhere. Or just buy hardware and wire it up yourself in an office. It's not important for build machines to be HA. You always have the option of mixing machines and adding cloud VMs too if your load suddenly increases.

    6. Use a build system that understands build graphs properly (i.e. not Maven) and modularize the codebase well. Most build systems can't eliminate redundant unit testing within a module, but can do so between modules, so finer grained modules + incremental builds can reduce the number of tests that are run for a given change.

    7. Be judicious about what tests are run on every change. Do you really need to run a full blown end to end test on every commit? Probably not.

    Test times are definitely an area where we need some more fundamental R&D though. Integration testing is the highest value testing but it's also the type of test build systems struggle the most to optimize out, as figuring out what might have been broken by a change is too hard.

    [1] Disclosure: I do some work for Oracle Labs, but I think this statement is true regardless.

  • fire_lake 8 hours ago ago

    Weird article. Bazel does exactly what the author wants. And it seems unrelated to AI.

  • jdlshore 8 hours ago ago

    Not really about AI, but instead a complaint about the difficulty of optimizing build pipelines.