Building Robust Helm Charts

(willmunn.xyz)

69 points | by will_munn 3 days ago ago

13 comments

  • amw-zero a day ago ago

    There is no such thing as a robust helm chart. Null was not the billion dollar mistake: templating languages were.

  • LunicLynx a day ago ago

    I like helm charts but find it very difficult to work confidently on them. Mainly because of yaml and probably not the right tools.

    So to one degree I wonder what tools are other people using to get a better experience doing this?

    • misnome a day ago ago

      We do rendered manifest pattern. The chart gets rendered into a single yaml, that get checked into it's own branch and PR. That way, any changes can be easily inspected before merging and can work with confidence that e.g. changing a setting or updating isn't going to change ALL of the objects. It's also extremely easy to (trustingly) roll back to previous states.

      The only downside is that you can't really prune excess objects with this method. We're pushed to use Argo for deployment which I don't really gel with, but I trust it to apply the yaml, and at the very least it highlights when objects need to be removed.

    • bigstrat2003 a day ago ago

      Honestly my take on Helm charts is to keep them as simple as possible. All the complicated stuff you see in public charts people publish? Yeah stay far, far away from that. Our Helm charts at my job are 95% plain YAML files with an occasional variable insertion to handle cases where you need different hostnames (etc) based on the environment being deployed to. They are a pleasure to work with because they are so simple.

      Even some of the examples in TFA (like the optional persistent storage one) are IMO way more complex than what you should use Helm for. At that point you're better off using a programming language to generate the YAML based on some kind of state object you create. It's way too error prone to do that stuff with YAML templating imo.

    • rjzzleep a day ago ago

      KCL fixes a lot of issues, but it doesn't seem to gain any traction. And it's not this unusable mess that ksonnet or jsonnet is.

      https://www.kcl-lang.io/

    • thecopy a day ago ago

      I also get uneasy feeling, that the "values.yaml" section, it feels often underspecified and a black-box.

    • gouggoug a day ago ago

      I wish timoni[1] would take off.

      It’s based on Cue and doesn’t rely on templating.

      [1] timoni.sh

      • tuananh a day ago ago

        few years ago, everyone thought Cue is gonna replace YAML.

        Event Dagger (a big Cue believer) deprecated their Cue SDK back in 2023.

  • madduci 20 hours ago ago

    I use Terraform with the Kubernetes Provider, which is also actively developed by HashiCorp itself.

    Templating / injection of values has been much better, skipping the Helm Templating madness and relying on a set of tools that allow perform minting, security scans, generation of docs, unit tests and establish clear dependencies within Terraform, thanks to the graph model.

    Helm Charts are a nice idea, but mistakes can happen really easy

    • zbentley 17 hours ago ago

      This is the way. Remove Helm and Argo from your IaC entirely and manage as much as possible via Terraform with the hashicorp/kubernetes provider. It's simpler (fewer tools), and you also get:

      - Clarity re: destruction of obsoleted/destroyed resources (rather than kubectl's "won't do it", Helm's "it depends on ten settings", and Argo's "I'll try my best but YMMV").

      - Control over apply ordering if the k8s/tf default doesn't do it for you.

      - Resource control as granular (or not, if you just want to write big multi-resource "kubernetes_manifest" blocks) as you want. You can move around, case-by-case, on the spectrum between "templated raw YAML copied from somewhere else" and "individual resources with (somewhat) strong typing/schema-awareness in code". As a bonus, if you do it fully granularly, there's no indirection via YAML happening at all, just per-resource Kubernetes API calls.

      - A coherent story for moving ownership/grouping of k8s resources between different logical groups of stuff via terraform import/moved blocks.

      - Vastly more accurate proposed-changes diff than Argo, Helm, or even Kubernetes itself can provide: tTerraform's core execution model is plan-as-canonical-changelist, while k8s/helm/argo added noop/proposed diffs as ancillary features of variable quality.

      - The ability to mix in management of non-k8s resources (AWS/GCP/Azure/etc. stuff that k8s resources talk to), which is often simpler than deploying complex Kubernetes controllers that manage those same external resources. Controllers are great if you need lots of complex or self-serve management of external resources, but if you are only ever managing e.g. load balancers in one way in a few places, a big controller might be overkill versus doing it by hand.

      The only big drawback of this approach is with CRDs. There's no way to have Terraform that deploys CRDs in the same plan as Terraform that refers to resources of those CRDs' types--not even if you conditionally "count = 0" deactivate management of the CRD resources based on variables or whatnot. To cope with this, you either have to get very good at targeted plan/applies (yuck), or plan/apply multiple Terraform modules in order (which is simple and a good practice, but results in more code and can be unwieldy at first).

      All the other drawbacks I've heard to doing it this way are pretty silly, and boil down to:

      1. "but everyone uses Argo/Helm!" Okay, lots of people smoke cigarettes too--and if you're deploying charts complex enough that you're having to get into the weeds with 'em, you've already gotten enough familiarity to easily port them into kubernetes-provider HCL anyway.

      2. "I don't like Terraform/HCL". You do you, I guess, but 90% of the reasons people hate it boil down to either "you're using Terraform like it's 2016 and a lot of massive improvements were released circa 2018-2020", or "the Terraform model forces you to be rigorous and explicit rather than approximate and terse you're mad about it".

      Relatedly, I was not impressed with the hashicorp/helm provider and routinely push for folks to go back to the regular Kubernetes provider instead. Architecturally the Helm provider is bad (let's indirect the already-too-complex templating constructs through another templating language! What could go wrong?), and its implementation is also not great--getting diagnostics/log output is harder than it should be, whether old resources are destroyed/replaced/updated-in-place is left left up to Helm itself in complex ways that break with the usual Terraform assumptions, and getting meaningful diffs is tricky (the "manifest" provider experiment exists but is experimental for a reason and causes terraform crashes--not just erroneous diff output--often).

      • madduci 6 hours ago ago

        And you can have policy as code, which is a big bonus.

        +1 for multi module apply, for CRDs and infrastructure components that must be there, before they can be used from other resources

  • tuananh a day ago ago

    we built a argocd preview that render diff for MR.

    reviewing a MR that upgrade helm chart version is a lot less scary.

  • 3 days ago ago
    [deleted]