H2JVM – A Haskell Library for Writing JVM Bytecode

(discourse.haskell.org)

48 points | by rowbin 3 days ago ago

16 comments

  • J-Kuhn a day ago ago

    At a quick glance, it seems to be missing some hook to resolve class hierarchies - which is needed when merging stacks with different types. Consider

        var foo = expr ? new Foo() : new Bar();
    
    For such an expression, some superclass or superinterface for both Foo and Bar has to be chosen.

    What I am curios about is, if the java classfile API[1] provides a good model that could work in Haskell as well - I always had the impression that it was heavily influenced by functional programming, for example it uses "lifting transforms"[2].

    PS: Good work on the label resolving part - this model is also used by the Java Classfile API and before it, ASM.

    [1]: https://docs.oracle.com/en/java/javase/26/docs/api/java.base... [2]: https://docs.oracle.com/en/java/javase/26/docs/api/java.base...

  • amarant a day ago ago

    This... Confuses me profoundly. My entire career I've worked with Java, and it's mostly a pretty decent language, imo. I think my biggest gripe with Java is the JVM. It's limiting, doesn't really provide any value(the proposed value is portability, but we always run apps in docker containers anyway, so what is it really doing for us?)

    I (kinda) get why someone might want to write Haskell rather than Java, but I'm just not sure why you would want to run Haskell on the JVM?

    • cogman10 a day ago ago

      Wow, I think I have almost the exact opposite opinion here.

      Java is an ok language, but what really makes it shine is the JVM. It's one of the fastest VMs out there and is one of the most customizable ones as well. For example, pretty much all other languages with a GC have just a GC and that's it. Java allows you to pick and choose your GC based on the workload.

      It is one of the least limiting VMs out there because any knob you might want to tune, can be tuned. It's a huge value add.

      I think the only part of the JVM that's not great is the fact that objects are bulky and the lack of value classes. Which ultimately means every struct like object you want can have a pretty hefty price in terms of memory. But otherwise, it's best in class basically for everything.

      • amarant 11 hours ago ago

        I do agree that as far as VM's go, the JVM is not bad. I guess my recent fascination with rust is kinda shining through here, but I've begun questioning the need for any sort of VM: even the best ones add overhead, and since in practice we always deploy to a very specific environment anyway (when was the last time you deployed a non-dockerized jvm app?), why not just build native executables and skip the overhead?

        I dunno, I feel like I'm deploying a VM in a VM in a VM, and at some point they cost more than they taste.

        • cogman10 9 hours ago ago

          > even the best ones add overhead

          Nope. Quite the opposite. You can precompile Java to a static runtime using the likes of Graal or even android. That actually makes these applications slower, not faster.

          Part of this comes down to the Java language design. For Java, there is a lot of dynamic dispatch involved. In rust terms, it's as if almost every parameter was a `Box<dyn Foo>`. In Java, it's pretty natural to have a method like `void foo(List<Bar> baz){}` which can be called by any concrete list type. In fact, compiled down, this actually looks like `void foo(List baz) {}` in the bytecode.

          The JVM is able to capture runtime information and realize "Oh, `foo` is always called with an `ArrayList`. And that `ArrayList` always emits a `Bar` element." That allows it to optimize and directly call the `ArrayList` methods rather than having to always do a "Ok, determine the type, look up the method table, call the method". Which is exactly what rust has to do with a `Box<dyn Foo>` signature. To do something similar with Rust you have to do a more complex PGO compilation.

          Now, don't get me wrong, rust is smart. That's why they've designed the language such that `foo: Box<dyn Foo>` just isn't as ergonomic as `foo: &Foo`. The language pushes you to use the concrete structs when possible and to avoid doing dynamic dispatch. It supports it, but it requires a lot more ritual.

          • amarant 8 hours ago ago

            Weeeelll, I'm definitely getting into dubious nitpick territory, but I'm not sure we can draw the conclusion that the JVM has negative overhead just because Java happens to be slower without it. That's mostly because Java wasn't designed to run without the JVM and, while you can do it, it kinda sucks. I think you have shown that Java is rather out of its element without the JVM, which is fair enough.

            Comparing performance with a performance focused systems language such as rust wouldn't be fair either, and the borrow checker is Def cheating compared to GC, so this is also not good grounds to conclude that the JVM has a positive overhead either.

            I think probably Go offers the fairer comparison, but my experience with it is so limited I can't really be sure. But unless I'm mistaken it's also GC'd, like Java, and runs without a VM, which is the difference we're trying to isolate.

            Ergonomics wise my understanding is they're comparable too, but like I said, I have very little experience with go.

            The performance analyses I've seen comparing the two seem to indicate a 10-15% performance increase in go compared to java, depending mostly on which report you read.

            It's definitely not clear-cut, I probably wouldn't put much weight to such an argument if it didn't align so well with my intuition. I'm not free of bias. Maybe I just want something different after nearly 15 years of Java? I mean I definitely do, but maybe it's affecting my judgment more than I think?

            But that's performance: the other limiting factor is that you need to install Java to run java apps(unless you go against nature and build sluggish native Java, but we covered that).I find I fairly often just don't want to have that requirement. Native executables have their charm, I find.

            • cogman10 7 hours ago ago

              Would it surprise you to learn that Rust does the same thing that Java does?

              The main difference is that rust drives the VM all the way to the point of generating machine code while java generates the machine code at runtime.

              Rust does a translation of the syntax to a high level bytecode, then a mid level bytecode, then to LLVM IR, and finally it lets LLVM do the translation of the IR to machine code. The way LLVM uses "VM" is exactly the same way Java is using "VM".

              Javascript is similar. In fact, v8, the engine that powers node and chrome, was initially written by Java hotspot developers.

              The current performance initiatives that Python and Ruby are taking are doing exactly what the JVM and Javascript does. In fact, the pypy JIT and LuaJIT are learning from and implementing what the JVM does. It's a proven mechanism to getting more performance and better optimizations.

              Even GCC does the same thing under the hood.

              It really is clear cut, more than you might expect.

    • tikhonj a day ago ago

      This project isn't for running Haskell on the JVM, it's for writing a compiler that produces JVM bytecode. You'd use it if you wanted to implement your own JVM language in Haskell or maybe if you wanted to have some kind of JVM-backed domain-specific language embedded in Haskell.

    • pron a day ago ago

      The reason most "serious" or important software is written for the JVM these days is because it gives you an unparalleled combination of performance, productivity, and observability. There's almost no competition if these things are what you need. The problem isn't so much why pick Java among the alternatives, but that there are hardly any alternatives.

      • AlotOfReading a day ago ago

        You have to put on some very narrow lenses to argue that "most" serious software is written for the JVM. Operating systems, compilers, browsers, databases, planes, cars, the transaction processing for at least one major payment processor, major cloud services, etc predominantly use other languages.

        I don't think there's any single language that most serious software is written in.

        • pron a day ago ago

          C, C++, and C# are obviously also major players in "serious software", but you can estimate the volume of software through the number of people involved (e.g. https://www.devjobsscanner.com/blog/top-8-most-demanded-prog...). If Java doesn't have an outright majority, it has an obvious plurality. And again, there aren't many alternatives.

    • vips7L a day ago ago

      An interesting take.

      You get lots of things for free when targeting JVM bytecode. GCs, JITs, interop with one of the largest and most battle tested ecosystems.

    • rienbdj a day ago ago

      Pretty good performance for low effort is a big win.

    • a day ago ago
      [deleted]
  • internet_points a day ago ago

    see also https://github.com/Frege/frege and https://github.com/mchav/froid (though both are kind of dead I guess?)

    • J-Kuhn a day ago ago

      There is also Idris 2 for JVM (https://github.com/mmhelloworld/idris-jvm)

      Frege targets Java source code, which is then compiled by javac - the downside of that approach is you can not preserve the line numbers for debug information.