Code World Model

(github.com)

14 points | by tosh a day ago ago

5 comments

  • 2001zhaozhao a day ago ago

    It would be interesting to see if they have an updated version of a model that employs this training technique. According to the paper it scored well on release (65.8% on SWE bench), but by now it no longer scores competitively against the latest generation open coding models (e.g. Devstral Small 2).

    I wonder whether other labs have implemented something similar to this approach. Perhaps code world modeling isn't actually necessary (relative to other simpler techniques) to achieve the kind of deep environment understanding that the paper touts as being important to improve agentic coding performance.

    • general_reveal a day ago ago

      Serious question. How do we know these bench suites are any good?

  • chid a day ago ago

    Given the high bar of entry 160VRAM GPU - is there anything practical one can use this for?

    • omneity a day ago ago

      The model being 32B could run in <20GB VRAM with Q4 quantization (minimal loss of quality), or 80GB unquantized at full fidelity. The quoted 160GB is for their recommended evaluation settings.

      There's a few pre-quantized options[0] or you can quantize it yourself with llama.cpp[1]. You can run the resulting gguf with llama.cpp `llama-cli` or `llama-server`, with LM Studio or with Ollama.

      0: https://huggingface.co/models?search=cwm%20q4%20gguf

      1: https://huggingface.co/spaces/ggml-org/gguf-my-repo

      • chid 13 hours ago ago

        I see, still a fair more VRAM than I have access to. Thanks for sharing that information.