SolidStart - Hacker News

razorbeamz a day ago ago

LLMs are not the ideal tool for this job, because LLMs cannot do math or count.

[-]

vitally3643 18 hours ago ago

Most human programmers are also fantastically bad at math.

[-]

razorbeamz 6 hours ago ago

True but irrelevant.

This 8-track duplication puzzle is a problem of math.

bijowo1676 7 hours ago ago

LLMs beat humans at generating code (and fixing broken one) and letting CPU execute the code

in-silico 13 hours ago ago

> LLMs cannot do math

This is plainly not true anymore

[-]

razorbeamz 6 hours ago ago

No, they fundamentally cannot do math. They are next token predictors, not calculators.

ithkuil a day ago ago

But LLM can write code that can do math and count. Tool use, more broadly, has proven to be a very powerful way to let LLMs do what they're good at (handle the fuzzy and imprecise nuances of natural language, which includes the scooping of a lot of context) and delegate other things they're not good at to external tools, some of which if can write on the spot.

If you think about it, we humans do that all the time too.

I'm crap at 4 digit multiplication in my head, but I have no problem doing that with pencil and paper

[-]

razorbeamz 6 hours ago ago

> But LLM can write code that can do math and count.

They cannot, however, execute that code. They can feed that code into an external program they've been given access to, but they can't execute it themselves.

camkego a day ago ago

Thanks to this article, I learned that the 70's Sticky Finger jeans brand were named after a Rolling Stones album. Don't know how I missed that all this time.

bijowo1676 7 hours ago ago

there is no secret to how LLMs did it.

I asked LLM and what it did was try DP solution, couldnt find DP solution, then launched brute-force solution. Everything in Python. so what LLM really did, was execute the SoTA human knowledge including solving using code.

here are LLM "thoughts" for example in the article:

________

I need to distribute 11 tracks across four programs on an 8-track cartridge. Every program must play for the exact same length of tape. Shorter programs waste the remaining silence. The goal is to minimize the longest program length.

________

The search found a best maximum program length of 613 seconds.

The dynamic programming solver claimed feasibility for 601, yet the search returned zero solutions.

A discrepancy appeared between the binary search result and the partition check.

The binary search lower bound was set to the longest track instead of the average length. This caused the search to miss the true minimum of 613.

Reconstruction failed — running exhaustive search for the true optimum.

________

The optimal assignment has been confirmed.

Program 3 is the longest at 613 seconds, which sets the tape length. Program 4 is the shortest at 573 seconds, leaving 40 seconds of wasted silence. The remaining three programs are 607, 610, and 613 seconds. A perfect 601-second maximum is impossible because the tracks cannot be split.

this was code used by LLM:

    python3 << 'EOF'
    tracks = {1:174,2:238,3:195,4:194,5:220,6:203,7:258,8:135,9:213,10:271,11:302}

    def fmt(s): return f"{s//60}:{s%60:02d}"

    items = list(tracks.items())
    all_best = []

    def search(i, bins, progs, best_max):
        if i == len(items):
            if max(bins) == best_max:
                all_best.append([sorted(p) for p in progs])
            return
        if max(bins) > best_max:
            return
        tid, dur = items[i]
        tried = set()
        for pi in range(4):
            if bins[pi] in tried: continue
            tried.add(bins[pi])
            bins[pi] += dur
            progs[pi].append(tid)
            search(i+1, bins, progs, best_max)
            progs[pi].pop(); bins[pi] -= dur

    search(0, [0,0,0,0], [[],[],[],[]], 613)
    # dedupe
    seen = set()
    unique = []
    for sol in all_best:
        key = tuple(sorted(tuple(p) for p in sol))
        if key not in seen:
            seen.add(key)
            unique.append(sol)

    print(f"All {len(unique)} distinct optimal solutions at 10:13:")
    for sol in unique[:5]:
        sums = [sum(tracks[t] for t in p) for p in sol]
        print(f"  {sol} -> {[fmt(s) for s in sums]}")
    EOF

mordae a day ago ago

[dead]

LLMs Will Replace 8-Track Duplication Engineers