23 comments

  • aabhay 6 hours ago ago

    The ability to use automatic verification + synthetic data is basically common knowledge among practitioners. But all these organizations have also explored endlessly the different ways to overfit on such data and the conclusion is the same -- the current model architecture seems to plateau when it comes to multi-step logical reasoning. You either drift from your common knowledge pre-training too far or you never come up with the right steps in instances where there's a vast design space.

    Think -- why has nobody been able to make an LLM play Go better than AlphaZero while still retaining language capabilities? It certainly would have orders of magnitude more parameters.

    • danielmarkbruce 5 hours ago ago

      AlphaZero is a system including models and search capabilities. This isn't a great example.

      • aabhay 5 hours ago ago

        AlphaZero is not an llm though? Its primarily a convolutional network with some additional fully connected layers that guide an MCTS at inference time.

        • danielmarkbruce 5 hours ago ago

          That too! It's definitely not an LLM. It would be a bad architecture choice (almost certainly...). For math an LLM would be a good choice. Arbitrary length input, sequential data, unclear where/which symbols to pay most attention to. Math notation is a human built language just like english.

          Search is a great tool for AI, but you have to have a reasonable search space (like chess or go or poker). I'm not close enough to Lean to understand if it can do something like that (i believe not), but math doesn't feel like a thing where you can reasonably "search" next steps, for most problems.

  • _flux 6 hours ago ago

    This must be one of the best applications for LLMs, as you can always automatically verify the results, or reject them otherwise, right?

    • aithrowawaycomm 6 hours ago ago

      The dependence on huge amounts of synthetic data makes it hard for me to see an application for mathematicians. The theorems being proved here are incredibly boring, more like calculation-heavy homework problems from a professor who is mean to freshmen. (This is not the same thing as what Terence Tao has been looking at, which is more human language -> Lean "translation.") There is a basic problem that new and interesting mathematics does not have the data for ANNs to learn from; this mathematics is very old and there are thousands of examples online. To be extremely pedantic the authors didn't show DeepSeek was good at proofs, but rather at undergraduate competition math problems. I am assuming some of that transfers... but I am a huge critic of LLMs and even so I constantly overestimate them!

      However I could see this being useful for verified software development, which usually involves a huge amount of tedious and uninteresting lemmas, and the size of the proofs becomes exhausting. Having an LLM check all 2^8 configurations of something seems worthwhile.

    • danielmarkbruce 5 hours ago ago

      Any situation involving the ability to simulate (like, weather models, bio models, physics models) is potentially a good application. It's still hard to make a good model to approximate an extremely complex function.

    • yuhfdr 6 hours ago ago

      Same with code generation.

      But generating useless code, or proofs, just to discard them is hardly a consequence and externality free effort.

      • kkzz99 6 hours ago ago

        But you can far more easily scale llm generation compared to human researchers.

        • yuhfdr 6 hours ago ago

          Sloppyjoes always assume the best case for llms and worst case for their counter example of choice.

          10 trillion llms, powered by a Dyson sphere, that still output slop is still worse than one unappreciated post doc.

          • bluechair 4 hours ago ago

            That was funny even if I don’t agree with the sentiment.

  • maxrmk an hour ago ago

    There's a newer version of this model that takes a really cool RL based approach: https://arxiv.org/pdf/2408.08152

  • Havoc 5 hours ago ago

    Surprised by the negativity here. A 7B class model +- doubles gpt4 score and everyone goes “meh”?!?

    • lmeyerov 4 hours ago ago

      I was excited for that headline, but I didn't get a clear & fair sense of comparison, like how prompt engineered etc was the comparison, or a false comparison?

      There was not enough detail to determine what was ultimately useful & truly better, if anything. So lessons learned near useless. Likewise, I could not tell how useful the fine-tuning was and why, vs basic other tricks that would avoid all this complexity. The work seems good, but I found almost no scientific value in the experimentation and reporting. So I can't comment because there is little to comment on that I normally would. We focus more on the coding & analysis side, logical QA on fuzzier questions, so I am genuinely curious, supportive, am informed, etc, but left frustrated and wanting my time back.

    • Workaccount2 4 hours ago ago

      The attitude that HN on the whole has towards AI is what you would expect from an existential threat to a very high paying career.

      • danpalmer 29 minutes ago ago

        I've seen broadly 2 attitudes: a refusal to accept AI as an existential threat to a high paying career, and a belief that AI is an existential threat to a high paying career.

        As with most things in life, the truth is likely somewhere boring in the middle.

  • whyowhy3484939 5 hours ago ago

    "Suppose you try to construct a coherent, ordered, natural world with no resource other than repeated exposure to things, and the formation of certain associative bonds. Oh, please!"

    This is prof. Robinson on Kantian philosophy - check out Oxford podcasts by the way - and this quote is meant to imply that building a coherent world out of raw sensory data and statistics alone is completely and utterly impractical if not outright impossible. While I don't think he meant to refer to any kind of AI, in my mind this description also aptly describes the general method of DL neural networks. Repeated exposure to find correlation.

    How does one find order through associativity alone? With AI this is not an academic problem anymore. This has become practical. Kant says it is impossible, not just unlikely.

    The Kantian project and the various core issues it tries to address seems readily applicable to AI research yet I see very little mention of it. Perhaps I am just dumb though. Building a mind capable of taming tremendous sensory flux needs to, at the very least, take note of the (many) fundamental issues he raised. Issues I feel are not at all trivial to set aside. I feel we are stuck in Hume's empiricist reasoning and have yet to graduate to Kant and beyond.

    Are we now somehow convinced yet again that causality and reasoning will, in fact, after all spontaneously emerge out of pure chaos? Didn't we settle the impossibility of this a few hundred years ago?

    • viraptor 4 hours ago ago

      The philosophy angle is interesting of course, but are any of those claims proven true? Why would someone stop trying to achieve something just because Kant's view of the world says it's impossible? Philosophies come and go and get refined over time. Meanwhile you only need to find one edge case where they don't apply the way Kant imagined it. Or find an area where the claim is moot in practice because you achieved all your goals anyway.

      • whyowhy3484939 3 hours ago ago

        I can appreciate this very practical stance and I naturally urge all my technical colleagues to persist in their struggle for pragmatic victory, but I can't help but voice some concerns that at the very least may lead to an illuminating response capable of at long last disabusing me of my critical notions. You may imagine similar concerns would perhaps arise in you if massive societal resources were to be invested in finding the "next biggest number" because a multitude of people have decided math can't be trusted or some loophole is thought to be found through empirical effort alone.

        Hume's reasoning on this particular issue, and I am taking liberties here, boils down to the idea that anything can cause anything and there is no necessary connection between anything. At least, no connection we would be able to gather with our senses. The causal, necessary, connection between one billiard ball causing a second ball to move is not to be found anywhere in raw sensory data itself. You will not find a "third element", the "causal relationship", anywhere. There is just raw sensory data, one ball coming from the left, two balls besides each other and then one ball moving away to the right. The idea that one ball caused the other to move is made up. It is fiction. It is, at best, a habit of the mind to find those sort of correlations and label them as "causal". I dare you to find a flaw in that argument. As convincing as it is, it is pretty damning for any enterprise that wants to call itself scientific or even rational. Nothing we will experience, nothing we will ever think up, no matter how sophisticated, will, on a fundamental level, ever amount to anything more than "more or less probable".

        This famously awoke Kant from his "dogmatic slumber". Luckily for him he found some problems in Hume's argument, and again I am taking liberties, because to entertain even the idea of an external world filled with objects like billiard balls presupposes the existence of tiny, slightly important things like, oh I don't know, time and space itself. Hume, where do you pull these from? You can look in raw sensory for evidence of time and space for a long time and, like looking for causality, you'll come up empty-handed, unless, and here is the point, you bring those notions with you and "wear those glasses", so to speak. You massage the data so it will fit the spatio-temporal domain and now you can start making sense of it and not a figurative second sooner.

        There are all sorts of parallels here with problems in AI (IMO). Neural networks are asked to infer concepts like time, space and causality by just looking at a lot of data and I can't help but be skeptical of success. The interesting thing to me here is that AI has made these dry and academic philosophical debates practical and useful. Hume talks about billiard balls, but it is easy to convert this into ML lingo by considering, say, some excitation of an array of artificial neurons that is followed by another configuration of excitation. What is their connection? How will you ever unearth "causality" from these completely unconnected events? Nothing about this problem has changed its nature at all in the past few hundred years.

        If "causality" or "necessary connection" is too abstract for your taste, consider that to, say, have any type of memory mechanism at all you have to have some sort of sensory apparatus - say, a neuron or some multitude of them - that is capable of holding, say, event A and some unit of time later, event B and can then connect those two by assigning a probability of some kind between them. Is there any other way? Can you build memory without using a mechanism vaguely of this kind? But notice you are bringing the notion of the temporal to the data instead of the other way around. Nothing about event A or event B can tell you what the nature of time is. You bring it inside your sensor which has a "before" and "after" slot. Kant would say "Aha! There it is. You could not find anything in the data so you had to preprocess it in order to make it intelligible", but he would do it in dense, long-winded, inscrutable German. (He'd probably make fun of you without you knowing it as well.)

        It is through the nature of this, in our case temporal, sensor that any kind of temporal connection can be made, not through the data itself. That is quite something and I am having a hard time refuting this line of reasoning. If you need more than space, time and causality you can consider the problem of "substance": how will you keep track of an object that alters its appearance? How do you "know" that some entity is merely changing appearance by, say changing clothes or moving through a dark spot and is thus dimly lit all of a suddenly, but is "essentially" the same? What's this "essentially"? How much of an sensory impression can change before it is a "different entity"? This problem has the same character as the temporal and causal problem. The data itself will not be illuminating unless you bring "substance glasses" with you.

        Strong AI might be found implementing Kantian category sensors like Unity, Plurality, Causality, Substance, etc. A guy can dream right.

    • baanist 2 hours ago ago

      I doubt you are going to change anyone's mind on this who is already convinced that computers can think and reason and all that's required is the right sequence of numbers. Moreover, the internet is awash with bots and AIs working on behalf of governments to spread political and economic propaganda. HN has moderation to avoid it but the moderators are human so they can't foil all attempts and notice all AIs and bots.

    • ilaksh 3 hours ago ago

      Science came along and made much of philosophy obsolete, especially the parts that are now directly addressed by science and engineering projects.

      For your comment to make sense it would have to include some context of computer science and artificial intelligence research, a significant amount of which is directly applicable to the topic you mention.

      It's not that you are dumb, you are just being willfully ignorant. I'm not an expert on machine learning but I do know enough to say confidently that there is a lot of research about building world models from sensory data.

      • whyowhy3484939 3 hours ago ago

        Oh, I agree on the ignorant part, but you can believe me it is not willed. I very much like to be educated on this because nothing I have read - yet - touched these fundamental problems with a ten-foot pole.

        I have elaborated in a sibling comment. I do not dare to ask you to consider it because it is quite loquacious, but in the off-chance you want to indulge in some armchair philosophy you can join me in my pedantry and enlighten a lost soul.

  • tikkun 7 hours ago ago

    [Submitted on 23 May 2024]