Text2CAD Generating Sequential CAD Designs from Text Prompts

(sadilkhan.github.io)

105 points | by RafelMri 13 hours ago ago

65 comments

  • mwill 3 hours ago ago

    I see multiple comments arguing that using a CAD package is only easier and faster if you already know how to use a CAD package, and this is a 'better' UI for people who don't have those skills....but in that scenario, are you not then just trading the fixed upfront time investment to learn the basics of CAD, for ongoing inefficiency and difficulty every time you want to model something?

    For a user of a UI like this, there comes a point where their time would have been better spent learning a CAD package.

    Another layer is if you are modelling something that has to be machined or built in real life, you have to be keeping an eye on how it will physically exist throughout the entire process, stock it will be machined from or materials it will be built with. Thinking in terms of CAD workflows help with this greatly in my experience. The operations shown in the demo are not only easier to perform in a CAD package than describe in English to an LLM, but also the easiest part of it (except maybe if you are designing strictly for 3D printing)

    • dingnuts 2 hours ago ago

      > For a user of a UI like this, there comes a point where their time would have been better spent learning a CAD package.

      for some users, they will think of few enough designs in a lifetime to make learning any specific software worthwhile. For these users, the LLM's inefficiencies are worth the trade-off.

      • throwgfgfd25 an hour ago ago

        But the thing is, those users are unlikely to have thought of anything novel simply because they are not designers: if the tool is going to be successful then what they want is likely to be in the training set and easily googleable.

        This whole idea seems contingent on imagining a situation where a non-CAD user has an idea for a truly novel physical object, has extensive geometry skills, and can describe that object in some magical level of detail that doesn't involve any terms of art from the CAD domain.

        It's not a very likely scenario. And the energy put into tools to support this scenario would be better spent improving searchability of the data that is going to go into the training set, and simple tools to allow objects in the training set to be combined (such as those offered by TinkerCad or Microsoft's 3D builder).

        It's also prone to the risk that the LLM gets something wrong: makes a part that is prone to failure, or will actually destroy a CNC, or be unprintable by a 3D printer, etc.

        • godelski 36 minutes ago ago

            > It's also prone to the risk that the LLM gets something wrong: makes a part that is prone to failure, or will actually destroy a CNC, or be unprintable by a 3D printer, etc.
          
          You can destroy 3d printers too... especially if you get the bright idea of generating gcode... and one might reasonably get this idea since so many factors matter like the settings (but can easily result in things like ASA poisoning...)

          It's a real "too clever by a quarter" thinking

      • iancmceachern 18 minutes ago ago

        But this is just so dismissive of the whole profession(s)

        It's like saying I only want 2 sculptures on my garden so we should make a thing that sculpts like Michelangelo because I don't want to learn to sculpt for only two statues.

        This is why we have civilization, trade. We can each specialize, master, one thing, then share our surplus for others'.

        Why wouldn't you hire someone who can do it on an hour?

      • groby_b 23 minutes ago ago

        These users can download plenty of designs for the objects they think of - you don't often create truly unique things, most stuff already exists.

        More importantly, creating a 3D model without understanding mechanical properties is a meaningless exercise. Go ahead, ask anybody who has built things about their first attempts - and this approach means they will always be first attempts.

  • sgnelson 6 minutes ago ago

    Because a lot of people are saying "CAD is only useful if you know how to use it." One word:

    Tinkercad. https://www.tinkercad.com/

    I teach children how to use it. It takes them about 15 minutes to pick it up. Are you going to design a car with it? No. But if you look at all the Text2Cad programs so far, they won't either. If you need to design something for 3d printing, Tinkercad, in terms of ease of use/simplicity, is hard to beat. And it's free to use.

    I've played with other Text2Cad projects, and I've yet to be able to get anything out of them other than the most simple shapes, which frankly, what's the point if it can't make an object that's useful or meets my requirements? It takes way too much time having to write out paragraphs to even get the most basic of brackets built. I remember on the Zoo app, it very much likes typical CAD language which means you're going to need to be able to speak "CAD" anyways when trying to create a successful part (think: constraints in typical CAD)

    What these LLM programs really need to focus on (imo) is writing Openscad (and similar "programmable CAD") code, which is going to at least give the user a better starting place, as well as the ability to edit their parts. I think one of the biggest constraints for this is the lack of millions of lines of code and documentation unlike most popular programming languages for the LLM.

    I think the research is neat, but for now, it seems like a solution in search of a problem.

  • echoangle 8 hours ago ago

    Anyone who has ever done CAD knows that a picture is worth a thousand words. Describing a 3D object you want in words is much more effort to get right than drawing a simple sketch. Wouldn’t it be better to have image input for an application like this?

    • godelski 3 hours ago ago

      That's exactly what I was thinking. I can't describe what I want because I don't know it yet. It comes to life as I design.

      I was an engineer in a former life but still do a fair amount of printing. But when I design parts there's not even ways I __could__ know what I want before hand. As I build I realize I made wrong assumptions, but also not enough, that there are better ways to do things, that I can solve other problems, that I didn't think how things would interact together, that I could modify things to be better for the manufacturing process (this is such a big on in 3d printing and so many online files get this VERY wrong. But it is a hard skill to learn), and so many other things. In part this is because I've had more time to think, but there's more to it, when you see the thing "coming alive". This is much the same way I code, though I guess that's not common.

      I'm not sure if they'll address these problems, but I think anyone working in this space should make sure they also spend a lot of time in CAD themselves. It isn't clear to me that any of the authors do (looking at their websites. The main author mentions interest in AI-CAD but only has work from this year. Get this man a 3D printer). It's quite possible that they do, but they look like they've been computer scientists their whole career and that is probably not enough to understand the the intricacies of the problems they're trying to solve. There's a classic problem in CS where people get that you can learn a lot of things quickly but it is missed that getting the nuance and mastery takes time, that you should talk to experts. The first part is useful because it gives you the language to talk to experts, not because it makes you a replacement for one.

      • throwgfgfd25 2 hours ago ago

        > In part this is because I've had more time to think, but there's more to it, when you see the thing "coming alive".

        Yep. I have two printed prototypes of different approaches to a mechanism on my desk that only exist because of months of staring at CAD in the evenings, learning new things, doing research.

        They are not radical (they may be slightly novel in places; I have never seen 3D printed mechanisms like them).

        I don't know if I could describe them in words at all, but if I could, it would only be because I worked through them in CAD in the first place.

        For anything other than a trivial object I just can't see how you'd even come up with the words without having worked through the design -- what, on paper in 2D in pencil? After doing the maths? That's CAD in reverse.

        • godelski 2 hours ago ago

          Yeah words and images (especially sketches) are fuzzy. I think we tend to think they are more precise than they are because we are so good at communicating, but often this is only after having a relationship with the other person. It is easy to ignore the frustration and frequency of miscommunication and blame it on other things, like your manager being dumb. When in fact, both might be true.

          There's definitely things I think I could describe in words, but without a doubt could be communicated faster by sketching. There more complicated things where I think it would just be faster to cad up the damn thing. It's like math (or code). The language(s) are precise and annoying because of that precision, but they're still the easiest way to do the things we want to do, which is why we use them. Natural language's flexibility is great for abstraction and big ideas but not so great when it comes to precision. Things get very wordy very fast when you get into the details. And I'm sure everyone knows the value of arguing with your friend or coworker over those tiny things, even if it doesn't seem important. If you don't, you probably need to work on teams more often or make more friends lol

          • throwgfgfd25 2 hours ago ago

            > And I'm sure everyone knows the value of arguing with your friend or coworker over those tiny things, even if it doesn't seem important.

            Not to mention that in this case, this disagreement over the meaning of ultra-fine detail will be happening with an LLM, which does not really understand the words.

            • godelski 2 hours ago ago

              I'm an ML researcher and I seriously do not understand how people are avoiding the stupid loops. Like where I tell the LLM all the conditions, what works and what doesn't work, and then it tells me to do the thing that I just said doesn't work (while at the beginning of the response it even acknowledges this!). So then I say "x doesn't work, here's the output" and then it says "sorry for the confusion, you're right. Instead let's <insert bunch of useless words> then do x" where x is the same thing...

              I can't be the only one, right? I feel like I'm being gaslit lol

    • jsheard 8 hours ago ago

      Yeah but when all you have is an LLM hammer, everything looks like a text2text nail.

      • CSMastermind 5 hours ago ago

        LLMs are fundamentally one dimensional which works fine when you're generating next tokens for text which because that's a 1D problem.

        I do wonder how much progress we could make on a problem like this with a 3D transformer architecture.

        • abotsis 5 hours ago ago

          I’m not sure I follow this. Isn’t an LLMs dimensionality measured by how many parameters the model supports? Ie 10s of billions in some cases? If I understand it correctly, then, the model is already evaluating things in lots of dimensions and reducing it down to 1, as you say in the case of text, 2 dimensions in image generation, 3 should be pretty straightforward.

          • btbuildem 4 hours ago ago

            I think they're referring to the dimensionality of the input / output space, not the intermediate internal representation.

        • btbuildem 4 hours ago ago

          The neat thing is, you can rasterize 1D space into 2D, 3D and so on. Trick as old as analog TV signal processing.

          • throwgfgfd25 3 hours ago ago

            If I am understanding you right... I don't think this gets you anywhere useful.

            Even if you could do what you're suggesting with an LLM (I have my doubts) this result would be a mesh or 3D pixel grid or something, yes?

            This is terrible for interoperability and it's the opposite of what mainstream CAD packages do.

    • DonnyV 6 hours ago ago

      Yeah, a way to turn a flat sketch to a 3D model would be a better way to do this.

  • jareklupinski 5 hours ago ago

    98% of the total time it takes me to design something, usually involves deleting entire designs because they get a point where something is unfeasible, invalid geometry, or against all manufacturers' guidelines (will this shape work with a draft angle? etc.)

    this is usually before i start adding intricacies such as shells, fillets, and other features, which do take a lot of effort, but making those by hand is more of the 'art' side of the process anyway

    anything that gets me through the first 98% is welcome :)

    • JofArnold 2 hours ago ago

      > usually involves deleting entire designs

      Yes, for me that's usually around the point I've got a huge and complex assembly with all the motion wired up. Right-click -> Duplicate -> Rename "MyProject V2".

      I would save a huge amount of time also this way.

    • throwgfgfd25 4 hours ago ago

      > anything that gets me through the first 98% is welcome :)

      Only if the result of the LLM is a good, well-architected parametric CAD model you can adjust, right?

      • jareklupinski 3 hours ago ago

        that seems like a little much to expect from an LLM; the average CAD file in my experience has not been not well-architected :)

        as long as the output is something my manufacturer can understand (downloadable mesh: STL/STEP/etc (they dont take links)), the tool did its job for me

        i would probably start the final model from scratch no matter what the output was, so i can match my chosen manufacturer's tolerances/design rules/optimizations, and to give breathing room for my quirks/workflow (i like to design subtractively, some people design additively)

  • q3k 11 hours ago ago

    Describing what I want to design seems like more effort than actually just sketching things out in a decent CAD package.

    • serf 10 hours ago ago

      only if you're familiar with the CAD package.

      presumably a big benefit here is that all it really takes English and geometry knowledge.

      • Palomides 6 hours ago ago

        thinking in terms of composing 3d objects and their positions is 90% of doing CAD already, if you can do that, you can reproduce any of the objects in OP with 15 minutes of learning the tool

        seriously, I think people overestimate how hard basic CAD work is

        • throwgfgfd25 4 hours ago ago

          > seriously, I think people overestimate how hard basic CAD work is

          I think this is one of those things that programmers overestimate worse than non-programmers, too. To the point that they reject CAD UIs too early and get themselves stuck in often rather limiting code-CAD environments, because they never get to learn how parametric GUI CAD works.

          This belief that only code can be intuitively parametric is obviously not something that non-programmers suffer from.

          I think code-CAD has many benefits (though the idea of the various LLM-to-OpenSCAD tools out there makes me shudder; this is the worst possible combination of obscure knowledge-bases).

          But just a trivial amount of time learning even FreeCAD (the least-intuitive CAD package, pretty unambiguously) unlocks so much potential.

        • iancmceachern 5 hours ago ago

          Yeah, if you architect your part right, which takes experience.

          For folks that don't belive us, check out "speed modeling"

          • throwgfgfd25 4 hours ago ago

            It does take experience but mostly it takes a little analysis.

            I'm still really quite green at CAD; I've come to it late in life and I only make relatively simple things, perhaps; only simple mechanisms.

            But when I look at people getting stuck and asking for help in CAD groups it so often comes down to the knock-on effects of very early mistakes, like squandering the benefits of the base planes by choosing the wrong initial orientation, muddling through with primitives when an extrusion of a sketch would do the job, or making a series of complex circular pockets when a single revolve could have done it better.

            Basic familiarity with a few principles and their expression in CAD, and a little study of existing objects gets you a long way.

            It interests me that programmers are willing to learn the expressive nuances of individual languages or libraries or methodologies, but as soon as it comes to GUI CAD they dismiss the whole thing as too hard or too obscure. The excitement around LLM text-to-CAD seems emblematic of this; as if the wrong conclusions about GUI interfaces have been drawn from bad experiences of dev GUIs.

            • iancmceachern 3 hours ago ago

              Totally

              Asking me or folks that do what I do to create a hardware product that way would be like asking a sculptor to sit down and write NC code for a marble router to cut out the sculpture they see in their minds eye.

              When we're using these tools, Solidworks in my case, we're not just clicking with the mouse. We're typing complex commands, creating and using variables, creating scripts and macros, we often use gaming mice with many buttons mapped to complex hockey's, etc.

              The graphics interface is just part of it, the part where it shows us 3d geometry so we can put it into our minds eye, and a way to tell the computer what geometry we are talking about before we execute work using what I described above.

              Most people don't get it, if you do you do. You do.

      • zppln 6 hours ago ago

        So then you're left with a CAD file for a CAD package you're not familiar with..?

  • btbuildem 4 hours ago ago

    I see a lot of comments here to the effect "but it's easier to just do it in CAD in the first place" -- those are experts speaking. Imagine being a newcomer to CAD, the learning curve is quite steep. Compared to mastering the complex UI and workflows of a CAD problem, a text-based approach seems much easier.

    I can see this being incorporated into existing software as an alternate workflow path.

    What caught my curiosity here is the "sequential" qualifier. One massive weakness of all the AI content generation schemes is the lack of editability -- you get what you get, and attempts to refine the initial results are middling at best. This seems like it allows the user to build a more complex scene from multiple prompts -- likely meaning you can go back and edit some of the prompts to tweak the building blocks, and edit the overall scene. Interesting!

    • throwgfgfd25 3 hours ago ago

      > I see a lot of comments here to the effect "but it's easier to just do it in CAD in the first place" -- those are experts speaking.

      This is what I think, and yet I am the longest time away from being an expert.

      It's easier because CAD tools unlock CAD thinking and empower your brain to actually do design, as well as helping you see problems you didn't anticipate.

      This whole area -- LLM to CAD -- is one of the most misguided applications for generative AI (beyond "generative design" as it was understood in the pre-LLM/pre-GAN era, where it was usually used for FEM/topology optimisation)

      There are already enormous libraries of freely available basic CAD models for real-world objects; any beginner would be much better off simply learning how to merge them. And any tool aimed at beginners would be better off assisting that process (TinkerCad does, for example)

      And if a beginner has a truly novel object to make, an LLM is not going to have the training set data to make it. Nor is the beginner likely to have the CAD knowledge (words, expressions) to describe it.

      For this to be of use to a beginner you do have to imagine quite a niche kind of beginner: one who is expert in descriptive language and advanced geometry. Those people would be better off learning some sort of CAD environment; indeed they are the niche that is least likely to be driven insane by the limitations of OpenSCAD.

  • Aurornis 5 hours ago ago

    Cool in theory, but I can’t imagine describing anything other than the most basic CAD designs verbally. Text is not a good medium for designs beyond the most basic shapes and modifications.

    • CSMastermind 5 hours ago ago

      I work with software that displays 3D models of real products.

      If I could feed the manufacturers spec sheet along with maybe some pictures of the items and marketing copy then get a 3D model out it would save me a ton of money.

  • l5870uoo9y 9 hours ago ago

    That's something artificial intelligence is really good at: fitting data into a formalized format. I'm working on something similar[0], here the goal is not CAD design but charts. This is also something that has only become possible with later versions of AI, such as GPT-4.

    [0]: https://www.datavisualizer.ai/

  • loughnane 8 hours ago ago

    This would be most useful if it could generate objects that could then easily be tweaked, ideally parametrically.

    I think with a long preamble in the prompt (you are an expert at designing injection molded parts, and so on) you could get something pretty useful.

  • qazxcvbnmlp 5 hours ago ago

    So much negativity here. It’s interesting to see new technical possibilities. Congratulations to the team that made this.

    • throwgfgfd25 3 hours ago ago

      It's not negativity so much as informed dissent.

  • avodonosov 9 hours ago ago

    What format for resulting CAD designs they use? And in general, what is (are) the best open CAD formats and applicatioms today?

  • maCDzP 4 hours ago ago

    Let’s see where this goes in 1-2 years. There is a ton of money in the CAD industry.

  • isoprophlex 8 hours ago ago

    I wonder how effective a solution leveraging LLMs would be in producing NC code... maybe a numerically precise method is still a hard requirement there.

    • iancmceachern 5 hours ago ago

      We don't need LLM for that, and it would be very risky to trust not crashing a $250k machine tool with a $5k part in it that could kill the operator if something goes wrong

      • therouwboat 2 hours ago ago

        There is AI assistant for mastercam and other cam software, it promises to make 80% finished programs, but parts look really simple, like something you do on the first week of cnc training.

  • timonoko 12 hours ago ago

    Gemini clearly understands OpenScad primitives. "Make this object hollow", etc.

    But "Make me a teapot" seems to be aiming for teapot-looking object, which cannot hold tea. But this is not failure perse.

    • timonoko 9 hours ago ago

      Making functional teapot takes about 10 queries. You just have to make the pot and spout hollow and move the handle outside.

      Totally functional tea-pot production solution already. All you need to do is to automatize the display function and STL-production and any granny can print her own teapot without understanding nothing about computers and 3D-printers.

      • jsheard 9 hours ago ago

        Rest in peace granny, taken from us by a cocktail of plastics, bacteria, mold and tea. Maybe stating the obvious but 3D printed parts usually aren't food safe unless you post-process them with an appropriate non-porous coating.

        • andai 8 hours ago ago

          What's your favorite non-porous teapot coating?

          • timonoko 8 hours ago ago

            No coating needed. All you need is to re-flow the ABS. Feels like pottery after this simple operation: https://youtu.be/iLaGJwCCz-E?si=a1gccp8EI9sGzt-e

            • echoangle 7 hours ago ago

              Do you really want almost-boiling-water in ABS? I wouldn't do that.

              • timonoko 6 hours ago ago

                Reflow happens at 200°C, out gassing any 3D-related harmful stuff.

                Equal to inject-molded ABS cutlery.

                • echoangle 6 hours ago ago

                  Maybe I'm too careful but I wouldn't use plastic cutlery on hot food either. Also, the glass transition temperature of ABS is only 105 °C, so It's probably not too strong when filled with 100 °C water.

                  • timonoko 6 hours ago ago

                    I have already made all kinds of tests. Only restriction was that you cannot make open flame cooking pot, because it starts combusting too easily.

                    • echoangle 6 hours ago ago

                      How did you test the water quality after storing it in an ABS container? How would you notice if something from the ABS dissolved in the water?

      • waciki 9 hours ago ago

        Those grannies are already making teapots, it's called pottery.

        • throwgfgfd25 3 hours ago ago

          More than a thousand generations of grannies, at that. Additive manufacturing from 30,000 years ago!

      • jaakl 6 hours ago ago

        Does it respond to http queries?

  • kkfx 10 hours ago ago

    The idea is very nice, but as all LLM-backed stuff I see results are poor, meaning we need more effort to use the LLM obtaining good results than doing the work with non-LLM-wrapped/backed tools. It's the same for copilot to code and so on.

    It's still an interesting applications that might have a better future concentrating toward simulations (Salomé MECA, CodeASTER etc) since for these systems it's more tedious designing the simulation by hand than just sketch a 3D part with a modern parametric CAD.

  • magic_man 6 hours ago ago

    can you import this into solid works or autocad? I think it would be super useful to have the basic model and then tweak it.

    • throwgfgfd25 3 hours ago ago

      This is one of those classic LLM scenarios that doesn't make sense.

      If you can tweak the model in a CAD package, you can quite possibly make it in CAD in the first place more quickly than you can manage several rounds of descriptions with an LLM.

      It's like the whole "I'd like to get an LLM to write a song and then I'd adjust it" thing. No musician needs this. If you can truly fine-tune a song, writing one is easy. And if you aren't a musician, you won't get good results fine-tuning a song.

  • sanchezxg an hour ago ago

    Alright

  • westurner 6 hours ago ago

    From https://news.ycombinator.com/item?id=40131766 re: LLM minifigs and parametric CAD parts libraries:

    > Is there a blenderGPT-like tool trained on build123d Python models?

    > ai-game-development tools lists a few CAD LLM apps like blenderGPT and blender-GPT: https://github.com/Yuan-ManX/ai-game-development-tools#3d-mo...