SolidStart - Hacker News

mrkeen 9 months ago ago

I think ChatGPT is just bad at whatever domain you have expertise in.

Talk to it about something you don't know about, and you'll think it's really good technology ;)

Yawrehto 9 months ago ago

"Everything you read in the newspapers is absolutely true except for the rare story of which you happen to have firsthand knowledge." - Erwin Knoll, allegedly[1]

[1] It's widely attested to him, including on Wikipedia and in an aside in an old NYTimes article [https://www.nytimes.com/1982/02/27/us/required-reading-smith...], but I couldn't track down the original source to verify it. It's possible it's falsely attributed, but on the other hand he wasn't super prominent -- editor of a major magazine for progressivism, called The Progressive, yes, but no Mark Twain -- and so there'd be little incentive to say it was his when you could call it Twain's.

anonzzzies 9 months ago ago

It is pretty good at coding; I have 40+ years of hobby and professional experience of coding in dozens of languages, from small projects to millions of lines architected and (co) written by me. There is software I wrote begin 90s still running production etc. I am a cto now but I still write code for my job about half time and in my free time as well, that is to say, I was until recently; now I have the tooling (made myself as I didn't like what there was at the time) to not code at all and yet I produce more code than ever. We use PRs and reviews and my colleagues are quite surprised about the speed and the quality. So yeah, not bad for sure even in my field of expertise.

We are now rolling out my tooling in the company so everyone can forget about the boring stuff and just focus on the business logic. There is resistance as this is going to cost jobs; we don't have infinite work to do and this is much (much) faster.

[-]

bitwizeshift 9 months ago ago

This hasn’t been my experience at all in the slightest.

Been programming since I was in elementary school, and current Copilot, OpenAI and even Gemini models generate code at a very very junior level. It might solve a practical problem, but it can’t write a decent abstraction to save its life unless you repeatedly prompt it to. It also massively struggles to retain coherence when it has more moving parts; if you have different things being mutated, it often just forgets it and will write code that crashes/panics/generates UB/etc.

When you are lucky and you get something that vaguely works, the test cases it writes are of negative value. Test cases are either useless cases that don’t cover edge cases, are incorrect entirely and fail, or worse yet — look correct and pass, but are semantically wrong. LLM models have been absolutely hilariously bad at this, where it will generate passing cases for the code as written, but not for the semantics of the code being written. Writing it by hand would catch it quickly, but a junior dev using these tools can easily miss this.

Then there is Rust; most models don’t do rust well. In isolation they are kind of okay, but overall it frequently generates borrowing issues that fail to compile.

[-]

anonzzzies 9 months ago ago

But I guess, and this is dangerous to say I do realize, is that the tooling around the prompts and around the results is key to getting the best results. Just prompts without guards is not how you want to do it.

b20000 9 months ago ago

the last 1% of a project takes as much time as the first 99% of a project

[-]

anonzzzies 9 months ago ago

And this is related to my comment how?

[-]

skydhash 9 months ago ago

The code is the mean, not the end. And you take care in it, because other people will work with it to handle changing requirements. One aspect of seniority is to know the tools to solve a problem, the other is to make the solution practical to maintain and adaptive. And this takes as much time as solving the problem.

[-]

anonzzzies 9 months ago ago

I agree, still not really related to a thread that is about llms writing code. The llm and code are tools to address stuff my business deals with; I didn't say it does everything; we were just talking about the code part.

more_corn 9 months ago ago

It’s pretty good at code for my job. Maybe my job is just easy. But I have a couple decades of experience at it and it’s able to generate reasonably good code for obscure parts of what I do.

I saw a spectacularly bad example of open ai trying to reason about electronics yesterday. Something like how do I use the gpio pins of my jetson and it failed so hard it was funny. That one seems simple to me. Identify that you need to look up the pinouts, find the image. Label the pins… I suspect there’s something wrong in this generation of gpts when it comes to reasoning about electronics.

pwg 9 months ago ago

So it shows an electronic form of Gell-Mann Amnesia?:

https://www.epsilontheory.com/gell-mann-amnesia/

[-]

dagmx 9 months ago ago

Ah thanks for the link. I’ve been trying to recall the name of the phenomena for ages and it’s always been just on the tip of my tongue.

9 months ago ago

[deleted]

mystified5016 9 months ago ago

Probably because it isn't really built for that. It's a word soup generator, and not a technical database.

For this kind of task, you probably want a model that has specifically been trained on every product datasheet ever, and not ten million reddit threads and forum posts about how a 555 or 328p can solve any problem.

I doubt that chatgpt has been fed every datasheet for every part made in the last decade or two. Even if it had, that's likely far outweighed by the amout of noise coming from people talking about the most common parts.

But fundamentally I'm not sure that LLMs are great for this type of work. No two datasheets are the same and I've never seen one that wasn't missing some kind of information. What you very much do not want is an LLM hallucinating a value that does not actually exist in the datasheet. Or have it conflate two parts and mix up their values. These models just don't seem to be up to the task of returning real information from abstract queries. They're just meant to generate probabilistic text sequences.

[-]

rasz 9 months ago ago

I would expect someone like Supplyframe to cook a custom model for that purpose. I think I remember them mentioning something about using ML to clean up old badly scanned/faxed datasheets a long time ago.

journal 9 months ago ago

Are humans not word generators with arms and legs?

bbourn 9 months ago ago

Hey, don't tell everyone our secret sauce! ;)

But yeah, you hit it on the head. LLMs alone were not enough to be able to read the datasheet. It took a mix of different algorithms, including OCR, Computer Vision, Neural Nets, and yes, LLMs to be able to consistently read all the differences between different manufactures, categories, etc.

Give us a whirl at www.zenode.ai and let me know what you think, we just launched our MVP and while there's a ton to improve we think it's pretty helpful!

jononor 9 months ago ago

As an electronics engineer, I have tried it for such tasks, without success. I specified requirements (only the key/rarer ones, typically 1 or 2) and asked it to find components. It failed miserably, typically just insisting that some related but much more common component satisfied the requirement. More and more apologetic as I tired to guide/coax it along. I know that there are a few components available that satisfy the requirement, as well as several hundred that do not. And I know that the information is in digitally readable PDF files (as opposed to scans).

This specific failure might be s kind of averaging problem, where common answers around the general theme are preferred over more specific (and correct). LLMs can also fail completely at trivial concepts such as negation, or separating between "Y above X" and "Y below X".

[-]

bbourn 9 months ago ago

You should try www.Zenode.ai and tell us what you think! We did a lot of work to show all the components that could work, and use the AI to rank them according to what would work best for your needs. We still have a TON of things to do, but as hardware engineers ourselves we've found it useful for own projects!

mystified5016 9 months ago ago

Yeah, it seems super bad when you get even just below the surface of general theory. It's not too bad at showing you how e.g. a low pass filter is calculated, but it doesn't do well at actually running the calculations.

It does a pretty good job at getting you to the "draw the rest of the owl" stage

t0mas88 9 months ago ago

This applies to many fields. It will come up with plausible looking but wrong answers and keeps apologising if you correct or point out the mistakes.

I've seen it with statistics as well, asking it to implement some things in code. You'll get working but mathematically wrong code.

bbourn 9 months ago ago

This is exactly how and why we built www.Zenode.AI. It's a search engine for electronic components that uses and AI model trained to read the component datasheets

AI is the perfect tool for this task, but LLMs alone aren't capable of doing this. It took us multiple algorithms working together to read all the tables, equations, footnotes, etc. Now we've extracted the information from over 10M datasheets, and created a parts database that you can search across using natural language.

We're actually planning to hit Show Hacker News this week or next, just ironing out a few details first!

[-]

savorypiano 9 months ago ago

Pardon, but am I not seeing what you are seeing? I tried many searches and they all failed abysmally. Even basic queries like a regulator type of certain output - both the type and output could be wrong. Not sure how it's ready to launch.

pera 9 months ago ago

Thank you, this looks very useful; I've tried it a few hours ago and it was not returning results (or it was taking a very long time to load) but will try again next week. Good luck with your launch.

[-]

bbourn 9 months ago ago

Damn, yeah, we added the ability for the AI to set filters based on the input over the weekend and it has definitely slowed things down. We've got a rust script thats going out today to speed things up and add more functionality (like giving you the count of parts that each value in a filter is associated with), that should speed things up pretty dramatically....

Thanks for the try though, we'll definitely be improving this dramatically over the next few weeks!

torginus 9 months ago ago

ChatGPT can be hilariously bad at less common things, for example, I asked it for the uses of polyurethane foam in my native language, and it suggested it would be great for decorating cakes.

[-]

meow_catrix 9 months ago ago

It is, if you’re in the visual advertising industry.

whimsicalism 9 months ago ago

what is your native language?

Atmael 9 months ago ago

because chat gpt is a linguistic model

linguistic models are based on images of objects and phenomena, not on the physical properties of those objects and phenomena

there's practical knowledge and there's theoretical knowledge

practical knowledge can be used as technology but theoretical knowledge cannot

by and large, chat gpt is a generator of 'idle chatter' that people are used to in their communication and often in their thinking

[-]

bbourn 9 months ago ago

This is actually well said. One of the issues we had to overcome with Zenode was the fact that people reference the same information in completely different ways. For example, "what's the nominal power draw" and "what's the typical power output" are the same query, but to an LLM, these are different.

We had to do a ton of work to get the tool to consistently return the same value (and still sometimes it shits the bed).

So yeah, even light technical jargon can confuse the heck out of an LLM, which is why that wasn't the only tool that's necessary to build this.

Try us out, would be curious to get your feedback! www.zenode.ai

[-]

Atmael 9 months ago ago

thanks for the link, that's interesting

can your search template be used to search for any strictly catologized things or is the template strictly tied to searching for electronic components?

__

i ask this because it occurred to me that it might be possible to make a universal cataloger of physical things and phenomena for linguistic ai, which would make it possible to... you know

__

in other words, you can create a 'real ai' from this publicly available ai linguistic model ;)

mikewarot 9 months ago ago

The thing about data sheets is you have to watch out for your own assumptions when reading them. If it doesn't explicitly say it'll do X... it won't, no matter how common it is in other parts of the same type.

It might help but you have to be the backstop when it comes to the final call. Measuring the false positive/false negative rate could be tedious, but it's important to have a good estimate of, in order to use it wisely.

[-]

mystified5016 9 months ago ago

> If it doesn't explicitly say it'll do X... it won't

Unfortunately, that is only true until it isn't. Most datasheets are not as complete as you would like, and many are just incorrect. It's up to you to take the incomplete information and make a judgement call as an engineer. Undocumented features are not uncommon.

pera 9 months ago ago

Yeah, I see what you mean, like if I want to find for instance "a sub milliampere X" but in the pdf it only says uA then it would be impossible for an LLM to suggest that

[-]

pera 9 months ago ago

> respond with true or false: is 1234uA submilliampere?

> True. 1234 microamperes (uA) is equal to 1.234 milliamperes (mA), which is sub-milliampere.

[-]

pulvinar 9 months ago ago

Interesting. I found that GPT-4 gets this wrong quite consistently, and o1-preview gets it right. It takes a moment to think it through and so isn't tripped up by relying on heuristics.

tuanmount2 9 months ago ago

Since data used to train chatgpt is public internet data, it will probable be bad at anything uncommon niche knowledge

savorypiano 9 months ago ago

How much would you pay for this feature?

Is typing your requirements that much easier than going through traditional search filters at Digikey?

[-]

bbourn 9 months ago ago

We think it is, but filters are just the tip of the iceberg for what a true AI for electronics can do. Our service also uses Machine Learning to rank the results according what components you already think will or will not work for your service. It also looks at clusters within the results and pulls an example part from them, so that you don't have to look through a bunch of similar parts.

You should check it out, www.zenode.ai!

cdaringe 9 months ago ago

I asked it all sorts of specifics about how to use my esp32 and it does surprisingly well

[-]

Infinity315 9 months ago ago

This is unsurprising since from what little I know about electronics, I know the ESP32 is pretty common. I know very little about electronics. So if I know it's common, it's for sure in the training data of ChatGPT.

fzzzy 9 months ago ago

Do in context learning. Gather a huge sheet of specs for components you want to use, put it at the top of your chat, and then ask questions.

MrCoffee7 9 months ago ago

Have you tried ChatGPT apps specialized for electronics, such as https://chatgpt.com/g/g-6PTe1fb3X-electronics-and-circuit-an... ?

[-]

gtirloni 9 months ago ago

Here's the "app":

You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture. You are tasked with answering questions and providing assistance in the domain of Electronics and Circuit Analysis. You will apply structured and concise explanations while incorporating relevant academic and technical references from uploaded materials. Your goal is to ensure clarity, accuracy, and technical correctness in your responses, especially when dealing with advanced concepts. Always follow the structured format requested by the user.

pera 9 months ago ago

No, just straight chatgpt and claude but I will take a look, thanks

Ask HN: Why is ChatGPT so bad at electronics?