SolidStart - Hacker News

tdesilva 9 hours ago ago

Mentioning neural ODE doesn't make sense here, as this is unrelated. Basically any implementation of transformer uses residuals, but you're not really training a neural ODE here.

Also consider getting rid of the em-dashes. I don't know if you mostly vibe-coded this or not, but the README is pretty clearly AI generated.

[-]

vforno 4 hours ago ago

Hi, thanks for the comment. Nanoeuler is starting as a study and research project that will obviously improve over time. I'll do my best to make the readme and other things more readable. Thank you very much.

isatty 7 hours ago ago

I'm genuinely curious how much of this is LLM generated?

[-]

vforno 4 hours ago ago

Most part of trasformer and sft!

ericb 11 hours ago ago

How long was it trained for? How many tokens?

[-]

vforno 11 hours ago ago

Hi, a couple of hours, not too much! Including sft!

Chu4eeno 14 hours ago ago

Very weird coding style, did you run astyle --style=python on C code?

Also, your LLM left a comment in the cuda source that it is untested, does the cuda stuff work?

[-]

bArray 11 hours ago ago

Not sure, but the code is quite dense and lacking in comments. `nanoeuler` & `nanoeuler_check` is itself the binary checked straight into git with the `.log` file? All of the commit messages are "Add files via upload" and happened in quick succession.

I suspect this is LLM generated, which is cool, but shouldn't then have the claim "forward and backward passes are written and verified by hand" unless it is true.

Regarding the data, old texts from Gutenberg probably lowers the performance - especially as many texts are on purpose whimsical. Shakespeare for example made up words to be theatrical. You have a mix of different old English styles in the corpus - it's a terrible way to learn modern English. I had some success using .ZIM data archives from Kiwix as a source, you should get a more stable output using that data.

[-]

andai 5 hours ago ago

I haven't tested NanoEuler yet but Gutenberg is awesome. Maybe a matter of taste but I like it much better than modern English.

vforno 11 hours ago ago

Hi, the uploads are one after the other because it was a long, step-by-step research project where I tested the code on another machine. I admit that I'm slowly making up for the commits on all the projects. For Gutenberg and Shakespeare, I admit that they were the best tests I could do, but I'll always improve!

dang 12 hours ago ago

> Very weird coding style, did you run astyle --style=python on C code?

I'm sure you mean it in a more curious way but this type of comment on a Show HN often comes across as too harshy/snarky/dismissive for what we want here (see https://news.ycombinator.com/showhn.html).

vforno 13 hours ago ago

yes yes tested on a 4070 ti 16gb everything worked without problems!

Show HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch