> TCL test harness. C SQLite's test suite is driven by ~90,000+ lines of TCL scripts deeply intertwined with the C API. These cannot be meaningfully ported. Instead, FrankenSQLite uses native Rust #[test] modules, proptest for property-based testing, a conformance harness comparing SQL output against C SQLite golden files, and asupersync's lab reactor for deterministic concurrency tests.
If you're not running against the SQLite test suite, then you haven't written a viable SQLite replacement.
The TH3 test suite is proprietary, but the TCL test suite that they refer to is public domain.
I'm not sure where they get their 90k CLOC count though, that seems like it might be an LLM induced hallucination given the rest of the project. The public domain TCL test suite is ~27k CLOC, and the proprietary suite is 1055k CLOC.
The TH3 test suite is proprietary, but the TCL test suite that they refer to is public domain.
I'm not sure where they get their 90k CLOC count though, that seems like it might be an LLM induced hallucination given the rest of the project. The public domain TCL test suite is ~27k CLOC, and the proprietary suite is 1055k CLOC.
Single writer will outperform MVCC as long as you do dynamic batching (doesn't prevent logical transactions) and all you have to do is manage that writer at the application level.
Concurrent writers just thrash your CPU cache. The difference between L1 and L3 can be 100x. So your single writer on a single core can outperform 10-100s of cores. Especially when you start considering contention.
Here's sqlite doing 100k TPS and I'm not even messing with core affinity and it's going over FFi in a dynamic language.
This kind of slop spewing into Github feels like the modern equivalent of toxic plumes coming from smoke stacks.
Utterly unmaintainable by any human, likely never to be completed or used, but now deposited into the atmosphere for future trained AI models and humans alike to stumble across and ingest, degrading the environment for everyone around it.
The fact that they've hosted it on GitHub means they've agreed to GitHub's terms, which allows them (via OpenAI) to train on the code.
Also it's pretty hilarious to vibe-code a library that clones another library that someone has spent decades of work on, and then try to prohibit people from using that LLM output as training data for an LLM.
Even though it looks like LLM slop, we are starting to see big projects being translated/refactored with LLMs. It reminds me of the 2023 AI video era. If the pattern follows, we will start to see way fewer errors until it is economically viable.
Love the "race" demo on the site, but very curious about how you approached building this. Appreciated the markdown docs for the insight on the prompt, spec, etc
If you can't tell this is LLM slop then I don't really know what to tell you. What gave it away for me was the RaptorQ nonsense & conformance w/ standard sqlite file format. If you actually read the code you'll notice all sorts of half complete implementations of whatever is promised in the marketing materials: https://github.com/Taufiqkemall2/frankensqlite/blob/main/cra...
If you bothered to do any research at all you’d know the author as an extreme, frontier, avant-garde, eccentric LLM user and I say it as an LLM enthusiast.
Thanks. Next time I'll do more research on what counts for LLM code artwork before commenting on an incomplete implementation w/ all sorts of logically inconsistent requirements. All I can really do at this point is humbly ask for your & their avant-garde forgiveness b/c I won't make the same mistake again & that's a real promise you can take to the crypto bank.
Great! But note I haven’t said that you should be doing the research. This was more of a warning about today, but it also was a different kind of warning about the next 12-18 months once models catch up to what this guy wants to do with them.
Thank you for your wisdom. I'll make a note & make sure to follow up on this later b/c you obviously know much more about the future than a humble plebeian like myself.
i reimplemented my Grandma in Rust. She was a real Safety and Security hazard to herself and her surounding. She forgot things and made unsound memory assumtion. Took me about 3 Days vibe coding with Claude Code and was a real fun time. Now my grandma is leaking anything and has some new comandline switches. To be fair i know best how to implement Grandmas and everybody should use my Grandma from now on. If this breaks your scripts just adapt. Sure this was very cynical but im so tired reading every week some new pet project where rust is seen a mesiah. It is a new language, it helps getting memory right more easy. It is like the new visual basic.
> TCL test harness. C SQLite's test suite is driven by ~90,000+ lines of TCL scripts deeply intertwined with the C API. These cannot be meaningfully ported. Instead, FrankenSQLite uses native Rust #[test] modules, proptest for property-based testing, a conformance harness comparing SQL output against C SQLite golden files, and asupersync's lab reactor for deterministic concurrency tests.
If you're not running against the SQLite test suite, then you haven't written a viable SQLite replacement.
I thought I read somewhere that their full test suite is not publicly available?
The TH3 test suite is proprietary, but the TCL test suite that they refer to is public domain.
I'm not sure where they get their 90k CLOC count though, that seems like it might be an LLM induced hallucination given the rest of the project. The public domain TCL test suite is ~27k CLOC, and the proprietary suite is 1055k CLOC.
This and this needs Jepsen testing.
The value of SQLite is how robust it is and that’s because of the rigorous test suite.
Isn't that test suite private though?
The TH3 test suite is proprietary, but the TCL test suite that they refer to is public domain.
I'm not sure where they get their 90k CLOC count though, that seems like it might be an LLM induced hallucination given the rest of the project. The public domain TCL test suite is ~27k CLOC, and the proprietary suite is 1055k CLOC.
Thanks for the clarification, I appreciate it.
> and the proprietary suite is 1055k CLOC.
Why is the code size of the proprietary test suite even public though?
You can buy access to it.
Any serious SQLite re-implementation should buy it and test against it.
The cost of TH3 is listed as "call".
It's much more likely the issue is one of cost, not of seriousity.
Whats the obsession with concurrent writes?
Single writer will outperform MVCC as long as you do dynamic batching (doesn't prevent logical transactions) and all you have to do is manage that writer at the application level.
Concurrent writers just thrash your CPU cache. The difference between L1 and L3 can be 100x. So your single writer on a single core can outperform 10-100s of cores. Especially when you start considering contention.
Here's sqlite doing 100k TPS and I'm not even messing with core affinity and it's going over FFi in a dynamic language.
https://andersmurphy.com/2025/12/02/100000-tps-over-a-billio...
It's worth scrolling down to the current implementation status part:
https://github.com/Dicklesworthstone/frankensqlite#current-i...
Although I will admit that even after reading it, I'm not exactly sure what the current implementation status is.
It's fake. It doesn't exist. It never happened. The whole thing is an LLM hallucination. You can notice that it's all half implemented if you read the code: https://github.com/Taufiqkemall2/frankensqlite/blob/main/cra...
We are going to get overwhelmed with this stuff aren't we.
The people who understand basic logic will be fine but I'm starting to think that's a very small group of people.
If this wasn't ambitious enough, the author is also porting glibc to rust. As I understand it, all of it is agentic coded using custom harnesses.
It doesn't read ambitious so much as naive.
It entirely depends on how much the author reads the result of the agentic coding.
Not very much judging by the commit rate.
It sounds scifi, but not naive anymore.
Author is on HN: https://news.ycombinator.com/user?id=eigenvalue
Clean room implementation yea sure buddy
Why does clean room even matter given SQLite is in the public domain?
And in every training corpus many times over.
[dead]
Really, rust folks should stop using original projects names. It's not related to sqlite, it's very loosely inspired.
s/rust/llm/ doesn't really matter which language the slop is produced in
This kind of slop spewing into Github feels like the modern equivalent of toxic plumes coming from smoke stacks.
Utterly unmaintainable by any human, likely never to be completed or used, but now deposited into the atmosphere for future trained AI models and humans alike to stumble across and ingest, degrading the environment for everyone around it.
It's kinda like when the web first started taking off then there were WYSIWYGs. Everyone and their mom was creating static HTML websites.
But nobody shows off static HTML sites on HN.
Looks mildly interesting, but what's up with the license?
MIT plus a condition that designates OpenAI and Anthropic as restricted parties that are not permitted to use or else?
Good luck enforcing that. "Glad" to hear that Gemini's excluded.
Where do you see issues enforcing license terms?
The fact that they've hosted it on GitHub means they've agreed to GitHub's terms, which allows them (via OpenAI) to train on the code.
Also it's pretty hilarious to vibe-code a library that clones another library that someone has spent decades of work on, and then try to prohibit people from using that LLM output as training data for an LLM.
The author seems obsessed with RaptorQ[1], this is not a good place for it.
RS over GF256 is more than adequate. Or just plain LDPC.
[1] <https://www.jeffreyemanuel.com/writing/raptorq>
Says on top it's called monster but then it speaks of frankensql. Confusing website imho for a nice project
While I don't think the website is particularly well-designed, "monster" can be used as an adjective.
There's a limit to what Claude can do without a competent human helping …
I was looking at this repo the other day. Time travel queries look really useful.
Impressive piece of work from the AIs here.
where the heck is my mouse cursor?
There is a popular [excellent non vibe-coded] web server called FrankenPHP; A port of PHP to Go bundled with Caddy.
Are there any other FrankenProjects out there that have had any success?
Were we so impressed by the concept of the original Frankenstein?
Is this a Freudian slip, that we are expecting these AI projects to turn on their creators?
We need to ban this kind of AI slop yesterday.
Was it vibe coded?
Extremely. Repo is littered with one-off Python scripts, among many other indicators.
Nobody in their right mind would sponsor this project to be hand written.
Of course it was.
didn't notice at first, but my CPU fan went silent the moment I closed this slop website
Is the implementation untouched by generative AI? Seems a bit ignorant/dishonest to claim “clean-room” in such a case
AGENTS.md and COMPREHENSIVE_SPEC_FOR_FRANKENSQLITE_V1_CODEX.md in the root folder, and ugly AI slop image on the home page and README.
A better question is if the implementation was touched by anything other than generative AI.
Even though it looks like LLM slop, we are starting to see big projects being translated/refactored with LLMs. It reminds me of the 2023 AI video era. If the pattern follows, we will start to see way fewer errors until it is economically viable.
Love the "race" demo on the site, but very curious about how you approached building this. Appreciated the markdown docs for the insight on the prompt, spec, etc
If you can't tell this is LLM slop then I don't really know what to tell you. What gave it away for me was the RaptorQ nonsense & conformance w/ standard sqlite file format. If you actually read the code you'll notice all sorts of half complete implementations of whatever is promised in the marketing materials: https://github.com/Taufiqkemall2/frankensqlite/blob/main/cra...
If you bothered to do any research at all you’d know the author as an extreme, frontier, avant-garde, eccentric LLM user and I say it as an LLM enthusiast.
Thanks. Next time I'll do more research on what counts for LLM code artwork before commenting on an incomplete implementation w/ all sorts of logically inconsistent requirements. All I can really do at this point is humbly ask for your & their avant-garde forgiveness b/c I won't make the same mistake again & that's a real promise you can take to the crypto bank.
Great! But note I haven’t said that you should be doing the research. This was more of a warning about today, but it also was a different kind of warning about the next 12-18 months once models catch up to what this guy wants to do with them.
Thank you for your wisdom. I'll make a note & make sure to follow up on this later b/c you obviously know much more about the future than a humble plebeian like myself.
Yeah, “rewrite in rust” strikes again, this time equipped with a AI slop generator.
i reimplemented my Grandma in Rust. She was a real Safety and Security hazard to herself and her surounding. She forgot things and made unsound memory assumtion. Took me about 3 Days vibe coding with Claude Code and was a real fun time. Now my grandma is leaking anything and has some new comandline switches. To be fair i know best how to implement Grandmas and everybody should use my Grandma from now on. If this breaks your scripts just adapt. Sure this was very cynical but im so tired reading every week some new pet project where rust is seen a mesiah. It is a new language, it helps getting memory right more easy. It is like the new visual basic.