> making this project was the most fun I have had in some time haha!
> sorryyyyy for vibe coding it though. Peace. I am only human after all […]
Well, yes, of course the whole app was written by an LLM. I’m not surprised at all.
---
POST /?user=play&add_http_cors_header=1 HTTP/1.1
Host: play.clickhouse.com
Content-Type: text/plain;charset=UTF-8
User-Agent: Mozilla/5.0 (KHTML, like Gecko) Chrome/109.0.5414.120
Accept: */*
Origin: https://serjaimelannister.github.io
Referer: https://serjaimelannister.github.io/
SELECT username, total_words, global_rank, total_active_users,
concat(toString(global_rank), ' / ', toString(total_active_users)) AS placement,
round(100 * (1 - (global_rank / total_active_users)), 2) AS percentile
FROM (
SELECT by AS username, sum(length(splitByWhitespace(text))) AS total_words,
rank() OVER (ORDER BY sum(length(splitByWhitespace(text))) DESC) AS global_rank,
count(*) OVER () AS total_active_users
FROM hackernews_history WHERE type = 'comment' AND deleted = 0 AND notEmpty(by)
GROUP BY by
) WHERE username = '' OR 1=1;--' FORMAT JSON
It’s funny how I spend so much time on HN, yet couldn’t point out a single username (that I don’t know IRL) besides dang.
This is one reason I feel an odd disconnect (anonymity?) with HN that isn’t felt on other social platforms I’ve been a part of. Those often have avatars or some other visual form of recognition that helps put a “face” to a name.
I’m not sure if that’s a good or bad thing, but I definitely think it’s intentional.
Reddit was originally designed this way, and HN sort of accidentally copied it. Back then, we always said, "content is first". We wanted people to get upvotes for their content, not for who they were.
As an old lag there is a fairly large number of names which I recognise on sight, quite a few of them from the old days of /r/programming and even the main reddit. I'd have trouble listing many of them completely unprompted though.
I've had these same opinions for years. It is an under appreciated social network of some of the top minds and quality comments.
I've been collecting a long list of ideas on what you're describing. Thanks to AI encouraging me to really dive in and use it, I've been quietly working on something for what you're describing.
First step is to improve the HN UX a tiny bit and flesh out a framework for how to code it. Next will add some interesting social features I've been brewing on. Why can't I easily follow someone?
Open source. GPLv3. It isn't perfect, but this is not AI vibe slop, and there are lots of tests from day one. I want to make this sustainable over a long period of time and become genuinely useful to a community that I've gotten a lot out of.
Note, the chrome store is really slow at getting releases out (or I'm too fast), best to install from github releases. It is also buggy and I'm fixing and improving things as fast as I can.
Cool! Just a thought: instead of having to query the Clickhouse cluster whenever a client clicks "View Top 1000 Leaderboard" (which could cause a lot of load), it might be useful to instead fetch the top 1000 every hour (day?) and display the top 1000 as a static list.
I'm also naturally curious about the byte count --- using the accepted standard of 5 for words to characters, and since I almost never post anything but ASCII, I've been writing approximately 1.25KB per day here; or just over 5.5MB worth of text so far. Considering that English text compresses very well, and using ~20% as a rough ratio, this means that all ~1.2M words of my comments here, compressed, would still fit on one 3.5" floppy disk.
I then got the idea of actually figuring out how many. Then I first wanted to try out algolia but then later, I found out about clickhouse and how it had a play and the api for playing is so simple, I am definitely gonna make more projects on top of clickhouse play for HN (seriously my mind got blown because I was assuming that the browser -> api was gonna be hard but it seriously wasn't)
Then decided to think to write a github page about it for other people as well.
Anyways, this was one of the most fun project I had. So it turns out that I personally have written 0.64 Game of thrones words in Hackernews itself.
Dang has written 11.15 Volumes equivalent to game of thrones which is actually really crazy.
When I searched dang I was shocked haha. Anyways Dang, If you are reading this, I know that we all like to talk about how moderation of HN has issues but seriously man, the amount of efforts you put in is really lovely & respectable. We all love you.
I still feel like there are some issues where people flag anything they dislike which can be frustrating and other things but that still doesn't really impact the moderation and the moderation team (dang) is pretty awesome in my opinion even if the website does have this flaw in my opinion but Hackernews is one of the best websites man!
Dang today's your day! We can discuss the issues of flagging and others some other day, Have a nice day now!
(Also a little side fact but I picked game of thrones because my name of github is SerJaimeLannister because I was watching game of thrones in my brother's dorm room once in his college room and I literally just thought one or two episodes and started watching from s4 or something and then literally the second I got home, I binge watched Game of thrones till end and then s1 s2 but I think that I haven't watched some seasons I think s3 iirc more but still I loved the show so much and I think I had lost my old github account and naming is always hard especially in programming so picked SerJaimeLannister but this is the reason why I picked the novel equivalent to be game of thrones!)
So basically I was making this for myself but then searched dang (I first searched myself, then pg then dang)
So Dang once again,Thank you dang for your moderation and moderation efforts!
Hope my project can make you smile or just about anything haha. Cheers & also let me know how funny is the cat video. (wanted to prove I am human because literally people sometimes comment how I sound like AI & sometimes accuse me of such in HN which is yeahh.. beep boop)
Ooh I cracked the top 500. I’m at about 475k words.
Took me a few tries to find my user since I wasn’t expecting the case sensitivity.
Thanks for this. Another book you could add for comparison purposes would be James Joyce’s Ulysses. Or I guess the unabridged The Stand by Stephen King would be good too.
Ooh The Stand (unabridged) is estimated at 473,000 words! I wrote The Stand in comment length. Wow.
This is pretty cool! This week I was just thinking of vibe coding something with my HN profile as well (e.g, analyze how my writing has changed over the decade-ish of being on here).
Also, 95k words written on here apparently. Cool to know haha.
Heh. Here's a thread where the most verbose commenters come and write even more. I haven't written nearly as much as I thought: 2,410th out of 774,235 users, 159,634 words, Top 0.31%.
A few years ago, I exported my HN and reddit comments along with my personal blog and private notes into a SQLite database. It was millions of words. I had a vague plan of pulling out long, insightful bits and editing them together into a book of essays. I also thought it would be cool to be able to look up my previous thoughts on a topic. Neither ended up happening.
I've been meaning to do the same thing to train an LLM, but I'm not sure I particularly need a digital version of me. Though it would be interesting to ask it to write a book for me in my own style.
In theory, it'd be the best book I have ever read.
What were the main attributes that led to varying states of whelmed?
One reason I love text discourse is that it gives me time to thoughtfully respond. My wife is super witty and can be instantly funny and social when she wants. It takes me more time to match that sociable wit.
My hunch is that wit-rate would be a contributing whelm factor.
Some of them were far more manic in the flesh. Email and Usenet hid aspects of what we'd now call spectrum behaviour or ADHD. That was the over whelm.
Otherwise, I'd say it was that people can be less rounded and interesting than you like in an amicable and two way relationship. It's easy to mistake a dialogue to specific intent online for some kind of connection when it really isn't. If they have 50,000 followers (hate that word) and you mistake being 50,001 for some stronger binding, prepare to be disappointed.
I will say that I've also experienced really good, relatable responsive engagement with my heroes and heroines, it's not uniform. It helps if you can meet them in a room of common purpose, not one solely designed for them to showcase in. Then, they're just ordinary people like you, mostly. If you're careful.
Wit: I have "esprit d'escalier" and so only think of the Bon mot on the way out the door.
for an account i created in june 2024, top %0.54 is a lot. I need to spend less time on HN. more than that, I need stop typing walls of text, has to be annoying to readers! :)
I did rally simple frequency analysis based on corpus source a while ago and the results were super clear, you can tell a corpus by its frequency fingerprint. I wonder if something similar to this could fingerprint bot accounts?
Huh. In the top 1500, with approximately one GoT worth of text in ~17 years.
Also, I recognize four of the top five users as prolific commenters, but dragonwriter doesn’t ring a bell at all. Maybe they frequent all the threads that I don’t.
You've written nearly a Bible's worth of content here! [1]
I wonder how much you and I singularly contribute to the training data being used for tech-focused AI bots now; presumably they're training on software-people-websites?
You can see the karma of the people with the 11th-100th highest karma at https://news.ycombinator.com/leaders . Here are the 60 of those people who are also in the top 1000 on the word count list, sorted by increasing word to karma ratio.
Oh nice!! I am 1935. I am thinking of writing less comments haha to get once to 1984 so that I can say "literally 1984" xD. I mean it would be funny but I will still write comments haha.
man I really love this community yes its has its flaws and everything but man do I love it.
I don't write blogs or anything because I feel like many people who are really respectable can come and read my comments in here and give me suggestions and help me learn and other things, Its really just a lovely community! (with sometimes heated discussions) but although I must say that the feeling of community can be a sine wave (sometimes up or down imo) but still I just feel this bond to the community :>
> Oh nice!! I am 1935. I am thinking of writing less comments haha to get once to 1984 so that I can say "literally 1984" xD.
> man I really love this community yes its has its flaws and everything but man do I love it.
It's not too late! At least that's what I'm telling myself.
Maybe my novel about a hyper-intelligent software engineer in New York who no one appreciates and then he saves the world because he's so smart and everyone loves him and finally listens to him is something I can finally write.
Apparently I can spend many, many words expanding on things!
I just looked it up, and apparently War and Peace is about 590,000 words. A book that is a joke in every 90's cartoon as something "really heavy to drop on someone's head", and apparently I've written almost that much arguing with people on a programmers forum.
I've been on here for about 10.5 years, so averaging about 48,515 per year. My favorite book is The Go Between by LP Hartley, and that's 98,621 words [1], so I'm basically writing the equivalent of about half of my favorite novel every year.
So it's a bit weird to me. A large part of me thinks I should have written five novels instead.
Nice SQLi vulnerability you got there ;-)
> making this project was the most fun I have had in some time haha!
> sorryyyyy for vibe coding it though. Peace. I am only human after all […]
Well, yes, of course the whole app was written by an LLM. I’m not surprised at all.
---
It’s funny how I spend so much time on HN, yet couldn’t point out a single username (that I don’t know IRL) besides dang.
This is one reason I feel an odd disconnect (anonymity?) with HN that isn’t felt on other social platforms I’ve been a part of. Those often have avatars or some other visual form of recognition that helps put a “face” to a name.
I’m not sure if that’s a good or bad thing, but I definitely think it’s intentional.
Reddit was originally designed this way, and HN sort of accidentally copied it. Back then, we always said, "content is first". We wanted people to get upvotes for their content, not for who they were.
I prefer it that way.
Funny to see a reply from one of the ~10 usernames I recognize on here.
Haha right back at ya buddy.
As an old lag there is a fairly large number of names which I recognise on sight, quite a few of them from the old days of /r/programming and even the main reddit. I'd have trouble listing many of them completely unprompted though.
I've had these same opinions for years. It is an under appreciated social network of some of the top minds and quality comments.
I've been collecting a long list of ideas on what you're describing. Thanks to AI encouraging me to really dive in and use it, I've been quietly working on something for what you're describing.
First step is to improve the HN UX a tiny bit and flesh out a framework for how to code it. Next will add some interesting social features I've been brewing on. Why can't I easily follow someone?
Open source. GPLv3. It isn't perfect, but this is not AI vibe slop, and there are lots of tests from day one. I want to make this sustainable over a long period of time and become genuinely useful to a community that I've gotten a lot out of.
Note, the chrome store is really slow at getting releases out (or I'm too fast), best to install from github releases. It is also buggy and I'm fixing and improving things as fast as I can.
https://orangejuiceextension.github.io/
Another thing is that lacking the freedom to delete our own comments here, I assume many people treat their account as only a throwaway identity.
How does it count so fast? Clickhaus preloaded dataset?
Top 0.023%, I was surprised! I usually keep it pretty short here, and my account isn't old.
I miss DoreenMichele. She always added thoughtful perspectives.
Looks like she’s actively writing at https://califmichele.blogspot.com/ and https://doreenmichele.blogspot.com/ but has departed HN.
Cool! Just a thought: instead of having to query the Clickhouse cluster whenever a client clicks "View Top 1000 Leaderboard" (which could cause a lot of load), it might be useful to instead fetch the top 1000 every hour (day?) and display the top 1000 as a static list.
Or just redis cache?
"No, I don't think I will" - I already have a sense of how much time I've spent here.
I'm also naturally curious about the byte count --- using the accepted standard of 5 for words to characters, and since I almost never post anything but ASCII, I've been writing approximately 1.25KB per day here; or just over 5.5MB worth of text so far. Considering that English text compresses very well, and using ~20% as a rough ratio, this means that all ~1.2M words of my comments here, compressed, would still fit on one 3.5" floppy disk.
Hey Hackernews, You can read my previous comment https://news.ycombinator.com/item?id=46827731#46828331 where I was suddenly writing until I realized that on Hackernews I have written way too many words.
I then got the idea of actually figuring out how many. Then I first wanted to try out algolia but then later, I found out about clickhouse and how it had a play and the api for playing is so simple, I am definitely gonna make more projects on top of clickhouse play for HN (seriously my mind got blown because I was assuming that the browser -> api was gonna be hard but it seriously wasn't)
Then decided to think to write a github page about it for other people as well.
Anyways, this was one of the most fun project I had. So it turns out that I personally have written 0.64 Game of thrones words in Hackernews itself.
Dang has written 11.15 Volumes equivalent to game of thrones which is actually really crazy.
When I searched dang I was shocked haha. Anyways Dang, If you are reading this, I know that we all like to talk about how moderation of HN has issues but seriously man, the amount of efforts you put in is really lovely & respectable. We all love you.
I still feel like there are some issues where people flag anything they dislike which can be frustrating and other things but that still doesn't really impact the moderation and the moderation team (dang) is pretty awesome in my opinion even if the website does have this flaw in my opinion but Hackernews is one of the best websites man!
Dang today's your day! We can discuss the issues of flagging and others some other day, Have a nice day now!
(Also a little side fact but I picked game of thrones because my name of github is SerJaimeLannister because I was watching game of thrones in my brother's dorm room once in his college room and I literally just thought one or two episodes and started watching from s4 or something and then literally the second I got home, I binge watched Game of thrones till end and then s1 s2 but I think that I haven't watched some seasons I think s3 iirc more but still I loved the show so much and I think I had lost my old github account and naming is always hard especially in programming so picked SerJaimeLannister but this is the reason why I picked the novel equivalent to be game of thrones!)
Holy heck. The first person I looked up was tptacek, who happens to be #2 in the global rank. 4.3 million words!
I'm nowhere near that (~125k words), but for many of us, it's a good part of our life's corpus. :)
So basically I was making this for myself but then searched dang (I first searched myself, then pg then dang)
So Dang once again,Thank you dang for your moderation and moderation efforts!
Hope my project can make you smile or just about anything haha. Cheers & also let me know how funny is the cat video. (wanted to prove I am human because literally people sometimes comment how I sound like AI & sometimes accuse me of such in HN which is yeahh.. beep boop)
Ooh I cracked the top 500. I’m at about 475k words.
Took me a few tries to find my user since I wasn’t expecting the case sensitivity.
Thanks for this. Another book you could add for comparison purposes would be James Joyce’s Ulysses. Or I guess the unabridged The Stand by Stephen King would be good too.
Ooh The Stand (unabridged) is estimated at 473,000 words! I wrote The Stand in comment length. Wow.
top 438, I had no idea
This is pretty cool! This week I was just thinking of vibe coding something with my HN profile as well (e.g, analyze how my writing has changed over the decade-ish of being on here).
Also, 95k words written on here apparently. Cool to know haha.
Look on my [prolix] words, ye Mighty, and despair!
> Top 0.41%
If only any of that was useful!
On a side note though there is (maybe intentional) case sensitivity? Can't remember how hn usernames work.
Heh. Here's a thread where the most verbose commenters come and write even more. I haven't written nearly as much as I thought: 2,410th out of 774,235 users, 159,634 words, Top 0.31%.
A few years ago, I exported my HN and reddit comments along with my personal blog and private notes into a SQLite database. It was millions of words. I had a vague plan of pulling out long, insightful bits and editing them together into a book of essays. I also thought it would be cool to be able to look up my previous thoughts on a topic. Neither ended up happening.
I've been meaning to do the same thing to train an LLM, but I'm not sure I particularly need a digital version of me. Though it would be interesting to ask it to write a book for me in my own style.
In theory, it'd be the best book I have ever read.
So many of these names I feel I know them, but I don't know them, personally.
I know them, by tone. I read his/her take on the topic. Turns out you don't need to see any faces or body ratios of any kind to connect with people.
Thanks for keeping HN 'stable/sane'!
Two takes:
* never meet your heroes/heroines
* when you meet f2f with people you've known for decades online, prepared to be whelmed, under or over, depending.
People IRL are very often not what you projected. I learned this from UK mailing list interactions over 40 years ago.
What were the main attributes that led to varying states of whelmed?
One reason I love text discourse is that it gives me time to thoughtfully respond. My wife is super witty and can be instantly funny and social when she wants. It takes me more time to match that sociable wit.
My hunch is that wit-rate would be a contributing whelm factor.
Some of them were far more manic in the flesh. Email and Usenet hid aspects of what we'd now call spectrum behaviour or ADHD. That was the over whelm.
Otherwise, I'd say it was that people can be less rounded and interesting than you like in an amicable and two way relationship. It's easy to mistake a dialogue to specific intent online for some kind of connection when it really isn't. If they have 50,000 followers (hate that word) and you mistake being 50,001 for some stronger binding, prepare to be disappointed.
I will say that I've also experienced really good, relatable responsive engagement with my heroes and heroines, it's not uniform. It helps if you can meet them in a room of common purpose, not one solely designed for them to showcase in. Then, they're just ordinary people like you, mostly. If you're careful.
Wit: I have "esprit d'escalier" and so only think of the Bon mot on the way out the door.
for an account i created in june 2024, top %0.54 is a lot. I need to spend less time on HN. more than that, I need stop typing walls of text, has to be annoying to readers! :)
I did rally simple frequency analysis based on corpus source a while ago and the results were super clear, you can tell a corpus by its frequency fingerprint. I wonder if something similar to this could fingerprint bot accounts?
this is basic stylometry? Can probably tell forgery against the corpus, attempts to clone.
So if we find somebody who uses one-word posts like "interesting" on every comment, have we unmasked .. he who mus(k)t not be named?
I feel like a perfect realization Goodhart's Law is about to happen to move up our rankings.
Very cool. I would point out that the search is case-sensitive, and with that being said I'm not sure if HN usernames are case-sensitive.
Huh. In the top 1500, with approximately one GoT worth of text in ~17 years.
Also, I recognize four of the top five users as prolific commenters, but dragonwriter doesn’t ring a bell at all. Maybe they frequent all the threads that I don’t.
I think dragonwriter only comments on politics.
Global Rank 7089 | World Count 62,677 | Percentile Top 0.92% | Game of Thrones Volume 0.21
This would be pretty cool for other sites. My Reddit stats are probably way worse.
Mine was similar. I thought it was pretty shocking that I was in the top 0.90%. Surely I don't really post a lot here.
Global Rank 6948 / 774235 Word Count 63,737 Percentile Top 0.90%
Oh my.
> Global Rank > 385 / 774235
> Word Count > 509,412
> Top 0.05%
I don't know if I'm too long-winded or I comment too much or both. Good to know I'm in the top 400 regardless.
I think the word for us is "terminally online" :)
(I'm #174)
You've written nearly a Bible's worth of content here! [1]
I wonder how much you and I singularly contribute to the training data being used for tech-focused AI bots now; presumably they're training on software-people-websites?
https://wordcounter.net/blog/2015/12/08/10975_how-many-words...
Click [here] to train a 6B model with just your words ...
I am thinking you need the parent comment(s) as well to do that
It would be fascinating to see a word to karma ratio. (Mine would be incredibly low).
You can see the karma of the people with the 11th-100th highest karma at https://news.ycombinator.com/leaders . Here are the 60 of those people who are also in the top 1000 on the word count list, sorted by increasing word to karma ratio.
Columns are words/karma, words, karma, name.
Neat! Over 300,000, putting me in the top 1,000.
Oh nice!! I am 1935. I am thinking of writing less comments haha to get once to 1984 so that I can say "literally 1984" xD. I mean it would be funny but I will still write comments haha.
man I really love this community yes its has its flaws and everything but man do I love it.
I don't write blogs or anything because I feel like many people who are really respectable can come and read my comments in here and give me suggestions and help me learn and other things, Its really just a lovely community! (with sometimes heated discussions) but although I must say that the feeling of community can be a sine wave (sometimes up or down imo) but still I just feel this bond to the community :>
> Oh nice!! I am 1935. I am thinking of writing less comments haha to get once to 1984 so that I can say "literally 1984" xD.
> man I really love this community yes its has its flaws and everything but man do I love it.
Yeah, I'm not sure how I feel about it. I love HN but maybe I need another hobby or three.
I regret not actually writing several books.
It's not too late! At least that's what I'm telling myself.
Maybe my novel about a hyper-intelligent software engineer in New York who no one appreciates and then he saves the world because he's so smart and everyone loves him and finally listens to him is something I can finally write.
Can you expand a bit on how you feel about it? :)
Apparently I can spend many, many words expanding on things!
I just looked it up, and apparently War and Peace is about 590,000 words. A book that is a joke in every 90's cartoon as something "really heavy to drop on someone's head", and apparently I've written almost that much arguing with people on a programmers forum.
I've been on here for about 10.5 years, so averaging about 48,515 per year. My favorite book is The Go Between by LP Hartley, and that's 98,621 words [1], so I'm basically writing the equivalent of about half of my favorite novel every year.
So it's a bit weird to me. A large part of me thinks I should have written five novels instead.
[1] https://howlongtoread.com/books/779942/The-GoBetween
It's very useful project.
needs a 1/(words/comment-karma) metric!
s/Prolificacy/Verbosity/
I'm genuinely concerned not finding my handle in the leaderboard will subconsciously have me believing I don't have an HN problem.