For some reason, in the early days of the web, email seemed like a logical choice to get input from users.
The first time I tried to have users fill out a form, what I did was that I sent them an exe file which contained a windows application that showed a form and saved the replies to a file. In the email I asked users to send me back that file. But no matter how I worded the email, 50% of users sent me back the exe file instead.
That problem was what triggered me to learn about server side code and databases.
And when that worked, it hit me: I could make a form that asked users about their favorite bands and suggest them new bands right away. This way the system would learn about all the bands of the world on its own and become better and better in suggesting music. This is how Gnoosic [1] was born. Later I adapted it for movies and called that Gnovies [2]. And for literature and called that Gnooks [3].
All 3 are still alive and keep learning every day:
I really really love this, but it would benefit tremendously from one simple, but maybe not easy to implement, feature: movie posters. For someone like me who watches A LOT of movies, it’s sometimes hard to remember a movie based on its title alone, and seeing the official movie poster would make it so much easier.
But I guess implementing this would take away a lot of the current implementation‘s simplicity? Are these available without violating copyright left and right? Is there even a straightforward way to map movie titles to, say, their primary IMDB poster image?
As with many such sites I've used in the past, there are some issues with the data. Gnooks suggested one author, then the same author, but with a typo in the name. Gnoosic suggested Procol Harum, then it suggested A Whiter Shade of Pale (which is the title of the song most associated with Procol Harum).
And of course if your input form gives the user any help when they are filling it out, then you can run in to a problem where users choose ABBA more often than Aerosmith or Argent, (more often than they would have), based just on how ABBA is presented first alphabetically.
I got around this issue by not giving any help; the user was presented with a form with blanks for five artist and song name pairs. Which resulted in a crazy amount of typos, which I fixed “by hand” using a pretty lame interface since I wasn’t a good enough programmer to create a better way in 1996. I spent a LOT of time correcting typos.
Hey, cool to meet someone who went down a similar avenue. BubbleRings also looks like a real fun startup.
It used to be a form without typeaheads for years on Gnoosic too.
I never manually fixed typos, but I have put quite some work into making the system figure them out on its own. For example by asking people "Hey, you entered 'The Beatled' which sounds a lot like the popular band 'The Beatles'. Could it be you meant 'The Beatles'?" and then offering buttons for yes/no etc.
Oh, that’s a great discovery to make. Enjoy! And if you’re into interesting backstories on musicians, make sure to check out how and why she makes music, it’s both tragic and very peculiar.
Email was once the user interface for all remote services. Bitnet had nodes that responded to commands and performed operations that sent the results in email. For instance, you could send an email to "TRICKLE@TREARN" on Bitnet with a subject line: "GET ftp.funet.fi /pub/something" and the Trickle service at TREARN node would download the file over Internet, split it in chunks and would send it to you over Bitnet, so you'd effectively have FTP capability on Bitnet just with email.
I had written a user database called "Hitbase" (a very primitive Facebook) on a Fidonet network that responded to Netmail messages to a given node and sent the responses to the requesting address. That was in the 90's before Internet was accessible from homes.
My high school got email access for students in 1997 (New Zealand). We had to pay per megabyte for web browsing. There were services you could email URLs to, that would email you back a text rendering of the page. So I used that.
They had a fairly smart UI for following links. They would appear as footnotes and IIRC you could just hit reply, type the footnote number, and then send.
Email is still the best thing about the internet. I know it can be unwieldy if you don't spend the time to figure out a good strategy for dealing with it, and for that reason there will always be those that hate it. But I'm constantly wishing the services I use made better use of email for notifications or even as a user interface for the thing.
I remember those early systems, and was in touch with Upendra Shardanand and Pattie Maes at the time. Other early systems ca
As music recommendation was already being done, I developed MORSE, short for MOvie Recommendation SystEm, shortly after Ringo appeared. Like Ringo and Firefly, it was a collaborative filtering system, i.e. it worked by comparing how similar your tastes were to the tastes of other users, and took no account of other information (e.g. genre, director, cast). As it was a purely statistical algorithm, I didn't call it, or other collaborative filtering systems, AI. It was different to symbolic AI (which I was previously working on, in Prolog and Common Lisp), didn't use neural networks, and wasn't Nouvelle AI (actually the oldest approach to AI) either. I wrote it in C (it had to run fast and was just processing numbers) and used CGI (Common Gateway Interface) to collect data and give recommendations on the WWW.
In a nutshell, to predict the rating for a film a user hasn't seen yet, it plotted the ratings given by other users for that film against how their ratings correlated with the the user, found the best-fitting straight like through them and extrapolated it, estimating the rating of a hypothetical user whose tastes exactly matched the user for the film. It also calculated the error on this, which it took into account when giving recommendations. Other collaborative filtering systems used simpler algorithms which ignored the ratings of users whose tastes were different. When I used those simpler algorithms on the same data, recommendation accuracy got worse.
MORSE was released on the BT Labs' web site in 1995. It survived a few years there, but was later taken off the server. As BT Weren't going further with it, I asked if the source code could be released, This was agreed, but it wasn't on any machine, and they couldn't find the backup tape. The algorithm is described in detail here: https://fmjlang.co.uk/morse/morse.pdf and more general information is here: https://fmjlang.co.uk/morse/MORSE.html
My algorithm was pretty similar to yours I’m guessing. (See my other long post here.) I described mine to a friend one time and he called it “toothpick AI”.
I created one of the systems that competed with Ringo / Firefly. It was a great experience, and a long story that I hope to write up fully some day. A short summary:
I had The Similarities Engine up on the very early web for a couple of years. I joined some guys and we created a startup with a good bit of angel financing. It failed, but at the last minute, Firefly bought out the tech and code that we had developed, so our investors got a little back of what they put in.
Now I met my wife when I travelled to Poland to work with the programmers that were developing our product, and she is the best thing that ever happened to me, so I don't have ALL regrets about how the startup failed, but let me tell you about the big regret that I do have.
The code that was written in Poland was never used for anything, as far as I know. The only thing of value that Firefly got out of acquiring us is the patent that I wrote on my recommending algorithm. (Yeah yeah go ahead and give me a bunch of grief here for applying for, and having issue, a software patent. It was 1997, a different world, for one thing.)
Anyway, long story short, it all comes down to this one 20 minute phone call with my CEO, where I had to decide whether to take my patent back and kill the deal where NetAngels would sell to Firefly, or give up my patent in the hopes that Firefly stock would be worth something. Firefly didn't succeed and soon after sold to Microsoft for not much. But if I had killed the deal, I really think my then-fiancée and I would not have gotten married, so I think I chose wisely. However...
That patent of mine. It didn't look like much at the time maybe. But what happened is, over time, the Internet came to it. Big time. The patent describes what is basically THE basic collaborative filtering algorithm, and it issued. Microsoft never did anything with it as far as I know, they just kept it in a drawer so nobody could ever use it against them.
So here's the thing. Recently I asked ChatGPT and Claude how much my patent would have been worth, if I had held on to it. If you have regrets in your life about business deals, if you had a few early shares of Google, or some early bitcoin or something but you sold early, let me tell you, I have you beat. I'll let you see the numbers yourself--type this query into your favorite AI (and maybe do me a favor and post a comment here about what numbers it tells you):
A single individual created patent US 5,749,081. He sold it when it had barely issued. How much money do you think he could have made from the patent, if he had held on to it and effectively monetized it, as the early Internet grew to have so many companies using collaborative filtering systems to sell products to users and make money?
"if he had effectively monetized it" is very loose language, esp. if it tells you if you had spent (say) $10m it could be worth $500m. But you don't monetize individual patents, you build software.
I don’t build software, not for many years. And I have monetized individual patents.
Folks, we are just having fun here. Making fun of me. But not for making HN comments that aren’t as well written as you think they should be. For having created, owned, and then let slip through my hands, a patent that ChatGPT, Claude, and Gemini say may have been worth a billion dollars. So what if the estimate is off by 100x, so I let $10M slip through my fingers. It is still ridiculous and wild. But seriously who cares about the language I use to describe the thing.
Alice: "So I asked my Magic 8-Ball and it said..."
Bob: "You shouldn't do that. Those don't function in a reliable way."
Alice: "... But have you tried one?"
__________
The fact that you're appealing to the faux-authority of chatbots suggests you didn't do anything to verify the what-if prediction as plausible. If you had, that process would've given you something much more convincing to use.
If I'm understanding correctly, this was not AI in any sense.
This was collaborative filtering. Which is great and useful, but it's just simple statistics. You can write it in a SQL query a few lines long.
Or you can run more robust models using SVD (singular value decomposition) to reduce dimensionality and enable a form of statistical inference. I can't tell if Ringo/Firefly were using that or not. (If you have enough users, and a relatively limited set of objects like musical artists, you don't need this.)
But nobody calls these AI -- not now and not at the time. They're very clearly in the realm of statistics.
So the article is fun, but not everything needs to jump on the AI train, c'mon.
Might be the optimal approach for running a slow inference model locally, and if we treat LLMs like compilers this makes sense. Overnight compilation for complex codebases is still the normal thing to do, but what if LLM code generation (about the one task it seems really good at) was run overnight the same way? That is, your workflow would be to look at what the LLM generated the previous night, make a bunch of annotations and suggestions and then at the end of the day submit everything you did to the LLM for an 'overnight generation' task?
Lol "You emailed that you like sci-fi. We bet you'll like Alien, Bladerunner, and Close Encounters of the Third Kind!" Truly mind-blowing tech right there. How did they ever pull it off!
For some reason, in the early days of the web, email seemed like a logical choice to get input from users.
The first time I tried to have users fill out a form, what I did was that I sent them an exe file which contained a windows application that showed a form and saved the replies to a file. In the email I asked users to send me back that file. But no matter how I worded the email, 50% of users sent me back the exe file instead.
That problem was what triggered me to learn about server side code and databases.
And when that worked, it hit me: I could make a form that asked users about their favorite bands and suggest them new bands right away. This way the system would learn about all the bands of the world on its own and become better and better in suggesting music. This is how Gnoosic [1] was born. Later I adapted it for movies and called that Gnovies [2]. And for literature and called that Gnooks [3].
All 3 are still alive and keep learning every day:
[1] https://www.gnoosic.com
[2] https://www.gnovies.com
[3] https://www.gnooks.com
I really really love this, but it would benefit tremendously from one simple, but maybe not easy to implement, feature: movie posters. For someone like me who watches A LOT of movies, it’s sometimes hard to remember a movie based on its title alone, and seeing the official movie poster would make it so much easier.
But I guess implementing this would take away a lot of the current implementation‘s simplicity? Are these available without violating copyright left and right? Is there even a straightforward way to map movie titles to, say, their primary IMDB poster image?
Wow, is this awesome! I put in: Hugh Masekela, John Prine, and Ripe. It gave GREAT recommendations!
Oliver Mtukudzi, Melting Palms
I've never heard of either. Enjoy both. Sent the link to my son. Brilliant! Thank you.
As with many such sites I've used in the past, there are some issues with the data. Gnooks suggested one author, then the same author, but with a typo in the name. Gnoosic suggested Procol Harum, then it suggested A Whiter Shade of Pale (which is the title of the song most associated with Procol Harum).
Yes, since Gnod learns everything on its own, typos are inevitable.
But over time, it learns that two names can mean the same author.
You can help it by voting for typos here:
https://www.gnooks.com/vote_typos
If Gnoosic thinks a name is a band while it actually is a song, reporting the song as a typo of the band helps:
https://www.gnoosic.com/vote_typos
And of course if your input form gives the user any help when they are filling it out, then you can run in to a problem where users choose ABBA more often than Aerosmith or Argent, (more often than they would have), based just on how ABBA is presented first alphabetically.
I got around this issue by not giving any help; the user was presented with a form with blanks for five artist and song name pairs. Which resulted in a crazy amount of typos, which I fixed “by hand” using a pretty lame interface since I wasn’t a good enough programmer to create a better way in 1996. I spent a LOT of time correcting typos.
Hey, cool to meet someone who went down a similar avenue. BubbleRings also looks like a real fun startup.
It used to be a form without typeaheads for years on Gnoosic too.
I never manually fixed typos, but I have put quite some work into making the system figure them out on its own. For example by asking people "Hey, you entered 'The Beatled' which sounds a lot like the popular band 'The Beatles'. Could it be you meant 'The Beatles'?" and then offering buttons for yes/no etc.
Thanks - just tried gnovies and got a great recommendation.
I'm surprised that something I'd never heard of appears to work that well.
I have been enjoying these for years, thank you for wonderful services!
I used gnoosic just now and discovered Melody Gardot. Wow. Thanks!
Oh, that’s a great discovery to make. Enjoy! And if you’re into interesting backstories on musicians, make sure to check out how and why she makes music, it’s both tragic and very peculiar.
Thanks for pointing that out. Read her wikipedia page. What a life.
www.gnoosic.com is gorgeous! I can't wait to try it out in detail.
Let me get my post for this thread finished then I'll try to look you up.
Email was once the user interface for all remote services. Bitnet had nodes that responded to commands and performed operations that sent the results in email. For instance, you could send an email to "TRICKLE@TREARN" on Bitnet with a subject line: "GET ftp.funet.fi /pub/something" and the Trickle service at TREARN node would download the file over Internet, split it in chunks and would send it to you over Bitnet, so you'd effectively have FTP capability on Bitnet just with email.
I had written a user database called "Hitbase" (a very primitive Facebook) on a Fidonet network that responded to Netmail messages to a given node and sent the responses to the requesting address. That was in the 90's before Internet was accessible from homes.
My high school got email access for students in 1997 (New Zealand). We had to pay per megabyte for web browsing. There were services you could email URLs to, that would email you back a text rendering of the page. So I used that.
They had a fairly smart UI for following links. They would appear as footnotes and IIRC you could just hit reply, type the footnote number, and then send.
Email is still the best thing about the internet. I know it can be unwieldy if you don't spend the time to figure out a good strategy for dealing with it, and for that reason there will always be those that hate it. But I'm constantly wishing the services I use made better use of email for notifications or even as a user interface for the thing.
I remember those early systems, and was in touch with Upendra Shardanand and Pattie Maes at the time. Other early systems ca
As music recommendation was already being done, I developed MORSE, short for MOvie Recommendation SystEm, shortly after Ringo appeared. Like Ringo and Firefly, it was a collaborative filtering system, i.e. it worked by comparing how similar your tastes were to the tastes of other users, and took no account of other information (e.g. genre, director, cast). As it was a purely statistical algorithm, I didn't call it, or other collaborative filtering systems, AI. It was different to symbolic AI (which I was previously working on, in Prolog and Common Lisp), didn't use neural networks, and wasn't Nouvelle AI (actually the oldest approach to AI) either. I wrote it in C (it had to run fast and was just processing numbers) and used CGI (Common Gateway Interface) to collect data and give recommendations on the WWW.
In a nutshell, to predict the rating for a film a user hasn't seen yet, it plotted the ratings given by other users for that film against how their ratings correlated with the the user, found the best-fitting straight like through them and extrapolated it, estimating the rating of a hypothetical user whose tastes exactly matched the user for the film. It also calculated the error on this, which it took into account when giving recommendations. Other collaborative filtering systems used simpler algorithms which ignored the ratings of users whose tastes were different. When I used those simpler algorithms on the same data, recommendation accuracy got worse.
MORSE was released on the BT Labs' web site in 1995. It survived a few years there, but was later taken off the server. As BT Weren't going further with it, I asked if the source code could be released, This was agreed, but it wasn't on any machine, and they couldn't find the backup tape. The algorithm is described in detail here: https://fmjlang.co.uk/morse/morse.pdf and more general information is here: https://fmjlang.co.uk/morse/MORSE.html
My algorithm was pretty similar to yours I’m guessing. (See my other long post here.) I described mine to a friend one time and he called it “toothpick AI”.
I think many AI startup founders and engineers should look more into the past rather than imagine how the future may look.
I think there’s a lot of alpha in classic RFCs.
Interesting. In the early 90s, lots of protein servers, i.e. the predecessors of AlphaFold et al., were also using email as UI.
You'd submit query sequences as an email, and get an email back with predictions.
The input format has not changed, still FASTA.
I created one of the systems that competed with Ringo / Firefly. It was a great experience, and a long story that I hope to write up fully some day. A short summary:
I had The Similarities Engine up on the very early web for a couple of years. I joined some guys and we created a startup with a good bit of angel financing. It failed, but at the last minute, Firefly bought out the tech and code that we had developed, so our investors got a little back of what they put in.
Now I met my wife when I travelled to Poland to work with the programmers that were developing our product, and she is the best thing that ever happened to me, so I don't have ALL regrets about how the startup failed, but let me tell you about the big regret that I do have.
The code that was written in Poland was never used for anything, as far as I know. The only thing of value that Firefly got out of acquiring us is the patent that I wrote on my recommending algorithm. (Yeah yeah go ahead and give me a bunch of grief here for applying for, and having issue, a software patent. It was 1997, a different world, for one thing.)
Anyway, long story short, it all comes down to this one 20 minute phone call with my CEO, where I had to decide whether to take my patent back and kill the deal where NetAngels would sell to Firefly, or give up my patent in the hopes that Firefly stock would be worth something. Firefly didn't succeed and soon after sold to Microsoft for not much. But if I had killed the deal, I really think my then-fiancée and I would not have gotten married, so I think I chose wisely. However...
That patent of mine. It didn't look like much at the time maybe. But what happened is, over time, the Internet came to it. Big time. The patent describes what is basically THE basic collaborative filtering algorithm, and it issued. Microsoft never did anything with it as far as I know, they just kept it in a drawer so nobody could ever use it against them.
So here's the thing. Recently I asked ChatGPT and Claude how much my patent would have been worth, if I had held on to it. If you have regrets in your life about business deals, if you had a few early shares of Google, or some early bitcoin or something but you sold early, let me tell you, I have you beat. I'll let you see the numbers yourself--type this query into your favorite AI (and maybe do me a favor and post a comment here about what numbers it tells you):
A single individual created patent US 5,749,081. He sold it when it had barely issued. How much money do you think he could have made from the patent, if he had held on to it and effectively monetized it, as the early Internet grew to have so many companies using collaborative filtering systems to sell products to users and make money?
"if he had effectively monetized it" is very loose language, esp. if it tells you if you had spent (say) $10m it could be worth $500m. But you don't monetize individual patents, you build software.
https://patents.google.com/patent/US5749081A/en
I don’t build software, not for many years. And I have monetized individual patents.
Folks, we are just having fun here. Making fun of me. But not for making HN comments that aren’t as well written as you think they should be. For having created, owned, and then let slip through my hands, a patent that ChatGPT, Claude, and Gemini say may have been worth a billion dollars. So what if the estimate is off by 100x, so I let $10M slip through my fingers. It is still ridiculous and wild. But seriously who cares about the language I use to describe the thing.
> I asked ChatGPT and Claude how much my patent would have been worth
LLMs, on their own, can't really math.
Did you try it?
Alice: "So I asked my Magic 8-Ball and it said..."
Bob: "You shouldn't do that. Those don't function in a reliable way."
Alice: "... But have you tried one?"
__________
The fact that you're appealing to the faux-authority of chatbots suggests you didn't do anything to verify the what-if prediction as plausible. If you had, that process would've given you something much more convincing to use.
I can’t come to bed yet hon. Someone on the internet is wrong!
You can find out a little more about The Similarities Engine here, if you are interested:
https://www.whiteis.com/similarities-engine
If I'm understanding correctly, this was not AI in any sense.
This was collaborative filtering. Which is great and useful, but it's just simple statistics. You can write it in a SQL query a few lines long.
Or you can run more robust models using SVD (singular value decomposition) to reduce dimensionality and enable a form of statistical inference. I can't tell if Ringo/Firefly were using that or not. (If you have enough users, and a relatively limited set of objects like musical artists, you don't need this.)
But nobody calls these AI -- not now and not at the time. They're very clearly in the realm of statistics.
So the article is fun, but not everything needs to jump on the AI train, c'mon.
that was the worst article I ever read
Might be the optimal approach for running a slow inference model locally, and if we treat LLMs like compilers this makes sense. Overnight compilation for complex codebases is still the normal thing to do, but what if LLM code generation (about the one task it seems really good at) was run overnight the same way? That is, your workflow would be to look at what the LLM generated the previous night, make a bunch of annotations and suggestions and then at the end of the day submit everything you did to the LLM for an 'overnight generation' task?
Lol "You emailed that you like sci-fi. We bet you'll like Alien, Bladerunner, and Close Encounters of the Third Kind!" Truly mind-blowing tech right there. How did they ever pull it off!