I wonder if it is a generation gap thing. The young folks these days have probably used only Gmail, Proton or one of these big email services that abstract away all the technical details of sending and receiving emails. Without some visibility into the technical details of how emails are composed and sent they might not have ever known that the email headers are not some definite source of truth but totally user defined and can be set to anything.
+1, Even if they validate DKIM/SPF+alignment (aka DMARC) that would only verify the domain. There is no local part verification possible for the receiver, the sending server needs to be trusted with proper auth
How is it not? For all but some old and insecure or fairly exotic setups, DKIM/DMARC validates the sender server is authorised for that domain and the server's account-based outbound filtering validates it was sent by the owner of that mailbox.
If the sending server doesn't do DKIM, it's fundamentally broken, move your email somewhere else. If the sending server lets any user send with an arbitrary local part, that's either intended and desired, or also fundamentally broken. If there are other senders registered on the domain with valid DKIM and you can't trust them, you have bigger problems.
> If the sending server doesn't do DKIM, it's fundamentally broken,
No, it just won't get very good deliverability, because everything it talks to is now fundamentally broken.
DKIM shouldn't exist. It was a bad idea from day one.
It adds very little real anti-spam value over SPF, but the worse part is exactly the model you describe. DKIM was a largely undiscussed, back-door change to the attributability and repudiability of email, and at the same time the two-tiered model it created is far, far less effective or usable than just end-to-end signing messages at the MUA.
DKIM isn't an antispam measure, it's an anti-impersonation measure. With DKIM, you can't impersonate a domain, which means you can trust that any email you get from an email provider was sent in accordance with that provider's security policy. In most cases, that policy is "one user owns one localpart and they can only send from it if they have their password". In cases where it's not, this is intentional and known by their users.
If you as a user can't trust your email server, you've already lost, no matter if something is authorized by an outbound email or a click on an inbound link. If your mail server is evil or hacked, it can steal your OTP token or activation link just as easily as it can send an email in your name.
Yes, end to end authentication is definitely better, but this isn't what people are discussing here. With enforced DKIM, "send me an email" has a nearly identical security profile to "I've emailed you a link, click on it". Both are inferior to end-to-end crypto.
It’s wild when I read a professional looking website like this and Conscious Digital misspells their own org name as “Consious Digital” in the first paragraph. I’m glad they’re fighting against email spam but it just raises all sorts of red flags in my mind, or at least it used to.
Funny enough, these days it indicates the article was written by a human. I had a dev join my team and made a few typos and it gave me a chuckle, as it’s a whole class of mistake I hadn’t seen in awhile.
The "required login" pattern is particularly a problem. I seem to have namesakes around the US and UK that use my email address as their own when signing up for various services (mobile phone services, Shopify, Uber, various banks and investment firms, landscaper services, real estate services, home and car insurance, car repair shops, even Silver Daddies!!).
I can't open an issue (to ask the service to remove my email) without logging in to an account I don't have control over.
I don't want to use "forgot my password", because I don't want my IP address to be associated with a login to the account, because in some cases (particularly Shopify), the services were obviously used for fraud.
> I can't open an issue (to ask the service to remove my email) without logging in to an account I don't have control over.
> I don't want to use "forgot my password", because I don't want my IP address to be associated with a login to the account
As a fellow victim of worldwide technically-illiterate namesakes, I used to do this using the TOR browser until I had a paid VPN service which is what I use now. Out of sheer paranoia, I always use a secondary browser profile while using a false userAgent extension.
I was pretty early to Gmail, I paid $5 for an invite to the beta, and secured my first(.)last@gmail.com. But now I pay for my own domain and my own hosted email just to avoid any collisions
In the U.S., requiring a login (or any information other than your email address) to opt out is against the law. Additionally, you cannot require any steps other than "sending a reply electronic mail message or visiting a single Internet Web page."
I once wrote to the FTC for guidance as to whether or not this included requiring unsubscribers to solve a CAPTCHA or disable adblockers or enable Javascript, but did not get a response. I believe the law is plain with regards to this, but a lot of companies seem to be willing to risk it.
That site doesn't seem to support pages loading either.
edit: I feel their pain - I've spent the past week fighting AI scrapers on multiple sites hitting routes that somehow bypass Cloudflare's cache. Thousands of requests per minute, often to URLs that have never even existed. Baidu and OpenAI, I'm looking at you.
Are they hitting non-existent pages? I had ip addresses scanning my personal server including hitting pages that don't exist. I had fail2ban running already so I just turned on the nginx filters (and had to modify the regexs a bit to get them working). I turned on the recididiv jail too. It's been working great.
There is currently some AI scraper that uses residential IP addresses and a variety of techniques to conceal itself that likes downloading Swagger generated docs over… and over… and over.
Plus hitting the endpoints for authentication that return 403 over and over.
My n100 minipc can serve over 20k requests per second with nginx (well, it could, if not for the gigabit NIC limiting it). Actually IIRC it can (again, modulo uplink) do more like 40k rps for 404 or 304s.
Might be worth checking if they are appending random query strings to force cache misses. Usually you can normalize the request at the edge to strip those out and protect the origin.
You can block at your gateway/router. Lots of places have country IP ranges[1], and there are even more or less frequently updated lists of 'malicious' IP ranges[2]. Some gateway providers include 'block by country' and/or 'download blocklists automatically' as a feature.
Small, cheap VPSs that are ideal for running a small niche-interest blog or forum will easily fall over if they suddenly get thousands of requests in a short time.
Look at how many sites still get "HN hugged" (formerly known as "slashdotted").
I remember my first project posted to HN was hosted on a router with 32MB of RAM and a puny MIPS CPU; despite hitting the front page, it did not crash.
At this point, I have to assume that most software is too inefficient to be exposed to the Internet, and that becomes obvious with any real load.
So, they're trying to be an online privacy service for users but they require companies work in the way THEY want the companies to operate. This is not a serious organization I need to care about as a user or a service provider. They're just setting themselves up for failure by requiring the world around them to change.
If you get a clear notice that a user wants you to delete something, you act on it. It doesn't matter if it was sent by carrier pigeon. Can't automate it? Tough doo-doo. Interferes with your business model? Change your model or close.
You are 100% entitled to feel that way, but if they have a process that automatically deletes all of your data for you and you don't want to use it, don't complain.
The irony of a site about AI opt-outs getting hammered by AI scrapers is almost too on the nose.
trollbridge's point about scrapers using residential IPs and targeting authentication endpoints matches what we've seen. The scrapers have gotten sophisticated. They're not just crawling, they're probing.
The economics are broken. Running a small site used to cost almost nothing. Now you need to either pay for CDN/protection or spend time playing whack-a-mole with bad actors.
ronsor hosting a front-page HN project on 32MB RAM is impressive and also highlights how much bloat we've normalized. The scraper problem is real, but so is the software efficiency problem.
| Since emails are sent from the individual’s email account, they are already verified.
This is not how email works, though.
This.
I wonder if it is a generation gap thing. The young folks these days have probably used only Gmail, Proton or one of these big email services that abstract away all the technical details of sending and receiving emails. Without some visibility into the technical details of how emails are composed and sent they might not have ever known that the email headers are not some definite source of truth but totally user defined and can be set to anything.
98% of email users of any generation don't have the first clue how the protocol works.
Eh, nice times, when you could type an email just by telnetting to port 25...
I've certainly sent thousands of emails this way. It was a simpler time.
+1, Even if they validate DKIM/SPF+alignment (aka DMARC) that would only verify the domain. There is no local part verification possible for the receiver, the sending server needs to be trusted with proper auth
How is it not? For all but some old and insecure or fairly exotic setups, DKIM/DMARC validates the sender server is authorised for that domain and the server's account-based outbound filtering validates it was sent by the owner of that mailbox.
If the sending server doesn't do DKIM, it's fundamentally broken, move your email somewhere else. If the sending server lets any user send with an arbitrary local part, that's either intended and desired, or also fundamentally broken. If there are other senders registered on the domain with valid DKIM and you can't trust them, you have bigger problems.
> If the sending server doesn't do DKIM, it's fundamentally broken,
No, it just won't get very good deliverability, because everything it talks to is now fundamentally broken.
DKIM shouldn't exist. It was a bad idea from day one.
It adds very little real anti-spam value over SPF, but the worse part is exactly the model you describe. DKIM was a largely undiscussed, back-door change to the attributability and repudiability of email, and at the same time the two-tiered model it created is far, far less effective or usable than just end-to-end signing messages at the MUA.
DKIM isn't an antispam measure, it's an anti-impersonation measure. With DKIM, you can't impersonate a domain, which means you can trust that any email you get from an email provider was sent in accordance with that provider's security policy. In most cases, that policy is "one user owns one localpart and they can only send from it if they have their password". In cases where it's not, this is intentional and known by their users.
If you as a user can't trust your email server, you've already lost, no matter if something is authorized by an outbound email or a click on an inbound link. If your mail server is evil or hacked, it can steal your OTP token or activation link just as easily as it can send an email in your name.
Yes, end to end authentication is definitely better, but this isn't what people are discussing here. With enforced DKIM, "send me an email" has a nearly identical security profile to "I've emailed you a link, click on it". Both are inferior to end-to-end crypto.
It’s wild when I read a professional looking website like this and Conscious Digital misspells their own org name as “Consious Digital” in the first paragraph. I’m glad they’re fighting against email spam but it just raises all sorts of red flags in my mind, or at least it used to.
Funny enough, these days it indicates the article was written by a human. I had a dev join my team and made a few typos and it gave me a chuckle, as it’s a whole class of mistake I hadn’t seen in awhile.
The "required login" pattern is particularly a problem. I seem to have namesakes around the US and UK that use my email address as their own when signing up for various services (mobile phone services, Shopify, Uber, various banks and investment firms, landscaper services, real estate services, home and car insurance, car repair shops, even Silver Daddies!!).
I can't open an issue (to ask the service to remove my email) without logging in to an account I don't have control over.
I don't want to use "forgot my password", because I don't want my IP address to be associated with a login to the account, because in some cases (particularly Shopify), the services were obviously used for fraud.
> I can't open an issue (to ask the service to remove my email) without logging in to an account I don't have control over.
> I don't want to use "forgot my password", because I don't want my IP address to be associated with a login to the account
As a fellow victim of worldwide technically-illiterate namesakes, I used to do this using the TOR browser until I had a paid VPN service which is what I use now. Out of sheer paranoia, I always use a secondary browser profile while using a false userAgent extension.
I was pretty early to Gmail, I paid $5 for an invite to the beta, and secured my first(.)last@gmail.com. But now I pay for my own domain and my own hosted email just to avoid any collisions
In the U.S., requiring a login (or any information other than your email address) to opt out is against the law. Additionally, you cannot require any steps other than "sending a reply electronic mail message or visiting a single Internet Web page."
I once wrote to the FTC for guidance as to whether or not this included requiring unsubscribers to solve a CAPTCHA or disable adblockers or enable Javascript, but did not get a response. I believe the law is plain with regards to this, but a lot of companies seem to be willing to risk it.
See: https://www.ecfr.gov/current/title-16/chapter-I/subchapter-C...
Archive link:
https://web.archive.org/web/20251009081648/https://conscious...
That wasn't working for me, but this one was: https://archive.ph/QCMjJ
That site doesn't seem to support pages loading either.
edit: I feel their pain - I've spent the past week fighting AI scrapers on multiple sites hitting routes that somehow bypass Cloudflare's cache. Thousands of requests per minute, often to URLs that have never even existed. Baidu and OpenAI, I'm looking at you.
Are they hitting non-existent pages? I had ip addresses scanning my personal server including hitting pages that don't exist. I had fail2ban running already so I just turned on the nginx filters (and had to modify the regexs a bit to get them working). I turned on the recididiv jail too. It's been working great.
There is currently some AI scraper that uses residential IP addresses and a variety of techniques to conceal itself that likes downloading Swagger generated docs over… and over… and over.
Plus hitting the endpoints for authentication that return 403 over and over.
My n100 minipc can serve over 20k requests per second with nginx (well, it could, if not for the gigabit NIC limiting it). Actually IIRC it can (again, modulo uplink) do more like 40k rps for 404 or 304s.
Might be worth checking if they are appending random query strings to force cache misses. Usually you can normalize the request at the edge to strip those out and protect the origin.
> often to URLs that have never even existed
Oh you're so deterministic.
IP blocking Asia took my abusive scans down 95%.
I also do not have a robots.txt so google doesnt index.
Got some scanners who left a message how to index or dei dex, but was like 3 lines total in my log (thats not abusive).
But yeah, blocking the whole of Asia stopped soooo much of the net-shit.
> I also do not have a robots.txt so google doesnt index.
That doesn't sound right. I don't have robots.txt too but Google indexes everything for me.
https://news.ycombinator.com/item?id=46681454
I think this is a recent change.
All the comments there seem to suggest that there has been no change and that robots.txt isn't required.
How did you block Asia, cloudflare or something else?
You can block at your gateway/router. Lots of places have country IP ranges[1], and there are even more or less frequently updated lists of 'malicious' IP ranges[2]. Some gateway providers include 'block by country' and/or 'download blocklists automatically' as a feature.
[1] e.g. https://github.com/ipverse/geo-ip-blocks
[2] e.g. https://github.com/bitwire-it/ipblocklist
You can download weekly IP blocks of regions.
I import them into iptables and wholesale block them all.
I dont deal with eastdakota's pile of shit.
Why are "thousands" of requests noticable in any way? Webservers are so powerful nowadays.
Small, cheap VPSs that are ideal for running a small niche-interest blog or forum will easily fall over if they suddenly get thousands of requests in a short time.
Look at how many sites still get "HN hugged" (formerly known as "slashdotted").
I remember my first project posted to HN was hosted on a router with 32MB of RAM and a puny MIPS CPU; despite hitting the front page, it did not crash.
At this point, I have to assume that most software is too inefficient to be exposed to the Internet, and that becomes obvious with any real load.
While true, it's also true that it was (presumably) able to run and serve its intended audience until the scrapers came along.
It's not just one scraper.
So, they're trying to be an online privacy service for users but they require companies work in the way THEY want the companies to operate. This is not a serious organization I need to care about as a user or a service provider. They're just setting themselves up for failure by requiring the world around them to change.
Their detailed explanation of compliance issues in the space is interesting and enlightening.
You know what? Fuck what "companies" want.
If you get a clear notice that a user wants you to delete something, you act on it. It doesn't matter if it was sent by carrier pigeon. Can't automate it? Tough doo-doo. Interferes with your business model? Change your model or close.
You are 100% entitled to feel that way, but if they have a process that automatically deletes all of your data for you and you don't want to use it, don't complain.
https://archive.ph/QCMjJ if it helps
The irony of a site about AI opt-outs getting hammered by AI scrapers is almost too on the nose.
trollbridge's point about scrapers using residential IPs and targeting authentication endpoints matches what we've seen. The scrapers have gotten sophisticated. They're not just crawling, they're probing.
The economics are broken. Running a small site used to cost almost nothing. Now you need to either pay for CDN/protection or spend time playing whack-a-mole with bad actors.
ronsor hosting a front-page HN project on 32MB RAM is impressive and also highlights how much bloat we've normalized. The scraper problem is real, but so is the software efficiency problem.