It's generally good advice, but I don't see that Safe Browsing did anything wrong in this case. First, it sounds like they actually were briefly hosting phishing sites:
> All sites on statichost.eu get a SITE-NAME.statichost.eu domain, and during the weekend there was an influx of phishing sites.
Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged. How else is Google supposed to know that subdomains belong to different users? That's what the PSL is for.
From my reading, Safe Browsing did its job correctly in this case, and they restored the site quickly once the threat was removed.
I'm not saying that Google or Safe Browsing in particular did anything wrong per se. My point is primarily that Google has too much power over the internet. I know that in this case what actually happened is because of me not putting enough effort into fending off bad guys.
The new separate domain is pending inclusion in the PSL, yes.
Edit: the "effort" I'm talking about above refers to more real time moderation of content.
> My point is primarily that Google has too much power over the internet.
That is probably true, but in this case I think most people would think that they used that power for good.
It was inconvenient for you and the legitimate parts of what was hosted on your domain, but it was blocking genuinely phishing content that was also hosted on your domain.
Every website operator employee worth their salary in this area would have told the site's operator this beforehand, and could have avoided this incident. Hell, even ChatGPT could tell you that by now. The word that comes to mind is incompetence on someone's part, but I don't know of the details on particularly who was the incompetent one in this situation. Thankfully, they've learned a lesson about the situation and ideally won't make the same mistake again going forwards.
I disagree, as a professional in this field for over a decade.
For this to be a legitimately backed statement, professional's would have needed to know about the PSL. This is largely unmet.
For it to be met, there would need to be documentation in the form of RFC's and whitepapers in industry working groups which would be needed. This didn't happen.
M3AAWG only has two blog post mentions, and that's only after the great layoffs of 2023, and only that its being used by volunteers and needs support. No discussion about organization, what its being used for, process/due process, etc.
It wholly lacks the needed outreach to professionals in order to make such a statement and have it be true.
I mean, it's a very big field, and it's easy enough for me to armchair quarterback and call it a skill issue without being vulnerable and putting my own credentials into question. There's a whole big world of things to know about making and running websites, and I'll readily admit I don't know everything. I don't do a lot of CSS or website SEO or run ad campaigns, so someone experienced there will run circles around me.
Putting user generated content on its own domain is more on the security side of things to know about running a website, and our industry doesn't regulate who's allowed to build websites. Everyone's got their own set of different best practices.
Regardless of the exact date that GitHub moved which kinds of user generated content (UGC) over to which domain/domains, I do expect a curious webdev in 2025 to have used GitHub and to have wondered enough about it to ask what's up with stuff coming from eg raw.githubusercontent.com at some point in their web browsing career to ask Google about it. They should have walked away with the idea that they're putting UGC on a separate domain intentionally for security stuff, even if they never hear mention of the PSL or how exactly it works and is implemented. The /r/webdev post you'll find links to a GitHub blog post that gives a lot of detail as to why they did that, and that doesn't mention the PSL once.
It's fair to point out the PSL isn't common knowledge. I would agree that it isn't. I don't think it's necessary, however. All it takes is being a user of GitHub and a modicum of curiosity. I expect anyone that call themselves a webdev in 2025 to be able to explain to me what git and GitHub is and why they're different. They don't need to know where git came from but I don't think I'm being unreasonable in asking that much. From there, I expect someone to be able to make up an answer as to why there's raw.githubusercontent.com during an interview and mumble something about security, even if they can't give specific details about cookies and phishing and how that all works.
It's possible I'm being unreasonable here but I don't think I am. This isn't knowledge that takes attending W3C meetings about web browser standards to have come across. Regardless of if I am or not though, everyone who's come across this thread should now know that UGC goes in its own domain, even if they can't give details as to why.
I agree this isn't knowledge that takes a lot. The problem is these companies don't explain why they do what they do, in fact a lot of security stuff along these lines in the past has been tight-lipped secrecy bound stuff. You can wonder, but the answer isn't out there unless you know an insider willing to break a broadly worded NDA (not gonna happen, and some are quite broad).
The idea to segment certain types of traffic to different domains isn't that new. For example segmenting certain mail servers by marketing or transactional types into subdomains was done as far back as 2010, but it wasn't explained in whitepapers until around 2016 or 2017, where there was already gathered irrefutable evidence that reputational systems had been put in place and the rules for those damaged people running small email servers who were being illegitimately blocked from delivery; for years with no recourse or disclosure just imposed cost.
Once they published the whitepapers on that, professionals were on board because they specified what they were looking for, and how it should function. Basic Engineering stuff that people who manage and build these systems need to know to interoperate.
These things need professional outreach that standardizes it in some form or another, that's not a one-off blog post imo, and that must fully specify function, requirement, feedback mechanisms, and expectations of how its supposed to work; basic engineering stuff.
The PSL is just the same thing all over again. Big Tech just starts doing something silently that directly imposes cost on others, they don't say what they are doing. Then when it becomes too costly they try to offload it to others calling for support, though if they only do halfsies in a blog post buried in noise, they are only looking for plausible deniability.
The benefit in doing this is in anti-competitive behavior.
Incidentally, while separating subdomains for email servers has been standard practice for awhile now, recently these companies once again changed the reputational weights for things, and they aren't talking. Now its a whole domain as a single reputational namespace not just breakage at the subdomain (bb.aa.com.). No outreach on that as far as I've seen.
There are ways to do things correctly, and then there are ways to do things anti-competitively and coercively. The incentives matched to the outcomes point to which one that happens to be.
How you do something is more important than that you did something in these cases.
If you as a company don't do professional outreach about such changes or standards, and you arbitrarily choose to require something that isn't properly disclosed punishing everyone that hasn't received disclosure; that in my mind is a fair and reasonable case for either gross negligence (for general intent to prove malice) or tortuous interference with third-party companies businesses.
That question which you mentioned about asking in an interview (iirc) was actually asked in an Ignite interview, but was cut out from the recordings later, and the answer was we can't talk about what other departments are doing. They may have followed-up on that elsewhere but I never saw anything related to it.
It is critically important to know the reasons why things are structured a certain way or happen; in order to be able to interoperate. This is and has been known and repeated many times since the adoption of OSI & TCP in the 80s/90s with regards to interoperability of systems.
Blindly copying what others do is a recipe for disaster and isn't justifiable in terms of cost, and competent professional's don't roll the dice like that on large projects of that caliber of expense.
This stuff isn't straight forward either. Like knowing where the reputational namespace stops, what the ramp-up time (dm/dt) is for volume metrics to warm up a server at each provider, and objective indicators associated with when you go above that arbitrarily designed rate. (hint: non-deterministic hidden states) If it takes a month to perfectly warm a new server up without reputational consequences by an insider that knows, that's extra cost imposed on the company by that platform (whom you are competing against for email services).
No disclosure means starting over every time trying to guess at what they are doing, and having breakage later when they change things.
> reddit...
A lot of professionals no longer use reddit anymore because its a bot filled echo chamber that wastes valuable time.
Moderators there often remove posts regularly for simple disagreement, conflicts of interest, or to remove access to detailed solutions or methodology.
For an example of all that's wrong there, look to that CodingBootCamp reddit. There's a guy that's a moderator there that's been, in all probability, using a bot to destroy a competitors reputation and harass them for years, attacking the owners, execs, and going so far as to harass and stalk their children; while violating the Moderator Code of Contact. Crazy and toxic stuff.
---
You can't ever meet professional standards if you don't communicate or properly disclose interop requirements when complex systems are involved.
"Google does good thing, therefore Google has too much power over the internet" is not a convincing point to make.
This safety feature saves a nontrivial number of people from life-changing mistakes. Yes we publishers have to take extra care. Hard to see a negative here.
I respectfully disagree with your premise. In this specific case, yes, "Google does good thing" in a sense. That is not why I'm saying Google has too much power. "Too much" is relative and whether they do good or bad debatable, of course, but it's hard to argue that they don't have a gigantic influence on the whole internet, no? :)
Helping people avoid potentially devastating mistakes is of course a good thing.
What point are you trying to make here? You hosted phishing sites on your primary domain, which was then flagged as unsafe. You chose not to use the tools that would have marked those sites as belonging to individual users, and the system worked as designed.
Where'd you see/hear that? It hasn't been my experience at least - but maybe I've just been lucky or undercounting the sites.
There are required steps to follow but none are "have x users" or "see a lot of spam". It's mostly "follow proper DNS steps and guidelines in the given format" with a little "show you're doing this for the intended reason rather than to circumvent something the PSL is not meant for/for something the public can't get to anyways" (e.g. tricking rate limits, internal only or single user personal sites) added on top.
"Projects that are smaller in scale or are temporary or seasonal in nature will likely be declined. Examples of this might be private-use, sandbox, test, lab, beta, or other exploratory nature changes or requests. It should be expected that despite whatever site or service referred a requestor to seek addition of their domain(s) to the list, projects not serving more then thousands of users are quite likely to be declined."
Maybe the rules have changed, or maybe you were lucky? :)
Is it? Companies like Google coddle users instead of teaching them how to browse smarter and detect phishing for themselves. Google wants people to stay ignorant so they can squeeze them for money instead of phishers.
How does Google get money out of people in that case? As a corporation, Google contributes greatly to the education sector and also profits greatly, so it seems like they're pro-education to me, and are merely making the best of a bad situation, but I'd love to hear how Google extracts money from the people they've protected from phishing schemes in some secret way that I haven't considered. I do happen to have Google stock in my portfolio though, so maybe that indight's my entire comment for you though.
This is a fine mentality when it takes a certain amount of "Internet street smarts" (a term used in the article) to access the internet - at least beyond AOL etc.
But over half of the world has internet access, mostly via Chrome (largely via Android inclusion). At least some frontline protection (that can be turned off) is warranted when you need to cater to at least the millions of people who just started accessing the internet today, and the billions who don't/can't/won't put the effort in to learn those "Internet street smarts".
How does flagging a domain that was actively hosting phishing sites demonstrate that Google has too much power? They do, but this is a terrible example, undermining any point you are trying to make.
The thing about Google is that they regularly get this stuff wrong, and there is no recourse when they do.
I think most people working in tech know the extent to which Google can screw over a business when they make a mistake, but the gravity of the situation becomes much clearer when it actually happens to you.
This time it's a phishing website, but what if the same happens five years down the line because of an unflattering page about a megalomaniac US politician?
Then that would be an example of a system having failed and one that needs to change. Instead, this is an example of a hosting company complaining about the consequences of skipping some of the basic, well-documented safety and security practices that help to isolate domains for all sorts of reasons, from reputation to little things like user cookies.
This article shows an example of this process working as intended though.
The user's site was hosting phishing material. Google showed the site owner what was wrong, provided concrete steps to remedy the situation, and removed the warning within a few hours of being notified that it was resolved.
Google's support sucks in other ways, but this particular example went very smoothly.
There are two aspects to the Internet: the technical and the social.
In the social, there is always someone with most of the power (distributed power is an unstable equilibrium), and it's incumbent upon us, the web developers, to know the current status quo.
Back in the day, if you weren't testing on IE6 you weren't serving a critical mass of your potential users. Nowadays, the nameplates have changed but the same principles hold.
Social wasn't always sole powered, only began with the later social networks, not the early. And now people are retreating to smaller communities anyways.
Testing on IE6 wasn't the requirement, all browser's was. IE shipped default on windows and basically forced themselves into the browser conversation with an incomplete browser.
I don't mean social as in social network. I mean that people have always been a key aspect of the technology and how it it practically works.
Yes, yes, IE6 shipped by default shipped by default on Windows. And therefore if you wanted a website that worked, you tested against IE6. Otherwise people would try and use your website and it wouldn't work and they wouldn't blame the browser, they would blame your website.
Those social aspects introduce a bunch of not necessarily written rules that you just have to know and learn as you develop for the web.
> Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged.
NO, Google should be "mindful" (I know companies are not people but w/e) of the power it unfortunately has. Also, Cloudflare. All my homies hate Cloudflare.
... by using the agreed-upon tool to track domains that treat themselves as TLDs for third-party content: the public suffix list. Microsoft Edge and Firefox also use the PSL and their mechanisms for protecting users would be similarly suspicious that attacks originating from statichost.eu were originating from the owners of that domain and not some third-party that happened to independently control foo.statichost.eu.
Getting on the public suffix list is easier said than done [1]. They can simply say no if they feel like it and are making sure to be able to keep said rights as a "project" vs a "business," [2] which has its pros and cons.
> Getting on the public suffix list is easier said than done [1].
Can you elaborate on this? I didn't see anything in either link that would indicate unreasonable challenges. The PSL naturally has a a series of validation requirements, but I haven't heard of any undue shenanigans.
Is it great that such vital infrastructure is held together by a ragtag band of unpaid volunteers? No; but that's hardly unique in this space.
> Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged. How else is Google supposed to know that subdomains belong to different users? That's what the PSL is for.
A centralized list, where you have to apply to be included and it's up to someone else to decide whether you will be allowed in? How is this what they went for: "You want to specify some rules around how subdomains should be treated? Sure, name EVERY domain that this applies to."
Why not just something like https://example.com/.well-known/suffixes.dat at the main domain or whatever? Regardless of the particulars, this feels like it should have been an RFC and a standard that avoids such centralization.
There was an IETF working group that was working on a more distributed alternative based on a DNS record (so you could make statements in the DNS about common administrative control of subdomains, or lack of such common control, and other related issues). I believe the working group concluded its work without successfully creating a standard for this, though.
Yes, its generally good advice to keep user content on a separate domain.
That said, there are a number of IT professionals that aren't aware of the PSL as these are largely initiatives that didn't exist prior to 2023 and don't get a lot of advertisement, or even a requirement. They largely just started being used silently by big players which itself presents issues.
There are hundreds if not thousands of whitepapers on industry, and afaik there's only one or two places its mentioned in industry working groups, and those were in blog posts, not whitepapers (at M3AAWG). There's no real documentation of the organization, what its for, and how it should be used in any of the working group whitepapers. Just that it is being used and needs support; not something professional's would pay attention to imo.
> Second, they should be using the public suffix list
This is flawed reasoning as is. Its hard to claim this with a basis when professionals don't know about this, a small subset just arbitrarily started doing this, and seems more like false justification after-the-fact for throwing the baby out with the bath water.
Security is everyone's responsibility, and Google could have narrowly tailored the offending domain name accesses instead of blocking the top-level. They didn't do that, and worse that behavior could even be automated in a way that the process could be extended and there could be a noticing period to the toplevel provider before it started hitting everyone's devices. They also didn't do that apparently.
Regardless, no single entity should be able to dictate what other people perceive or see arbitrarily from their devices (without a choice; opt-in) but that is what they've designed these systems to do.
Enumerating badness doesn't work. Worse, say the domain names get reassigned to another unrelated customer.
Those people are different people, but they are still blocked as happens with small mail servers quite often. Who is responsible when someone who hasn't been engaged with phishing is being arbitrarily punished without due process. Who is to say that google isn't doing this purposefully to retain their monopolies for services they also provide.
Its a perilous torturous path where trust cannot be given because they've violated that trust in the past, and have little credibility with all net incentives towards their own profit at the expense of others. They are even willing to regularly break the law, and have never been held to account for it. (i.e. Google Maps WIFI wiretapping).
Hanlon's razor is a joke intended as a joke, but there are people that use it literally and inappropriately to deceitfully take advantage of others.
Gross negligence coupled with some form of loss is sufficient for general intent which makes the associated actions malicious/malice.
Throwing out the baby with the bath water without telling anyone or without warning, is gross negligence.
I'm not sure what to tell you. I'm a professional with nearly two decades of experience in this industry, and I don't read any white papers. I read web publications like Smashing Magazine or CSS Tricks, and more specifically authors like Paul Irish, Jake Archibald, Josh Comeau, and Roman Komarov. Developers who talk about the latest features and standards, and best practices to adopt.
The view that professionals in this industry exclusively participate in academic circles runs counter to my experience. Unless you're following the latest AI buzz, most people are not spending their time on arXiv.
The PSL is surely an imperfect solution, but it's solving a problem for the moment. Ideally a more permanent DNS-based solution would be implemented to replace it. Though some system akin to SSL certificates would be necessary to provide an element of third-party trust, as bad actors could otherwise abuse it to segment malicious activity on their own domains.
If you're opposed to Safe Browsing as a whole, both Chromium and Firefox allow you to disable that feature. However, making it an opt-in would essentially turn off an important security feature for billions of users. This would result in a far greater influx of phishing attacks and the spread of malware. I can understand being opposed to such a filter from an idealistic perspective, but practically speaking, it would do far more harm than good.
You seem to have not understood what I said conflating academia with whitepapers and then construct the rest on an improper foundation from there.
Whitepapers aren't the sole domain of academia. What we are talking about aren't hosted on Arxiv. We are talking about industry working groups.
The M3AAWG working group, and browser/CA forum publish RFCs and Whitepapers that professionals in this area do read regularly.
There's been insufficient/no professional outreach about PSL. You can't just do things at large players without disclosure for interop, because you harm others by doing so neglecting the imposing fallout from lack of disclosure on everyone else that's impacted within your sphere of influence which as a company running the second most popular browser is global.
When you do so without first doing certain reasonable and expected things (of any professional organization), you are being grossly negligent. This is sufficient to prove general intent for malice in many cases, a reasonable person in such circumstances should have known better.
This paves the way for proving tortuous or vexatious interference of a contract which is a tort and punishable by law when brought against the entity.
> The PSL is surely an imperfect solution but its solving the problem for the moment.
It is not, because the disclosure hasn't happened properly for interop, and in such circumstances it predictably creates a mountain of problems without visibility; a timebomb/poison pill where crisis arises from the brittle structure later following shock doctrine utilizing the snowball effect (a common tactic of the corrupt and deceiver alike).
Your entire line of reasoning which you constructed is critically flawed. You presume trust is important to this, and that such systems require trust, but trust doesn't have anything to do with the reputational metrics which the systems we are talking about are using to impose cost. Apples to Oranges.
You can't enumerate badness. Lots of professionals know this. Historic reputational blacklists also punish those that are innocent after-the-fact when not properly disclosed, or engineered for due process. A permanent record deprives anyone from using a blacklisted entry after it changes hands from the criminal to some unsuspecting person.
Your reasoning specifically frames a false dichotomy about security. This follows almost identically the same reasoning the Nazi's used (ref at the bottom).
No one is arguing that Safe Browsing and other mechanisms are useful as mitigation, but they are temporary solutions that must be disclosed to a detailed level that allows interoperability to become possible.
If you only tell your friends, and impose those draconian costs on everyone else, you are abusing your privileged position of trust for personal gain (a form of corruption), and causing harm on others even if you can't see it.
Chrome does not have an opt-out. You have to re-compile the browser from scratch to turn those subsystems off. Same with Firefox. That is not allowing you to disable that feature since users aren't reasonably expected to be able to recompile their software to change a setting.
There is no idealism/pacifism here. I'm strictly being pragmatic.
You neglect the harm you don't directly see, in the costs imposed on business. Second, Third, and n-order effects must be considered but have not been (this must necessarily grow in consideration based on the scope/scale of impact).
There are a few areas where doing such blind things may directly threaten existential matters (i.e. food production where failure of logistics lead to shortages, which whipsaw into chaos). It won't happen immediately, and we live in a growingly brittle but still somewhat resilient society, but it will happen eventually if such harm is adopted and allowed as standard practice; though the method is indirect the scope starts off large.
If you only look through a lens at a small part of the cycle of the dynamics that favors your argument which you set in motion, ignoring everything else; that is called cherry-picking or also commonly known as the fallacy of isolation.
Practically speaking, that line of reasoning is without foundational support and unsound. Its important to properly discern and reason about things as they actually exist in reality.
Competent professionalism is not an idealistic perspective. The harm naturally comes when one doesn't meet well established professional requirements. When the rule of law fails to hold destructive people to account for their actions; that's a three-alarm fire as a warning sign of impending societal collapse. The harms of which are incalculable.
Ref:
"Of course the people don't want war. But after all, it's the leaders of the country who determine the policy, and it's always a simple matter to drag the people along whether it's a democracy, a fascist dictatorship, or a parliament, or a communist dictatorship. Voice or no voice, the people can always be brought to the bidding of the leaders.
(Your implications follow this part closely): That is easy. All you have to do is tell them they are being attacked, and denounce the pacifists for lack of patriotism, and exposing the country to greater danger."
Putting user content on another domain and adding that domain to the public suffix list is good advice.
So good, in fact, that it should have been known to an infrastructure provider in the first place. There's a lot of vitriol here that is ultimately misplaced away from the author's own ignorance.
The PSL is something you find out about after it goes wrong.
It's a weird thing, to be honest, a Github repo mentioned nowhere in any standards that browsers use to treat some subdomains differently.
Information like this doesn't just manifest itself into your brain once you start hosting stuff, and if I hadn't known about its existence I wouldn't have thought to look for a project like this either. I certainly wouldn't have expected it to be both open for everyone and built into every modern internet-capable computer or anti malware service.
To be fair I’ve been in the space for close to 20 years now, worked on some of the largest sites and this is the first I’m hearing of the public suffix list.
I don't know about LiveJournal, but I don't believe you can host any interactive content on substack (without hacking substack at least). You can't sign up and host a phishing site, for instance.
User-uploaded content (which does pose a risk) is all hosted on substackcdn.com.
The PSL is more for "anyone can host anything in a subdomain of any domain on this list" rather than "this domain contains user-generated content". If you're allowing people to host raw HTML and JS then the PSL is the right place to go, but if you're just offering a user post/comment section feature, you're probably better off getting an early alert if someone has managed to breach your security and hacked your system into hosting phishing.
The public suffix list interferes with cookies. So on a service like livejournal, where you want users logged in across all subdomains, it's not an option
Exactly, this has been documented knowledge for many years now, even decades. Github and other large providers of user-generated content have public-facing documentation on the risks and ways to mitigate them. Any hosting provider that chooses to ignore those practices is putting themselves, and their customers, at risk.
> There's a lot of vitriol here that is ultimately misplaced away from the author's own ignorance.
For what it's worth, this makes it sound like you think the vitriol should be aimed at the author's ignorance rather than the circumstances which led to it, presuming you meant the latter.
I do think the author's ignorance was a bigger problem--both in the sense of he should have known better and also in the sense that the PSL needs to be more discoverable--than anything Google('s automated systems) did.
However, I'm now reflecting on what I said as "be careful what you wish for", because the comments on this HN post have done a complete 180 since I wrote it, to the point of turning into a pile-on in the opposite direction.
> also in the sense that the PSL needs to be more discoverable
Well, this is a problem that caused the author's ignorance but you present it as though it's the other way around. That's primarily what I meant. Not really disagreeing with "should have known better", mostly in the sense that user-generated content is a huge yellow flag.
The good news is, once known, a lesson like this is hard to forget.
The PSL is one of those load-bearing pieces of web infrastructure that is esoteric and thanklessly maintained. Maybe there ought to be a better way, both in the sense of a direct alternative (like DNS), and in the sense of a better security model.
There’s some value in the public suffix list being shared, with mild sanity checking before accepting entries: it maintains a distinction between site (which includes all subdomains) and origin (which doesn’t). Safe Browsing wants to block sites, but if you can designate your domain a public suffix without oversight, you can bypass that so that it will only manage to block your subdomains individually (until they adjust their heuristics to something much more complicated and less reliable than what we have now).
The thing is, for users, having a separate domain wouldn't have made any difference without the PSL. And you cannot get on there before you're big enough - which I'd say is roughly at the same time as you start grabbing the attention of scammers.
Well, you're responding to him, so questions or suggestions are probably better than speculation.
My comment about vitriol was more directed at the HN commenters than Eric himself. Really, I think a discussion about web infrastructure is more interesting than a hatefest on Google. Thankfully, the balance seems to have shifted since I posted my top-level comment.
> Well, you're responding to him, so questions or suggestions are probably better than speculation.
I suspect the author is unaware of their other blindspots. It's not 2001 anymore. Holding yourself out as a hosting provider comes with some baseline expectations.
Since there's a lot of discussion about the Public Suffix list, let me point out that it's not just a webform where you can add any domain. There's a whole approval process where one very important criterion is that the domain to be added has a large enough user base. When you have a large enough user base, you generally have scammers as well. That's what happened here.
It basically goes: growing user base -> growing amount of malicious content -> ability to submit domain to PSL. In that order, more or less.
In terms of security, for me, there's no issue with being on the same domain as my users. My cookies are scoped to my own subdomain, and HTTPS only. For me, being blocked was the only problem, one that I can honestly admit was way bigger than I thought.
What sort of size would be needed to get on there?
My open source project has some daily users, but not thousands. Plenty to attract malicious content, I think a lot of people are sending it to themselves though (like onto a malware analysis VM that is firewalled off and so they look for a public website to do the transfer), but even then the content will be on the site for a few hours. After >10 years of hosting this, someone seems to have fed a page into a virus scanner and now I'm getting blocks left and right with no end in sight. I'd be happy to give every user a unique subdomain instead of short links on the main domain, and then put the root on the PSL, if that's what solves this
Based on what I've seen, there's no way to get that project into the PSL. I would recommend you to have the content available at projectcontent.com if the main site is project.com, though. :)
As a CISO I am happy with many of the protections that Google creates. They are in a unique position, and probably the only ones to be able to do it.
However, I think the issue is that with great power comes great responsibility.
They are better than most organisations, and working with many constraints that we cannot always imagine.
But several times a week we get a false "this mail is phishing" incident, where a mail from a customer or prospect is put in "Spam", with a red security banner saying it contains "dangerous links". Generally it is caused by domain reputation issues, that block all mail that uses an e-mail scanning product. These products wrap URLs so they can scan when the mail is read, and thus when they do not detect a virus, they become defacto purveyors of virii, and their entire domain is tagged as dangerous.
I have raised this to Google in May (!) and have been exchanging mail on a nearly daily basis. Pointing out a new security product that has been blacklisted, explaining the situation to a new agent, etc.
Not only does this mean that they are training our staff that security warnings are generally false, but it means we are missing important mail from prospects and customers. Our customers are generally huge corporations, missing a mail for us is not like missing one mail for a B2C outfit.
So far the issue is not resolved (we are in Oct now!) and recently they have stopped responding. I appreciate our organisation is not the US Government, but still, we pay upwards of 20K$ / year for "Google Workspace Enterprise" accounts. I guess I was expecting something more.
If someone within Google reads this: you need to fix this.
I'm old. I've been doing security for a very long time. Started back in the 1990s. Here's what I have learned over the last 30 years...
Half (or more) of security alerts/warnings are false positives. Whether it's the vulnerability scanner complaining about some non-existent issues (based on the version of Apache alone... which was back ported by the package maintaner), or an AI report generated by interns at Deloitte fresh out of college, or someone reporting www.example.com to Google Safe Browsing as malicious, etc. At least half of the things they report on are wrong.
You sort of have to have a clue (technically) and know what you are doing to weed through all the bullshit. Tools that block access, based on these things do more harm than good.
What this post might be missing is that it’s not just Google that can block your website. A whole variety of actors can, and any service that can host user-generated content, not just html (a single image is enough), is at risk, but really, any service is at risk. I’ve had to deal with many such cases: ISPs mistakenly blocking large IP prefixes, DPI software killing the traffic, random antivirus software blocking your JS chunk because of a hash collision, even small single-town ISPs sinkholing your domain because of auto-reports, and many more.
In the author’s case, he was at least able to reproduce the issues. In many cases, though, the problem is scoped to a small geographic region, but for large internet services, even small towns still mean thousands of people reaching out to support while the issue can’t be seen on the overall traffic graph.
The easiest set of steps you can do to be able to react to those issues are:
1. Set up NEL logging [1] that goes to completely separate infrastructure,
2. Use RIPE Atlas and similar services in the hope of reproducing the issue and grabbing a traceroute.
I’ve even attempted to create a hosted service for collecting NEL logs, but it seemed to be far too niche.
I don't see how a separate domain would solve the main issue here. If something on that separate domain was flagged, it would still affect all user content on that domain. If your business is about serving such user content, the main service of your business would be down, even though your main domain would still be up.
You are right, it would still affect all users. Until the pending PSL inclusion is complete, that is. But it now separates my own resources, such as the website and dashboard of statichost.eu from that.
A separate domain may not prevent users' conten from being blocked, but it may prevent blocking of the administrative interfaces. Which would help affected customers get their content and the service could more easily put a banner advising users of the situation, etc.
I second this for personal sites. Having run forums and chan sites without a CDN I found that not only is this true, it is 100% automated. The timing in emails to VPS/Registrars matches the times their scripts would crawl my sites and submit illicit content, screenshot it and automatically submit the screenshots to the VPS/server/registrar providers. That was incentive enough for me to take my sites private / semi-private. I would move them to .onion nodes but that's just too slow for me. I have my own theories as to what groups are running these scripts to push people to CDN's but no smoking gun.
Corporations are a little safer. They have mutually binding contracts with multiple internet service providers and dedicated circuits. They have binding contracts with DNS registrars. Having been on the receiving end of abuse@ they notify over phone and email giving plenty of time to figure out what is going on. I've never seen corporate circuits get nuked for such shenanigans.
Any services successfully offloading UGC to other moderated platforms? E.g. developer tools relying on GitHub instead of storing source/assets in the service itself, and Microsoft can take care of most moderation needs. But are there consumer apps that do things like this?
But something has definitely changed over the past few years. Back in the days, it felt completely normal for individuals to spin up and run their own forums. Small communities built and maintained by regular people. How many new truly independent, individual-run forums can you name today? Hardly any. Instead we keep hearing about long-time community sites shutting down because individuals can no longer handle the risks of user content. I've seen so many posts right here on HN announcing those kinds of shutdowns.
I feel like yes forums are being closed because they have migrated to the likes of things like discord
I have mixed opinions about discord and if I can be honest, I have mixed opinions about forums as well
My opinion is to take things like forums and transfer them over to things like xmpp/(Irc?)/(signal?)/(matrix most prefered)
There are bridges as well for matrix <-> Irc if this is something that interests you, there are bridges for everything but I prefer matrix with cinny and I generally think that due to its decentralized nature, it might be better than centralized forums maybe as well.
>How many new truly independent, individual-run forums can you name today?
Almost none, but it's due to a lot of complicated factors and not just the direct risk of user content.
Take moderation of content that won't get you banned by your ISP. It sucks. Nobody in their right mind would want to do it. There are countless bots and trolls that are going to flood your forums for whatever cause they champion.
Then there is DDOS floods because you pissed off said bots and trolls. This can make the forums unaffordable and piss off your ISP.
But even if nothing goes wrong, popularity is a risk in itself. In the past there was stuff like the Slashdot effect where your site would go down for a while. But now if your small site became popular on tiktok for some reason 20 million people could show up. Even if your site can stand up to that, how will you moderate it? How will you pay for the bandwidth?
Oh, and will you get any advertisers because of said user content? How are you going to pay for the site?
Oh, also you're competing with massive sites for eyeballs, how are you going to get actual users?
Is it consolidation of services? Waaaaay back in the day, imageboards like 4chan were "one complaint away from being shut down" but 24-hours later they'd be up again on another rag-tag hosting provider. Nowadays it's like one complaint to cloudflare or AWS and the site is dead dead.
Your equally just one fake report to an automated system away having your account shut down. So, yes, your actions have consequences, but more worrying to me is the ability of someone with a grudge causing consequences for you as well.
You say "we" like it is the population of internet users. They have no choice in this other than to use whatever sites are available. It is the megaEvilCorps that are doing it to us. They start with a novel idea that is rewarded by lots of users. They then decide to weaponize their site against us to become money printing machines. They then use that money to buy up any competition which artificially limits the end user's choices. WE aren't doing shit to ourselves. Profit seeking megaEvilCorps are doing it to us.
On top of the megaEvilCorps are evilScammyHackers that have made the internet a dangerous place. So entrepreneurial minded folks came up with some cool things to help protect users and site owners from these evilScammyHackers. Problem is, it takes scaled services to do it which again takes money which naturally limits those that are able to provide those services. This is again not we doing anything to ourselves.
If you mean we has a species, then sure, but that's a really stretched definition of we
Not sure who changed the HN headline, but I appreciate the change. Especially since the concept in the headline is buried at the bottom of the post.
Post author is throwing a lot of sand at Google for a process that has (a) been around for, what, over a decade now and (b) works. The fact of the matter is this hosting provider was too open, several users of the provider used it to put up content intended to attack users, and as far as Google (or anyone else on the web is concerned) the TLD is where the buck stops for that kind of behavior. This is one of the reasons why you host user-generated content off your TLD, and several providers have gotten the memo; it is unfortunate statichost.eu had not yet.
I'm sorry this domain admin had to learn an industry lesson the hard way, but at least they won't forget it.
Author here. I understand that my post and what I'm trying to say is unclear. And that there are too many different aspects to all this.
What I'm trying to say in the post specifically about Google is that I personally think that they have too much power. They can and will shut down a whole domain for four billion users. That is too much power no matter the intentions, in my opinion. I can agree that the intentions are good and that the net effect is positive on the whole, though.
On the "different aspects" side of things, I'm not sure I agree with the _works_ claim you make. I guess it depends on what your definition of works is, but having a blacklist as you tool to fight bad guys is not something that works very well in my opinion. Yes, specifically my own assets would not have been impacted, had I used a separate domain earlier. But the point still stands.
The fact that it took so long to move user content off the main domain is of course on me. I'm taking some heat here for saying this is more important than one (including me) might think. But nonetheless, let it be a lesson for those of you out there who think that moving that forum / upload functionality / wiki / CMS to its own domain (not subdomain) can be done tomorrow instead of today.
It can happen to anyone and cause a reputational risk. Once upon a time $workplace had a Zoho Form that would be blacklisted by Google Safe Browsing or Microsoft Edge for arbitrary periods of time, presumably because someone used Zoho to make a phishing site, leading to some very confused calls.
Sounds a very convenient mistake to do on your competitors? It does not sound believable that they would not know what zoho was or that it makes no sense to flag all the zoho domain.
In Github's case, I think it was also because a lot of security boundaries were using TLD which led x.github.com potentially grab cookies of y.github.com or worse, github.com itslef
Don't forget the `githubusercontent.com` domain, which is specifically used to host risky, user-generated content, and fully documented in https://docs.github.com/en/authentication/keeping-your-accou... (using an open source component that other companies could also use, if they were interested in similar levels of security)
I am a solo developer. I recently created a new web app for a client. Google has marked as phishing so they can't use it. Obviously I can't do anything about it except report error and wait. I'm worried if I move it to a new domain that one will get marked as well. Not sure what to do TBH.
No, however it does include a Microsoft entra/Azure AD/Microsoft 365 login for that clients tenant. It is also a newly registered domain so I can understand why it looks suspicious. The most frustrating thing is that this is all a machine I.e. no-one I can speak to, nothing I can do to fix it. My fate has been decided by an algorithm.
Google has some sort of internal flag for determining origin is different on some platforms. We don't get a complete takedown of Neocities every time there's a spam site reported. It is likely that they were not on that list but perhaps have been manually added to whatever that internal list is at this point.
The public suffix list (https://publicsuffix.org/) is good and if I were to start from scratch I would do it that way (with a different root domain) but it's not absolutely required, the search engines can and do make exceptions that don't just exclusively use the PSL, but you'll hit a few bumps in the road before that gets established.
Ultimately Google needs to have a search engine that isn't full of crap, so moving user content to a root domain on the PSL that is infested with phishing attacks isn't going to save you. You need to do prolific and active moderation to root out this activity or you'll just be right back on their shit list. Google could certainly improve this process by providing better tooling (a safe browsing report/response API would be extremely helpful) but ultimately the burdon is on platforms to weed out malicious activity and prevent it from happening, and it's a 24/7 job.
BTW the PSL is a great example of the XKCD "one critical person doing thankless unpaid work" comic, unless that has changed in recent years. I am a strong advocate of having the PSL management become an annual fee driven structure (https://groups.google.com/g/publicsuffix-discuss/c/xJZHBlyqq...), the maintainer deserves compensation for his work and requiring the fee will allow the many abandoned domains on the list to drop off of it.
If you're not using separate domains then I hope you don't have any kind of sensitive information stored in cookies. You can't rely on the path restrictions for cookies because it's easily bypassed.
Strict cookies crossing root to subdomains would be a major security bug in browsers. It's always been a (valid) theoretical concern but it's never happened on a large scale to the point I've had to address it. There is likely regression testing on all the major browsers that will catch a situation where this happens.
If youtube.com doesn't end up on the Safe Browsing blacklist because of phishing videos, but your own website can easily end up there, it's a pretty clear case of Google abusing their power.
YouTube doesn't allow you to put your credentials into text box and hit send. Google sites, on the other hand, does pose a disk, but they'll likely be treated the same as any other domain on the PSL.
In my experience, safe browsing does theoretically allow you to report scams and phishing in terms of user generated content, but it won't apply unless there's an actual interactive web page on the other end of the link.
There is the occasional false positive but many good sites that end up on that list are there because their WordPress plugin got hacked and somewhere on their site they are actually hosting malware.
I've contacted the owners of hacked websites hosting phishing and malware content several times, and most of the time I've been accused of being the actual hacker or I've been told that I'm lying. I've given up trying to be the good guy and report the websites to Google and Microsoft these days to protect the innocent.
Google's lack of transparency what exact URLs are hosting bad material does play a role there.
YouTube hosts millions of videos telling people that they are the government/your bank and that you should move money/contact a scam center/buy cryptocurrency. Even worse is the fact you can pay to turn these videos into ads that will roll in front of other videos.
On the whole of YouTube, it's a tiny sliver of a percentage, but because YouTube has grown too large to moderate, it's still hosting these videos.
If Google applied the same rules they apply to the safe browsing list, they'd probably get YouTube flagged multiple times a week.
You are right, of course. I'm not sure if those of you who disagree with me think that Safe Browsing did its job (which it did!), that Safe Browsing is a good thing (which it maybe is, but which I slightly disagree with), or that it's ok that Google monitors everything everyone does.
The last point is actually the one I'm trying to make.
There should be a concept, sort of an inverse of tragedy of the commons, for the positive feedback loop of many users providing big data to a company that can use that data to benefit many users.
From spamblocking that builds heuristics fed by the spam people manually flag in GMail to Safe Browsing using attacks on users' Chrome as a signal to their voice recognition engine leapfrogging the industry standard a few years back because they trained it on the low-quality signal from GOOG411 calls, Google keeps building product by harvesting user data... And users keep signing up because the resulting product is good.
This puts a lot of power in their hands but I don't think it's default bad... If it becomes bad, users leave and Google starts to lose their quality signal, so they're heavily incentivized to provide features users want to retain them.
This does make it hard to compete with them. In the US at least, antitrust competition has generally been about user harm, not actually market harm. If a company has de-facto control but customers aren't getting screwed, that's fine because ultimately the customer matters (and nobody else is owed a shot at beeing a Google).
There is. It is called ponzi, and its illegal, but in most cases its become indirect enough in the consequences without proper guard rails/accountability that its now allowed by most publicly traded business today (through clever deception).
Generally, it involves three phases:
1st: Front-loaded benefits in CapEx funding meeting customer/investor expectations regardless of cost.
2nd: Inflection point of momentum where CapEx falls off, a brief period where income meets costs.
3rd: Enshittification - momentum/acceleration reverses to the negative, failure of services as the system is continually hollowed out and cost exceeds income.
This is seen in the S-growth or S-adoption curves in business starting to become visible towards the late 1970s and progressively increasing exponentially thereafter in time.
Most companies jettison (sell/off or merge) or close down services before they hit the 3rd stage where the service objectively can be seen as unprofitable by associated investors. The ones that don't are state-funded apparatus.
This concept drives almost everything we see today in modern society and in the market there are parallels and indirect consequences fully described back in the 1950's by Mises with regards to money-printing regardless of its form (i.e. debt that is not reserve backed (Basel3), synthetic shares, paper warrants (Comex), Bonds (with reporting loophole hold to maturity), flier miles, credit card rewards, etc).
The structure and its flaws remain foundationally intractable. This is how you profit and grow bigger off destroying the market. Eventually consolidation leaves state apparatus in place of a market.
No market can compete with slave labor, which is what state-funded apparatus use indirectly through money-printing/currency debasement. Its not considered a tax, and its not given willingly. Its extracted labor.
Those that have lived through these times see the drastic reduction of options in available products that have naturally sieved to the point where shortages are now regularly occuring (for those with a discerning eye). There are a lot of moving factors, but the structure and their inevitable trends are well known structures, at least in certain circles.
In seriousness, the totality of Socio-economic collapse is more probable than a lot of other potential futures, as a result of this. Collapse has happened many times throughout history in relation to money-printing.
Always before, we were not in ecological overshoot for our population, let alone being in this state for 2 full generations. Catton/Malthus paint a grim picture of the outcomes but no one of action pays attention to these things. Its all largely drowned out by the noise of bots.
It's hard to get that point because you're conflating two different stories.
Folks around here are generally uneasy about tracking in general too, but remove big brother monitoring from Safe Browsing and this story could still be the same: whole domain blacklisted by Google, only due to manual reporting instead.
"Oh, but a human reviewer would've known `*.statichost.eu` isn't managed by us"—not in a lot of cases, not really.
Sure, and sorry for being so unclear. The point of my post was meant to be a) Google has this enormous cannon, is this "right"? And b) they will use it to kill anything bigger than a mosquito.
But you're right, complaining about big tech surveillance didn't help with making that point at all.
> But you're right, complaining about big tech surveillance didn't help with making that point at all.
I disagree. Everyone with a brain is thinking it. Its important to address what your audience may be thinking especially given the other factors in this which I've mentioned in other responses related to gross negligence.
Technical capability exists to narrowly define blacklists, and they chose a gross negligence route (baby with bathwater), without providing notice.
Generally because Facebook polices Facebook (imperfectly, but the effort is demonstrated) and the damage radius is limited to Facebook users mostly. As long as the easiest way to avoid damage from the Facebook domain is "Don't use Facebook," the larger Internet doesn't need a mechanism to police it.
If Facebook became a trap that frequently hosted malware to strangers, the rest of the net would begin to interpret it as damage and route around it.
I’ve got a random subdomain hosting a little internal tool. About twice a year, Google Safe Browsing decides it’s phishing and flags it. Sometimes they flag the whole domain for good measure.
Search Console always points to my internal login page, which isn’t public and definitely isn’t phishing.
They clear it quickly when I appeal, and since it’s just for me, I’ve mostly stopped worrying about it.
I encountered something similar. I have `*.domain.tld` pointed to an internal IP address, and over the past few years it happened a few times where some subdomain would be flagged as dangerous by Google Safe Browsing.
Internal IP addresses in public DNS are sometimes used to do things like DNS rebind attacks. It's possible that's tripping up their detection mechanism.
My workaround is to use an IPv6 ULA for my publicly hosted private IP addresses, which is extremely unlikely to ever be reused by a bad actor.
Many phishing attacks originate from Google's owns domains. Gmail users phishing others, scammy youtube videos, scammy comments with links to scams, scammy ads to fake banking pages, etc, etc. But Google would never be hypocritical, never!!
That is a great point. When I see these sites I'm always seeing a dozen red flags, and maybe the biggest one is that it's showing a "NatWest" banking site or something and is hosted on "portal-abc.statichost.eu". But the whole point is of course saving users from coming to harm, and if it did - great!
As a result, some ISPs apparently block the domain. Why is it listed? I have no idea. There are no ads, there is no user content, and I've never sent any email from the domain. I've tried contacting spamhaus, but they instantly closed the ticket with a nonsensical response to "contact my IT department" and then blocked further communication. (Oddly enough, my personal blog does not have an IT department.)
Just like it's slowly become quasi-impossible for an individual to host their own email, I fear the same may happen with independent websites.
From reading that my guess would be that the IP of your host gotten from your hosting provider had some spammy history before you started hosting your blog on it.
Either that or your DNS provider hosts a lot of spam.
> As a result, some ISPs apparently block the domain
This is the infuriating part. I get that someone buying cheap hosting may end up with an IP address that used to send spam, but spam lists are not reliable indicators of website security.
Overzealous security products are a blight on the internet. I'd be less annoyed at them if they weren't so trivial to bypass as a hacker with access to a stolen credit card.
Many commenters are implying that there is a security issue here, and that I'm putting everyone in danger. That is quite frankly a pretty absurd claim to just casually make. I'm of course very curious to hear more details on what the security risk here actually would be?
Do you think I'm reading/writing sensitive data to/from subdomain-wide cookies?
Also, yes, the PSL is a great tool to mitigate (in practice eliminate) the problem of cross-domain cookies between mutually untrusting parties. But getting on that list is non-trivial and they (voluntary maintainers) even explicitly state that you can forget getting on there before your service is big enough.
I am not implying you’re putting “everyone” in danger. I’m merely implying that you’re putting your own service in danger by allowing clients to act like a trusted subdomain like controlpanel.statichost.eu, .secure, or Unicode similarities of www.
Ok, I see. You mean the possibility of users impersonating statichost.eu itself. That is actually a good point, and the exact reason why user subdomains are required to have a dash in them. Edit: Also, only ASCII is allowed. :)
I guess control-panel.statichost.eu is still possible, of course, but that already seems like a pretty long shot.
Anyone who can upload HTML pages to subdomain.domain.com can read and write cookies for *.domain.com, unless you declare yourself a public suffix and enough time has passed for all the major browsers to have updated themselves.
I've seen web hosts in the wild who could have their control panel sessions trivially stolen by any customer site. Reported the problem to two different companies. One responded fairly quickly, but the other one took several years to take any action. They eventually moved customers to a separate domain, so the control panel is now safe. But customers can still execute session fixation attacks against one another.
(Author here) This is all true. The main assumption from my part is that anything remotely important or even sensitive should be and is hosted on a domain that is _not_ companysubdomain.domain.com but instead www.company.com.
I don't like nor trust google, but "Use your own judgement and hard-earned Internet street smarts" doesn't work either, because the median internet user does not have anything resembling internet street smarts.
Still not sure why it's legal for Google to slander companies like this. They often have no proof or it's a false positive, meanwhile they're screaming about how malicious you are.
I'm not asking about this specific case. There are plenty of examples of Google wrongly accusing others of being malicious with massive business impact
Seems like a reasonable trade-off I mean six hours is not the worst thing in the world. What if you were hosting mission-critical such as such? Were you?
> To be fair, many or even most sites on the Google Safe Browsing blacklist are probably unworthy. But I’m pretty sure this was not the first false positive.
The bigger issue is that the internet needs governance. And, in the absence of regulation, someone has stepped in and done it in a way that the author didn't like.
Perhaps we could start by requiring that Google provide ways to contact a living, breathing human. (Not an AI bot that they claim is equivalent.)
why do you assume that the living, breathing human hired by theGoogs will be competent at handling all of the crazy that will be flung at them by the living, breathing human on the other end of the line. One single person cannot handle that. Naturally, you need a team of living, breathing humans. You might even have them in triage level groups like level 1 support, level 2 support and so on where each level is a more trained/experienced living, breathing human. Eventually, you'll have an entire department of people of varying degrees of skill. Oh, wait, I'm sorry, I thought it was the year 2000.
Hopefully, this helps you understand why your living, breathing human is such a farcical idea for theGoogs to consider.
Playing devil's advocate, who else was going to step into that role? Who would have the clout to be trusted? The Googs would want to do something just as a self protecting action that evolved into a self aggrandizing sense of empowerment that they might not be the protector we need but the one we deserved
I don't even know what you're asking, or how that's the question you ask from my comment. Clearly, no, I don't think a human and a bot are the same. I'm saying that evilCorp is not going to pay for a human support staff in the year 2025 when the company is pushing it's AI/LLM chatbot as a major part of who they are. If the chatbot company doesn't use its own chatbot, why would anyone else? Of course they are not going to pay for humans.
How does any of that lead to your asking if I think humans === bots?
That may be, and we certainly don't need anyone explaining Google's position - we already know what that is. Nobody here actually cares what Google wants, we're expressing what we want. Nerds have helped Google enough with free marketing and goodwill, Google's reputation being tarnished can only help us not hurt us.
Honestly, this is extremely basic stuff in hosting, not only due to safe browsing, but also—and more importantly—cookie safety, etc. If a hosting provider didn’t know (already bad enough) and turn to whining after being hit, then
> Static site hosting you can trust
is more like amateur hour static site hosting you can’t trust. Sorry.
The thing is, you cannot just add any domain to the PSL. You need a significant amount of users before they will include your domain. Before recently, there really was no point in even submitting, since the domain would have been rejected as too small. An increase in user base, increase in malicious content and the ability to add your domain to the PSL all happen sort of simultaneously.
I'm also trusting my users to not expose their cookies for the whole *.statichost.eu domain. And all "production" sites use a custom domain anyway, which avoids all of this anyway.
There are well-documented solutions to this that don't rely on the PSL. Choosing to ignore all of that advice while hosting user content is a very irresponsible choice, at best.
So the problem here is that Alice on alice.statichost.page might set a cookie for the `.statichost.page` domain if she's careless (which is sometimes the case with Alice). This cookie can then be read by Mallory on mallory.statichost.eu. Or the other way around, if Mallory wants to try to trick Alice into reading his cookie. How this can be prevented without the PSL is something I'm very interested to hear more about.
I have the same issue. Think of my site as WeTransfer, but instead of only files, you can also use it as a link shortener or pastebin. Abuse works the same as on every other site or service: I do spot checks and users can report content. This was fine until uBlock Origin decided the website was malicious, per one of the lists that is default-enabled for everyone
That list doesn't have a clear way to get off of it. I would be happy to give them the heads up that their users are complaining about a website being broken, but there is no such thing, neither for users nor for me. In looking around, there's many "sources" that allegedly independently decided around the same day that my site needs to not work anymore, so now there's a dozen parties I need to talk to and more popping up the further you look. Netcraft started sending complaints to the registrar (which got the whole domain put on hold), some other list said they sent abuse to the IP space owner (my ISP), public resolvers have started delisting the domain (pretending "there is no such domain" by returning NXDOMAIN), as well as the mentioned adblockers
There's only one person who hasn't been contacted: the owner. I could actually do something about the abusive content...
It's like the intended path is that users start to complaint "your site doesn't work" (works for me, wdym?) and you need to figure out what software is it they're using, what DNS resolver they use, what antivirus, what browser, if a DOH provider is enabled... to find out who it might be that's breaking the site. People don't know how many blocklists they're using, and the blocklists don't give a shit if you're not a brand name they recognize. That's the only difference between my site and a place like Github: if I report "github.com hosts malware", nobody thinks "oh, we need to nuke that malicious site asap!"
I'd broaden the submitted post to say that it's not only Google with too much power, but these blocklists have no notification mechanism or general recourse method. It's a whack-a-mole situation which, as an open source site with no profit model (intentionally so), I will never win. Big tech is what wins. Idk if these lists do a trademark registration check or how they decide who's okay and who's not, but I suspect it's simply a brand name thing and your reviewer needs to know you
> Luckily, Google provided me with a helpful list of the offending sites
Google is doing better than most others with that feature. Most "intelligence providers", which other blocklists like e.g. Quad9 uses, are secretive about why they're listing you, or never even respond at all
I have recently had the pleasure of speaking with Google senior leadership involved in the Safe Browsing product on the topic of getting my SaaS product placed on their, "naughty list." The platform was down for 6 or so hours due to a false positive hit for phishing.
I have read A LOT of blogs/rants/incidents on social media about startups, small businesses, and individuals getting screwed by large companies in similar capacities. I am VERY sympathetic to those cries into the sky, shaking fists at clouds, knowing very well we are all very small and how the large providers seem to not care. With that in mind, I am not blind to the privilege my organization has to rope in Google to discuss root causes for incidents.
I am writing about it here because I believe most people will never be able to pull a key Google stakeholder into a 40 minute video call to deeply discuss the RCAs. The details of the discussion are probably protected by NDA so I'll be speaking in general terms.
Google has a product called Web Risk (https://cloud.google.com/web-risk/docs/overview), I hear it's mostly used by Google Enterprise customers in regulated verticals and some large social media orgs. Web Risk protects the employees of these enterprise organizations by analyzing URLs for indicators of risk, such as phishing, brand impersonation, etc.
My SaaS platform is well established and caters mostly to large enterprise. I provide enterprise customers with optional branded SSO landing pages. Customers can either use sign-in from the branded site (SP-initiated) or redirect from their own internal identity provider to sign-in (IdP-initiated). The SSO site branding is directed by the customer, think along the lines of what Microsoft does for Entra ID branded sign-in pages. Company logo(s), name, visual styling, and other verbiage may be included. The branded/vanity FQDN is (company).productname.mydomain.com.
You may be able to see where I'm headed at this point... Why was my domain blocked? For suspected phishing.
A mutual enterprise customer was subscribed to Google's Web Risk. When their employees navigated to their SSO site, Google scanned it. Numerous heuristics flagged the branded SSO site as phishing and we were blocked by Safe Browsing across all major web browsers (Safari, Chrome, Firefox, Edge, and probably others). Google told us that had our customer put the SSO site on their Web Risk allow-list, we wouldn't have been blocked.
I'm no spring chicken, I cannot rely nor expect a customer to do that, so I pressed for more which led to a lengthy conversation on risk and the seemingly, from my perspective, arbitrary decisions made by a black box without any sort of feedback loop.
I was provided a tiny bit of insight into the heuristic secret sauce, which led to prescribed guidance on what could be done to greatly reduce the risk of getting false positive flag for phishing again. Those specifics I assume I cannot detail here, however the overall gist of it is domain reputation. Google was unable to positively ascertain my domain's reputation.
My recommendation is for those of you out there in the inter-tubes who have experienced false positive Safe Browsing blocks, think about what you can do to increase your domain's public reputation. Also, get a GCP account so if you do get blocked, you can open a ticket from that portal. I was told it would be escalated to the appropriate team and be actioned on within 10-15 minutes.
Another day, another IT company learning the hard way about the public suffix list, or well-known URIs, or some other well-documented-but-niche security technology.
I love that IT is a field where there's no required formal education track and you can succeed with any learning path, but we definitely need some better way to make sure new devs are learning about some of these gotchas.
The PSA is good, the article is meh. There is too much misdirected anger towards google here, IMO. I agree it sucks to be the false positive, but it'd also suck more to unknowingly be part of phishing campaigns and not know.
On top of that, it is also recommended to serve user content from another domain for security reasons. It's much easier to avoid entire classes of exploits this way. For the site admins: treat it as a learning experience instead of lashing out on goog. In the long run you'll be better off, having learned a good lesson.
I’m sure I don’t know ALL the "security best practices that have been around for 20+ years" and this is perfectly fine as long as I’m able to react quickly. See also https://xkcd.com/1053/.
It's fine if you personally didn't know that. But if I'm paying for a service, I expect the provider to understand basic security best practices that have been industry standard for 20+ years. And if they don't, they should be hiring people who do.
XKCD 1053 is not a valid excuse for what amounts to negligence in a production service.
Author here. What kind of security negligence are you referring to? What would be a specific attack vector that I left open?
Regarding the PSL - and I can't believe I'm writing this again: you cannot get on there before your service is big enough and "the request authentically merits such widespread inclusion"[1]. So it's kind of a chicken and egg situation.
Regarding the best practice of hosting user content on a separate domain: this has basically two implications:
1. Cookie scope of my own assets (e.g. dashboard), which one should limit in any case and which I'm of course doing. So this is not an issue.
2. Blacklisting, which is what all of this has been about. I did pay the price here. This has nothing to do with security, though.
I'm sorry to be so frank, but you don't know anything about me or my security practices and your claim of negligence is extremely unfounded.
Eric, I think it appropriate to mention, and I'd like to point out the lack of any real documentation (reaching a professional level) related to PSL on the professional working groups touching on these things (i.e. M3AAWG).
There are only two blog posts on M3AAWG in 2023 where it had been used silently (apparently for years), but was calling for support. I would think if it were an industry recognized initiative it would have the appropriate documents/whitepapers published on it in the industry working group tasked with these things. These people are supposed to be engineer's after all. AFAIK this hasn't happened aside from a brief-after-action with requests for support which is highly problematic.
When there is no professional outreach (via working group or trade group), its real hard to say that this isn't just gross negligence on google's part. M3AAWG has hundreds if not thousand's of whitepapers each hundreds of pages. A single blog post or two that mention it insufficiently, won't rationally negate this claim supporting gross negligence.
Why do I mention Gross negligence?, when coupled with loss, it is sufficient in many cases to support a finding of 'malice' without specific intent (i.e. general intent), especially when such an entity has little/no credibility, but is overshadowed by power/authority that is undeserved. Deceitful people that reasonably should know the consequences will go bad, often purposefully structure towards general intent to avoid legal complications and the legal system has evolved. I am not a lawyer, but this paraphrase about gross negligence/general intent/malice did come from a lawyer, its not meant or intended for use as legal advice in paraphrase form, so standard IANAL disclaimer applies. If the that is needed, consult a qualified professional for a specific distinction on this.
The company is more than technically capable of narrowly defining blacklists and providing due process and appropriate noticing requirements.
The situation begs questions of torturous interference, and whether the PSL is being used as an anti-competive mechanistic moat to prevent competitors from entering the market by imposing additional cost arbitrarily on competitors that is assymetric to the costs such companies have with competing services (as oligopoly/monopoly).
PSA: Submitted title was 'PSA: Always use a separate domain for user content". We've changed it per https://news.ycombinator.com/newsguidelines.html. Might be worth knowing for context.
It's generally good advice, but I don't see that Safe Browsing did anything wrong in this case. First, it sounds like they actually were briefly hosting phishing sites:
> All sites on statichost.eu get a SITE-NAME.statichost.eu domain, and during the weekend there was an influx of phishing sites.
Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged. How else is Google supposed to know that subdomains belong to different users? That's what the PSL is for.
From my reading, Safe Browsing did its job correctly in this case, and they restored the site quickly once the threat was removed.
I'm not saying that Google or Safe Browsing in particular did anything wrong per se. My point is primarily that Google has too much power over the internet. I know that in this case what actually happened is because of me not putting enough effort into fending off bad guys.
The new separate domain is pending inclusion in the PSL, yes.
Edit: the "effort" I'm talking about above refers to more real time moderation of content.
> My point is primarily that Google has too much power over the internet.
That is probably true, but in this case I think most people would think that they used that power for good.
It was inconvenient for you and the legitimate parts of what was hosted on your domain, but it was blocking genuinely phishing content that was also hosted on your domain.
Every website operator employee worth their salary in this area would have told the site's operator this beforehand, and could have avoided this incident. Hell, even ChatGPT could tell you that by now. The word that comes to mind is incompetence on someone's part, but I don't know of the details on particularly who was the incompetent one in this situation. Thankfully, they've learned a lesson about the situation and ideally won't make the same mistake again going forwards.
I disagree, as a professional in this field for over a decade.
For this to be a legitimately backed statement, professional's would have needed to know about the PSL. This is largely unmet.
For it to be met, there would need to be documentation in the form of RFC's and whitepapers in industry working groups which would be needed. This didn't happen.
M3AAWG only has two blog post mentions, and that's only after the great layoffs of 2023, and only that its being used by volunteers and needs support. No discussion about organization, what its being used for, process/due process, etc.
It wholly lacks the needed outreach to professionals in order to make such a statement and have it be true.
I mean, it's a very big field, and it's easy enough for me to armchair quarterback and call it a skill issue without being vulnerable and putting my own credentials into question. There's a whole big world of things to know about making and running websites, and I'll readily admit I don't know everything. I don't do a lot of CSS or website SEO or run ad campaigns, so someone experienced there will run circles around me.
Putting user generated content on its own domain is more on the security side of things to know about running a website, and our industry doesn't regulate who's allowed to build websites. Everyone's got their own set of different best practices.
Regardless of the exact date that GitHub moved which kinds of user generated content (UGC) over to which domain/domains, I do expect a curious webdev in 2025 to have used GitHub and to have wondered enough about it to ask what's up with stuff coming from eg raw.githubusercontent.com at some point in their web browsing career to ask Google about it. They should have walked away with the idea that they're putting UGC on a separate domain intentionally for security stuff, even if they never hear mention of the PSL or how exactly it works and is implemented. The /r/webdev post you'll find links to a GitHub blog post that gives a lot of detail as to why they did that, and that doesn't mention the PSL once.
It's fair to point out the PSL isn't common knowledge. I would agree that it isn't. I don't think it's necessary, however. All it takes is being a user of GitHub and a modicum of curiosity. I expect anyone that call themselves a webdev in 2025 to be able to explain to me what git and GitHub is and why they're different. They don't need to know where git came from but I don't think I'm being unreasonable in asking that much. From there, I expect someone to be able to make up an answer as to why there's raw.githubusercontent.com during an interview and mumble something about security, even if they can't give specific details about cookies and phishing and how that all works.
It's possible I'm being unreasonable here but I don't think I am. This isn't knowledge that takes attending W3C meetings about web browser standards to have come across. Regardless of if I am or not though, everyone who's come across this thread should now know that UGC goes in its own domain, even if they can't give details as to why.
I agree this isn't knowledge that takes a lot. The problem is these companies don't explain why they do what they do, in fact a lot of security stuff along these lines in the past has been tight-lipped secrecy bound stuff. You can wonder, but the answer isn't out there unless you know an insider willing to break a broadly worded NDA (not gonna happen, and some are quite broad).
The idea to segment certain types of traffic to different domains isn't that new. For example segmenting certain mail servers by marketing or transactional types into subdomains was done as far back as 2010, but it wasn't explained in whitepapers until around 2016 or 2017, where there was already gathered irrefutable evidence that reputational systems had been put in place and the rules for those damaged people running small email servers who were being illegitimately blocked from delivery; for years with no recourse or disclosure just imposed cost.
Once they published the whitepapers on that, professionals were on board because they specified what they were looking for, and how it should function. Basic Engineering stuff that people who manage and build these systems need to know to interoperate.
These things need professional outreach that standardizes it in some form or another, that's not a one-off blog post imo, and that must fully specify function, requirement, feedback mechanisms, and expectations of how its supposed to work; basic engineering stuff.
The PSL is just the same thing all over again. Big Tech just starts doing something silently that directly imposes cost on others, they don't say what they are doing. Then when it becomes too costly they try to offload it to others calling for support, though if they only do halfsies in a blog post buried in noise, they are only looking for plausible deniability.
The benefit in doing this is in anti-competitive behavior.
Incidentally, while separating subdomains for email servers has been standard practice for awhile now, recently these companies once again changed the reputational weights for things, and they aren't talking. Now its a whole domain as a single reputational namespace not just breakage at the subdomain (bb.aa.com.). No outreach on that as far as I've seen.
There are ways to do things correctly, and then there are ways to do things anti-competitively and coercively. The incentives matched to the outcomes point to which one that happens to be.
How you do something is more important than that you did something in these cases.
If you as a company don't do professional outreach about such changes or standards, and you arbitrarily choose to require something that isn't properly disclosed punishing everyone that hasn't received disclosure; that in my mind is a fair and reasonable case for either gross negligence (for general intent to prove malice) or tortuous interference with third-party companies businesses.
That question which you mentioned about asking in an interview (iirc) was actually asked in an Ignite interview, but was cut out from the recordings later, and the answer was we can't talk about what other departments are doing. They may have followed-up on that elsewhere but I never saw anything related to it.
It is critically important to know the reasons why things are structured a certain way or happen; in order to be able to interoperate. This is and has been known and repeated many times since the adoption of OSI & TCP in the 80s/90s with regards to interoperability of systems.
Blindly copying what others do is a recipe for disaster and isn't justifiable in terms of cost, and competent professional's don't roll the dice like that on large projects of that caliber of expense.
This stuff isn't straight forward either. Like knowing where the reputational namespace stops, what the ramp-up time (dm/dt) is for volume metrics to warm up a server at each provider, and objective indicators associated with when you go above that arbitrarily designed rate. (hint: non-deterministic hidden states) If it takes a month to perfectly warm a new server up without reputational consequences by an insider that knows, that's extra cost imposed on the company by that platform (whom you are competing against for email services).
No disclosure means starting over every time trying to guess at what they are doing, and having breakage later when they change things.
> reddit...
A lot of professionals no longer use reddit anymore because its a bot filled echo chamber that wastes valuable time.
Moderators there often remove posts regularly for simple disagreement, conflicts of interest, or to remove access to detailed solutions or methodology.
For an example of all that's wrong there, look to that CodingBootCamp reddit. There's a guy that's a moderator there that's been, in all probability, using a bot to destroy a competitors reputation and harass them for years, attacking the owners, execs, and going so far as to harass and stalk their children; while violating the Moderator Code of Contact. Crazy and toxic stuff. ---
You can't ever meet professional standards if you don't communicate or properly disclose interop requirements when complex systems are involved.
"Google does good thing, therefore Google has too much power over the internet" is not a convincing point to make.
This safety feature saves a nontrivial number of people from life-changing mistakes. Yes we publishers have to take extra care. Hard to see a negative here.
I respectfully disagree with your premise. In this specific case, yes, "Google does good thing" in a sense. That is not why I'm saying Google has too much power. "Too much" is relative and whether they do good or bad debatable, of course, but it's hard to argue that they don't have a gigantic influence on the whole internet, no? :)
Helping people avoid potentially devastating mistakes is of course a good thing.
What point are you trying to make here? You hosted phishing sites on your primary domain, which was then flagged as unsafe. You chose not to use the tools that would have marked those sites as belonging to individual users, and the system worked as designed.
Please note that this tool (PSL) is not available until you have a significant user base. Which probably means a significant amount of spam as well.
Where'd you see/hear that? It hasn't been my experience at least - but maybe I've just been lucky or undercounting the sites.
There are required steps to follow but none are "have x users" or "see a lot of spam". It's mostly "follow proper DNS steps and guidelines in the given format" with a little "show you're doing this for the intended reason rather than to circumvent something the PSL is not meant for/for something the public can't get to anyways" (e.g. tricking rate limits, internal only or single user personal sites) added on top.
https://github.com/publicsuffix/list/wiki/Guidelines#validat...
"Projects that are smaller in scale or are temporary or seasonal in nature will likely be declined. Examples of this might be private-use, sandbox, test, lab, beta, or other exploratory nature changes or requests. It should be expected that despite whatever site or service referred a requestor to seek addition of their domain(s) to the list, projects not serving more then thousands of users are quite likely to be declined."
Maybe the rules have changed, or maybe you were lucky? :)
Ah yeah, looks like it was added in 2022 https://github.com/publicsuffix/list/wiki/Guidelines/_compar...
Thanks for the note!
You're not wrong. You just picked a poor example which illustrates the opposite of the point you're making.
Fair enough! :)
> but it's hard to argue that they don't have a gigantic influence on the whole internet, no? :)
Then don't relate this to safe browsing. What is the connection?
You could have just written a one liner. Google has too much power. This has nothing to do with safe-browsing.
In fact you could write...
- USA/China/EU etc has too much power..
You use the word relative in another reply..
Same way.. My employer has relatively too much power...
Is it? Companies like Google coddle users instead of teaching them how to browse smarter and detect phishing for themselves. Google wants people to stay ignorant so they can squeeze them for money instead of phishers.
How does Google get money out of people in that case? As a corporation, Google contributes greatly to the education sector and also profits greatly, so it seems like they're pro-education to me, and are merely making the best of a bad situation, but I'd love to hear how Google extracts money from the people they've protected from phishing schemes in some secret way that I haven't considered. I do happen to have Google stock in my portfolio though, so maybe that indight's my entire comment for you though.
This is a fine mentality when it takes a certain amount of "Internet street smarts" (a term used in the article) to access the internet - at least beyond AOL etc.
But over half of the world has internet access, mostly via Chrome (largely via Android inclusion). At least some frontline protection (that can be turned off) is warranted when you need to cater to at least the millions of people who just started accessing the internet today, and the billions who don't/can't/won't put the effort in to learn those "Internet street smarts".
How does flagging a domain that was actively hosting phishing sites demonstrate that Google has too much power? They do, but this is a terrible example, undermining any point you are trying to make.
The thing about Google is that they regularly get this stuff wrong, and there is no recourse when they do.
I think most people working in tech know the extent to which Google can screw over a business when they make a mistake, but the gravity of the situation becomes much clearer when it actually happens to you.
This time it's a phishing website, but what if the same happens five years down the line because of an unflattering page about a megalomaniac US politician?
Then that would be an example of a system having failed and one that needs to change. Instead, this is an example of a hosting company complaining about the consequences of skipping some of the basic, well-documented safety and security practices that help to isolate domains for all sorts of reasons, from reputation to little things like user cookies.
This article shows an example of this process working as intended though.
The user's site was hosting phishing material. Google showed the site owner what was wrong, provided concrete steps to remedy the situation, and removed the warning within a few hours of being notified that it was resolved.
Google's support sucks in other ways, but this particular example went very smoothly.
> Oh my god, my site was unavailable for 7 hours because I hosted phishing!
Won't someone please think of the website operator?
Maybe google can have large impact is a more accurate way of putting it vs power.
There are two aspects to the Internet: the technical and the social.
In the social, there is always someone with most of the power (distributed power is an unstable equilibrium), and it's incumbent upon us, the web developers, to know the current status quo.
Back in the day, if you weren't testing on IE6 you weren't serving a critical mass of your potential users. Nowadays, the nameplates have changed but the same principles hold.
Social wasn't always sole powered, only began with the later social networks, not the early. And now people are retreating to smaller communities anyways.
Testing on IE6 wasn't the requirement, all browser's was. IE shipped default on windows and basically forced themselves into the browser conversation with an incomplete browser.
I don't mean social as in social network. I mean that people have always been a key aspect of the technology and how it it practically works.
Yes, yes, IE6 shipped by default shipped by default on Windows. And therefore if you wanted a website that worked, you tested against IE6. Otherwise people would try and use your website and it wouldn't work and they wouldn't blame the browser, they would blame your website.
Those social aspects introduce a bunch of not necessarily written rules that you just have to know and learn as you develop for the web.
> Google has too much power over the internet.
In this case they did use it for good cause. Yes, alternatively you could have prevented the whole thing from happening if you cared about customers.
Exactly.
> Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged.
NO, Google should be "mindful" (I know companies are not people but w/e) of the power it unfortunately has. Also, Cloudflare. All my homies hate Cloudflare.
It is mindful.
... by using the agreed-upon tool to track domains that treat themselves as TLDs for third-party content: the public suffix list. Microsoft Edge and Firefox also use the PSL and their mechanisms for protecting users would be similarly suspicious that attacks originating from statichost.eu were originating from the owners of that domain and not some third-party that happened to independently control foo.statichost.eu.
Getting on the public suffix list is easier said than done [1]. They can simply say no if they feel like it and are making sure to be able to keep said rights as a "project" vs a "business," [2] which has its pros and cons.
[1] https://github.com/publicsuffix/list/blob/main/public_suffix...
[2] https://groups.google.com/g/publicsuffix-discuss/c/xJZHBlyqq...
> Getting on the public suffix list is easier said than done [1].
Can you elaborate on this? I didn't see anything in either link that would indicate unreasonable challenges. The PSL naturally has a a series of validation requirements, but I haven't heard of any undue shenanigans.
Is it great that such vital infrastructure is held together by a ragtag band of unpaid volunteers? No; but that's hardly unique in this space.
> Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged. How else is Google supposed to know that subdomains belong to different users? That's what the PSL is for.
How is this kinda not insane? https://publicsuffix.org/list/public_suffix_list.dat
A centralized list, where you have to apply to be included and it's up to someone else to decide whether you will be allowed in? How is this what they went for: "You want to specify some rules around how subdomains should be treated? Sure, name EVERY domain that this applies to."
Why not just something like https://example.com/.well-known/suffixes.dat at the main domain or whatever? Regardless of the particulars, this feels like it should have been an RFC and a standard that avoids such centralization.
There was an IETF working group that was working on a more distributed alternative based on a DNS record (so you could make statements in the DNS about common administrative control of subdomains, or lack of such common control, and other related issues). I believe the working group concluded its work without successfully creating a standard for this, though.
The problem is that you then have to trust the site's own statement about whether its subdomains are independent.
Yes, its generally good advice to keep user content on a separate domain.
That said, there are a number of IT professionals that aren't aware of the PSL as these are largely initiatives that didn't exist prior to 2023 and don't get a lot of advertisement, or even a requirement. They largely just started being used silently by big players which itself presents issues.
There are hundreds if not thousands of whitepapers on industry, and afaik there's only one or two places its mentioned in industry working groups, and those were in blog posts, not whitepapers (at M3AAWG). There's no real documentation of the organization, what its for, and how it should be used in any of the working group whitepapers. Just that it is being used and needs support; not something professional's would pay attention to imo.
> Second, they should be using the public suffix list
This is flawed reasoning as is. Its hard to claim this with a basis when professionals don't know about this, a small subset just arbitrarily started doing this, and seems more like false justification after-the-fact for throwing the baby out with the bath water.
Security is everyone's responsibility, and Google could have narrowly tailored the offending domain name accesses instead of blocking the top-level. They didn't do that, and worse that behavior could even be automated in a way that the process could be extended and there could be a noticing period to the toplevel provider before it started hitting everyone's devices. They also didn't do that apparently.
Regardless, no single entity should be able to dictate what other people perceive or see arbitrarily from their devices (without a choice; opt-in) but that is what they've designed these systems to do.
Enumerating badness doesn't work. Worse, say the domain names get reassigned to another unrelated customer.
Those people are different people, but they are still blocked as happens with small mail servers quite often. Who is responsible when someone who hasn't been engaged with phishing is being arbitrarily punished without due process. Who is to say that google isn't doing this purposefully to retain their monopolies for services they also provide.
Its a perilous torturous path where trust cannot be given because they've violated that trust in the past, and have little credibility with all net incentives towards their own profit at the expense of others. They are even willing to regularly break the law, and have never been held to account for it. (i.e. Google Maps WIFI wiretapping).
Hanlon's razor is a joke intended as a joke, but there are people that use it literally and inappropriately to deceitfully take advantage of others.
Gross negligence coupled with some form of loss is sufficient for general intent which makes the associated actions malicious/malice.
Throwing out the baby with the bath water without telling anyone or without warning, is gross negligence.
I'm not sure what to tell you. I'm a professional with nearly two decades of experience in this industry, and I don't read any white papers. I read web publications like Smashing Magazine or CSS Tricks, and more specifically authors like Paul Irish, Jake Archibald, Josh Comeau, and Roman Komarov. Developers who talk about the latest features and standards, and best practices to adopt.
The view that professionals in this industry exclusively participate in academic circles runs counter to my experience. Unless you're following the latest AI buzz, most people are not spending their time on arXiv.
The PSL is surely an imperfect solution, but it's solving a problem for the moment. Ideally a more permanent DNS-based solution would be implemented to replace it. Though some system akin to SSL certificates would be necessary to provide an element of third-party trust, as bad actors could otherwise abuse it to segment malicious activity on their own domains.
If you're opposed to Safe Browsing as a whole, both Chromium and Firefox allow you to disable that feature. However, making it an opt-in would essentially turn off an important security feature for billions of users. This would result in a far greater influx of phishing attacks and the spread of malware. I can understand being opposed to such a filter from an idealistic perspective, but practically speaking, it would do far more harm than good.
You seem to have not understood what I said conflating academia with whitepapers and then construct the rest on an improper foundation from there.
Whitepapers aren't the sole domain of academia. What we are talking about aren't hosted on Arxiv. We are talking about industry working groups.
The M3AAWG working group, and browser/CA forum publish RFCs and Whitepapers that professionals in this area do read regularly.
There's been insufficient/no professional outreach about PSL. You can't just do things at large players without disclosure for interop, because you harm others by doing so neglecting the imposing fallout from lack of disclosure on everyone else that's impacted within your sphere of influence which as a company running the second most popular browser is global.
When you do so without first doing certain reasonable and expected things (of any professional organization), you are being grossly negligent. This is sufficient to prove general intent for malice in many cases, a reasonable person in such circumstances should have known better.
This paves the way for proving tortuous or vexatious interference of a contract which is a tort and punishable by law when brought against the entity.
> The PSL is surely an imperfect solution but its solving the problem for the moment.
It is not, because the disclosure hasn't happened properly for interop, and in such circumstances it predictably creates a mountain of problems without visibility; a timebomb/poison pill where crisis arises from the brittle structure later following shock doctrine utilizing the snowball effect (a common tactic of the corrupt and deceiver alike).
Your entire line of reasoning which you constructed is critically flawed. You presume trust is important to this, and that such systems require trust, but trust doesn't have anything to do with the reputational metrics which the systems we are talking about are using to impose cost. Apples to Oranges.
You can't enumerate badness. Lots of professionals know this. Historic reputational blacklists also punish those that are innocent after-the-fact when not properly disclosed, or engineered for due process. A permanent record deprives anyone from using a blacklisted entry after it changes hands from the criminal to some unsuspecting person.
Your reasoning specifically frames a false dichotomy about security. This follows almost identically the same reasoning the Nazi's used (ref at the bottom).
No one is arguing that Safe Browsing and other mechanisms are useful as mitigation, but they are temporary solutions that must be disclosed to a detailed level that allows interoperability to become possible.
If you only tell your friends, and impose those draconian costs on everyone else, you are abusing your privileged position of trust for personal gain (a form of corruption), and causing harm on others even if you can't see it.
Chrome does not have an opt-out. You have to re-compile the browser from scratch to turn those subsystems off. Same with Firefox. That is not allowing you to disable that feature since users aren't reasonably expected to be able to recompile their software to change a setting.
There is no idealism/pacifism here. I'm strictly being pragmatic.
You neglect the harm you don't directly see, in the costs imposed on business. Second, Third, and n-order effects must be considered but have not been (this must necessarily grow in consideration based on the scope/scale of impact).
There are a few areas where doing such blind things may directly threaten existential matters (i.e. food production where failure of logistics lead to shortages, which whipsaw into chaos). It won't happen immediately, and we live in a growingly brittle but still somewhat resilient society, but it will happen eventually if such harm is adopted and allowed as standard practice; though the method is indirect the scope starts off large.
If you only look through a lens at a small part of the cycle of the dynamics that favors your argument which you set in motion, ignoring everything else; that is called cherry-picking or also commonly known as the fallacy of isolation.
Practically speaking, that line of reasoning is without foundational support and unsound. Its important to properly discern and reason about things as they actually exist in reality.
Competent professionalism is not an idealistic perspective. The harm naturally comes when one doesn't meet well established professional requirements. When the rule of law fails to hold destructive people to account for their actions; that's a three-alarm fire as a warning sign of impending societal collapse. The harms of which are incalculable.
Ref: "Of course the people don't want war. But after all, it's the leaders of the country who determine the policy, and it's always a simple matter to drag the people along whether it's a democracy, a fascist dictatorship, or a parliament, or a communist dictatorship. Voice or no voice, the people can always be brought to the bidding of the leaders.
(Your implications follow this part closely): That is easy. All you have to do is tell them they are being attacked, and denounce the pacifists for lack of patriotism, and exposing the country to greater danger."
Putting user content on another domain and adding that domain to the public suffix list is good advice.
So good, in fact, that it should have been known to an infrastructure provider in the first place. There's a lot of vitriol here that is ultimately misplaced away from the author's own ignorance.
The PSL is something you find out about after it goes wrong.
It's a weird thing, to be honest, a Github repo mentioned nowhere in any standards that browsers use to treat some subdomains differently.
Information like this doesn't just manifest itself into your brain once you start hosting stuff, and if I hadn't known about its existence I wouldn't have thought to look for a project like this either. I certainly wouldn't have expected it to be both open for everyone and built into every modern internet-capable computer or anti malware service.
To be pedantic, the GitHub repo is not the source of truth, this is:
https://publicsuffix.org/list/public_suffix_list.dat
It even says so in the file itself. If Microsoft goes up in flames, they can switch to another repository provider without affecting the SoT.
If you don't know what you're doing and as a result bad things happen, that's on you.
I don't have a lot of sympathy for people who allow phishing sites suffering reputational consequences.
To be fair I’ve been in the space for close to 20 years now, worked on some of the largest sites and this is the first I’m hearing of the public suffix list.
Maybe it was effective from obscurity?
For something that you think is a de-facto standard, public suffix list seems kinda raw to me for now.
I checked it for two popular public suffixes that came to mind: 'livejournal.com' and 'substack.com'. Both weren't there.
Maybe I'm mistaken, it's not a bug and these suffixes shouldn't be included, but I can't think of the reason why.
I don't know about LiveJournal, but I don't believe you can host any interactive content on substack (without hacking substack at least). You can't sign up and host a phishing site, for instance.
User-uploaded content (which does pose a risk) is all hosted on substackcdn.com.
The PSL is more for "anyone can host anything in a subdomain of any domain on this list" rather than "this domain contains user-generated content". If you're allowing people to host raw HTML and JS then the PSL is the right place to go, but if you're just offering a user post/comment section feature, you're probably better off getting an early alert if someone has managed to breach your security and hacked your system into hosting phishing.
The public suffix list interferes with cookies. So on a service like livejournal, where you want users logged in across all subdomains, it's not an option
Exactly, this has been documented knowledge for many years now, even decades. Github and other large providers of user-generated content have public-facing documentation on the risks and ways to mitigate them. Any hosting provider that chooses to ignore those practices is putting themselves, and their customers, at risk.
> There's a lot of vitriol here that is ultimately misplaced away from the author's own ignorance.
For what it's worth, this makes it sound like you think the vitriol should be aimed at the author's ignorance rather than the circumstances which led to it, presuming you meant the latter.
I do think the author's ignorance was a bigger problem--both in the sense of he should have known better and also in the sense that the PSL needs to be more discoverable--than anything Google('s automated systems) did.
However, I'm now reflecting on what I said as "be careful what you wish for", because the comments on this HN post have done a complete 180 since I wrote it, to the point of turning into a pile-on in the opposite direction.
> also in the sense that the PSL needs to be more discoverable
Well, this is a problem that caused the author's ignorance but you present it as though it's the other way around. That's primarily what I meant. Not really disagreeing with "should have known better", mostly in the sense that user-generated content is a huge yellow flag.
This is of course true! It just takes an incident like this to get ones head out of ones ass and actually do it. :)
The good news is, once known, a lesson like this is hard to forget.
The PSL is one of those load-bearing pieces of web infrastructure that is esoteric and thanklessly maintained. Maybe there ought to be a better way, both in the sense of a direct alternative (like DNS), and in the sense of a better security model.
There’s some value in the public suffix list being shared, with mild sanity checking before accepting entries: it maintains a distinction between site (which includes all subdomains) and origin (which doesn’t). Safe Browsing wants to block sites, but if you can designate your domain a public suffix without oversight, you can bypass that so that it will only manage to block your subdomains individually (until they adjust their heuristics to something much more complicated and less reliable than what we have now).
This is the kind of thing that customers rely on you to do _before_ it causes an incident.
The thing is, for users, having a separate domain wouldn't have made any difference without the PSL. And you cannot get on there before you're big enough - which I'd say is roughly at the same time as you start grabbing the attention of scammers.
One can only imagine the other beginner mistakes made by this operator.
Well, you're responding to him, so questions or suggestions are probably better than speculation.
My comment about vitriol was more directed at the HN commenters than Eric himself. Really, I think a discussion about web infrastructure is more interesting than a hatefest on Google. Thankfully, the balance seems to have shifted since I posted my top-level comment.
> Well, you're responding to him, so questions or suggestions are probably better than speculation.
I suspect the author is unaware of their other blindspots. It's not 2001 anymore. Holding yourself out as a hosting provider comes with some baseline expectations.
> baseline expectations
Do you have more details? That sounds interesting.
Everyone learns somehow.
Since there's a lot of discussion about the Public Suffix list, let me point out that it's not just a webform where you can add any domain. There's a whole approval process where one very important criterion is that the domain to be added has a large enough user base. When you have a large enough user base, you generally have scammers as well. That's what happened here.
It basically goes: growing user base -> growing amount of malicious content -> ability to submit domain to PSL. In that order, more or less.
In terms of security, for me, there's no issue with being on the same domain as my users. My cookies are scoped to my own subdomain, and HTTPS only. For me, being blocked was the only problem, one that I can honestly admit was way bigger than I thought.
Hence, the PSA. :)
What sort of size would be needed to get on there?
My open source project has some daily users, but not thousands. Plenty to attract malicious content, I think a lot of people are sending it to themselves though (like onto a malware analysis VM that is firewalled off and so they look for a public website to do the transfer), but even then the content will be on the site for a few hours. After >10 years of hosting this, someone seems to have fed a page into a virus scanner and now I'm getting blocks left and right with no end in sight. I'd be happy to give every user a unique subdomain instead of short links on the main domain, and then put the root on the PSL, if that's what solves this
> [..] projects not serving more then (sic) thousands of users are quite likely to be declined.
from PSL's GitHub repo's wiki [0].
[0]: https://github.com/publicsuffix/list/wiki/Guidelines#validat...
Based on what I've seen, there's no way to get that project into the PSL. I would recommend you to have the content available at projectcontent.com if the main site is project.com, though. :)
> My cookies are scoped to my own subdomain
If you mean with the domain option, that's not really sufficient. You need to use the Host- prefix
As a CISO I am happy with many of the protections that Google creates. They are in a unique position, and probably the only ones to be able to do it.
However, I think the issue is that with great power comes great responsibility.
They are better than most organisations, and working with many constraints that we cannot always imagine.
But several times a week we get a false "this mail is phishing" incident, where a mail from a customer or prospect is put in "Spam", with a red security banner saying it contains "dangerous links". Generally it is caused by domain reputation issues, that block all mail that uses an e-mail scanning product. These products wrap URLs so they can scan when the mail is read, and thus when they do not detect a virus, they become defacto purveyors of virii, and their entire domain is tagged as dangerous.
I have raised this to Google in May (!) and have been exchanging mail on a nearly daily basis. Pointing out a new security product that has been blacklisted, explaining the situation to a new agent, etc.
Not only does this mean that they are training our staff that security warnings are generally false, but it means we are missing important mail from prospects and customers. Our customers are generally huge corporations, missing a mail for us is not like missing one mail for a B2C outfit.
So far the issue is not resolved (we are in Oct now!) and recently they have stopped responding. I appreciate our organisation is not the US Government, but still, we pay upwards of 20K$ / year for "Google Workspace Enterprise" accounts. I guess I was expecting something more.
If someone within Google reads this: you need to fix this.
I'm old. I've been doing security for a very long time. Started back in the 1990s. Here's what I have learned over the last 30 years...
Half (or more) of security alerts/warnings are false positives. Whether it's the vulnerability scanner complaining about some non-existent issues (based on the version of Apache alone... which was back ported by the package maintaner), or an AI report generated by interns at Deloitte fresh out of college, or someone reporting www.example.com to Google Safe Browsing as malicious, etc. At least half of the things they report on are wrong.
You sort of have to have a clue (technically) and know what you are doing to weed through all the bullshit. Tools that block access, based on these things do more harm than good.
What this post might be missing is that it’s not just Google that can block your website. A whole variety of actors can, and any service that can host user-generated content, not just html (a single image is enough), is at risk, but really, any service is at risk. I’ve had to deal with many such cases: ISPs mistakenly blocking large IP prefixes, DPI software killing the traffic, random antivirus software blocking your JS chunk because of a hash collision, even small single-town ISPs sinkholing your domain because of auto-reports, and many more.
In the author’s case, he was at least able to reproduce the issues. In many cases, though, the problem is scoped to a small geographic region, but for large internet services, even small towns still mean thousands of people reaching out to support while the issue can’t be seen on the overall traffic graph.
The easiest set of steps you can do to be able to react to those issues are: 1. Set up NEL logging [1] that goes to completely separate infrastructure, 2. Use RIPE Atlas and similar services in the hope of reproducing the issue and grabbing a traceroute.
I’ve even attempted to create a hosted service for collecting NEL logs, but it seemed to be far too niche.
[1]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Net...
I don't see how a separate domain would solve the main issue here. If something on that separate domain was flagged, it would still affect all user content on that domain. If your business is about serving such user content, the main service of your business would be down, even though your main domain would still be up.
You are right, it would still affect all users. Until the pending PSL inclusion is complete, that is. But it now separates my own resources, such as the website and dashboard of statichost.eu from that.
A separate domain may not prevent users' conten from being blocked, but it may prevent blocking of the administrative interfaces. Which would help affected customers get their content and the service could more easily put a banner advising users of the situation, etc.
It feels like unless you're one of the big social media companies, accepting user content is slowly becoming a larger and larger risk.
It always was. You're one upload and a complaint to your ISP/Google/AWS/MS away from having your account terminated.
I second this for personal sites. Having run forums and chan sites without a CDN I found that not only is this true, it is 100% automated. The timing in emails to VPS/Registrars matches the times their scripts would crawl my sites and submit illicit content, screenshot it and automatically submit the screenshots to the VPS/server/registrar providers. That was incentive enough for me to take my sites private / semi-private. I would move them to .onion nodes but that's just too slow for me. I have my own theories as to what groups are running these scripts to push people to CDN's but no smoking gun.
Corporations are a little safer. They have mutually binding contracts with multiple internet service providers and dedicated circuits. They have binding contracts with DNS registrars. Having been on the receiving end of abuse@ they notify over phone and email giving plenty of time to figure out what is going on. I've never seen corporate circuits get nuked for such shenanigans.
Any services successfully offloading UGC to other moderated platforms? E.g. developer tools relying on GitHub instead of storing source/assets in the service itself, and Microsoft can take care of most moderation needs. But are there consumer apps that do things like this?
I think imgur and disqus are good examples of that, there are probably quite a few.
But something has definitely changed over the past few years. Back in the days, it felt completely normal for individuals to spin up and run their own forums. Small communities built and maintained by regular people. How many new truly independent, individual-run forums can you name today? Hardly any. Instead we keep hearing about long-time community sites shutting down because individuals can no longer handle the risks of user content. I've seen so many posts right here on HN announcing those kinds of shutdowns.
I feel like yes forums are being closed because they have migrated to the likes of things like discord
I have mixed opinions about discord and if I can be honest, I have mixed opinions about forums as well
My opinion is to take things like forums and transfer them over to things like xmpp/(Irc?)/(signal?)/(matrix most prefered)
There are bridges as well for matrix <-> Irc if this is something that interests you, there are bridges for everything but I prefer matrix with cinny and I generally think that due to its decentralized nature, it might be better than centralized forums maybe as well.
>How many new truly independent, individual-run forums can you name today?
Almost none, but it's due to a lot of complicated factors and not just the direct risk of user content.
Take moderation of content that won't get you banned by your ISP. It sucks. Nobody in their right mind would want to do it. There are countless bots and trolls that are going to flood your forums for whatever cause they champion.
Then there is DDOS floods because you pissed off said bots and trolls. This can make the forums unaffordable and piss off your ISP.
But even if nothing goes wrong, popularity is a risk in itself. In the past there was stuff like the Slashdot effect where your site would go down for a while. But now if your small site became popular on tiktok for some reason 20 million people could show up. Even if your site can stand up to that, how will you moderate it? How will you pay for the bandwidth?
Oh, and will you get any advertisers because of said user content? How are you going to pay for the site?
Oh, also you're competing with massive sites for eyeballs, how are you going to get actual users?
Is it consolidation of services? Waaaaay back in the day, imageboards like 4chan were "one complaint away from being shut down" but 24-hours later they'd be up again on another rag-tag hosting provider. Nowadays it's like one complaint to cloudflare or AWS and the site is dead dead.
Your equally just one fake report to an automated system away having your account shut down. So, yes, your actions have consequences, but more worrying to me is the ability of someone with a grudge causing consequences for you as well.
This is a direct consequence of centralization of services. We're doing this to ourselves.
You say "we" like it is the population of internet users. They have no choice in this other than to use whatever sites are available. It is the megaEvilCorps that are doing it to us. They start with a novel idea that is rewarded by lots of users. They then decide to weaponize their site against us to become money printing machines. They then use that money to buy up any competition which artificially limits the end user's choices. WE aren't doing shit to ourselves. Profit seeking megaEvilCorps are doing it to us.
On top of the megaEvilCorps are evilScammyHackers that have made the internet a dangerous place. So entrepreneurial minded folks came up with some cool things to help protect users and site owners from these evilScammyHackers. Problem is, it takes scaled services to do it which again takes money which naturally limits those that are able to provide those services. This is again not we doing anything to ourselves.
If you mean we has a species, then sure, but that's a really stretched definition of we
It seems like you wanted to ask a question but avoided using question marks so there isn't much I can do here.
Not sure who changed the HN headline, but I appreciate the change. Especially since the concept in the headline is buried at the bottom of the post.
Post author is throwing a lot of sand at Google for a process that has (a) been around for, what, over a decade now and (b) works. The fact of the matter is this hosting provider was too open, several users of the provider used it to put up content intended to attack users, and as far as Google (or anyone else on the web is concerned) the TLD is where the buck stops for that kind of behavior. This is one of the reasons why you host user-generated content off your TLD, and several providers have gotten the memo; it is unfortunate statichost.eu had not yet.
I'm sorry this domain admin had to learn an industry lesson the hard way, but at least they won't forget it.
Author here. I understand that my post and what I'm trying to say is unclear. And that there are too many different aspects to all this.
What I'm trying to say in the post specifically about Google is that I personally think that they have too much power. They can and will shut down a whole domain for four billion users. That is too much power no matter the intentions, in my opinion. I can agree that the intentions are good and that the net effect is positive on the whole, though.
On the "different aspects" side of things, I'm not sure I agree with the _works_ claim you make. I guess it depends on what your definition of works is, but having a blacklist as you tool to fight bad guys is not something that works very well in my opinion. Yes, specifically my own assets would not have been impacted, had I used a separate domain earlier. But the point still stands.
The fact that it took so long to move user content off the main domain is of course on me. I'm taking some heat here for saying this is more important than one (including me) might think. But nonetheless, let it be a lesson for those of you out there who think that moving that forum / upload functionality / wiki / CMS to its own domain (not subdomain) can be done tomorrow instead of today.
> In order to limit the impact of similar issues in the future, all sites on statichost.eu are now created with a statichost.page domain instead.
This read like a dark twist in a horror novel - the .page tld is controlled by Google!
https://get.page/
Thank you for this hacker-minded and sharp comment! <3 Seriously, not all comments on here are as fun to read for me as the author and fellow hacker.
And for what it's worth, it feels great to actually pay for something Google provides!
One way to build trust.
It can happen to anyone and cause a reputational risk. Once upon a time $workplace had a Zoho Form that would be blacklisted by Google Safe Browsing or Microsoft Edge for arbitrary periods of time, presumably because someone used Zoho to make a phishing site, leading to some very confused calls.
Sounds a very convenient mistake to do on your competitors? It does not sound believable that they would not know what zoho was or that it makes no sense to flag all the zoho domain.
Github discovered the same thing a long long time ago which is why you now have the github.io domain.
In Github's case, I think it was also because a lot of security boundaries were using TLD which led x.github.com potentially grab cookies of y.github.com or worse, github.com itslef
https://news.ycombinator.com/item?id=5500612
Don't forget the `githubusercontent.com` domain, which is specifically used to host risky, user-generated content, and fully documented in https://docs.github.com/en/authentication/keeping-your-accou... (using an open source component that other companies could also use, if they were interested in similar levels of security)
I am a solo developer. I recently created a new web app for a client. Google has marked as phishing so they can't use it. Obviously I can't do anything about it except report error and wait. I'm worried if I move it to a new domain that one will get marked as well. Not sure what to do TBH.
Is it phishing?
No, however it does include a Microsoft entra/Azure AD/Microsoft 365 login for that clients tenant. It is also a newly registered domain so I can understand why it looks suspicious. The most frustrating thing is that this is all a machine I.e. no-one I can speak to, nothing I can do to fix it. My fate has been decided by an algorithm.
Google has some sort of internal flag for determining origin is different on some platforms. We don't get a complete takedown of Neocities every time there's a spam site reported. It is likely that they were not on that list but perhaps have been manually added to whatever that internal list is at this point.
The public suffix list (https://publicsuffix.org/) is good and if I were to start from scratch I would do it that way (with a different root domain) but it's not absolutely required, the search engines can and do make exceptions that don't just exclusively use the PSL, but you'll hit a few bumps in the road before that gets established.
Ultimately Google needs to have a search engine that isn't full of crap, so moving user content to a root domain on the PSL that is infested with phishing attacks isn't going to save you. You need to do prolific and active moderation to root out this activity or you'll just be right back on their shit list. Google could certainly improve this process by providing better tooling (a safe browsing report/response API would be extremely helpful) but ultimately the burdon is on platforms to weed out malicious activity and prevent it from happening, and it's a 24/7 job.
BTW the PSL is a great example of the XKCD "one critical person doing thankless unpaid work" comic, unless that has changed in recent years. I am a strong advocate of having the PSL management become an annual fee driven structure (https://groups.google.com/g/publicsuffix-discuss/c/xJZHBlyqq...), the maintainer deserves compensation for his work and requiring the fee will allow the many abandoned domains on the list to drop off of it.
If you're not using separate domains then I hope you don't have any kind of sensitive information stored in cookies. You can't rely on the path restrictions for cookies because it's easily bypassed.
You can set cookies that strictly stay on the root domain and don't cross to subdomain origins, and vise versa (https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Coo...). We've been doing this for 12 years without issue.
Strict cookies crossing root to subdomains would be a major security bug in browsers. It's always been a (valid) theoretical concern but it's never happened on a large scale to the point I've had to address it. There is likely regression testing on all the major browsers that will catch a situation where this happens.
Hosts phishing sites, gets blocked by anti phishing mechanism. Works as expected from my point of view.
Get yourself on public suffix list or get better moderation. But of course just moaning about bad google is easier.
If youtube.com doesn't end up on the Safe Browsing blacklist because of phishing videos, but your own website can easily end up there, it's a pretty clear case of Google abusing their power.
YouTube doesn't allow you to put your credentials into text box and hit send. Google sites, on the other hand, does pose a disk, but they'll likely be treated the same as any other domain on the PSL.
In my experience, safe browsing does theoretically allow you to report scams and phishing in terms of user generated content, but it won't apply unless there's an actual interactive web page on the other end of the link.
There is the occasional false positive but many good sites that end up on that list are there because their WordPress plugin got hacked and somewhere on their site they are actually hosting malware.
I've contacted the owners of hacked websites hosting phishing and malware content several times, and most of the time I've been accused of being the actual hacker or I've been told that I'm lying. I've given up trying to be the good guy and report the websites to Google and Microsoft these days to protect the innocent.
Google's lack of transparency what exact URLs are hosting bad material does play a role there.
What is a phishing video?
YouTube hosts millions of videos telling people that they are the government/your bank and that you should move money/contact a scam center/buy cryptocurrency. Even worse is the fact you can pay to turn these videos into ads that will roll in front of other videos.
On the whole of YouTube, it's a tiny sliver of a percentage, but because YouTube has grown too large to moderate, it's still hosting these videos.
If Google applied the same rules they apply to the safe browsing list, they'd probably get YouTube flagged multiple times a week.
You are right, of course. I'm not sure if those of you who disagree with me think that Safe Browsing did its job (which it did!), that Safe Browsing is a good thing (which it maybe is, but which I slightly disagree with), or that it's ok that Google monitors everything everyone does.
The last point is actually the one I'm trying to make.
There should be a concept, sort of an inverse of tragedy of the commons, for the positive feedback loop of many users providing big data to a company that can use that data to benefit many users.
From spamblocking that builds heuristics fed by the spam people manually flag in GMail to Safe Browsing using attacks on users' Chrome as a signal to their voice recognition engine leapfrogging the industry standard a few years back because they trained it on the low-quality signal from GOOG411 calls, Google keeps building product by harvesting user data... And users keep signing up because the resulting product is good.
This puts a lot of power in their hands but I don't think it's default bad... If it becomes bad, users leave and Google starts to lose their quality signal, so they're heavily incentivized to provide features users want to retain them.
This does make it hard to compete with them. In the US at least, antitrust competition has generally been about user harm, not actually market harm. If a company has de-facto control but customers aren't getting screwed, that's fine because ultimately the customer matters (and nobody else is owed a shot at beeing a Google).
> There should be a concept....
There is. It is called ponzi, and its illegal, but in most cases its become indirect enough in the consequences without proper guard rails/accountability that its now allowed by most publicly traded business today (through clever deception).
Generally, it involves three phases:
1st: Front-loaded benefits in CapEx funding meeting customer/investor expectations regardless of cost.
2nd: Inflection point of momentum where CapEx falls off, a brief period where income meets costs.
3rd: Enshittification - momentum/acceleration reverses to the negative, failure of services as the system is continually hollowed out and cost exceeds income.
This is seen in the S-growth or S-adoption curves in business starting to become visible towards the late 1970s and progressively increasing exponentially thereafter in time.
Most companies jettison (sell/off or merge) or close down services before they hit the 3rd stage where the service objectively can be seen as unprofitable by associated investors. The ones that don't are state-funded apparatus.
This concept drives almost everything we see today in modern society and in the market there are parallels and indirect consequences fully described back in the 1950's by Mises with regards to money-printing regardless of its form (i.e. debt that is not reserve backed (Basel3), synthetic shares, paper warrants (Comex), Bonds (with reporting loophole hold to maturity), flier miles, credit card rewards, etc).
The structure and its flaws remain foundationally intractable. This is how you profit and grow bigger off destroying the market. Eventually consolidation leaves state apparatus in place of a market.
No market can compete with slave labor, which is what state-funded apparatus use indirectly through money-printing/currency debasement. Its not considered a tax, and its not given willingly. Its extracted labor.
Those that have lived through these times see the drastic reduction of options in available products that have naturally sieved to the point where shortages are now regularly occuring (for those with a discerning eye). There are a lot of moving factors, but the structure and their inevitable trends are well known structures, at least in certain circles.
In seriousness, the totality of Socio-economic collapse is more probable than a lot of other potential futures, as a result of this. Collapse has happened many times throughout history in relation to money-printing.
Always before, we were not in ecological overshoot for our population, let alone being in this state for 2 full generations. Catton/Malthus paint a grim picture of the outcomes but no one of action pays attention to these things. Its all largely drowned out by the noise of bots.
It's hard to get that point because you're conflating two different stories.
Folks around here are generally uneasy about tracking in general too, but remove big brother monitoring from Safe Browsing and this story could still be the same: whole domain blacklisted by Google, only due to manual reporting instead.
"Oh, but a human reviewer would've known `*.statichost.eu` isn't managed by us"—not in a lot of cases, not really.
Sure, and sorry for being so unclear. The point of my post was meant to be a) Google has this enormous cannon, is this "right"? And b) they will use it to kill anything bigger than a mosquito.
But you're right, complaining about big tech surveillance didn't help with making that point at all.
> But you're right, complaining about big tech surveillance didn't help with making that point at all.
I disagree. Everyone with a brain is thinking it. Its important to address what your audience may be thinking especially given the other factors in this which I've mentioned in other responses related to gross negligence.
Technical capability exists to narrowly define blacklists, and they chose a gross negligence route (baby with bathwater), without providing notice.
You are right, but then again, nobody flags facebook because of the scamming taking place in some facebook pages.
Generally because Facebook polices Facebook (imperfectly, but the effort is demonstrated) and the damage radius is limited to Facebook users mostly. As long as the easiest way to avoid damage from the Facebook domain is "Don't use Facebook," the larger Internet doesn't need a mechanism to police it.
If Facebook became a trap that frequently hosted malware to strangers, the rest of the net would begin to interpret it as damage and route around it.
"Might makes right" as they say.
There is no real way a normal person even can flag facebook.
I’ve got a random subdomain hosting a little internal tool. About twice a year, Google Safe Browsing decides it’s phishing and flags it. Sometimes they flag the whole domain for good measure.
Search Console always points to my internal login page, which isn’t public and definitely isn’t phishing.
They clear it quickly when I appeal, and since it’s just for me, I’ve mostly stopped worrying about it.
I encountered something similar. I have `*.domain.tld` pointed to an internal IP address, and over the past few years it happened a few times where some subdomain would be flagged as dangerous by Google Safe Browsing.
Internal IP addresses in public DNS are sometimes used to do things like DNS rebind attacks. It's possible that's tripping up their detection mechanism.
My workaround is to use an IPv6 ULA for my publicly hosted private IP addresses, which is extremely unlikely to ever be reused by a bad actor.
Google services simply behaved the way I would expect them to here. Who knows... they may even have saved some users from coming to harm.
Many phishing attacks originate from Google's owns domains. Gmail users phishing others, scammy youtube videos, scammy comments with links to scams, scammy ads to fake banking pages, etc, etc. But Google would never be hypocritical, never!!
That is a great point. When I see these sites I'm always seeing a dozen red flags, and maybe the biggest one is that it's showing a "NatWest" banking site or something and is hosted on "portal-abc.statichost.eu". But the whole point is of course saving users from coming to harm, and if it did - great!
I was curious how other browsers handle this. Apparently Safari and Firefox delegate to Google.
https://www.apple.com/legal/privacy/data/en/safari/
https://support.mozilla.org/en-US/kb/how-does-phishing-and-m...
Microsoft seems to do its own thing for Edge, though.
https://learn.microsoft.com/en-us/deployedge/microsoft-edge-...
This is a bit of a tangent, the whole concept of "domain reputation" can be infuriating. For example, my blog has been marked as suspicious by spamhaus.org: https://check.spamhaus.org/results?query=dynomight.net
As a result, some ISPs apparently block the domain. Why is it listed? I have no idea. There are no ads, there is no user content, and I've never sent any email from the domain. I've tried contacting spamhaus, but they instantly closed the ticket with a nonsensical response to "contact my IT department" and then blocked further communication. (Oddly enough, my personal blog does not have an IT department.)
Just like it's slowly become quasi-impossible for an individual to host their own email, I fear the same may happen with independent websites.
From reading that my guess would be that the IP of your host gotten from your hosting provider had some spammy history before you started hosting your blog on it.
Either that or your DNS provider hosts a lot of spam.
Hmmm, I use https://njal.la/ for DNS. Could spamhaus really just auto-mark every njalla user as suspicious?
Yeah, possibly. Privacy related services are often used by spammers.
> As a result, some ISPs apparently block the domain
This is the infuriating part. I get that someone buying cheap hosting may end up with an IP address that used to send spam, but spam lists are not reliable indicators of website security.
Overzealous security products are a blight on the internet. I'd be less annoyed at them if they weren't so trivial to bypass as a hacker with access to a stolen credit card.
So… you were hosting user generated content on the same TLD as your website, without using the PSL, and you blamed G when things went south?
By putting UGC on the same TLD you also put your own security at risk, so they basically did you a favor…
Many commenters are implying that there is a security issue here, and that I'm putting everyone in danger. That is quite frankly a pretty absurd claim to just casually make. I'm of course very curious to hear more details on what the security risk here actually would be?
Do you think I'm reading/writing sensitive data to/from subdomain-wide cookies?
Also, yes, the PSL is a great tool to mitigate (in practice eliminate) the problem of cross-domain cookies between mutually untrusting parties. But getting on that list is non-trivial and they (voluntary maintainers) even explicitly state that you can forget getting on there before your service is big enough.
I am not implying you’re putting “everyone” in danger. I’m merely implying that you’re putting your own service in danger by allowing clients to act like a trusted subdomain like controlpanel.statichost.eu, .secure, or Unicode similarities of www.
Ok, I see. You mean the possibility of users impersonating statichost.eu itself. That is actually a good point, and the exact reason why user subdomains are required to have a dash in them. Edit: Also, only ASCII is allowed. :)
I guess control-panel.statichost.eu is still possible, of course, but that already seems like a pretty long shot.
It's also good from a security perspective.
Anyone who can upload HTML pages to subdomain.domain.com can read and write cookies for *.domain.com, unless you declare yourself a public suffix and enough time has passed for all the major browsers to have updated themselves.
I've seen web hosts in the wild who could have their control panel sessions trivially stolen by any customer site. Reported the problem to two different companies. One responded fairly quickly, but the other one took several years to take any action. They eventually moved customers to a separate domain, so the control panel is now safe. But customers can still execute session fixation attacks against one another.
(Author here) This is all true. The main assumption from my part is that anything remotely important or even sensitive should be and is hosted on a domain that is _not_ companysubdomain.domain.com but instead www.company.com.
Super interesting article! We were facing the exact same problem last Monday . Also write a couple of words about the incident: https://muetsch.io/how-google-accidentally-took-us-off-the-i....
Does anyone know if adding our domains to Public Suffix List will prevent incidents like this?
I don't like nor trust google, but "Use your own judgement and hard-earned Internet street smarts" doesn't work either, because the median internet user does not have anything resembling internet street smarts.
Still not sure why it's legal for Google to slander companies like this. They often have no proof or it's a false positive, meanwhile they're screaming about how malicious you are.
My site, a library for building services with the GOV.UK Design System, is currently blocked too.
Despite being a paying Google Workspace customer, I can't get in touch with anyone who can help.
https://github.com/x-govuk/govuk-components/
Notably this post did not examine whether any of the sites it was hosting on this domain was malicious/misleading.
I'm not asking about this specific case. There are plenty of examples of Google wrongly accusing others of being malicious with massive business impact
Good question, it probably shouldn't be legal. Enforcing the law on these behemoths is another problem, though... :)
Because Google has absolutely nothing to lose and you do, besides that they can outlast anybody except for nation states in court.
How does this answer the question of legality?
Questions of legality are answered by a judge, not by a forum
[flagged]
[flagged]
Seems like a reasonable trade-off I mean six hours is not the worst thing in the world. What if you were hosting mission-critical such as such? Were you?
> To be fair, many or even most sites on the Google Safe Browsing blacklist are probably unworthy. But I’m pretty sure this was not the first false positive.
The bigger issue is that the internet needs governance. And, in the absence of regulation, someone has stepped in and done it in a way that the author didn't like.
Perhaps we could start by requiring that Google provide ways to contact a living, breathing human. (Not an AI bot that they claim is equivalent.)
why do you assume that the living, breathing human hired by theGoogs will be competent at handling all of the crazy that will be flung at them by the living, breathing human on the other end of the line. One single person cannot handle that. Naturally, you need a team of living, breathing humans. You might even have them in triage level groups like level 1 support, level 2 support and so on where each level is a more trained/experienced living, breathing human. Eventually, you'll have an entire department of people of varying degrees of skill. Oh, wait, I'm sorry, I thought it was the year 2000.
Hopefully, this helps you understand why your living, breathing human is such a farcical idea for theGoogs to consider.
Well, Google did self-appoint itself the "internet police," and the general job of the police is to deal with screwballs.
So you can't take one part of the responsibility and abdicate the other part!
Playing devil's advocate, who else was going to step into that role? Who would have the clout to be trusted? The Googs would want to do something just as a self protecting action that evolved into a self aggrandizing sense of empowerment that they might not be the protector we need but the one we deserved
You really think talking to a human and a bot is the same?
I don't even know what you're asking, or how that's the question you ask from my comment. Clearly, no, I don't think a human and a bot are the same. I'm saying that evilCorp is not going to pay for a human support staff in the year 2025 when the company is pushing it's AI/LLM chatbot as a major part of who they are. If the chatbot company doesn't use its own chatbot, why would anyone else? Of course they are not going to pay for humans.
How does any of that lead to your asking if I think humans === bots?
That may be, and we certainly don't need anyone explaining Google's position - we already know what that is. Nobody here actually cares what Google wants, we're expressing what we want. Nerds have helped Google enough with free marketing and goodwill, Google's reputation being tarnished can only help us not hurt us.
Honestly, this is extremely basic stuff in hosting, not only due to safe browsing, but also—and more importantly—cookie safety, etc. If a hosting provider didn’t know (already bad enough) and turn to whining after being hit, then
> Static site hosting you can trust
is more like amateur hour static site hosting you can’t trust. Sorry.
The thing is, you cannot just add any domain to the PSL. You need a significant amount of users before they will include your domain. Before recently, there really was no point in even submitting, since the domain would have been rejected as too small. An increase in user base, increase in malicious content and the ability to add your domain to the PSL all happen sort of simultaneously.
I'm also trusting my users to not expose their cookies for the whole *.statichost.eu domain. And all "production" sites use a custom domain anyway, which avoids all of this anyway.
There are well-documented solutions to this that don't rely on the PSL. Choosing to ignore all of that advice while hosting user content is a very irresponsible choice, at best.
So the problem here is that Alice on alice.statichost.page might set a cookie for the `.statichost.page` domain if she's careless (which is sometimes the case with Alice). This cookie can then be read by Mallory on mallory.statichost.eu. Or the other way around, if Mallory wants to try to trick Alice into reading his cookie. How this can be prevented without the PSL is something I'm very interested to hear more about.
If user1.statichost.page gets blacklisted now will it affect user2.statichost.page as well?
Yes, unless they submit statichost.page to the public suffix list.
I have the same issue. Think of my site as WeTransfer, but instead of only files, you can also use it as a link shortener or pastebin. Abuse works the same as on every other site or service: I do spot checks and users can report content. This was fine until uBlock Origin decided the website was malicious, per one of the lists that is default-enabled for everyone
That list doesn't have a clear way to get off of it. I would be happy to give them the heads up that their users are complaining about a website being broken, but there is no such thing, neither for users nor for me. In looking around, there's many "sources" that allegedly independently decided around the same day that my site needs to not work anymore, so now there's a dozen parties I need to talk to and more popping up the further you look. Netcraft started sending complaints to the registrar (which got the whole domain put on hold), some other list said they sent abuse to the IP space owner (my ISP), public resolvers have started delisting the domain (pretending "there is no such domain" by returning NXDOMAIN), as well as the mentioned adblockers
There's only one person who hasn't been contacted: the owner. I could actually do something about the abusive content...
It's like the intended path is that users start to complaint "your site doesn't work" (works for me, wdym?) and you need to figure out what software is it they're using, what DNS resolver they use, what antivirus, what browser, if a DOH provider is enabled... to find out who it might be that's breaking the site. People don't know how many blocklists they're using, and the blocklists don't give a shit if you're not a brand name they recognize. That's the only difference between my site and a place like Github: if I report "github.com hosts malware", nobody thinks "oh, we need to nuke that malicious site asap!"
I'd broaden the submitted post to say that it's not only Google with too much power, but these blocklists have no notification mechanism or general recourse method. It's a whack-a-mole situation which, as an open source site with no profit model (intentionally so), I will never win. Big tech is what wins. Idk if these lists do a trademark registration check or how they decide who's okay and who's not, but I suspect it's simply a brand name thing and your reviewer needs to know you
> Luckily, Google provided me with a helpful list of the offending sites
Google is doing better than most others with that feature. Most "intelligence providers", which other blocklists like e.g. Quad9 uses, are secretive about why they're listing you, or never even respond at all
I have recently had the pleasure of speaking with Google senior leadership involved in the Safe Browsing product on the topic of getting my SaaS product placed on their, "naughty list." The platform was down for 6 or so hours due to a false positive hit for phishing.
I have read A LOT of blogs/rants/incidents on social media about startups, small businesses, and individuals getting screwed by large companies in similar capacities. I am VERY sympathetic to those cries into the sky, shaking fists at clouds, knowing very well we are all very small and how the large providers seem to not care. With that in mind, I am not blind to the privilege my organization has to rope in Google to discuss root causes for incidents.
I am writing about it here because I believe most people will never be able to pull a key Google stakeholder into a 40 minute video call to deeply discuss the RCAs. The details of the discussion are probably protected by NDA so I'll be speaking in general terms.
Google has a product called Web Risk (https://cloud.google.com/web-risk/docs/overview), I hear it's mostly used by Google Enterprise customers in regulated verticals and some large social media orgs. Web Risk protects the employees of these enterprise organizations by analyzing URLs for indicators of risk, such as phishing, brand impersonation, etc.
My SaaS platform is well established and caters mostly to large enterprise. I provide enterprise customers with optional branded SSO landing pages. Customers can either use sign-in from the branded site (SP-initiated) or redirect from their own internal identity provider to sign-in (IdP-initiated). The SSO site branding is directed by the customer, think along the lines of what Microsoft does for Entra ID branded sign-in pages. Company logo(s), name, visual styling, and other verbiage may be included. The branded/vanity FQDN is (company).productname.mydomain.com.
You may be able to see where I'm headed at this point... Why was my domain blocked? For suspected phishing.
A mutual enterprise customer was subscribed to Google's Web Risk. When their employees navigated to their SSO site, Google scanned it. Numerous heuristics flagged the branded SSO site as phishing and we were blocked by Safe Browsing across all major web browsers (Safari, Chrome, Firefox, Edge, and probably others). Google told us that had our customer put the SSO site on their Web Risk allow-list, we wouldn't have been blocked.
I'm no spring chicken, I cannot rely nor expect a customer to do that, so I pressed for more which led to a lengthy conversation on risk and the seemingly, from my perspective, arbitrary decisions made by a black box without any sort of feedback loop.
I was provided a tiny bit of insight into the heuristic secret sauce, which led to prescribed guidance on what could be done to greatly reduce the risk of getting false positive flag for phishing again. Those specifics I assume I cannot detail here, however the overall gist of it is domain reputation. Google was unable to positively ascertain my domain's reputation.
My recommendation is for those of you out there in the inter-tubes who have experienced false positive Safe Browsing blocks, think about what you can do to increase your domain's public reputation. Also, get a GCP account so if you do get blocked, you can open a ticket from that portal. I was told it would be escalated to the appropriate team and be actioned on within 10-15 minutes.
Another day, another IT company learning the hard way about the public suffix list, or well-known URIs, or some other well-documented-but-niche security technology.
I love that IT is a field where there's no required formal education track and you can succeed with any learning path, but we definitely need some better way to make sure new devs are learning about some of these gotchas.
Wow 19€ for web hosting. I pay like 8 for whole vps. Crazy
How dare google tag my website unsafe just because im hosting a bunch of phishing sites!
Like i get that google has a lot of power, but you think they would use a case where google was actually in the wrong.
The PSA is good, the article is meh. There is too much misdirected anger towards google here, IMO. I agree it sucks to be the false positive, but it'd also suck more to unknowingly be part of phishing campaigns and not know.
On top of that, it is also recommended to serve user content from another domain for security reasons. It's much easier to avoid entire classes of exploits this way. For the site admins: treat it as a learning experience instead of lashing out on goog. In the long run you'll be better off, having learned a good lesson.
Exactly! For a web dev in 2025 to still not know security best practices that have been around for 20+ years is a failure on the part of the dev.
I’m sure I don’t know ALL the "security best practices that have been around for 20+ years" and this is perfectly fine as long as I’m able to react quickly. See also https://xkcd.com/1053/.
It's fine if you personally didn't know that. But if I'm paying for a service, I expect the provider to understand basic security best practices that have been industry standard for 20+ years. And if they don't, they should be hiring people who do.
XKCD 1053 is not a valid excuse for what amounts to negligence in a production service.
Author here. What kind of security negligence are you referring to? What would be a specific attack vector that I left open?
Regarding the PSL - and I can't believe I'm writing this again: you cannot get on there before your service is big enough and "the request authentically merits such widespread inclusion"[1]. So it's kind of a chicken and egg situation.
Regarding the best practice of hosting user content on a separate domain: this has basically two implications: 1. Cookie scope of my own assets (e.g. dashboard), which one should limit in any case and which I'm of course doing. So this is not an issue. 2. Blacklisting, which is what all of this has been about. I did pay the price here. This has nothing to do with security, though.
I'm sorry to be so frank, but you don't know anything about me or my security practices and your claim of negligence is extremely unfounded.
[1] https://github.com/publicsuffix/list/wiki/Guidelines#validat...
Eric, I think it appropriate to mention, and I'd like to point out the lack of any real documentation (reaching a professional level) related to PSL on the professional working groups touching on these things (i.e. M3AAWG).
There are only two blog posts on M3AAWG in 2023 where it had been used silently (apparently for years), but was calling for support. I would think if it were an industry recognized initiative it would have the appropriate documents/whitepapers published on it in the industry working group tasked with these things. These people are supposed to be engineer's after all. AFAIK this hasn't happened aside from a brief-after-action with requests for support which is highly problematic.
When there is no professional outreach (via working group or trade group), its real hard to say that this isn't just gross negligence on google's part. M3AAWG has hundreds if not thousand's of whitepapers each hundreds of pages. A single blog post or two that mention it insufficiently, won't rationally negate this claim supporting gross negligence.
Why do I mention Gross negligence?, when coupled with loss, it is sufficient in many cases to support a finding of 'malice' without specific intent (i.e. general intent), especially when such an entity has little/no credibility, but is overshadowed by power/authority that is undeserved. Deceitful people that reasonably should know the consequences will go bad, often purposefully structure towards general intent to avoid legal complications and the legal system has evolved. I am not a lawyer, but this paraphrase about gross negligence/general intent/malice did come from a lawyer, its not meant or intended for use as legal advice in paraphrase form, so standard IANAL disclaimer applies. If the that is needed, consult a qualified professional for a specific distinction on this.
The company is more than technically capable of narrowly defining blacklists and providing due process and appropriate noticing requirements.
The situation begs questions of torturous interference, and whether the PSL is being used as an anti-competive mechanistic moat to prevent competitors from entering the market by imposing additional cost arbitrarily on competitors that is assymetric to the costs such companies have with competing services (as oligopoly/monopoly).