Across a broad sample of typo domains of major sites, most registered domains aren’t actually reachable, implying they are registered for defensive, legitimate, or unrelated purposes. Interestingly, the typo space on major sites is actually very sparsely registered (2% at edit distance 1), meaning that typosquatting may actually be underexploited.
>Interestingly, the typo space on major sites is actually very sparsely registered (2% at edit distance 1), meaning that typosquatting may actually be underexploited.
Anecdotally, the autosuggestions and improved browsing history recommendations may mean this is way less lucrative than it used to be.
Also, anyone doing search like behaviour in their address bar is far more likely to see a knowledge panel style reply for prominent websites vs the 10 blue link format of historical search engine results, which may have included the nefarious domains.
I'd leap to say that because of this, users find their intended domain by using natural language far more than they used to.
I would argue that it is 100% searching in the address bar. Mobile has trained people to do that, and search results are usually solid enough to take you to the right place.
A possible explanation why typos for major sites are sparsely registered could be that the domain industry has put a lot of focus the last decade on addressing malicious registrations, and many registrars that focus on the market segment of large companies sell products that monitor for malicious registrations with legal response in case one pops up. It is also seems that bulk registrars has gotten better filters to reduce malicious registrations, which is a service some security companies offer to registrars. In theory it should be quite more difficult today for a malicious actor to go to a major registrar and buy an obvious trademark infringing domain for a major site.
Domain/trademark monitoring also directly compete with defensive registrations. Often it is a question if you want to pay the lawyers/monitoring service, a large number of registration/renewal fees, or both.
My guess is also that not all typos are equal. Should have a stricter edit version for 1-keystroke-away filtered edits (that is delete, swap or add 1 key away / replace one key away) instead of pure Levenshtein. Like Fqcebook is a more likely typo than Fjcebook but they are both edit-1
If I understand correctly from the paper what qualifies as an edit distance of 1 is pure Levenshtein distance-1 right?
Just curious because while the edit-1 space can be fairly big, I’d assume all edits have very different probabilities. So the squatted domains probably skew to a higher probability edit. By that I mean mostly keyboard edit typos, eg on a phone: the “cwt” typo is more likely than “cpt” for “cat” because of an and w keyboard proximity. Wonder what the squatting rate is when you filter for edit within one key stroke for example (only really change the add and replace types of edits, not delete or swap)
Yes, Levenshtein in that case give too big an exploration space. A keyboard edit distance would probably work better. Delete and swap are still 1 but replace and add should be within say 1-key at most
"... meaning that typosquatting may actually be underexploited."
Missing from the paper is an examination of web user behaviour
Over time, so-called "direct navigation" where the domain name, e.g., example.com, was typed into the browser address bar, has declined. By the time Google terminated "Adsense for domains" in 2012 IMO it had managed to systematically subsume most of the traffic and associated revenue from the typosquatting/domain parking racket
With the introduction of the so-called "omnibar" or "omnibox" in Firefox^1 and Chrome, typographical errors in domain names are submitted as "searches" to a company that sells ad services. For example, Safari, Firefox, Chrome all sending search traffic to Google, LLC. From the DoJ antitrust litigation we know that Google has been paying ridiculously large sums of money to various companies for this traffic
1. Firefox originally called this the "awesome bar"
Not to mention increasingly common user practice of direct navigation to a search engine webpage, e.g., google.com, then searching for the desired website, e.g., example.com
As everyone knows, one company, in some cases through acquisitions and/or anticompetitive conduct, came to control 1. search, 2. "the web browser", 3. online advertising services on the open web, 4. operating systems (mobile, "chromebook"), ...
If parked domains only get traffic from "direct navigation",^2 then it stands to reason that such traffic has declined as it has been increasingly captured by advertising-sponsored "default browsers" and, ultimately, Google. IMO, it makes sense that domain parking as a means of delivering ads and generating revenue would give way to these domains becoming unregistered or registered to malware distributers or the like
What are the registration histories for the unregistered edit distance 1 typosquatting domains. Consider the number that are "currently unregistered" versus "never before registered"
2. Perhaps the registrants are using other ways to send traffic to these domains
>“In large scale experiments, we found that over 90% of the time, visitors to a parked domain would be directed to illegal content, scams, scareware and anti-virus software subscriptions, or malware, as the ‘click’ was sold from the parking company to advertisers, who often resold that traffic to yet another party,” Infoblox researchers wrote in a paper published today.
Hey, same thing happens with my Google search results, what a coincidence!
Their definition of parked domain is a bit odd, with "expired" domain names and typosquatting” domains. I work at a registrar and the absolutely vast majority of parked domains for us are domains owned by customers that register alternative versions, campaign, products and misspellings of their primary domain. Parked in that sense mean an almost empty zone with occasionally a default landing page, sometimes as a paid DNS service at the registrar, and sometimes as a free service (There are still registration and renewal fees).
Putting a redirect onto such domain would be a major bad faith act by the registrar and a reason to avoid that registrar at all costs. The customer is the owner of that name, has their name attached as the registrant, and generally hold some legal risk while doing so. It also goes directly against the primary reason why the customers bought the domains in the first place.
The ones that hold advertisement two specific cases. One is "expired" domains which are not actually expired but where the registrar holds on to it in the hope that the old or new customer will buy it for an extra cost. The other is names which a customer or the registrar itself bought as an investment in hope to auction out. That kind of behavior was historically frowned at but is fairly common practice for a smaller number of domains. Usually you don't put redirects on those since you want to expose the fact that the domain is for sale.
So I am very confused where they got their 90% number from, but then I would not call typosquatting as parked domains if its registered by a malicious actor and used for a scam on their own servers (or hacked servers as it may be).
I park mine by having no IP address, MX record is "0 ." meaning it does not receive email, the SPF record is "v=spf1 -all" and DMARC is a strict reject, CAA is 0 issue ";", BIMI is "v=BIMI1; l=; a=;". I do the same for wildcard DNS. There's probably more I should add.
Indeed, this is a common practice in the broader data. It seems the linked article is filtering to resolvable+hosted domains, a subset of overall domain parking.
Yup. That's why I am suggesting to stop that practice and just remove the IP rather than trusting the landing page someone else maintains. Or if one would like to give bots something to do point it to a multicast address or perhaps MoD/US Military address.
A similar trend I've noticed in the US within recent years has been that misdialing toll-free numbers(or even correctly dialing an apparently expired number), originally "area code" 800, since expanded to include 888, 877, 866, 855, and 844, will lead to a scam or advertising connection.
This is one of numerous trustworthiness attacks on general public-switched telephone network (PSTN) use which I suspect will lead to an increased abandonment of that system. If we can neither trust either incoming or outgoing calls to connect to a trustworthy counterparty, people will tend to prefer systems which do so.
(This is on top of privacy and security issues with PSTN, including data exfiltration by operators, and potential for wiretapping and intercepting voice, texts, and data.)
I owned facebook.ky, as a goof, for about 2 weeks 10+ years ago before Facebook claimed it from me. Wild to me that huge banks don’t have a team whose responsibility it is to watch for and seize scam domains
Facebook[1], Google, etc all use (or used to use) MarkMonitor that offers domain squatting monitoring as a service[2] that utilizes the Uniform Domain Name Dispute Resolution Policy to remove offending domains violating their trademark. These services are quite expensive from my understanding.
[1] It appears Facebook now utilizes their own internal registry.
I've seen this on some of the domains speculatively registered by companies hoping to sell them for a fortune. Pick a dictionary word, or just a short (3 or 4 letter) Domain Name. If it's not actually in use, somebody had registered it and would love to sell it for some stupid amount. In the mean time, I guess they pay the fees by renting to scammers...
I really wish the domain registrar's would prohibit speculation, but there's money to be made, so...
Especially when the alternative is "type the company name into google" where the top 3 results are ads and they've previously been seen to stick malware distribution sites above the legitimate company pages
I just checked. At least it's not answering on 25 to receive all that free typo mail. Same for gmali.com. But they could spoof the gmail login page. Not finding out.
PORT STATE SERVICE
80/tcp open http
443/tcp open https
8080/tcp open http-proxy
You're looking in the wrong place. They don't need to be listening for mail on the machine behind the A/AAAA records for the domain, because they have an MX record indicating that mail should be delivered elsewhere:
$ dig MX gmai.com +short
1 mail.h-email.net.
Port 25 is very rare these days, as it implies the possibility of unencrypted traffic; legitimate SMTP traffic uses port 587. That said, I checked a couple of the hosts that that name resolves to, and they all listen for both SMTP and secure SMTP traffic:
$ nmap -p 25,587 mail.h-email.net
Starting Nmap 7.95 ( https://nmap.org ) at 2025-12-18 16:31 UTC
Nmap scan report for mail.h-email.net (165.227.159.144)
Host is up (0.093s latency).
Other addresses for mail.h-email.net (not scanned): 91.107.214.206 165.227.156.49 167.235.143.33 5.75.171.74 5.161.194.135 178.62.199.248 5.161.98.212 162.55.164.116 49.13.4.90
rDNS record for 165.227.159.144: mail2.h-email.net
PORT STATE SERVICE
25/tcp open smtp
587/tcp open submission
As far as I've been able to research, these typesquatting domain traps started at the same time as Spamhaus CSS blacklist which was actually a company called Deteque.
If the MX has a large number of Hetzner IPs as mailservers, then it's probably Spamhaus.
This just happened to me a month ago, I was waiting for a unused domain to expire. The domain was hosted on Epik (which I think is a trashy company but w/e).
About a month before expiration it somehow got renewed for 10 years, which is weird because it was not available ... and is now hosting a "get-rich-quick" scam that pretends to be a genuine Petro Canada campaign.
> About a month before expiration it somehow got renewed for 10 years, which is weird because it was not available
I've seen some domain registrars auctioning off domains during the last 2-4 weeks before they expire. If nobody buys it, then it actually expires and is then released.
At the end of the day, no matter your domain, ICANN can just take it for their VC bros. Happened to a friend of mine that owned a pretty novel domain name that a certain social media company wanted. He refused to sell. ICANN and his registrar just transferred it out from under him. Gone. See ya.
Wow. In light of this it's amazing that Mr. Nissan (RIP) and later his heirs managed to not only retain control of nissan.com, but regain it after it was stolen years after his passing.
I know better. They read this site. They know that all it takes is some company to issue some trademark litigation and they fold. No basis, no question, just here you go.
well, condescension aside, literally what would they do? there's nothing remotely illegal about posting the name of a site in a forum. and here you are trying to get me to be as scared as you are about posting a basic fact in a forum and why would I be?
We did a large-scale study of this phenomenon recently: https://www.cs.bu.edu/faculty/crovella/paper-archive/wung-if...
Across a broad sample of typo domains of major sites, most registered domains aren’t actually reachable, implying they are registered for defensive, legitimate, or unrelated purposes. Interestingly, the typo space on major sites is actually very sparsely registered (2% at edit distance 1), meaning that typosquatting may actually be underexploited.
>Interestingly, the typo space on major sites is actually very sparsely registered (2% at edit distance 1), meaning that typosquatting may actually be underexploited.
Anecdotally, the autosuggestions and improved browsing history recommendations may mean this is way less lucrative than it used to be.
Also, anyone doing search like behaviour in their address bar is far more likely to see a knowledge panel style reply for prominent websites vs the 10 blue link format of historical search engine results, which may have included the nefarious domains.
I'd leap to say that because of this, users find their intended domain by using natural language far more than they used to.
I would argue that it is 100% searching in the address bar. Mobile has trained people to do that, and search results are usually solid enough to take you to the right place.
Yeah, I'd lean towards a high % also- it would take some time to prove it.
Also, homograph attacks are likely much less of a thing for the above reasons.
A possible explanation why typos for major sites are sparsely registered could be that the domain industry has put a lot of focus the last decade on addressing malicious registrations, and many registrars that focus on the market segment of large companies sell products that monitor for malicious registrations with legal response in case one pops up. It is also seems that bulk registrars has gotten better filters to reduce malicious registrations, which is a service some security companies offer to registrars. In theory it should be quite more difficult today for a malicious actor to go to a major registrar and buy an obvious trademark infringing domain for a major site.
Domain/trademark monitoring also directly compete with defensive registrations. Often it is a question if you want to pay the lawyers/monitoring service, a large number of registration/renewal fees, or both.
My guess is also that not all typos are equal. Should have a stricter edit version for 1-keystroke-away filtered edits (that is delete, swap or add 1 key away / replace one key away) instead of pure Levenshtein. Like Fqcebook is a more likely typo than Fjcebook but they are both edit-1
Someone should make a qwertyshtein() function.
If I understand correctly from the paper what qualifies as an edit distance of 1 is pure Levenshtein distance-1 right?
Just curious because while the edit-1 space can be fairly big, I’d assume all edits have very different probabilities. So the squatted domains probably skew to a higher probability edit. By that I mean mostly keyboard edit typos, eg on a phone: the “cwt” typo is more likely than “cpt” for “cat” because of an and w keyboard proximity. Wonder what the squatting rate is when you filter for edit within one key stroke for example (only really change the add and replace types of edits, not delete or swap)
> Interestingly, the typo space on major sites is actually very sparsely registered (2% at edit distance 1)
It seems to me that "edit distance 1" still describes some very implausible typos.
Yeah corner and comer is an edit distance of 2 but perhaps more lucrative than corner and corker, as a bad example.
I saw rnicrosoft in use the other day, somewhere.
Yes, Levenshtein in that case give too big an exploration space. A keyboard edit distance would probably work better. Delete and swap are still 1 but replace and add should be within say 1-key at most
"... meaning that typosquatting may actually be underexploited."
Missing from the paper is an examination of web user behaviour
Over time, so-called "direct navigation" where the domain name, e.g., example.com, was typed into the browser address bar, has declined. By the time Google terminated "Adsense for domains" in 2012 IMO it had managed to systematically subsume most of the traffic and associated revenue from the typosquatting/domain parking racket
https://web.archive.org/web/20250320184725if_/https://domain...
With the introduction of the so-called "omnibar" or "omnibox" in Firefox^1 and Chrome, typographical errors in domain names are submitted as "searches" to a company that sells ad services. For example, Safari, Firefox, Chrome all sending search traffic to Google, LLC. From the DoJ antitrust litigation we know that Google has been paying ridiculously large sums of money to various companies for this traffic
1. Firefox originally called this the "awesome bar"
https://web.archive.org/web/20250927011424if_/https://www.cn...
Not to mention increasingly common user practice of direct navigation to a search engine webpage, e.g., google.com, then searching for the desired website, e.g., example.com
As everyone knows, one company, in some cases through acquisitions and/or anticompetitive conduct, came to control 1. search, 2. "the web browser", 3. online advertising services on the open web, 4. operating systems (mobile, "chromebook"), ...
If parked domains only get traffic from "direct navigation",^2 then it stands to reason that such traffic has declined as it has been increasingly captured by advertising-sponsored "default browsers" and, ultimately, Google. IMO, it makes sense that domain parking as a means of delivering ads and generating revenue would give way to these domains becoming unregistered or registered to malware distributers or the like
What are the registration histories for the unregistered edit distance 1 typosquatting domains. Consider the number that are "currently unregistered" versus "never before registered"
2. Perhaps the registrants are using other ways to send traffic to these domains
>“In large scale experiments, we found that over 90% of the time, visitors to a parked domain would be directed to illegal content, scams, scareware and anti-virus software subscriptions, or malware, as the ‘click’ was sold from the parking company to advertisers, who often resold that traffic to yet another party,” Infoblox researchers wrote in a paper published today.
Hey, same thing happens with my Google search results, what a coincidence!
Yeah maybe it's not over 90% of the time. But I wonder if a study has been done to what the percentage is just for search ads.
Their definition of parked domain is a bit odd, with "expired" domain names and typosquatting” domains. I work at a registrar and the absolutely vast majority of parked domains for us are domains owned by customers that register alternative versions, campaign, products and misspellings of their primary domain. Parked in that sense mean an almost empty zone with occasionally a default landing page, sometimes as a paid DNS service at the registrar, and sometimes as a free service (There are still registration and renewal fees).
Putting a redirect onto such domain would be a major bad faith act by the registrar and a reason to avoid that registrar at all costs. The customer is the owner of that name, has their name attached as the registrant, and generally hold some legal risk while doing so. It also goes directly against the primary reason why the customers bought the domains in the first place.
The ones that hold advertisement two specific cases. One is "expired" domains which are not actually expired but where the registrar holds on to it in the hope that the old or new customer will buy it for an extra cost. The other is names which a customer or the registrar itself bought as an investment in hope to auction out. That kind of behavior was historically frowned at but is fairly common practice for a smaller number of domains. Usually you don't put redirects on those since you want to expose the fact that the domain is for sale.
So I am very confused where they got their 90% number from, but then I would not call typosquatting as parked domains if its registered by a malicious actor and used for a scam on their own servers (or hacked servers as it may be).
I park mine by having no IP address, MX record is "0 ." meaning it does not receive email, the SPF record is "v=spf1 -all" and DMARC is a strict reject, CAA is 0 issue ";", BIMI is "v=BIMI1; l=; a=;". I do the same for wildcard DNS. There's probably more I should add.
Indeed, this is a common practice in the broader data. It seems the linked article is filtering to resolvable+hosted domains, a subset of overall domain parking.
Yup. That's why I am suggesting to stop that practice and just remove the IP rather than trusting the landing page someone else maintains. Or if one would like to give bots something to do point it to a multicast address or perhaps MoD/US Military address.
The m3aawg has a parked domain guide - https://www.m3aawg.org/sites/default/files/m3aawg_parked_dom...
I appreciate that but I will always follow the Bender Domain Parking Standard [1].
[1] - https://mirror.newsdump.org/domain_parking_standard.txt
We've unfortunately come a long (bad) way from the innocuous "backpack girl" parking pages.
For a refresher: https://i.kym-cdn.com/entries/icons/original/000/033/037/gir...
> For a refresher
I've never seen that image before. :/
More background: https://knowyourmeme.com/memes/people/parked-domain-girl
I remember her!
A similar trend I've noticed in the US within recent years has been that misdialing toll-free numbers(or even correctly dialing an apparently expired number), originally "area code" 800, since expanded to include 888, 877, 866, 855, and 844, will lead to a scam or advertising connection.
This is one of numerous trustworthiness attacks on general public-switched telephone network (PSTN) use which I suspect will lead to an increased abandonment of that system. If we can neither trust either incoming or outgoing calls to connect to a trustworthy counterparty, people will tend to prefer systems which do so.
(This is on top of privacy and security issues with PSTN, including data exfiltration by operators, and potential for wiretapping and intercepting voice, texts, and data.)
I owned facebook.ky, as a goof, for about 2 weeks 10+ years ago before Facebook claimed it from me. Wild to me that huge banks don’t have a team whose responsibility it is to watch for and seize scam domains
Facebook[1], Google, etc all use (or used to use) MarkMonitor that offers domain squatting monitoring as a service[2] that utilizes the Uniform Domain Name Dispute Resolution Policy to remove offending domains violating their trademark. These services are quite expensive from my understanding.
[1] It appears Facebook now utilizes their own internal registry.
[2] https://www.markmonitor.com/domain-dispute-recovery-solution...
I've seen this on some of the domains speculatively registered by companies hoping to sell them for a fortune. Pick a dictionary word, or just a short (3 or 4 letter) Domain Name. If it's not actually in use, somebody had registered it and would love to sell it for some stupid amount. In the mean time, I guess they pay the fees by renting to scammers...
I really wish the domain registrar's would prohibit speculation, but there's money to be made, so...
Hopefully “direct navigation” does not become a boogeyman like “side loading” has.
Especially when the alternative is "type the company name into google" where the top 3 results are ads and they've previously been seen to stick malware distribution sites above the legitimate company pages
This was happening for months with blender in 2022/2023, previously collected links about it here: https://news.ycombinator.com/item?id=34917701
Yesterday I received spam with link on https://storage.googleapis.com/ that redirected to some parked domain.
The bit about the gmai.com mailserver is disturbing. One would imagine there are many other typo squatters with a similar setup.
I just checked. At least it's not answering on 25 to receive all that free typo mail. Same for gmali.com. But they could spoof the gmail login page. Not finding out.
You're looking in the wrong place. They don't need to be listening for mail on the machine behind the A/AAAA records for the domain, because they have an MX record indicating that mail should be delivered elsewhere:
Port 25 is very rare these days, as it implies the possibility of unencrypted traffic; legitimate SMTP traffic uses port 587. That said, I checked a couple of the hosts that that name resolves to, and they all listen for both SMTP and secure SMTP traffic:mail.h-email.net is a Spamhaus spamtrap.
As far as I've been able to research, these typesquatting domain traps started at the same time as Spamhaus CSS blacklist which was actually a company called Deteque.
If the MX has a large number of Hetzner IPs as mailservers, then it's probably Spamhaus.
Port 25 is only uncommon for client submission, but prevalent for MTA>MTA traffic.
Can we have a land value tax for domains?
We have one; that's the registration fee.
This just happened to me a month ago, I was waiting for a unused domain to expire. The domain was hosted on Epik (which I think is a trashy company but w/e).
About a month before expiration it somehow got renewed for 10 years, which is weird because it was not available ... and is now hosting a "get-rich-quick" scam that pretends to be a genuine Petro Canada campaign.
> About a month before expiration it somehow got renewed for 10 years, which is weird because it was not available
I've seen some domain registrars auctioning off domains during the last 2-4 weeks before they expire. If nobody buys it, then it actually expires and is then released.
Which registrars? I would want to avoid those.
At the end of the day, no matter your domain, ICANN can just take it for their VC bros. Happened to a friend of mine that owned a pretty novel domain name that a certain social media company wanted. He refused to sell. ICANN and his registrar just transferred it out from under him. Gone. See ya.
Wow. In light of this it's amazing that Mr. Nissan (RIP) and later his heirs managed to not only retain control of nissan.com, but regain it after it was stolen years after his passing.
Money talks
There's a difference between trademark issues and your registrar auctioning off the name
Unless you prove you had the trademark first and they did it anyway.
Out of curiosity, what was the domain?
You gotta name the domain!
I'd rather not face the ire of ICANN, sorry.
I know better. They read this site. They know that all it takes is some company to issue some trademark litigation and they fold. No basis, no question, just here you go.
what would they do to you...?
Oh you sweet summer child, you haven’t met their lawyers…
well, condescension aside, literally what would they do? there's nothing remotely illegal about posting the name of a site in a forum. and here you are trying to get me to be as scared as you are about posting a basic fact in a forum and why would I be?