To people interested in archiving this, it's still in the 7 days time history in BigQuery Public Dataset:
For instance:
SELECT
country
FROM
`bigquery-public-data.google_political_ads.geo_spend`
FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 6 DAY)
GROUP BY country;
I dumped the all active tables (advertiser_declared_stats, advertiser_geo_spend, advertiser_stats, advertiser_weekly_spend, geo_spend, creative_stats) in CSV format. I'd prefer if someone did an archive where they are confident in the integrity of the dumped data but at least that version exists.
BigQuery supports a sort of "time travel" / point-in-time querying. By default and for those datasets it's enabled for 7 days which means you can query data as it was up-to 7 days ago.
3. For each table under google_political_ads, run the following query: SELECT * FROM `bigquery-public-data.google_political_ads.<TABLE>` FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 6 DAY) GROUP BY country;
3. Export as CSV in GCS
Another procedure that is probably better but requires BigTable is:
What a snarky comment. As said just one comment under (https://news.ycombinator.com/item?id=45413138), I made my own backup. But as someone who's not actively using BigQuery, I'm not confident in the data integrity so I'd prefer if someone confident in their abilities could make an archive as well.
Archivist here. Google is not an archive. Neither is Tumblr or Flickr or any other platform that might delete your content at any time. They're companies and it's their job to make money. This is why my profession exists. We don't make money, which is why we're not well funded, but we have a whole lot of training, technical knowledge, and professional ethics around saving information and making it accessible. If you want to preserve your records, talk to an archivist because you can't assume some faceless corporation will do it for you.
I commend you for your work and I think it's incredibly important, and I fully agree with what you've posted here.
However, this is still a noteworthy story because they aren't complaining about their own data being deleted. It's all data history for political ads, and it's whole point of existing was for transparency (it's even in the URL of the Google site). This is a reversal of an almost 10 year old policy
I think the reasonable position here is that it's within Google's rights if they want to take data down, but at least give us warning and an archive of the data. If they'd said "we're going to take all this data down in two months but here's an archive of all of it if anyone wants to download it", I think very few people would have a problem with it.
Agree. They're a company with (I assume) a PR department that continues to allow the company to make some really bad choices that continue to erode their reputation.
That's an interesting point. But then I might have said instead that Google, the company, is sure making their PR department's jobs much more difficult.
I'm reasonably confident you literally mean the corporations have to do the archiving since the article is about a corporation not doing it. Philosophically you just picked a random subgroup in society to do the archiving. If we're going to pick people who have to do this by law, why not force the archivists to do it? They've already got the skills and experience. Probably willing to do it voluntarily if there is some money in it from the government, but I suppose if we're committed to forcing people to do it there can be some sort of taskmaster drag them back to the archives if they try to sneak out early.
It's kinda interesting, since many countries already have taxpayer paid archivers.
Not sure if the laws have changed, but every book published over here in my country needed to send a few copies to our "national library" for archival.
Ironic considering it seems to be a change in law that spurred this action in the first place.
If it's worth saving from a societal standpoint, maybe a third option of funding and maintaining a public archive could be taken. Wild idea, we can tax the faceless corporation to pay for it.
You really want mandatory data retention laws? Think about the side effects of this.
First of all, any such regulation is a regressive tax on small businesses. Small companies will find it harder to comply than large ones. The cost to Google would be trivial but for a small startup it might kill them, especially if retaining data isn't important to their business.
Secondly, there are privacy implications. It's sometimes good when data is purged.
For political advertisement via the internet? I absolute want mandatory data retention and transparency. It must be clear what was published using which targeting criteria by whom and when. 100%. Our societies are in grave danger.
In general, this point is absolutely correct, and regulatory capture as a mechanism to stifle smaller competitors is shockingly common.
But in the case of requiring brokers of political advertising to maintain transparency about the reach of that advertising - that seems far more palatable and far more in the public interest. If you want to play in that specific sandbox, you owe accountability to the public at a level where dynamics across election cycles can be analyzed.
Of course, all this is just a thought exercise, since the background of the original post is that Google is removing its archive because its response to the EU regulatory environment has been to pull out of the political ads market entirely on a go-forward basis. Regulations did not require it to maintain any historical archives, apparently, and so the natural consequence would be that Google had no reason to air its historical dirty laundry with no benefit to them at all.
All communications are speech in one way, but whether this should be considered equivalent (from a regulatory perspective) to an individual's speech was not apparent to 4 out of 9 Supreme Court justices in the US in 2010 during Citizen's United - and, certainly, this opinion does not bind (or speak for) the entire world's description of speech.
Pretty sure GP was being sarcastic. But in any case, there's no reason we can't recognize that Google and other such massively-influential companies are hugely different from a small business and act accordingly.
I think your comment exemplifies why people have an issue with "just regulate it" because there are endless nitpicks and carve-outs that seem arbitrary and will likely have unintended consequences. It's easy to go "then just do this" but in reality the government and private sector can only deal with so much from an enforcement and compliance perspective.
saying "businesses over a certain size must comply" and "data must be anonymised" are not endless nitpicks, they're simple rules that can be and are regularly enforced the world over. I think your comment exemplifies why people have so much distaste for the corporate sphere and its disingenuous ideology in general
We need to start working on the premise that large corporations are different beasts than small businesses. I mean as a people of the world as a whole.
There is a tipping point somewhere and that is definitely up for conversation but we need to pick a point and start making sure regulation hits where it does good.
Frankly, the outcomes of both "regulate it" and "don't regulate it" have already both been captured by the biggest offenders to use as they wish.
Say you build a hobby website for your photos and allow people to comment on them. Boom, now you are responsible for keeping archives for other people posting there and cannot take your site down. Why do you think this is correct?
This is about political advertisements, not about a hobby website for photos. As a society, we need to hold those that influence and track us, to be responsible and transparent.
Guys, this isn't Twitter, we don't have to be obtuse just to ramp up engagement.
Right or wrong (evidently wrong), the common assumption has historically been that the Internet giants like Google assumed the mantle of facilitating, and to a lesser degree, preserving, the digital commons. Having your own backups and general data practices is still going to be the best strategy, but I don't think it's fair or good faith to act like everyone who got bit by this and similar instances is just an idiot.
> Guys, this isn't Twitter, we don't have to be obtuse just to ramp up engagement.
I agree, but by the same notion jumping to the conclusion that this was a bad faith move from Google is overlooking the fact that the ad transparency site is still up and working for other countries.
This only impacts EU countries, even though most of the comments have assumed the entire ad archive is gone (meaning they didn’t even skim the article). A true good faith curiousity perspective would be to wonder why it’s the EU specifically.
I’m willing to bet the reaction here would be different (not from everyone, but in aggregate) if the headline was “Sourceforge just erased years of free software history” or “Google Scholar just erased years of scientific history” because they’d just taken down all old repos or search results for old papers without any notice.
> “Google Scholar just erased years of scientific history”
Sure the tone would be different, but anyone who was totally shocked by Google pulling a service is definitely an idiot. How much worse could Google's reputation be at this point. It's Google. They pull stuff.
That it happened on any particular day? Yeah, that is surprising. What are the odds. Could have been any day of any year.
That it happened? Not a surprise. If it matters to you, you should have done something.
I think it's not about individual, but rather collective idiocy. It's much more convenient to believe false impressions that big tech is trying to instill rather than listening to nerds reminding you that cloud is just other people's computer.
Agree totally. I don’t agree with the requirement that we naively treat all corporations of all sizes from 5 people to 500,000 by the same rules. When your profits are socialized across the entire world you have special obligations. Sorry not sorry. How to correctly specify those obligations to avoid unintended consequences is a separate matter, maybe not even possible but it’s an orthogonal question.
> the common assumption has historically been that the Internet giants like Google assumed the mantle of facilitating, and to a lesser degree, preserving, the digital commons
A role Google was happy to fill for so long. We shouldn't forget that, and we shouldn't let them simply throw away the responsibilities they endeavoured to undertake, just because it's no longer beneficial for them.
Platforms like Google, Meta, Xitter, and so on, have an incentive to save data, because mining that data is how they make money. So they save as much as they can for as long as they can. But if a mine is tapped out, it gets closed.
The EU has new regulations on political ad transparency and targeting coming in this year with likely fines for non compliance.
I imagine many of these old ads do not comply with the new rules so Google removed everything just to eliminate the risk of a fine or enforcement action.
If these are important, people shouldn't rely on the ad agency archiving them.
> I imagine many of these old ads do not comply with the new rules so Google removed everything just to eliminate the risk of a fine or enforcement action.
The EU does not regulate keeping historical records though. Google deleting them is almost suspicious because we can't imagine a good reason they'd go out of their way to have someone spend time on deleting information.
You're right about expectations from an Ad company though. Imagine people using their browser or phones thinking "privacy".
> The EU does not regulate keeping historical records though.
My experience dealing with GDPR and other EU regulations across several companies is that the laws are very vague in their wording. We encountered a lot of scenarios where the law was just vague enough that our lawyers advised us to avoid anything that could be interpreted as infringing. The penalties assigned in some of these laws are indicated as a percentage of global profits, so we would play it safe to avoid any possibility of some EU politician trying to score political points by getting headlines about a big fine they extracted from a tech company.
I don’t know about their political advertising laws specifically, but I would not be the least bit surprised if they deleted these ads to be 100% safe in dodging potential fines under vague laws.
> because we can't imagine a good reason they'd go out of their way to have someone spend time on deleting information.
Note that they didn’t remove the archive for non-EU countries. They only did this in the EU. By this logic, they spent extra effort exclusively doing this for the EU while keeping it for other countries. That suggests to me that some EU specific reason is in play.
I’m not sure what you’re saying exactly: you had a bad lawyer or you operate in a place where the spirit of the law is not known?
> That suggests to me that some EU specific reason
That was my point as well - Google is being shady for no apparent reason. I guess we’ll never know, or they’ll release some clickbaity statement like Apple’s recent “commentary” on the DMA. Baddies gonna do bad things.
Find me surprised the EU also has rules about this and everything else.
No wonder we are getting left behind. We literally pay thousands of people out of our taxpayer money, to make life difficult for anyone that wants to do business. That’s all those bureaucrats in Brussels literally do: come up with yet more rules.
You really should inform yourself what "we" are paying exactly, is it "taxpayer" money and what it gets used for. You'd be surprised at the things around you that you take for granted are also driven by those same regulations.
Sounds like you live in a country where elections were not manipulated by social media or at least not to the extent that they had to be annulled. Dark money and/or foreign nations with limitless money buying up digital adspace everywhere is absolutely a problem in the EU and I'm very glad we have some legislation against it now.
Everyone here is commenting on the fact that you cannot depend on a big company to store backups for you. This is generally fair, but the company in question specifically has the following mission statement:
> Google's mission is to organise the world's information and make it universally accessible and useful
At the very least, google has some responsibility in helping out web archivists...
Yeah hindsight is gonna be 20-20 as always, but in general if it's not on your hard drive, you have very little control over if and when it is deleted.
First thing you should do if you find useful data is to archive it somehow.
IMHO it's not, but they built that ad transparency site specifically for that purpose, so it's noteworthy that they've suddenly and silently reversed course.
The archive and record is still up. The article is about how EU countries are not available in the tool.
Speculation is that some EU regulation might overlap with the site: Maybe it’s technically against some law to share that customer information or perhaps the upcoming laws about political advertising don’t have a carve-out for historical ads, so any such archive might be infringing.
If they were out to eliminate all transparency they’d just shut the site down, but they didn’t.
I think you could make an argument that any hyperscaler that has used it's economy of scale superpower to obliterate as much competition as the giants have (thinking Google, Meta, et al) and have gone to such lengths to become the center-point of their given markets have at least a vague notion of an ethical obligation to like... not be monsters. Which is of course antithetical to the "eat the world" ethos: one would observe that few things once eaten are better for the experience.
Like, maybe it's just me, but if you as a corporate entity are going to subsume all your competition under the weight of your size and ability to lose money on tons of things so you can make even more on the other, I think it's a pretty fair ask in turn for them to not... I dunno, arbitrarily delete a shit ton of useful data for no reason?
"But the ad archives were introduced 7 years ago for a reason - in no small part because of the chaos of the Brexit and Trump 2016 votes, and our own advocacy here in Ireland about interference in the 2018 8th amendment referendum.
They were introduced to allow for scrutiny of campaigns, and also to provide a historical record so we could go back and look at what had been promised, and what had been spent, and to see if this lined up with what happened later.
This erasure of our political past feels dangerous, for scrutiny, for accountability, for shared memory, for enforcement of our rules - for our democracy."
It's a rather reasonable demand that they should keep a record of the political campaigns and psyops they make their dough from, at least if nuking Alphabet from orbit is not an option.
The groups that have the power to demand they do this are the same ones involved in the chaos and have no onus to ensure that google keeps this data around, and as much as the US flip flops in political power it can go from they must keep this data one year to they must get rid of it the next.
No, the only reasonable demand in a world that has already embraced insanity is that you keep the data and share it with others. Why not get together with others of the same mindset and form voluntary entities that keep this data safe.
Trusting the government to keep the audit data on itself won't always work, and the 4th estate of the large media has been bought off by large companies that are much more apt to work in the governments interests to make as much profit as possible at the cost of democracy.
YouTube offers free hosting and is widely available. Presumably that’s why it was chosen to host these videos. Other people cannot duplicate YouTube. VCs financed YouTube to build a free service with network effects that allow them to turn around and charge rent on the one hand and exercise capricious control on the other. It’s not in the public interest to allow these billionaires to retain control of these companies. Their control over these rent seeking enterprises directly conflicts with the public interest. This incident, erasing political ads, is an egregious example of private entities using their wealth to silence essential public discourse.
This is a content-free dismissal. The existence of an effective monopoly in records retrieval for previously published data does not affect your ability to collect presently published data.
I think this is quite a bit more complex when you’re talking about products that operate as common carriers. The historical social contract we have with these services is that we treat them as the digital commons.
Clearly that has changed, but you’ll have to forgive folks for treating the internet as it has been in the ~20 years previous to the current chaos.
The billionaire owners of these systems are the entitled ones. They profit by using VC money to offer something for free, then turning around to exploit network effects and lock-in so they can charge the working class various forms of rent for what they’ve built.
This thread is full of comments that miss the fact that political ads are not uploaded to Google for free. Advertising is 99% of the funding that sustains the platform and makes Google oodles of money, and when it no longer suits them (read: doesn't generate profit and might be risky), they're dropping the content.
It'd be one thing if it was 20 petabytes of cat videos, but this is content that Google was literally paid to serve to people.
And 'broad'cast advertising is a different beast from hyper-targeted advertising that nobody but the intended recipient sees. In the latter case, political advertising archives offer insight into otherwise hidden advertising.
There must be more to this story given that the political ad archive is still available for many other non-EU countries:
> Now when you try to click on "political ads" you get re-directed to a page asking you to select from a small number of countries - the US, of course, UK, India, Australia, Brazil, Israel - but not one EU country (see below):
Is there some sort of EU data retention law at play here?
Is the historical aspect of the archive coincidental? It seems like Google archived ads; and those ads provided historical context. How does this information get supplemented by other historical data? I wonder if it will be relevant to know that an ad in the US talked about people eating their pets, not trying to be sarcastic it is a real part of history.
This article reports on the removal of access to the ad archive. No where does it claim the onus is on Google to maintain this data. Most of top level comments here are arguing against a straw man.
I maintain my own link meta archive. Just because I know that it will stay, but I am aware of link rot. I know also that internet archive exists, and also I know that it works painfully slow.
I want to be able to at least browse headlines and titles about Jeffrey Epstein after 20 years, when they start erasing all history about him.
This erasure of our political past feels dangerous, for scrutiny, for accountability, for shared memory, for enforcement of our rules - for our democracy.
My goodness, I wonder how long it took them to think this statement up? I imagine they revised it again and again asking the editor "does this sound scary enough?" to the point it bears little resemblance to reality.
Google didn't delete your political history, it deleted it's own. You lost something that was given for free, thus could be taken away at any time.
Apparently it was so important that it would be "dangerous" if it went away but the author couldn't put in a minimal amount of effort to make a copy of it.
Maybe get your government to do this, instead of expecting some random company to do it for free indefinitely? This article could well be entitled "Google were the only people bothering to record our history."
How could the government do this? At best they could make an archive of the data Google chooses to publish but to do it themselves would require mandating Google to share the data with them to begin with.
> You asked "how could the government do this"? Come on, bro.
And you gave a generic link showing that they could share "data" in general. Not this kind of data.
Like "come on bro" both Ireland and the EU already have generic data portals...
> The linked article explains where they were getting the data before.
Yes, from the company who choose to share the data. There's no way to get the date from inside the company. In order for the government to publish the data they need to rely on the very same private company to supply them with it indefinitely, don't they? Again, "come on bro"...
Unfortunately people cannot 'get' their government to do much - and certainly not something as specific as this. Given that electoral leverage is a blunt and broad tool indeed. Also this is an EU level issue, not one pertaining to a national government.
Missing in this analysis is that advertisers did not, in fact, put ads on Google for free. While it's surely not part of the terms of service that Google is going to "back up" your ads indefinitely, the calculus employed in this thread is very, very wrong.
Google made huge profits on political ads and when it no longer suits them to spend paltry disk space on them, it's dropping them like a hot rock.
Stop pretending like Google was somehow offering a platform that people were freeloading on. Rather, these political ads were 100% the funding that sustains the platform and makes them and all their shareholders money.
These businesses built what have become the commons, which they retain control over through mechanisms like network effects, which allow them to make unaccountable decisions on the one hand and profit from rentierism on the other. It would probably be in people’s interest for their governments to seize these assets and socialize them as public utilities.
Disappointing but i don't see any foul play from google like the headline kind of insinuates. Anyway their product support is notoriously fickle so why would you expect anything more of them. TLDR a policy change at google made the authors hobby more difficult or impossible because he didn't make any contingencies. Specifically they deleted political ad history for the EU.
Where was the insinuation of foul play? I re-read the article with your criticism in mind, but it seems that they are basically just saying "this happened" and 'these are the negative consequences."
The phrase "erased history" isn’t morally neutral. It carries a tone of judgment, even suggests censorship. If the author had wanted to appear neutral, they would have used "deleted" or "removed."
Ah! I read the article twice but hadn't considered the headline. I guess some editor wanted it sexed up a bit before it was published, the article itself is quite matter of fact.
This specific case is about people who have no idea they are interested in the data today (likely because they don't yet know it exists), needing it for research tomorrow. It's about how we should go about archiving public data rather than about whether somebody did a good job on their personal backups. And, if a company presents a data set as an accessible store of public information, what is a reasonable timeline for removing that service, especially when it is trivial for said company to maintain it for a reasonable sunset period. This is obviously extremely important information, and rug-pulling so quickly is borderline malicious.
Even if it were about personal information, I would still find your comment short-sighted, because companies position themselves as reliable, and people rely on them because of this. Even if it's not always smart to go out in a bad neighborhood late at night in a very short skirt, there are moments when pointing it out borders on pathological.
You say this like someone was talking about their own data, not someone else's. The OP is talking about a Google dataset of money spent on Google ads over time, not one the OP created or produced in any way. Do we even call it a "backup" when we're talking about preserving a third-party's data (possibly without their consent, possibly in a way they would consider a violation of their ToS)?
Still, certainly it's true as observable that we cannot trust Google to preserve anything. One reason some people don't realize this is because Google has spent years marketing itself as a trustworthy steward with an interest in doing things for the good of society. Another reason people don't do something about it, about every single Google dataset or media corpus that they might regret disappearing -- is because it's expensive and sometimes challenging to do so, Google sometimes tries to prevent it as a ToS violation, and nobody is funding it (or at least there is certainly not enough funding to preserve all of Google's datasets).
There are ways to address these things, but blaming people noticing it as a problem as idiots is probably not one that will lead to addressing them.
I would suggest a slight modification to your idea, in that people should build local repositories for there stuff, and use the mega corp for whatever they are good at NOW, TODAY and keep a smirk ready for when the mega corp blunders off and the whole recovery is click, click, done
cause if there is one thing the web is good at, it's signing you up for the newest flash, later today.
Didn't Google say that their mission is to organize the world's data? They definitely created the expectation and no matter what the t&c say, people will keep Google according to the standards the company itself created via its marketing.
It’s no one else’s job unless you pay them, which is you taking a backup.
And when I say “a backup” I mean multiple backups different media different locations, different backup mechanisms depending on how much you really don’t want to lose the data.
You seem to have misunderstood what is going on here.
The data here is Google's data. They sell ads to political parties. They then make details of those ad sales available to researchers to let them understand how political ad spending is happening in different countries.
This researcher is complaining that Google appear to have changed their mind about making that data available.
The audacity of people to think others should store data for them indefinitely is unbelievable. You had years to back this up if you really cared about it, and most probably don’t. Cry more.
And history isn’t “erased”. It still happened. It’s up to you to remember it.
To people interested in archiving this, it's still in the 7 days time history in BigQuery Public Dataset:
For instance:
SELECT country FROM `bigquery-public-data.google_political_ads.geo_spend` FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 6 DAY) GROUP BY country;
Could someone get Archive Team on this? https://archiveteam.org
I dumped the all active tables (advertiser_declared_stats, advertiser_geo_spend, advertiser_stats, advertiser_weekly_spend, geo_spend, creative_stats) in CSV format. I'd prefer if someone did an archive where they are confident in the integrity of the dumped data but at least that version exists.
Please upload those to archive.org.
https://archive.org/details/bigquery-public-data.google_poli...
/r/DataHoarder as well
https://www.reddit.com/r/DataHoarder/
What does "7 days time" mean in this context?
https://cloud.google.com/bigquery/docs/access-historical-dat...
BigQuery supports a sort of "time travel" / point-in-time querying. By default and for those datasets it's enabled for 7 days which means you can query data as it was up-to 7 days ago.
If someone can post instructions, I’d be happy to follow them and make a copy as well.
For what it's worth, my own procedure:
1. Create a GCS bucket
2. Open https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1s...
3. For each table under google_political_ads, run the following query: SELECT * FROM `bigquery-public-data.google_political_ads.<TABLE>` FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 6 DAY) GROUP BY country;
3. Export as CSV in GCS
Another procedure that is probably better but requires BigTable is:
1. Open https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1s...
2. For each table, click Snapshot and set Snapshot time to Sep 23 (Sep 24 works as well)
Ohh, Google’s BQ query bills are about to go through the roof!
[flagged]
What a snarky comment. As said just one comment under (https://news.ycombinator.com/item?id=45413138), I made my own backup. But as someone who's not actively using BigQuery, I'm not confident in the data integrity so I'd prefer if someone confident in their abilities could make an archive as well.
<https://news.ycombinator.com/item?id=9972849>
Archivist here. Google is not an archive. Neither is Tumblr or Flickr or any other platform that might delete your content at any time. They're companies and it's their job to make money. This is why my profession exists. We don't make money, which is why we're not well funded, but we have a whole lot of training, technical knowledge, and professional ethics around saving information and making it accessible. If you want to preserve your records, talk to an archivist because you can't assume some faceless corporation will do it for you.
I commend you for your work and I think it's incredibly important, and I fully agree with what you've posted here.
However, this is still a noteworthy story because they aren't complaining about their own data being deleted. It's all data history for political ads, and it's whole point of existing was for transparency (it's even in the URL of the Google site). This is a reversal of an almost 10 year old policy
The data is not yet deleted but will be in 2 days, would you be interested in archiving it?
https://news.ycombinator.com/item?id=45412855
Yup. Thanks for your work.
If we want to preserve something, then it's up to us, to ensure that it's preserved.
If we pay someone else (like you) to do it, then we expect them to preserve it, but not if we aren't paying for it.
That said, preserving stuff; even electronic stuff, is a challenge.
I think the reasonable position here is that it's within Google's rights if they want to take data down, but at least give us warning and an archive of the data. If they'd said "we're going to take all this data down in two months but here's an archive of all of it if anyone wants to download it", I think very few people would have a problem with it.
> Google is not an archive.
Agree. They're a company with (I assume) a PR department that continues to allow the company to make some really bad choices that continue to erode their reputation.
I don’t think it’s PR’s job to tell the company what to do. It’s their job to spin what the company does in a way that’s beneficial to it.
That's an interesting point. But then I might have said instead that Google, the company, is sure making their PR department's jobs much more difficult.
Faceless corporations should be forced to do so via the law!
I'm reasonably confident you literally mean the corporations have to do the archiving since the article is about a corporation not doing it. Philosophically you just picked a random subgroup in society to do the archiving. If we're going to pick people who have to do this by law, why not force the archivists to do it? They've already got the skills and experience. Probably willing to do it voluntarily if there is some money in it from the government, but I suppose if we're committed to forcing people to do it there can be some sort of taskmaster drag them back to the archives if they try to sneak out early.
It's kinda interesting, since many countries already have taxpayer paid archivers.
Not sure if the laws have changed, but every book published over here in my country needed to send a few copies to our "national library" for archival.
edit: https://www.nuk.uni-lj.si/informacije/obvezni-izvod-fizicni-... (yeah.. google translte it)
Ironic considering it seems to be a change in law that spurred this action in the first place.
If it's worth saving from a societal standpoint, maybe a third option of funding and maintaining a public archive could be taken. Wild idea, we can tax the faceless corporation to pay for it.
You really want mandatory data retention laws? Think about the side effects of this.
First of all, any such regulation is a regressive tax on small businesses. Small companies will find it harder to comply than large ones. The cost to Google would be trivial but for a small startup it might kill them, especially if retaining data isn't important to their business.
Secondly, there are privacy implications. It's sometimes good when data is purged.
For political advertisement via the internet? I absolute want mandatory data retention and transparency. It must be clear what was published using which targeting criteria by whom and when. 100%. Our societies are in grave danger.
In general, this point is absolutely correct, and regulatory capture as a mechanism to stifle smaller competitors is shockingly common.
But in the case of requiring brokers of political advertising to maintain transparency about the reach of that advertising - that seems far more palatable and far more in the public interest. If you want to play in that specific sandbox, you owe accountability to the public at a level where dynamics across election cycles can be analyzed.
Of course, all this is just a thought exercise, since the background of the original post is that Google is removing its archive because its response to the EU regulatory environment has been to pull out of the political ads market entirely on a go-forward basis. Regulations did not require it to maintain any historical archives, apparently, and so the natural consequence would be that Google had no reason to air its historical dirty laundry with no benefit to them at all.
"Brokers of political advertising" is just speech.
All communications are speech in one way, but whether this should be considered equivalent (from a regulatory perspective) to an individual's speech was not apparent to 4 out of 9 Supreme Court justices in the US in 2010 during Citizen's United - and, certainly, this opinion does not bind (or speak for) the entire world's description of speech.
"just speech" is just speech too, right?
I seem to be missing the point here. Are you claiming that this term can not be applied categorically in a realistic fashion? I think that's wrong.
I agree with what you said here.
I do think that with a slight modification, OP's statement can be improved in a practical way.
"Regulations that allow consumers to export their data should be more comprehensive and standardized."
I'll also note that many of these big services do allow you to export user data fairly easily.
What are the privacy implications in a database of advertising, an activity specifically intended to make information as public as possible?
Pretty sure GP was being sarcastic. But in any case, there's no reason we can't recognize that Google and other such massively-influential companies are hugely different from a small business and act accordingly.
>First of all, any such regulation is a regressive tax on small businesses.
your first argument is that it harms small businesses
it's really not an issue to set up laws such that small businesses do not have to follow them. the DMA is a perfect example
your second argument is that there are privacy implications
okay then require the data to be anonymised
I think your comment exemplifies why people have an issue with "just regulate it" because there are endless nitpicks and carve-outs that seem arbitrary and will likely have unintended consequences. It's easy to go "then just do this" but in reality the government and private sector can only deal with so much from an enforcement and compliance perspective.
saying "businesses over a certain size must comply" and "data must be anonymised" are not endless nitpicks, they're simple rules that can be and are regularly enforced the world over. I think your comment exemplifies why people have so much distaste for the corporate sphere and its disingenuous ideology in general
We need to start working on the premise that large corporations are different beasts than small businesses. I mean as a people of the world as a whole.
There is a tipping point somewhere and that is definitely up for conversation but we need to pick a point and start making sure regulation hits where it does good.
Frankly, the outcomes of both "regulate it" and "don't regulate it" have already both been captured by the biggest offenders to use as they wish.
Say you build a hobby website for your photos and allow people to comment on them. Boom, now you are responsible for keeping archives for other people posting there and cannot take your site down. Why do you think this is correct?
This is about political advertisements, not about a hobby website for photos. As a society, we need to hold those that influence and track us, to be responsible and transparent.
Guys, this isn't Twitter, we don't have to be obtuse just to ramp up engagement.
Right or wrong (evidently wrong), the common assumption has historically been that the Internet giants like Google assumed the mantle of facilitating, and to a lesser degree, preserving, the digital commons. Having your own backups and general data practices is still going to be the best strategy, but I don't think it's fair or good faith to act like everyone who got bit by this and similar instances is just an idiot.
> Guys, this isn't Twitter, we don't have to be obtuse just to ramp up engagement.
I agree, but by the same notion jumping to the conclusion that this was a bad faith move from Google is overlooking the fact that the ad transparency site is still up and working for other countries.
This only impacts EU countries, even though most of the comments have assumed the entire ad archive is gone (meaning they didn’t even skim the article). A true good faith curiousity perspective would be to wonder why it’s the EU specifically.
I’m willing to bet the reaction here would be different (not from everyone, but in aggregate) if the headline was “Sourceforge just erased years of free software history” or “Google Scholar just erased years of scientific history” because they’d just taken down all old repos or search results for old papers without any notice.
> “Google Scholar just erased years of scientific history”
Sure the tone would be different, but anyone who was totally shocked by Google pulling a service is definitely an idiot. How much worse could Google's reputation be at this point. It's Google. They pull stuff.
That it happened on any particular day? Yeah, that is surprising. What are the odds. Could have been any day of any year.
That it happened? Not a surprise. If it matters to you, you should have done something.
I think it's not about individual, but rather collective idiocy. It's much more convenient to believe false impressions that big tech is trying to instill rather than listening to nerds reminding you that cloud is just other people's computer.
Agree totally. I don’t agree with the requirement that we naively treat all corporations of all sizes from 5 people to 500,000 by the same rules. When your profits are socialized across the entire world you have special obligations. Sorry not sorry. How to correctly specify those obligations to avoid unintended consequences is a separate matter, maybe not even possible but it’s an orthogonal question.
> When your profits are socialized across the entire world
What does this mean?
> the common assumption has historically been that the Internet giants like Google assumed the mantle of facilitating, and to a lesser degree, preserving, the digital commons
A role Google was happy to fill for so long. We shouldn't forget that, and we shouldn't let them simply throw away the responsibilities they endeavoured to undertake, just because it's no longer beneficial for them.
Platforms like Google, Meta, Xitter, and so on, have an incentive to save data, because mining that data is how they make money. So they save as much as they can for as long as they can. But if a mine is tapped out, it gets closed.
The EU has new regulations on political ad transparency and targeting coming in this year with likely fines for non compliance.
I imagine many of these old ads do not comply with the new rules so Google removed everything just to eliminate the risk of a fine or enforcement action.
If these are important, people shouldn't rely on the ad agency archiving them.
> I imagine many of these old ads do not comply with the new rules so Google removed everything just to eliminate the risk of a fine or enforcement action.
The EU does not regulate keeping historical records though. Google deleting them is almost suspicious because we can't imagine a good reason they'd go out of their way to have someone spend time on deleting information.
You're right about expectations from an Ad company though. Imagine people using their browser or phones thinking "privacy".
> The EU does not regulate keeping historical records though.
My experience dealing with GDPR and other EU regulations across several companies is that the laws are very vague in their wording. We encountered a lot of scenarios where the law was just vague enough that our lawyers advised us to avoid anything that could be interpreted as infringing. The penalties assigned in some of these laws are indicated as a percentage of global profits, so we would play it safe to avoid any possibility of some EU politician trying to score political points by getting headlines about a big fine they extracted from a tech company.
I don’t know about their political advertising laws specifically, but I would not be the least bit surprised if they deleted these ads to be 100% safe in dodging potential fines under vague laws.
> because we can't imagine a good reason they'd go out of their way to have someone spend time on deleting information.
Note that they didn’t remove the archive for non-EU countries. They only did this in the EU. By this logic, they spent extra effort exclusively doing this for the EU while keeping it for other countries. That suggests to me that some EU specific reason is in play.
I’m not sure what you’re saying exactly: you had a bad lawyer or you operate in a place where the spirit of the law is not known?
> That suggests to me that some EU specific reason
That was my point as well - Google is being shady for no apparent reason. I guess we’ll never know, or they’ll release some clickbaity statement like Apple’s recent “commentary” on the DMA. Baddies gonna do bad things.
I would think to reduce the risk of looking bad too, when people can see what was done on their ad network.
The archive is still up for non-EU countries.
They didn’t delete the entire archive. Only something specific to the EU, potentially due to some upcoming regulations there.
Find me surprised the EU also has rules about this and everything else.
No wonder we are getting left behind. We literally pay thousands of people out of our taxpayer money, to make life difficult for anyone that wants to do business. That’s all those bureaucrats in Brussels literally do: come up with yet more rules.
You really should inform yourself what "we" are paying exactly, is it "taxpayer" money and what it gets used for. You'd be surprised at the things around you that you take for granted are also driven by those same regulations.
Sounds like you live in a country where elections were not manipulated by social media or at least not to the extent that they had to be annulled. Dark money and/or foreign nations with limitless money buying up digital adspace everywhere is absolutely a problem in the EU and I'm very glad we have some legislation against it now.
Protecting the free market is important for everyone doing business.
The neo liberal dream of completely unconstrained markets leads to only monopolies and no competition.
Left behind? Aren't we the happiest people in the world?
Everyone here is commenting on the fact that you cannot depend on a big company to store backups for you. This is generally fair, but the company in question specifically has the following mission statement:
> Google's mission is to organise the world's information and make it universally accessible and useful
At the very least, google has some responsibility in helping out web archivists...
"Don't be evil"
Yeah hindsight is gonna be 20-20 as always, but in general if it's not on your hard drive, you have very little control over if and when it is deleted.
First thing you should do if you find useful data is to archive it somehow.
Why is it on Google to store political history?
IMHO it's not, but they built that ad transparency site specifically for that purpose, so it's noteworthy that they've suddenly and silently reversed course.
To be clear, the site is still up. Archives are still available for other countries.
When they took money to influence political history, I think the least they can do is keep a record of who paid them to say what.
The archive and record is still up. The article is about how EU countries are not available in the tool.
Speculation is that some EU regulation might overlap with the site: Maybe it’s technically against some law to share that customer information or perhaps the upcoming laws about political advertising don’t have a carve-out for historical ads, so any such archive might be infringing.
If they were out to eliminate all transparency they’d just shut the site down, but they didn’t.
I think you could make an argument that any hyperscaler that has used it's economy of scale superpower to obliterate as much competition as the giants have (thinking Google, Meta, et al) and have gone to such lengths to become the center-point of their given markets have at least a vague notion of an ethical obligation to like... not be monsters. Which is of course antithetical to the "eat the world" ethos: one would observe that few things once eaten are better for the experience.
Like, maybe it's just me, but if you as a corporate entity are going to subsume all your competition under the weight of your size and ability to lose money on tons of things so you can make even more on the other, I think it's a pretty fair ask in turn for them to not... I dunno, arbitrarily delete a shit ton of useful data for no reason?
Sadly it's not just you, but it's wrong. They didn't have as part of the agreement to show some ads a clause for keeping the data for eternity.
Well, if an organization is so central to the operation of our society, maybe they need to not be privately owned then.
They don't need to be. It's just a strange coincidence that all the talented innovators aren't working for the public sector.
"But the ad archives were introduced 7 years ago for a reason - in no small part because of the chaos of the Brexit and Trump 2016 votes, and our own advocacy here in Ireland about interference in the 2018 8th amendment referendum.
They were introduced to allow for scrutiny of campaigns, and also to provide a historical record so we could go back and look at what had been promised, and what had been spent, and to see if this lined up with what happened later.
This erasure of our political past feels dangerous, for scrutiny, for accountability, for shared memory, for enforcement of our rules - for our democracy."
It's a rather reasonable demand that they should keep a record of the political campaigns and psyops they make their dough from, at least if nuking Alphabet from orbit is not an option.
Reasonable demand from whom?
The groups that have the power to demand they do this are the same ones involved in the chaos and have no onus to ensure that google keeps this data around, and as much as the US flip flops in political power it can go from they must keep this data one year to they must get rid of it the next.
No, the only reasonable demand in a world that has already embraced insanity is that you keep the data and share it with others. Why not get together with others of the same mindset and form voluntary entities that keep this data safe.
Trusting the government to keep the audit data on itself won't always work, and the 4th estate of the large media has been bought off by large companies that are much more apt to work in the governments interests to make as much profit as possible at the cost of democracy.
I wrote it so clearly it's from me.
Is this your view on corporations having to keep and make public their accounting records as well? If not, what's the difference?
Why is what is effectively the digital commons privately owned and we are okay with that?
Private owners went to the effort of collecting it. Other people can start doing so at any time.
YouTube offers free hosting and is widely available. Presumably that’s why it was chosen to host these videos. Other people cannot duplicate YouTube. VCs financed YouTube to build a free service with network effects that allow them to turn around and charge rent on the one hand and exercise capricious control on the other. It’s not in the public interest to allow these billionaires to retain control of these companies. Their control over these rent seeking enterprises directly conflicts with the public interest. This incident, erasing political ads, is an egregious example of private entities using their wealth to silence essential public discourse.
Monopolies are not easy to crack.
This is a content-free dismissal. The existence of an effective monopoly in records retrieval for previously published data does not affect your ability to collect presently published data.
It's not the digital commons, mostly because that's not a thing. The EU could've kept the ads they uploaded, but they didn't.
Arbitrarily labeling a private, multi-billion dollar infrastructure as a "digital commons" reeks of hubris and entitlement.
I think this is quite a bit more complex when you’re talking about products that operate as common carriers. The historical social contract we have with these services is that we treat them as the digital commons.
Clearly that has changed, but you’ll have to forgive folks for treating the internet as it has been in the ~20 years previous to the current chaos.
The billionaire owners of these systems are the entitled ones. They profit by using VC money to offer something for free, then turning around to exploit network effects and lock-in so they can charge the working class various forms of rent for what they’ve built.
This thread is full of comments that miss the fact that political ads are not uploaded to Google for free. Advertising is 99% of the funding that sustains the platform and makes Google oodles of money, and when it no longer suits them (read: doesn't generate profit and might be risky), they're dropping the content.
It'd be one thing if it was 20 petabytes of cat videos, but this is content that Google was literally paid to serve to people.
The whole discussion is cosmically ironic.
TV ads are not free either. But I don't think many people expect TV stations to archive every single ad forever.
Stadium ads are not free either. But (ditto).
We're talking political advertising, not all ads.
And 'broad'cast advertising is a different beast from hyper-targeted advertising that nobody but the intended recipient sees. In the latter case, political advertising archives offer insight into otherwise hidden advertising.
There must be more to this story given that the political ad archive is still available for many other non-EU countries:
> Now when you try to click on "political ads" you get re-directed to a page asking you to select from a small number of countries - the US, of course, UK, India, Australia, Brazil, Israel - but not one EU country (see below):
Is there some sort of EU data retention law at play here?
Is the historical aspect of the archive coincidental? It seems like Google archived ads; and those ads provided historical context. How does this information get supplemented by other historical data? I wonder if it will be relevant to know that an ad in the US talked about people eating their pets, not trying to be sarcastic it is a real part of history.
I mostly agree with comments here saying this is kind of silly to malign Google for.
I'm grateful for the piece regardless if only to inform me that the service exists (and, well — now doesn't for some countries).
This article reports on the removal of access to the ad archive. No where does it claim the onus is on Google to maintain this data. Most of top level comments here are arguing against a straw man.
They wouldn’t. You just can’t access it anymore. And there is a reason for that.
I maintain my own link meta archive. Just because I know that it will stay, but I am aware of link rot. I know also that internet archive exists, and also I know that it works painfully slow.
I want to be able to at least browse headlines and titles about Jeffrey Epstein after 20 years, when they start erasing all history about him.
Links:
https://github.com/rumca-js/RSS-Link-Database-2025 - Year 2025
https://github.com/rumca-js/RSS-Link-Database-2024 - Year 2024
https://github.com/rumca-js/RSS-Link-Database-2023 - Year 2023
https://github.com/rumca-js/RSS-Link-Database-2022 - Year 2022
https://github.com/rumca-js/RSS-Link-Database-2021 - Year 2021
It would be foolish to let Google (or any other company) be YOUR archivist
no way, something on the internet disappeared?
It’s not Google’s responsibility to keep an archive of data a few people find important.
If they find important to keep those ads get funding for their project and store it themselves. Ideally without further burdening taxpayers.
This erasure of our political past feels dangerous, for scrutiny, for accountability, for shared memory, for enforcement of our rules - for our democracy.
My goodness, I wonder how long it took them to think this statement up? I imagine they revised it again and again asking the editor "does this sound scary enough?" to the point it bears little resemblance to reality.
Google didn't delete your political history, it deleted it's own. You lost something that was given for free, thus could be taken away at any time.
Apparently it was so important that it would be "dangerous" if it went away but the author couldn't put in a minimal amount of effort to make a copy of it.
Maybe get your government to do this, instead of expecting some random company to do it for free indefinitely? This article could well be entitled "Google were the only people bothering to record our history."
How could the government do this? At best they could make an archive of the data Google chooses to publish but to do it themselves would require mandating Google to share the data with them to begin with.
The USA has https://data.gov. The EU could make something similar.
>but to do it themselves would require mandating Google to share the data with them to begin with
The data was already public
> The USA has https://data.gov
Could you link specifically to the datasets about political advertising reach available via that website? I get no results
> The data was already public
Where?
>Could you link specifically to the datasets about political advertising reach available via that website? I get no results
You asked "how could the government do this"? Come on, bro.
>Where?
The linked article explains where they were getting the data before.
> You asked "how could the government do this"? Come on, bro.
And you gave a generic link showing that they could share "data" in general. Not this kind of data.
Like "come on bro" both Ireland and the EU already have generic data portals...
> The linked article explains where they were getting the data before.
Yes, from the company who choose to share the data. There's no way to get the date from inside the company. In order for the government to publish the data they need to rely on the very same private company to supply them with it indefinitely, don't they? Again, "come on bro"...
Unfortunately people cannot 'get' their government to do much - and certainly not something as specific as this. Given that electoral leverage is a blunt and broad tool indeed. Also this is an EU level issue, not one pertaining to a national government.
Missing in this analysis is that advertisers did not, in fact, put ads on Google for free. While it's surely not part of the terms of service that Google is going to "back up" your ads indefinitely, the calculus employed in this thread is very, very wrong.
Google made huge profits on political ads and when it no longer suits them to spend paltry disk space on them, it's dropping them like a hot rock.
Stop pretending like Google was somehow offering a platform that people were freeloading on. Rather, these political ads were 100% the funding that sustains the platform and makes them and all their shareholders money.
These businesses built what have become the commons, which they retain control over through mechanisms like network effects, which allow them to make unaccountable decisions on the one hand and profit from rentierism on the other. It would probably be in people’s interest for their governments to seize these assets and socialize them as public utilities.
Then I suggest electing a government that's interested in doing what's in the people's interest.
Big tech effectively owns the government, at least in the US. If people really want to fix this mess we need mass strikes.
This is a very naive comment IMHO. There are very few governments in the word, if any, which could be trusted with something like that.
Meh, if your government then decides to make the country great again, it'll be deleting stuff that it thinks is not great...
Disappointing but i don't see any foul play from google like the headline kind of insinuates. Anyway their product support is notoriously fickle so why would you expect anything more of them. TLDR a policy change at google made the authors hobby more difficult or impossible because he didn't make any contingencies. Specifically they deleted political ad history for the EU.
Where was the insinuation of foul play? I re-read the article with your criticism in mind, but it seems that they are basically just saying "this happened" and 'these are the negative consequences."
The phrase "erased history" isn’t morally neutral. It carries a tone of judgment, even suggests censorship. If the author had wanted to appear neutral, they would have used "deleted" or "removed."
Ah! I read the article twice but hadn't considered the headline. I guess some editor wanted it sexed up a bit before it was published, the article itself is quite matter of fact.
Neutrality, and the appearance of neutrality, are different things.
[dead]
[dead]
[dead]
[flagged]
[flagged]
This specific case is about people who have no idea they are interested in the data today (likely because they don't yet know it exists), needing it for research tomorrow. It's about how we should go about archiving public data rather than about whether somebody did a good job on their personal backups. And, if a company presents a data set as an accessible store of public information, what is a reasonable timeline for removing that service, especially when it is trivial for said company to maintain it for a reasonable sunset period. This is obviously extremely important information, and rug-pulling so quickly is borderline malicious.
Even if it were about personal information, I would still find your comment short-sighted, because companies position themselves as reliable, and people rely on them because of this. Even if it's not always smart to go out in a bad neighborhood late at night in a very short skirt, there are moments when pointing it out borders on pathological.
You say this like someone was talking about their own data, not someone else's. The OP is talking about a Google dataset of money spent on Google ads over time, not one the OP created or produced in any way. Do we even call it a "backup" when we're talking about preserving a third-party's data (possibly without their consent, possibly in a way they would consider a violation of their ToS)?
Still, certainly it's true as observable that we cannot trust Google to preserve anything. One reason some people don't realize this is because Google has spent years marketing itself as a trustworthy steward with an interest in doing things for the good of society. Another reason people don't do something about it, about every single Google dataset or media corpus that they might regret disappearing -- is because it's expensive and sometimes challenging to do so, Google sometimes tries to prevent it as a ToS violation, and nobody is funding it (or at least there is certainly not enough funding to preserve all of Google's datasets).
There are ways to address these things, but blaming people noticing it as a problem as idiots is probably not one that will lead to addressing them.
The post is important because the more people who see it, the more people get reminded not to rely on Google without backups.
I would suggest a slight modification to your idea, in that people should build local repositories for there stuff, and use the mega corp for whatever they are good at NOW, TODAY and keep a smirk ready for when the mega corp blunders off and the whole recovery is click, click, done cause if there is one thing the web is good at, it's signing you up for the newest flash, later today.
[flagged]
https://www.seuros.com/blog/aws-deleted-my-10-year-account-w...
Didn't Google say that their mission is to organize the world's data? They definitely created the expectation and no matter what the t&c say, people will keep Google according to the standards the company itself created via its marketing.
I blame companies for being hostile towards backups. An encrypted database I can't access without the app is not it.
Why can't it be the fault of both parties?
Because you’re responsible for your own data.
It’s no one else’s job unless you pay them, which is you taking a backup.
And when I say “a backup” I mean multiple backups different media different locations, different backup mechanisms depending on how much you really don’t want to lose the data.
You seem to have misunderstood what is going on here.
The data here is Google's data. They sell ads to political parties. They then make details of those ad sales available to researchers to let them understand how political ad spending is happening in different countries.
This researcher is complaining that Google appear to have changed their mind about making that data available.
A very good backup has regular restore tests and the 3-2-1 system. An awesome backup has very regular restore tests and the 3-2-1-1 system.
Don’t forget the backup system that backs up the AWS data to AWS using AWS backup systems that’s also a good approach.
Or the decision not to backup your source code because “it’s on GitHub”. That’s also a ripper backup approach.
That seems like an extremely poor approach. What if someone at Amazon accidentally deletes your account?
> Why can't it be the fault of both parties?
It can be, but it isn't in this case.
[flagged]
The audacity of people to think others should store data for them indefinitely is unbelievable. You had years to back this up if you really cared about it, and most probably don’t. Cry more.
And history isn’t “erased”. It still happened. It’s up to you to remember it.
But they stored Google links! /s