SolidStart - Hacker News

I recognize the author Jascha as an incredibly brilliant ML researcher, formerly at Google Brain and now at Anthropic.

Among his notable accomplishments, he and coauthors mathematically characterized the propagation of signals through deep neural networks via techniques from physics and statistics (mean field and free probability theory). Leading to arguably some of the most profound yet under-appreciated theoretical and experimental results in ML in the past decade. For example see “dynamical isometry” [1] and the evolution of those ideas which were instrumental in achieving convergence in very deep transformer models [2].

After reading this post and the examples given, in my eyes there is no question that this guy has an extraordinary intuition for optimization, spanning beyond the boundaries of ML and across the fabric of modern society.

We ought to recognize his technical background and raise this discussion above quibbles about semantics and definitions.

Let’s address the heart of his message, the very human and empathetic call to action that stands in the shadow of rapid technological progress:

> If you are a scientist looking for research ideas which are pro-social, and have the potential to create a whole new field, you should consider building formal (mathematical) bridges between results on overfitting in machine learning, and problems in economics, political science, management science, operations research, and elsewhere.

[1] Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks

http://proceedings.mlr.press/v80/xiao18a/xiao18a.pdf

[2] ReZero is All You Need: Fast Convergence at Large Depth

https://arxiv.org/pdf/2003.04887

[-]

tablatom 13 hours ago ago

Interesting timing for me! Just a couple of days ago I discovered the work of biologist Olivier Hamant who has been raising exactly this issue. His main thesis is that very high performance (which he defines as efficacy towards a known goal plus efficiency) and very high robustness (the ability to withstand large fluctuations in the system) are physically incompatible. Examples abound in nature. Contrary to common perception evolution does not optimise for high performance but high robustness. Giving priority to performance may have made sense in a world of abundant resources, but we are now facing a very different period where instability is the norm. We must (and will be forced to) backtrack on performance in order to become robust. It’s the freshest and most interesting take on the poly-crisis that I’ve seen in a long time.

https://books.google.co.uk/books/about/Tracts_N_50_Antidote_...

[-]

vlovich123 9 hours ago ago

I don’t think it’s smart to proactively back track without being very careful. One thing that’s needed is for corporate death to be allowed to occur. Right now the downsides of risky behavior is bailed out for large enough risk. Then the companies that fail aren’t robust and the ones that don’t are but bailouts let non robust companies keep going. Otherwise “robustness” is a property without a measure which means that you’ll get robustness theater where actions are being taken in the name of being robust but it’s not actually making a difference at best and could be making things worse.

As for society itself being robust, it’s a much harder property. Being robust is nice but no one actually wants to live in a metered society where there’s insufficient resources - they’d generally rather kill for resources greedily and let others fail without helping them. That’s why socialized healthcare struggles - while it guarantees a minimum of care for everybody, the care provided has longer wait times and most people are not willing to wait their turn.

[-]

wongarsu 7 hours ago ago

In a free market economy we shouldn't demand robustness, we should create a system that promotes and rewards robustness. A strict commitment against bail-outs would certainly be part of that. Companies (and private people) can decide to lower their risk exposure (at the cost of efficiency/profit) or take out insurance against risks. And if they go the insurance route they have to assess how likely their insurance is to go insolvent at the next insurance event. That's how you reward those that are actually resilient.

Healthcare is more complicated. It can never work as an efficient free market since nobody goes comparison shopping for the hospital with the best value-for-money when they have a car crash. That's why socialized healthcare achieves much better results per dollar spent. But it's often hamstrung by attempts at efficiency.

I think a better societal example is disaster relief: helping people back up after they have been hit by a hurricane is the humane thing to do, but how much is that encouraging people to settle in high risk areas with insufficient precautions?

[-]

CuriouslyC 7 hours ago ago

That solution is never going to work when black swan events occur on the order of every 5-10 years and executive vision is focused on the next quarter with little concern paid to anything outside the next 2-3 years. Nobody is going to want to give up short term performance to mitigate risks that probably won't manifest until after they've left for a better job.

[-]

LorenPechtel an hour ago ago

Yeah, that's the real problem. Too much efficiency in the short term.

My idea on working around this: for any business with actively traded stock there is a salary cap, say $1m/yr *per year*. You want to pay that guy $10m/yr? No, you pay him $1m and he gets 9 sets of shares that are worth $1m now, but they will be delivered one a year. Next year, same thing, you give him $1m, one set of shares from the previous year is delivered to him, he's got 9 new sets coming. So long as you have such shares forthcoming you are not permitted to engage in any trade where you would gain from the stock going down. If you do so inadvertently (say, investing in a fund that shorts the stock) any income from that is taxed at 100%.

The idea is to make your top people care about the long term prospects of the company, not merely the prospects of their area for whatever time they're in charge of it.

kortilla 5 hours ago ago

That solution is how it already works for the vast majority of companies in the US.

“Too big to fail” is a meme that only applied to a tiny handful of companies during the financial crisis. Take a look at SVB for how fast a stalwart huge bank can implode with zero fucks given by the government.

[-]

collingreen 4 hours ago ago

By "zero fucks given by the government" do you mean the government got involved, effectively bought the bank, and took responsibility for 100% of deposits (most of which were the balances of startups, ie venture capital investments)?

jppope 4 hours ago ago

Pretty sure Boeing should have failed 3 times by my count.

wongarsu 6 hours ago ago

5-10 years is a perfectly normal investment horizon, and in the end investors are the ones electing the CEO and setting goals and rewards for the executive. If betting on the long term is a winning strategy companies absolutely have the means to do that. But right now it usually isn't.

WillPostForFood 4 hours ago ago

Businesses won't plan long term or for black swan events if they don't have to; it is rational not to if they know a bailout is coming.

[-]

lukeschlather an hour ago ago

Businesses won't plan for black swan events when the people operating them have other sufficient wealth that the death of the company doesn't pose a serious problem for them. When CEOs make enough in a year to retire, there's no need to to worry about a potential catastrophic failure next year.

bumby 7 hours ago ago

>private people) can decide to lower their risk exposure

I think the complexities of modern societies make it too difficult to measure this risk adequately. We just don’t have the bandwidth to think about the second-and-third order effects for every social/financial interaction we encounter. And people are generally very poor at estimating high-consequence/low-probability events. This means people will often take very outsized risks without realizing it; when bad things happen it creates an unstable society. I don’t think we’ve evolved to personally manage all the c risks in a large complex society and farming those risks out to institutions seems to be the current way most societies have decided to mitigate those risks.

[-]

Zach_the_Lizard 4 hours ago ago

>...farming those risks out to institutions seems to be the current way most societies have decided to mitigate those risks

Unfortunately, those institutions --be they governments, insurance companies, UL Labs, banks, venture capitalists, etc.--also need to be vetted.

Even when staffed with impeccably well credentialed and otherwise highly capable people, their conclusions may be drawn using a different risk framework than your own.

The risk that they mitigate may even be the risk that you won't vote for them, give them money, etc.

There is also the risk of having too little risk, a catastrophe no worse than too much risk. The balloon may not pop, but it may never be filled.

[-]

bumby 3 hours ago ago

I don’t think anyone reasonable is advocating believing institutions on blind faith (possibly with the exception of religious institutions). They need to be transparent and also strive to reflect the values (risk and otherwise) of their constituents.

nradov 6 hours ago ago

Patients have time to shop for most healthcare services. Only a small fraction of healthcare spending is for emergencies. The highest cost stuff is mostly elective procedures. If you need a colonoscopy or hip replacement then you have time to shop around.

Socialized healthcare has its advantages and is probably more cost effective on average. But we also see affluent Canadians coming to the USA as medical tourists and paying cash for MRI scans in order to avoid the queues back home.

whatshisface 7 hours ago ago

I don't see why people can't comparisons shop for hospitals before they get in a car crash. Unless I am literally unconscious I would go to the hospital in my area that I trust the most, and I have plans for which urgent care, clinics and hospitals I would take someone else to if they needed a driver.

In fact I think a pretty small fraction of patients arrive at the ER unconscious.

[-]

RandomLensman 7 hours ago ago

How would develop the "trust" and why would it be correct? How would you diagnose yourself or others before selecting a hospital if those have different trust for different things? How do you balance urgency vs different trust levels if the hospitals are not all the same distance?

[-]

mikeyouse 7 hours ago ago

It also ignores that huge swaths of the country have no choice at all and the only hospital within a hundred miles is only viable due to huge Federal subsidies. We’ve been helping a close family member navigate that scenario and sure, he could vote with his dollars but it would involve a three hour drive to a neighboring state for an 80-yr old. I’d rather just enforce minimum quality standards on everyone like most other civilized countries rather then relying on “the free market” which so far in my experience has just led to PE goliaths swallowing entire health systems to focus on bill collection and union busting.

[-]

nradov 5 hours ago ago

CMS does enforce minimum clinical quality standards on hospitals (at least those that accept Medicare). The problems in areas without meaningful competition tend to be more around shortages of qualified practitioners, high prices, and abusive billing policies.

whatshisface 6 hours ago ago

I can't imagine anyone would object to minimum quality standards for anything receiving federal subsidies.

[-]

collingreen 4 hours ago ago

edit: didn't realize I was feeding a troll. Feel free to ignore.

I expect the objections are in how quality is measured and enforced.

It reminds me of education system in the US - most people (project 2025 aside) think it's good to have a public education system; having a pipeline of skilled workers makes it easier to build an economy filled with a diverse set of businesses.

However, the attacks start to fly when there is disagreement about who should be allowed to teach, how they should be measured, and how they should be paid.

[-]

whatshisface 3 hours ago ago

Settling everyone's differences about rural medical subsidies might be a good stepping stone to an NHS.

whatshisface 6 hours ago ago

You could ask the same questions about grocery shopping or buying a PC.

[-]

RandomLensman 6 hours ago ago

A mis-assessment there might be far less consequential and those also do not require a medical diagnosis before making a decision where to go.

[-]

whatshisface 4 hours ago ago

I've never needed a medical diagnosis to decide between calling my GP and going to an urgent care. It's just a bit surreal to hear everyone else say my ordinary survival skills are impossible and more than could be asked of anybody!

[-]

RandomLensman 3 hours ago ago

You said you have a hospital selected you trust (by whatever your metric is). Hospitals tend not be all equal for all things, so trust should probably be differential - how do you assess yourself as a patient then to decide on where to go? And if you do not differentiate the trust any further than to a single hospital regardless of what the issue is: why is that sufficient?

I think it is fine to have some preferences for a hospital, but not sure how much benefit that confers outside of some narrow situations.

[-]

whatshisface 3 hours ago ago

Sinply replace hospital with any other service, take your own answers and then translate it back. In economic terms I researched medical facilities until the expected marginal benefit of the information fell below the marginal cost. There are a lot of reasons to reform the US healthcare system but you can't argue that consumer choice is too complex to be realized.

[-]

RandomLensman 3 hours ago ago

I don't know the US system well enough to say much about it.

unethical_ban 4 hours ago ago

Just to be clear: You're asserting that the average citizen

  * has the same capacity to research an unknown number of medical procedures and the doctors performing them as they do researching onion prices or CPU specs

  * faces a similar scale of consequences when failing to properly analyze medical procedures as they do when they fail to properly price-compare onions or PC services

  * has the same freedom of choice to "purchase their preference" in an emergency, life-threatening situation as they have when shopping for PCs or groceries

[-]

whatshisface 4 hours ago ago

Dietary and metabolic problems are an epidemic that outweights malpractice in terms of quality and quantity of life by more than two orders of magnitude - so yes, I am saying people face "shopping problems" of life or death magnitude every day.

WalterBright 8 hours ago ago

The usual cycle for business in a free market is it appears young and fresh, lacking any parasites. It grows rapidly, displacing existing mature businesses. Then, it accumulates bureaucracy and parasites, becoming less and less efficient, strangled by bloat and inability to adapt, and slides into bankruptcy, replaced by the next generation of new businesses. The remains of the business are then reallocated to the next generation of businesses.

(This is quite unlike the common view that businesses inevitably grow to take over the world.)

I.e. business is much like a living organism.

Problems set in when the government bails out failing businesses.

Even worse are government "businesses". They are not allowed to fail, and the inefficiencies, parasites, corruption, grow and grow. When can you remember a government agency being abolished? Eventually, the government will collapse.

[-]

LorenPechtel 29 minutes ago ago

Not just government agencies. Everybody wants their finger in the pie to justify their job. And every politician wants to do things their voters like.

I'm thinking of a reasonably recent article I saw that was talking about helping people navigate the 30+ assistance programs they might be eligible for. There's your problem right there--there should not be 30+ programs doing approximately the same thing! That's an awful lot of duplication of effort.

Or look at what happens with business licenses. Two things I see:

1) They want their $ from entities that shouldn't really be "businesses" in the first place. Around here an awful lot of licensed professionals have to have a "business" license--never mind that the nature of their work means they're inside some other entity that actually is reasonable to license. And that means a sales tax registration which has an annual minimum that such people almost certainly will never reach. (Sales tax includes use tax--but it's their office that actually engages in such transactions.)

2) Businesses that perform their work on-site have to have business licenses for every license area of the metropolitan area they work in. Hey, guys, get together and define the superset of the rules of your area and allow someone to get a license that covers the whole area based on that superset.

The Republicans are "right" in that we have far too many regulations. But they are very wrong in wanting to take an axe to them--most of the rules are individually sensible (and when they produce nonsense it's often situations where it's not worthwhile to special case), there is a horrible problem of duplication of effort and fingers in the pie. It's not chopping that's needed, it's organization.

normie3000 8 hours ago ago

> When can you remember a government agency being abolished?

In the UK the last I specifically remember is DFID, which shut down in 2020.

vlovich123 5 hours ago ago

> Even worse are government "businesses". They are not allowed to fail, and the inefficiencies, parasites, corruption, grow and grow. When can you remember a government agency being abolished?

In Commonwealth countries and the UK itself there are plenty of businesses called “crown corporations” which are owned by the government. Change in attitudes towards more liberalism led governments to deregulation and selling off bits and pieces or the entire corporation. Here are some Canadian examples:

https://www.cbc.ca/news/politics/canada-post-it-innovapost-s...

https://policyoptions.irpp.org/magazines/march-2024/mulroney...

America is a relatively young country and has very peculiar philosophies sometimes not found in the rest of the world. Be very careful extrapolating an American perspective abroad or as capturing some elemental truth of the universe.

yuliyp 3 hours ago ago

The problem with this is principal-agent problems. The owners of the business don't want it to fail. The people working there want to make money. They generally live their life and enjoy what money they make before the chickens come home to roost. It can be hard for the owners to realize the business is fragile before that fragility becomes apparent. In the mean time the people running the business made a bunch of money, potentially jumped to other jobs or retired or died.

And the owners could have sold when the business was propped up by unknown fragility.

Human lives are too short for these kinds of feedback loops to be all that effective.

fireflash38 8 hours ago ago

One of the primary reasons people bail out companies are the knock-on effects. People losing jobs, etc. If society itself is robust enough to cover for people in those situations, we could let companies fail far more.

[-]

ghaff 8 hours ago ago

There's a sentiment on here often that, even if a company has been essentially blown up by technology or market change, they should have transformed themselves to adapt. But that implies they probably needed to rototill their workforce in any case. At some point, you're probably better off just declaring bankruptcy and starting fresh or letting someone else do so.

karmonhardan 3 hours ago ago

People los jobs anyway, from the knock-on effects of the bail out. The bail out is more about controlling who loses jobs.

nradov 6 hours ago ago

True, but for some companies there are also national security concerns. If we lose the domestic supply chain for certain items then that limits our freedom of action and leaves us vulnerable to supply disruptions.

[-]

AdrianB1 2 hours ago ago

If you depend on a single company to supply certain items, you have a big problem already. Pouring money in that company will mostly help the executive bonuses, not the national security.

RandomLensman 7 hours ago ago

Socialized healthcare seems to kind of work in many developed economies - where does it struggle and by what metrics regarding the health outcomes?

[-]

AdrianB1 2 hours ago ago

I would love to see a good proof that it works, all the discussions, rumors and anecdotal evidence suggest the contrary. I am open to learn the truth, with hard numbers.

Very long waiting times are the first thing that comes to mind regarding such failures, with UK and Canada at the top spot. It is not uncommon to die waiting for a consultation to be diagnosed in 1-2 years.

[-]

RandomLensman an hour ago ago

I don't think you'd see that kind of waiting times in Germany, for example (but Germnay also is at the high end of healthcare as %GDP).

Terr_ 10 hours ago ago

That reminds me of a study on "lazy" ants as a reserve/replacement labor force. [0]

Maximizing efficiency in the short-term is not the same as maximizing survival in the long term.

[0] https://www.sciencedaily.com/releases/2017/09/170908205356.h...

naasking 4 hours ago ago

> Contrary to common perception evolution does not optimise for high performance but high robustness.

It does both, eg. if the environment is stable then fitness is correlated with efficiency, if the environment is unstable then it's robustness.

jfim 11 hours ago ago

We've seen this during the COVID pandemic supply chain disruptions as well, where just in time supply chain management doesn't work as expected when operating in an abnormal environment.

[-]

LorenPechtel 12 minutes ago ago

It's not just Covid. Look at the medical world. Generic products compete on price and there is little profit margin--not enough to warrant overprovisioning against problems. And meeting FDA requirements for new activities means new players can't just jump in the game. (And we sometimes see this done maliciously--control all active production of something and shove the price through the roof.) One factory has a problem and there can be huge problems downstream as a result.

The only solution I see is for the FDA to include supply reliability in it's determination of whether a system is acceptable.

soulofmischief 10 hours ago ago

I'd always thought this conclusion was just a given.

Highly optimized systems take full advantage of their environment and rely on a high degree of predictability in order to avoid redundant operations.

These systems minimize the free energy in the system, and so very little free energy is available to counteract new forces introduced to the environment which act on the system.

You'll find parallels in countless domains, since the very basis for learning and stabilization of a system revolves around becoming more or less sensitive to a given stimulus. Examples could be attention, supply chain economics, institutions, etc.

jimkleiber 8 hours ago ago

I was gonna come here to say that, especially how there was a shortage on toilet paper. I remember reading it was becuase factories were so efficient that when people started using the toilet at home instead of the office, it was hard to switch the factories from making commercial to residential toilet paper. I think someone even made the pun of paper-thin margins.

bmsan 4 hours ago ago

Unedited bullet points on a related topic (same prefixes are linear, different prefixes connect to the others, but I haven't decided where yet):

>capital concentration increases

>expectations for what capital owners can do with money increases

>expectations exceed available capital

>investment returns must increase (race to the top)

>cooperation among capital owners must increase to get better returns

>capital owning group begins to self-select and become less diverse, if this wasn't already caused by the background/personality required to accrue capital

>investment theory converges on a handful of "winning" ventures

>because this is where capital is flowing, workers are forced to divert to these ventures

>competition increases, hyperspecialization increases

>expertise in and sophistication of other areas begins to decline, causing quality decline, garnering less investment; feedback loop

-----

*debt cannibalizes future productivity

-----

)diversity in capital ownership and management increases likelihood of diversity in investment venture target

)increased competition, increased likelihood that ventures will cover needs, decreased likelihood of overweighting in one area/overproduction

)solution: capital redistribution. Perhaps globally

JeremyNT 4 hours ago ago

The "slack" is important in an unstable environment because it allows for reallocation of resources without causing a system to fail.

It's tempting to minimize waste, but excess capacity is required to adapt if things are evolving quickly.

rglullis 11 hours ago ago

> We must (and will be forced to) backtrack on performance in order to become robust.

This is something that Nassim Taleb and the people working on https://realworldrisk.com/ have been saying for decades already.

__MatrixMan__ 6 hours ago ago

I don't know that the poly-crisis is bit this does feel timely.

I know I'd tolerate a digital experience of far lower fidelity (fewer pixels, for instance, or even giving up GUIs altogether) if I could get it in a way that doesn't break every time some far away person farts near a cloud console: A trade of performance for robustness.

3abiton 7 hours ago ago

The acceleration of knowledge is producing so much content, real gems are passing by unnoticed. Thanks for pitching in!

bumby 7 hours ago ago

There are also a lot of engineering examples where the goal is to optimize for reliability. I think the most common domain is marine platforms where it is prohibitively expensive to induct and repair (you have to send a team out by helicopter, for example).

[-]

nradov 6 hours ago ago

And yet most large merchant ships are designed with a single engine, propeller, and rudder to optimize for cost instead of reliability. We have seen some spectacular failures of that approach recently, although it probably still makes sense in aggregate.

A major mechanical casualty beyond what the crew can repair usually means a tow to a shipyard. Flying more engineers in by helicopter would seldom help, and often isn't feasible.

[-]

LorenPechtel 10 minutes ago ago

The bridge that collapsed wasn't due to a single engine, propeller or rudder. It was due to a single electrical system. One intermittent electrical issue left the ship basically helpless even though all propulsion and steering was undamaged.

bumby 5 hours ago ago

This is true, but different than the maritime platforms I was talking about. The ones that tend to focus on reliability-centered optimization are platforms used for drilling, not transport. Even then, you will see instances where they decide to optimize for cost/schedule (eg Deepwater Horizon). IMO, that is a company-cultural issue.

Btw- reliability optimization doesn’t necessarily mean it is optimized to not fail. They are optimized to fail within some predetermined risk level. What that risk level should be is an entirely different discussion.

maxerickson 9 hours ago ago

Giving priority to performance may have made sense in a world of abundant resources, but we are now facing a very different period where instability is the norm.

Why do you think this?

nradov 6 hours ago ago

To a first approximation, humans have never lived in a world of abundant resources. That has mostly only applied to a minority of affluent people in developed countries. But resource abundance continues to improve on average worldwide.

bbor 7 hours ago ago

  His main thesis is that very high performance (which he defines as efficacy towards a known goal plus efficiency) and very high robustness (the ability to withstand large fluctuations in the system) are physically incompatible.

…what about humans? We’re far more efficacious than any other animal, and far more capable of behavioral adaptation.

Plus, isn’t “physically impossible” a computer science argument, not a biological one? Unless we’re using the OG “physis”==“nature”, I guess

thomasahle 13 hours ago ago

I love the idea of ReZero, basically using a trainable parameter, alpha, in residual layers like this:

  Deep Network                  | xi+1 = F(xi)                 
  Residual Network              | xi+1 = xi + F(xi)            
  Deep Network + Norm           | xi+1 = Norm(F(xi))           
  Residual Network + Pre-Norm   | xi+1 = xi + F(Norm(xi))      
  Residual Network + Post-Norm  | xi+1 = Norm(xi + F(xi))      
  ReZero                        | xi+1 = xi + αi F(xi)

However, I haven't actually seen this used in practice. The papers we have on Gemma and Llama all still seem to be using layer norms.

Am I missing something?

[-]

immibis 7 hours ago ago

Isn't this already part of F?

[-]

aoeusnth1 5 hours ago ago

Your sound system has a volume dial to turn up and down the gain of the track even though you could get the same effect by re-recording the track at a higher volume; isn’t that curious?

thomasahle 5 hours ago ago

I should add that alpha is initialized to 0.

lubujackson 14 hours ago ago

The exciting thing about this idea is if you can correlate, say, economics with the works of ML, that means a computer program which you can run, revise and alter can directly give you measurable data about these complex system interactions that mostly have existed as a platonic idea since reality is too nuanced and multiple to validate concepts formally.

With the idea that there is some subset of logic that sits below economics that is provable and exact. That is a powerful idea worth pursuing!

[-]

nerdponx 13 hours ago ago

This idea has been pursued several times in the past, and it always ends up producing lots of interesting academic results and no practical conclusions.

It's certainly an interesting perspective on the development of complex systems. The idea that an economy can be somehow overfitted to its own incentives and constraints I don't think is entirely new, cf the Beer Game. But as a general concept, it's certainly not something that usually finds its way into policy discussion, beyond some very specific talk about reshoring of certain critical industries.

However, I think the most important benefit of this perspective is going to be providing yet another counterargument against the Austrian economics death cult.

[-]

ahartmetz 12 hours ago ago

It seems to me that something similar to Adam Smith happened to the Austrians: their ideas have been cherry-picked. According to German Wikipedia, their main things were / are a focus on individual preferences, marginal utility, and a rejection of mathematical modeling(!)

There was also something about lower state expenditures (...taxes...) giving better results for the people - that's the one that seems to be very popular with rich people for some reason. Go figure.

[-]

jampekka 11 hours ago ago

Austrian economics also rejects empirical assesment of its claims. Instead, universal thruths are derived "logically" (formal logic banned though) from "obviously true" axioms using a method called praxeology.

It seems a lot like Scientology: the more you learn about it, the more bizarre it gets. And of course it's used to extract a lot of money for few benefactors.

[-]

naasking 2 hours ago ago

Unlike scientology, Austrian economics made some important contributions to mainstream economic understanding.

amelius an hour ago ago

Well if you can turn chatgpt into an intelligent actor in a simulated economy, and are able to run it at scale, I bet you can get some valuable insights.

mrfox321 7 hours ago ago

More importantly, he invented diffusion models:

http://proceedings.mlr.press/v37/sohl-dickstein15.pdf

salawat 15 hours ago ago

>> If you are a scientist looking for research ideas which are pro-social, and have the potential to create a whole new field, you should consider building formal (mathematical) bridges between results on overfitting in machine learning, and problems in economics, political science, management science, operations research, and elsewhere.

Translation to laymen: ML is being analogized to the mathematical structure of signaling between entities and institutions in society.

Mathematician proposes problem that plagues one (overfitting in ML, the phenomena by which a neural network's ability to generalize is negatively impacted by overtraining so the functions it can emulate are tightly coupled to the training data), must plague the other.

In short, there must be a breakdown point at which overdevelopment of societal systems or signaling between them makes things simply worse.

I personally think all one need do is look at what would happen if every system were perfectly complied with to see we may already be well beyond that breakpoint in several industrial verticals.

LarsDu88 15 hours ago ago

Adding to my reading list!

whizzter 4 hours ago ago

This has become a societal problem in Sweden during the past 20 or so years.

1: Healthcare efficiency is measured by "completed tasks" by primary care doctors, the apparatus is optimized for them handling simple cases and they thus often do some superficial checking and either send one home with some statistically correct medicine (aspirin/antibiotics) or punt away cases to a specialized doctor if it appears to be something more complicated.

The problem is that since there's now fewer of them (efficient) they've more or less assembly line workers and have totally lost the personal "touch" with patients that would give them an indication on when something is wrong. Thus cancers,etc are very often diagnosed too late so even if specialized cancer care is better, it's often too late to do anything anyhow.

2: The railway system was privatized, considering the amount of cargo shipped it's probably been a huge success but the system is plagued by delays due to little gaps in the system to allow late trains to speed up or to even do more than basic maintenance (leading to bigger issues).

[-]

EasyMark 2 hours ago ago

I wish these were the biggest problems facing US train and healthcare industries.

t_mann 14 hours ago ago

The argument rides on the well-known Goodhart's law (when a measure becomes a target, it ceases to be a good measure). However, it only puts it down to measurement problems, as in, we can't measure the things we really care about, so we optimize some proxies.

That, in my view, is a far too reductionist view of the problem. The problem isn't just about measurement, it's about human behavior. Unlike particles, humans will actively seek to exploit any control system you've set up. This problem goes much deeper than just not being able to measure "peace, love, puppies" well. There's a similar adage called Campbell's law [0] that I think captures this better than the classic formulation of Goodhart's law:

The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.

The mitigants proposed (regularization, early stopping) address this indirectly at best and at worst may introduce new quirks that can be exploited through undesired behavior.

[0] https://en.wikipedia.org/wiki/Campbell%27s_law

[-]

Edman274 3 hours ago ago

> Unlike particles, humans will actively seek to exploit any control system you've set up.

Well, agents will. If you created a genetic algorithm for an AI agent whose reward function was the amount of dead cobras it got from Delhi, I feel like you'd quickly find that the best performing agent was the one that started breeding cobras. In the human case and in the AI case the reward function has been hacked. In the AI case we decide that the reward function wasn't designed well, but in the human case we decide that the agents are sneaky petes who have a low moral character and "exploited" the system.

[-]

phainopepla2 an hour ago ago

We have good reason to treat the humans as sneaky in your example, because they understand the spirit of the control system, and exploit the letter of it. The AI only understands the letter.

layer8 8 hours ago ago

> Unlike particles, humans will actively seek to exploit any control system you've set up.

But that’s only possible because the control system doesn’t exactly (and only) control what we want it to control. The control system is only an imperfect proxy for what we really want, in a very similar way as the measure in Goodhart’s law.

Another variation of that is the law of unintended consequences [0]. There is probably a generalized computational or complex-systems version of it that we haven’t discovered yet.

[0] https://www.sas.upenn.edu/~haroldfs/540/handouts/french/unin...

[-]

etiam 6 hours ago ago

Disagree. Even if it was striving to regulate exactly the right thing in the first place, most of these issues occur for systems where no single actor could be expected to exert complete control and could well be vulnerable anyway.

Start working with a nice, clean, fully relevant system, end up modelling that plus the whole range of adversarial perturbations from agents of pretty high complexity.

[-]

layer8 5 hours ago ago

I don’t exactly see how this is different from what I was describing.

EasyMark 2 hours ago ago

I think a big portion of that is humans don’t like to be viewed only as numbers and will rebel and manipulate any system you try to put the thumbscrews to them with. So the quote to mean rings golden and isn’t fallible to much of an extent

netcan 13 hours ago ago

This is true, these "laws" are approximations and imperfect reductions.

Which one is useful or descriptive will depend on the specific example.

Optimizing ML VS Optimizing a social media algorithm VS using standardized testing to optimize education systems.

There is no perfect abstraction that applies to these different scenarios precisely. We don't need that precision. We just need the subsequent intuition about where these things will go wrong.

[-]

onethought 10 hours ago ago

I missed the citation on his education point. Has someone proved that “teaching to the test” leads to lower educational outcomes than not having tests?

[-]

schrectacular 3 hours ago ago

Not a citation, but I believe it's a mediocritizing measure. For some teachers and some students, teaching to the test is probably better. I suspect more heavily concentrated in the bottom 50% of each group. For a subset of great teachers and great students, it's a detriment.

netcan 6 hours ago ago

IDK. I don't think we can actually have a discussion about education where all statements are supported by indisputable evidence.

There do happen to be citations for this question but I doubt any really clears an "indisputable evidence" standard. That's the nature of the field. Even if the whole discussion was evidence based and dotted with citations, we'd still be working with a lot of intuition and speculation.

ismailmaj 9 hours ago ago

I saw some professors share the least about their tests to make sure we truly understand the material, sounds to me like a real-life usage of a train/test split. It’s not far fetched to think they employed this technique because teaching to the test didn’t work well by itself.

netcan 8 hours ago ago

Another example of this approximate law is in exercise physiology.

To a normal person, there are a lot of good proxy indicators of fitness. You could train sprinting. You could hop up and down. Squat. Clean and jerk.. etc.

Running faster,hopping higher, squatting heavier... all indicators of increasing fitness... and success of your fitness training.

Two points:

1 - The more general your training methodology, the more meaningful the indicators. Ie, if your fitness measure is "can I push a car uphill," and your training method is sprinting and swimming... pushing a heavier car is a really strong indicator of success. If your training method is "practice pushing a car," then an equivalent improvement does not indicate equivalent improvement in fitness.

2- As an athlete (say clean and jerk) becomes more specialized... improvements in performance become less indicative of general fitness. Going from zero to "recreational weighlifter" involves getting generally stronger and musclier. Going from college to olympic level... that typically involves highly specialized fitness attributes that don't cross into other endeavors.

Another metaphor might be "base vs peak" fitness, from sports. Accidentaly training for (unsustainable) peak performance is another over-optimization pitfall. It can happen when someone blindly follows "line go up." Illusary optimizations are actually just trapping you in a local maxima.

I think there are a lot of analogies here to biology, but also ML optimization and social phenomenon.

[-]

bob1029 8 hours ago ago

Clean & jerk is one of those movements that I would almost consider "complete". Especially if you are going to mix in variants of the squat.

Not sure these are the best example. I don't know of anyone who can C&J more than their body weight for multiple repetitions who isn't also an absolute terminator at most other meaningful aspects of human fitness.

Human body is one machine. Hormonal responses are global. Endurance/strength is a spectrum but the whole body goes along for the ride.

[-]

netcan 6 hours ago ago

Perhaps. And you could probably test this but I would gamble that the principle still applies. IE, these weightlifters are probably also very capable (eg) shotputters because of all that weightlifting. But also... their shotput, sprinting and other tangential abilities probably peak at some point. From then on, they are mostly just getting stronger at clean and jerk.

> Hormonal responses are global. Endurance/strength is a spectrum but the whole body goes along for the ride.

This is true, and that is why most exercise is a general good for most people, and has similar physiological effects. However, at some point "specialization" (term of art), kicks in. At that point, a bigger clean and jerk no longer equates to a longer shot put.

Fwiw... This isn't a point about exercise or how to exercise. Most people aren't that specialized or advanced in a sport and the ones who are have coaches. My point is that the phenomenon speculated to be broad in this post applies (I suspect) to physiology. Probably quite broadly. It's just easy to think about it in terms of sports because "training & optimization" directly apply.

[-]

travisjungroth 4 hours ago ago

I agree with your overall point, but also the person you’re replying to. I think that clean and jerk may be the example that least supports your argument. If I had to optimize an athlete for one movement and then test them on 20, C&J would probably be my pick. Bench press would be lower down the list.

This isn’t just nit picking exercises here. There are some measures to optimize for that lead to broader performance. They tend to be more complex and test all components of the system.

rootusrootus an hour ago ago

IMO deadlift is another good one. If you could only do a single lift, I think you could make a pretty good case that it should be deadlift. Works damn near every muscle in your body, and gets your heart going pretty good too.

naasking 2 hours ago ago

I think that's more an indication that "general fitness" is not a rigorous metric. It's fine as a vague notion of "physical ability" up to a point, and past that it loses meaning because improvements in ability are task-specific and no longer translate to other tasks.

thomassmith65 3 hours ago ago

I noticed an example of this rule at my local hardware superstore.

Around a decade ago, the store installed anti-theft cages.

At first they only kept high-dollar items in the cages. It was a bit of an inconvenience, but not so bad. If a customer is dropping $200+ on some fancy power tool, he or she likely doesn't mind waiting five minutes.

But a few years later, there was a change - almost certainly a 'data-driven' change: suddenly there was no discernible logic to which items they caged and which they left uncaged. Now a $500 diagnostics tool is as likely to sit open on a shelf, as a $5 light bulb to be kept under lock and key.

Presumably the change is a result of sorting a database by 'shrinkage' - they lock up the items that cumulatively lose the hardware store the most money, due to theft.

But the result is (a) the store atmosphere reads as "so profit-driven they don't trust the customers not to steal a box of toothpicks" and (b) it's often not worth it for customers to shop there due to the waiting around for an attendant to unlock the cage.

I doubt the optimization helped their bottom-line, even if it has prevented the theft of some $3 bars of soap.

[-]

tshaddox 3 hours ago ago

It’s much more convenient to buy from Amazon than to try to find someone to unlock a glass case at the pharmacy. Especially since any pharmacy with glass cases for basic items will also be understaffed.

crazygringo 3 hours ago ago

> they lock up the items that cumulatively lose the hardware store the most money, due to theft.

> I doubt the optimization helped their bottom-line

These seem to be in direct contradiction, unless you really think people have stopped going there because of it, to such an extent it outweighs the thefts. Especially when, if they stop going there, the competing local hardware superstore is probably doing the exact same thing. And remember, retail margins aren't usually huge -- for every item stolen, how many more do they need to sell to recoup the loss? Even if some people go to Amazon instead, it can still be worth it to avoid the theft.

It's much more likely that it has indeed had the biggest impact on reducing theft, and that your "discernable logic" simply doesn't have experience with these things -- that theft often isn't about item value, but rather about reliable resellability. A single expensive niche power tool takes a long time to resell; laundry detergent and razor blades can be unloaded in quantity the same day. People go through detergent and razors a lot faster than light bulbs.

I understand you dislike the inconvenience. But I really think you should be blaming the thieves or the factors behind theft, not the stores.

[-]

thomassmith65 2 hours ago ago

I doubt the optimization helped their bottom-line; I do not know it.

It is possible for a business to make money without customers actually liking the company: hey, it works for some of the FA*NG companies!

That said, there is something that feels 'off' about management obsessing over shrinkage to the point that the shopping experience begins to suck. It's not a truckstop or a drug store in a bad area... it's a hardware superstore.

With too much data, some manager can fixate on $3 screwdriver thievery and not think about the bigger picture: like shoppers finding the store to be a pain in the ass, and therefore no longer an attractive place to buy expensive riding-lawnmowers and floor jacks.

A store can quantify lower sales figures, but it may not be obvious that the lower sales were related to the choice of 'caged vs uncaged' inventory.

But again, I do not know. I only suspect.

orcim 5 hours ago ago

It's an effect that exists, but the examples aren't accurate.

An overemphasis on grades isn't from wanting to educate the population; obesity isn't from prioritizing nutrient-rich food; and increased inequality isn't from wanting to distribute resources based on the needs of society.

Living a well-lived life through culture, cooking, or exercise doesn't make you more susceptible to sensationalism, addiction, or gambling. It's a lack of stimulus that makes you reach for those things.

You can argue that academia enables rankings, industrial food production enables producing empty calories, and economic development enables greater inequality. But that isn't causation.

It also isn't a side effect when significant resources specifically go into promoting education as a private matter best used to educate the elite, that businesses aren't responsible for the externalities they cause, and that resources should be privately controlled.

In many ways, it is far easier to have more public education, heavily tax substances like sugar, and redistribute wealth than it is to do anything else. That just isn't the goal. It used to be hard to get a good education, good food, and a good standard of living. And it still is. For the same reasons.

bilsbie 7 hours ago ago

This is why I don’t like focusing on GDP. I think a quarterly poll on life satisfaction and optimism would be a better measure.

If you’re curious about GDP. I my car breaks and I get it fixed, that adds to GDP.

If a parent stays home to raise kids, that lowers GDP. If I clean my own house that lowers GDP. Etc.

Unemployment is another crude metric. Are these jobs people want or do they feel forced to work bad jobs.

[-]

jebarker 6 hours ago ago

I'm not really disagreeing (as GDP is a crude measure), but rather thinking out loud. I don't think my individual life satisfaction and optimism should be influenced by nation-state economics to the extent that that's what they're optimizing for. The job of my government is to create the conditions for security, prosperity and opportunity without oppressing the rest of the world or destroying the planet. But it's up to me to find a satisfying life within that and that is possible within drastically different economic and social structures. Similarly, there's probably not a set of conditions that gives universal satisfaction to all citizens, so what summary statistics of life satisfaction and optimism do we optimize for?

vladms 4 hours ago ago

I find ironic that we are talking about ML where we have vectors of thousands of quantities and then we go to measure social/economic stuff with one (or a couple of numbers).

The general discourse (news, politicians, forums, etc.) over a couple of measures will always be highly simplifying. The discourse over thousands of measures will be too complex to communicate easily.

I hope that at some point most people will acknowledge implicitly that the fewer the number of measures the more probable is that it is a simplification that hides stuff. (ex: "X is a billionaire, means his smart"; "country X has high GDP means it's better than country Y with less GDP" and so forth).

[-]

durumu 4 hours ago ago

> I hope that at some point most people will acknowledge implicitly that the fewer the number of measures the more probable is that it is a simplification that hides stuff.

But the larger the number of measures, the more free variables you have. Which makes it easier to overfit, either accidentally or maliciously.

klysm 6 hours ago ago

The point is it doesn’t matter what you measure

swed420 7 hours ago ago

Agreed, and that goes for capitalism at large.

Here's a rough outline for one proposed alternative to capitalism and the failed central planning alternatives of the past:

https://jacobin.com/2019/03/sam-gindin-socialist-planning-mo...

Some relevant snippets:

> Though planning and worker control are the cornerstones of socialism, overly ambitious planning (the Soviet case) and overly autonomous workplaces (the Yugoslav case) have both failed as models of socialism. Nor do moderate reforms to those models, whether imagined or applied, inspire. With all-encompassing planning neither effective nor desirable, and decentralization to workplace collectives resulting in structures too economically fragmented to identify the social interest and too politically fragmented to influence the plan, the challenge is: what transformations in the state, the plan, workplaces, and the relations among them might solve this quandary?

> The operating units of both capitalism and socialism are workplaces. Under capitalism, these are part of competing units of capital, the primary structures that give capitalism its name. With socialism’s exclusion of such private units of self-expansion, the workplace collectives are instead embedded in pragmatically constituted “sectors,” defined loosely in terms of common technologies, outputs, services, or simply past history. These sectors are, in effect, the most important units of economic planning and have generally been housed within state ministries or departments such as Mining, Machinery, Health Care, Education, or Transportation Services. These powerful ministries consolidate the centralized power of the state and its central planning board. Whether or not this institutional setup tries to favor workers’ needs, it doesn’t bring the worker control championed by socialists. Adding liberal political freedoms (transparency, free press, freedom of association, habeas corpus, contested elections) would certainly be positive; it might even be argued that liberal institutions should flourish best on the egalitarian soil of socialism. But as in capitalism, such liberal freedoms are too thin to check centralized economic power. As for workplace collectives, they are too fragmented to fill the void. Moreover, as noted earlier, directives from above or competitive market pressures limit substantive worker control even withinthe collectives.

> A radical innovation this invites is the devolution of the ministries’ planning authority and capacities out of the state and into civil society. The former ministries would then be reorganized as “sectoral councils” — structures constitutionally sanctioned but standing outside the state and governed by worker representatives elected from each workplace in the respective sector. The central planning board would still allocate funds to each sector according to national priorities, but the consolidation of workplace power at sectoral levels would have two dramatic consequences. First, unlike liberal reforms or pressures from fragmented workplaces, such a shift in the balance of power between the state and workers (the plan and worker collectives) carries the material potential for workers to modify if not curb the power that the social oligarchy has by virtue of its material influence over the planning apparatus, from information gathering through to implementation as well as the privileges they gain for themselves. Second, the sectoral councils would have the capacity, and authority from the workplaces in their jurisdiction, to deal with the “market problem” in ways more consistent with socialism.

> Key here is a particular balance between incentives, which increase inequality, and an egalitarian bias in investment. As noted earlier, the surpluses earned by each workplace collective can be used to increase their communal or individual consumption, but those surpluses cannot be used for reinvestment. Nationwide priorities are established at the level of the central plan through democratic processes and pressures (more on this later) and these are translated into investment allocations by sector. The sector councils then distribute funds for investment among the workplace collectives they oversee. But unlike market-based decisions, the dominant criteria are not to favor those workplaces that have been most productive, serving to reproduce permanent and growing disparities among workplaces. Rather, the investment strategy is based on bringing the productivity of goods or services of the weaker collectives closer to the best performers (as well as other social criteria like absorbing new entrants into the workforce and supporting development in certain communities or regions).

...

> No one paid greater economic homage to capitalism than the authors of The Communist Manifesto, marveling that capitalism “accomplished wonders far surpassing Egyptian pyramids, Roman aqueducts, and Gothic cathedrals.” Yet far from seeing this as representing the pinnacle of history, Marx and Engels identified this as speaking to a new and broader possibility: capitalism was “the first to show what man’s activity can bring about.” The task was to build on this potential by explicitly socializing and reorganizing the productive forces.

> In contrast, for Hayek and his earlier mentor von Mises, capitalism was the teleological climax of society, the historical end point of humanity’s tendency to barter. Hayek considered it a truism that that without private property and no labor and capital markets, there would be no way of accessing the latent knowledge of the population, and without pervasive access to such information, any economy would sputter, drift, and waste talent and resources. Von Mises, after his argument that socialism was essentially impossible was decisively swept aside, turned his focus on capitalism’s genius for entrepreneurship and the dynamic efficiency and constant innovation that it brought.

> Despite Hayek’s claims, it is in fact capitalism that systematically blocks the sharing of information. A corollary of private property and profit maximization is that information is a competitive asset that must be hidden from others. For socialism, on the other hand, the active sharing of information is essential to its functioning, something institutionalized in the responsibilities of the sectoral councils. Further, the myopic individualism of Hayek’s position ignores, as Hilary Wainwright has so powerfully argued, the wisdom that comes from informal collective dialogue, often occurring outside of markets in discussions and debates among groups and movements addressing their work and communities.

remram 17 hours ago ago

Those are great points! Another related law is from queuing theory: waiting time goes to infinity when utilization approaches 100%. You need your processes/machines/engineers to have some slack otherwise some tasks will wait forever.

[-]

toasterlovin 16 hours ago ago

I’m remembering reading once that cities are incredibly efficient in how they use resources (compared to the suburbs and rural areas, I guess), and, in light of your comment about waiting time, I’m realizing why now why they’re so unpleasant: constant resource contention.

[-]

nuancebydefault 11 hours ago ago

On the other hand, in cities people are queueing up and talking at the bakery counter. While people in the suburbs are listening to the radio while driving to the bakery. I guess you choose to live where you feel most comfortable.

naming_the_user 15 hours ago ago

Amusingly this is something that I see as being a huge divide in rural and urban politics.

Yes, it’s inefficient. Yes, some people want that!

[-]

fragmede 15 hours ago ago

Right. Living is not an optimization problem.

[-]

badpun 14 hours ago ago

Unless not until the oil and other essential stuff run out.

[-]

tambourine_man 9 hours ago ago

Our problem is not that we are running out of stuff, but that we’re drowning on it.

exe34 14 hours ago ago

what it means to not optimise though is that some people end up better off and many others are worse off.

[-]

wizzwizz4 5 hours ago ago

And what it means to optimise is also that some people end up better off and many others are worse off.

[-]

exe34 3 hours ago ago

Yes, the point is to find a balance so that the first number is maximised.

bmicraft 9 hours ago ago

Sorry to put it so bluntly, but you're basically saying:

"I don't care it the climate's fucked, I want to live away from civilization and drive 100 miles a day everywhere"

Of course we shouldn't hyper-optimize everything, but sooner people realize our environment depends on not everyone getting exactly what they want whenever they want the better. Living in a (walkable) city is just one such concession towards the environment we ought to make, even if we don't "want" to.

[-]

naming_the_user 9 hours ago ago

Or we could just compete with each other for resources as we have since forever. I’d rather do that than have no choice but to live in Kowloon.

Just whack an externality tax on fossil fuels and things like cutting down wilderness, job done.

[-]

xerox13ster 5 hours ago ago

Or we can stop acting like there’s only two options: living in wide-open fields with a clear horizon or the fucking walled city of Kowloon.

Also, you have the mindset of a typical anti-social coastal elite who thinks “oh no big deal we can just raise the cost of living for all the poor rural types by sticking on a tax because I want to go LARP as a Victorian manor lord. And people don’t bend to my every whim immediately or live exactly like me so I want to be in total control of the 50 miles around me.”

llm_trw 9 hours ago ago

If you think cities don't fuck the climate just as much as suburbs do I have a well you can carry water 40 flights of stairs from.

[-]

r3d0c 7 hours ago ago

cities do because they exist in a system that generate carbon, but they are vastly more resource & carbon efficient than suburbs per person

https://usa.streetsblog.org/2015/03/05/sprawl-costs-the-publ...

https://news.berkeley.edu/2014/01/06/suburban-sprawl-cancels...

fragmede 2 hours ago ago

That’s not remotely what I’m saying. I live in a city and don’t drive most days because I can walk and take public transit and there’s never any parking. What I’m saying is that in the bigger picture, approaching life as a set of problems to be optimized is the wrong way to approach life.

Jensson 14 hours ago ago

The efficiency results in abundance not possible in less dense areas, you are waiting for things that are simply not available elsewhere.

[-]

nerdponx 13 hours ago ago

Sort of. Compare doing laundry at the laundromat to doing laundry in your basement.

[-]

mgfist 7 hours ago ago

They meant things like bars, restaurants, sports stadiums, concerts, plays. Things that require sufficient density to make economic sense.

[-]

kortilla 5 hours ago ago

LA has multiple of all of those and nearly entirely suburbs

georgeburdell 16 hours ago ago

Yep, I used to work in a factory. Target utilization at planning time was 80%. If you over-predict your utilization, you waste money. If you under-predict, a giant queue of “not important” stuff starts to develop

[-]

scott_w 14 hours ago ago

This reminds me of something my mother told me she aimed for when she ran her catering businesses: she always wanted 1 serving of pie leftover at the end of every day.

If she had 0, she ran the risk of turning customers away and losing money. Any more than 1 is excess waste. Having just 1 meant she’d served every possible customer and only “wasted” 1 slice.

[-]

rzzzt 13 hours ago ago

And then you can eat the pie as a reward.

immibis 7 hours ago ago

Customers don't want to buy the last one.

eru 15 hours ago ago

For some scenarios that's fine, and you can slash the queue whenever necessary.

Eg at Google (this was ten years ago or so), we could always spend leftover networking capacity on syncing a tiny bit faster and more often between our data centres. And that would improve users' experience slightly, but it also not something that builds up a backlog.

At a factory, you could always have some idle workers swipe the floor a bit more often. (Just a silly example, but there are probably some tasks like that?)

[-]

082349872349872 10 hours ago ago

Unlike merchantmen, naval vessels were crewed at a level allowing for substantial attrition (bad attrition would be casualties; good attrition would be prize crews); I believe they traditionally (pace Churchill) had many, many activities which were incidental to force projection (eg polishing the brightwork) but could be used to occupy all hands.

[-]

eru 6 hours ago ago

Yes. And, well, you can also always train more. Especially in the age of sail.

eru 17 hours ago ago

You can add a measure of robustness to your optimization criteria. You can explicitly optimise for having enough slack in your utilisation to handle these unforeseen circumstances.

For example, you can assign priorities to the loads on your systems, so that you can shed lower priority loads to create some slack for emergencies, without having to run your system idle under during lulls.

I get what the article is trying to say, but they shouldn't write off optimisation as easily as that.

[-]

remram 2 hours ago ago

A task "shed" is one delivered with infinite latency. If that's fine for you then the theorem doesn't hurt you, do what's best for your domain. It's just something to be aware of.

hinkley 16 hours ago ago

The problem is that people who agree to a task being low priority still expect it to be done in nine months and all of a sudden they become high priority if that doesn’t happen.

So you’re fixing the micro economics of the queue but not the macro. Queues still suck when they fill up, even if they fill with last minute jobs.

[-]

eru 15 hours ago ago

This totally depends on the system in question and what the agreements with your users are.

Eg if you are running video conferencing software, and all of a sudden you are having bandwidth problems, you typically first want to drop some finer details in the video, and then you want to drop the audio feed.

In any case, if you dropped something, you leave it dropped, instead of picking it back up again a few seconds later. People don't care about past frames.

(However, queuing instead of outright dropping can still makes sense in this scenario, for any information that's younger than what human reaction times can perceive.)

Similarly in your scenario, you'd want to explicitly communicate to people what the expectations are. Perhaps you give out deep discounts for tasks that can be dropped (that's what eg some electriticy providers do), or you can give people 'insurance' where they get some monetary compensation if their task gets dropped. (You'd want to be careful how you design such a scheme, to avoid perverse incentives. But it's all doable.)

> So you’re fixing the micro economics of the queue but not the macro. Queues still suck when they fill up, even if they fill with last minute jobs.

I don't know, I had pretty positive experiences so far when eg I got bumped off a flight due to overbooking. The airline offered decent compensation.

Overbooking and bumping people off _improves_ the macro situation: despite the occasional compensation you have to pay, when unexpectedly everyone who booked actually showed up, overbooking still makes the airline extra money, and via competition this is transformed into lower ticket prices. Many people love lower airfares, and have shown a strong revealed preference of putting up with a lot of stuff eg RyanAir pulls as long as they get cheap tickets.

vishnugupta 15 hours ago ago

I feel that a 100% efficient system is not resilient. Even minor disruptions in subsystems lead to major breakdowns.

There’s no room to absorb shocks. We saw a drastic version of this during COVID-19 induced supply chain collapse. Car manufacturers had built near 100% just in time manufacturing that they couldn’t absorb chip shortages and it took them years to get back up.

It also leaves no room for experimentation. Whatever experiment can only happen outside a system not from within it.

[-]

kelseyfrog 13 hours ago ago

This is coincides with my headcannon cause of the business cycle.

1. Firms compete

2. Firms either increase their efficiency or die

3. Efficient firms are more susceptible to shocks

4. Firm shutdown and closures are themselves shocks

5. Eventually the system reaches a critical point where the aggregate susceptibility is higher than the aggregate of shocks that will be generated by shutdowns and closures

6. Any external shock will cause a cascade

There's essentially a "commons" where firms trade susceptibility for efficiency. Or in other words, susceptibility is pooled while the rewards for efficiency are separate.

[-]

NeoTar 12 hours ago ago

It sounds similar to how animal/plant species often work.

A species will specialise for a niche, and outcompete a generalist. But when conditions change, the generalist can adapt and the specialist suffers.

Eddy_Viscosity2 8 hours ago ago

But in practice we see that:

1. Firms compete

2. Some firms get ahead

3. Accrued advantages to being ahead amplify

4. A small number of firms dominate

5. New competition is bought or crushed

6. Dominate firms become less efficient in competition-free environment

[-]

kelseyfrog 3 hours ago ago

They aren't mutually exclusive. And, not xor.

yannis 10 hours ago ago

Good analysis, but one also needs to look at the definition of `efficiency`, what is your definition of efficiency in this context.

[-]

kelseyfrog 5 hours ago ago

The ability to do more with fewer resources. Profit is a great starting point when answering, "What is efficiency to a firm?"

tacitusarc 15 hours ago ago

There is a fundamental tension between efficiency and resilience, you are completely correct. And yea, it’s a systems problem, not limited to tech.

There is an odd corollary, which is that capitalistic systems which reward efficiency gains and put downward pressure to incentivize efficiency, deal with the resilience problem by creating entirely new subsystems rather than having more robust subsystems, which is fundamentally inefficient.

[-]

hyperadvanced 14 hours ago ago

This is exactly the subthread of this conversation I’m interested in.

Is what you’re saying that capitalism breaks down resilience problems into efficiency problems?

I think that’s an extremely motivating line of thinking, but I’ll have to do some head scratching to figure out exactly what to make of it. On one hand, I think capitalism is really good at resilience problems (efficient markets breed resilience, there’s always an incentive to solve a market inefficiency), on the other (or perhaps in light of that) I’m not so sure those two concepts are so dialectically opposed

[-]

tacitusarc 2 hours ago ago

To understand the effects, we first have to take a step back and recognize that efficiency and resiliency problems are both subsets of optimization problems. Efficiency is concerned with maximizing the ratio of inputs to outputs, and resiliency is concerned with minimizing risk.

The fundamental tension arises because risk mitigation increases input costs. Over a given time horizon, there is an optimal amount of risk mitigation that will result in maximum aggregate profit (output minus input, not necessarily monetary). The longer the time horizon, the more additional risk mitigation is required, to prevent things like ruin risk.

But here’s the rub: competition reduces the time horizon to “very very short” because it drives down the output value. So in a highly competitive market, we see companies ignore resiliency (they cannot afford to invest in it) and instead they get lucky until they don’t (another force at work here is lack of skin in the game). The market deals with this by replacing them with another firm that has not yet been subject to the ruinous risks of the previous firm. This cycle repeats again and again.

Most resilient firms have some amount of monopolistic stickiness that allows them to invest more in resiliency, but it is also easy to look at those firms and see they are highly inefficient.

The point is that the cycle of firms has a cost, and it is not a trivial one: capital gets reallocated, businesses as legal entities are created, sold, and destroyed, contracts have to be figured out again, supply chains are disrupted, etc. Often, the most efficient outcome for the system is if the firms had been more resilient.

So there is an inefficient Nash equilibrium present in those sort of competitive markets.

maximus-decimus 12 hours ago ago

I mean, car companies also just straight out cancelled their chip orders because they initially thought people would stop buying cars during COVID.

I_AM_A_SMURF 16 hours ago ago

That tracks. I worked at a lot of places/teams where anything but a P0 was something that would never be done.

[-]

hinkley 16 hours ago ago

Solution: everything is a P0!

[-]

jaggederest 15 hours ago ago

Then you just get Little's law, which is not usually what people want. Preemption is usually considered pretty important... Much like preemptory tasks.

[-]

hinkley 15 hours ago ago

No what you get is alcoholism. It was sarcasm.

[-]

jaggederest 15 hours ago ago

Porque no los dos? The purpose of a beverage is what it does.

robertclaus 16 hours ago ago

Interesting. My gut reaction is that this is true in reverse: infinite wait time leads to 100% utilization. However, I feel like you can also have 100% utilization with any queue length if input=output. Is that theory just a result of a first order approximation or am I missing something?

[-]

Aeolun 15 hours ago ago

I think it comes from tasks not taking an equal amount of time, coming in at random, and not having similar priorities.

immibis 8 hours ago ago

The average queue length is still infinity. Whatever the queue length happens to be at the start, it will stay there, and it could be any positive number up to infinity.

Besides, angels can't really balance on pinheads.

amelius an hour ago ago

Slack __or__ lower priority tasks.

appendix-rock 10 hours ago ago

For some it may go without saying, but for the uninitiated, y’all should be reading https://en.wikipedia.org/wiki/The_Goal_(novel)

redsparrow 7 hours ago ago

This makes me think of going to chain restaurants. Everything has been focus-grouped and optimized and feels exactly like an overfit proxy for an enjoyable meal. I feel like I'm in a bald-faced machine that is optimized to generate profit from my visit. The fact that it's a restaurant feels almost incidental.

"HI! My name is Tracy! I'm going to be your server this evening!" as she flawlessly writes her name upside down in crayon on the paper tablecloth. Woah. I think this place needs to re-calibrate their flair.

LarsDu88 16 hours ago ago

I was trying to remember where I remember where I heard of this author's name before.

Invented the first generative diffusion model in 2015. https://arxiv.org/abs/1503.03585

[-]

Arech 13 hours ago ago

And for me it was this ingenious 2019 paper co-authored by Stephan Hoyer and Sam Greydanus on doing structural optimization by employing a (constrained) neural network as a storage/modifier/tuner of the physical model describing the structure to optimize: https://arxiv.org/abs/1909.04240 Super interesting approach and very well written paper.

usaphp 16 hours ago ago

I think it also applies to when managers try to overoptimize work process, in the end creative people lose interest and work becomes unbearable...little chaos is necessary in a work place/life imo...

[-]

hinkley 16 hours ago ago

I kill my desire to work on a lot of side projects by trying to over optimize the parts I’m not going to like doing. I should just do the yucky parts and get past them. But at least nobody is paying me to spiral.

mch82 an hour ago ago

Really interesting to learn about the ML perspective of the cost of localized efficiency. Local efficiency can also make things worse from a queueing theory perspective. Optimizing a process step that feeds a system bottleneck can cause queues to pile up, decreasing system-level productivity. Automating production of waste forces downstream processes to deal with added waste.

jrochkind1 8 hours ago ago

Calling this the "strong version of Goodhart's law" was immediately brain-expanding for me.

I have been thinking of goodhart's law a lot, but realized I had been leaning toward focusing on human reaction to the metric as the cause of it; but this reminded me it's actually fundamentally about the fact that any metric itself is inherently not an exact representation of the quality you wish to represent.

And that this may, as OP argues, make goodhart's law fundamental to pretty much any metric used as a target. Independently of how well-intentioned any actors. It's not a result of like human laziness or greed or competing interests, it's an epistemic (?) result of the neccesary imperfection of metric validity.

This makes some of OP's more contentious "Inject noise into the system" and "early stopping" ideas more interesting even for social targets.

"The more our social systems break due to the strong version of Goodhart's law, the less we will be able to take the concerted rational action required to fix them."

Well, that's terrifying.

raister 17 hours ago ago

This reminds me of Eli Goldratt's quote: "Tell me how you measure me, I will tell you how I behave."

[-]

tirant 12 hours ago ago

Parallel to Munger’s “Show me the incentives and I will show you the outcome” which I think all of us have or will realize for ourselves at some point in life.

whack 17 hours ago ago

Corollary: "If you do not measure me, I will not behave"

[-]

ryandv 16 hours ago ago

This is coming very close to denying the antecedent, one of the most basic formal logical fallacies.

hinkley 16 hours ago ago

No, I’m gonna do what I want to do. If you hire good people “what they want to do” is going to be what they think is right. Which may or may not be.

eventuallylive 17 hours ago ago

Strictly speaking this is not the contrapositive and therefore the proof is yet to be seen. A sound corollary: "If I do not behave, it is because you did not measure me."

[-]

lotsofpulp 17 hours ago ago

Is a contrapositive a corollary? P implies Q is logically equivalent to Not Q implies Not P.

A corollary would be some other relation that can be deduced as a result of P implies Q, not simply a restatement of P implies Q.

(Using the discrete math definition of imply, not the colloquial definition of imply).

[-]

moefh 16 hours ago ago

Yes, a corollary can be just the contrapositive of something you just proved. Sometimes it's even more trivial, like a special case of a general theorem you proved.

A very common use is to re-state something so it's in the exact form of something you said you'd prove. Another common case is to highlight a nice incidental result that's a bit outside the path towards the main result -- for example, it immediately follows (perhaps logically equivalent to) something that's been proven, but it's dressed in a way that catches the attention of someone who's just skimming.

kzz102 5 hours ago ago

I think of efficiency as one example where naive economic thinking has poisoned common sense. Economists view inefficiency as a problem. Because a healthy economy is efficient, therefore inefficiency is unhealthy. Any inefficient market is a "market failure". Efficiency is also the primary way a manager can add value. But the problem is, efficiency assumes existence of metrics, and indeed is counter productive if your metrics are wrong.

[-]

marcosdumay 5 hours ago ago

> Efficiency is also the primary way a manager can add value.

That's not right. The primary task of management is alignment.

[-]

kzz102 4 hours ago ago

> That's not right. The primary task of management is alignment.

Fair enough.. at least they think they can add value by improving efficiency.

[-]

marcosdumay 3 hours ago ago

It's a way they can add value. It's far from their primary way, and it's a task that is not primary done by management.

But yes, there are plenty of managers that focus on it.

cb321 8 hours ago ago

I do like it when researchers try to connect the deep ideas within their work to broader more general systems, but caution is warranted to the optimism. This article is the kind of formal analogy that inspired/drove much of the marketing appeal of the Santa Fe Institute back in the 1980s. It's honestly always pretty fun, but the devil is usual in the details here (as is usual in making anything "work", such as self-organized criticality [1] which if you enjoyed this article you will also probably enjoy!).

As just one example to make this point more concrete (LOL), the article mentions uncritically that "more complex ecosystems are more stable", but over half a century ago in 1973 Robert May wrote a book called "Stability and Complexity in Model Ecosystems" [2] explaining (very accessibly!) how this is untrue for the easiest ideas of "complex" and "stable". In more human terms, some ideas of "complex" & "stable" can lead you astray, as has been appearing in the relatively nice HN commentary on this article here.

Perhaps less shallowly, things go off the rails fast once you have both multiple metrics (meaning no "objective Objective") and competing & intelligent agents (meaning the system itself has a kind of intrinsic complexity, often swept under the rug by a simplistic thinking that "people are all the same"). I think this whole topic folds itself into "Humanity Complete" (after NP-complete.. a kind of infectious cluster of Wicked Problems [3]) like trust/delegation do [4].

[1] https://en.wikipedia.org/wiki/Self-organized_criticality

[2] https://press.princeton.edu/books/paperback/9780691088617/st...

[3] https://en.wikipedia.org/wiki/Wicked_problem

[4] https://en.wikipedia.org/wiki/Demarcation_problem

otherme123 5 hours ago ago

> Goal: Distribution of labor and resources based upon the needs of society

> Proxy: Capitalism

> Strong version of Goodhart's law leads to: Massive wealth disparities (with incomes ranging from hundreds of dollars per year to hundreds of dollars per second), with more than a billion people living in poverty

Please, show me a point in all human history when we have less than 90% global population living in poverty, pre-capitalism. Yes, there are 1 billion people (out of 8 billion) living in poverty today. But they were 2 billion (of 4.5 billion total) living in poverty as recently as 1980 (https://www.weforum.org/agenda/2016/01/poverty-the-past-pres...).

Poverty is steadily going down (https://www.weforum.org/agenda/2016/01/poverty-the-past-pres...) since we have data. The first countries to get rid of recurrent famines were the same that first adopted capitalism. The same countries where their population started having higher expectations than to live another day.

Paraphrasing Churchill about democracy, "[capitalism] is the worst economic system except for all other systems that has been tried from time to time".

[-]

jaco6 5 hours ago ago

There is no such thing as capitalism. “The first countries to get rid of recurrent famine” were those that began using the steam engine and field enclosure. These are not “capitalism”, they are “capital”, in tge jargon of the economist Marx, or, in modern parlance, technology. All of the parts of the world that are not starving are not starving due to their adoption of a variety of technologies, both mechanical and bureaucratic: fertilizer, machine tractors, and centralized governance using telephone lines and now internet. When people claim “capitalism” ended starvation, they ignore places like China and Russia which also ended famine despite adopting state ownership of farms. There was indeed famine during the transition period, but Russia ended famine by the 50s and China ended famine by the 70s, long before many “capitalist” countries in the 3rd world. That’s because the world isn’t about the phony field of “economics,” it’s about technology. Likewise, the other advancements of society that are claimed to be associated with “capitalism”—the end of smallpox and other old high-casualty diseases, the development of reliable air and trans-oceanic transport, instant global communications: all are due to the efforts of scientists working in labs mostly funded by governments and wealthy patrons, not “capitalism.” Capitalist enterprises almost never take major innovative risks. Even the latest glorious advancement that will doubtless be claimed by “capitalism,” Wikipedia-scraping chat bots, was developed by a non-profit.

Everything is about technology—stop letting economists drag you into stupid, poorly formed debates using undefined terms like “capitalism.”

[-]

d0gsg0w00f 4 hours ago ago

Technology is a large factor but not the core driver. It's individual human motivation that drives the efficiency of the system. When you centralize the function of human motivation into a governing body like socialism aims to do, the goal is that the central body can optimize the system. However, the system is too complex and always misses something that was never predicted. When you "outsource" those motive drivers back to the people with capitalism you let the system optimize itself.

smokel 14 hours ago ago

I was listening to an episode of the "inControl" podcast [1], in which Ben Recht suggested that overfitting is not always well understood.

Perhaps it is interesting to read his blogpost "Machine Learning has a validity problem" alongside this article.

[1] https://www.incontrolpodcast.com/

[2] https://archives.argmin.net/2022/03/15/external-validity/

projektfu 16 hours ago ago

And that's leaving out Jevon's paradox, where increasing efficiency in the use of some scarce resource sometimes/often increases its consumption, by making the unit price of the dependent thing affordable and increasing its demand. For example, gasoline has limited demand if it requires ten liters to go one km, but very high demand at 1 L/10km, even at the same price per liter.

[-]

hinkley 15 hours ago ago

When people know the answer is always “no” they save their energy to plea for stuff they really can’t do without. You start saying yes and they’ll ask for more.

The trick is as always to find out the XY problem. What they really need may be way easier for you to implement than what they actually asked for.

[-]

eru 15 hours ago ago

Sometimes you can just embrace it, instead of looking for tricks.

If you are in the business of selling any product or service, then it's great that finding a way to make it cheaper also generates more demand for you.

[-]

hinkley 15 hours ago ago

I’m confused, because the “not trick” I’m talking about is the boondoggle created by giving people exactly what they ask for, making nobody happy and jamming up your throughput in the process.

[-]

eru 15 hours ago ago

To be specific: if you can find a way to make fridges for half the previous cost, and you can sell them for three quarters the previous price, you don't want to talk people out of buying more fridges. In fact, them buying vastly more fridges is exactly what you want.

[-]

its_bbq 12 hours ago ago

And not necessarily the long term result anybody wants

[-]

eru 9 hours ago ago

Same happened with Walkmans or desktop computer, or mobile phones etc.

It's pretty normal that people want less of stuff when it's expensive, and more when it's cheap.

projektfu 5 hours ago ago

Yeah, but it is also second-order effects where the efficient use of a resource opens it up for more uses as well as for more exploitation. Perhaps this is most visible with farmland. Efficient use of water (center-pivot sprinkler) causes much more land to be arable, causing more use of that same water as well, depleting aquifers.

dooglius 17 hours ago ago

Overfitting may be a special case of Goodhart's Law, but I don't think Goodhart's Law in general is the same as overfitting, so I don't think the conclusion is well-supported supported in general; there may be plenty of instances of proxy measures that do not have issues.

I'll also quibble with the example of obesity: the proxy isn't nutrient-rich good, but rather the evaluation function of human taste buds (e.g. sugar detection). The problem is the abundance of food that is very nutrient-poor but stimulating to taste buds. If the food that's widely available were nutrient-rich, it's questionable whether we would have an obesity epidemic.

[-]

feyman_r 17 hours ago ago

We realize now or at least in recent past, the value of true nutrient-rich food or a balanced diet.

Carbohydrate abundance was likely important in moving people out of hunger and poverty but excesses of the same kind of diet are a reflection on obesity.

My guess is that calorie-per-gram-per-dollar of carbohydrates is still lower than fat and protein.

gond 11 hours ago ago

“Though this pheonomenon is often discussed, it doesn't seem to be named. Let's call it the strong version of Goodhart's law“

I wonder why the author called it that way when this seems to me clearly derived from Ross Ashby‘s law of Requisite Variet[1], predating Goodhard by 20 years. As I see it, it is not even necessary to put more meaning it Goodhard as there actually is. Requisite Variety is sufficient. Going by his resume, I strongly assume the author knows this.

Russel Ackoff, building on countless others, put into two sentences for which others needed two volumes:

“The behaviour of a system is never equal to the behaviour of its parts. - It is the product of their interactions.“

[1] https://en.m.wikipedia.org/wiki/Variety_(cybernetics)

[-]

appendix-rock 10 hours ago ago

Love myself some cybernetics! All engineers are doing themselves a disservice by sitting here writing smooth-brained rants about “dumb MBAs making my job hard” instead of reading up on this field and understanding the true complexities that are inherent in people working together.

[-]

gond 9 hours ago ago

Agreed. I wonder if this is still the aftermath of the chasm which resulted when ‘Marvin Minsky et al‚ disowned Cybernetics, took some parts out of it and gave it a shiny new name.

Especially Systems Theory in its second manifestation (Maturana, Luhmann, von Förster, Glasersfeld - and Ackoff) is extremely powerful, deep and, reasons beyond me, totally overlooked.

Have to say tho, most MBA‘s I encountered sadly never ever heard of Cybernetics or Systems Theory. :-(

[-]

pradn 6 hours ago ago

How has Systems Theory changed how you think? Is there a good book you recommend on the topic? Thank you!

[-]

gond 24 minutes ago ago

I made a mistake here, leading astray. It's second-order cybernetics, not second order systems theory. I melted the two in my head, reason is that (Social) Systems Theory according to Niklas Luhmann [1] incorporates several parts of second order cybernetics (most of the people mentioned before show up there) and blends it all together to a point where I have difficulties to distinguish the separate parts.

This is all not 'new' in the literal sense. Luhmann died in the late 90s. I tried to come up with a spot-on-book of Luhmann, but that lead to nothing. It's all spread out (papers and books) as far as I know. There is a (transcribed) lecture given in 1992, which may come close, but I couldn't find a translated version of it.

One of the parts which hit home for me was the introduction of a differenciated form of distinctiveness, in turn partly enabled by the introduction of an observer, and observation in general. Heinz von Foerster gave an often cited sentence which I think fits the part, albeit Luhmann gurus will probably moan my simplification : "Objectivity is the delusion that observations could be made without an observer."

Apart from that, sorry for not being more helpful here.

[1]https://en.wikipedia.org/wiki/Niklas_Luhmann

chrisweekly 5 hours ago ago

Efficiency tends to come at the cost of adaptability. Don't put it on rails if it needs a steering wheel. So many enterprises suffer from extreme rigidity - often caused by optimizations that lead to local maxima.

ezekiel68 4 hours ago ago

This certainly tugs at all the right levers of the intuition. Not sure how to "buck the trend" in any established organization/regime to adjust expectations according to the theory. Looks like this might need to be demonstrated in the wild at a new concern or as a turnaround job, where the leaders could have a strong influence on the culture (similar to how Jim Keller steered AMD and is now steering TensTorrent).

lynguist 11 hours ago ago

I would claim in a completely informal way that the optimal degree of utilization is ln(2)=0.693, around 70%.

This stems from the optimal load of self-balancing trees.

A little bit of slack is always useful to deal with the unforeseen.

And even a lot of slack is useful (though not always as it is costly) as it enables to do things that a dedicated resource cannot do.

On the other hand, no slack at all (so running at above 70%) makes a system inflexible and unresilient.

I would argue for this in any circumstance, be it military, be it public transit, be it funding, be it allocation of resources for a particular task.

[-]

Tade0 10 hours ago ago

I would put it at

  1 - e^(-1) ~= 0.6321

As e^x is a commonly occurring curve and at that point its derivative goes below 1, meaning from that point on it's diminishing returns.

[-]

LUmBULtERA 10 hours ago ago

I was thinking how this string of thought could connect to our daily lives. As a family with a toddler, if we fill up too much of our schedule/time in a day, a perturbation to the schedule can break everything. If instead we fill up 63-70% of the schedule and build in Flex Time, we're good!

tpoacher 15 hours ago ago

There was no need to invent a new law named "strong version", it already exists: Campbell's law.

The subtle difference between the two being exactly what the author describes: Goodhart's law states that metrics eventually don't work, Campbell's law states that, worse still, eventually they tend to backfire.

slashdave 8 hours ago ago

I am surprised that the author left out another mitigation. To build solutions (models) that are constructed to be more transferable (amenable to out of domain data). For example, in machine learning, using physics informed models. Perhaps this is simply a sign that the author is a proponent of generic, deep-learning.

[-]

yldedly 5 hours ago ago

Most people in ML, even if they are very proficient, don't understand why models should generalize out of domain. They just don't think about it.

zmmmmm 12 hours ago ago

Maybe I'm misunderstanding this but this doesn't seem like an accurate explanation of overfitting:

> In machine learning (ML), overfitting is a pervasive phenomenon. We want to train an ML model to achieve some goal. We can't directly fit the model to the goal, so we instead train the model using some proxy which is similar to the goal

One of the pernicious aspects of overfitting is it occurs even if you can perfectly represent your goal via a training metric. In fact it's even worse simetimes as an incorrect training metric can indirectly help regularise the outcome.

[-]

practal 11 hours ago ago

You might be misunderstanding here what the "goal" is. Your training metric is just another approximation of the goal, and it is almost never perfect. If it is perfect, you cannot overfit, by definition.

efitz 2 hours ago ago

I am skeptical of the analogy to overfitting, although I understand where the author is coming from and agree with the sentiment.

The basic problem is stupid simple. Optimizing a process for one specific output necessarily un-optimizes for everything else.

Right now much of commerce and labor in the United States is over-optimized for humans because tech businesses are optimizing for specific outcomes (productivity, revenue, etc) in a way that ignores the negative impacts on the humans involved.

The optimizations always turn into human goals, eg my manager needs to optimize for productivity if they want a bonus (or not get optimized out themselves), which means they need to measure or estimate or judge or guess each of their employees’ productivity, and stupid MBA shit like Jack Welch’s “fire the lowest 10% every year”) results in horrible human outcomes.

Sure there are people who need to be fired, but making it an optimization exercise enshittified it.

Same for customer service. Amazon wants to optimize revenue. Customer service and returns are expensive. Return too many things? You’re fired as a customer.

Call your mobile providers customer service too often? Fired.

Plus let’s not staff customer service with people empowered to do, well, service. Let’s let IVRs and hold times keep the volumes low.

All anecdotes but you’ve experienced something similar often enough to know it is the rule, not the exception, and it’s all due to over-optimization.

whatever1 15 hours ago ago

When we optimize we typically have a specific scenario in our head. With the proper tools one can probably make the mathematically optimal decisions to deal with this exact scenario.

However: 1) This exact scenario will likely never materialize 2) You have not good quantification of the scenario anyway due to noise/biases in measurements.

So now you optimized for something very specific, and the nature throws you something slightly different and you are completely screwed because your optimized solve is not flexible at all.

That is why a more “suboptimal” approach is typically better and why our stupid brains outperform super fancy computers and algorithms in planning.

leoc 8 hours ago ago

Related: "Dodo-Lean" by Darrell Mann https://www.darrellmann.com/dodo-lean/ , about systems which have been optimised into fragility.

tikkun 9 hours ago ago

Makes me think of: some of Taleb's ideas about just-in-time manufacturing (no slack eg covid supply shocks)

https://www.lesswrong.com/posts/yLLkWMDbC9ZNKbjDG/slack

Also, can't recall it but a long time ago I read a piece about how scheduling a system to 60% of its max capacity is generally about right, to allow for expected but unexpected variations (also makes me think of the concept of stochastic process control and how we can figure out the level of expected unexpected variations, which could give us an even better sense of what %-of-capacity to run a system at)

naitgacem 11 hours ago ago

Upon reading the title at first glance, I thought this was going to be how "effecient" computers nowadays. Such as MacBooks and such, who started this efficient computers thing in the recent times. And they are, but as a result computers are all worse off for it. I mean soldered RAMs and everything is a system on a chip.

[-]

throwuxiytayq 11 hours ago ago

The existence of the Framework Laptop proves that this is largely an imaginary tradeoff, or at least one badly taken by Apple.

[-]

naitgacem 11 hours ago ago

Unfortunately once one company does something and gets away with it, or makes even more money, everyone will follow.

I was looking at Thinkpads and was somewhat shocked to see they started doing that too!

[-]

CooCooCaCha 5 hours ago ago

This is also why the libertarian solution to bad work environments “just leave” doesn’t work at scale.

appendix-rock 10 hours ago ago

That wouldn’t be worthy of the front page of HN. “I don’t like the current tradeoffs that laptop manufacturers make” has been talked through to absolute death. It’s the opposite of interesting.

[-]

throwuxiytayq 7 hours ago ago

If anything, there should be more space for this topic in our collective consciousness.

leeoniya 5 hours ago ago

"Efficiency trades off against resiliency"

https://blog.nelhage.com/post/efficiency-vs-resiliency/

o-o- 8 hours ago ago

> FTA: This same counterintuitive relationship between efficiency and outcome occurs in machine learning.

The "examples abound, in politics, economics, health, science, and many other fields" isn't a relationship between efficiency and outcome, but rather measuring and efficiency, or measuring and outcome. I think a better analogy is Heissenberg's uncertainty principle – the more you measure the more you (negatively) affect the environment you're measuring.

hedora 18 hours ago ago

I don’t think the author understands what efficiency measures.

All of the examples involve a bad proxy metric, or the flawed assumption that spending less improves the ratio of price to performance.

[-]

mirekrusin 15 hours ago ago

The argument is that regardless of what metic is chosen, it'll create deminishing returns followed by negative returns.

What it means is the objective can't be static - for example once satiated, you need to pick different one to keep improving globally. Or do something else that moves the goalpost.

feyman_r 17 hours ago ago

My take was that initially the metric is appropriate, but then with overfitting, it’s not enough.

It eventually becomes a bad proxy metric.

atoav 17 hours ago ago

> [..] it signifies the level of performance that uses the least amount of inputs to achieve the highest amount of output. It often specifically comprises the capability of a specific application of effort to produce a specific outcome with a minimum amount or quantity of waste, expense, or unnecessary effort.

to quote wikipedia quoting Sickles, R., and Zelenyuk, V. (2019). "Measurement of Productivity and Efficiency: Theory and Practice". Cambridge: Cambridge University Press.

Offering that criticism without clarifying what efficiency measures in your opinion doesn't allow us to follow your viewpoint without us just taking your word for it. Needless to say this isn't considered good style in a discourse.

A 100 percent "efficient" system can be one that is overfitted to certain metrics and it is the typical death sin of management to confuse metrics with reality and miss that their great numbers hollow out anything that makes a system work well and reliable, because guess what: having 1 critical employee and working them like a mule is good when things work, but bad when they suddenly don't, because that second employee you thought was fat that could be cut, was your fallback. In that case your metric of efficiency was slightly increased while another, less easy to quantify (and therefore often non-existent) metric of resilience went down significantly. This means if your goal was having an efficient and resilient company, but your metric only measured the former, guess what.

Same is true in engineering, where you can optimize your system so much to fit your expected problem, one slight deviation within the problem now stops the whole thing from working alltogether (F1 racing car when part of the track turns out to be a sucky dirtroad). Highly optimized systems are highly optimized towards one particular situation and thus less flexible.

Or in biology, where everybody ought to know that mixed woods are more resilient to storms and other pests, while having great side effects for the health of the ecosystem, yet in pure economic terms it is easy to convince yourself the added effíciency of a monoculture is worth it economically, because all you look at is revenue, while ignoring multiple other metrics that impact reality.

brilee 17 hours ago ago

Accusing these examples of involving "bad proxy metric" is identical to the no true scotsman fallacy.

hinkley 15 hours ago ago

Might need to read some Goldratt. We generally don’t understand efficiency that well.

mppm 14 hours ago ago

Yeah, every single example listed looks like gaming of bad metrics. Framing it as overfitting is unproductive, IMHO, and discounts the essentially adversarial context. I also discounts the stupidity of equating "efficiency" with a high score on a simple metric. Reality has a Surprising Amount of Detail, and all that.

[-]

nottorp 5 hours ago ago

gaming of metrics. Not of bad metrics. The point is all metrics will become bad because they will be gamed for.

numbol 9 hours ago ago

There is a book on this topic, "Why Greatness Cannot Be Planned" https://link.springer.com/book/10.1007/978-3-319-15524-1 There are many youtube videos where Ken explain those ideas, this one for example https://www.youtube.com/watch?v=y2I4E_UINRo

godelski 15 hours ago ago

I find this article a bit odd, considering what the author is an expert in: generative imagery. It's the exact problem he discusses, the lack of a target that is measurable. Defining art is well known to be ineffable, yet it is often agreed upon. For thousands of years we've been trying to define what good art means.

But you do not get good art by early stopping, you do not get it by injecting noise, you do not get it by regularization. All these do help and are essential to our modeling processes, but we are still quite far. We have better proxies than FID but they all have major problems and none even come close (even when combined).

We've gotten very good at AI art but we've still got a long way to go. Everyone can take a photo, but not everyone is a photographer and it takes great skill and expertise to take such masterpieces. Yet there are masters of the craft. Sure, AI might be better than you at art but that doesn't mean it's close to a master. As unintuitive as this sounds. This is because skill isn't linear. The details start to dominate as you become an expert. A few things might be necessary to be good, but a million things need be considered in mastery. Because mastery is the art of subtly. But this article, it sounds like everything is a nail. We don't have the methods yet and my fear is that we don't want to look (there are of course many pursuing this course, but it is very unpopular and not well received. Scale is all you need is quite exciting, but lacking sufficient complexity, which even Sutton admits to be necessary). It's my fear that we get too caught up in excitement that we become blind to our limitations. Because it's knowing those limitations that is what gives us direction to improve upon. When every critique is seen as spoiling the fun of the party, we'll never be able to have anything better. I'm not trying to stop the party, in fact, I'm worried it'll stop.

[-]

ahartmetz 14 hours ago ago

> But this article, it sounds like everything is a nail

In the process, acting somewhat like a generalization of the problem it describes: overly precise and narrow approaches to "improve" ineffable qualities. But the author seems to understand that - he comments on the absurdity of some direct transfers of ML methods to real world problems. I think he just added a bunch of not necessarily well solvable, but particularly suffering from "overfitting", example problems. It's a food for thought article, not a grand proposal.

mirekrusin 15 hours ago ago

I think he agrees more with you than you think.

Evolution also picked it up as "satiation" - eating icecream feels good however you can't keep eating 1 per minute, same with pretty much everything.

In art it means not hijacking everything for some local maximum.

[-]

godelski 14 hours ago ago

I think you're probably right tbh. But I do think this point could be stressed a bit more. Especially when we're talking about how easy it is to trick ourselves into thinking we're doing what's good enough.

idunnoman1222 6 hours ago ago

The statement overfits its own idea. Testing students is not an example of an efficiency

RadiozRadioz 13 hours ago ago

This is more a meta comment about the blog itself (as is customary for HN): I like the blog, there has been a lot of work put into it, so it makes me sad that it's hosted on GitHub pages using a subdomain of GitHub.io. When the day comes that GitHub inevitably kills/ruins Pages, because it _will_ happen, there is no question, the links to this blog will be stuck forever pointing to this dead subdomain that the author has no control over. We just have to hope that the replacement blog is findable via search engines, and hope that comments are enabled wherever the pages link is referenced so that new people can find the blog. An unfortunate mess that is definitely going to happen, entirely Microsoft's fault.

[-]

0x1ceb00da 3 hours ago ago

If he hosts his site on his own server, it will go down a few months after his death.

mdp2021 13 hours ago ago

Strengthening the importance of the Archive (the Wayback).

teleforce 8 hours ago ago

There is a big difference between efficiency and effectiveness, and all system should focus on the latter rather than the former whether it's AI based or not.

There's a reason why the best-seller of self-help book for several decades now is the book by Stephen Covey entitled "The 7 Habits of Highly Effective People" not "Efficient People".

RandomLensman 11 hours ago ago

Efficiency is usually easier than effectiveness so it is optimised for much more and that spills over in the results and outcomes, of course.

amai 8 hours ago ago

Related is the book from deMarco (2002): Slack

https://herbertlui.net/slack-tom-demarco-summary/

“People under time pressure don’t think faster!”

ocean_moist 16 hours ago ago

Metrics are ambiguous because they are abstractions of success and miss context. If you want a pretty little number, it doesn’t come without cost/missing information.

I don’t know if this phenomenon is aptly characterized as “too much efficiency”.

js8 9 hours ago ago

I have a pet theory that the state-planned economies failed not because they were inefficient (as neoclassical economics claims), but rather because they attempted to be too efficient. They tried to exactly calculate which producer needs what inputs, what they should produce and when, and a little deviation from the plan caused big cascading failures.

Free market is actually less efficient than direct control, but it is correspondingly more robust. This is evidenced in the big companies, which also sometimes try to control things in the name of "efficiency" and end up being quite inefficient. And also small companies, which are often competing and duplicating efforts.

The optimum (I hesitate it call that because it's not well-defined, it is in some sense a society's choice) seems to be somewhere in the middle - you need decent amount of central direction (almost all private companies have that) and redundancy (provided by investment funds on the free market).

(As aside, despite me being democratic socialist, I don't believe the democracy matters that much for economic development, but is desirable from a moral perspective. You can have a lot of economic development under authoritarian rule, there are examples on both sides, as most private companies are also actually small authoritarian fiefdoms.)

[-]

amai 8 hours ago ago

I recommend the book from Spufford (2010): Red Plenty on this topic.

- https://amp.theguardian.com/books/2010/aug/08/red-plenty-fra...

- https://chris-said.io/2016/05/11/optimizing-things-in-the-us...

nosianu 9 hours ago ago

On this side track to the discussion, I think that a major factor there was the location of decision making vs. where the action took place. Too much control was taken away from the location where it was needed.

Centralization does that, in general, not just in those countries.

There is a reason octopuses have sub-brains in their arms, and that some of our reflexes are controlled from neurons in the spine and not from all the way up in the brain, and why small army units have some autonomy.

mglz 9 hours ago ago

> I have a pet theory that the state-planned economies failed not because they were inefficient

Well, a lot of it was corruption. A sufficient level of corruption can destroy almost any system, even if it had a well-meaning leader at the top.

baq 9 hours ago ago

See also antifragility: https://en.wikipedia.org/wiki/Antifragility

In short, efficiency is fragile. If you want your thing to be be stronger after a shock (instead of falling apart), you must design it to be antifragile.

Note: it's hard to build antifragile physical things or software, but processes and organizations are easier. ML models can be antifragile if they're constantly updating.

[-]

throw_pm23 9 hours ago ago

Yes, extending the idea more broadly to society, health, politics, etc. sounds pretty much was Taleb has been doing.

Angostura 12 hours ago ago

Really interesting article. Got me pondering the extent to which the peacock’s tail is an example of overfitting and Goodhart's

The female peacock is using the make peacock’s tail as a proxy for fitness - with beautiful consequences, but the males with the largest, showiest tails are clearly less fit, and more prone to predation.

[-]

llm_trw 12 hours ago ago

That's the point. That they are alive shows that their innate fitness minus the tail is higher than that of another alive peacock who doesn't have a tail.

Or put another way: someone who wins the Olympic 100m sprint while hopping on one leg is a better runner that everyone else in the race by a wide margin.

Angostura 12 hours ago ago

Really interesting article. Got me pondering the extent to which the peacock’s tail is an example of overfitting and Goodhart's

The female peacock is using the make peacock’s tail as a proxy for fitness - with beautiful consequences, but the males with the largest, showiest tails are clearly less fit

[-]

RandomLensman 11 hours ago ago

There is research on costly signalling and evolution.

sillyLLM 8 hours ago ago

It seems that this is related to the Tim Harford book Messy: The Power of Disorder to Transform Our Lives, but that book is not about deep learning.

rowanG077 18 hours ago ago

I don't think it's unintuitive at all. 100% optimized means 100% without slack. No slack means any hitch at all will destroy you.

[-]

fallous 17 hours ago ago

Indeed, the more efficient you become the more brittle you will be. You must depend upon the present being static and the future being perfectly predictable based on the events of the past. The present and the future don't merely need to be dependable within your own domain but also in the entire world.

The flexibility necessary to succeed in a real world requires a certain level of inefficiency.

[-]

femto 17 hours ago ago

Interestingly, the same effect shows up in communications systems. The more efficient an error correction code (ie. the closer it approaches the Shannon Bound), the more catastrophically it fails when the channel capacity is reached. The "perfect" code delivers no errors up until the Shannon bound then meaningless garble (50% error rate) beyond the Shannon Bound.

My point is that error correction codes have a precise mathematical definition and have been deeply studied. Maybe there is a general principle at work in the wider world, and it is amenable to a precise proof and analysis? (My guess is that mileage may be made by applying Information Theory, as used to analyse error correcting codes.)

HappMacDonald 17 hours ago ago

I have heard this same criticism leveled at global supply chains as of the supply shocks of the early 2020s such as COVID, Ever Given, etc.

[-]

fallous 17 hours ago ago

Yes, just-in-time supply chain systems often become over-efficient and brittle... usually because each link in the chain assumes that someone else is taking on the burden of inefficiency by having excess inventory in order to absorb shocks to the system.

refurb 14 hours ago ago

That would assume your only target measure is efficiency, which would be a silly think to target in exclusivity of everything else.

hinkley 15 hours ago ago

“Hidebound”

shae 7 hours ago ago

Gerrymandering is over fitting. Mitigation: randomize the actual shape of a district when the votes are counted.

mooktakim 11 hours ago ago

Teaching is a terrible example. Teaching is actually more efficient when it is decentralised as the teachers can adapt to local environment and changes. With centralisation you have bad feedback loop.

eru 17 hours ago ago

Just add some measure of robustness to your optimization criterion. That includes having some slack for unforeseen circumstances.

[-]

sahmeepee 12 hours ago ago

I thought that was the purpose of adding noise (in the mitigations).

[-]

eru 6 hours ago ago

Noise is one possible way, but not the only one.

paganel 13 hours ago ago

And then you optimize around the slack, and we're back to step 1.

[-]

eru 6 hours ago ago

The slack is part of your optimisation criteria.

ansc 12 hours ago ago

Kinda surprised to not see anyone mention Jacques Ellul and his The Technological Society which highly revolve around this. Technological here does not refer to technology.

shahules 17 hours ago ago

Can't agree with you more my friend. Another point on a philosophical level is efficiency or optimization in life, which always focuses on tangible aspects and ignores the greater intangible aspects of life.

submeta 10 hours ago ago

Remind‘s me of a quote from Donald Knuth: „Premature optimisation is the source of all evil.“

failrate 18 hours ago ago

"If you do not build the slack into the system, the system will take the slack out of you."

inglor_cz 12 hours ago ago

When it comes to his Mitigation: Inject noise into the system. proposal: I would be happy to experiment with some sortition in our political systems. Citizens' assemblies et cetera.

Randomly chosen deliberative bodies could keep some of the stupid polarization in check, especially if your chances to be chosen twice into the same body are infinetesimal.

https://en.wikipedia.org/wiki/Sortition

We tend to consider "democracy" as fundamentally equivalent to "free and fair elections", but sortition would be another democratic mechanism that could complement our voting systems. Arguably more democratic, as you need money and a support structure to have a chance to win an election.

natmaka 16 hours ago ago

https://en.wikipedia.org/wiki/Diminishing_returns

zug_zug 8 hours ago ago

I like the connection between Goodheart's law and overfitting. However these examples are a reach:

--- Goal: Healthy population Proxy: Access to nutrient-rich food Strong version of Goodhart's law leads to: Obesity epidemic

I'm not sure I believe this one. Exactly who's target is "access to nutrient-rich food" and how would removing that target fix the US obesity epidemic? Is "nutrient-rich" a euphemism for high-calorie? My understanding is that there are plenty of places with high-nutrient food but different norms and much better health (e.g. Japan).

We can and do measure population health across (including obesity), this isn't a proxy for an unmeasurable thing.

--- Goal: Leaders that act in the best interests of the population Proxy: Leaders that have the most support in the population Strong version of Goodhart's law leads to: Leaders whose expertise and passions center narrowly around manipulating public opinion at the expense of social outcomes

Is this really a case of "overfitting from too much data"? Or is this just a case of "some things are hard to predict?" Or even, "it's hard to give politicians incentives." It'd be interesting if we gave presidents huge prizes if the country was better 20 years after they left office.

--- Goal: An informed, thoughtful, and involved populace Proxy: The ease with which people can share and find ideas Strong version of Goodhart's law leads to: Filter bubbles, conspiracy theories, parasitic memes, escalated tribalism

Is "the goal" really a thoughtful populace? Because every individual's goal is pleasure, and the companies goals are selling ads. So I don't know who's working on that goal.

boredhedgehog 13 hours ago ago

If a citizen recognizes or intuits this to be a deep-seated problem of the political process, and if the only concrete influence this citizen can exert on the political process is choosing one of several proposed representatives, it seems rational to choose the most irrational, volatile, chaotic and unpredictable candidate.

The ideal choice would be a random number generator, but lacking that, he would want to inject the greatest dose of entropy available into the system.

dawnofdusk 17 hours ago ago

The author is a very sharp individual but is there a reason he insists on labelling overfitting as a phenomenon from machine learning instead of from classical statistics?

[-]

HappMacDonald 17 hours ago ago

It might simply be that he didn't trace the etymology back that far.

If it turned out that the term actually started in tailoring before statistics really got it's feet under it (which I absolutely cannot say that it did, just that trying to extrapolate backwards that sounds like a reasonable guess) then it wouldn't speak poorly of you if you hadn't also known that.

[-]

dawnofdusk 17 hours ago ago

The author is an academic, it is important to give proper credit for ideas within reason. Same reason I call F = ma the law of Newton and now the law of my high school physics teacher, even though I learned it first from him.

The reason I have this quibble is because the author says things like

>you should consider building formal (mathematical) bridges between results on overfitting in machine learning, and problems in economics, political science, management science, operations research, and elsewhere

If we are appropriately modest and acknowledge the fact that overfitting is well-studied by statisticians (although, obviously not in the context of deep neural networks), it seems kind of ridiculous to make statements like, economists and political scientists should consider using statistics?

feyman_r 17 hours ago ago

The blog is mainly about ML - I don’t think the author alluded to overfitting having originated in that space; they just said it’s used extensively.

chefandy 17 hours ago ago

They don't say "classical statistics," but I don't see any implication that the phenomenon was born from machine learning, even if they say it's a common problem within machine learning. Maybe I missed it? They do mention modelling their conception of overfitting around Goodhart's Law, noting its origin in economics.

kazinator 14 hours ago ago

Goal: efficient applications

Proxy: minimizing execution time of hot loops

Strong version Goodhart's: applications get incredibly bloated and unresponsive

hcfman 13 hours ago ago

Well the use of the phrase “too much” already implies less than optimal. A self fulfilling prophesy by definition ?

fwungy 2 hours ago ago

Efficiency means minimizing the use of the costliest components. It's installing fault points into a system on purpose.

Robust systems minimize fault points. Efficient systems come at the cost of robustness, and vice versa given a fixed definition of what is being conserved, i.e. costs or energy.

For example, a four cylinder engine that gets 15mpg will have a longer life than one that gets 30mpg, given the same cost.

ponow 9 hours ago ago

> Distribution of labor and resources based upon the needs of society

Not a goal for me, and not for evolution. Survival, health, prosperity, thriving and complexity rate higher. Not everyone makes it.

[-]

immibis 7 hours ago ago

Thank you for acknowledging that your goal is to kill people. Now that that has been acknowledged, we can all ignore whatever else you have to say.

Trasmatta 17 hours ago ago

From a social / emotional / spiritual/ humanistic perspective, this is what I see in the "productivity" and "wellness" spaces.

"Ahh, if only I hyperoptimize all aspects of my existence, then I will achieve inner peace. I just need to be more efficient with my time and goals. Just one more meditation. One more gratitude exercise. If only I could be consistent with my habits, then I would be happy."

I've come to see these things as a hindrance to true emotional processing, which is what I think many of us actually need. Or at least it's what I need - maybe I'm just projecting onto everyone else.

[-]

nradov 17 hours ago ago

Some of us are trying to optimize for things other than happiness. An occasional bit of happiness can be a nice side effect of certain types of optimization but happiness isn't a reasonable goal to focus on by itself.

[-]

tananan 11 hours ago ago

Happiness is a valid goal. If one perceives it’s not reasonable to expect it, then you may arrive at this conclusion. But imo that’s because we short-circuit happiness to sources of pleasure that we see aren’t that reliable.

Hell, even this settling for happiness as a side-product is a result of the judgement that this is the best we can do regarding the goal of happiness.

Jerrrrrrry 16 hours ago ago

Everyone wants to be happy, and we can't all be right, right?

aswanson 6 hours ago ago

You can also be too efficient in your career/life. You can take the "Inject noise into the system" as injecting positive randomness into your associations with people and ideas. If something seems slightly interesting but off your beaten track, learn more about it.

AnimeLife 9 hours ago ago

Very interesting article. I don't get though for hotspot partitions they didn't use a cache like Redis.

skramzy 10 hours ago ago

What gets measured gets optimized

Animats 17 hours ago ago

Important subject, so-so blog post. This idea deserves further development.

The author seems to be discussing optimizing for the wrong metric. That's not a problem of too much efficiency.

Excessive efficiency problems are different. They come from optimizing real output at the expense of robustness. Just-in-time systems have that flaw. Price/performance is great until there's some disruption, then it's terrible for a while.

Overfitting is another real problem, but again, a different one. Overfitting is when you try to model something with too complex a model and and up just encoding the original data in the model, which then has no predictive power.

Optimizing for the wrong metric, and what do about it, is an important issue. This note calls out that problem but then goes off in another direction.

[-]

satyanash 17 hours ago ago

> Optimising for the wrong metric, and what do about it, is an important issue.

All metrics are wrong, some metrics are useful. Finding the useful one and then recognising when it ceases to become useful is the hard problem.

stoperaticless 13 hours ago ago

Very good characterisation of close, but distinct concepts. (a map of a domain)

If we squint a little, focus on close/far-away instead of same/distinct and s/metric/model/g (because usage of a metric implies a model), we can see how close these things can be.

Optimizing for the wrong metric - becomes “using a wrong model”.

Excessive efficiency - is partially “using a wrong model”, or maybe “good model != perfect model”. We start with good enough model, but after certain threshold we get to experience the difference between “good enough” and “perfect” (aparantly we care about redundancy, but it was not part of our model; so we were using a wrong model)

Overfitting is “finding the wrong model” (I wanted a model for the whole population, got a model only for a sample)

..or if we squint even more and go meta.. overfitting is part of “good model != perfect (meta)model” of modeling. (using sample data is good enough, but not perfect)

P.S. I liked the article. Choice of the title - not so much.

P.P.S. Simplicity of a model is part of meta-model.

chriscappuccio 5 hours ago ago

While this intuitively seems likes good idea, his real life examples are severely lacking. This gets interesting when we see where the rubber hits the road, the causes and effects of what is being optimized for vs what is happening and we look deeply into improving that scenario.

abernard1 13 hours ago ago

The author identifies problems with a system measuring targets, but then all the proposals are about increasing the power and control of the system.

Perhaps the answer—as hippy sounding as it is—is to reduce the control of the system outright. Instead of adding more measures, more controls, which are susceptible to the prejudices of control, we let the system fall where it may.

This, to me, is a classic post of an academic understanding the failures of a system (and people like themselves in control of said system) but then not allowing the mitigation mechanisms of alternate systems to take its place.

This is one of the reasons I come to HN: to view the prime instigators of big-M Modern failure and their inability to recognize their contributions to that problem.

[-]

CatWChainsaw 3 hours ago ago

Loosening of control is exactly the answer, but the world pathology is currently that money/power/control are all unalloyed virtues to be pursued at any cost, so we'll have to wait for a global implosion before any of the Certified Geniuses on this site, or anywhere else, consider an alternative approach.

alexashka 14 hours ago ago

Does being a super efficient AI researcher make everything worse?

futuramaconarma 9 hours ago ago

Lol. Nothing to do with efficiency, just humans recognizing incentives and acting in self interest.

layer8 8 hours ago ago

I propose to denote the worsening trajectory by the term “enshittification”.

bdjsiqoocwk 14 hours ago ago

In practice this is just Goodhart's law itself. It's not distinct. In Goodhart's law

> when a measure becomes a target, it ceases to be a good measure

If you ask someone "could you give me an example" you will see that in the example the measure that becomes a target is already a proxy. Even the example that the author presents, the good that cares a lot about testing its students... How does the school test its students? With exams. But that's already a proxy for testing students knowledge...

But overall excellent article.

knodi 15 hours ago ago

Also it leads to a rigid system that is inflexible to deal with unknowns.

bbor 17 hours ago ago

IMO the theory at the start of the post is well written and almost there, but it needs to more substantively engage with the relevant philosophical concepts. As a result, the title "efficiency is bad!" is incorrect in my opinion.

That said, the post is still valuable and would work much better with a framing closer to "some analogies between statistical analysis and public policy" -- the rest of the post (all the political recommendations) is honestly really solid, even if I don't see a lot of the particular examples' connections to their analogous ML approaches. The creativity is impressive, and overall I think it's a productive, thought-provoking exercise. Thanks for posting OP!

Now, for any fellow pendants, the philosophical critique:

  more efficient centralized tracking of student progress by standardized testing

The bad part of standardized testing isn't at all that it's "too efficient", it's that it doesn't measure all the educational outcomes we desire. That's just regular ol' flawed metrics.

  This same counterintuitive relationship between efficiency and outcome occurs in machine learning, where it is called overfitting.

Again, overfitting isn't an example of a model being too efficacious, much less too efficient (which IMO is, in technical contexts, a measure of speed/resource consumption and not related to accuracy in the first place).

Overfitting on your dataset just means that you built a (virtual/non-actual) model that doesn't express the underlying (virtual) pattern you're concerned with, but rather a subset of that pattern. That's not even a problem necessarily, if you know what subset you've expressed -- words like "under"/"too close" come into play when it's a random or otherwise meaningless subset.

  I'm not allowed to train my model on the test dataset though (that would be cheating), so I instead train the model on a proxy dataset, called the training dataset.

I'd say that both the training and test sets are actualized expressions of your targeted virtual pattern. 100% training accuracy means little if it breaks in online, real-world use.

  When a measure becomes a target, if it is effectively optimized, then the thing it is designed to measure will grow worse.

I'd take this as proof that what we're really talking about here is efficacy, not efficiency. This is cute and much better than the opening/title, but my critique above tells me that this is just a wordy rephrasing of "different things have differences". That certainly backs up their claim that the proposed law is universal, at least!

015a 4 hours ago ago

"Efficiency" meaning, given some input cost, reducing the loss of applying that cost, toward some measured outcomes. High efficiency implies something about each of those three stages, none of which are reasonable to apply in all situations:

1. That the only input to the system is cost/money (or proxies of that, like compensated human time). Put another way: That the model you're working with is perfectly liquid, and you don't need to worry about fundamental supply constraints.

2. That the loss is truly loss, and there isn't some knock-on effects from that loss which might range from generally beneficial and good, to actually being somewhat responsible for the output metric, and your model is measuring the wrong thing.

3. That the output metric correctly and holistically proxies for the real-world outcomes you desire.

Using the example from the article on standardized testing: A school administration might make an efficiency argument by comparing dollars spent to standardized test scores.

* Dollars isn't the only input to this system, however; two major ones also include the quality of teachers and home life of the students. Increasing the spend of the system might do nothing to standardized test scores if these two qualities also can't be improved (you might make the argument that increasing dollars attracts better teachers, and there's some truth to this, but generally (even in tech) these two things just aren't strongly correlated; many organizations have forgotten what it even means to be "good at your job" and how to screen for quality in interviews. When organizations lose that, no amount of money can generate good hires because the litmus test doing the hiring is bad).

* "Loss" in this system might be the increase of funding without seeing proportionally increasing test scores; which does not account for spending money in extracurriculars like music, art, and sports; all generally desirable things we believe money should be spent on (isn't it interesting that we call these things "extra"curriculars?).

* Even if a school administration can apply this model to increase test scores, increasing test scores might not be an outcome anyone really wants. As the article says, all that guarantees is a generation of great test-takers. Increasing college acceptance rates? We've guaranteed a generation of debtors and bad degrees. Turns out, its impossible to proxy for the real world thing you want, in a way that can be measured on a societal level.

All of this is really just symptoms of the "financialization of everything", which has been talked about endlessly. In particular to this discussion, society has broadly forgotten about what the word "service" means; that public transit in your city must be a capitalistic enterprise, it itself has an efficiency metric that must be internally positive, because the broader positive efficiency impact that public transit network has on the people and businesses in the city, and thus municipal tax income, is too complex to account for within a more unified economic model.

nobrains 11 hours ago ago

It would be nice, if us on HN, can crowdsource, some good KPIs/Proxies for the goals mentioned in the article.

These ones:

Goal: Educate children well Proxy: Measure student and school performance on standardized tests Strong version of Goodhart's law leads to: Schools narrowly focus on teaching students to answer questions like those on the test, at the expense of the underlying skills the test is intended to measure

--- Goal: Rapid progress in science Proxy: Pay researchers a cash bonus for every publication Strong version of Goodhart's law leads to: Publication of incorrect or incremental results, collusion between reviewers and authors, research paper mills

--- Goal: A well-lived life Proxy: Maximize the reward pathway in the brain Strong version of Goodhart's law leads to: Substance addiction, gambling addiction, days lost to doomscrolling Twitter

--- Goal: Healthy population Proxy: Access to nutrient-rich food Strong version of Goodhart's law leads to: Obesity epidemic

--- Goal: Leaders that act in the best interests of the population Proxy: Leaders that have the most support in the population Strong version of Goodhart's law leads to: Leaders whose expertise and passions center narrowly around manipulating public opinion at the expense of social outcomes

--- Goal: An informed, thoughtful, and involved populace Proxy: The ease with which people can share and find ideas Strong version of Goodhart's law leads to: Filter bubbles, conspiracy theories, parasitic memes, escalated tribalism

--- Goal: Distribution of labor and resources based upon the needs of society Proxy: Capitalism Strong version of Goodhart's law leads to: Massive wealth disparities (with incomes ranging from hundreds of dollars per year to hundreds of dollars per second), with more than a billion people living in poverty

---

I will start:

Goal: Leaders that act in the best interests of the population

Good proxy: Mandate that local leaders can only send their kids to the schools in their precinct. They can only take their families to the hospitals in their precincts.

tonymet 5 hours ago ago

The author is right that we rely on metrics too much. But he's biased against capitalism and his proposed cure is more socialism. What's actually lacking is wisdom and integrity.

curious-tech-12 19 hours ago ago

perfect reminder that when you focus too hard on the proxy, you might win the battle and lose the war

[-]

HappMacDonald 17 hours ago ago

Sounds like Goodhart's law: "When a measure becomes a target, it ceases to be a good measure"

Too much efficiency makes everything worse (2022)