Incentive Cascades
What do grade inflation, environmental review scores, and philanthropic evaluation all have in common?
In a recent episode of Conversations with Tyler, John Arnold discussed his foundation's approach to philanthropy. Similarly to organizations like Givewell, Arnold Ventures is focused on “evidence-based philanthropy,” meaning they want to study and fund interventions that are actually effective and don’t just sound good on paper.
About 15 minutes into the conversation, Arnold makes the following disconcerting observation about the philanthropic evaluation pipeline:
[T]here was this realization we had that everybody in the chain was incentivized to find the positive result, whether it was the funder, the academic, the university, the journal, or the popular press.
Everybody wanted to find and publicize the positive result, and so there were a lot of positive results that were coming out that were either done with low bars of integrity and quality, or at the extreme end — and this is relatively rare — but bordering on absolute fraud in the industry.
It's unsurprising that there are bad incentives in philanthropy. There are bad incentives everywhere! But Arnold's point is more subtle. He's noticing that everybody in this evidence-based-philanthropy chain has incentives that point in the same direction. The researchers studying the interventions want positive results so they can publish fancy papers, the evaluation agencies want positive results so they can claim a positive impact, and the donors want positive results so they feel good about where their money is going.
The only people who are not incentivized to get positive results are the recipients of the aid themselves. No one in the village where the infamous PlayPump was installed was tempted by a system that would break down and result in grandmas pushing a heavy merry-go-round to get water.1 But the recipients are typically not a part of the evaluation pipeline, since it’s their outcomes we’re monitoring.
We might call Arnold’s observation an example of an incentive cascade: Incentive alignment among all the actors in a decision-making loop. This, surely, is not a new observation. But incentive cascade is a nice concept-handle for a problem that is annoyingly common. Consider:
Grade inflation: Kids want better grades. Parents want their kids to have better grades. Teachers want to give good grades because they get evaluated both by the students themselves and the school on their students’ performance. Schools want their students to have good grades so they get good jobs after graduation and donate to the school. And employers want their new hires to have good grades so they can boast about employees that graduated with various latin adjectives attached to their names. Is it surprising that the median grade at Harvard is A-?
ESG scores: Environmental, social, and governance (ESG) scores are given to companies by third-party rating agencies. Companies with good ESG scores get access to ESG-focused investment capital (e.g., Blackrock). Companies, therefore, want good ESG scores. Rating agencies want clients, so they're incentivized to give good ESG scores. And investors want more investment opportunities so they want more companies to have good ESG scores. Under this scheme, will ESG scores accurately reflect *mumbles something indistinct about whatever ESG ostensibly measures*?
About half of academic publishing: Academics want to publish eye-catching, counterintuitive studies. Academic journals want to publish eye-catching, counterintuitive studies. Science journalists want to report on eye-catching, counterintuitive studies. Peer-review is supposed to be critical of eye-catching, counterintuitive studies, but peer-review consists of individual reviewers who (1) often know whose study they're reviewing, (2) would rather spend time doing their own research than reviewing someone else's work, and (3) know that if the standards aren't that high for getting something published then their own work has a higher chance of getting published. Astonishingly, sometimes peer-review does still work. But often it doesn't, and the result is whatever psychology has been up to for the last 30 years.
Incentive cascades are produced by aligned incentives across multiple actors. So, to break incentive cascades, you need to introduce an actor whose incentives point in another direction. You need to introduce adversaries, or at least disinterested third parties. Ideally you introduce actors with an incentive to be as truth-seeking as possible. (Cue Nassim Taleb yelling "skin in the game!")
Here are some systems which deliberately avoid incentive cascades by introducing friction:
The Legal system. The defense and prosecution are deliberately pitted against one another, and both are incentivized to win the case.
Open source software. Each individual developer wants to contribute to the repository. All other developers, especially the maintainers, want the contributed code to work, and to solve a particular problem. What has evolved is a robust set of norms and protocols (issues, pull requests, unit tests, code review) to ensure that the code which makes it into the repo is useful and as bug-free as possible.
The other half of academic publishing. OK, there are parts of academia that still work well. If an academic knows that any claims will be heavily scrutinized and replication will be attempted quickly (e.g., math, physics, biology, chemistry), then this is pitting the incentives of each academic against the others.
Capitalism?! Buyers and sellers obviously have different incentives. This ensures that prices, in a well-functioning market, accurately reflect supply and demand. Market distortions, on the other hand, can lead to incentive cascades. (Imagine if the government decided to guarantee the student loans provided by private organizations. That would be wild.)
Incentive cascades exist on a spectrum. Perfectly aligned (or perfectly orthogonal) incentives don’t exist in practice. There’s always some friction in the system—not every student has an A+, not every company has an ESG score of 100%, and people eventually noticed that PlayPumps are a terrible idea. But the idealization of an incentive cascade is a useful tool for diagnosing why some institutions work better than others.
Thanks to Cam for comments.
Check out this report by UNICEF on the results of PlayPumps. See the “disadvantages” section.