Theoretical and empirical research highlights the role of punishment in promoting collaborative efforts1, 2, 3, 4, 5. However, both the emergence and the stability of costly punishment are problematic issues. It is not clear how punishers can invade a society of defectors by social learning or natural selection, or how second-order free-riders (who contribute to the joint effort but not to the sanctions) can be prevented from drifting into a coercion-based regime and subverting cooperation. Here we compare the prevailing model of peer-punishment6, 7, 8 with pool-punishment, which consists in committing resources, before the collaborative effort, to prepare sanctions against free-riders. Pool-punishment facilitates the sanctioning of second-order free-riders, because these are exposed even if everyone contributes to the common good. In the absence of such second-order punishment, peer-punishers do better than pool-punishers; but with second-order punishment, the situation is reversed. Efficiency is traded for stability. Neither other-regarding tendencies or preferences for reciprocity and equity, nor group selection or prescriptions from higher authorities, are necessary for the emergence and stability of rudimentary forms of sanctioning institutions regulating common pool resources and enforcing collaborative efforts.
Many economic experiments on ‘public goods games’ (PGGs) have shown that a substantial fraction of players are willing to incur costs to impose fines on exploiters, that is, those who do not contribute to the joint effort1, 2, 3, 4, 5, 6, 7, 8. As a consequence, the threat of punishment looms credibly enough to increase the average level of pro-social contributions. However, the sanctioning system is itself a public good. Thus, punishers are often seen as altruistic, because others benefit from their costly efforts9, 10, 11, 12, 13. Conversely, those who refrain from punishing exploiters are ‘second-order free-riders’. Among self-interested agents, second-order free-riding should spread and ultimately cause the collapse of cooperation.
A solution is to punish second-order free-riders also14. But such ‘second-order punishment’ risks being subverted by third-order free-riders in turn, leading to infinite regress. Moreover, if everyone contributes to the public good, second-order free-riders will not be spotted. Their number can grow through neutral drift, ultimately allowing defectors to invade with impunity. We show how a simple mechanism can overcome this problem.
There exist a variety of sanctioning systems. Most experiments on public goods with punishment have considered peer-punishment: after the PGG, individuals can impose fines on exploiters, at a cost to themselves. Interestingly, the first experiment on public goods with punishment15 considered a different mechanism: players decide whether to contribute to a ‘punishment pool’ before contributing to the public goods. This can be viewed as a first step towards an institutionalized mechanism for punishing exploiters, and compared with the self-financed contract enforcement games in Governing the Commons16. It is like paying towards a police force, whereas peer-punishers take law enforcement into their own hands.
Peer- and pool-punishment are both expensive ways to impose negative incentives on free-riders. In many economic experiments, the increase in cooperation is more than matched by the costs of punishment, and an overall reduction of total pay-off is observed8, 9. Because the costs of pool-punishment arise even when there are no exploiters to be punished, it seems even more socially expensive than peer-punishment. However, the issue of second-order punishment favours pool-punishment. If everyone contributes to the public good, then peer-punishers are not distinguishable from second-order free-riders. By contrast, pool-punishers declare themselves beforehand. We may expect that pool-punishment leads more easily to a second-order punishment regime and, hence, to more stability.
Because sanctioning institutions, as known from social history, usually forbid individuals to take the law into their own hands, it is also worthwhile to investigate the competition between peer- and pool-punishment. A model based on evolutionary game theory shows that both peer- and pool-punishment can emerge, if participation in the joint effort is optional rather than compulsory. Pool-punishment requires second-order punishment, whereas peer-punishment is little affected by it. Both sanctioning mechanisms can evolve if players simply imitate whatever yields the highest pay-off. If peer-punishers compete with pool-punishers, all depends on second-order punishment. Without it, the population is dominated by peer-punishers. With it, pool-punishers take over, although the average income is thereby reduced.
A ‘punishment fund’ can be viewed as a rudimentary institution to uphold the common interest. Many small-scale societies use this principle, for instance by hiring an enforcer. In Governing the Commons16, several examples of self-financed contract enforcement are described. They concern the provisioning and the appropriation of common resources, for instance high mountain meadows (the ‘commons’), irrigation systems or inshore fisheries. Our model shows that individuals can spontaneously adopt a self-governing institution to monitor contributions and sanction free-riders. It needs no top-down prescriptions from higher authorities, nor great feats of planning: trial and error, and the imitation of successful examples, can lead to a social contract among individuals guided by self-interest.
To model a PGG, we assume that if N ≥ 2 individuals participate in the interaction, each can decide whether to contribute a fixed amount, c > 0, to the common pool. This amount will be multiplied by a factor of r > 1 and then divided among the N − 1 other players. If all contribute, they obtain (r − 1)c each. Because contributors do not benefit from their own contribution, self-interested players ought to contribute nothing. If all do this, their pay-off will be zero. This reveals a social dilemma.
Pool-punishers not only contribute c to the PGG, but also, beforehand, an amount, G, to a punishment pool. Free-riders will be fined an amount, BNv, proportional to the number, Nv, of pool-punishers. In the case of second-order punishment, second-order free-riders will be fined the same amount. Peer-punishers contribute c to the PGG, and after the game impose a fine, β, on each free-rider in their group, at a cost γ. If Nw peer-punishers are in the group, each defector pays a total fine βNw. In case of second-order punishment, second-order defectors are treated just like defectors.
Let us assume that the game is not compulsory11, 17. Some players may abstain from the joint enterprise. They can do something else instead, and earn a pay-off, σ, independent of what others are doing. If only one player is willing to engage in the joint effort, there will be no PGG and the solitary would-be participant also earns σ.
Let M denote the population size; X the number of players who participate in the PGG and contribute, but do not punish; Y the number of defectors, who participate but contribute neither to the PGG nor to the sanctions; Z the number of non-participants; V the number of pool-punishers; and W the number of peer-punishers. Random samples of N individuals are faced with the opportunity of a joint enterprise. Social learning leads to preferential copying of successful strategies. We obtain their long-run frequencies by numerical simulations (compare with Figs 1, 2 and 3). In a limiting case, we obtain analytic results (Supplementary Information) that we now describe.
Let us first neglect peer-punishment, and assume that the pay-off, σ, for non-participants lies between zero (obtained if all free-ride) and (r − 1)c − G (obtained if all contribute to the PGG and the punishment pool). The inequality
highlights that participating in the joint enterprise is a venture that succeeds if most participants contribute and fails if most do not.
In the absence of second-order punishment, the long-run frequencies in the (X, Y, Z, V) subpopulations are (2, 2, 2, 1)/7 and little cooperation is achieved. With second-order punishment, the corresponding long-run frequencies are (0, 0, 0, 1).The population is dominated by pool-punishers enforcing cooperation. If the game is compulsory (that is, Z = 0), the population consists of free-riders only.
Alternatively, if we neglect pool-punishment, and assume that
the long-run frequencies in the (X, Y, Z, W) subpopulations are (2, 2, 2, M + 2)/(M + 8) and punishers prevail, with or without second-order punishment. Again, if the game is compulsory, only free-riders survive in the long run.
In the competition between peer- and pool-punishers without second-order punishment, peer-punishers win. The long-run frequencies in the (X, Y, Z, V, W) subpopulations are (6, 6, 4, 1, 3M + 6)/(3M + 23). With second-order punishment, pool-punishers win, and the corresponding frequencies are (0, 0, 0, 1, 0).
Repression of free-riding is a basic theme for several major transitions in evolution18, and can lead to evolutionarily stable strategies allocating part of the contribution towards suppressing competition19. In human societies, sanctions are ubiquitous4, 16, 20, 21. Peer-punishment emerges more easily than pool-punishment, because it requires no second-order punishment, and inequality (2) is weaker than inequality (1). But with second-order sanctions, pool-punishment out-competes peer-punishment, despite being socially expensive. Both types of punishment only emerge, in our model, if players can opt out of the joint enterprise. This restricts the range of applications22, 23. However, there is considerable evidence that cooperation can increase, if participation is voluntary rather than compulsory24, 25, 26 (see Supplementary Information for an intuitive explanation).
Many early experiments on public goods with punishment terminated after six or ten rounds, and although punishment usually increased the propensity to cooperate, the overall income was often less than without punishment2, 8, 9. But if the number of rounds is sufficiently large, cooperation becomes common3. As long as players avoid antisocial punishment of contributors5 (a feature not included in our model), peer-punishment becomes cost free. Pool-punishment entails fixed costs and thus is less efficient. However, peer-punishment is ill-suited for second-order punishment, as has also been observed empirically27. Pool-punishment is more conducive to second-order punishment. A sanctioning institution should view anyone not contributing to its upkeep as a defector and resort to second-order punishment. Adding second-order punishment may add to the cost of sanctioning, but as long as inequality (1) holds, the results are unaffected.
Experimental PGGs allowing players to opt, from round to round, between treatments with or without peer-punishment28, or to vote on whether to forbid antisocial punishment29, suggest intermediary stages towards pool-punishment. Further steps towards endogenous institution formation are analysed in refs 23, 30. We considered players motivated entirely by self-interest, and did not assume preferences for reciprocity or equity21. This obviously does not mean that such preferences do not exist. Their emergence may actually have been favoured by the prevalence of sanctioning institutions over thousands of years.
We left out many important issues, such as quorum-sensing and signalling, reputation and opportunism, repeated interactions and graduated punishment, and did not specify how pool-punishment is actually set up. Our model is minimalistic, but allows proof of principle. Origins of institutions are notoriously difficult to trace, but we have shown that they can emerge spontaneously among self-interested individuals.
We apply evolutionary game theory to populations of fixed size, M, and variable composition, X, Y, Z, V and W (the numbers of players using the five strategies for the optional PGG with peer- or pool-punishment). We compute the pay-offs obtained by players using these strategies. The pay-off differences define the probabilities that the strategies are copied through social learning, as a function of a parameter, s ≥ 0, measuring ‘imitation strength’. Together with an ‘exploration rate’, μ ≥ 0, which specifies the propensity to switch randomly to another strategy, this defines a stochastic process describing the evolution of the frequencies X, Y, Z, V and W. We compute their stationary distributions (which correspond to the relative frequencies in the long run) both numerically and, in a limiting case, analytically, and check these values by individual-based simulations. This allows us to compare the evolution of any subset of the five strategies under social learning. For further details, see Supplementary Information.
- Cooperation and punishment in public good experiments. Am. Econ. Rev. 90, 980–994 (2000) &
- The efficient interaction of indirect reciprocity and costly punishment. Nature 444, 718–723 (2006) &
- The long-run benefits of punishment. Science 322, 1510–1512 (2008) , &
- Costly punishment across human societies. Science 312, 1767–1770 (2006) et al.
- Antisocial punishment across societies. Science 319, 1362–1367 (2008) , &
- Altruistic punishment in humans. Nature 415, 137–140 (2002) &
- Cooperation and punishment, especially in humans. Am. Nat. 164, 753–764 (2004) &
- The economics of altruistic punishment and the maintenance of cooperation. Proc. R. Soc. B 275, 871–878 (2008) &
- Detrimental effects of sanctions on human altruism. Nature 422, 137–140 (2003) &
- The evolution of altruistic punishment. Proc. Natl Acad. Sci. USA 100, 3531–3535 (2003) , , &
- Altruistic punishment and the origin of cooperation. Proc. Natl Acad. Sci. USA 102, 7047–7049 (2005)
- The evolution of altruism and punishment: role of the selfish punisher. J. Theor. Biol. 240, 475–488 (2006) &
- Strong reciprocity or strong ferocity? A population genetic view of the evolution of altruistic punishment. Am. Nat. 170, 21–36 (2007) , , &
- Punishment allows the evolution of cooperation (or anything else) in sizable groups. Ethol. Sociobiol. 13, 171–195 (1992) &
- The provision of a sanctioning system as a public good. J. Pers. Soc. Psychol. 51, 110–116 (1986)
- 1990) Governing the Commons: The Evolution of Institutions for Collective Action (Cambridge Univ. Press,
- Sigmund, K. Via freedom to coercion: the emergence of costly punishment. Science 316, 1905–1907 (2007) , , & &.
- 1997) & The Major Transitions in Evolution (Oxford Univ. Press,
- Mutual policing and repression of competition in the evolution of cooperative groups. Nature 377, 520–522 (1995)
- 2009) (ed.) Games, Groups, and the Global Good (Springer,
- 157–191 (National Academy, 2002) , & in The Drama of the Commons (eds Ostrom, L. et al.)
- When does optional participation allow the evolution of cooperation? Proc. R. Soc. B 276, 1167–1174 (2009) &
- Coordinated punishment of defectors sustains cooperation and can proliferate when rare. Science 328, 617–620 (2010) , &
- Social welfare, cooperator’s advantage, and the option of not playing the game. Am. Sociol. Rev. 58, 787–800 (1993) &
- Volunteering as a Red Queen mechanism for cooperation. Science 296, 1129–1132 (2002) , , &
- Volunteering leads to rock–paper–scissors dynamics in a public goods game. Nature 425, 390–393 (2003) , &
- Second order punishment in one-shot prisoner’s dilemma. Int. J. Psychol. 39, 329–334 (2004) , , &
- The competitive advantage of sanctioning institutions. Science 312, 108–111 (2006) , &
- Who to punish? Individual decisions and majority rule in mitigating the free rider problem. Eur. Econ. Rev. 53, 495–511 (2009) , &
- Institution formation in public goods games. Am. Econ. Rev. 99, 1335–1355 (2009) , &
K.S. acknowledges TECT I-104 G15, A.T. thanks the Emmy Noether programme of the DFG and C.H. thanks NSERC (Canada).