The consumer welfare standard can’t be saved with more theory. The problem is how it works in practice, and solving that means changing the burden of proof in antitrust cases.

The Stigler Center’s 2023 Antitrust and Competition conference seeks to answer the question: what lays beyond the consumer welfare standard? In advance of the discussions, ProMarket is publishing a series of papers on the infamous consumer welfare standard. This piece is part of that debate.


What standard should drive antitrust enforcement? Unsurprisingly, given that it’s the motivating question for this conference, this is quite difficult to answer. Antitrust tackles the whole economy, meaning that any standard needs to be broad enough to handle diverse problems in the most assorted industries. Yet, because antitrust regulators wield significant power, the standard must also be specific enough to allow for predictability and accountability—otherwise, interest groups could hijack the policy to increase their economic power and extract rents, suppressing rather than promoting competition. Furthermore, any public policy needs a guiding principle, and antitrust is no exception.

In light of these criteria, we’ll argue that the consumer welfare standard is not the right standard for antitrust — and moreover that it is not a few tweaks and updates away from serving that role. The central problem with today’s standard is the burden of proof: antitrust enforcers are forced to demonstrate their case in the face of an avalanche of modelling by consultants, hired by companies that almost by definition have endless resources. Adjusting the consumer welfare standard to, say, better account for innovation won’t fix that. Therefore, in our view, stronger rebuttable structural presumptions in merger control are a better path forward.

What good antitrust policy looks like

 In our view, there are at least five properties that an ideal antitrust standard should have:

  1. Grounded on science: The standard should be based on the best scientific research on the state of competition in markets, and proposers should demonstrate and regularly check that the standard will result in material welfare benefits to society.
  2. Focused on “competition” problems: Antitrust policy ultimately promotes market competition, and an antitrust standard should acknowledge that. A central tenet for such a standard, therefore, is whether increasing competition specifically helps advance societal welfare.
  3. Accepts political priorities: Antitrust is not a self-sufficient island but a public policy tool that is part of a broader regulatory toolbox to influence market behavior. If democratically elected governments decide—for example, through the passage of laws—that regulators should maximize a certain welfare notion (say, actual consumer welfare rather than total welfare), antitrust policy should assent to this decision. The same goes for enforcement priorities. The determination of what are the broad boundaries of competition policy entails important trade-offs in terms of economic efficiency versus equity or the distribution of resources, and elected officials have more legitimacy to make these broad trade-off determinations than antitrust officials.
  4. Actionable, accountable and predictable: The standard that guides enforcement decisions must be actionable. It must generate evidence that allows interested parties and society more broadly to assess whether it is accomplishing its goal (accountable). And it should enable private parties to understand what the standard implies for their behavior, without too much uncertainty (predictable). Being actionable, accountable, and predictable does not mean that regulators retain no discretion. Part of accountability may be associated with whether enforcers properly used their discretion when deciding on if and how to prosecute cases, being subject to ex-post review by legislators and society more broadly.
  5. Acknowledges real-world limitations: Finally, the standard should acknowledge the skewed and distorted incentive structure in antitrust enforcement. Enforcers are under-resourced almost by definition—they are constantly facing the largest companies in their respective economies. In addition, they also operate in environments with large information asymmetries, as firms know better the rationale for their business plans and the competitive reality of their markets. Rents exist in many markets, and especially in concentrated markets that are the prime target of antitrust enforcers. Incumbent firms have the resources and incentives to preserve those rents at almost any cost. The standard for our times must therefore encourage companies to reveal private information and acknowledge the existence of increasingly reliable evidence that concentration and market power are producing negative effects to society. That fact suggests that companies should face a higher burden to show that their actions are legitimate.

The limits of the consumer welfare standard

As many other articles in this series expose, the current interpretation and application of the consumer welfare standard do not fare very well against the criteria set above. First, the standard typically tries to assess the impact on Marshallian demand, a criterion that has been criticized in modern welfare economics and in law. Second, its practical application has largely been confined to static price effects in the context of particular models of horizontal product differentiation, which are not really connected to our modern understanding of how markets and market players behave. Additionally, the few ex-post assessments of policy interventions that do exist point to the fact that its application has not led to lower prices in actual cases (principally, mergers, as there are very few dominance cases run). We note that a greater use of ever more complex economic models in antitrust, which has happened over the past thirty years, has coincided with a decline in enforcement activity, alongside industry trends pointing towards increased margins and concentration, as well as evidence of reduced industry dynamism.

This brings us to the third point above, namely that the current debate is happening largely because antitrust practice has become disconnected from the wider society, which is reclaiming voice in a discussion too often relegated to very selected few “experts”. As for being actionable (fourth point), we have already said that the current welfare standard indeed has this property but only when it comes to static price effects. It is also predictable in one direction only, in the sense of having resulted in systematic underenforcement, which is particularly worrying under the current more concentrated distribution of market power. And fifth, it ignores real-world limitations. Enforcers do not have the resources to fight too many battles under the current standard, they are being spammed by economic consultancies, and are required to satisfy increasingly stricter standards of proof that discourage companies from producing private information. The result has been a steady decline in enforcement even as market concentration has increased. This is recognized by the many proposals and actual reforms to strengthen antitrust enforcement that are taking place around the world.

Why the consumer welfare standard can’t be rescued

A classic response to this question is that the consumer welfare standard is “flexible” enough to go beyond static price effects. We disagree with this response, at least as it applies in practice. Let us start, though, with the classic response. Economists are constantly developing more and more sophisticated tools, that could accommodate factors such as innovation, quality, or consumer choice. Indeed some (limited) precedents exist: for example, there is an increasing push to incorporate innovation considerations into merger control through some form of economic modelling. One of us was directly involved with an “innovation theory of harm” in merger control that was used by the European Commission in the agrochemical industry in 2017 and 2018 (e.g. Dow/DuPont, Bayer/Monsanto). The Commission was successful in those cases, but received so much criticism and pushback that it decided, de facto, not to engage with it again. Too many resources were burnt, and too many people got scars—an experience not to be repeated.

This shows the major limitations of the current approach to antitrust. A more “flexible” consumer welfare standard that still maintains current burdens of proof will likely mean more and more complex models, which are even more prone to the casting of doubt by an industry of expert consultants. This occurs in a context where judges are even less capable of understanding what these models mean in practice, leaving them more susceptible to external influences and leading to bad decisions that misapply economic theory even in reasonably settled areas. All this while enforcers are so under-resourced that they have to undergo intense triage and challenge only the most outrageous cases. Spamming and overwhelming antitrust authorities is an effective, cross-Atlantic strategy to decrease enforcement. If anything, the current consumer welfare standard is purposefully built to exploit (or, at least, totally ignores) real-world limitations in policymaking. Using more “sophisticated” economics to accommodate for an expanded version of the current welfare standard, while interesting in theory, is more likely to produce problems in practice.

A modest proposal for a change: Stronger rebuttable structural presumptions in merger control

In this short proposal, we discuss mergers (though some ideas could be adapted to dominance cases involving near-monopolists). Currently, enforcement resources are not well employed. In traditional merger control, the regulator must, first and foremost, find market power in a well-defined relevant market (with the definition of what type of economic evidence characterizes as “well-defined” becoming stricter by the year). This has created a real obsession in the profession with market definition. From the companies’ perspective, the game is clear: if the relevant market is found to be large enough, a case would never even start—courts have many times rejected merger challenges based on absurd, firm-driven relevant market definitions. It is unavoidable for a regulator to spend precious time and people on this. Because resources are finite, very little are left for the actual analysis of the competitive impacts of the merger. It is also up to the enforcer to then determine that the merger is likely to produce negative effects on the market (in tests that are many times dissociated from the statutory standards). All this while the clock is ticking against the enforcer, and in an environment where regulators often depend on the merging parties to provide the information necessary for their analysis. Unsurprisingly, only a few battles can be fought in this setting. As regulators became increasingly overwhelmed, historical solutions have been to sweep the dirt under the carpet by raising merger notification thresholds, allowing harmful mergers to avoid scrutiny. Hardly a good outcome for society.

Stronger rebuttable structural presumptions in merger control that acknowledge that size and market structure matter have the power to change this downward spiral. The largest firms are those that, at least prima facie, are more likely to have market power. They are also the firms where the risk of harm to markets and other societal interests are the greatest, precisely due to their size. Hence, the presumption would be that a merger involving a firm above a certain size threshold would likely lead to adverse effects on competition. A merger or acquisition involving that firm would not be allowed, unless the parties demonstrate that the merger/acquisition will likely lead to significant efficiencies that will be shared with consumers.

A structural proposal is grounded on robust economics. In the context of horizontal mergers, every merger is bad for consumers in the absence of efficiencies, a concept that is uncontroversial but has been strangely lost in the antitrust discussion. (Practical antitrust enforcement often refers to a presumption of efficiency for any vertical merger, even those involving large firms with market power. This presumption simply does not exist as a matter of economics. )

Regarding efficiencies, some commentators will certainly argue that a structural presumption would miss the efficiencies generated by those mergers. This is not true for two reasons (at least). First, there is no empirical evidence demonstrating that mergers involving large firms recurrently generate efficiencies, while there is evidence that current merger control has failed to prevent price increases and other types of harm. This makes sense: these are the largest firms that may have already exhausted reasonable economies of scale. In their merger chapter in the IO Handbook, Asker and Nocke review about 30 mergers retrospective studies in many industries (both concentrated and unconcentrated). Most led to price increases. Only one study finds evidence of long-run efficiencies, namely Focarelli and Panetta (2003): this study uses data about Italian banks in the 90s, when the number of national banks decreased from 78 to 56. Even in narrowly defined markets (provinces), the HHI was low enough to avoid any scrutiny according to the DOJ guidelines. In other words, their finding has little to do with the concentrated markets where a structural presumption would apply.

Second, firms may still demonstrate such efficiencies in their rebuttal. For that, the firm would have to show that the merger is necessary to achieve such efficiencies (that is, there are no less anticompetitive actions available to reach the same effect) and demonstrate that consumers will also benefit directly from the merger.

Let us rebut the old criticism from the 1970s that size is positively related to efficiency, and thus size should not be punished. A structural presumption for mergers does not punish size. If firms want to grow organically, they can do so. It also does not impose a line of business restriction: incumbent firms can invest in other markets, which is pro-competitive. The presumption only says that large firms are more likely to have market power, wherever that market power comes from, and those firms should prima facie “make” rather than “buy.”

Still, some will argue that this proposal ultimately means that every merger will be blocked, leading to large welfare losses through missed synergies. This is not true. First, as mentioned above, this would apply only to very large firms, and could be supplemented by safe harbors for very small firms. Second, and more importantly, the “rebuttable” part safeguards the transactions with the largest gains. The firms in question have better access to information, time, and economic resources to demonstrate that their actions benefit society. If they cannot prove it, then most likely no one else can. Therefore, this presumption also creates strong incentives for firms to share proprietary data and diminish information asymmetries. Instead of simply working to block regulators’ activities, merging parties would now be induced to explain the merits of their cases.

Structural indicators

In order to implement a structural presumption, one needs to derive structural indicators to use as thresholds for the presumption to kick in. Here is where the standard of proof can be lowered to give the regulator an effective tool. While every metric will be imperfect, one can imagine an array of indicators such as turnover, market capitalisation, market share in a broadly defined market, and so forth. We do not intend to be exhaustive here, but rather propose ideas to start a discussion. Broadly speaking, indicators can belong to three categories, one based on firm-level characteristics, one on transaction-level characteristics, and one on industry-level characteristics.

The first group (firm-level characteristics): includes notions such as firm’s turnover, market capitalization, and the like. It has been used to designate, for instance, digital gatekeepers in the European Digital Markets Act (which uses thresholds based on market capitalization, turnover, and monthly active users over a 3-year window). Margins are also an important category that has received increasing attention. This is because margins—that is, the ability to sustain prices above marginal costs—is the textbook definition of market power (without expressing any normative judgment on the source of such power). Evidence of high margins is direct evidence of the existence and exercise of market power, especially if sustained over time. All these indicators would apply to the general economy, and should be built and used more routinely by antitrust enforcers to follow market dynamics (for example, through a recurrent use of the now almost defunct FTC’s powers through Section 6(b) of the FTC Act).

The second group (transaction level): focuses on the acquisition price of the transaction. If this price is “large”, it may be a prima-facie signal that the acquirer is imputing anticompetitive pre-emption rents into its evaluation. Fumagalli et al. (2023) first develop a theory that supports such a structural presumption, and then devise an empirical test to screen the problematic acquisitions. To do the latter, they select a sample of target companies comparable to the acquisition target under consideration. For each of them, they proxy firm value with the stock market value before the acquisition (or with other measures such as venture capital valuation, if available). This gives a proxy for the value of the target firm net of the market power effect generated by the acquisition. They use this information on the comparable set of target firms to identify the “outside option” of a viable target. A formal statistical test can then compare the acquisition price with the outside option. They also propose an additional screening to identify which high-price takeovers will likely cause a larger welfare loss. The idea is that high-price acquisitions that cause a larger increase in market power should come with a larger premium on target firms’ market value (if the premium reflects merger-specific efficiencies, they would be assessed under the rebuttable part).

A similar proposal exists to apply an economic goodwill threshold test. This test concerns a target’s net tangible assets as a proportion of transaction value. The difference between net tangible assets and transaction value can represent the gains the acquirer expects to realize from its strengthened competitive position, notwithstanding the value of intangible assets such as intellectual property rights. Therefore, the economic goodwill test reflects the logic driving start-up acquisitions and represents a useful innovation in competition law enforcement. These are two examples of methods that provide a benchmark for the determination of what is a “large” transaction that triggers additional presumptions. Ultimately, there is no “correct” determination for this number: this is a normative decision for legislators and competition authorities to make in consultation with society.

The third group of indicators (industry level) could be structured in at least two different ways. The first is to define companies’ share at the industry-level. Databases based, e.g., on NAICS, or Orbis, are routinely employed by researchers describing industry trends. Koltay et al. (2022) show how to employ these methods in a time series of European industries. They argue that the data show that, in the past 20 years, market power problems have emerged in European industries where the share of the four largest firms exceed 50%. They do not propose this as a threshold for a structural presumption, but the method could be adapted to look for one.

Another option can be to build indicators based on the current HHI methods, but focusing, for the purposes of the presumption, on HHI levels (rather than increments) and starting at much lower thresholds than what is currently employed (and which are not based on any particular economic theory and extremely permissive of mergers). Here a note is important on this recent, important contribution by Nocke and Whinston (2022). While they do argue that it is the change in the HHI rather than the level of the HHI to be more informative in their model, they obtain this result under some important assumptions. First, in their empirical application, the level of the HHI is not informative only conditional on using the increment. If one runs their regressions on the HHI level only, their scatter plots already suggest that the required efficiencies would need to be higher in more concentrated markets, as there is a positive correlation between the change in HHI and its level. Second, and interestingly, the paper dedicates a whole section to screening based on HHI levels, showing it would make sense when an authority’s objective reflects a desire to prevent significant consumer harm. Indeed, the idea that increases in concentration lead to greater and greater increases in price is one intuitive argument for being concerned with the post-merger level of the HHI. Formally, the reduction in consumer surplus when efficiencies fall short of the level that would leave consumers surplus unchanged, is higher the smaller the number of firms. The key force driving this effect is that, with fewer rival firms, non-merging firms replace less of any reduction in the merging firms’ supply.

The use of HHIs for determining the presumption, however, has the significant flaw in that the index is always derived from computing market shares in specific cases. This is where the current system has failed with its very obsessive and narrow definition of relevant markets. One alternative could be to accept HHI determinations but kick in the presumption at much lower levels of concentration (say, an HHI of 1000 defines a concentrated market), diminishing the importance of market share determinations and relying more on the other thresholds of firm and transaction characteristics to do most of the work. The HHI level then becomes more akin to a safe harbour for harmless transactions than a screener of bad mergers.

Reviving Industry Studies: A proposal that possibly emerges from our discussion is that all the indicators we mentioned above (which can also be supplemented by others, such as profitability, and entry and exit of firms in industries) could, and perhaps should, be collected on an ongoing basis, rather than being discussed case-by-case. Indeed, antitrust authorities should have “industry” units that collect regular statistical evidence on the competitive levels of markets they oversee, as the FTC used to do in the past. Distributions of market concentration and market power could be collected and compared across industries, allowing refinements in the thresholds based on evidence. Case teams involved in merger assessments could then obtain the structural indicators from the industry unit, and add their transaction-specific indicators—rather than having to build their cases almost from scratch at every transaction. The industry unit could employ a variety of expertise, including economists, statisticians, and industry experts. Among other things, this arrangement would have the added benefit of ensuring consistency in data collection and analysis.

Whether thiswill represent real change depends on how the specific proposal is structured and the standards of proof that will be adopted. The main risk with structural presumptions is that it produces a whack-a-mole situation and just shifts the economics battle from harm calculation to relevant market definition (though this is always a key battle anyway). As mentioned above, a possibility is that: (i) the market share definition/intervention thresholds are based on easy-to-collect structural parameters (one could also do an average weigh for different definitions that also kicks in the presumption), (ii) the strength of the evidence for the rebuttal grows the higher the market share/level of the structural parameter.

This would be relatively straightforward for industries such as pharmaceuticals, but it would be more challenging for more fluid industries, such as digital markets. Even there, however, thresholds may be easily established for core segments, such as general search, app stores or cloud computing. In any event, the regulatory resources freed in the analysis of traditional industries could lead to better scrutiny of highly dynamic markets.

Designing such standards requires making trade-offs. As such, the general guidelines would be better if adopted by legislators—with some margin of discretion for regulators (as done with the DMA in Europe, for example). However, most changes in antitrust policy in the US have taken place through administrative and judicial shifts in almost open opposition to the text of the laws. Therefore, significant room exists for regulatory action that implements part of these presumptions even within the current legislative framework (or at least, within the current text of the laws).

The ultimate specific thresholds should also be widely debated pre-adoption (including the methodologies to calculate them), and widely distributed post-adoption. They should also be accompanied by requirements that regulators revisit them periodically to ensure that the thresholds are reflective of market conditions.

These procedural safeguards help guarantee that the enforcement of antitrust law is grounded on science, accepts political compromises, and is actionable, accountable and predictable: a major improvement over the current consumer welfare standard.