What are the effects of COVID-19 on mortality? Individual-level causes of death and population-level estimates of casual impact


How many people have died from COVID-19? What is the impact of COVID-19 on mortality in a population? Can we use excess mortality to estimate the effects of COVID-19?

In this text I will explain why the answer to the first two questions need not be the same. That is, the sum of cases where COVID-19 has been determined to be the direct[1] cause of death need not be the same as the population-level estimate about the causal impact of COVID-19. When measurement of the individual-level causes of death is imperfect, using excess mortality (observed minus expected) to measure the impact of COVID-19 leads to an underestimate of the number of individual cases where COVID-19 has been the direct cause of death.


The major assumption on which the argument rests is that some of the people who have died from COVID-19 would have died from other causes, within a specified relatively short time-frame (say, within the month). It seems very reasonable to assume that at least some of the victims of COVID-19 would have succumbed to other causes of death. This is especially easy to imagine given that COVID-19 kills disproportionally the very old and that the ultimate causes of death that it provokes – respiratory problems, lungs failure, etc. – are shared with other common diseases with high mortality among the older population, such as the flu.

Defining individual and population-level causal effects

With this crucial assumption in mind, we can construct the following simple table. Cell A contains the people who would have survived if they had not caught the Coronavirus, but they caught it and died. Cell B contains the people that caught the Coronavirus and died, but would have died from other causes even if they did not catch the virus[2]. Cell C contains the people who caught the virus and survived and would have survived even if they did not catch the virus. Cell D contains the people who would have died if they did not catch virus, but they did and survived. Cell C is of no interest for the current argument, and for now we can assume that cases in Cell D are implausible (although this might change if we consider indirect effects of the pandemic and the policy measures it provoked. But for now, we ignore such indirect effects). Cell E is people that did not catch the virus and survived (also not interesting for the argument). Cell F is people who did not catch the virus and died from other causes. As a matter of definition, total mortality within a period is A + B + F.

 Caught Corona-virus and diedCaught Corona-virus and survivedDid not catch Coronavirus
Would have survived unless catches CoronaACE
Would have died if doesn’t catch CoronaBDF

The number of individual-level deaths directly caused by COVID-19 that can be observed is the sum of cells A + B. Without further assumptions and specialized knowledge, we cannot estimate the share of cases that would have died anyways from the total. For now, just assume that this is positive; that is, such cases exist. The population-level causal impact of COVID-19 is A, or, in words, those that have died from COVID-19 minus those that would have died from other causes within the same period. The population-level causal effect is defined counterfactually. Again, without further assumptions about the ratio of B to A, the population-level causal impact of COVID-19 is not identifiable. An important conclusion that we reach is that the population-level causal impact of COVID-19 on mortality does not necessarily sum up to the sum of the individual cases where COVID-19 was the cause of death.

Scenario I: perfect measures of individual-level causes of death

Assume for the moment that all individual cases where COVID-19 was the cause of death are observed and recorded. Under this assumption, what does excess mortality measure? Excess mortality is defined as the difference between the observed (O) and predicted (P) number of deaths within a period, with the prediction (expectation) coming from historical averages, statistical models or anything else[3]. Under our definitions, the observed mortality in O a period contains groups  A + B + F. So the difference between observed O and predicted P gives A, or the number of people that have died from COVID-19, but would have survived otherwise. Therefore, excess mortality identifies the population-level causal impact of the COVID-19 (see also the figure below).

One implication of this line of reasoning is that under perfect measurement of individual-level cause of deaths and a positive amount of people who would have died from other causes if they had not died from COVID-19 (cell B), the sum of the cases where COVID-19 was recorded as a cause of death should exceed the excess in observed mortality O – P.  (See the situation in France where this might be happening.)

Scenario II: imperfect measures of individual-level causes of death

Let’s consider now a more realistic scenario where determining and recording the individual causes of death is imperfect. Under this assumption, the observed number of deaths in a period still contains O = A + B + F. Excess mortality O – P still identifies the population level effect A. However, this is not the number of deaths directly caused by COVID-19, which includes those that would have died anyways (B): a category that is already included in the prediction about mortality during this period [4].

In other words, excess mortality underestimates the sum of individual cases where COVID-19 is the direct cause of death. The amount of underestimation depends on how large the share of people who would have died from other causes but died from COVID-19 is. The larger the share, the larger the underestimation. To put it bluntly, COVID-19 kills more people than excess mortality suggests. This is because the expected number of deaths, on which the calculation of excess mortality depends, contains a share of people that would have died from other causes, but were killed by the virus.


These are the main conclusions from the analysis:

  1. The sum of individual-level cases where COVID-19 was the direct cause of death is not the same as the population-level causal impact of the virus.
  2. Excess mortality provides a valid estimate of the population-level causal impact.
  3.  When measurement of the individual causes of death is imperfect, excess mortality provides an underestimate of the sum of individual cases where COVID-19 was the cause of death.
  4. With perfect measurement of the individual causes of death, excess in mortality should be lower than then the sum of the individual case where COVID-19 was the cause of death.


[1] I suspect some will object that the coronavirus and COVID-19 are never the direct causes of death but only provoke other diseases that ultimately kill people. This is irrelevant for the argument: I use ‘COVID-19 as a direct case of death’ as a shortcut for a death that was caused by COVID-19 provoking some other condition that ultimately kills.

[2] Formally, for people in cell B, COVID-19 is a sufficient but not necessary condition for dying within a certain period. For people in cell A, COVID-19 is both necessary and sufficient. Because of the counterfactual definition of the population-level effect, it only tracks cases where the cause was both necessary and sufficient.

[3] In reality, the models used to predict and estimate the expected mortality are imperfect and incorporate considerable uncertainties. These uncertainties compound the estimation problems discussed in the text, but the problems will exist even if the expected mortality was predicted perfectly.

[4] Extending the analysis to include indirect effects of COVID-19 and the policy responses it led to is interesting and important but very challenging. There are multiple plausible mechanisms for indirect effects, some of which would act to decrease mortality (e.g. less pollution, fewer traffic accidents, fewer crime-related murders, etc.) and some of which would act to increase mortality (e.g. due to stress, not seeking medical attention on time, postponed medical operations, increases in domestic violence, self-medication gone wrong, etc.). The time horizon of the estimation becomes even more important as some of these mechanisms need more time to exert their effects (e.g. reduced pollution).   Once we admit indirect effects, the calculation of the direct population-level effect of COVID-19 from excess mortality data becomes impossible without some assumptions about the share and net effect of the indirect mechanisms, and the estimation of the sum of individual-level effects becomes even more muddled.

Government positions from party-level Manifesto data (with R)

In empirical research in political science and public policy, we often need estimates of the political positions of governments (cabinets) and the salience of different issues for different governments (cabinets). Data on policy positions and issue salience is available, but typically at the level of political parties. One prominent source of data for issue salience and positions is the Manifesto Corpus, a database of the electoral manifestos of political parties. To ease the aggregation of government positions and salience from party-level Manifesto data, I developed a set of functions in R that accomplish just that, combining the Manifesto data with data on the duration and composition of governments from ParlGov.

The see how the functions work, read this detailed tutorial.

You can access all the functions at the dedicated GitHub repository. And you can contribute to this project by forking the code on GitHub. If you have questions or suggestions, get in touch.


Discretion is Fractal

Last week, I made a presentation at the Leiden University conference ‘Political Legitimacy and the Paradox of Regulation’ under the admittedly esoteric title ‘Discretion is Fractal’. Despite the title, my point is actually quite simple: one cannot continue to model, conceptualize and measure (administrative or legal) discretion as a linear phenomenon because of the nested structure of legal norms which exhibits self-similarity at different levels of observation. And, yes, this means that law is fractal, too. In the same way there is no definite answer to the question ‘how long is the coast of Britain‘, there can be no answer to the question which legal code provides for more discretion, unless a common yardstick and level of observation is used (which requires an analytic reconstruction of the structure of the legal norms).
The presentation tries to unpack some of the implications of the fractal nature of legal norms and proposes an alternative strategy for measuring discretion. Here is a pdf of the presentation which I hope makes some sense on its own.

In defense of description

John Gerring has a new article in the British Journal of Political Science [ungated here]which attempts to restore description to its rightful place as a respectful occupation for political scientists. Description has indeed been relegated to the sidelines at the expense of causal inference during the last 50 years, and Gerring does a great job in explaining why this is wrong. But he also points out why description is inherently more difficult than causal analysis: 

‘Descriptive inference, by contrast, is centred on a judgment about what is important, substantively speaking, and how to describe it. To describe something is to assert its ultimate value. Not surprisingly, judgments about matters of substantive rationality are usually more contested than judgments about matters of instrumental rationality, and this offers an important clue to the predicament of descriptive inference.’ (p.740)

Required reading.

Weighted variance and weighted coefficient of variation

Often we want to compare the variability of a variable in different contexts – say, the variability of unemployment in different countries over time, or the variability of height in two populations, etc. The most often used measures of variability are the variance and the standard deviation (which is just the square root of the variance). However, for some types of data, these measures are not entirely appropriate. For example, when data is generated by a Poisson process (e.g. when you have counts of rare events) the mean equals the variance by definition. Clearly, comparing the variability of two Poisson distributions using the variance or the standard deviation would not work if the means of these populations differ. A common and easy fix is to use the coefficient of variation instead, which is simply the standard deviation divided by the mean. So far, so good.

Things get tricky however when we want to calculate the weighted coefficient of variation. The weighted mean is just the mean but some data points contribute more than others. For example the mean of 0.4 and 0.8 is 0.6. If we assign the weights 0.9 to the first observation [0.4] and 0.1 to the second [0.8], the weighted mean is (0.9*0.4+0.1*0.8)/1, which equals to 0.44. You would guess that we can compute the weighted variance by analogy,  and you would be wrong.

For example, the sample variance of {0.4,0.8} is given by [Wikipedia]:

or in our example ((0.4-0.6)^2+(0.8-0.6)^2) / (2-1) which equals to 0.02. But, the weighted sample variance cannot be computed by simply adding the weights to the above formula (0.9*(0.4-0.6)^2+0.1*(0.8-0.6)^2) / (2-1). The formula for the weighted variance is different [Wikipedia]:

where V1 is the sum of the weights and V2 is the sum of squared weights:.
The next steps are straightforward: the weighted standard deviation is the square root of the above, and the weighted coefficient of variation is the weighted standard deviation divided by the weighted mean.

Although there is nothing new here, I thought it’s a good idea to put it together because it appears to be causing some confusion.  For example, in the latest issue of European Union Politics you can find the article ‘Measuring common standards  and equal responsibility-sharing in EU asylum outcome data’  by a team of scientists from LSE. On page 74, you can read that:

The weighted variance [of the set p={0.38, 0.42} with weights W={0.50,0.50}] equals 0.5(0.38-.0.40)^2+0.5(0.42-0.40)^2 =0.0004.

As explained above, this is not generally correct unless the biased (population) rather than the unbiased (sample)  weighted variance is meant. When calculated properly, the weighted variance turns out to be 0.0008. Here you can find the function Gavin Simpson has provided  for calculating the weighted variance in R and try for yourself.

P.S. To be clear, the weighted variance issue is not central to the argument of the article cited above but is significant as the authors discuss at length the methodology for estimating variability in data and introduce the so-called Coffey-Feingold-Broomberg measure of variability which the authors  deem more appropriate for proportions.

P.P.S On the internet, there is yet more confusion: for example, this document (which pops high in the Google results) has yet a different formula, shown in a slightly different form here  as  well.

Disclaimer. I have a forthcoming paper on the same topic (asylum policy) as the EUP article mentioned above.