# What are the effects of COVID-19 on mortality? Individual-level causes of death and population-level estimates of casual impact

Introduction

How many people have died from COVID-19? What is the impact of COVID-19 on mortality in a population? Can we use excess mortality to estimate the effects of COVID-19?

In this text I will explain why the answer to the first two questions need not be the same. That is, the sum of cases where COVID-19 has been determined to be the direct[1] cause of death need not be the same as the population-level estimate about the causal impact of COVID-19. When measurement of the individual-level causes of death is imperfect, using excess mortality (observed minus expected) to measure the impact of COVID-19 leads to an underestimate of the number of individual cases where COVID-19 has been the direct cause of death.

Assumptions

The major assumption on which the argument rests is that some of the people who have died from COVID-19 would have died from other causes, within a specified relatively short time-frame (say, within the month). It seems very reasonable to assume that at least some of the victims of COVID-19 would have succumbed to other causes of death. This is especially easy to imagine given that COVID-19 kills disproportionally the very old and that the ultimate causes of death that it provokes – respiratory problems, lungs failure, etc. – are shared with other common diseases with high mortality among the older population, such as the flu.

Defining individual and population-level causal effects

With this crucial assumption in mind, we can construct the following simple table. Cell A contains the people who would have survived if they had not caught the Coronavirus, but they caught it and died. Cell B contains the people that caught the Coronavirus and died, but would have died from other causes even if they did not catch the virus[2]. Cell C contains the people who caught the virus and survived and would have survived even if they did not catch the virus. Cell D contains the people who would have died if they did not catch virus, but they did and survived. Cell C is of no interest for the current argument, and for now we can assume that cases in Cell D are implausible (although this might change if we consider indirect effects of the pandemic and the policy measures it provoked. But for now, we ignore such indirect effects). Cell E is people that did not catch the virus and survived (also not interesting for the argument). Cell F is people who did not catch the virus and died from other causes. As a matter of definition, total mortality within a period is A + B + F.

The number of individual-level deaths directly caused by COVID-19 that can be observed is the sum of cells A + B. Without further assumptions and specialized knowledge, we cannot estimate the share of cases that would have died anyways from the total. For now, just assume that this is positive; that is, such cases exist. The population-level causal impact of COVID-19 is A, or, in words, those that have died from COVID-19 minus those that would have died from other causes within the same period. The population-level causal effect is defined counterfactually. Again, without further assumptions about the ratio of B to A, the population-level causal impact of COVID-19 is not identifiable. An important conclusion that we reach is that the population-level causal impact of COVID-19 on mortality does not necessarily sum up to the sum of the individual cases where COVID-19 was the cause of death.

Scenario I: perfect measures of individual-level causes of death

Assume for the moment that all individual cases where COVID-19 was the cause of death are observed and recorded. Under this assumption, what does excess mortality measure? Excess mortality is defined as the difference between the observed (O) and predicted (P) number of deaths within a period, with the prediction (expectation) coming from historical averages, statistical models or anything else[3]. Under our definitions, the observed mortality in O a period contains groups  A + B + F. So the difference between observed O and predicted P gives A, or the number of people that have died from COVID-19, but would have survived otherwise. Therefore, excess mortality identifies the population-level causal impact of the COVID-19 (see also the figure below).

One implication of this line of reasoning is that under perfect measurement of individual-level cause of deaths and a positive amount of people who would have died from other causes if they had not died from COVID-19 (cell B), the sum of the cases where COVID-19 was recorded as a cause of death should exceed the excess in observed mortality O – P.  (See the situation in France where this might be happening.)

Scenario II: imperfect measures of individual-level causes of death

Let’s consider now a more realistic scenario where determining and recording the individual causes of death is imperfect. Under this assumption, the observed number of deaths in a period still contains O = A + B + F. Excess mortality O – P still identifies the population level effect A. However, this is not the number of deaths directly caused by COVID-19, which includes those that would have died anyways (B): a category that is already included in the prediction about mortality during this period [4].

In other words, excess mortality underestimates the sum of individual cases where COVID-19 is the direct cause of death. The amount of underestimation depends on how large the share of people who would have died from other causes but died from COVID-19 is. The larger the share, the larger the underestimation. To put it bluntly, COVID-19 kills more people than excess mortality suggests. This is because the expected number of deaths, on which the calculation of excess mortality depends, contains a share of people that would have died from other causes, but were killed by the virus.

Conclusions

These are the main conclusions from the analysis:

1. The sum of individual-level cases where COVID-19 was the direct cause of death is not the same as the population-level causal impact of the virus.
2. Excess mortality provides a valid estimate of the population-level causal impact.
3.  When measurement of the individual causes of death is imperfect, excess mortality provides an underestimate of the sum of individual cases where COVID-19 was the cause of death.
4. With perfect measurement of the individual causes of death, excess in mortality should be lower than then the sum of the individual case where COVID-19 was the cause of death.

Notes:

[1] I suspect some will object that the coronavirus and COVID-19 are never the direct causes of death but only provoke other diseases that ultimately kill people. This is irrelevant for the argument: I use ‘COVID-19 as a direct case of death’ as a shortcut for a death that was caused by COVID-19 provoking some other condition that ultimately kills.

[2] Formally, for people in cell B, COVID-19 is a sufficient but not necessary condition for dying within a certain period. For people in cell A, COVID-19 is both necessary and sufficient. Because of the counterfactual definition of the population-level effect, it only tracks cases where the cause was both necessary and sufficient.

[3] In reality, the models used to predict and estimate the expected mortality are imperfect and incorporate considerable uncertainties. These uncertainties compound the estimation problems discussed in the text, but the problems will exist even if the expected mortality was predicted perfectly.

[4] Extending the analysis to include indirect effects of COVID-19 and the policy responses it led to is interesting and important but very challenging. There are multiple plausible mechanisms for indirect effects, some of which would act to decrease mortality (e.g. less pollution, fewer traffic accidents, fewer crime-related murders, etc.) and some of which would act to increase mortality (e.g. due to stress, not seeking medical attention on time, postponed medical operations, increases in domestic violence, self-medication gone wrong, etc.). The time horizon of the estimation becomes even more important as some of these mechanisms need more time to exert their effects (e.g. reduced pollution).   Once we admit indirect effects, the calculation of the direct population-level effect of COVID-19 from excess mortality data becomes impossible without some assumptions about the share and net effect of the indirect mechanisms, and the estimation of the sum of individual-level effects becomes even more muddled.

# The problem with scope conditions

tl;dr: Posing arbitrary scope conditions to causal arguments leads to the same problem as subgroup analysis: the ‘results’ are too often just random noise.

Ingo Rohlfing has a very nice post on the importance of specifying what you mean by ‘context’ when you say that a causal relationship depends on the context. In sum, the argument is that ‘context’ can mean two rather different things: (1) scope conditions, so that the causal relationship might (or might not) work differently in a different context, or (2) moderating variables, so that the causal relationship should work differently in a different context, defined by different values of the moderating variables. So we better be explicit which of these two interpretations we endorse when we write that a causal relationship is context-dependent.

This is an important point. But the argument also exposes the structural similarity between scope conditions and moderating variables. Once we recognize this similarity, it is a small step to discover an even bigger issue lurking in the background: posing arbitrary scope conditions leads to the same problem as arbitrary subgroup analysis; namely, we mistake random noise for real relationships in the data.

The problem with subgroup analysis is well-known: We start with a population in which we find no association between two variables. And then we try different subgroups of the original population until we find one where the association between the two variables is ‘significant’. Even when a ‘real’ relationship between the variables does not exist at all, when we try enough subgroups, sooner or later we will get ‘lucky’ and discover a subgroup for which the relationship will look too strong to be due to chance. But it will be just that. (If you are still not persuaded, see the classic XKCD post below that makes the problem rather obvious.)

How are scope conditions similar? Well, we start with a subgroup of a population for which we find evidence for a strong, systematic relationship between some variables. Next, we try to extend the research to the broader population or to different subgroups, where we find no relationship. Then we conclude that the original relationship is context-dependent and suggest some scope conditions that define the context. But, essentially, we have committed the same mistake as the researcher trying out different subgroups before he or she gets ‘lucky’: it’s only that we have been ‘lucky’ on the first try!

When we find that a relationship holds in group A, but not in group B, a common response is to say that the relationship depends on some background scope conditions that are present in A but not in B. But, it is probably more likely that the original result for group A has been a fluke in the first place. After all, a theory that there is no relationship is more parsimonious than a theory that there is a relationship that is context-dependent (at least when we start from assumptions that not everything is connected to everything else by default).

Of course, in some cases, there will be good reasons to conclude that there are scope conditions to a previously-established association or causal relationship. Similarly, in some cases there are certain subgroups in which a relationship holds, while not in others or in the general population. The point is that failing to find a relationship in a new context should make us more sceptical whether the original finding itself was not just a result of chance. Hence, before, or in parallel to, searching for scope conditions, we should go back to the original study and try to ascertain whether the original finding still holds by collecting additional evidence or interpreting the existing evidence with a more sceptical prior.

The search for scope conditions should also be theory-driven, the same way the selection of subgroups should be driven by theoretical considerations. A scope condition would be more likely to be real, if it has been anticipated by theory and explicitly hypothesized as such before seeing the new data. Otherwise, it is too easy to capitalize on chance and elevate any random difference between groups (countries, time periods, etc.) as a scope condition of a descriptive or causal relationship.

While the problem with subgroup analysis is discussed mostly in statistical research, the problem with scope conditions is even more relevant for qualitative, small-N research than for large-N studies. This is because small-N research often proceeds from a single case study, where some relationships are found, to new cases, where often these relationships are not found, with the conclusion typically being that the originally-discovered relationships are real but context-dependent. That could be the case, but it could be also be that there are no systematic relationships in any of these cases at all.

I feel that if qualitative researchers disagree with my diagnosis of the problem with scope conditions, it will be because they often start from very different ontological assumptions about how the social world works. As mentioned above, my analysis holds only if we assume that the multitude of variables characterizing our world are not systematically related, unless we find evidence that they are. But many qualitative researchers seem to assume that everything is connected to everything else, unless we find evidence that it is not. Starting from such a strongly deterministic worldview, posing scope conditions when we fail to extend a result makes more sense. But then so would any subgroup analysis that finds a ‘significant’ relationship, and we seem to agree that this is wrong, at least in the context of statistical work.

To conclude, unless you commit to a strongly deterministic ontology where everything is connected to everything else by default, be careful when posing scope conditions to rationalize a failure to find a previously-established relationship in a different context. Instead, question whether the original result itself still holds. Only then search for more complex explanations that bring in scope conditions or moderating variables.

# Government positions from party-level Manifesto data (with R)

In empirical research in political science and public policy, we often need estimates of the political positions of governments (cabinets) and the salience of different issues for different governments (cabinets). Data on policy positions and issue salience is available, but typically at the level of political parties. One prominent source of data for issue salience and positions is the Manifesto Corpus, a database of the electoral manifestos of political parties. To ease the aggregation of government positions and salience from party-level Manifesto data, I developed a set of functions in R that accomplish just that, combining the Manifesto data with data on the duration and composition of governments from ParlGov.

The see how the functions work, read this detailed tutorial.

You can access all the functions at the dedicated GitHub repository. And you can contribute to this project by forking the code on GitHub. If you have questions or suggestions, get in touch.

Enjoy!

# Immigration and voting for the radical right in Andalusia

I wrote a short text for the European Politics and Policy (EUROPP) blog on the link between immigration presence and voting for Vox, a relatively young radical right party, in the Spanish region of Andalusia.  Full text is here, see also this post from 2015 about a similar link with Euroscepticism in the UK. The most important graph is below. Here is an excerpt:

To sum up, the available empirical evidence suggests that the relative size of the non-Western foreign-born population at the municipal level is positively, and rather strongly, related to the share of votes cast for Vox, the first Spanish radical right party to get in parliament since the end of Franco’s regime. Immigration might be responsible to a considerable extent for the resurgence of the radical right in Andalusia.

# The political geography of human development

The research I did for the previous post on the inadequacy of the widely-used term ‘Global South’ led me to some surprising results about the political geography of development.

Although the relationship between latitude and human development is not linear, distance from the equator turned out to have a rather strong, although far from deterministic and not necessarily causal, link with a country’s development level, as measured by its Human Development Index (HDI). Even more remarkably, once we include indicators (dummy variables) for islands and landlocked countries, and interactions between these and distance from the equator, we can account for more than 55% of the variance in HDI (2017). In other words, with three simple geographic variables and their interactions we can ‘explain’ more than half of the variation in the level of development of all countries in the world today. Wow! The plot below (pdf) shows these relationships.

In case you are wondering whether this results is driven by many small counties with tiny populations, it is not, When we run a weighted linear regression with population size as the weight, the adjusted R-squared of the model remains still (just above) 0.50. On a sidenote, including dummies for (former) communist countries and current European Union (EU) member states pushed the R-squared above 0.60. Communist regime or legacy is associated with significantly lower HDI, net of the geographic variables, and EU membership is associated with significantly higher HDI.

The next question to consider is whether the relationship between geography and development has grown weaker or stronger over time. There are many plausible ideas we might have about the influence of globalization, the spread of information and communication technologies, wars, and financial crises on the links between geography and development. When we look at the data, however, it turns out that the strength of the link has remained roughly the same since 1990. Wow! Despite of all global social and political transformations over the past 30 years, geography still play the same, rather larger role in constraining and enabling human development. The gif below shows the same plots for 1990, 2000, 2010, and 2017. While overal development grows over time, the relationship with distance from the equator remains roughly the same, as indicated by the slopes of the linear regression lines.

Note that the way the HDI is constructed (HDI) makes changes in development over time not quite comparable (the index is capped at 1.0, so if you are an already highly developed country, there is not much scope to improve further your index). Also, the sample of countries for which there is available data is smaller in 1990 (N=144) than in 2017 (N=191).

Since we mentioned population size, let’s consider the link between the population size of a country and its level of HDI. Are small countries more successful? Does it pay off to be a large state? Maybe countries with populations that are neither too big nor too small perform best?

As the plot below (pdf) shows, there is no clear relationship between population size and HDI. The linear regression line slopes slightly downwards but the ‘effect’ is not significant and it is not really linear. The loess fit meanders up and down without a clear pattern. It turns out there is no sweet spot for population size when it comes to human development. Small populations can be just as good, and just as bad, and bigger ones. There are tiny states that are successful, and ones that do pretty badly. The same for mid-sized, big, and enormous countries (not in terms of area, but population).

This lack of relationship is quite remarkable, but there is another surprise when we look at the change in development between 2000 and 2017. As the plot below (pdf) shows, more populous countries have been more successful in improving their HDI over the past 18 years. It is not a huge difference, but given the overall small scale of the observed changes, it is significant and important.

To sum up, while in general population size is not related to development, during the past two decades more populous countries have been more successful in improving their development index. This is of course good news, as it means that more people live longer, study longer, and enjoy higher standards of living.

For now, this concludes my exploits in political geography, which turned out to harbor more insights that I expected, even when I have only explored a total of five variables. If you want to continue from here on your own, the `R` script for the figures is here and the datafile is here.