Skip to content

Category: Measurement

What are the effects of COVID-19 on mortality? Individual-level causes of death and population-level estimates of casual impact

Introduction How many people have died from COVID-19? What is the impact of COVID-19 on mortality in a population? Can we use excess mortality to estimate the effects of COVID-19? In this text I will explain why the answer to the first two questions need not be the same. That is, the sum of cases where COVID-19 has been determined to be the direct[1] cause of death need not be the same as the population-level estimate about the causal impact of COVID-19. When measurement of the individual-level causes of death is imperfect, using excess mortality (observed minus expected) to measure the impact of COVID-19 leads to an underestimate of the number of individual cases where COVID-19 has been the direct cause of death. Assumptions The major assumption on which the argument rests is that some of the people who have died from COVID-19 would have died from other causes, within a specified relatively short time-frame (say, within the month). It seems very reasonable to assume that at least some of the victims of COVID-19 would have succumbed to other causes of death. This is especially easy to imagine given that COVID-19 kills disproportionally the very old and that the ultimate causes of death that it provokes – respiratory problems, lungs failure, etc. – are shared with other common diseases with high mortality among the older population, such as the flu. Defining individual and population-level causal effects With this crucial assumption in mind, we can construct the following simple table. Cell…

Government positions from party-level Manifesto data (with R)

In empirical research in political science and public policy, we often need estimates of the political positions of governments (cabinets) and the salience of different issues for different governments (cabinets). Data on policy positions and issue salience is available, but typically at the level of political parties. One prominent source of data for issue salience and positions is the Manifesto Corpus, a database of the electoral manifestos of political parties. To ease the aggregation of government positions and salience from party-level Manifesto data, I developed a set of functions in R that accomplish just that, combining the Manifesto data with data on the duration and composition of governments from ParlGov. The see how the functions work, read this detailed tutorial. You can access all the functions at the dedicated GitHub repository. And you can contribute to this project by forking the code on GitHub. If you have questions or suggestions, get in touch. Enjoy!

Discretion is Fractal

Last week, I made a presentation at the Leiden University conference ‘Political Legitimacy and the Paradox of Regulation’ under the admittedly esoteric title ‘Discretion is Fractal’. Despite the title, my point is actually quite simple: one cannot continue to model, conceptualize and measure (administrative or legal) discretion as a linear phenomenon because of the nested structure of legal norms which exhibits self-similarity at different levels of observation. And, yes, this means that law is fractal, too. In the same way there is no definite answer to the question ‘how long is the coast of Britain‘, there can be no answer to the question which legal code provides for more discretion, unless a common yardstick and level of observation is used (which requires an analytic reconstruction of the structure of the legal norms). The presentation tries to unpack some of the implications of the fractal nature of legal norms and proposes an alternative strategy for measuring discretion. Here is a pdf of the presentation which I hope makes some sense on its own.

In defense of description

John Gerring has a new article in the British Journal of Political Science [ungated here]which attempts to restore description to its rightful place as a respectful occupation for political scientists. Description has indeed been relegated to the sidelines at the expense of causal inference during the last 50 years, and Gerring does a great job in explaining why this is wrong. But he also points out why description is inherently more difficult than causal analysis:  ‘Descriptive inference, by contrast, is centred on a judgment about what is important, substantively speaking, and how to describe it. To describe something is to assert its ultimate value. Not surprisingly, judgments about matters of substantive rationality are usually more contested than judgments about matters of instrumental rationality, and this offers an important clue to the predicament of descriptive inference.’ (p.740) Required reading.

Weighted variance and weighted coefficient of variation

Often we want to compare the variability of a variable in different contexts – say, the variability of unemployment in different countries over time, or the variability of height in two populations, etc. The most often used measures of variability are the variance and the standard deviation (which is just the square root of the variance). However, for some types of data, these measures are not entirely appropriate. For example, when data is generated by a Poisson process (e.g. when you have counts of rare events) the mean equals the variance by definition. Clearly, comparing the variability of two Poisson distributions using the variance or the standard deviation would not work if the means of these populations differ. A common and easy fix is to use the coefficient of variation instead, which is simply the standard deviation divided by the mean. So far, so good. Things get tricky however when we want to calculate the weighted coefficient of variation. The weighted mean is just the mean but some data points contribute more than others. For example the mean of 0.4 and 0.8 is 0.6. If we assign the weights 0.9 to the first observation [0.4] and 0.1 to the second [0.8], the weighted mean is (0.9*0.4+0.1*0.8)/1, which equals to 0.44. You would guess that we can compute the weighted variance by analogy,  and you would be wrong. For example, the sample variance of {0.4,0.8} is given by [Wikipedia]: or in our example ((0.4-0.6)^2+(0.8-0.6)^2) / (2-1) which equals to 0.02. But, the weighted sample variance cannot be computed by…