causality – RE-DESIGN

What are the effects of COVID-19 on mortality? Individual-level causes of death and population-level estimates of casual impact

Published by demetriodor on April 27, 2020

Introduction How many people have died from COVID-19? What is the impact of COVID-19 on mortality in a population? Can we use excess mortality to estimate the effects of COVID-19? In this text I will explain why the answer to the first two questions need not be the same. That is, the sum of cases where COVID-19 has been determined to be the direct[1] cause of death need not be the same as the population-level estimate about the causal impact of COVID-19. When measurement of the individual-level causes of death is imperfect, using excess mortality (observed minus expected) to measure the impact of COVID-19 leads to an underestimate of the number of individual cases where COVID-19 has been the direct cause of death. Assumptions The major assumption on which the argument rests is that some of the people who have died from COVID-19 would have died from other causes, within a specified relatively short time-frame (say, within the month). It seems very reasonable to assume that at least some of the victims of COVID-19 would have succumbed to other causes of death. This is especially easy to imagine given that COVID-19 kills disproportionally the very old and that the ultimate causes of death that it provokes – respiratory problems, lungs failure, etc. – are shared with other common diseases with high mortality among the older population, such as the flu. Defining individual and population-level causal effects With this crucial assumption in mind, we can construct the following simple table. Cell…

The problem with scope conditions

Published by demetriodor on September 12, 2019

Posing arbitrary scope conditions to causal arguments leads to the same problem as subgroup analysis: the ‘results’ are too often just random noise.

QCA solution types and causal analysis

Published by demetriodor on August 25, 2017

Qualitative Comparative Analysis (QCA) is a relative young research methodology that has been frequently under attack from all corners, often for the wrong reasons. But there is a significant controversy brewing up within the community of people using set-theoretic methods (of which QCA is one example) as well. Recently, COMPASSS – a prominent network of scholars interested in QCA – issued a Statement on Rejecting Article Submissions because of QCA Solution Type. In this statement they ‘express the concern … about the practice of some anonymous reviewers to reject manuscripts during peer review for the sole, or primary, reason that the given study chooses one solution type over another’. The ‘solution type’ refers to the procedure used to minimize the ‘truth tables’ which collect the empirical data in QCA (and other set-theoretic) research when there are unobserved combinations of conditions (factors, variables) in the data. Essentially, in cases of missing data (which is practically always) together with the data minimization algorithm the solution type determines the inference you get from the data. I have not been involved in drawing up the statement (and I am not a member of COMPASSS), and I have not reviewed any articles using QCA recently, so I am not directly involved in this controversy on either side. At the same time, I have been interested in QCA and related methodologies for a while now, I have covered their basics in my textbook on research design, and I remain intrigued both by their promise and their…

Explanation and the quest for ‘significant’ relationships. Part II

Published by demetriodor on February 22, 2012

In Part I I argue that the search and discovery of statistically significant relationships does not amount to explanation and is often misplaced in the social sciences because the variables which are purported to have effects on the outcome cannot be manipulated. Just to make sure that my message is not misinterpreted – I am not arguing for a fixation on maximizing R-squared and other measures of model fit in statistical work, instead of the current focus on the size and significance of individual coefficients. R-squared has been rightly criticized as a standard of how good a model is** (see for example here). But I am not aware of any other measure or standard that can convincingly compare the explanatory potential of different models in different contexts. Predictive success might be one way to go, but prediction is altogether something else than explanation. I don’t expect much to change in the future with regard to the problem I outlined. In practice, all one could hope for is some clarity on the part of the researchers whether their objective is to explain (account for) or find significant effects. The standards for evaluating progress towards the former objective (model fit, predictive success, ‘coverage’ in the QCA sense) should be different than the standards for the latter (statistical & practical significance and the practical possibility to manipulate the exogenous variables). Take the so-called garbage-can regressions, for example. These are models with tens of variables all of which are interpreted causally if they reach the magic…

Is unit homogeneity a sufficient assumption for causal inference?

Published by demetriodor on December 6, 2011

Is unit homogeneity a sufficient condition (assumption) for causal inference from observational data? Re-reading King, Keohane and Verba’s bible on research design [lovingly known to all exposed as KKV] I think they regard unit homogeneity and conditional independence as alternative assumptions for causal inference. For example: “we provide an overview here of what is required in terms of the two possible assumptions that enable us to get around the fundamental problem [of causal inference]” (p.91, emphasis mine). However, I don’t see how unit homogeneity on its own can rule out endogeneity (establish the direction of causality). In my understanding, endogeneity is automatically ruled out with conditional independence, but not with unit homogeneity (“Two units are homogeneous when the expected values of the dependent variables from each unit are the same when our explanatory variables takes on a particular value” [p.91]). Going back to Holland’s seminal article which provides the basis of KKV’s approach, we can confirm that unit homogeneity is listed as a sufficient condition for inference (p.948). But Holland divides variables into pre-exposure and post-exposure before he even gets to discuss any of the additional assumptions, so reverse causality is ruled out altogether. Hence, in Holland’s context unit homogeneity can indeed be regarded as sufficient, but in my opinion in KKV’s context unit homogeneity needs to be coupled with some condition (temporal precedence for example) to ascertain the causal direction when making inferences from data. The point is minor but can create confusion when presenting unit homogeneity and conditional independence side by side as alternative assumptions for inference.

Tag: causality

What are the effects of COVID-19 on mortality? Individual-level causes of death and population-level estimates of casual impact

The problem with scope conditions

More on QCA solution types and causal analysis

Explanation and the quest for ‘significant’ relationships. Part II

Is unit homogeneity a sufficient assumption for causal inference?