The problem with scope conditions

tl;dr: Posing arbitrary scope conditions to causal arguments leads to the same problem as subgroup analysis: the ‘results’ are too often just random noise.

Ingo Rohlfing has a very nice post on the importance of specifying what you mean by ‘context’ when you say that a causal relationship depends on the context. In sum, the argument is that ‘context’ can mean two rather different things: (1) scope conditions, so that the causal relationship might (or might not) work differently in a different context, or (2) moderating variables, so that the causal relationship should work differently in a different context, defined by different values of the moderating variables. So we better be explicit which of these two interpretations we endorse when we write that a causal relationship is context-dependent.

This is an important point. But the argument also exposes the structural similarity between scope conditions and moderating variables. Once we recognize this similarity, it is a small step to discover an even bigger issue lurking in the background: posing arbitrary scope conditions leads to the same problem as arbitrary subgroup analysis; namely, we mistake random noise for real relationships in the data.

The problem with subgroup analysis is well-known: We start with a population in which we find no association between two variables. And then we try different subgroups of the original population until we find one where the association between the two variables is ‘significant’. Even when a ‘real’ relationship between the variables does not exist at all, when we try enough subgroups, sooner or later we will get ‘lucky’ and discover a subgroup for which the relationship will look too strong to be due to chance. But it will be just that. (If you are still not persuaded, see the classic XKCD post below that makes the problem rather obvious.)

How are scope conditions similar? Well, we start with a subgroup of a population for which we find evidence for a strong, systematic relationship between some variables. Next, we try to extend the research to the broader population or to different subgroups, where we find no relationship. Then we conclude that the original relationship is context-dependent and suggest some scope conditions that define the context. But, essentially, we have committed the same mistake as the researcher trying out different subgroups before he or she gets ‘lucky’: it’s only that we have been ‘lucky’ on the first try!

When we find that a relationship holds in group A, but not in group B, a common response is to say that the relationship depends on some background scope conditions that are present in A but not in B. But, it is probably more likely that the original result for group A has been a fluke in the first place. After all, a theory that there is no relationship is more parsimonious than a theory that there is a relationship that is context-dependent (at least when we start from assumptions that not everything is connected to everything else by default).

Of course, in some cases, there will be good reasons to conclude that there are scope conditions to a previously-established association or causal relationship. Similarly, in some cases there are certain subgroups in which a relationship holds, while not in others or in the general population. The point is that failing to find a relationship in a new context should make us more sceptical whether the original finding itself was not just a result of chance. Hence, before, or in parallel to, searching for scope conditions, we should go back to the original study and try to ascertain whether the original finding still holds by collecting additional evidence or interpreting the existing evidence with a more sceptical prior.

The search for scope conditions should also be theory-driven, the same way the selection of subgroups should be driven by theoretical considerations. A scope condition would be more likely to be real, if it has been anticipated by theory and explicitly hypothesized as such before seeing the new data. Otherwise, it is too easy to capitalize on chance and elevate any random difference between groups (countries, time periods, etc.) as a scope condition of a descriptive or causal relationship.

While the problem with subgroup analysis is discussed mostly in statistical research, the problem with scope conditions is even more relevant for qualitative, small-N research than for large-N studies. This is because small-N research often proceeds from a single case study, where some relationships are found, to new cases, where often these relationships are not found, with the conclusion typically being that the originally-discovered relationships are real but context-dependent. That could be the case, but it could be also be that there are no systematic relationships in any of these cases at all.

I feel that if qualitative researchers disagree with my diagnosis of the problem with scope conditions, it will be because they often start from very different ontological assumptions about how the social world works. As mentioned above, my analysis holds only if we assume that the multitude of variables characterizing our world are not systematically related, unless we find evidence that they are. But many qualitative researchers seem to assume that everything is connected to everything else, unless we find evidence that it is not. Starting from such a strongly deterministic worldview, posing scope conditions when we fail to extend a result makes more sense. But then so would any subgroup analysis that finds a ‘significant’ relationship, and we seem to agree that this is wrong, at least in the context of statistical work.

To conclude, unless you commit to a strongly deterministic ontology where everything is connected to everything else by default, be careful when posing scope conditions to rationalize a failure to find a previously-established relationship in a different context. Instead, question whether the original result itself still holds. Only then search for more complex explanations that bring in scope conditions or moderating variables.

Intuitions about case selection are often wrong

Imagine the following simple setup: there are two switches (X and Z) and a lamp (Y). Both switches and the lamp are ‘On’. You want to know what switch X does, but you have only one try to manipulate the switches. Which one would you choose to switch off: X, Z or it doesn’t matter?

These are the results of the quick Twitter poll I did on the question:

Clearly, almost half of the respondents think it doesn’t matter, switching X is the second choice, and only 2 out of 15 would switch Z to learn what X does. Yet, it is by pressing Z that we have the best chance of learning something about the effect of X. This seems quite counter-intuitive, so let me explain.

First, let’s clarify the assumptions embedded in the setup: (A1) both switches and the lamp can be either ‘On’ [1 ] or ‘Off’ [0]; (A2) the lamp is controlled only by these switches; there is nothing outside the system that controls its output; (A3) X and Z can work individually or in combination (so that the lamp is ‘On’ only if both switches are ‘On’ simultaneously).

Now let’s represent the information we have in a table:

Switch X Switch Z Lamp Y
1 1 1
0 0 0

We are allowed to make one experiment in the setup (press only one switch). In other words, we can add an observation for one more row of the table. Which one should it be?

Well, let’s see what happens if we switch off X (let’s call this strategy S1). There are two possible outcomes: either the lamp goes off (S1a) or it stays on (S1b).

In the first case (represented as the second line in the table below) we can conclude that X is not necessary for the lamp to be ‘On’, but we do not know whether X can switch on the lamp on its own (whether it is sufficient to do so).

Switch X Switch Z Lamp Y
1 1 1
0 1 1
0 0 0

If the lamp goes off when we press X, we know that X is necessary for the outcome but we do not know whether X can turn on the lamp on its own or only in combination with Z.

Switch X Switch Z Lamp Y
1 1 1
0 1 0
0 0 0

To sum up, by pressing X we learn either that (S1a) X is not necessary or that (S1b) X matters but we do not know whether on its own or only in combination with Z.


Now, let’s see what happens if we press Z (strategy S2). Again either the lamp stays on (S2a) or it goes off (S2b).

Under the first scenario, we learn that X is sufficient to turn on the lamp.

Switch X Switch Z Lamp Y
1 1 1
1 0 1
0 0 0

Under the second scenario, we learn that X is not sufficient to turn on the light. It is still possible that it is necessary for turning on the lamp in combination with Z.

Switch X Switch Z Lamp Y
1 1 1
1 0 0
0 0 0

To sum up, by pressing Z we learn either that (S2a) X can turn on the lamp or (S2b) that it cannot turn on the lamp on its own but is possibly necessary in combination with Z. 

Comparing the two sets of inferences, I think it is clear that the second one is much more informative. By pressing Z we learn either that we can turn on the lamp by pressing X or that we cannot unless Z is ‘On’. By pressing X we learn next to nothing: we are either still in the dark whether X works on its own to turn on the lamp (sorry for the pun) or that X matters but we still do now know whether we also need Z to be ‘On’.

If you are still unconvinced, the following table summarizes all inferences under all strategies and contingencies about each of the possible effects (X, Z, and the interaction XZ):

X works on its own Z works on its own Only XZ works Strategy
? True False S1a
? False ? S1b
True ? False S2a
False ? ? S2b

It should be obvious now that we are better off by pressing Z to learn about the effect of X.

Good, but what’s the relevance of this little game? Well, the game resembles a research design situation in which we have one observation (case), we have the resources to add only one more, and we have to select which observation to make. In other words, the game is about case selection.

For example, we observe a case with a rare outcome – say, successful regional integration. We suspect that two factors are at play, both of which are present in the case – say, high trade volume within the integrating block and democratic form of government for all units. And we wanna probe the effect of trade volume in particular. In that case, the analysis above suggests that we should choose a case that has the same volume of trade but a non-democratic form of government, rather than a case which has low volume of trade and democratic form of government.

This result is counter-intuitive, so let’s spell out why. First, note that we are interested in the effect of X (the effect of the switch and of trade volume) and not in explaining Y (how to turn on the lamp or how does regional integration come about). This is a subtle difference in interpretation, but one that is crucial for the analysis. Second, note that we are more interested in the effect of X than in the effect of Z, although both are potential causes of Y. If both X and Z are of equal interest, then obviously it doesn’t matter which one observation we make. Third, the result hinges on the assumption that there is nothing other than X or Z (or their interaction) that matters for Y. Once we admit other possible causal variables in the set-up, then we are no longer better off switching Z to learn the effect of X.

Sooooo, don’t take this little game as general advice on case selection. But it definitely shows that when it comes to research design our intuitions cannot always be trusted.

P.S. One assumption on which the analysis does not depend is binary effects and outcomes: it works equally well with probabilistic effects that are additive or multiplicative (involving an interaction). 

Learn more about research design.

Unit of analysis vs. Unit of observation

Having graded another batch of 40 student research proposals, the distinction between ‘unit of analysis’ and ‘unit of observation’ proves to be, yet again, one of the trickiest for the students to master.

After several years of experience, I think I have a good grasp of the difference between the two, but it obviously remains a challenge to explain it to students. King, Keohane and Verba (1994) [KKV] introduce the difference in the context of descriptive inference where it serves the argument that what often goes under the heading of a ‘case study’ often actually has many observations (p.52, see also 116-117). But, admittedly the book is somewhat unclear about the distinction and unambiguous definitions are not provided.

In my understanding, the unit of analysis (a case) is at the level at which you pitch the conclusions. The unit of observation is at the level at which you collect the data. So, the unit of observation and the unit of analysis can be the same but they need not be. In the context of quantitative research, units of observation could be students and units of analysis classes, if classes are compared. Or students can be both the units of observation and analysis if students are compared. Or students can be the units of analyses and grades the unit of observations if several observations (grades) are available per student. So it all depends on the design. Simply put, the unit of observation is the row in the data table but the unit of analysis can be at a higher level of aggregation.

In the context of qualitative research, it is more difficult to draw the difference between the two, also because the difference between analysis and observation is in general less clear-cut. In some sense, the same unit (case) traced over time provides distinct observations but I am not sure to what extent these snap-shots would be regarded as distinct ‘observations’ by qualitative researchers. 

But more importantly, I start to feel that the distinction between units of analysis and units of observation creates more confusion rather than more clarity. For the purposes of research design instruction, we would be better off if the term ‘case’ did not exist at all so we could simply speak about observations (single observation vs. single case study, observation selection vs. case selection, etc.) Of course, language policing never works so we seem to be stuck in an unfortunate but unavoidable ambiguity.

Overview of the process and design of public administration research in Prezi

Here is the result of my attempt to use Prezi during the last presentation for the class on Research Design in Public Administration. I tried to use Prezi’s functionality to provide in a novel form the same main lessons I have been emphasizing during the six weeks (yes, it is a short course). Some of the staff is obviously an over-simplification but the purpose is to focus on the big picture and draw the various threads of the course together.

Prezi seems fun but I have two small complaints: (1) the handheld device I use to change powerpoint slides from a distance doesn’t work with Prezi, and (2) I can’t find a way to make staff (dis)appear ala PowerPoint without zooming in and out .

Is unit homogeneity a sufficient assumption for causal inference?

Is unit homogeneity a sufficient condition (assumption) for causal inference from observational data?

Re-reading King, Keohane and Verba’s bible on research design [lovingly known to all exposed as KKV] I think they regard unit homogeneity and conditional independence as alternative assumptions for causal inference. For example: “we provide an overview here of what is required in terms of the two possible assumptions that enable us to get around the fundamental problem [of causal inference]” (p.91, emphasis mine). However, I don’t see how unit homogeneity on its own can rule out endogeneity (establish the direction of causality). In my understanding, endogeneity is automatically ruled out with conditional independence, but not with unit homogeneity (“Two units are homogeneous when the expected values of the dependent variables from each unit are the same when our explanatory variables takes on a particular value” [p.91]).

Going back to Holland’s seminal article which provides the basis of KKV’s approach, we can confirm that unit homogeneity is listed as a sufficient condition for inference (p.948). But Holland divides variables into pre-exposure and post-exposure before he even gets to discuss any of the additional assumptions, so reverse causality is ruled out altogether. Hence, in Holland’s context unit homogeneity can indeed be regarded as sufficient, but in my opinion in KKV’s context unit homogeneity needs to be coupled with some condition (temporal precedence for example) to ascertain the causal direction when making inferences from data.

The point is minor but can create confusion when presenting unit homogeneity and conditional independence side by side as alternative assumptions for inference.

Course on Research Design

I am teaching again the Research Design class for the MSc in Public Administration at Leiden University. It is a rather challenging course since the  background of the students is so diverse (from Religious Studies to Psychology to International Relations) and because most of the students have very little training and a certain dislike for any formal method of data analysis.

Here is the course outline that we prepared (with my colleague Brendan Carroll). All comments and suggestions are more than welcome.