Books on public policy

This is a list of recommended books on public policy, including introductory textbooks and more advanced texts, including handbooks and books on more specific topics within the field of public policy analysis.

Introductory textbooks:

Knill, C., & Tosun, J. (2012). Public policy: A new introduction. Macmillan International Higher Education.

Howlett, M., Ramesh, M., & Perl, A. (2009). Studying public policy: Policy cycles and policy subsystems (Third Edition). Oxford: Oxford university press.

 John, P. (2013). Analyzing public policy. Routledge.

Hill, M., & Varone, F. (2014). The public policy process. Routledge.

Cairney, P. (2011). Understanding public policy: Theories and issues. Macmillan International Higher Education.

Howlett, M. (2019). Designing public policies: Principles and instruments. Routledge.

Advanced books:

Goodin, R. E., Moran, M., & Rein, M. (2006). The Oxford handbook of public policy (Vol. 6). Oxford Handbooks.

Sabatier, P. A., & Weible, C. M. (Eds.). (2014). Theories of the policy process. Westview Press.

Baumgartner, F. R., & Jones, B. D. (2015). The politics of information: Problem definition and the course of public policy in America. University of Chicago Press.

Jones, B. D., & Baumgartner, F. R. (2005). The politics of attention: How government prioritizes problems. University of Chicago Press.

Writing for public policy:

Smith, C., & Pasqualoni, M. (2019). Writing Public Policy: A Practical Guide to Communicating in the Policy Making Process. Oxford University Press.

The problem with scope conditions

tl;dr: Posing arbitrary scope conditions to causal arguments leads to the same problem as subgroup analysis: the ‘results’ are too often just random noise.

Ingo Rohlfing has a very nice post on the importance of specifying what you mean by ‘context’ when you say that a causal relationship depends on the context. In sum, the argument is that ‘context’ can mean two rather different things: (1) scope conditions, so that the causal relationship might (or might not) work differently in a different context, or (2) moderating variables, so that the causal relationship should work differently in a different context, defined by different values of the moderating variables. So we better be explicit which of these two interpretations we endorse when we write that a causal relationship is context-dependent.

This is an important point. But the argument also exposes the structural similarity between scope conditions and moderating variables. Once we recognize this similarity, it is a small step to discover an even bigger issue lurking in the background: posing arbitrary scope conditions leads to the same problem as arbitrary subgroup analysis; namely, we mistake random noise for real relationships in the data.

The problem with subgroup analysis is well-known: We start with a population in which we find no association between two variables. And then we try different subgroups of the original population until we find one where the association between the two variables is ‘significant’. Even when a ‘real’ relationship between the variables does not exist at all, when we try enough subgroups, sooner or later we will get ‘lucky’ and discover a subgroup for which the relationship will look too strong to be due to chance. But it will be just that. (If you are still not persuaded, see the classic XKCD post below that makes the problem rather obvious.)

How are scope conditions similar? Well, we start with a subgroup of a population for which we find evidence for a strong, systematic relationship between some variables. Next, we try to extend the research to the broader population or to different subgroups, where we find no relationship. Then we conclude that the original relationship is context-dependent and suggest some scope conditions that define the context. But, essentially, we have committed the same mistake as the researcher trying out different subgroups before he or she gets ‘lucky’: it’s only that we have been ‘lucky’ on the first try!

When we find that a relationship holds in group A, but not in group B, a common response is to say that the relationship depends on some background scope conditions that are present in A but not in B. But, it is probably more likely that the original result for group A has been a fluke in the first place. After all, a theory that there is no relationship is more parsimonious than a theory that there is a relationship that is context-dependent (at least when we start from assumptions that not everything is connected to everything else by default).

Of course, in some cases, there will be good reasons to conclude that there are scope conditions to a previously-established association or causal relationship. Similarly, in some cases there are certain subgroups in which a relationship holds, while not in others or in the general population. The point is that failing to find a relationship in a new context should make us more sceptical whether the original finding itself was not just a result of chance. Hence, before, or in parallel to, searching for scope conditions, we should go back to the original study and try to ascertain whether the original finding still holds by collecting additional evidence or interpreting the existing evidence with a more sceptical prior.

The search for scope conditions should also be theory-driven, the same way the selection of subgroups should be driven by theoretical considerations. A scope condition would be more likely to be real, if it has been anticipated by theory and explicitly hypothesized as such before seeing the new data. Otherwise, it is too easy to capitalize on chance and elevate any random difference between groups (countries, time periods, etc.) as a scope condition of a descriptive or causal relationship.

While the problem with subgroup analysis is discussed mostly in statistical research, the problem with scope conditions is even more relevant for qualitative, small-N research than for large-N studies. This is because small-N research often proceeds from a single case study, where some relationships are found, to new cases, where often these relationships are not found, with the conclusion typically being that the originally-discovered relationships are real but context-dependent. That could be the case, but it could be also be that there are no systematic relationships in any of these cases at all.

I feel that if qualitative researchers disagree with my diagnosis of the problem with scope conditions, it will be because they often start from very different ontological assumptions about how the social world works. As mentioned above, my analysis holds only if we assume that the multitude of variables characterizing our world are not systematically related, unless we find evidence that they are. But many qualitative researchers seem to assume that everything is connected to everything else, unless we find evidence that it is not. Starting from such a strongly deterministic worldview, posing scope conditions when we fail to extend a result makes more sense. But then so would any subgroup analysis that finds a ‘significant’ relationship, and we seem to agree that this is wrong, at least in the context of statistical work.

To conclude, unless you commit to a strongly deterministic ontology where everything is connected to everything else by default, be careful when posing scope conditions to rationalize a failure to find a previously-established relationship in a different context. Instead, question whether the original result itself still holds. Only then search for more complex explanations that bring in scope conditions or moderating variables.

Government positions from party-level Manifesto data (with R)

In empirical research in political science and public policy, we often need estimates of the political positions of governments (cabinets) and the salience of different issues for different governments (cabinets). Data on policy positions and issue salience is available, but typically at the level of political parties. One prominent source of data for issue salience and positions is the Manifesto Corpus, a database of the electoral manifestos of political parties. To ease the aggregation of government positions and salience from party-level Manifesto data, I developed a set of functions in R that accomplish just that, combining the Manifesto data with data on the duration and composition of governments from ParlGov.

The see how the functions work, read this detailed tutorial.

You can access all the functions at the dedicated GitHub repository. And you can contribute to this project by forking the code on GitHub. If you have questions or suggestions, get in touch.

Enjoy!

Immigration and voting for the radical right in Andalusia

I wrote a short text for the European Politics and Policy (EUROPP) blog on the link between immigration presence and voting for Vox, a relatively young radical right party, in the Spanish region of Andalusia.  Full text is here, see also this post from 2015 about a similar link with Euroscepticism in the UK. The most important graph is below. Here is an excerpt:

 
To sum up, the available empirical evidence suggests that the relative size of the non-Western foreign-born population at the municipal level is positively, and rather strongly, related to the share of votes cast for Vox, the first Spanish radical right party to get in parliament since the end of Franco’s regime. Immigration might be responsible to a considerable extent for the resurgence of the radical right in Andalusia.

The political geography of human development

The research I did for the previous post on the inadequacy of the widely-used term ‘Global South’ led me to some surprising results about the political geography of development.

Although the relationship between latitude and human development is not linear, distance from the equator turned out to have a rather strong, although far from deterministic and not necessarily causal, link with a country’s development level, as measured by its Human Development Index (HDI). Even more remarkably, once we include indicators (dummy variables) for islands and landlocked countries, and interactions between these and distance from the equator, we can account for more than 55% of the variance in HDI (2017). In other words, with three simple geographic variables and their interactions we can ‘explain’ more than half of the variation in the level of development of all countries in the world today. Wow! The plot below (pdf) shows these relationships.

 

 

In case you are wondering whether this results is driven by many small counties with tiny populations, it is not, When we run a weighted linear regression with population size as the weight, the adjusted R-squared of the model remains still (just above) 0.50. On a sidenote, including dummies for (former) communist countries and current European Union (EU) member states pushed the R-squared above 0.60. Communist regime or legacy is associated with significantly lower HDI, net of the geographic variables, and EU membership is associated with significantly higher HDI.

The next question to consider is whether the relationship between geography and development has grown weaker or stronger over time. There are many plausible ideas we might have about the influence of globalization, the spread of information and communication technologies, wars, and financial crises on the links between geography and development. When we look at the data, however, it turns out that the strength of the link has remained roughly the same since 1990. Wow! Despite of all global social and political transformations over the past 30 years, geography still play the same, rather larger role in constraining and enabling human development. The gif below shows the same plots for 1990, 2000, 2010, and 2017. While overal development grows over time, the relationship with distance from the equator remains roughly the same, as indicated by the slopes of the linear regression lines.

 

 

Note that the way the HDI is constructed (HDI) makes changes in development over time not quite comparable (the index is capped at 1.0, so if you are an already highly developed country, there is not much scope to improve further your index). Also, the sample of countries for which there is available data is smaller in 1990 (N=144) than in 2017 (N=191).

Since we mentioned population size, let’s consider the link between the population size of a country and its level of HDI. Are small countries more successful? Does it pay off to be a large state? Maybe countries with populations that are neither too big nor too small perform best?

As the plot below (pdf) shows, there is no clear relationship between population size and HDI. The linear regression line slopes slightly downwards but the ‘effect’ is not significant and it is not really linear. The loess fit meanders up and down without a clear pattern. It turns out there is no sweet spot for population size when it comes to human development. Small populations can be just as good, and just as bad, and bigger ones. There are tiny states that are successful, and ones that do pretty badly. The same for mid-sized, big, and enormous countries (not in terms of area, but population).

 

 

This lack of relationship is quite remarkable, but there is another surprise when we look at the change in development between 2000 and 2017. As the plot below (pdf) shows, more populous countries have been more successful in improving their HDI over the past 18 years. It is not a huge difference, but given the overall small scale of the observed changes, it is significant and important.

 

 

To sum up, while in general population size is not related to development, during the past two decades more populous countries have been more successful in improving their development index. This is of course good news, as it means that more people live longer, study longer, and enjoy higher standards of living.

For now, this concludes my exploits in political geography, which turned out to harbor more insights that I expected, even when I have only explored a total of five variables. If you want to continue from here on your own, the R script for the figures is here and the datafile is here.

The ‘Global South’ is a terrible term. Don’t use it!

The Rise of the ‘Global South’

The ‘Global South‘ and ‘Global North‘ are increasingly popular terms used to categorize the countries of the world. According to Wikipedia, the term ‘Global South’ originated in postcolonial studies, and was first used in 1969. The Google N-gram chart below shows the rise of the ‘Global South’ term from 1980 till 2008, but the rise is even more impressive afterwards.

Nowadays, the Global South is used as a shortcut to anything from poor and less-developed to oppressed and powerless. Despite this vagueness, the term is prominent in serious academic publications, and it even features in the names of otherwise reputable institutions. But, its popularity notwithstanding, the ‘Global South’ is a terrible term. Here is why.

 

There is no Global South

The Global South/Global North terms are inaccurate and misleading. First, they are descriptively inaccurate, even when they refer to general notions such as (economic) development. Second, they are homogenizing, obscuring important differences between countries supposedly part of the Global South and North groups. In this respect, these terms are no better than alternatives that they are trying to replace, such as ‘the West‘ or the ‘Third World‘. Third, the Global South/Global North terms imply a geographic determinism that is wrong and demotivational. Poor countries are not doomed to be poor, because they happen to be in the South, and their geographic position is not a verdict on their developmental prospects.

 

The Global South/Global North terms are inaccurate and misleading

Let me show you just how bad these terms are. I focus on human development, broadly defined and measured by the United Nations’ Human Development Index (HDI). The HDI tracks life expectancy, education, and standard of living, so it captures more than purely economic aspects of development.

The chart below plots the geographic latitude of a country’ capital against the country’s HDI score for 2017. (Click on the image for a larger size or download a higher resolution pdf). It is quite clear that a straight line from South to North is a poor description of the relationship between geographic latitude and human development. The correlation between the two is 0.48. A linear regression of HDI on latitude returns a positive coefficient, and the R-squared as 0.23. But, as is obvious from the plot, the relationship is not linear. In fact, some of the southern-most countries on the planet, such as Australia and New Zealand, but also Chile and Argentina, are in the top ranks of human development. The best summary of the relationship between HDI and latitude is curvilinear, as indicated by the Loess (nonparametric local regression) fit.

 

 

 

You can say that we always knew that and the Global South was meant to refer to ‘distance from the equator’ rather than to absolute latitude. But, first, this is rather offensive to people in New Zealand, Australia, South Africa and the southern part of South America. And, second, there is still far from a deterministic relationship between human development and geographic position, as measured by distance from the equator. The next plot (click on the image for a larger size, download a pdf version here) shows exactly that. Now, overall, the relationship is stronger: the correlation is 0.64. And after around the 10th degree, it is also rather linear, as indicated by the match between the linear regression line and the Loess fit. Still, there is important heterogeneity within the South/close to equator and North/far from equator countries. Singapore’ HDI is almost as high as that of Sweden, despite the two being on the opposite ends of the geographic scale. Ecuador’s HDI is just above Ukraine’s, although the former is more than 50 degree closer to the equator than then latter. Gabon’s HDI is higher than Moldova’s, despite Gabon being 46 degrees further south than Moldova.

 

 

This is not to deny that there is a link between geographic position and human development. By the standards of social science, this is a rather strong correlation and fairly smooth relationship. It is remarkable that no country more the 35 degrees from the equator has an HDI lower than 0.65 (but this excludes North Korea, for which there is no HDI data provided by the UN).  But there is still important diversity in human development at different geographic zones. Moreover, the correlation between geographic position and development need to be causal, let alone deterministic.

There are good arguments to be made that geography shapes and constraints the economic and social development of nations. My personal favorite is Jared Diamond’s idea that Eurasia’s continental spread along an East-West axis made it easier for food innovations and agricultural technology to diffuse, compared to America’s continental spread along a North-South axis. But geography is not a verdict for development, as plenty of nations have demonstrated. Yet, the Global South/Global North categories suggest otherwise.

 

What to use instead?

OK, so the Global South/Global North are bad words, but what to use instead? There is no obvious substitute that is more descriptively accurate, less homogenizing and less suggestive of (geographic) determinism. But then don’t use any categorization that is so general and coarse. There is a good reason why there is no appropriate alternative term: the countries of the world are too diverse to fit into two boxes: one for South and one for North, one for developed and one for non-developed, one for powerful, and one for oppressed.

Be specific about what the term is referring to, and be concrete about the set of countries that is covered. If you mean the 20 poorest countries in the world, say the 20 poor countries in the world, not countries of the Global South. If you mean technologically underdeveloped countries, say that and not countries of the Third World. If you mean rich, former colonial powers from Western Europe, say that and not the Global North.  It takes a few more words, but it is more accurate and less misleading.

It is a bit ironic that the Global South/Global North terms are most popular among scholars and activists who are extremely sensitive about the power of words to shape public discourses, homogenize diverse populations, and support narratives that take a life of their own, influencing politics and public policy. If that’s the case, it makes it even more imperative to avoid terms that are inaccurate, homogenizing and misleading on a global scale.

If you want to look at the data yourself, the R script for the figures is here and the datafile is here.