Skip to content

Category: Data visualization

Excess mortality in the Netherlands in 2020

What has been the impact of COVID-19 on mortality in the Netherlands? Using the methods described here, I estimated excess mortality in the country during 2020. The results are not pretty: around 15,000 additional deaths, 10% increase over the expected mortality for the year, 25% of the excess not captured by records of official COVID-19-related deaths. The analysis features comparisons of excess mortality over the past 10 years, as well as an exploration of 2020 excess mortality across age and gender. Read it here. You can also check the data and code (in R).

Modeling mortality

To grasp the true impact of COVID-19 on our societies, we need to know the effect of the pandemic on mortality. In other words, we need to know how many deaths can be attributed to the virus, directly and indirectly. It is already popular to visualize mortality in order to gauge the impact of the pandemic in different countries. You might have seen at least some of these graphs and websites: FT, Economist, Our World in Data, CBS, EFTA, CDC, EUROSTAT, and EUROMOMO. But estimating the impact of COVID-19 on mortality is also controversial, with people either misunderstanding or distrusting the way in which the impact is measured and assessed. That’s why, I put together a step-by-step guide about how we can go about estimating the impact of COVID-19 on mortality. In the guide, I build a large number of statistical models that we can use to predict expected mortality in 2020. The complexity of the models ranges from the simplest, based only on weekly averages from past years, to what is currently the state of the art. But this is not all. What I also do is review the predictive performance of all of these models, so that we know which ones work best. I run the models on publicly available data from the Netherlands, I use only the open software R, and I share the code, so anyone can check, replicate and extend the exercise. The guide is available here: http://dimiter.eu/Visualizations_files/nlmortality/Modeling-Mortality.html I hope this guide will provide some transparency about how expected mortality is and can be estimated…

The political geography of human development

The research I did for the previous post on the inadequacy of the widely-used term ‘Global South’ led me to some surprising results about the political geography of development. Although the relationship between latitude and human development is not linear, distance from the equator turned out to have a rather strong, although far from deterministic and not necessarily causal, link with a country’s development level, as measured by its Human Development Index (HDI). Even more remarkably, once we include indicators (dummy variables) for islands and landlocked countries, and interactions between these and distance from the equator, we can account for more than 55% of the variance in HDI (2017). In other words, with three simple geographic variables and their interactions we can ‘explain’ more than half of the variation in the level of development of all countries in the world today. Wow! The plot below (pdf) shows these relationships.     In case you are wondering whether this results is driven by many small counties with tiny populations, it is not, When we run a weighted linear regression with population size as the weight, the adjusted R-squared of the model remains still (just above) 0.50. On a sidenote, including dummies for (former) communist countries and current European Union (EU) member states pushed the R-squared above 0.60. Communist regime or legacy is associated with significantly lower HDI, net of the geographic variables, and EU membership is associated with significantly higher HDI. The next question to consider is whether the relationship between…

Books on data visualization

Here is a compilation of new and classic books on data visualization:   Scott Murray (2017) Interactive Data Visualization for the Web  Elijah Meeks (2017) D3.Js in Action: Data Visualization with JavaScript  Alberto Cairo (2016) The Truthful Art: Data, Charts, and Maps for Communication  Andy Kirk (2016) Data Visualization  David McCandless (2014) Knowledge is Beautiful    Edward Tufte (2006) Beautiful Evidence   Edward Tufte (2001) The Visual Display of Quantitative Information  Edward Tufte (1997) Visual Explanations: Images and Quantities, Evidence and Narrative  Edward Tufte (1990) Envisioning Information 

Olympic medals, economic power and population size

The 2016 Rio Olympic games being officially over, we can obsess as much as we like with the final medal table, without the distraction of having to actually watch any sports. One of the basic questions to ponder about the medal table is to what extent Olympic glory is determined by the wealth, economic power and population size of the countries. Many news outlets quickly calculated the ratios of the 2016 medal count with economic power and population size per country and presented the rankings of ‘medals won per billion of GDP’ and ‘medals won per million of population’ (for example here and here). But while these rankings are fun, they give us little idea about the relationships between economic power and population size, on the one hand, and Olympic success, on the other. Obviously, there are no deterministic links, but there could still be systematic relationships. So let’s see. Data I pulled from the Internet the total number of medals won at the 2016 Olympic games and assigned each country a score in the following way: each country got 5 points for a gold medal, 3 points for silver, and 1 point for bronze. (Different transformations of medals into points are of course possible.) To measure wealth and economic power, I got the GDP (at purchasing power parity) estimates for 2015 provided by the International Monetary Fund, complemented by data from the CIA Factbook (both sets of numbers available here). For population size, I used the Wikipedia list available…

Visualizing asylum statistics

Note: of potential interest to R users for the dynamic Google chart generated via googleVis in R and discussed towards the end of the post. Here you can go directly to the graph. An emergency refugee center, opened in September 2013 in an abandoned school in Sofia, Bulgaria. Photo by Alessandro Penso, Italy, OnOff Picture. First prize at World Press Photo 2013 in the category General News (Single). The tragic lives of asylum-seekers make for moving stories and powerful photos. When individual tragedies are aggregated into abstract statistics, the message gets harder to sell. Yet, statistics are arguably more relevant for policy and provide for a deeper understanding, if not as much empathy, than individual stories. In this post, I will offer a few graphs that present some of the major trends and patterns in the numbers of asylum applications and asylum recognition rates in Europe over the last twelve years. I focus on two issues: which European countries take the brunt of the asylum flows, and the link between the application share that each country gets and its asylum recognition rate. Asylum applications and recognition rates Before delving into the details, let’s look at the big picture first. Each year between 2001 and 2012, 370,000 people on average have applied for asylum protection in one of the member states of the European Union (plus Norway and Switzerland). As can be seen from Figure 1, the number fluctuates between 250,000 and 500,000 per year, and there is no clear trend. Altogether, during this 12-year period, approximately 4.5 million…