sábado, febrero 22 de 2020


Junio 03, 2016

Taxpayer-funded inaccuracies: another data story

Bad design decisions may cause confusion or even mislead. This is another example of a bad color scale selection that leads to false conclusions when interpreting information on a choropleth map.


Elías de la Rosa

Elías de la Rosa

Ever since the inception of mankind, decisions are made for a variety of reasons, e.g. leisure, work or sheer survival. And, as humans are prone to errors, wrong decisions are made too: choosing the wrong mate for a venture, taking a longer path on your daily commute or simply regret eating a dish at a restaurant.

Most of the times, wrong decisions are made on the basis of incomplete, inaccurate, non-existent or even false information (e.g. Iraq wars). Nonetheless, there are times in which data are improperly presented, even though they are accurate per se. The data from government institutions are no exception to this rule, as you will see in the next picture.

Number of convicts admitted to penitentiary centers due to felonies of the common regime in Mexico by state - Year: 2014. Source: http://www3.inegi.org.mx/sistemas/mexicocifras/default.aspx

There are a couple of  things that should be highligted here:

  • Color scale: despite the color scale goes from an almost bloodish red to a sky blue, it does so in a discrete (i.e. in predefined ranges) and irregular way, leading to:
  • Confusion: a state near the lower bound of  a color range and another state near the upper bound of the same range will be portrayed with the same color. 
  • More confusion: as the chrominance of the color scale is not determined by the number of convicts, the color scale becomes less intuitive and harder to interpret.
  • Lack of context: even though the number of convicts of  each state is accurate, it does not lead to actionable insights but to rather informative ones. This is because it is commonsense that states with higher populations will tend to have more convicts. 

The consequences of this map being released to the public may be:

  • Inability to correctly interpret the map (or completely interpreting the map at all);
  • Opportunity for policy makers to turn a blind eye on marginal states and overstate government action on the most populated states.

In order to counter the aforementioned issues, these could be possible actions:

  • Use relative numbers instead of absolute ones. By using, for example, number of convicts per 100.000 inhabitants, a more real sense of the convict situation in each state will be drawn out, leading to better interpretations and, thus, decisions.
  • Use a continuous color scale instead of a discrete palette. A continuous color scale (e.g., a "rainbow" one) will make the map more useful, as the number of convicts will be directly related to the color.



Elías de la Rosa

Elías de la Rosa

Entusiasta del Data Science. Siempre haciendo que el Data Science sea algo fácil de entender para toda clase de personas