Why statisticians should welcome the Data Revolution

Here are two ideas of the moment:

  • Monitoring the post-2015 Sustainable Development Goals (SDGs) requires a huge investment in an international framework that delivers globally compatible statistics.
  • Sustainable development requires a huge investment in data collection capacity so that decision makers at national and sub-national levels have access to usable information from sustainable sources.

They are not the same thing. Herein lies the problem with Morten Jerven’s critique of the UN Independent Expert Advisory Group report on the Data Revolution.

Statistics and data are not the same. Statistics are derived from data. The more (reliable) data that exists the fewer extrapolations statisticians need to make and the more they can focus on quality and consistency within and across datasets.

Statistics are derived from two data sources: surveys and administrative data. Surveys collect statistically designed samples of data. Administrative systems count everything. Many things can only be counted through a survey – household income, for example. Many should be counted as part of the day-to-day running of public administration.

Take maternal mortality.

  • Everyone agrees that the ideal way to measure causes of death is through a civil registration and vital statistics (CRVS) system
  • Everyone agrees that this is the long-term sustainable goal
  • Many countries do not yet have CRVS systems
  • Some have them in development and are currently not able to produce reliable data
  • Others might think they are producing good data but the statisticians can provide valid reasons as to why the data is not yet reliable
  • Everyone agrees that statistics from survey data fill the gap.

Between 2015 and 2030 the SDG monitoring framework requires that consistent, compatible statistics on maternal mortality are produced from the best data source available. The Data Revolution effectively calls for all countries to have operational, credible CRVS systems in place by 2030. The challenge is how these two objectives are merged. What happens if a CRVS system in a country ‘goes live’ in 2020? Can an indicator be rebased mid-stream? There are huge costs and limited resources. Can the treasury afford to run quality surveys AND build quality administrative systems at the same time?

Much of Jerven’s review of A World that Counts expresses frustration with what he sees as the intellectual sloppiness of the document. This is a useful exercise but it shouldn’t detract us from bigger issues. Most people accept that the ‘Data Revolution’ has become a brand that is useful shorthand for something that isn’t just about data and may not really be a revolution.

Development Initiatives’ work joining up data to produce information that is both accessible and usable by decision makers at sub-national level goes beyond, we would argue, being a ‘statement of belief and hope’. A critique that belittles the ‘enthusiasm’ of those daring to aspire to count the ‘invisible’ and the dispossessed ignores the political economy of the issue.

This is a deeply political issue. How will existing resources be used to ensure that sustainable infrastructures can be put in place to ensure sustainable information for effective decision making? How will new resources be best harnessed to improve the chances of meeting this target? The result of a successful outcome will be good statistics. But getting there is going to require, for want of a better word, a data revolution. Or, perhaps more accurately, as the draft PARIS21 roadmap puts it, 139 separate data revolutions in 139 countries.

The more we can count, the better we can calculate.