• Discussion paper
  • 9 February 2015

Opening public data to improve government transparency in Nepal

This profile is part of a series looking at how data is being accessed and used in developing countries Embracing their role as a data intermediary, the gr

Prakash Neupane is the ambassador of the Open Knowledge network local group in Nepal – a group of technology enthusiasts who meet regularly to discuss issues around access to information and open data.

This profile is part of a series looking at how data is being accessed and used in developing countries

Embracing their role as a data intermediary, the group develops technical solutions to open up data around social issues, such as government accountability. Recognising that a lack of open budget data was one of the barriers preventing effective civil society engagement with government accountability, Open Knowledge Nepal undertook a project to open up government budget and spending data.

 

Prakash Neupane, ambassador of the Open Knowledge network local group in Nepal . Photo credit: Open Knowledge

Photo credit: Open Knowledge

Prakash’s demand for data and information

“Budget data is something that matters very much to people because any adjustment in it has a direct impact on individual household incomes and living standards.”

By acting as an intermediary and opening the data, Open Knowledge Nepal is making the budget information more accessible to civil society. Prakash recognises the importance of publishing budget data in a format that can be shared and used by civil society groups to monitor spending against allocated budgets. He believes that greater access to budget information will improve civil societys ability to monitor funding at a local, national and sector level, and hold government to account. Prakash feels this will lead to more opportunities for civil society to engage in budget decisions, which will help the government to formulate budgets shaped by citizens’ needs.

Examples of Prakash’s data use

  • Transforming data: Open Knowledge Nepal held a two-day event – the ‘Data Spending Party’ – in which coders, journalists and students extracted budget and spending data relating to one locality (the Kathmandu Municipality). This involved using computer software to ’scrape’ the data and transform it into an open format. The Data Spending Party equipped a range of potential data intermediaries with new data literacy skills. A second data liberation event took the open agenda further, building on the experience from the first event to ‘scrape’ national budget data. The national budget data was then made available on a public web platform for easy access, analysis and use.
  • Analysis to reveal trends in spending over time: Once the data had been scraped from the PDFs and cleaned in Microsoft Excel, the data could be sorted and filtered, allowing analysis to be carried out. The analysis revealed some interesting trends, for example spending on infrastructure development significantly increased over time, a trend that would have been difficult to identify in the data’s original format.

Challenges to Prakash having better information

Because of the team’s initial lack of experience in opening data they had to learn a process through trial and error, which was time consuming. They also had to spend time researching appropriate software packages for scraping the data. They have already reused the skills they learned during their first attempts and are now better equipped for future efforts.

The type of Nepali font in which the data was originally published did not meet the Unicode Standard (a character coding system that facilitates the worldwide interchange of text). This caused a number of issues, for example it could not be easily translated into other languages and could only be read by computers with the Nepali font installed, which meant that the data was not open to everyone. The group decided that it was important to publish the data in English so that it could reach a wider audience and be utilised by more people, but this process was time consuming.

One of the biggest issues with the data itself was inconsistencies in the font used in the original government PDF files. This distorted the order of the dataset once it was extracted, preventing analysis and redistribution. As a consequence the team had to go through each extracted file and correct it before it could be shared. The sheer quantity of data meant this posed another significant challenge to Prakash and his team.

Download this summary as a pdf

Download Prakash Neupane’s full story

Other case studies in this series