• Discussion paper
  • 15 December 2016

Metadata for open data portals

This paper investigates how open data portals share their metadata and explores the most prevalent underlying metadata standards used.

Author

Beata Lisowska

A recent Open Data Institute (ODI) summit in London featured a number of talks where a range of stakeholders discussed open data: how important it is, how it unleashes the true potential of data, what it means, what possibilities if offers, and where the future of the open data lies. Open data, should be accessible to all, usable and sharable by all, and as such is a key tool in seeking to advance sustainable development and be used for good governance.

However, despite more data being published in open formats, data scientists, journalists and analysts are often left with a daunting and time-consuming task of not only finding relevant data and discovering new datasets, but most importantly understanding it before any analysis can be done. That information should be found in the metadata that should couple the data published.

Metadata is, in essence, structured information that makes it easier to retrieve, use or manage an information resource. In practice, metadata describes a dataset and its structure, and helps users discover it. The information usually includes such basic elements as: title, who published the dataset, when it was published, how often it is updated and what license is associated with the dataset. These are classed as ‘descriptive metadata’ as opposed to ‘structural metadata’, which describes for example information on page layout or an object’s component and their relationships (such as chapters or tables in a book).

This paper investigates how open data portals share their metadata and explores the most prevalent underlying metadata standards used. It seeks to understand to what extent the metadata standards used by the predominant open data platforms are interoperable. Interoperble metadata across open data portals enables datasets to be discoverable,
re-useable and searchable across portals rather than ‘siloed’ within them (this is called a federated search).

This discussion paper was written as a part of the Joined-up Data Standards project, a joint initiative between Development Initiatives and Publish What You Fund.