Summary of recommendations
In the rest of this report, we present the status quo in CSO usage of government spending data in the form of case studies on successful data-driven projects.
In this chapter, we highlight the common difficulties faced by these CSOs and the opportunities they see for improvements to the useability of fiscal data, as well as our recommendations for how those difficulties can be overcome and those improvements achieved. A fuller exposition of our recommendations can be found in the report’s conclusion.
More and better data needed
There is strong demand not only for more data about public revenues and expenditures but also for data that is of higher quality and is more useable. While many CSOs consulted for this study engaged in sophisticated analysis and uncovered subtle connections, in terms of person-hours, the bulk of their work consisted of merely collecting data and refining it.
Our first and most important recommendation is therefore that CSOs push their governments to proactively release data on the full range of public finances in a machine-readable and accessible form. A later chapter of the report provides our guidelines for financial data.
Types of data to demand
Many countries still do not make data available on important areas of public spending. There is a demand for data in each of the following categories:
- Spending (crucially at transaction level)
- Procurement and contracts
- contextual information (e.g. demographics, geodata, targets & outputs)
As one interviewee remarked, it does not take much to render data unusable. It is important that data publishers take steps to ensure that published spending data actually contributes to citizen engagement with public finances. These steps include:
- releasing data proactively
- releasing data regularly and in a timely fashion
- making data available at international (e.g. EU farm subsisidies), national, and local levels
- ensuring consistency of data (e.g. consistent identifiers for companies)
- publishing reference data, code sheets, and metadata
- publishing data with an open license to promote reuse
Publishing data in unstructured and non-machine-readable formats wastes time and prevents many projects from getting off the ground. Data should be published in a form that is transparent to computational processing. CSOs should push governments to:
- publish data in a machine-readable format (no PDFs, Word documents, or HTML tables)
- provide a bulk download option: no CAPTCHA codes, download limits, etc.
Opportunities for knowledge-sharing and engagement
More needs to be done to promote publishing standards and best practises between countries. CSOs consulted for this study readily identified countries with exemplary publishing practices (e.g. Slovakia’s procurement data and the UK’s transaction-level spending data). While we acknowledge that there is still work to be done in both cases, country practices like these should be identified as the forefront of open data policies and used as examples to help civil society initiatives demand more from their own countries.
Examples of successful cooperation between CSOs and government have shown that CSO engagement with government can lead to greater financial transparency and better data. The Supervizor project in Slovenia, for example, was driven by a combination of a strong independent anti-corruption commission and access to pro-bono development resources to prototype and develop the models. CSOs will play a crucial role in the future of fiscal transparency, not least of all through their engagement with governments.
Finally, there is a major opportunity for transparency advocates and open data groups to work together. These two communities bring different focuses and areas of expertise, the combination of which may be very powerful. Transparency organisations bring knowledge of government policy and contextual experience that technical experts often lack; conversely, open data hackers understand data processing and the programmer community better than most policy experts. This opportunity has been partly obscured by superficial disagreements over terminology that could be clarified by greater explicitness around key terms like “open”.
Civil society groups around the world could benefit from training and support in key areas. While the pool of skills is hugely diverse across the CSO community, key parts of the data pipeline have consistently proven to be problems that steal time from activism and other parts of an organisation’s work.
We see strong potential in offering focused trainings around these key needs:
- Web scraping
- Liberating and cleaning PDFs
- FOI request skills
This last need, FOI request skills, should be emphasized: “technical” training is not the only training necessary. Despite recent increases in proactive data publication, FOI requests remain of prime importance for getting hold of information on public money, and CSOs interested in public spending require training in their effective use.