Spending Data Handbook – How do Civil Society Organisations Wrangle Spending Data?
Over the last few months, we have been travelling the globe in an effort to work out which organisations are mapping the ebbs and flows of government money – and how they do it. In a series of interviews, currently being published via the OpenSpending blog we’ve highlighted the issues which Civil Society Organisations face when working with government financial data.
The aim has been firstly to establish a front along which Civil Society Organisations can unite and realise that they are not lone-voices demanding better data, more current data.
The story so far…
Our first step was to start building a Working Group to bring these people who work in similar sectors around the globe together and stimulate discussion around the topic of Spending Data. There is already a good exchange of interesting projects from around the world and topics such as the creation of standards and an exchange of expectations and practices are being mooted. The group is growing in size and open to those working in the field of open spending data.
The second aim was to find a way, via training, technology or otherwise, to tackle the challenges which the CSOs we spoke to highlighted. The most common problems that these organisations have been encountering include dodgy data, which often changes in structure and format from year to year, incorrect data formats (most common offender, as usual, being – PDFs), jargon or codes in data which had to be painstakingly decoded and many more. We were also delighted to see that CSOs were curious and getting more ambitious with data and wanted to know more about how they could work with it most effectively, for example asking questions such as- “How do I present my data better?”, “How do I speed up getting data from websites?” or “We’ve been thinking about geocoding publicly-funded projects so we can put them on a map, do you know of anyone who has done this successfully?”.
What’s next? – The Spending Data Handbook.
The range of advocacy topics tackled by these groups is so diverse (from gender budgeting to checking that promised infrastructure in a local town actually gets built) that it would be impossible to address all of the data wrangling skills in one book. Everyone needs different levels of data, from tiny, ward-level datasets up to national budgets. However, there are some overarching principles which apply universally to working with government financial data.
These overarching topics are what we aim to cover in the Spending Data Handbook. Like the Open Data Handbook, it will be available as an open educational resource on the internet and for training sessions. More on the content below…
What questions will the Spending Data Handbook cover?
Based on the interviews we conducted, we've drawn up a suggested list of topics for what could go into the handbook. We'd love to <a href="http://lists.okfn.org/mailman/listinfo/openspending">hear from you</a> with your suggestions and modifications.
Part One: An introduction to Open Financial Data
How can CSOs be more effective in their use of data? What should they be asking governments for?
- How can they be more effective in requesting (and keeping hold of) meaningful information from government?
- Which phases of the budget/procurement cycle do they need to demand data from?
- What technical formats are ideal for re-use and interpretation?
- What transparency rules need to be in place to enforce publication?
How can they get (and keep hold of) data
- How can they make backups of the data that has been published?
- How can they extract data from sources on the web?
Part Two: Technical primer for data work
How can data be analysed and interpreted?
- Which phases of the budget/procurement cycle produce which kind of data? Which tools do you need to work with these different types of data?
- What different data formats can be used – and how?
- What is the difference between structured and unstructured digital data?
- How can unstructured data be re-structured?
- What are PDF files and how can information from them be extracted?
- How can you convert between different structured formats?
How can data be cleaned up and brought into a more uniform format?
- How can data be augmented to allow for more meaningful interpretation?
- How can government classifications (codesheets) be applied?
- How can geographic information be included?
- How can information about vendors/suppliers be included?
How can data be delivered to the public?
- What are databases, query languages? How can they be used?
- How can you summarize large sets of data?
- How can data be presented in an accessible and meaningful way?
So now, we need your feedback. Are you an NGO working on these issues? Are there any additional topics you feel people in your organisation would like to know more about? Are the suggested topics useful? You can get in touch with us via the OpenSpending mailing list or submit a response via this form.
We hope for a sprint on the Handbook to take place around November, but the book will be a work in progress. There are many issues raised by NGOs in our research which won’t make it into this first version, but we hope that as organisations become more ambitious in working with their data, we’ll add tips and tricks for advanced wrangling into future versions.
Further interviews from conversations with CSOs will continue to be published on the OpenSpending blog. Stay tuned for updates.