Opening the EU procurement databaseOutdated Content Warning: This content refers to an older version of OpenSpending. See here for information about the next version of OpenSpending and ways to contribute.
On 2 May civic coders and journalists from across Europe from Norway to Slovenia met up in Brussels for a Hack Day supported by Knight Mozilla Open News to dig into the EU procurement register, Tender Electronic Daily. Here are a few of the highlights from the day and where we’ve been able to take the TED data since the Hack Day.
Scraping and parsing of TED into OpenTED
Several people got together to build and improve the parsing of data already scraped. The past years has seen several attempts to scrape and parse TED, but past scraping projects have been missing the granular parsing needed in order to make the data useful for investigations.
You can find the latest updates here - we’d love your help: https://github.com/opented/opented
User stories - ideas for data journalism on procurement data
During the Hack Day a handful of data journalists sat down to line up questions, which should be investigated by journalists. “If the TED data could speak, what should we ask?”
####Key questions to ask the TED data: - Where in the EU are highways most expensive to build? - What are the best ways to expose cartel formations - What companies are running food services at public school (and what’s on the menu)? - Who is the biggest weapons salesman in the EU? - Who are the biggest public transport providers - Arriva vs. Connexion? - What’s the relationship between procurement process (time) and what companies gets the contract? - How many contracts go outside the EU? - Do the governments purchase from big or small suppliers? - How can we predict when a project will go over budget? - What sector is the least competitive (CPV-code vs. average bid)? - Are there companies within certain sectors that get a suspicous large amount of contracts?
####Using other data with procurement data from TED - Reconcile with company data from Opencorporates - Match contracts to actual spending data eg. UK government spending - Ask Freedom of Information requests to the procuring authority for entire contract (see examples of such successful FOI requests here and here - Get to understand what the contracts are actually about with the official taxonomy of the CPV-code
A first look at the TED data
The full TED data set is now available, thanks to the hack team and in particular Friedrich Lindenberg. You can access the TED data as CSV files EU wide or on a country by country basis. You can also get to know the data format without downloading it, as we’ve uploaded the data for Czech Republic (2013) as a sample.
We hope that many of you will dig into it, and examine the contracts.
####Reviewing the data quality
At a community call on May 15, community members decided to begin a quality review of the data available in TED.
We’re asking you to help contributing to this review. If you’d like to help, please get in touch! In particular the review will examine missing amounts (non mandatory fields), missing contract fields and inconsistent data formats. The quality review can be really useful in order to get a clear picture of the state of the TED-data and help assess how useful it will be for journalists and researchers. We’d be particular interested to hear from those of you who are already working on procurement data in EU countries.
Thanks to Julia Keseru from Sunlight Foundation and Ronny Patz Transparency International, Brussels for offering input on this.
###Should OpenSpending include procurement data?
Procurement data is increasingly becoming available in more countries such as Senegal, Ukraine and Australia, where transactional spending data remains unavailble. By definition procurements are clearly different from transactional spending data. While “contract awards” should indicate a final price, they might not be final, as ammendments regularly occur in contracts and did anyone ever hear of an ICT-project, which went over budget? In such a case chances are that you as a journalist or watchdog would prefer to have access to the actual spending data, but if this is not available procurement data can still help you get closer to your story. Contracts might also lack clarity as to when payments are to actually occur. Finally procurement data will most often only include contracts above certain thresholds, which makes significant shares of actual contractor spending unaccessible.
I could be relevant for OpenSpending to add procurement data next to bugdet and spending data, because several countries with available procurement data, do not provide access to the ideal option of transactional spending data. This is for instance the case in most EU member states where procurements from OpenTED today is likely offering the most granular data on public spending.
Do you think OpenSpending should begin adding more procurement data? Let us know!