The They Buy For You Project (TBFY) concluded on 31 December 2020. To mark the occasion, we look back at three years of hard work, take stock and discuss the future of procurement analytics to unlock data value chains. Our conversation is with Till Christopher Lech (SINTEF Digital), Elena Simperl (King’s College, London), Oscar Corcho (Universidad Politecnica de Madrid) and Ian Makgill, (Spend Network).
1)How has TBFY made an impact? What, in your view, are the most significant achievements of the project?
Till Christopher Lech (TCL):First of all, we’ve contributed to putting the importance of data-driven analytics in the public sector, and especially within public procurement, on the European agenda. We’re still just scratching the surface when it comes to applying data-driven methods in government, so projects like TBFY are important since they open up new perspectives on how data value chains and analytics can be used for the good of society.
Prof Elena Simperl (ES): TBFY has achieved impact in several ways: By working with public administrations at local and national levels to help them improve how they publish and use procurement-related data. Also by developing technologies and solutions that buyers in the public and private sectors can use and adapt to become more transparent, make markets more competitive, and reduce waste and fraud. Finally,by developing the TBFY knowledge graph, which is a data resource that is designed to facilitate integration across multiple data silos and add context and domain knowledge to machine-learning-based procurement analytics.
Ian Makgill (IM): In the first instance, our research and analysis have shown that there are significant shortcomings in the quality of data being published in both above threshold and below threshold notices. Our work has allowed us to highlight where buyers are failing to publish specific records and where data is lacking. This, in turn, has led to some administrations adopting a fresh look at the way that procurement data is published and setting up projects to improve data quality.
Prof Oscar Corcho (OC):I will focus on two specific areas where I think that TBFY has made a relevant impact. I am really happy about the advances that have been done by the city of Zaragoza in the publication and visualization of open procurement data, which can serve as an example for other public administrations on how to publish procurement data in a structured manner and present it to their citizens for transparency purposes and to facilitate data reuse.
Another area where we have been able to generate impact is by providing services that allow those working on public procurement inside public administrations to find similar tenders in their own language or in another language, using our multilingual document similarity service.
2)How has TBFY contributed to realising new data value chains for Europe?
TCL: The main achievement here was the creation of the TBFY knowledge graph, a data structure linking tenders to contracts and suppliers. This data model combines the Open Contracting Data Standard with re-used company data models from another EU-funded project. This model will be available along with the tools we developed, as well as the ingested procurement data from TBFY for others to use for research and development, building applications, training their own models and many other purposes.
ES: TBFY developed several data value chains in the business cases – in fact, many of these business cases realise their own data value chains, drawing upon data and insight from various sources. TBFY helps them achieve this, most critically, through the TBFY knowledge graph. The term knowledge graph refers to a set of technologies and standards to organise and publish data specifically for use in data value chains – that is, in combination with other datasets. The approach we followed was directly tailored to the needs of procurement stakeholders but could be applied to other domains, from transport to the environment.
IM: By taking a more thorough approach to collating and processing data from a wider range of sources, TBFY has been able to explore new opportunities from the data. For example, the business cases have created new products, two of which are in production and are already profitable. Closer analysis of the data showed new opportunities that came from augmenting the data and improving methods of access.
OC: Several data value chains have been generated in the project. I really appreciate the work that has been done in connecting procurement data provided in a structured format by public administrations. It has required some additional processing, with data from companies coming from another company database, and how all this has come together in the TBFY knowledge graph.
3) What did you enjoy most about the project?
TCL: As Project Coordinator, I enjoyed collaborating with a very dedicated consortium. We had excellent research and technology partners as well as business case partners in the public and private sector that were really keen on taking our results and innovations into use.
ES:I loved taking on a new challenge. Government procurement is a hugely important area, as shown not least during the Covid-19 pandemic. Developing procurement intelligence solutions that meet people’s needs is a substantial undertaking, not just from a technology point of view, but also from a human point of view. TBFY acknowledged this – my team was leading on a work package that supported our business cases in understanding the human data interaction elements of the procurement solutions they build. I think this is essential to ensure the dashboards they build are fit for purpose.
IM: Friends for life and all that! Seriously, I loved working with some of the brightest people in Europe who share the same passions and problems as ourselves. It has been an honour to work with you all. I’m just sorry that lockdown means we won’t be able to meet over a table and break bread as the project ends.
OC: Besides working with a really nice team, the most enjoyable experiences for me have been the co-design activities with public administrations like those in the city council of Zaragoza, where we have designed how citizens and civil servants should be able to access public procurement data thanks to the services that we have been working on.
4) What were some of the unexpected challenges you’ve encountered on the way?
TCL: Well, I guess, it’s fair to say that no-one saw the Covid-crisis coming. Having said that, EU-projects are used to working in a distributed manner, using video meetings and conference calls on a regular basis. However, the situation made it difficult to get access to decision-makers in the public and private sector as they had other things on their minds than introducing data-driven methods in their procurement process. Even though they really should, as the Covid-situation put challenges to procurement. We document this in the impact assessment analysis, provided by our project partner Spend Network.
ES: The availability and quality of the data sources we used. To some degree, this wasn’t entirely unexpected, though it’s somewhat disappointing to see that well into the 21st century we’re still advocating for key public sector datasets to be released for others to use. I also realised that the whole area of human factors in machine learning and data analytics is a huge challenge that deserves its own project – every single piece of procurement intelligence we built is used by specific user groups and we’re only starting to understand what that means in terms of algorithm design and user experience.
IM: We knew the data quality was going to be bad, but oh my. It is really a problem. I didn’t think we could allow $4trn of spending to be this poorly documented but it appears we can.
OC: Data quality, data quality, data quality. This is always an issue in this type of project that requires curating and integrating data from many heterogeneous sources, and I was not expecting TheyBuyForYou to be different. So probably I should not say that this was an unexpected challenge, indeed.
5) TBFY has brought together ten organisations from five different countries. What did you like best about working in a cross-European team? What could have been better?
TCL: At SINTEF, we have a long tradition of working in international R&D projects, especially within the context of the European framework programmes. These projects give us the opportunity to collaborate with top European R&D communities thus expanding our own expertise. Coming from a smaller European country, we need to compete internationally in order to keep up.
ES: I’ve been part of many European teams like TheyBuyForYou and it’s always very rewarding to learn about new contexts, build a common vision and work towards it, and get to know more about other countries. Sadly we could do less of that in 2020 because of the pandemic, but by that point, we were already a well-formed team, which continued to perform remotely.
IM: The breadth of skills in the project and the diversity of local contexts was really insightful, but most of all it was discovering that the same problems exist in every administration. That’s good news because it means that solving problems in Norway will help to solve problems in Spain and beyond. What could have been better… Did we mention data quality?
OC: As in many other EU collaborative projects where I have been involved, it was always great to exchange all our points of view in terms of the technology that has been used for different parts of our service stack, the data models, and our approaches for defining them, the way in which business models have been created, etc. Particularly, I have really enjoyed learning a lot about how we all shared the same interests in making public procurement as open as possible (no matter whether from public administrations, research institutions and universities, or private companies).
6) How has the open data / big data/procurement landscape changed over the past three years? What lies ahead?
TCL: During the last three years, the European Commission has issued several pieces of legislation as part of their data strategy. Starting with the Public Sector Information Directive to the Data Governance Act, the latter published just recently. Of course, it will take a while until these will be in full effect, as goes for other measures such as eForms for tenders, but I choose to be optimistic that we will see better infrastructures for accessing high-value public data sets (such as procurement data) in the future.
ES: As outlined in the excellent report released by our colleagues from Spend Network, there have been changes in some of the KPIs tracked during the project, but not all developments were positive. Our hope is that the knowledge graph we built will be adopted more widely, as the main advantage of this solution (compared to other ways to publish data) is in its resilience to changes in the domain, and in its ability to adapt to new datasets and vocabularies. What is important for buyers and providers of procurement intelligence solutions is to understand that this approach – whether using the TBFY knowledge graph or a knowledge graph of their own making is hugely beneficial. I hope to see more examples of procurement knowledge graphs in the future across Europe.
IM: New NLP algorithms such as Google BERT and GPT-3 create significant opportunities for dynamic content creation, better classification, and deep learning across languages. Exploiting these technologies can radically improve data quality, but only if the source data is made accessible in a timely and fulsome manner.
OC: The open data landscape has continued evolving over these years. I can see more maturity in open data publishing and a better understanding of the open data landscape by public administrations. Several years ago, talking about open data was still difficult in many contexts. Now it is taken for granted by most public administrations and it is a pleasure seeing civil servants and public workers working seriously into doing their best to open more data.
7) What are the stakes in public procurement and open data over the years to come? How could future research projects build on TBFY?
TCL: In TheyBuyForYou, we’ve provided the building blocks for implementing data-driven methods in the decision-making process for public procurers and suppliers. Obviously, access to high-quality data will be key in order to release the potential of big data analytics. Initiatives such as the TheyBuyForYou projects will be important, also in the future, in order to push data owners and data producers towards opening up their data silos. This is why our deliverable on procurement data publication (the ten recommendations) is a very important outcome of the project.
ES: The stakes were and remain super high, as we’re talking about a topic that is critical to how governments operate and how it is perceived by citizens and businesses. The open data ecosystem has made huge progress over the past ten years or so, but critical datasets are still to be released in the public domain. The costs of these delays are significant.
For future research and innovation, there are many options – I think one would need to do more knowledge graph work to make sure that it evolves seamlessly, and that people can use it effectively. For instance, there are millions of open government datasets available online with direct relevance to the TBFY knowledge graph – interlinking them would add value to downstream applications. I also think there is a piece out there on storytelling with procurement data – that is, building tools and approaches for data journalists, buyers, and policymakers to explore the data and make decisions with confidence.
IM: We are working with $4trn of spending, which is roughly equivalent to 11% of EU GDP. Spending that money effectively could be the difference between growing effective, low carbon economies and a slow recovery from the Covid-19 recession that looms. Better procurement data could be used to grow jobs, improve competition, reduce costs and save carbon, but we need the data to do this, to understand what works, and to evidence failure. It would be an extraordinary oversight if Europe failed to grasp the potential that procurement has for reshaping our future societies.
OC: We have produced a set of 10 recommendations for improving the way in which public procurement data is released by public administrations. A really good read for those interested in continuous improvement in this respect.
As for other projects to build on TBFY, our data is continuously updated, and our APIs are available for those interested in consuming the wealth of our knowledge graph. I would like to see more work on deriving more and more insights into the data that we have made available. Another point that I think that will be relevant for the future will be better exploitation, through AI techniques, of the wealth of knowledge that is “hidden” in the texts of the procurement notices. Definitely some clear next steps for the future.
If you’d like to chat with Ian about your procurement data needs, please get in touch here.