“Data collaboratives,” an emerging form of partnership in which participants exchange data for the public good, have huge potential to benefit society and improve artificial intelligence
NEW YORK – After Hurricane Katrina struck New Orleans in 2005, the direct-mail marketing company Valassis shared its database with emergency agencies and volunteers to help improve aid delivery. In Santiago, Chile, analysts from Universidad del Desarrollo, ISI Foundation, UNICEF, and the GovLab collaborated with Telefónica, the city’s largest mobile operator, to study gender-based mobility patterns in order to design a more equitable transportation policy. And as part of the Yale University Open Data Access project, health-care companies Johnson & Johnson, Medtronic, and SI-BONE give researchers access to previously walled-off data from 333 clinical trials, opening the door to possible new innovations in medicine.
These are just three examples of “data collaboratives,” an emerging form of partnership in which participants exchange data for the public good. Such tie-ups typically involve public bodies using data from corporations and other private-sector entities to benefit society. But data collaboratives can help companies, too—pharmaceutical firms share data on biomarkers to accelerate their own drug-research efforts, for example. Data-sharing initiatives also have huge potential to improve artificial intelligence (AI). But they must be designed responsibly and take data-privacy concerns into account.
Understanding the societal and business case for data collaboratives, as well as the forms they can take, is critical to gaining a deeper appreciation the potential and limitations of such ventures. The GovLab has identified over 150 data collaboratives spanning continents and sectors; they include companies such as Air France, Zillow, and Facebook. Our research suggests that such partnerships can create value in three main ways.
For starters, data collaboratives can improve situational and causal analysis. Their unique collections of data help government officials better understand issues such as traffic problems or financial inequality, and design more agile and focused evidence-based policies to address them.
Moreover, such data exchanges enhance decision-makers’ predictive capacity. Today’s vast stores of public and private data can yield powerful insights into future developments and thus help policymakers plan and implement more effective measures.
Finally, and most important, data collaboratives can make AI more robust, accurate, and responsive. Although analysts suggest AI will be at the center of twenty-first-century governance, its output is only as good as the underlying models. And the sophistication and accuracy of the models generally depend on the quality, depth, complexity, and diversity of data underpinning them. Data collaboratives can thus play a vital role in building better AI models by breaking down silos and aggregating data from new and alternative sources.
Public-private data collaborations have great potential to benefit society. Policymakers analyzing traffic patterns or economic development in cities could make their models more accurate by using call-detail records generated by telecom providers, for example. And researchers could enhance their climate-prediction models by adding data from commercial satellite operators. Data exchanges could be equally useful for the private sector, helping companies to boost their brand reputation, channel their research and development spending more effectively, increase profits, and identify new risks and opportunities.
Yet for all the progress and promise, data collaboration is still a nascent field, and we are only starting to understand its benefits and potential drawbacks. Our approach at the GovLab emphasizes the mutual benefit of collaboration and aims to build trust between data suppliers and users.
As part of this process, we have begun designing an institutional framework that places responsible data collaboration at the heart of public- and private-sector entities’ operations. This includes identifying chief data stewards in these organizations to lead the design and implementation of systematic, sustainable, and ethical collaborative efforts. The aim is to build a network of individuals from the private and public sectors promoting data stewardship.
Given heightened concerns over data privacy and misuse—the so-called techlash—some will understandably be wary of data-sharing initiatives. We are mindful of these legitimate worries, and of the reasons for the more general erosion of public trust. But we also believe that building rigorous frameworks and more systemic approaches to data collaboration are the best ways to address such concerns.
Data collaboratives bring together otherwise siloed data and dispersed expertise, helping to match supply and demand for such information. Well-designed initiatives ensure that the appropriate institutions and individuals use data responsibly to maximize the potential of innovative social policies. And accelerating the growth of data collaboratives is crucial to the further development of AI.
Sharing data involves risks, but it also has the potential to transform the way we are governed. By harnessing the power of data collaboratives, governments can develop smarter policies that improve people’s lives.
Stefaan G. Verhulst is Co-Founder and Chief Research and Development Officer of the GovLab at New York University.