Today, the world of public health research changed for ever. Or so I hope. The institutions that fund most health research in developing countries (and a good deal of research in rich countries too) have finally launched an assault on Data Hugging Disorder. They are pushing the scientists they fund to put any data they collect in the shared scientific domain.
The broadside against the culture of data-hoarding that dominates in public health is published today in a joint statement on data sharing signed by 17 institutions, including the three biggest funders of public health research globally: The U.S. National Institutes of Health, the Bill and Melinda Gates Foundation and the Wellcome Trust. Other signatories include the World Bank, the U.S. Centers for Disease Control, national research councils from the UK, France, Germany, Canada, Australia and New Zealand. (Here’s the full list.)
I’d like to have had a more bombastic introduction. “From this day forth, all data collected by researchers who are paid by taxpayers or tax-exempt charities will share those data with the world”. I implied as much in a comment in the Guardian. Actually, the statement doesn’t go anything like that far. Indeed it’s pretty fluffy, couched in terms of principles and goals rather than requirements. But having helped draft the fluff, and been part of nearly three years worth of discussions leading up to it, I think it is a damned good start. Perhaps surprisingly, many of the key institutional players would have liked to go much further, but they were shouted down by their legal departments.
So the statement doesn’t actually commit institutions to do anything concrete. But implicit in its goals are important changes to the culture that makes us researchers so mean with our data. These mega-funders say they aim to reward us for publishing data, not just papers. They aim to support data management, so that data can be shared practically. Data management has always been the most neglected and undervalued part of the research enterprise in public health (something I’ve ranted about before in print (pdf) and on air; I’m looking forward to researchers rewriting their budgets so that the funders can put their money where their joint statement is. And they aim to make sure that the scientists in developing countries who do a lot of the grunt work of collecting data of interest to global health do not get “scooped” by data munchkins sitting in Seattle or Geneva with squigabytes of processing power and a constant electricity supply. The equity principle was firmly stressed in a commentary about data sharing published in The Lancet today to go with the Joint Statement.
Some researchers will feel queasy about sharing their data; it is hard not to feel ownership when, night after night after exhausting night, you’ve driven your motorbike at 4.30 am through the entrails of the red light district in the rainy season to get them to the lab in good order. But the truth is that “my” samples, and the data they produce, are first and foremost owned by the people who gave them to me — and data tied up on my hard drive waiting for me to get around to writing that third paper about the study (when I’ve finished my next grant application and the IRB paperwork for my current study) are not doing those people any good at all. The other thing that we’re all worried about is that other people will get to see how filthy our data really are. But that’s surely a reason to let more light in, not less.
So today’s statement, fluffy though it is, is a cause for major celebration. The funders say they are putting together working groups to start developing the infrastructures and data standards we need, as well as to change incentive structures so that universities as well as funders support data sharing. It’s up to us researchers to muscle in to those working groups, to make sure that an Open Data world works for us as well as for our paymasters and, most importantly, the people who we prod, poke and bleed in our studies.