Covering Disruptive Technology Powering Business in The Digital Age

Home > DTA news > News > Data Sharing And Replication In The Sciences
Data Sharing And Replication In The Sciences
February 16, 2016 News

This article was originally published by and can be viewed in full here

Perhaps one of the most fascinating, yet underreported stories about data and the sciences thus far in 2016 was an editorial published last month in the highly prestigious New England Journal of Medicine criticizing the growing trend towards data sharing in the sciences. Its publication was juxtaposed with Vice President’s Biden’s remarks last month that one of the great obstacles towards medical advances is that so much of the data generated by medical research remains “trapped in silos, preventing faster progress and greater reach to patients.” Yet, data sharing and replication remain hotly contested topics in the sciences, provoking substantial conversation.

Written by the Editor-in-Chief and one of the Deputy Editors, the New England Journal of Medicine editorial took sharp aim at data sharing, worrying that “a new class of research person will emerge — people who had nothing to do with the design and execution of the study but use another group’s data for their own ends, possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what the original investigators had posited” and leading to a system “taken over by what some researchers have characterized as ‘research parasites’.”

Remarkably, the Editor-in-Chief of what is widely considered medicine’s oldest and most prestigious journal calls those who use open data “research parasites” who are “stealing from the research productivity” of those who make their data available. Yet, perhaps most strikingly in the era of replication, he notes that a principle concern is that data analysts might “use the data to try to disprove” the data collector’s own work or ideas. In short, he expresses concern that open data will allow researchers to disprove or question papers.

In sharp contrast to the norms of many fields, he further makes the demand that whenever an open dataset is used, its creators should be granted “coauthorship to acknowledge both the group that proposed the new idea and the investigative group that accrued the data that allowed it to be tested.”

In the face of the ensuing backlash, the Editor-in-Chief issued a second editorial clarifying the remarks and reasserting the journal’s “commit[ment] to data sharing in the setting of clinical trials.” Yet, he did not reverse his push for collaborative publication, repeating that “by working in collaboration … everyone will gain” and that the he “believe[s] that we will all benefit most if this is done collaboratively.” He also doubled down on the criticisms of sharing expressed in the original editorial, stating “we spoke to clinical trialists around the world. Many were concerned that data sharing would require them to commit scarce resources with little direct benefit” and that “some of them spoke pejoratively in describing data scientists who analyze the data of others.”