By Shannon Bohle
In 2013, Dr. John Holdren, the director of The White House Office of Science and Technology Policy (OSTP), issued a science policy memorandum requiring free public access to the published journal articles of scientific and medical researchers in the United States who received funding from agencies that receive over $1M from taxpaying citizens. His mandate was written in response to an online petition on the “We the People” website called “Require free access over the Internet to scientific journal articles arising from taxpayer-funded research.” The petition, which was created on 13 May 2012, pledged:
We believe in the power of the Internet to foster innovation, research, and education. Requiring the published results of taxpayer-funded research to be posted on the Internet in human and machine readable form would provide access to patients and caregivers, students and their teachers, researchers, entrepreneurs, and other taxpayers who paid for the research. Expanding access would speed the research process and increase the return on our investment in scientific research. The highly successful Public Access Policy of the National Institutes of Health proves that this can be done without disrupting the research process, and we urge President Obama to act now to implement open access policies for all federal agencies that fund scientific research.
In the official response, Holdren stated, “Americans should have easy access to the results of research they help support.” Specifically it directed “federal agencies…with more than $100 million in research and development expenditures to develop plans to make the results of federally-funded research publicly available free of charge within 12 months after original publication.” In particular, “the memorandum requires that agencies start to address the need to improve upon the management and sharing of scientific data produced with federal funding.”
Part of the problem with sharing articles and sharing data, that has been prominently featured in the scientific, medical, and mathematics literature suggests that there are serious problems underlying the research itself wherein even supplementary, selective data sets sometimes are insufficient to back the claims presented in published papers. The process of peer review is a time-honored tradition, but as science becomes more data-driven and reliant upon “big data” the necessary evidence for the acceptance of papers is not availabile, forcing reviewers to rely on their professional judgment rather than evidence. In point of fact, “90 percent of all the data in the world has been generated over the last 2 years.”
One popular saying is that it is important to “trust but verify” when it comes to science and medical research, and no one would agree more than a statistician that statistical results can be manipulated to argue the data supports a particular viewpoint or conclusion. What seems to happen in a majority of cases is that scientists make honest mistakes in drawing conclusions, based either on actual error conducting experiments or in greater than acceptable margin for errors when analyzing their results. The mandate, which covered making articles available to the public freely as well as some of the supporting data sets, did not go far enough in addressing the problem of reproducibility in science and medical fields. At a subsequent open hall public meeting, “Public Access to Federally-Supported Research and Development Data and Publications” hosted on 16-17 May 2013 in Washington DC by of the National Academies’ to discuss the changes in the mandate, two librarians, Michah Altman, Director of Research and Head Scientist for the Program on Information Science at MIT Libraries, and R. Michael Tanner, of the Association of Research Libraries, took particular note of the omission of one form of critical supporting documentation, laboratory notebooks. The 22 February 2013 memorandum itself stated, ”For purposes of this memorandum, data is defined, consistent with OMB circular A-110, as the digital recorded factual material commonly accepted in the scientific community as necessary to validate research findings including data sets used to support scholarly publications, but does not include laboratory notebooks, preliminary analyses, drafts of scientific papers, plans for future research, peer review reports, communications with colleagues, or physical objects, such as laboratory specimens” (emphasis is mine).
I mentioned in a previous blog post how an article in The Economist “pointed out serious flaws in today’s non-reproducible science journal articles and the role that virtual laboratory notebooks could play in monitoring experimental studies”:
Some government funding agencies, including America’s National Institutes of Health, which dish out $30 billion on research each year, are working out how best to encourage replication … Ideally, research protocols should be registered in advance and monitored in virtual notebooks … A rule of thumb among biotechnology venture-capitalists is that half of published research cannot be replicated. Even that may be optimistic. Last year researchers at one biotech firm, Amgen, found they could reproduce just six of 53 “landmark” studies in cancer research. Earlier, a group at Bayer, a drug company, managed to repeat just a quarter of 67 similarly important papers. A leading computer scientist frets that three-quarters of papers in his subfield are bunk. In 2000-10 roughly 80,000 patients took part in clinical trials based on research that was later retracted because of mistakes or improprieties.
Similar articles discussing false, misleading, or simply non-reproducible claims, have appeared over time in some of the most prestigious scientific journals, including Nature and Science, as well as the Annals of Applied Statistics. These studies point out how, with an estimated spending of $1.5 trillion (in USD) spent globally on research and development is one with serious financial impacts. There have been numerous reports of retractions, indeed “the number of retractions due to error has grown over five-fold since 1990.” It is understood that articles published in these journals come from all over the world, so it is not just a U.S. problem but also a worldwide one. So too is the ability to claim intellectual property rights over new processes and products through the process of patenting, not limited to U. S. companies, but companies from all over the world wishing to enter the U. S. marketplace by filing a U. S. patent online with the United States Patent and Trademark Office (USPTO).
One major problem that results when journal article are published and patents are reviewed is that the laboratory notebooks are rarely consulted. It is usually only in cases where a litigant challenges a patent that laboratory notebooks have historically held sway. However the fact that the companies have held the notebooks in their possession, rather than the USPTO, might introduce the possibility for tampering with the evidence should notebooks be required during litigation. It seems to me a much better idea that a neutral third party, the USPTO, would hold a digital copy of these notebooks in case the need would arise to consult them. Of course there is a need to protect the intellectual property rights of patent holders for the duration of the patent. Nevertheless, should the product or process come under official scrutiny, they should be available for consultation. Similarly, because they were produced by taxes of the American people, these notebooks (with a few exceptions wherein material should be restricted) should, after the life of the patent (usually 17 to 20 years), become open and freely accessible to all as part of the public domain and be sustained as part of a scientific and medical heritage preservation effort.
Of course, not all laboratory notebooks are part of research where a provisional patent or regular patent would be filed. Other specialized data repositories exist, and perhaps these will be expanded to encompass notebooks as well. But it is my belief that a centralized national depository like the USPTO is a good start to remedying the problem of reproducibility when it comes to discoveries that will impact the public’s health, the national (and perhaps international) economies dependent upon innovations in science, medicine, and technology, as well as the investments of shareholders in these companies.
With this in mind, I open up commentary, and invite readers to sign this online petition:
“Mandate Open Access to Digital Copies of Lab Notebooks Created Through Publicly Funded Research Leading to a US Patent.” Access to notebooks improves the processes of patenting, inventing and preserving U.S. scientific and medical history. In 2013, OSTP mandated open access for federally funded research articles and data, but excluded notebooks. This petition requests expansion of the mandate. When federally funded research results in a (provisional) patent application, a digital copy of searchable, full-text notebooks should be required. Why? Without notebooks, recent studies were unable to reproduce journal findings, resulting in serious economic and; health implications for products and processes. Notebooks are evidence in patent litigation, so funding USPTO storage prevents fraud. After the life of the patent, notebooks should become public domain with an exclusion allowing transfer of classified materials to NARA. To sign the petition, visit http://wh.gov/l5gv0 before 16 February 2014. (Enter a Zip Code, if appropriate).
Screenshot credit: “We the People”.
Shannon Bohle, BA, MLIS, CDS (Cantab), FRAS, AHIP is a professional medical librarian and writes a blog, “Scientific and Medical Libraries,” which is presented in association with Nature.com.