Waving the pirate flag: The future of science publishing is black?

Preface: I feel like I am too scatter-brained to blog as regularly as I would like to. When I have an idea it is often easier to put it on Twitter (@Aging_Scientist) instead of writing a post. The problem with blogging is once I get obsessed with an idea, I turn into a perfectionist and start writing these long and rambling posts. But they are never "good enough" or "quite finished". Not sure how often I will blog from now one.

The era of sci-hub

Evidently it is desirable to access the scientific literature efficiently and cheaply ("open access"), because papers are written to be shared. As researchers we want to benefit the public and make our voice heard. Three types of open access (OA) schemes are well established by now (ref. 5). Green, is the publication of manuscripts in archives. Gold, is the placement of publications in journals that guarantee free access. Despite high expectations, neither approach has managed to overtake regular pay-per-view and institutional subscription based models ("paywall"). Black, can describe a service called sci-hub which copies and saves almost all biomedical papers that are being published. This “black” approach is set to revolutionize publishing, but why?

The Problem: literature is too expensive
Access to the biomedical literature is too expensive. No library can afford a subscription to all journals so, even if you are affiliated with a university or two, you cannot access all articles you need for free. What is worse, some authors publish up-to-date methods and reviews in book chapters, which are much harder to access than papers. It is estimated that scientists have access to about 50% of the papers they need for free but for books the number may be as low as 2% (14).

1. This is highly problematic for students, who tend to have negative cash flows during their studies. If you went to university you probably know the problem. Moreover, many students writing undergrad papers just read abstracts and those who actually try to access the literature get punished twice, because it takes time to read in the first place, and then it also costs time and money to access papers. As a knock-on effect, difficult access to the literature provides the perverse incentive to water down undergrad courses and to cheat on writing assignments.

Of course, I had the integrity to do proper research so I spent hundreds of €s to access inferior DRM copies through library-loan services and fired off hundreds of mails begging authors to send me their pdf copies. Flooding researcher's overflowing inboxes, often enough and justifiably I was met with the scornful: "Who are you and why should I share (my time) with you?"

Don't get me started on the psychological and practical effects this has on your workflow and motivation, when after an exhaustive search you find an important paper and you then have to wait days or weeks before you can read it, and in the worst case you may just end up with an angry email for all your effort.

2. It doesn't get better as you progress in your career. The higher up you progress, as a PhD student or Postdoc, the more you are supposed to read and the less time you have to do so (and to go begging for access). Nevertheless academic pay remains anemic on an hourly basis, and yet you are supposed to pay out of pocket? Not cite pertinent literature because you can’t afford it? Is that a viable solution?

3. As an interested lay-person who is not affiliated with a university anymore or never was, accessing the literature is an exercise in futility. It would cost thousands of dollars to write a review paper for example (3) if you were taking a break in-between studies. Evidently being a science blogger, just a critical reader or a patient advocate becomes impossible. There are also many people working in health-related jobs who may have the ability and interest to read science papers, but never do for lack of access (say, e.g. psychotherapists, nursing staff, etc.)

4. If you work at a company accessing the literature is very costly, putting startups and companies in poorer countries at a disadvantage and thus harming innovation. Small companies do not have the necessary economies of scale and bargaining power to afford a subscription to a journal bundle. This puts a disproportionate strain on chemists and engineers which tend to pursue a career outside of academia more often – and as it turns out chemists are avid users of sci-hub.

5. The above are still best case examples. Millions of people studying and working in poor countries cannot dream of wasting as much money on papers as we do. They are simply doomed to do inferior science, remaining completely unaware of large swaths of literature.

Virtually everyone is being harmed by the current subscription based model because it poisons the scientific workflow which is highly reliant on motivation, continuity and speed. But perhaps it is unavoidable for some reason?

The greed of publishers?
Buying a single article would cost you 30-40$, but the true income publishers receive per article is likely lower. However, it is impossible to estimate due to the heterogeneous revenue streams since access to most articles is bought through bulk subscriptions by libraries and universities. Our knowledge of the actual pricing is limited due to non-disclosure agreements signed by the involved parties (Van Noorden 2013). Such secrecy is antithetical to the free market model and highlights that publishers have something to hide. Inelastic demand and secrecy has allowed abusive pricing policies to fester. Universities pay hundreds of thousands or millions for journal access to Elsevier, Springer, and Wiley, but the costs are not related to the estimated journal use (10).

Nothing is a stronger testament to the shady behavior of publishers than their legacy archives. Some articles are so old that they only exist in print, and hence they need to be uploaded online as a scan or OCR. Since these articles are very old, they should have generated enough revenue for the publishers and should have "paid for themselves". Nevertheless publishers like to charge extra and make old archives particularly hard to access even though the costs of digitalizing an article must be in the single or double digits.

Publishers continue to fight against open-access (OA) (%): "Traditional green OA in institutional repositories has been struggling with getting researchers to upload, despite the fact that most major universities now have such repositories in place (Eisen, 2015). The researchers just do not seem to bother with the little extra work involved and many are ignorant of the possibilities. The leading subject repositories, arXiv and PMC are doing better, but only cover some fields of science. And publishers have tightened embargo rules for self-archiving, making green OA less attractive...Elsevier has sent out takedown requests to Academia.edu (Howard, 2013)."

Double-charging. Many publishers force both the readers and authors to pay for their work (6): "We've also known since 2006 that most (75%) conventional or non-open access journals do charge author-side fees, on top of reader-side subscription fees."

Opinion polls show just how broken publishing is. We are scientists and science-lovers, not criminals, remember? Yet even among never-users of sci-hub about 80% approve of this service (7).

The True Cost of Publishing
Publishing a standard research article involves a lot of volunteer work. The authors have to do most of the research (obviously) and also conform to the guidelines, formatting and recommendations of each journal. After submitting an article to a journal for consideration, reviewers who work for free judge the quality of an article. The only true cost for the publisher is to pre-screen articles to check if they adhere to standards, initial quality review (i.e. editorial review), supervising and managing peer-review (again editorial work), re-formatting, type-setting and then publishing it online and maintaining the servers. (Let's assume for the sake of simplicity we want to publish a simple article and so no trees need to be killed and no additional value is added through Op-Eds and journalistic work.)

The final cost is dependent on the quality and selectivity of the journal. The more papers a journal rejects the more editorial work is needed and the longer it will take to manage peer-review (which itself is done for free, as already mentioned). Let's be clear here: publishing is not free and someone has to pay for it, so the actual question is whether regular publishers are extracting an unfair prize through rent-seeking and monopolistic behavior.

There are two models of publishing in practice. “Subscription based” when publishing is free to the author and later the costs are recouped by selling the article (around 80 or 90% of all journals) and “open-access” where the authors or funders pay all the costs of publishing and thereafter the article is free. How much does it cost?

The open-access publishers are the only transparent outfits so we have to ask them. For example, PeerJ, Hindawi and Ubiquity Press claim it is possible to publish peer-reviewed articles for 200-300$ per article (Van Noorden 2013), whereas at one of the most famous and prestigious open-access journals, PLoS ONE, it is 1350$. These costs are quite low considering that the parent company operates in San Francisco and overpays their executives (2). In contrast, unreviewed pre-prints can be published on arXiv, commonly used in physics, and this may cost as little as 10$ per article. Physicists use a form of post-publication review, which is not popular in the biomedical sciences so this is out of the question for now. Nevertheless the data shows that the cost of publishing is quite low even if we consider peer-review.

Smart online-only systems could further drive down the costs. Peer review after publishing might be the best solution, but other avenues could help to keep the current model cost-effective. If we publish the names of peer-reviewers, thereby rewarding good work, we could make one of the referees responsible for proof-reading on top of scientific assessment. Perhaps young scientists could fill this role. Paper pre-screening can be easily outsourced as well. Plenty of ideas exist.

Enter sci-hub -- the end of traditional publishing
The robin hood of publishing available online at “sci-hub[dot]something” is quickly burning through domain names that keep getting blocked like .cc, .ac, .tw, etc. but so far has remained available no matter the efforts of publishers. Why do publishers and many others take offense with Alexandra Elbakyan's devil spawn?

Criticisms leveled against Sci-hub 
Does sci-hub really benefit researchers from poor countries?
Publishers claim the following (4): "Ironically, traditional academic publishers and libraries have arguably provided far greater access to research in developing countries than Sci-Hub—by making content available through initiatives such as the International Network for the Availability of Scientific Publications and the eJournals Delivery"

This must be wrong per definition, because sci-hub offers free access to over 99% of all published literature as a first approximation (~100% for free is always more than a fraction of 100% as provided by publisher initiatives). However, let us look at the numbers. If we assume that accessing an article costs a researcher in a poor country 1$ instead of 30$ which is the retail prize, through a combination of opportunity costs, library loans and other routes of access, then how much does sci-hub benefit them?  Over a period of 6 months sci-hub served 28 million papers (Bohannon 2016) worth 28 million$. Many of these papers were accessed in poor countries. Thus the calculated subsidy to Indian researchers is about 2 million$, Chinese researchers gained 2.3 million$, Brazilian, Egyptian, Tunisian and Indonesian researchers about 0.5 million$, Russians 1 million$ and Iranians about 2.5 million$. Therefore by our conservative estimate, sci-hub saves 20 million$ per year for researchers in poor countries. The number balloons to 200 million$ if we assumed the researchers had to pay 10$ per article otherwise.

If that is difficult to understand, perhaps a more concrete example will help. Did the publishers help me when I was struggling for money? Was I supposed to pay from my pension savings? Where are they to help other individuals who are even less fortunate than I was? (3) How do they enable access in moderately poor countries like Poland not covered by their developing nations initiatives?

Rich people are using sci-hub, therefore it is evil: (4) "many of the site’s downloads go to affluent academic hubs, in areas such as New York City and Silicon Valley, that are already well-served with institutional access to scientific journals."

However, this ignores the problem I already mentioned above. The subscription-based model is so outmoded and cumbersome that it poisons our workflows. It is no surprise that some scientists prefer to use sci-hub even for papers they might be able to access otherwise. Only about 50% of papers are being published as open access work (5). Affluent scientists would pay a reasonable prize if there was a literature service that is consistent with high productivity workflows (netflix for papers).

Professional societies will suffer or die due to sci-hub (4). This is of course nonsense. There is no reason why for-profit publishing should cross-subsidize professional societies. If they provide tangible benefits to working scientists or universities then the respective players can pay them through membership fees.

Sci-hub articles may not be up-to-date. While this could be true, the platform outperforms in other respects, e.g. ease of use and currency (15).

Scientific publishing is doomed as it is being undermined (11): "nothing comes for free. Maintaining a quality publishing program and hosting that content on a robust and reliable platform is only sustainable when libraries, organizations, and individual researchers pay to subscribe to it"
But of course this is a red-herring, since publishers do not charge the fair market rate to begin with (10) and alternative models have shown their structures to be bloated.

It's stealing. Sure, but who cares? The law is there serve us and not the other way around. Ignoring the legal implications, most people agree that it is ethical to steal (7) in order to avoid greater harm, which in this case is lack of literature access by the very people we have tasked with developing cures for our ailments and diseases!

"The content Sci-Hub shares ‘for free’ was published by mostly commercial entities, after uncoerced submission of work authored and reviewed by academics. The published content is the result of legal publishing agreements between academic authors and publishers. Said academic authors consciously signed contracts with publishers granting their agreement to paywall and (in most cases) transfer all rights to said publishers" (13)
First of all, it is clearly and painfully wrong that our work is "uncoerced". Publish or perish is a slogan for a reason. Little about our work in the lab is uncoerced, but we labour in this broken system because we love humanity and science. Mental health crises and suicides by PhD students attest to how little coercion and other implicit violence there is in academia. Everyone's struggling in academia, do you think we have the mental energy to fight publishers at the end of the day? Of course we should also seek to reform publishing in a legal way if possible, but for now there is no alternative to sci-hub. Second, whether the government, as the ultimate arbiter of what is permitted, should accept the rent-seeking transfer of rights to publishers is another can of worms, but it is my personal belief that the government has the right to renege deals which have been made under economic and professional duress.

Enough with the naysayers.

Perhaps we should listen to unbiased sources about the problems with sci-hub (5): "The biggest effect of black OA could in fact be in diluting the popularity of green OA channels, in combination with publishers tightening the embargo rules, which in institutional repositories tend to follow."

This is likely correct. Sci-hub will broaden overall access to the literature while weakening the more cumbersome open access routes. Eventually, however, legislators may realize that mandating 100% gold and/or green open access it the only route to kill sci-hub. This is where I see the future. Once the US and EU implement rules, the rest of the world should follow. Until then sci-hub is there to stay. On the other hand, there is evidence that sci-hub improves the bargaining power of institutions (8, 12). Publishers, make us a better deal then before. You refuse? Oh, that is all too sad. Our users will have to google your articles, and the unpronounced threat is that people will turn to sci-hub forming a usage habit. "Time is not on Elsevier’s side." as the authors note (8).

Project DEAL highlights the power of the state (or of a nation-wide consortium) and the improved bargaining power of institutions in the era of sci-hub. German institutions are now fighting to get a fair and nation-wide subscription deal from Elsevier.

The future is black and green
Only freely available articles will enable more efficient workflows in science, education of and control by the public and access by economically disadvantaged individuals. PLOS and other open-access publishers set out to weaken if not kill commercial publishing but they failed (2):

"When we started PLOS the only way we had to make money was through [article processing charges, APCs], but if I had my druthers we’d all just post papers online in a centralized server funded and run by a coalition of governments and funders, and scientists would use lightweight software to peer review published papers and organize the literature in useful ways. And no money would be exchanged in the process. I’m glad that PLOS is stable and has shown the world that the APC model can work, but I hope that we can soon move beyond it to a very different system."
I do not think sci-hub is the end goal, but it might be a necessary evil to catalyze change, since open access publishing failed. If for profit publishers start dying off and we do not have enough journals then a non-profit arXiv-like solution can emerge. Green open-access has been limited because it was implemented as an opt-in service (1): "More than 60% of journals already allow authors to self-archive...Most of the others ask authors to wait for a time ... before they archive their papers. However, the vast majority of authors don't self-archive their manuscripts unless prompted by university or funder mandates."

Nevertheless manuscript-archiving might be the future since (6): "green open access is compatible with publishing in non-open access journals, which means that green open access mandates can respect author freedom to publish where they please."
As I said, if green or any other open access kills traditional publishers then several governmental and non-profit alternatives will step up. However, so far almost everyone working in science has felt the benefit of sci-hub either directly or indirectly, but all the problems remain speculative.

References and further reading
1. Van Noorden, Richard. "The true cost of science publishing." Nature 495.7442 (2013): 426-429.

2. On pastrami and the business of PLOS By MICHAEL EISEN | Published: MARCH 20, 2016 http://www.michaeleisen.org/blog/?p=1883

3. Bohannon, John. "Who's downloading pirated papers? Everyone." Science 352.6285 (2016): 508-512.

5. Björk, Bo‐Christer. "Gold, green, and black open access." Learned Publishing 30.2 (2017): 173-175.

7. In survey, most give thumbs-up to pirated papers. http://www.sciencemag.org/news/2016/05/survey-most-give-thumbs-pirated-papers

9. Driving Them Up The (Pay)Wall – Sci-Hub and the Disruption of the Academic Publishing Industry. https://spicyip.com/2017/07/driving-them-up-the-paywall-sci-hub-and-the-disruption-of-the-academic-publishing-industry.html

10. How much did your university pay for your journals? By John BohannonJun. 16, 2014 , 4:15 PM. http://www.sciencemag.org/news/2014/06/how-much-did-your-university-pay-your-journals

11. Sci-Hub’s “Free” Articles Are Anything but Free There’s a price to pay for accessing illegally downloaded material from the pirated-content website By KATHY PRETZ 14 September 2016. http://theinstitute.ieee.org/blogs/blog/scihubs-free-articles-are-anything-but-free

13. Ernesto Priego, Signal, Not Solution: Notes on Why SciHub Is Not Opening Access, The Winnower 3:e145624.49417 , 2016 , DOI: 10.15200/winn.145624.49417

14. Green, T. (2017). We’ve failed: Pirate black open access is trumping green and gold and we must change our approach. Learned Publishing, 30(4). DOI:10.1002/leap.1116, http://doi.org/10.1002/leap.1116

15. "Open access: All human knowledge is there—so why can’t everybody access it?". GLYN MOODY - 6/17/2016, 5:30 PM.