Intuitively, it’s easy to see how piracy could have both positive and negative effects on sales. It can complement sales by exposing a wider audience to the products, who otherwise would have not even thought about purchasing it. On the other hand, clearly if a product’s available for free online, one may not go through with the purchase.
Research on this topic seems to suffer from small sample sizes. Another problem is that of causality: sales and piracy can both be expected to affect the other.
The research that has been done had the general conclusion that complementarity trumps substitution: free online availability increases sales. One major caveat is that it was not piracy’s but voluntary online publication’s effect that was measured in these papers. I.e. when the authors/publishers made their content available freely online.
This differs from piracy in four main ways: (i) when downloading, users are exposed to ads urging them to buy the product, (ii) consumers may think of the publisher/author more positively because of the free sharing, (iii) information about legal free downloads may spread more quickly, and (iv) pirated versions may have lower quality, which induces fewer consumers to actually buy the product.
In a nice new working paper Hardy, Krawczyk and Tyrowicz (2014) run an experiment that tries to address the issues with previous research. First off, they do study piracy’s effects on sales. Second, they have a relatively large sample size of 246 books. Third, they control for seasonality in their research design.
The basic idea is simple: the researchers got 11 major Polish publishers to participate in their experiments with various books. Books were assigned to either a treatment or control group. Books in the former were removed from piracy sites with no exceptions, whereas books in the latter were not. In a nutshell, this is their experiment: simple but effective.
A few other important details worth mentioning are the following. First, books covered a large range of segments as the figure below illustrates (based on Table 1 in the paper).
Second, books were not assigned to the treatment and control groups randomly. Instead, they were matched in terms of their similarity so that more similar books were matched together. Then within these matches, roughly speaking one book got assigned to the treatment group, one to the control group. So both groups were composed of similar types of books. Using the Mahalanobis distance, similarity was measured across various dimensions including projected sales (by the publisher), date of publication, cover, number of pages and so on.
Third, the experiment ran through a whole year (starting in October 2012) so seasonality was not a concern.
Let’s move on to the results of the experiment. The authors start out by confirming that the treatment was effective. Treated books were indeed much less likely to be available online. The difference is statistically significant. The figure below illustrates this.
Three research assistants were also asked to try to find various books from the sample online. They found on average 57% of the books they were assigned in the control group, but only 32% of the books they were asked to find in the treatment group. Again, this is statistically significant. The authors therefore conclude that the treatment was effective: treated books were considerably harder (if not impossible) to obtain illegally.
Books in the control group had 5% higher sales, but in general various statistical tests/models show that there is no effect of piracy on sales whatsoever. For instance, the cumulative distribution functions of sales in the two groups are almost identical as shown below (CT = control group, ET = treatment group).
Next, some regressions were run to control for the amount of time for which an illegal copy was available online, whether an ebook version of the book exists (this could result in better quality pirated versions), and the number of copies available. These regressions as well show no effect of piracy on sales. Controlling for segment (fiction, nonfiction, etc.) does not change the result.
An additional concern may be that say niche titles’ sales were decreased by privacy, while best sellers’ wasn’t (or the other way around). And the two effects cancel each other out, or the effect only appears in a small segment and so is not significant in the full sample. This hypothesis is not supported by the data either. Running a quantile regression shows no significant effect of piracy on sales at any of the sales quantiles considered (which by the way are the 10th, 25th, 50th, 75th and 90th quantiles).
So what explains these results? It could be that piracy has both positive and negative effects and the two just happen to cancel each other out. It’s also possible that piracy would have negative effects, but given the preference for paper (and better quality), even illegal downloaders just go and buy the paper version.
Heterogeneous effects by genre can also be behind the results. Maybe in some genres that were underrepresented in the study, piracy does have significant (positive or negative) effects. Although no significant effect by genre was found, the sample size was small for many genres.
Substitution is another possible culprit. For instance, if a consumer can’t find a treated book online, they may just look for another similar book that is available online (because it is in the control group or not part of the study). This, however, may be more relevant in some genres (say academic books) than others. And since no heterogeneity by genre was found, it’s unlikely that these effects drive the overall results.
Overall, this study fits a general pattern in piracy research: the lack of impact of piracy on music, DVD or movie ticket sales. We can now add books to this list.
Now of course, in no way is this paper (or any of the previous ones) alone conclusive. But as you put together these bits of evidence, you see that all of them point in the same general direction. And that is somewhat more convincing.