How long a TV show can stay on air is mostly determined by how many viewers tune in every night to watch it. Thus popular TV shows tend to be successful. But are these popular shows also critically acclaimed? Do people generally like high quality shows?
I investigate these questions in this post using a sample of TV shows that were aired on US national television channels (ABC, CBS, FOX, NBC or the CW) and ended or were canceled between the 2005/2006 and 2014/2015 seasons.
To measure TV show success or popularity, I turn to two variables: the number of seasons and episodes a show had. To measure critical acclaim or quality, I use the show’s rating on IMDb. While an IMDb rating may not necessarily be the best proxy for critical acclaim, I would argue that it is mostly people who care a lot about TV shows who end up rating a show there. And those people I’d say have higher standards than the average TV viewer.
As a first approach, let us divide our sample of shows into two groups: those with high (above median) and those with low rating. The boxplots below summarize how these two groups differ in terms of number of seasons and episodes.
The bold horizontal line inside the boxes is the median number of seasons and episodes, respectively. The first thing we can see is that the median number of seasons barely differs between the low and high rating shows (left panel). On the other hand, high rating shows seem to have a higher median number of episodes (right panel). In general, the differences are small. And there is a lot of overlap in the number of seasons and episodes across the two groups.
We can also construct density plots for the two groups. The next two plots can be interpreted as follows: the higher the blue or red areas are at a certain number of seasons or episodes, the more shows with high (blue) or low (red) ratings have that many seasons/episodes.
We can, for instance, see in the plot below that low rating shows (red) tend to be somewhat more numerous at a low number of seasons (left side of the plot), while high rating shows more often have around 8 or more seasons (blue).
The above figure shows that for seasons as well, we have some positive relationship between ratings and success. The blue density is above the red one at higher numbers of seasons.
This relationship is much more pronounced for the number of episodes, as can be expected from the boxplots above. The density plot below shows this. Low rating shows have a low number of episodes (and high rating shows, a high number of episodes) much more commonly.
One may speculate as to why this is. It is perhaps possible that low quality shows can often run for many seasons, but it is mostly truly high quality shows that both run for many seasons and get extra episodes (beyond the usual 22 per season).
Let us now use scatterplots to check if the positive relationship we identified in the plots above is significant or negligible.
For seasons, we can see a very weak relationship, as shown below.
The blue line is almost horizontal, especially if we take the shaded 95% confidence interval around it into account. This means that at best there is a very weak positive relationship between seasons and ratings.
For episodes, the relationship is much more clear and we can reasonably expect a statistically strong relationship as well, as the plot below shows.
A final interesting plot compares the effect of rating on the number of episodes broken down by genre. We can see that the relationship seems much stronger for comedies. The stronger relationship for comedies might be explained by the “extra episodes” hypothesis outlined above.
So suppose it is high quality (i.e. high rating) shows that get orders for extra episodes (beyond 22). Such extra orders might be much easier to make for comedies, as generally sitcom plots do not stretch into multiple episodes. It is much harder for a drama to add an additional episode on short notice.
Finally, let us look at the relationship in a more statistically robust manner. Estimates from various models (see details at the end of the post in the Technical appendix) indicate that an additional IMDb rating point is associated with an additional .35 to .50 seasons. The preferred model (negative binomial) predicts an additional .45 seasons. There is virtually no difference between comedies and dramas.
The effect of ratings is significant in the preferred model at the 10% level, but barely (p = .0996).
For episodes, we have a much more robust relationship, as expected. An additional rating point is predicted to land an additional 11 episodes. Or in an alternative model, an additional rating point is predicted to increase the number of episodes over the lifetime of a show by around 23%. Genre has no significant effect, despite what the last plot may suggest.
The relationship between ratings and episodes is quite strong, significant at least at 5% in all cases.
To sum up, this analysis indicates that higher quality shows (as measured by IMDb rating) do tend to get more air time on average. This is especially true when we look at the number of episodes as opposed to the number of seasons. Thus many bad shows may last many seasons, but fans of good shows are more likely to be rewarded with a higher number of episodes per season.
The relationship is thus there, but then again it is not all that large quantitatively speaking. One would need at least 2 additional rating points in order for a show to have an additional season, and that is if we use our highest estimate. We obtain a similar result if we look at the estimates for episodes instead. An additional rating point delivers 11 episodes, which is about half a season. So one would need two more rating points to get a whole extra season.
So my verdict is that better shows get more air time statistically speaking, but the picture is not so bright if we look at quantitative effects. Nevertheless, this analysis has a rather optimistic conclusion: higher quality shows do tend to be somewhat more popular.
Models. To examine the relationship between ratings and seasons/episodes, I used several specifications. The dependent variables were number of seasons, number of episodes and log number of episodes. The independent variables were rating and a genre dummy (comedy or drama). I estimated models both with and without genre.
For seasons, I tried basic OLS specifications, but given that seasons can only take low integer values (i.e. it is count data), I looked at count data models as well. First, I estimated Poisson models, but the dispersion test indicated in all cases that there is overdispersion in the data. So I moved on to negative binomial models. The preferred specification is a negative binomial model with both rating and genre as independent variables.
In this specification, rating is significant at 10%, genre isn’t. The coefficient of rating is around .45.
For episodes, I tried OLS specifications. Rating is always significant (in the worst case at 5%), genre never is. As mentioned above, both the untransformed and the log-transformed episodes variable was tried.
Data. To conduct this analysis, data was collected (scrapped) from IMDb using R. The analysis is 100% reproducible (or so I like to think), and the code is available here. The linked file also contains the final data set in .csv format.