Whereby I try to convince you that all other ratings systems are the lose, and Goodfilms’ is the win. Let’s begin.
First, an observation
Your favourite films probably didn’t win Best Picture.
In fact, I wonder how many Best Pictures you would happily watch over and over again? Critical acclaim doesn’t guarantee that you’ll love a film, and a film doesn’t have to be objectively brilliant for you to list it among your favourites.
Moreover, some films that were shocking or groundbreaking for their time might not seem that remarkabe now, and some that were panned when they were released may achieve popularity and recognition much later.
The point is this - your enjoyment of a film is different to its critical reception. There’s something else at work, and this post is about determining what that is.
5-star ratings systems don’t work
A few years ago, YouTube blogged about the ineffectiveness of their 5-star ratings system, referring to this graph:
The number of 5-star ratings greatly outnumbered all others, with 1-star a distant second. They decided to switch to a thumbs-up/thumbs-down system, which by all reports is working well. Amazon also uses a 5-star system and similarly skewed ratings are often seen. In fact, they now define 4+ stars as ‘favorable’ and 1-3 stars as ‘critical’. The reviews of Transformers 2 are a particularly good example:
For films, a yes/no ratings system, like that employed by Rotten Tomatoes, may be the best way to derive a single, aggregate score. But it relies on a huge number of anonymous ratings, and collects the least possible amount of information from each person.
I think this is a step in the wrong direction.
When considering your immediate friends, you want to know the detail and variety of their opinions. There’s a difference between your friend liking a movie and loving it, between not really caring for a film and actively warning you off it. Goodfilms focuses on these opinions, and so an aggregate score isn’t good enough. Mind you, that doesn’t mean we’re happy with the usual 5-star rating system. Oh my word no.
There’s another problem with the traditional star system - quantization. Considering the following graph of a staggering 10 million movie ratings from the Movie Lens project:
There’s a clear bias toward integer ratings here - only 20% of all ratings are half-stars. And there’s no clear reason why that should be the case - clearly the majority of ratings are 3 and 4 stars, why are there so few 3.5 star ratings?
Goodfilms adopts a similar approach, and it completely avoids the integer-bias observed in the MovieLens dataset:
A continuous scale might be an improvement, but we haven’t addressed the key problem - there’s more to rating films than objective quality. But what is it?
Going beyond ‘quality’
We’ve come up with a concept that we think captures this neatly: rewatchability.
Yes, I’ll admit that that’s not a real word. But it’s a good short-hand for the question we ask when you rate a film on Goodfilms:
If you were to watch it again, how much would you enjoy it?
Now, when grappling with what score to give a film like Transformers 2, you don’t have to decide between ‘Transformers Rock!’ and ‘I wish I could have transformed into a gun to shoot myself’, like the Amazon example above. Now you can freely admit that it’s a long, loud, confusing mess of a film and you like it anyway:
On the other end of the scale, I think Black Swan is a terrific film, but I’d sooner pull my nails out than watch it again.
Rewatchability turns out to be an excellent indicator of what films you’d recommend. What better recommendation than a movie you’d happily watch again and again? And would you really suggest a film to a friend if you yourself wouldn’t enjoy a repeat viewing?
We think this is a much better way to rate films, and from the first chunk of data we’re starting to see some fascinating results. Here’s a sneak peak at what we’ve got so far, displaying ‘quality’ along the x-axis and ‘rewatchability’ along the y:
As you can see, both axes have their outliers, and overall there’s a correlation between quality and rewatchability, which you’d expect. Some films are showing themselves to be deeply contradictory, with others showing a surprising consensus. And there’s a tendency toward rating higher-quality films in general, but rewatchability is more evenly distributed.
There’ll be much more detailed posts in the future, but for now - that’s our ratings system, and how we came to it. We think it’s the total win, and we hope you’ll agree. So sign up if you haven’t already, and get rating!