Thumbs down on Netflix ditching of 5-star ratings for a thumbs-based system
March 19, 2017
The Netflix video streaming and DVD service announced Thursday that it is switching from a 5-star rating system to a simpler thumbs up / thumbs down system. I've been a Netflix user (and fan) for many years, and love their personalized ratings predictions. I have often used their model in presentations and brainstorming involving other services that could benefit from that kind of personalization. I think this change is a bad idea.
I called the Netflix Help Center (866-579-7172), and the customer service representative I spoke with told me they were eager to receive feedback on this topic, especially feedback that specifies why users are in favor or not in favor of the proposed change. I shared several reasons why I thought it was a bad idea and I want to share those reasons here, in the hope it may encourage other Netflix users - especially those who share my view that it's a bad idea - to contact Netflix and provide feedback.
Granularity is Good
The main objection I have to the proposed change is that I make careful distinctions in both the ratings I give to movies I have seen and on the personalized predicted ratings Netflix offers for movies I have not yet seen. I probably watch, on average, 2 hours of TV a week, 1 DVD every month, 1 movie in a theater every 3 months. I hardly ever watch video content on Netflix, YouTube or other streaming sources. So I'm probably an outlier on several dimensions.
That said, on the rare occasions when I do want to watch a movie - at home or in a theater - I will only watch a movie for which the personalized predicted Netflix rating is at least 4.0. Since I know the personalized prediction accuracy is dependent (in part) on my own ratings, I am very careful in how I rate movies. I use the following interpretations for the 5-star scale:
- 5 stars: a movie I liked so much I've seen it several times and/or would enjoy seeing again
- 4 stars: a movie I liked a lot, but am not interested in seeing again
- 3 stars: a movie I liked, but would probably have preferred to spend my time watching something else
- 2 stars: a movie I didn't like, and probably didn't watch much of
- 1 star: I don't know if I've ever seen a 1-star movie, and certainly don't want to ever see one
While some people may find it easier to give a thumbs up or thumbs down rating (which I will refer to hereafter as a thumbs-based rating), I would find it more difficult. I envision the following mapping from my 5-star schema to thumbs-based ratings:
- Thumbs up for a 4-star or 5-star movie
- No rating for a 3-star movie
- Thumbs down for a 1-star or 2-star movie
Given that I rarely see a 2-star movie, I would probably only be giving thumbs up ratings in the proposed new scheme, and predict that the lower volume of ratings combined with the lower granularity of ratings would result in less accurate Netflix rating predictions.
Quality vs. Quantity
Speaking of quantity, the Verge article reported that Netflix saw a 200% increase in the number of ratings among the test group who used thumbs up or thumbs down, compared to the number of ratings using the 5-star rating group.
The article doesn't report on the change in the number of users who submit ratings using thumbs up or thumbs down, nor is it clear whether a specific control group was used in the experiment. Based on their marvelously detailed posts in the Netflix tech blog, especially the posts on their recommender systems, I suspect they were very careful in the way the designed the experiment. Perhaps more details will eventually be reported there.
The article also doesn't report on the quality of the recommendations under the thumbs-based rating system. More is not necessarily better, and it is not clear what kind of impact the increased quantity had on the perceived quality of predictions based on the new system.
Given that the average U.S. adult consumes 5.5 hours of TV, movies, games, and other video content per day, I suspect most users are less discriminating than I am with respect to what they will watch. It may be that the quality of recommendations using the new system serves high-volume - or even average-volume - video consumers as well or better than it would serve low-volume video consumers. But if my supposition that higher volume video consumers are less discriminating is correct, then the increase in quality may not have much impact on the amount consumed. And since Netflix charges flat monthly rates, those of us who consumer relatively little video content are paying just as much as those who consume large amounts of video content .. and if the recommendation quality declines for someone like me, who consumes little content, and the quantity of video I consume similarly declines, I am more likely to discontinue the service than a high-volume consumer who might consume less if the quality of recommendations is not as good (due to fewer ratings). But if they are already consuming a large quantity of video, I don't understand what problem is Netflix trying to address.
Returns on Investments
The article draws an analogy between Netflix ratings and Spotify thumbs-based ratings, which I think is an inappropriate comparison point. I use both the Spotify and Pandora streaming music services (in fact, I'm a paid subscriber for both (I hate commercials in any medium)), but rating a song that lasts a few minutes is very different - in my view - from rating a movie that lasts a few hours. I'm much more willing to provide a finer granularity rating (e.g., on a 5-star scale) for an experience that will last hours vs. minutes.
I think a better comparison point would be Yelp, which uses 5-star ratings for restaurants and other service providers. I'm willing to provide ratings on a 5-star scale for restaurants, because it represents a more significant investment of time (and money). I would even consider TripAdvisor, an online service for reviews and ratings of hotels and other destinations and activities associated with traveling, a better comparison point than Spotify, as pl
Personalized Ratings for All
In fact, I think both Yelp and TripAdvisor could benefit from adopting the potentially-soon-to-be-former Netflix personalized rating scheme. I am growing weary of wading through reviews of restaurants on Yelp from people who rant about the bartender not paying attention to them, or a special event dinner that went awry, or from anyone who doesn't share similar tastes in restaurants to me. I would love it if Yelp would offer a personalized rating, or at least let me read reviews from people like me.
TripAdvisor ratings have become almost useless to me. It appears that many hotels are carpet-bombing guests with email invitations to review their stay, and the result seems to be that many places now have an overwhelming abundance of reviews from people who have only posted one review. I consider most newbie reviews nearly useless, both because they tend to be short and uninformative, and because there is no way to know what kind of other places the person has reviewed, so I can't tell how much the reviewer is like me.
I could rant further on the decline of both of these services - which I once found far more useful - but I will let it go (for now). I wanted to compose this post because throughout all the years I've been a Netflix user, the service has only gotten better (as I gave it more ratings upon which to make recommendations), and I'd hate to see yet another beloved rating, review and recommender service decline.
If you feel similarly, I urge you to call Netflix soon, as they are reportedly planning to roll out the new thumbs-based rating system in April.