Posted by: biblioglobal | June 10, 2014

How good are Goodreads ratings?

Or more precisely, how well does the average rating of a book on Goodreads predict my own rating of that book?

I joined Goodreads not too long after starting this blog, mostly as a way of keeping a list of possible books to read for my book-from-every-country project. My initial assumption, like a lot of people, I think, was that the star rating they give to books wouldn’t really tell me anything about whether I would like a book. After all, those ratings are averages of so many people with such different tastes. (And surely many of those people have bad taste!)  Plus I’m skeptical even of my own ability to quantify my opinion of a book on a scale from 1 to 5, which is why I never include ratings in my blog.

Being a science-type person though, it occurred to me to test it out. I blinded myself to Goodreads ratings for the time that it took me to read  50 books  from my Goodreads list. These were mostly books from my book-from-every-country project or other ‘global’ type reading. I rated those books periodically, trying to follow the Goodreads scheme of ‘It was okay’, ‘I liked it’, ‘It was amazing” etc. When I got done with 50 books, I took a look at the data.

Here’s a graph of the results:

GoodreadsRatings

It turns out that there is actually a statistically significant correlation (two-tailed linear regression, p=0.03) between the average Goodreads rating and my own opinion! It’s not exactly a tight relationship though. The R^2 value of 0.09 means that variation in the Goodreads rating only accounts for about 9% of the variation in my rating. Still, there’s enough of an effect that in the future if I were trying to choose between two books that looked interesting to me and one had a 3.5 rating on Goodreads and the other had a 4.0 rating, I’d go with the 4.0. I wouldn’t not read a book  just because it had a 3.5 rating though.

It’s notable that all of the Goodreads ratings for books I read fell into a fairly small range, from The Tiger’s Wife rated 3.36 (Why on earth is it rated that low?!) to Persepolis and Poor Economics which tied at 4.20 (totally deserved!). I think that range is probably typical of most reasonably successful books*. If my reading was a random sampling of the books on Goodreads, including really obscure, badly written books (not that all obscure books are badly written!), I suspect that Goodreads ratings would have better predictive power. (Anyone willing to read 50 completely random books to test this? I think I’ll pass.)

The book that comes out as most under-rated on Goodreads (as compared to my own opinion) is Ali and Nino by Kurban Said. It seems that a movie version is in the works. It will be interesting to see if that boosts the ratings. The book that according to me is most over-rated is Bonsai by Alejandro Zambra. I think that one comes down to personal taste. It’s apparently quite a good book. It just didn’t happen to appeal to me at all.

Well, what about other sites with book ratings? I decided to look at the ratings on Amazon and LibraryThing for comparison. One thing I immediately noticed was that the ratings on Amazon were noticeably higher than for the other two. (The data also look boxier because the Amazon ratings are rounded to the nearest tenth.)

AmazonRatings

Maybe the higher ratings are related to Amazon being a shopping site? Maybe you’re more likely to bother to rate books you really liked there? Once you account for the higher ratings though, Amazon ratings were about as successful at predicting my ratings as Goodreads’ were.

Here’s LibraryThing:

LibraryThingRatings

Interestingly, LibraryThing seems to do a somewhat better job of predicting my opinion than either of the other two.  That surprised me because in most cases the books had fewer ratings on LibraryThing than on Goodreads. I wonder whether the community on LibraryThing might have more similar tastes to mine? (Or it could just be chance. I haven’t actually done the statistical test to determine if LibraryThing was significantly better than Goodreads.)

So, to conclude, Goodreads’ isn’t bad, but LibraryThing might be better (for me anyway).

* In the course of writing this post, I came across a Goodreads list of books with ratings over 4.5. At first glance, two types of book were highly represented at the top of this list.

  1. Books in series. My guess is that if you aren’t that excited by Robert Jordan, you probably aren’t going to make it to book #14. Heck,  Robert Jordan himself didn’t make it to book 14.
  2. Calvin and Hobbes. Everyone loves Calvin and Hobbes.

 

Advertisements

Responses

  1. Ratings for me are useless I just ignore them for goodreads

    • Yes, me too. I rate almost everything 4 stars because I mostly really like what because I’m good at choosing books I’ll like. I reserve 5 stars for really amazing books like James Joyce’s Ulysses so – ahem – I think it’s a bit odd when I see limp romances rated 5 stars.
      But I enjoy the conversations on GR, and I love sticky-beaking at what my friends are reading (yes, Stu, that’s you!)

      • Lisa, that’s an interesting approach. It’s likely that you would see less correlation if your ratings don’t vary much. When I was doing this, I did try to make sure that I was giving ratings across the spectrum.

        There are absolutely a lot of other benefits to GR! They’re just harder to quantify…

    • Stu, that’s what I thought initially too. But it turns out that for me anyway the ratings do actually tell me something about what I will think of the book. I really don’t think you can tell without actually looking at the data.

  2. For me as long as the rating is not alarmingly low, I wouldn’t care too much about it. Most books on my to-read list on goodreads have ratings of 3.3 and above anyway. Just put Ali and Nino on my list.
    ps: would love to connect on goodreads with you https://www.goodreads.com/user/show/238527-dioni-mee

    • Found you! I’m Biblioglobal on Goodreads, same as here.

      I hope you enjoy Ali and Nino. I’ll be interested to see whether you think it is under-rated or not!

  3. I somewhat rely on Goodreads and Amazon ratings to choose my book. The book conversation as well.

    • The book conversation is definitely useful too! Sometimes I know that what people say they don’t like about a book is not something that will bother me. I think ‘somewhat rely’ is the approach I will use in the future- take the average rating into account, but also a lot of other factors.

  4. I use the Amazon ratings. For books that are around or below the 3.5 level, I go straight to the 1 star reviews. These help me make a determination. Sometimes people are upset because their book arrived late, which is obviously a poor reason to give the book a 1 star rating. Some people have other issues with the book which I wouldn’t necessarily take issue with. So basically, if I determine that the reason they gave 1 star is not something I would give 1 star for, then I go ahead with the purchase.

    • Sounds like a sensible approach to me!

  5. Wow, I love that you did so much research on this! I am not very statistically minded (yeah, I’m an engineer, go figure that out). But this is just brilliant! I don’t look at GR ratings unless it is a new-to-me author/book and I have heard nothing about it. I don’t look at LT or Amazon ratings for books. I do use Amazon ratings heavily for shopping.

    • It’s always fun to have data to play with! For me, that’s the best part of doing science. Plus the fact that sometimes the results are unexpected…

  6. Wow, this is really interesting. I just moved out of Goodreads but I’ve recently signed up for Leafmarks and LibraryThing. I usually don’t bother with the average ratings but one can’t sometimes feel intrigued by 4+ average ratings of obscure books. Also, LibraryThing has a reputation for having more keen users so probably this is one reason your opinions are closer to it. 😉

    • I’m feeling like those 4+ books might be worth paying attention to! (At least if they have a reasonable number of ratings and not just a handful.)

      I’m seriously considering giving LibraryThing a try. In addition to the ratings results, I really like the fact that it isn’t covered with ads. Are you happy with LibraryThing so far? Did you get a paid account?

      • I would have done a permanent move if it were free. I’m still trying it out to see if the paid account is worth it. Here are some of my observations:

        1. Cataloguing books is great and easy. You could edit any of the fields without having to be a librarian, which means the edits are just yours. Very important if you are obsessed with details and if you have rare editions.
        2. There’s a lot of info and statistics. I think you will totally enjoy the latter.
        3.The layout of the site is not too pretty. It feels a little cluttered and it looks like a forum.
        4. The social aspect is not very interactive. For some reason, I can’t see the people who like my reviews.
        5. The groups section needs some tweaking.

        If I get used to it, I think I would enjoy it there. There are many decent reviews and the lack of GIFs give the reviews section a professional appeal. Long comment is long. 😀

    • Thanks for your long comment! I think the layout/appearance is what initially made LibraryThing seem less attractive to me, but that’s not something that I find particularly important to me. I think I’ll play around with a free account for a bit.

  7. Ah, all those statistics courses being put to non traditional use 🙂

    Sent from my iPad

    >

    • Yeah, I didn’t do a statistical comparison between Goodreads and LibraryThing to test whether LibraryThing is actually significantly better though 😦

  8. This is all really interesting. I think you’re right about Amazon, if someone buys a book there’s a higher chance they know they’ll like it (I think), but with Amazon, you’re really only going to get those that really like a book or don’t like a book unlike (I think) Goodreads or LibraryThing where readers are there to look at their own catalog/collection.

    This only partially makes sense in my head so hopefully it makes sense to you.

    • Makes sense to me. I have seen suggestions that Amazon ratings are more likely to be extreme values. It certainly seems plausible. Of course, one would have to look at the data to be sure!

  9. Well, this is very interesting. Like you, I am skeptical of my ability to summarize my thoughts on a book into a certain number of stars. Even when I give a rating, such as when I’m on Goodreads (which I don’t tend as carefully as I’d like), it feels arbitrary to me. I don’t know why I give a 4 instead of a 3.

    • For me, I think it depends a lot on the book. Looking back through my ratings, there are some I feel confident of and others that I have doubts about. Was this really a 4 star book? Or should it have been a 3 star book? Then there are books like The Bone People, where I thought 2/3 of the book was amazing and I hated the other 1/3 with a fiery passion. There’s no way to convey that in a single rating.

  10. I love this! This is the exact kind of thing I always want to do except that since I stopped being an academic biologist and started working in aquarium animal husbandry instead I’ve pretty much forgotten how to do stats. I’ve now read a lot of obscure books with one or two ratings on Goodreads if you want to use my data to test your theory. I suspect it may not pan out though because sometimes the people rating the obscure books know the author (or are possibly related to him/her) so you get a lot of really obscure books with a single five-star rating. I have also noticed that the “harder” a book is the lower its average goodreads rating (though this is my perception and I have not done any statistical analysis). So Harry Potter and the Hunger Games have higher averages than, say, Great Expectations and Anna Karenina.

    • I would love to see what the correlation looks like for someone else! The statistics part is easy once you have a spreadsheet with a column for the average rating and a column for your rating. Getting the list into a spreadsheet in the first place is a little more work. If you go to the list view of the books you’ve read in Goodreads and then select print, you can copy and paste most of the information into a spreadsheet (make sure average rating is one of the columns). However, your rating won’t transfer correctly because it is shown as a graphic rather than a number. So you have to manually go through and put your ratings in. If you make a spreadsheet using Google Sheets, you could share the link with me via a message on Goodreads and I would be happy to do the statistics part!

      One solution to the obscure books problem would be to set a minimum number of ratings that a book must have for you to include it. Then you would need to fill in another column for whether the book meets the threshold or not. Then I could run the stats with and without rare books.

      Aquarium animal husbandry sounds like a fun job!

      • Sorry I haven’t followed up on this comment (baby-induced time warp). This sounds like the kind of thing I would love to sink hours into, so I think I’m going to start on the spreadsheet. I’m going to set a couple of ratings thresholds (I think 10, 100, and 1000) to see if it makes any difference. Once it’s done I’ll share it with you but no worries if you don’t have time to do the stats, I won’t hold you to it (especially as I have no idea how long it will take me to actually finish the spreadsheet).

      • No worries, that’s perfectly understandable. And congratulations!

      • Haha, thanks. She’s 9 months old now, but I’m still using her as an excuse. My memory is definitely not what it was–don’t know if it’s hormones or sleep deprivation, but I’m still waiting for my cognitive faculties to return to pre-baby state. Anyway, I finished the spreadsheet and sent you the link on goodreads. This is fun!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: