If we know a reader’s genre preference, what can we infer about their reading behavior? Let’s take a look at data from Goodreads to find out.
Goodreads is a social website that allows users to rate and review the books they’ve read.
You can find the code for this analysis at GitHub. The file reader_and_genre_preference.ipynb contains a write-up with code.
The Data
The dataset consists of 1,065 users with 557,654 ratings between them. The most prolific user rated 5,492 books. The least active rated 74. The mean number of ratings is 523 with a standard deviation of 538. The users were sampled from raters of 14 titles from seven genres. For each genre, a popular and moderately popular title (determined by the number of ratings on the Goodreads’ website as of January 3, 2018) was selected. (See the titles in the notes section below). Users with less than 70 ratings were discarded.
Determining the genres of rated books was a bottleneck in the data collection process, so instead of looking at every book a user rated, 145 books were sampled from the ratings of each user. If the user rated less than 145 books, then all of their books were included. Nonfiction titles and titles of indeterminate genre were discarded, leaving a pool of 138,516 fiction titles. The mean sample size of fiction titles per user was 130 with a standard deviation of 22. The average standard error was 0.023 with an average standard deviation of 0.013.
Genre | Number of Readers |
---|---|
General Fiction | 289 |
Romance | 160 |
Fantasy | 236 |
Science Fiction | 67 |
Horror | 17 |
Mystery / Supense | 125 |
Young Adult | 171 |
How Strong is Genre Preference?
To answer this, let’s delve into the data. Let’s separate readers by their preferred genres and see what they’re reading as a group.
Chi-square goodness-of-fit results
Genre | Result | Genre | Result |
---|---|---|---|
Gen Fiction: | χ2(6) = 126.97, p < 0.001 | Romance: | χ2(6) = 97.95, p < 0.001 |
Fantasy: | χ2(6) = 107.72, p < 0.001 | Sci-Fi: | χ2(6) = 125.10, p < 0.001 |
Horror: | χ2(6) = 51.45, p < 0.001 | Mystery: | χ2(6) = 74.65, p < 0.001 |
YA: | χ2(6) = 124.87, p < 0.001 |
A chi-square test of goodness-of-fit was performed for each group to determine whether the genres were equally preferred. The results above suggest that readers within groups showed significant preference for specific genres.
Most of these groups look pretty similar. Roughly 50% of the titles they read are within their preferred genre and they read a smattering of everything else. Romance readers are notable for their loyalty; roughly 65% of the books read in that group are romance. It’s interesting that romance readers seem to turn to fantasy and young adult when reading outside of the genre. In a future analysis, I’ll dive into subgenres. Is romance a subgenre of the fantasy and young adult read by this group?
Romance readers don’t seem to read much horror or sci-fi, and horror and sci-fi readers reciprocate the disinterest. What does that mean for horror and sci-fi books with strong romantic elements? Do they appeal to romance readers?
It’s worth noting that fantasy seems to be a second favorite of every group. Mystery readers show a preference for fiction and fiction readers give a small edge to mystery, but even those groups seem to enjoy fantasy. Do they enjoy the same types of fantasy? That’s certainly worth more digging.
Young adult fares well outside of its core readership. But I think it’s been known since the craze of Harry Potter that these books aren’t exclusively for kids.
Let’s take a look at the popularity of each genre outside of its group.
Super Loyalists
So far, we’ve looked at behavior averaged over groups. Do the patterns hold when we look at the members of each group who are most loyal to their genre? Let’s look at the top 25% most genre-loyal readers.
Again, romance readers stand out. Unlike every other group, their loyalists read almost nothing else! Young adult edges out fantasy among these readers, but both numbers are tiny. The loyalists of other genres look a lot like their less loyal counterparts, just more loyal.
What’s Missing?
We’ve looked at the behavior of readers grouped by preferred genre. We’ve even looked at the behavior of the most genre-loyal readers. Asking more questions than we’ve answered aside, something is missing here. We’ve ignored a key part of our data that might further illuminate reading behavior. Have you figured it out? Right. Exactly. We didn’t look at ratings. How do groups of readers rate books outside of their preferred genre? Will we find that horror novel ratings amongst romance readers are stellar, suggesting that romance readers are extra picky about reading horror and only pick up ones they’re virtually guaranteed to like? Or will we find that horror novel ratings amongst romance readers are abysmal, suggesting that each horror novel read is a cautionary tale not to read another? I can’t wait to find out.
Coming soon…
Notes
Determining a Book’s Genre
Goodreads allows users to place books on named shelves such as Romance, Fantasy, and Urban Fantasy. I determined the genre of each book based on the shelves it was placed on and the number of users who placed it there.
The following keywords took first precedence:
Nonfiction, Comics, Graphic Novels, Childrens
If a book appeared on any of these shelves, it was excluded.
The following keywords took second precedence:
Romance, Category Romance, Fantasy, Science Fiction, Horror, Mystery, Young Adult
If a book was placed on one of these shelves, it was assigned the corresponding genre. If the book was placed on more than one of these shelves, the shelf with the greater number of users took precedence (i.e. 500 users place it in Romance; 1000 users place it in Mystery; assignment equals Mystery).
The following keywords took third precedence:
Fiction, Classics, Literary Fiction, Womens Fiction
Any book placed on one of these shelves without an accompanying genre was categorized as General Fiction.
Sampling Users
Below are the titles from which the users were sampled with number of ratings on January 3, 2018:
Title | Genre | Number of Ratings |
---|---|---|
Gone Girl | Mystery | 1,696,469 |
Never Let You Go | Mystery | 9,742 |
The Magician's Land | Fantasy | 46,812 |
The Stone Sky | Fantasy | 11,141 |
Ready Player One | Science Fiction | 481,302 |
Apex | Science Fiction | 4,893 |
Allegiant | Young Adult | 683,200 |
Little Monsters | Young Adult | 2,351 |
Vision in White | Romance | 110,358 |
Temporary | Romance | 1,220 |
Freedom | Fiction | 133,538 |
My Name is Lucy Barton | Fiction | 73,632 |
Sleeping Beauties | Horror | 21,094 |
Into the Drowning Deep | Horror | 1,835 |