What's In a Name?: An Analysis
Fans, scholars, and social critics are charting the relationship between traditionally published fiction, fan fiction, and the creators thereof. Henry Jenkins argues that “[r]ather than talking about media producers and consumers as occupying separate roles, we might now see them as participants who interact with each other according to a new set of rules that none of us fully understands” (3). While scholars understand this relationship in differing ways, Lesley Goodman argues that “fan interpretation privileges the coherence of the [published text’s] fictional universe while downplaying the authority of the text and insisting that the author is not dead, but a failure and a disappointment” (663). This is nowhere more true than in the Harry Potter fandom.
​
Fans have a complicated relationship with Harry Potter author J. K. Rowling, especially as she continues to extratextually give information about her fictional world. Since the end of the book series in 2007, there has been a steadily growing frustration with how Rowling adds details to her universe outside the texts themselves, starting with the announcement that Dumbledore is gay directly after the final novel came out. This frustration erupted into a full-scale internet battle in the beginning of 2019 as memes and social media posts railed against Rowling’s continued alterations that add details that some dislike and others believe should be in the texts themselves and not simply tweeted. While fan complaints, especially those around Rowling, do seem to point to fans’ resistance against Rowling and her authority, fans continue to write within Rowling’s world. One would assume that the Harry Potter texts would thus influence fan fiction writers, but we wanted to explore what this relationship might actually be.
​
For this project, we use text mining to see how fan fiction writers are influenced by Rowling’s texts. We mine Rowling’s seven novels and 450 pieces of fan fiction published between 2009 and 2017 on Archive of Our Own (AO3--see here for more information on how these corpora were gathered), using an algorithm called Word2Vec to discover which words are closest to a target word (see more here on our process). For our target words, we have chosen "Harry", "Ron", and "Hermione". We are looking for trends in how the Golden Trio are discussed in Rowling’s works or the fan fiction they inspire.
Hypothesis and Findings
Going into this project, our hypothesis was:
​
Over time, Harry Potter fan fiction will look less and less like the Harry Potter novels.
​
We assumed since fans have had increasingly negative reactions to Rowling that their fan fiction would thus become less similar to her work. As we began to focus on the similarity between words, we found that, at the most basic level:
​
Over time, this particular corpora of Harry Potter fan fiction began to look less and less like...anything.
​
As you can see in the below graphs (for more on how to read these graphs, see here), the average proximity level for the top 200 most similar words for each of "Harry", "Ron" and "Hermione" in the Harry Potter novels, written by a single author, always remained above .999, even in the fifth book, which took a rather sizable dip.
The average for a year of fan fiction, on the other hand, never gets above .74 ("Ron" and "Hermione" in 2009), and steadily decreases to a low of .38 ("Ron" in 2017). In other words, unsurprisingly, a single author has clearer trends than many authors.
However, what this data does suggest is that, over the first nine years of AO3, the fan fiction became more diverse, with voices talking about these characters in increasingly different ways. Scholars, such as Abigail De Kosnik, have noted the increased variety of fan fiction as more and more people take to sites like AO3, but this suggests that not only are there more writers but that their material is measurably different. They are not all saying the same thing or talking about these characters in the same way. As more voices come into this conversation, they are not trying to mimic one person (such as Rowling) but rather adding their own unique voice to the growing world of Harry Potter.
​
With all this diversity, there are still a few trends we found, specifically focused around names. Here, we lay out three clear trends we can see between the books and our fan fiction corpora:
1. Names as a Sign of Humanity
One of the most interesting things we found is how often the character names we chose to study related to other character names in general. Our Word2Vec program measures how words (in this case, "Harry", "Ron", and "Hermione") are used in similar ways to other words. To avoid confusion, we will use “characters” to talk about our chosen focus words of "Harry", "Ron", and "Hermione" and “names” to reference the words found on their lists. One would assume that character names would act most similarly to other character names since they function as the same part of speech. Yet, in the first book of the Harry Potter series, only 8% of the top 200 most similar words were names, while nouns formed a much larger proportion of those lists, as you can see in these pie charts:
What we found interesting about the low percentage of names in the top 200 most-similar-word lists is that that suggests that these characters don’t act like other characters. They operate more like things than they do like people, suggesting an objectification particularly unusual for protagonists.
​
That being said, the percentage of names in the top 200 most similar names rose steadily throughout the series, especially after the third book, as can be seen in this graph:
The growth in name similarity matches the subtle switch in the Harry Potter series from children’s literature to Young Adult (YA) literature, as Harry and his friends moved from childhood into their teen years, and the series became decidedly darker. Roberta Trites argues that one of the major differences between children’s literature and YA is the shift from understanding the power of the individual to the social connections that control our world:
​
“In books that younger children read...much of the action focuses on one child who learns to feel more secure in the confines of her or his immediate environment, usually represented by family and home. Children’s literature often affirms the child’s sense of Self and her or his personal power...But in the adolescent novel, protagonists must learn about the social forces that have made them what they are. They learn to negotiate the levels of power that exist in the myriad social institutions within which they must function, including family; school; the church; government; social constructions of sexuality, gender race, class; and cultural mores surrounding death.” (3)
​
In other words, children’s literature has a focus on the self and how that self is separate from the rest of the social world; YA has a focus on how characters fit into and must navigate that social world. YA thus depicts personal agency, but an agency that must be negotiated through other people, communities, and social forces.We suggest that the shift to YA explains why our Golden Trio steadily becomes similar to other names. They are being placed more and more within their community, relating to others as they become agents in their own right who are comparable to other agents. In other words, instead of being seen as objects, which the heavy presence of nouns suggests may be the case for the younger Harry, Ron, and Hermione, they are now seen as people, much like the other people who populate the series.
​
When we turn to the fan fiction data, names continue to become larger and larger portions of the top 200 most similar words. Here is the percentages of names within the top 200 most similar words from the books again on a reformatted chart that matches the scope of the chart for the fan fiction beside it:
As you may notice, the fan fiction, across the board, has a higher percentage of names in each character’s top 200 most-similar-words lists than the book lists, continuing to rise over time with a spike at 2013. There are a few different reasons why this may be, specifically the volume of characters that exists in fan fiction that might not exist in the Harry Potter novels. Regardless, this data suggests that fan fiction writers treat Harry, Ron, and Hermione more like other characters than they do like objects, verbs, adjectives, etc. For fan-fiction writers, these characters are people, just like all the other characters that inhabit their stories. They act like people and not like things or actions. This may suggest that these fan-fiction writers fundamentally understand their characters, who they are and even what they are, differently than Rowling did.
2. Which Name?
Within the rising percentage of names, we found one fascinating trend regarding whether first or last names appeared in the data sets. For example, “Malfoy” appears in most of the book datasets for Harry, Ron, and Hermione, as do “Potter”, “Weasley”, and “Snape”:
There are two things of note with these names:
​
First, look at how Potter and Weasley rank on Harry’s and Ron’s charts respectively:
For both, their last name ranks fairly high, between 21 and 92 for Harry and between 29 and 106 for Ron. In other words, their last names do not operate the same way their first names do, as we can see in that their last names are outranked by many other names and words. To be fair, Harry’s parents are discussed often and Ron’s whole family are a large part of the books, especially later on, perhaps skewing how these last names work.
​
Yet, if we turn to a character such as Hermione, whose family is almost never mentioned, we only find her last name appearing in the top 200 most similar words in the books once (in Order of the Phoenix at rank 97). Though, to be fair, Granger may simply be used less often because most characters call girls in Harry Potter by their first, rather than last, names. There are a few exceptions, such as the ever-delightful Draco who often calls Hermione “Granger”. The fact that a character who is the main if not sole way of mentioning their last name still does not have a lot of similarity with their last name suggests that there is not a clear connection between a character and their last name. In effect, Harry and Potter are two different characters; they act differently, they are described differently, they are addressed differently.
​
Rowling actually points to this difference between first and last names briefly in Half-Blood Prince. When Hagrid is upset with Harry, Ron, and Hermione, he calls Harry “Potter”, to which Harry asks “Since when have you called me ‘Potter’?” (228). Snape almost always calls Harry “Potter” as does Draco Malfoy, but those close to him call him “Harry”. The two names functionally operate differently as Potter is used scornfully or at least apathetically while Harry is used in camaraderie.
​
The second thing of note is that these last names almost never appear in the top 200 most similar names for these three characters in the fan fiction (except for Snape in Harry's lists). Whereas Granger rarely ranked similarly to Hermione in the books, in the fan fiction the boys’ last names also disappear from their most similar words, suggesting that that second character, the one based in their last names only used by acquaintances or enemies,disappear from this fan fiction. At the same time, “Draco” and “Severus” begin to appear, especially for Harry:
Whereas Draco only appears once in the top 200 most similar names in the books for Harry (in Half-Blood Prince at rank 126), and Severus never appears, they both become prominent names in Harry’s proximity in fan fiction. In fact, while Snape still appears fairly consistently, both Draco and Severus have a higher similarity index and higher rank than Snape. In other words, as Harry’s and Ron’s second, less-familiar character disappears, Draco and Severus become more familiar, suggesting a shift from enemies to allies, friends, or lovers (depending on the fan fiction).
3. Variable Harry
A final trend became apparent in both building the graphs for the earlier arguments and in constructing the data and graphs for the interactive graphs page: the change from books to fan fiction works differently for Harry than it does for Ron and Hermione. We can see this in two ways.
​
First, in the books, Harry, Ron, and Hermione have approximately the same percentage of names in their top two hundred most similar words. One year, Harry might have slightly more, another it’s Ron, but they stay fairly consistent:
But, if we look at the percentage of names by year for the fan fiction, Harry has a significantly lower percentage of names, sometimes only half the percentage of Ron and Hermione, in every year except 2016:
While Harry’s percentage of names does grow, as discussed in the first finding, it does so at a significantly slower rate than the other two and remains lower than the other two throughout.
​
The second way we can see the difference between Harry and Ron and Hermione is in which names consistently reappear across the years of fan fiction. As we explain in full here, we created two lists of names for each character: 1) those who appeared in at least four of the top 200 most-similar-word lists from the books for that chosen character; 2) those who appeared in at least five of the top 200 most-similar-word lists from the fan fiction for that chosen character. In the books, the same fourteen names met the criteria for all three characters: Dumbledore, Fred, Fudge, Ginny, Hagrid, Lupin, Malfoy, Neville, Percy, Potter, Snape, Weasley, and the other two characters whom we studied (e.g., Harry had Ron and Hermione).
​
The fan fiction lists, on the other hand, looked very different. Ron and Hermione had 34 and 32 names that met the criteria, respectively, while Harry had 17, half of Ron’s consistently-appearing names. In other words, not only did Harry have a lower percentage of over all names, he had a significantly lower number of repeating names.
​
We argue that these two patterns suggest that Harry, in this fan fiction, is a more flexible character for fan-fiction writers than Ron or Hermione. One could argue that these two patterns explain each other: Harry has fewer names in his lists overall, so of course he has fewer consistently appearing names. However, the data suggests that Harry is similar to a higher percentage of unique characters than Ron or Hermione, no matter how many names appear on his list. While Ron and Hermione have more names appear in their top 200 most-similar-word lists, many of those names are repeats. Harry, on the other hand, has fewer repeats but a higher percentage of unique names. In other words, if we remove all duplicate names (keeping one instance of each name mentioned) from all nine years of top 200 most-similar-word lists, Harry has 382 unique names, or 65.98% of his total number of names (579, with the duplicates). Hermione has more unique names, 397, but she has so many instances of names, 710 total, that her percentage is a lower 55.92%. Ron only has 381 unique instances of names, even though he has 699 names total, meaning that only 54.51% of his names are unique. In other words, even though Ron has 120 more instances of names in his lists, he has one fewer unique name than Harry, and Hermione has so many repeated names that it lowers the percentage of unique names. This means that while names might not appear often in Harry’s lists, those that do are more diverse.
​
In short, Harry interacts and is similar to more unique characters in the fan fiction than Ron and Hermione. This is perhaps unsurprising; Harry, as the main character, appears in more situations with more characters than Ron and Hermione who, while central to the novels, are not the protagonists. Harry is thus seen as more flexible in these fan fictions than Ron or Hermione.
Conclusion
We have only begun to scratch the surface of this data. We have found interesting patterns while compiling this site and attempting to make it accessible to others. But there’s still so much to find. So, we have created a page on which you can play with our data and see what you find. We believe that people outside the scholarly community, especially the fans who inspired and were the basis of this work, should be able to access this work. Brittany Kelley argues that scholars “have a responsibility to enact an ethics of goodwill that balances the concerns of fans with those of scholarly development.” Fans should be able to see how their writing and their communities work and should be able to contribute to the scholarly conversation about them. The comparison between published author and fan-fiction author leaves much to be explored, and our hope in creating this site is to open up this conversation to fan and scholar alike.