February 23, 2018
New research highlights changing gender roles and declining representation in English-language fiction
by Simon Reichley
As Kat Eschner wrote last week in Smithsonian, a trio of researchers trained in English literature and information sciences have published an article in the Journal of Cultural Analytics that provides new insight into the historical development of authorship and gender roles in English language fiction over the last several hundred years.
The research project was conducted by Ted Underwood at the University of Illinois and David Bamman of UC Berkeley, with assistance from Sabrina Lee, a graduate student at Illinois. In order to better understand the historical development and significance of gender representations in fiction, the researchers examined meta-data harvested from 104,000 novels published between 1803 and 2009. As Eschner points out, no human could read this many books in a lifetime, and so the team resorted to an algorithmic method developed by Underwood and Bamman, called BookNLP.
This technology allows researchers to scan texts, identify and group characters by name, assign them genders based on honorifics and other conventions, and then identify words, actions, and things associated with each character. By subjecting all 104,000 novels to this procedure, and then clustering the resulting data by year, the team was able to compile a detailed historical model of what kinds of characters appeared in fiction over time, how those characters were represented, and what they did.
The researchers identified two trends, somewhat contradictory. First, “gender divisions between characters have become less sharply marked over the last 170 years,” which is to say that our representations of men and women have become less rigid, less confining. Second, “there is an eye-opening, under-discussed decline in the proportion of fiction actually written by women,” accompanied by a marked decline in the number of women and girls represented in fiction. As the authors put it, “While gender roles were becoming more flexible, the space actually allotted to (real, and fictional) women on the shelves of libraries was contracting sharply.”
This tension is most apparent in the period between 1850 and 1950; in only a hundred years, women’s share of authorship falls from nearly fifty percent to less than twenty-five. As the authors point out, this staggering drop in representation is not often pointed out in, say, introductions to popular college anthologies, or other discussions of literature from that period.
The other important claim that the authors make is backed by a slightly more technical methodology, enabled by machine learning.
The brief explanation is that Underwood, Bamman, and Lee (henceforth Unbamlee) took their identified characters, tagged each as a man or woman, and then presented them to an algorithm, along with a cloud of descriptions, actions, and objects associated with that character. They then asked the algorithm to form a model of masculine and feminine characterization, and to predict whether a character it hadn’t yet seen was a man or a woman. The algorithm then got a grade on its accuracy, which the researchers track over time.
According to the researchers, “If the model turns out to be accurate, we can conclude that practices of characterization were powerfully organized by a binary conception of gender… If the model becomes less accurate, we can conclude that gender was becoming a less pervasive organizing structure — or at least, that gender was being expressed in ways that no longer aligned with the binary division between he and she.”
And in fact the data does suggest that, over time, characterization become less “powerfully organized by a binary conception of gender.” While certain subsets of language become more strongly predictive as we move into the twentieth century (e.g. language describing the body and clothing), others become significantly less so (e.g. language describing “domestic space and subjectivity”). And while this, combined with the simultaneous decline women’s authorship, might suggest that fiction by women is more generally rigid in its descriptions of gender, the reverse actually appears to be true. As the authors put it:
Gender differences seem to be drawn more starkly in stories written by men. It is not immediately clear why, although one might guess that the explanation is related to the underrepresentation of women in their imaginative worlds. The lone woman in a Western or detective story may tend to be depicted very strongly as “The Woman.”
Unbamlee (warned you) draw several interesting conclusions from their research. Many boil down to: “This is the beginning of an interesting line of inquiry, on which we are not prepared to comment further.” But one seems particularly insightful, and provides a welcome corrective to myopic tendencies that plague many disciplines:
If the proportion of novels written by women declined because men moved into a formerly feminine genre, while women were taking advantage of new opportunities elsewhere (for instance, writing nonfiction), then it would appear that genres themselves were becoming less strongly gendered. If that were true, it might make intuitive sense that gender roles inside fiction would be blurring at the same time. Indeed, it might not be a paradox at all that women writers left the genre where they had previously been segregated at the same time as gender roles were growing more flexible. This would seem paradoxical to literary critics only because we have a special professional interest in fiction, tempting us to view it as a barometer of equality generally — which it may not be at all.
Simon Reichley is the rights and operations manager at Melville House.