The following is an edited version of a talk, “Blake and Big Data”, given at the English Literature in the World: From Manuscript to Digital ¦ New Pathways conference at the University of Lisbon, 9 May, 2018. It is very much a work in progress regarding some circumstances in which quantitative approaches to literary data may help us understand aspects of the reception of Blake’s works dealing with the history of references to Blake’s poem, “And did those feet”, which was set to music as “Jerusalem” by Charles Hubert Parry in 1916. Originally, the talk was intended to cover a wider range of data sets I have started to accumulate with reference to William Blake (some of which would have more fully justified the epithet of “big data”, whatever that may be).
The stimulus for both the talk and this post has been the work I’ve undertaken over the past year on the Blake-Parry hymn as a history of that text, stretching back to Blake’s original composition of the stanzas included in the Preface to Milton a Poem until the EU Referendum in 2016, with a focus on the century since Parry set Blake’s words to music. While working on the book, I kept a spreadsheet with references collated from written texts and audio recordings in particular, eventually amassing a dataset comprising some 600 entries. The data collected offers a sufficient series of examples to make me think differently about ways of reading the hymn, and this post is intended as a preliminary working through of some of the theoretical issues surrounding the employment of digital techniques in the field of reception studies and digital humanities.
Any discussion of quantitative methods with regard to Blake’s work carries an intrinsic warning, for Blake himself admonished readers against an over-reliance on what he called “Druidical Mathematical Proportion of Length Bredth Highth” (Milton 4.27, E98). As we shall see later, an important reaction against recent statistical analyses have included what are often loosely dubbed “romantic” oppositions: actually, more often than not this is intended as a derogatory term, but as a Romanticist I believe there are actually some valid criticisms against a reliance on quantitative methods (as opposed to, say, subjective phenomenological readings) that should always be borne in mind. My own use of statistical analyses is intended as a practical method that – in what are actually very limited circumstances – may help us build a picture of some aspects of the reception of Blake’s work. Blake scholars have relied on datasets for the best part of a century now: Geoffrey Keynes’s 1921 A Bibliography of William Blake included a list of Blake publications, which was then supplemented and superseded in 1969 by G. E. Bentley’s Blake Books and its various supplements in book form and as articles in Blake, An Illustrated Quarterly. Recently, I have been writing much more about settings of Blake to music, and Donald Fitch’s 1990 book, Blake Set to Music has become an indispensable reference work.
The subtitle of the talk was “Literary data as a challenge to literary theory”, invoking a text that has long been important to my own reception work, Hans Robert Jauss’s essay “Literary History as a Challenge to Literary Theory” (the original German text of which was published in 1970 and then translated into English in 1982). Jauss was writing at a time when periodization of literature was (rightly) falling into decline, but his own approach – which overlapped with elements of what would become fashionably known as New Historicism, as well as the materialist techniques of figures such as Jürgen Habermas – was a significant step in reconsidering how an audience’s reception of literary texts changed as the “horizon of expectations” evolved over time. Jauss offers a particularly compelling example of this with regard to the diverging receptions of Ernest-Aimé Feydeau, who published his literary sensation Fanny in 1857, the same year as Gustave Flaubert’s Madame Bovary. As Jauss observes, Fanny went through thirteen editions in one year while Flaubert’s formal innovations initially found little success. Though Madame Bovary had few admirers at first, however, they were tenacious, passing on their passion for Flaubert to each new generation so that eventually it was Fanny which came to seem the outmoded novel.
Today, we have a fairly simple way to test Jauss’s hypothesis, which certainly seems correct on an intuitive level. Google’s Ngram Viewer, which as of 2015 had scanned more than 5 million texts, allows a rapid search of certain phrases. Entering the search terms Ernest Feydeau and Gustave Flaubert certainly seems to support Jauss’s explanation of audience reception of the two authors:
As can be seen above, during the 1860s and early 1870s, it is Feydeau who is referenced more, and yet from 1875 this situation reverses so that, some twenty years after the publication of Madame Bovary and Fanny, it is Flaubert who eclipses the reputation of his friend as Feydeau lapses into obscurity by the end of the century. It should be noted, however, that Jauss’s hypothesis requires a degree of refinement, particularly when compared to the data from the French corpus:
Jauss’s reading which suggests a transformation of the horizon of expectations, so that the bestseller Feydeau is overtaken by the formal experimenter Flaubert does not seem to apply: almost from the very beginning Flaubert appears to match Feydeau, although as in the English corpus there is an explosion of references from the mid 1870s onwards. It should be noted immediately that the above charts, which indicate references to both authors in various journals and books, are no indication of sales and so this measure of popularity is not included. It is very likely that the trial of Flaubert and the publishers of La Review de Paris which serialised Madame Bovary meant that there were many more references to the author than could be expected from the number of actual readers, but this is a hypothesis that is difficult to test and – something of a running feature throughout this blog post – indicates how cautious we must be when employing quantitative techniques.
An entirely non-cautious (and increasingly notorious) example of the appeal of Big Data came from Chris Anderson in 2008 in an article for Wired entitled, “The End of Theory”. In it he observed that:
At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics. It calls for an entirely different approach, one that requires us to lose the tether of data as something that can be visualized in its totality. It forces us to view data mathematically first and establish a context for it later… Petabytes allow us to say: “Correlation is enough.”
Anderson, who frequently makes grandiose statements in order to attract attention has been refuted carefully and methodically by scientific researchers such as Sabina Leonelli, who demonstrates how Big Data is almost inevitably a highly selected phenomena with results drawn from social, political and economic factors, and Fulvio Mazzochi, who shows how petabytes of data enhance the testing of hypothesis rather than replaces them.
This post, then, has no real intention of arguing that the end of theory is nigh after Anderson, although some of my work in recent years has been much more influenced by that of Franco Moretti, who made a particularly forceful argument for rethinking methodologies in the digital humanities nearly twenty years ago now in his spectacularly titled “The Slaughterhouse of Literature”:
But of course there is a problem here. Knowing two hundred novels is already difficult. Twenty thousand? How can we do it, what does knowledge mean in this new scenario? One thing for sure: it cannot mean the very close reading of very few texts – secularized theology, really (‘canon’!) – that has radiated from the cheerful town of New Haven over the whole field of literary studies. A larger literary history requires other skills: sampling; statistics; work with series, titles, concordances, incipits – and perhaps also the ‘trees’ that I discuss in this essay. (Reprinted in Distant Reading, 2013, p.208)
In Graphs, Maps, Trees, Moretti argues that the use of quantitative methods allows us, by viewing “fewer elements” (i.e. individual texts), to have a “sharper sense of their overall interconnection”. Actually, a fairly careful rereading of Graphs, Maps, Trees for this conference led me to have a greater appreciation for what are, actually, quite moderate claims by Moretti: unlike Anderson, he is not attempting to make grandiose claims for the end of literary theory but seeking to demonstrate some noticeable trends within literary history. That said, his use of evolutionary theory as a way “to think about very large systems” has led towards a degree of “scientism”, a false application of scientific method in the humanities where, frankly, it is harder to replicate and generalise data – even more so than in the social sciences. A more extreme version of this is, for me, to be found in the work of Joseph Carroll who, in papers such as “Three Scenarios for Literary Darwinism” (2010) seeks to excise the vagaries of postmodernism from literary theory.
The tendency towards scientism in the work of theorists such as Moretti has been cogently critiqued by Tom Eyers, who argues that the tendency towards neo-positivism in Moretti (and also Stephen Ramsay’s influential Reading Machines) results in an “uncritical positivism at the very moment that [it] affirms an apparently critical historicism.” I particularly like Eyers’ critique because he shows an awareness of many of the advantages of the digital humanities, whether preserving decaying archives or deploying new data mining techniques within scholarship, while distancing himself both from broadly neo-Romantic, uncritically aestheticist objections to digital humanities and the equally uncritical techno-evangelism. I do not necessarily subscribe to his adoption of Althusser as a model for a new “speculative” formalism that can synthesise history and form, but he makes many pertinent observations regarding Moretti’s process that have influenced my own thinking, most notably the warning against assuming a uniform model of literary consumption to generate data from distant reading. Individual subjectivity never disappears, and Moretti’s taboo against close reading has been especially unhelpful to my own analyses of “Jerusalem”, where it is precisely the phenomenological, individual, subjective interpretation of the text that has produced a significant bifurcation in the reception history of the text in terms of political reception by left and right.
Actually, my reservations regarding Moretti’s model stem less from what he does explicitly in works such as Graphs, Maps and Trees and Distant Reading than the reductive tendency that emerges in so-called “literary Darwinism”. While a potentially contentious response towards this would, in my opinion, follow Deleuze’s consideration of empiricism (after Henri Bergson and Alfred North Whitehead) as the conditions for the production of novelty rather than a reflection of the “real” world, untangling that important thread will take this blog post in a much more convoluted direction. Here I shall simply observe a tendency in some of the social sciences, including communication studies, to employ “postpostivist” methodologies. As Allen, Titsworth and Hunt observe in their handbook on Quantitative Research in Communication:
A key component of the scientific method is verification and absolutism – that through replication, theories become “verified” and accepted as universally true. Although application of the scientific method to the study of communication and other social sciences was very popular at one time, more contemporary theory embraces a postpostivist approach that does not rely on absolute truth. From the postpositivist perspective, theories are assumed to be good descriptions of human behaviour, but exceptions are expected because of unique circumstances and the tendency for some unpredictability to be present in any situation. (p.8)
As such, a postpositivist approach to the data I am using to describe some of the reception of the Blake-Parry hymn “Jerusalem” follows this understanding: the data considered below is far from complete and exceptions are to be expected. It is a tool for a heuristics of understanding rather than any attempt at a complete hermeneutics.
Methods for collecting data
One thing became absolutely clear when preparing for the paper in Lisbon: although I have generally tended not to use quantitative techniques in my own work (one exception being for a chapter in William Blake and the Digital Humanities), I have worked with a considerable number of students in the fields of Journalism and Media Studies, both at undergraduate and postgraduate level; as such, sorting through my data demonstrated a number of flaws in my methods for collecting data. Mainly this was due to the fact that I had not initially intended to produce any form of quantitative analysis, and the desire to do so emerged from the number of references to the Blake-Parry hymn which showed definite patterns in some areas. As such, there are a number of limitations in the method for collecting data which ultimately affect the analysis which follows.
My principle methods of data collection were threefold: serendipity, that is by reading through any number of books/listening to recordings that I knew referenced the hymn; more systematically using Google’s NGram Viewer to examine the digitised collection of some five million texts; finally, by using online music databases such as Allmusic and Discogs, these two including 20 million and 150 million texts. While the number of texts included in the NGram Viewer is considerable, this should be placed against a corpus of 25 million books scanned bas part of Google Books (which itself is only a small portion of an estimated 130 million titles worldwide as of 2010).
While the method of data collection was not planned in as structured way as I would have intended had quantitative analysis been planned for from the very beginning, essentially arising from an extended bibliography, nonetheless it represents the most comprehensive collection of data for this topic ever collated. The work is not yet complete – there are, for example, some suspicious gaps in periods such as the 1940s that make me believe that more works remain to be found. In addition, I would like to collate references in news media to the hymn, although preliminary work I have undertaken here indicates that I will have to do a lot more cleaning of data (when a newspaper refers to “Jerusalem”, it’s usually the city rather than the hymn).
Bearing in mind the above limitations, nonetheless the final data set provides some interesting correlations that can be visualised in a number of ways, beginning with a simple scatter plot that shows some the frequency of instances referencing the hymn since Alexander Gilchrist’s publication of the Life of William Blake in 1863.
Unsurprisingly, the chart above shows Blake’s poem/Parry’s hymn being referenced more frequently as time progresses, but we should be wary of rushing to two conclusions that would establish causal relations between the data shown here and the reception of the Blake-Parry hymn.
First of all, the distribution of frequency data would appear to demonstrate an exponential growth which appears to begin some time around the 1990s, but it is perhaps more likely that the eventual shape will be closer to an S-curve, with a saturation of references in the selected media occurring in the twenty-first century. Following from this, the temptation is to discuss the above frequency data in terms of the popularity of “Jerusalem”, but this cannot be demonstrated causally from the data despite the apparent simplicity of a correlation between recorded frequencies over time.
Consider the following graph:
This chart, taken from the Church of England’s Statistics for Mission 2016, shows a fairly familiar trajectory of long-term decline in the Anglican church. Whereas nearly 7 percent of the population defined itself as Anglican in 1960, that figure had dropped to less than 2 percent in 2016, and regular church attendance had dropped from around 3.5 percent to slightly more than 1 percent between 1968 and 2016. Of course, because the population of the UK has increased during that time, it would still be possible for this decline to be matched by a growth in absolute numbers, but by 2016 the actual number of church goers had dropped to below one million. The reason why this is significant to a discussion of “Jerusalem” is that CofE churches use the hymnal Hymns Ancient and Modern, which includes “Jerusalem”: there is no statistical data collected on how often particular hymns are sung at church, but it is not an entirely unreasonable assumption that in one area at least – singing in church – the Blake-Parry hymn is less popular now (or at least performed less often) than it was some fifty years ago.
Because my research on the reception of “Jerusalem” traces its use across certain types of media (books, audio recordings, television and film in particular), it cannot begin to answer whether the hymn is more or less popular in absolute terms, only that it is more prevalent within those media in the twenty-first century than it was during the twentieth century. Certainly the hymn is sung at public events such as cricket matches and Last Night of the Proms, so it may indeed be more popular in absolute terms, but I have not collected the data to verify this. Nonetheless, within the data set I do have some interesting examples of particular points in its reception history are thrown into relief. Thus, for example, while I expected a surge of instances in 1976 during the Queen’s silver jubilee (and there was, indeed, a small rise in occurrences), the greater frequency is actually during 1973, mainly due to a slight flurry in audio recordings including that by Emerson Lake and Palmer on their album Brain Salad Surgery. There is, however, no obvious correlation between this increase and external events, unlike the more dramatic surge in frequency during 2011 (32 instances) and 2012 (29 instances), where “Jerusalem” was clearly recorded and performed more regularly because of the royal wedding of William and Kate Middleton and the Olympic ceremonies/golden jubilee the following year. Similarly, a spike in 2000 was due to the selection of the hymn as the official song for Euro 2000 by Fat Les, with the track being included on a number of compilation albums that year.
There has, then, been a greater media use of “Jerusalem” in the twenty-first century, but this has also been a period of greater deviation between the number of instances each year as the following chart demonstrates:
Each of these three fifteen-year periods demonstrate that the median for instances of “Jerusalem” increases considerably. In the decade and a half when Parry first set Blake’s poem to music, the median was one appearance a year, representing the fact that while occasionally it appeared in some format more than once there were also years when it did not appear at all. By the 1970s, this was no longer the case although the median has only risen slightly to 3 occurrences each year on average. In the first years of the twenty-first century, by contrast, the media is 16 instances a year with a much wider range between the various data points.
The following three charts illustrate similar points in a slightly different fashion, showing the distribution curves for incidents of the lyric “And did those feet”/”Jerusalem” in three different sets. In the first, covering the entire period from 1863 to 2016 (a population where N=150 because in this data set there are a few instances where no data was collected), the mean is 3.84 with a standard deviation of 6.137. What is significant about these numbers is that, across a 153 years, the number of instances in the media of references to the text are very low because, for more than half a century, I was not able to find any reference to the text. If we focus on the century from 1916-2016 (a population where N=98), the mean of instances is higher at 5.69 and the standard deviation or spread of numbers has increased to 7.9. Turning finally to 1970-2016, the first date selected because it is during this decade that we see the first spike in references to the Blake-Parry hymn, the mean has increased substantially to 10.02 and the standard deviation now stands at 7.778. Further concentration on smaller slices of later time periods would intensify this trend – a higher mean and a wider spread of variables from the norm as a greater number of references to the hymn fluctuate greatly.
Again, it is important to read such statistics carefully. “Jerusalem” is more prevalent in certain media instances, but once more this neither proves nor disproves the supposed popularity or otherwise of the hymn. The three histograms above, however, do demonstrate that the data is skewed when viewing the distribution curve for the period 1863-2016 in particular: essentially, there are more years during the nineteenth century when there is no reference to Blake’s poem than when it is alluded to, demonstrating very much that this is a text that comes into its own in the twentieth century.
One thing that does become evident from the data I have collected is that the driving force behind this increased media saturation is audio recording, as the following two charts demonstrate:
The majority of media formats where “Jerusalem” occurs is via audio (whether live performance – only noted rarely in my statistics and not including regular events such as Last Night of the Proms – or, more commonly, audio recordings). While music comprises more than half the instances within my data set, before the 1970s audio recordings at least are rare, and it is during the CD-revolution that takes place during the 1990s that instances of “Jerusalem” appear most often, participating in the wider renaissance of classical music brought about by the innovation of the CD. Indeed, it is possible that a final tailing off of those instances could reflect the decline of CD in recent years, although this correlation cannot be proven and, in any case, could be reasonably expected to have occurred earlier in the preceding decade. In general, however, the data collected does seem to indicate that at least partially the wider media reception of “Jerusalem” corresponded to a transformation in audio recording technologies: the hymn became part of the backing track for the nation because, as with so much other music, innovations in technology meant that it was easier to produce and distribute.
This data, visualised in different ways, does point to a similar conclusion: that “Jerusalem” has been more widely distributed across media formats as the century since Parry set it to music, and that this growth has been driven by audio recordings. I won’t lie, such conclusions are hardly earth-shattering and would have been guessed as “common sense” by any number of commentators, but it is useful to see the evidence demonstrating such a clear trend. Two other examples also demonstrate the value – and the limitation – of such augmented reading, one of which actually shaped my own understanding of the reception of the hymn and another of which indicates the danger of false positivism when employing quantitative methods.
The first set of charts also deals with the categorisation of music as follows:
This first chart – drawing largely on self-identified categories of recordings (whether emphasising a choir, pop music, by a military brass band etc.) is an effective way of seeing immediately some of the ways in which those recordings of the hymn have been categorised. It is an exercise in taxonomy which, while hardly surprising in some respects – the vast majority of instances are orchestral or choral arrangements – does indicate a few interesting examples, one of which I shall follow up below. The one point to make about this visualisation is that it obviously does not help with tracking instances across time: in many cases, this is not especially relevant, but occasionally – as in the categories of sport and music for royal occasions – it disguises the fact that such uses are very recent (largely post-2000) and thus indicate changing attitudes towards/uses of “Jerusalem”.
The interesting example, which for me is illustrative of how such quantitative analysis actually affected my reading of a text, is that of matrimonial recordings.
In and of itself, this doesn’t appear to be an especially interesting chart: between 2004 and 2011 there were fourteen instances of “Jerusalem” being included on wedding compilations. However, this simple data changed one section of my book to a significant degree: there are absolutely no examples of the hymn being included on compilations for this purpose before 2000 that I can find, although I still need to check that there are none after 2011. This is a surprising example of changing uses of the hymn – which personally I trace to the release of Four Weddings and a Funeral in 1994 (“Jerusalem” is sung at the first wedding in the film) with some newspaper references in the late 90s and early 2000s. The spike in 2011 is around the royal wedding of William and Kate Middleton and, if there truly are no further incidents (which I doubt) perhaps represents an oversaturation of the hymn at such services.
The final example deals with one of the most evocative phrases from Blake’s poem – “dark Satanic mills”. The chart below indicates the frequency since 1900 where the phrase has been used separately from the hymn to illustrate some aspect of society or other thought:
For some time, I have been rather adamant that Blake’s phrase has nothing to do with the industrial revolution and, in my opinion, is only tenuously connected with the Albion Flour Mills constructed in Southwark which burned down in 1791. Yet it becomes clear that, after some tentative references in the 1910s (the first instance I can find of the phrase outside of simple repetition within the poem as a whole), the phrase really begins to gain currency from the 1950s onwards. I am not entirely confident of my data to be sure that the dip in the 1970s is entirely satisfactory, but certainly from the 1980s onwards it becomes embedded in popular culture – both in Britain and internationally – as a phrase used to invoke the worst excesses of industrialisation and mechanisation. Of the fewer number of instances where it is used to refer to something else, a significant proportion of these arise from scholars pointing out that it does not refer to the industrial revolution.
This is another example of what Jauss refers to as the changing “horizon of expectations”: as the phrase “dark satanic mills” is used more frequently to refer to industrialisation, so more people refer to it the same way. Admittedly, alternative uses have also increased (some of these directly oppositional) but in the main part this is a case where the meaning of the phrase has definitely chanced since Blake wrote down those words. While I disagree with this usage in many respects as not that which Blake intended, I am also interested in the spread of the term: while it does not represent the author’s original meaning, it has a much more effective use or exchange value as a term describing the industrial revolution. When people use those three words, they call up a period in history extremely effectively and the phrase serves as a microcosm of the ways in which the poem as a whole has been transformed throughout its reception history.
The conclusions of my research at this stage are still fairly tentative. Regarding the value of quantitative analysis, in some cases it demonstrates the obvious (that instances of “Jerusalem” increase as time progresses, and that this really is a twentieth- and twenty-first century text, with its reception doubtlessly driven by Parry’s setting the hymn to music). Even in those cases, it may be of use – for example in terms of showing how prevalent the phrase “dark satanic mills” becomes in the latter part of the twentieth century – and in other circumstance it offered me patterns that I was not expecting, such as the usage of the hymn in wedding services from the early 2000s onwards.
To me it is obvious that more work needs to be done: I consider my data set fairly representative of the hymn, but am not yet fully confident that it offers a suitable population sample throughout the full twentieth century, and as such I cannot say whether certain gaps (most notably in the 1940s) are significant or the result of my flawed methods of collecting data. Nonetheless, some of the evidence that is emerging is compelling to me and this is a project that I wish to continue. The next steps are to ensure that the data set as it currently stands is as complete as possible, while also considering the option to include other media references from news sources.
It should also be noted that the data here has been analysed in a largely descriptive fashion. While I would like to answer certain questions, for example whether a person’s political stance predisposes them to listen to “Jerusalem”, I cannot answer this in anything but an anecdotal way. As Allen, Titsworth and Hunt observe, quantitative analysis is very good at answering questions as to what is happening, but not why. To begin to find solutions to these and other questions would require a mixed methodology incorporating qualitative approaches.
Regardless of certain specific gaps in the data discussed here, there is a more general conclusion that I believe can be drawn upon already, and that is how quantitative analysis compels us to reconsider the text in new ways. Before continuing on this line, it is very much worth remembering the following admonition by Blake taken from his annotations to the works of Joshua Reynolds: “To Generalize is to be an Idiot To Particularize is the Alone Distinction of Merit–General Knowledges are those Knowledges that Idiots possess” (E641). I have been very cautious in some of my own generalisations, and I am critical of the positivist assumptions in some approaches to digital humanities which assume that data reveals us truth. Likewise, although I can understand why Moretti argues against close reading the majority of my own work on “Jerusalem” consists of some 90,000 words of close reading of four quatrains, what I consider to be one of the most important works in England in recent decades.
But when we survey data as a whole, contracting and expanding our senses as Blake describes the Eternals in The [First] Book of Urizen (E71), then we can see different forms, have a sharper sense of the interconnection between those forms as Moretti suggests. For example, while the vast majority of musical recordings are classical, for most of them the significant difference in musical terms is whether they use Elgar’s arrangement or Parry’s: that difference is noticeable, but most other elements of the recording are not. As such, it is the collation of musical settings into different genres and branded formats that becomes important, indicating whether the music is being aimed at a sporting, military, traditional or more easy-listening audience. This is where “distant reading” comes into its own.
In such cases, quantitative analysis of “Jerusalem” does, I would argue, become useful (with such usefulness always being recognised as limited). Alongside the task of hermeneutics, of interpreting the text, it provides a form of literary heuristics, indicating the parameters within which the text operates among a wider audience. It cannot be used to tell us what the hymn means for its various audiences, but it does offer in broad terms some insights into how the text comes to be used in different times and circumstances.
Allen, Mike, Titsworth, Scott, and Hunt, Stephen K., Quantitative Methods in Communication, Sage, 2009.
Anderson, Chris, “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete“, Wired, 2008.
Carroll, Joseph, “Three Scenarios for Literary Darwinism”, New Literary History, 41.1, 2010, pp.53-67.
Church of England, The, “Statistics for Mission, 2016“, Research and Statistics, 2016.
Eyers, Tom, “The Perils of the “Digital Humanities”: New Positivisms and the Fate of Literary Theory“, Postmodern Culture, 2015.
Jauss, Hans Robert, Towards an Aesthetic of Reception, trans. Timothy Bahti, University of Minnesota Press, 1982.
Leonelli, Sabina, “What difference does quantity make? On the epistemology of Big Data in biology“, Big Data & Society, 2014.
Mazzochi, Fulvio, “Could Big Data be the end of theory in science?“. Science and Society, 2015.
Moretti, Franco, Graphs, Maps, Trees, Verso, 2007.
Moretti, Franco, Distant Reading, Verso, 2013.