‘I know no speck so troublesome as self’: Finding Middlemarch through Corpus Linguistics

Dr Rosalind White, (@DrRosalindWhite on Twitter) research associate at the University of Birmingham’s Centre for Corpus Research and on #FindingMiddlemarch at Royal Holloway, University of London, proposes a way into George Eliot’s Middlemarch using corpus linguistics.

In this blog post, I’d like to explore how corpus linguistic tools can be used to illuminate the semantic texture of George Eliot’s writing. I make use of the CLiC Web App (Mahlberg et al. 2020). From the outset, George Eliot frames Middlemarch: A Study of Provincial Life (1871) as a pseudo-scientific sample of the complex human dynamics that can be found in a provincial town. Using the metaphor of an optical microscope, the narrator vows to concentrate ‘all the light [they] can command’ on ‘unravelling certain human lots and seeing how they [are] woven and interwoven’ (102).

Even with a microscope directed on a water-drop we find ourselves making interpretations which turn out to be rather coarse; for whereas under a weak lens you may seem to see a creature exhibiting an active voracity […] a stronger lens reveals to you certain tiniest hairlets which make vortices for these victims while the swallower waits passively at his receipt of custom.

A ‘strong lens applied to Mrs Cadwallader’s match-making’, Eliot suggests, will show a parallel ‘play of minute causes’ (41). This metaphorical microscope reappears throughout the narrative. Ineffectual intellectual Edward Casaubon — who ‘dreams in footnotes’ — is so single-minded that Mrs Cadwallader suspects a drop of his blood under a slide, would reveal ‘all semicolons and parentheses’ (49).  Ego, Eliot infers, can act as a ‘tiny speck very close to our vision’ that ‘blot[s] out the glory of the world and leave[s] only a margin by which we see the blot’, for there is ‘no speck so troublesome as self’ (589).

Fig. 1 Middlemarch (Edinburgh and London: William Blackwood & Sons, 1871-72) with original green decorated wrapper. Fig. 2 Middlemarch, Fair Copy, Manuscript Add MS 34034, British Library.

Corpus linguistics can be seen as a form of “distant reading” (Moretti 2013, Mahlberg & Wiegand 2020, Froehlich 2018). It is a method that allows us to momentarily look past the haziness of our own subjectivity and obtain a panoramic perspective by making the large-scale aspects of literature more visible. Digital tools like CLiC can be used to help us map themes, characterisation and cultural trends across a given corpus. 

Reconciling quantitative data with qualitative analysis is a method in keeping with George Eliot’s characteristic use of empathy to transport her readers beyond the margins of their own subjectivity. As Eliot’s narrator famously puts it:

That element of tragedy which lies in the very fact of frequency, has not yet wrought itself into the coarse emotion of mankind; and perhaps our frames could hardly bear much of it. If we had a keen vision and feeling of all ordinary human life, it would be like hearing the grass grow and the squirrel’s heartbeat, and we should die of that roar which lies on the other side of silence. As it is, the quickest of us walk about well wadded with stupidity. [emphasis mine]

Middlemarch is a novel crammed with characters blighted by a narrowness of vision — fromthe myopic Dorothea who ‘always see what nobody else sees’ but ‘never see what is quite plain’ (23) to Dr Lydgate, blinded by ambition to the ‘hampering threadlike pressure of small social conditions’ (132). It is therefore a text uniquely receptive to the use of such “distant” methods.

Provincialism on the Page

Using CLiC to run a concordance on the word ‘Middlemarch’, we can observe that the town itself is frequently employed as an adjective — from ‘Middlemarch habits’, ‘Middlemarch politics’ and ‘Middlemarch gossip’ to more peculiar phrases like ‘in a Middlemarch light’, ‘Middlemarch phraseology’ and ‘the limits of Middlemarch perception’. That something as idiosyncratic as one’s perception could be considered quintessentially ‘Middlemarch’ speaks to the sheer embeddedness of Eliot’s characters in the culture of their town. Even more mundane articles (like ‘Middlemarch medicine’ or ‘Middlemarch lodgings’) are presented as wholly inextricable from their locality.

Graphical user interface, text, application

Description automatically generated

Fig. 3 A sample of concordance lines highlighting collocates to the left of ‘Middlemarch’ that denote departure, via CLiC v. 2.1.2., collocates highlighted with the KWICGrouper.

Figure 3 is a concordance for the word Middlemarch, i.e., the word is displayed in the centre with a certain amount of context on the left and the right. Interestingly, Middlemarch repeatedly occurs alongside collocates that denote departure like leave, leaving, left, quit, and quitting (collocates are words that occur repeatedly on the left or right of a search word). This is despite the fact that (barring the epilogue ‘finale’ and the honeymoon trip to Rome) there is no point at which a character expressly departs from the town.


Description automatically generated

Fig. 4 Percentage of Coventry women in 1851 born less than 10km away. Just 10.44% of women and 11.71% of men in Coventry were born more than 50km away from their current residence; in contrast, as many as 73.49% of women and 71.51% of men in Coventry resided less than 10km away from their birthplace. Data via PopulationsPast.Org.

In her twenties, George Eliot compared her own provincial existence to the ‘walled-in world’ of a David Wilkie genre painting (Eliot 2010: 76). In later life as an author, however, she effectively reframes the humdrum particulars of provincial life by bringing them into sharper focus. As Ruth Livesey has put it, Eliot’s mode of realism is laced with a radical ‘insistence that a picture of the commonplace world is entitled to full colour’ (Livesey 2020: 11). Using the nineteenth-century reference corpus provided by CLiC to generate a list of key words, this ‘luminous detail’ (Livesey 2020: 1) is immediately apparent. Amongst character names like Dodo and “plot” related words like vote or medical, more domestic words like furniture appear at a higher frequency. There are 47 instances of furniture in Middlemarch, whereas Charles Dickens’ Great Expectations, for example, uses furniture 7 times, (so 147.78 per million words vs 37.81 per million words). Solely from the term furniture, a miniature narrative emerges in which honest, hardworking characters like Mr Farebrother or the Garths are unconcerned by the state of their furniture (lines 1-4) while those with ‘spots of commonness’ like Dr. Lydgate agonise over the subject (lines 5-8)


Description automatically generated

Fig. 5 A sample of 8 concordance lines of 47 of furniture in Middlemarch, via CLiC v. 2.1.2.

Also notable is the word light, which is used in both a literal and metaphorical sense over a hundred times: from the ‘wondrous modulations of light and shadow’ on an old, thatched roof ‘full of mossy hills and valleys’ to Will Ladislaw’s impish smile described as ‘a gush of inward light’.

Tracing the Trappings of Gender

Examining concordance lines that include female pronouns like she, her, and hers, in comparison to the male pronouns he, him, and his, the tethering of women to a certain sphere of existence is immediately noticeable.  There are 22 instances of her on the left of marriage, but only 12 examples of the cluster ‘his marriage’. The phrase ‘her husband’ is used 153 times (largely in reference to the internal ruminations of Rosamond and Dorothea), but ‘his wife’ is used only used 62 times. Significantly, the cluster ‘her husband’ frequently presents in reference to a female character carefully observing, or even micro-managing, the emotional state of her husband. As Figure 6 shows, Dorothea’s own emotions are inextricably wedded to those of her husband: from her delight ‘at seeing her husband less weary than usual’ (line 10), to her private vow to vanish the morning’s gloom ‘if she could see her husband glad at her presence’ (line 7).

Graphical user interface, text, application, email

Description automatically generated

Fig. 6 Concordance lines generated by ‘her husband’, via CLiC v. 2.1.2. Note the many references to a wife observing her husband’s emotions.

There is a parallel discrepancy in the adjectives used by various characters to refer to the opposite sex: plain, single, and foolish all collocate exclusively with ‘women’, while professional, medical, intellectual and clever collocate exclusively with ‘men’. 

There is a visible difference between ‘money’ collocating with male pronouns vs female pronouns.  His repeatedly occurs in the first position to the left of ‘money’, but her only does so on one occasion. Her is on one occasion separated by the world ‘superfluous’ (line 3), coming in as a modifier of money, and the only example of her presenting to the immediate left of money refers to Dorothea speculating on how she may help a future husband. Upon closer inspection, every example of money collocating with a female pronoun also refers to a female character’s husband, father, or brother: from Mary handing her hard-earned money over to Caleb Garth (line 8) to Rosamond and Lydgate’s joint financial difficulties (line 7). (For more on corpus methods to describe the gendered world in nineteenth century fiction also see Cermakova & Mahlberg 2022 & forthcoming).

Graphical user interface, application

Description automatically generatedGraphical user interface, text, application, email

Description automatically generated

Fig. 7 A comparison of male and female pronouns collocating with ‘money’, via CLiC v. 2.1.2.

Finding Empathy & Authorial Allegiance

To conclude, I’d like to draw attention to the use of corpus linguistics as a skeleton key that quickly provides us with access into a character’s mind. Instances in which the word but presents to the immediate left of a character’s name can easily be used for this purpose (Fig. 8). In line 1, for example, ‘but’ directs us to the romance taking place purely in Rosamond’s mind: where ‘every look and word’ is the subject of ‘eager meditation’.

Graphical user interface, text

Description automatically generated

Fig. 8 A comparison of male and female pronouns collocating with ‘money’, via CLiC v. 2.1.2.

This also serves as a way to track authorial allegiance to various characters (Dorothea Lydgate and Rosamond appear in this configuration the most frequently). Emotional epiphanies, self-reflection, and inward anxieties are all expressed under these conditions.

Graphical user interface, text, application

Description automatically generated

Fig. 9 Characters in Middlemarch that collocate with ‘poor’, via CLiC v. 2.1.2.

The word felt (occurring to the immediate right of a character’s name) can be tracked in a similar way, as can the word poor (to the immediate left).  Dorothea, Rosamond and Lydgate accrue the most empathy in this manner, closely followed by Casaubon and Fred Vincy. Interestingly, despite the fact they collocate frequently with felt and but, neither Mr. Bulstrode nor Will Ladislaw collocate with poor. Moreover, both Lydgate and Casaubon collocate considerably less with poor than they do with felt or but. As is evident in Figure 9, the word poor is often used by Eliot as a direct appeal to the reader for empathy (lines 10-11 & 14-17). Conscious, perhaps, of the fact that male readers might find it more difficult to connect with women, it is a stylistic device that Eliot seems to keep in reserve for her female characters (Rosamond and Dorothea collocate with poor 17 and 16 times respectively).

Chart, line chart

Description automatically generated

 Fig. 9 A line graph tracking the rate at which characters collocate to the left of ‘felt’ or to the right of ‘poor’ or ‘but’.

Concluding Remarks

Scholars of nineteenth-century literature have long regarded “close reading” as the cornerstone of literary analysis (such methods rest on the belief that one can extract the intrinsic themes of a text by zooming in on certain passages). This post has, I hope, demonstrated what can be gained from reconciling “close” and “distant” reading methods through corpus tools. Middlemarch is a famously tightly woven novel that subtly knits together the narrative of multiple disparate individuals into a collective whole. It is a novel that is best observed at multiple scales; for as Eliot herself noted in The Mill on the Floss ‘there is nothing petty to the mind that has a large vision of relations.’


This post is part of the AHRC funded project ‘Finding Middlemarch in Coventry 2021-22’ led by Professor Ruth Livesey (Royal Holloway, University of London) and Professor Redell Olsen (Royal Holloway, University of London). The project will culminate in ‘Of that Roar Which…’, an experimental short film by Redell Olsen.

 Tickets for ‘The Great Middlemarch Mystery’ — an immersive multi-location theatre experience, researched and co-developed by Professor Livesey and produced by Dash Arts — are available now! The play will take place in Coventry’s Cathedral Quarter from Thursday 7- Sunday 10 April 2022.

Join the conversation via our blog on Twitter with #FindingMiddlemarch 

Please cite this post as follows: White, R. (2022) ‘I know no speck so troublesome as self’: Finding Middlemarch through Corpus Linguistics [Blog post]. CLiC Fiction Blog, University of Birmingham. Retrieved from [https://blog.bham.ac.uk/clic-dickens/2022/03/31/finding-middlemarch-through-corpus-linguistics/]

Provincialism at Large: Sukanya Banerjee 13/2/2020

We’re delighted to welcome Prof. Sukanya Banerjee for the first of our 2020 Provincialism at Large events. Prof. Banerjee (University of Wisconsin-Milwaukee), will be giving a masterclass and evening lecture as a joint event with Royal Holloway Centre for Victorian Studies.

Sukanya Banerjee is the author of Becoming Imperial Citizens: Indians in the Late-Victorian Empire (2010, awarded the NVSA Sonya Rudikoff Prize for best first book in Victorian Studies), and the co-editor of New Routes for Diaspora Studies (2012). Her articles have also appeared in journals such as Victorian Studies, Victorian Literature and Culture, and Diaspora: A Journal of Transnational Studies.

The masterclass is entitled ‘”Nation”, “Home” and “Empire” in Victorian Studies’, and will be taking place on Thursday 13th February at 2-3pm in Room IN029 (International Building, Egham campus). The talk will be taking place on the same day at 6-7.30pm in the Moore Annexe Lecture Theatre (Egham campus). 

Readings for the masterclass (pdfs below): Burton, A. (1997), Who Needs the Nation? Interrogating ‘British’ History. Journal of Historical Sociology, 10: 227-248. doi:10.1111/1467-6443.00039

Banerjee, Sukanya. “Transimperial.” Victorian Literature and Culture 46, no. 3-4 (2018): 925–28. doi:10.1017/S1060150318001195

Register for free tickets via Eventbrite: 

Masterclass: https://www.eventbrite.co.uk/e/masterclass-nation-home-and-empire-in-victorian-studies-tickets-90520779087.

Evening talk: https://www.eventbrite.co.uk/e/sukanya-banerjee-title-tbc-tickets-90521188311.

Lecture 6pm: Colonial Economies, Consumer Loyalty, and the Transimperial

Can the expansive demands of the free market be reconciled with more sedimented yearnings for the “local,” the “provincial”? In revisiting this classic-and abiding question-this talk studies the relation between late-nineteenth-century India and Britain, considering how an idiom of consumer loyalty negotiates the tense relation between free trade and incipient notions of territoriality and nationalism.

She Lives! George Eliot 2019

O May I join the choir invisible  
Of those immortal dead who live again  
In minds made better by their presence: live  
In pulses stirr’d to generosity,  
In deeds of daring rectitude, in scorn 
For miserable aims that end with self,  
In thoughts sublime that pierce the night like stars,  
And with their mild persistence urge man’s search  
To vaster issues.  

In “The Choir Invisible” George Eliot gave an indication of what she hoped her literary afterlife might look like. I joined Professor Ruth Livesey on the AHRC “Provincialism” project in September and since then I’ve been considering where George Eliot lives in 21st century cultural discourse. Over the past couple of months, I have been using the George Eliot collections at Nuneaton Library and Nuneaton Museum and Art Gallery, who are partners on the project, to delve into the nature and trajectory of Eliot’s literary fame. One of the outputs of my post is to produce a scholarly article on this work, but I thought on this, the bicentenary of Eliot’s birth, I might share some gems from the Nuneaton collections and some reflections on the research so far. So, where does George Eliot live in 2019?

Eliot lives in the legacy of material culture she left in her wake. From the exquisite collection of her works at Nuneaton Library to local and national monuments, Eliot has been commemorated in ways large and small since her death in 1880. Public monuments include the George Eliot granite obelisk in the George Eliot Memorial Gardens in Nuneaton; the statue in Nuneaton Town centre (erected in 1986) and the memorial stone in Poets’ Corner at Westminster Abby, unveiled to coincide with the centenary of her death in 1980. But Eliot lives too in smaller ways in historical collections of local ephemera: in postcards of local places she uses in her works, in bookmarks, commemorative envelope covers, a Royal Mail postage stamp, in souvenir programmes of the week-long celebration in Nuneaton to mark the centenary of her birth in 1919.  Tracking local newspaper reporting about Eliot since her first published fiction appeared in Blackwood’s Magazine in 1857, reveals a recurrent concern after her death, that she should not be forgotten. This concern was reflected early on, in the words on the George Eliot memorial obelisk in Nuneaton: “Lest we forget”.  The raising of funds for a permanent memorial was the stated aim behind the events of the 1919 Eliot celebrations in Nuneaton and was finally brought to fruition through the work of the George Eliot Fellowship fifty years after its founding.   

George Eliot. Scenes of Clerical Life with illustrations by Hugh Thomson. London: MacMillan, 1906.
Courtesy of Warwickshire County Council, George Eliot Collection, Nuneaton Library.
George Eliot. Adam Bede. London: Walter Scott, 1901.
Courtesy of Warwickshire County Council, George Eliot Collection, Nuneaton Library.
George Eliot. Silas Marner with illustrations by Hugh Thomson. London: MacMillan and Co., 1907.
Courtesy of Warwickshire County Council, George Eliot Collection, Nuneaton Library.

Eliot also lives in objects and artefacts. I was delighted to find she had a Royal Holloway connection, having attended Maths classes at Bedford College in 1850-1851.

Mary Anne Evans’ name in the Bedford College Register of Students 1849-1870.
RHCAR/130/1 Archives Royal Holloway, University of London

I have been particularly drawn however to the small, every-day, commonplace Eliot artefacts contained in the collections at Nuneaton Museum, such as a simple receipt, signed by “Marian Lewes” which records the income she received from the trustees of her fathers’ estate on 8 December 1857. The receipt demonstrates the right Eliot claimed to name herself, but what makes the receipt even more remarkable is that it is addressed by the trustees to “Mrs Marian Lewes”.  It is a visceral example of Eliot’s breath-taking self-determination. When she signed the receipt her first fictional writing Amos Barton, had just appeared in Blackwood’s Magazine and her literary incognito was still intact.

Fragile pages of blotting paper, also in the Nuneaton Museum collection, are extraordinary survivors of Eliot’s workaday life as a writer.

Receipt signed by Marian Lewes in 1857 for income received from the trustees of her father’s estate.
Nuneaton Museum and Art Gallery. U/2/1980/11. Courtesy of Nuneaton Museum and Art Gallery .
Tools of the trade: George Eliot’s blotter. Nuneaton Museum and Art Gallery U/1/1974/2.
Courtesy of Nuneaton Museum and Art Gallery

An empty mourning envelope, date stamped two days after Eliot’s death, addressed to her brother Isaac Evans at Griff, their childhood home, was arresting, bringing their relationship, bookended by rejection and loss, into sharp focus.

Mourning envelope addressed to Isaac Evans at Griff. Nuneaton Museum and Art Gallery U/2/1980/5
Courtesy of Nuneaton Museum and Art Gallery.

Eliot lives, of course, in academic discourse, in scholarly endeavour and debate but she has always lived in popular culture in film, radio and TV, in theatrical performances, public readings, pageants even musicals. During the week-long centenary celebrations in 1919, 2000 people took part in an open-air theatrical performance of “The George Eliot Centenary Pastoral Play” written by A. Farmer. This year the GE 2019 website testifies to a rich array of local, national and international events, readings, panel discussions, tours and theatre performances in her honour.

Eliot lives at the points of intersection between academia and popular culture. Two events yesterday at Senate House (George Eliot at 200) and the British Library (What’s So Great About George Eliot?) highlighted the impact of Eliot on the personal and professional lives of contemporary writers, broadcasters, actors, direct family descendants, George Eliot Fellowship members and Eliot scholars.

Eliot lives in mass digital repositories. During this research the scrapbook, a collection of ephemera, newspaper clippings, photographs, pamphlets and event invitations, drawn together over time, has come to resonate with me in a very 21st century way. This is not purely because they are a physical medium I have been consulting at Nuneaton Library, but because I have been searching for Eliot in mass digital repositories, seeking out fleeting references from forgotten, seemingly unimportant sources which only become significant when brought collectively before the eye: Eliot, there is no doubt, lives in the digital age.

Eliot lives, as she most hoped she might, in people, in expressions of humanity and in the richness of life seen through multiple perspectives. She is the subject but perhaps too the method underpinning the multiple perspectives of Gillian Wearing’s ground-breaking film “Everything is Connected: George Eliot’s Life” available on BBC iplayer.

I have during my time in Nuneaton seen George Eliot’s legacy in the quiet, unfailing kindness of the staff of Nuneaton Library and Nuneaton Museum and Art Gallery, to all who enter their doors.  Eliot may yet to reach the dizzying global profile of Dickens, but her legacy lives.

Copyright in all the images displayed in this blog belongs to the copyright owners identified beneath each image. They are used here courtesy of Nuneaton Library, Nuneaton Library and Museum and Royal Holloway, University of London Archives.

George Eliot at Large

Although this project focuses on Eliot’s role in rethinking provincialism and her localisation in North Warwickshire, that work of hers was only made possible by Eliot’s outward-looking world view and experience of European intellectual culture.

If you are interested in following up Eliot’s encounters with the German world of ideas in person (as opposed to the translation work that absorbed her in her Coventry years) take a look at Bob Muscutt’s blog George Eliot in Weimar – highly informative and rich with research.

Seminars: Provincialism At Large

During 2019-20 the project will be hosting a series of speakers exploring the idea of provincialism, regionalism, and scale across the nineteenth century. The seminars are a collaboration between Royal Holloway Centre for Victorian Studies and Centre for Geohumanities and explore the concepts underlying the project through interdisciplinary conversations, ‘at large’ across the circulating imperial networks of nineteenth-century provincial thinking. We are very grateful to the organisers of the long-running ‘Landscape Surgery’ programme in the Department of Geography for their support in programming this year.

Seminars Autumn Term 2019: 9/10/2019; 19/11/2019 in conjunction with Landscape Surgery

Josephine McDonagh (University of Chicago): ‘Provincialism, Multilingualism and the Novel:  early nineteenth century migration to South America and Jane Eyre

Wednesday 9th October  2019 2-4pm, 11 Bedford Square 1-03

Suggested reading: McDonagh, ‘Rethinking Provincialism in Mid-Nineteenth-Century Fiction: Our Village to Villette’, Victorian Studies 55 (2013): 399-424. doi:10.2979/victorianstudies.55.3.399 http://www.jstor.org/stable/10.2979/victorianstudies.55.3.399 or email ruth.livesey@rhul.ac.uk.

Katrina Navickas (University of Hertfordshire):  ‘Customary rights, property and contested belongings in English commons and village greens, 1795-1965’.

Tuesday 19th November 2019  2-4pm 11 Bedford Square 1.01

Abstract: This paper examines contested customary rights and landownership of commons and village greens in England in the 19th and 20th centuries. The 1965 Commons Registration act sought to map definitively the extent of common land and associated rights in England and Wales, but it was a flawed piece of legislation. Its implementation revealed the widespread difficulties of defining a common, its rights and its ownership, much of which has still not been resolved today. Some of those disputes stretched back into the 19th century and earlier. This paper takes as its focus the case studies of the 1795 court case about the village green of Steeple Bumpstead, Essex, and the contested ownership history of Wisley Common, Surrey, from the 19th century through to the 1965 legislation and the present day. It feeds into current academic and popular debates about land reform and legislation. What do such cases tell us about local and regional identities, and popular ideas of the commons and common rights? Why did people still claim common rights in 1965, and today?

Suggested further reading: Fitch vs Rawling, Fitch & Chatteris, 4 Feb. 1795, English Law Review, 126, p. 614-618