¶ 1 Leave a comment on paragraph 1 0 If the question, “What do we want the university of the digital future to be?” seems like an impossibly ambitious frame of reference for the question of what we want literary studies in the digital future to be, let’s pause to locate the rhetoric of that question more specifically. Posed by Alan Liu in a keynote lecture for the Texas Institute for Literary and Textual Studies (TILTS) in March 2011, the question was at the top of a branching tree of increasingly precise questions: “What is the ecology of academic, governmental, philanthropic, for-profit, corporate, and other stakeholders in providing digital education?” (31:17); “Will the students of digital education be economic individuals or citizens?” (32:45); “What will be the labor model of the university of the digital future?” (33:34); “What will be the intellectual property model of the university of the digital future?” (34:13).1 This lecture—delivered in the shadow of an acute economic crisis in higher education—proposed that we remember and reexamine how, as a condition of possibility for humanistic study, the university needs to exist as an economic and ideological set of commitments, and it further proposed that we take seriously the profound changes that might be necessary in those commitments if the university is to continue to exist in some form in a future that is crucially conditioned both by the digital and by the new economy of scarcity. Literary and textual studies, in other words—and digital literary studies as a matter of course—would take place within an institutional and intellectual context that is undergoing radical change, under conditions of extreme competitive pressure. As Liu noted, quoting from Forbes magazine, “The Internet is about to do to America’s universities and colleges what it’s done to media and entertainment—profoundly upend them. And improve them” (25:40).
¶ 2 Leave a comment on paragraph 2 0 During that same period, the newly founded 4Humanities site asserted a close linkage between the crisis in the humanities at large and the intervention that the specifically digital humanities might make in that crisis.2 The site’s mission statement cites the decline in public and private support for the humanities and warns that the resulting cuts in funding place at risk all that the humanities contribute to our understanding of “the wisdom of the past, awareness of other cultures in the present, and imagination of innovative and fair futures,” It proposes that the digital humanities may prove an important ally to help the traditional humanities “communicate with, and adapt to, contemporary society” (“Mission”).
¶ 3 Leave a comment on paragraph 3 0 This alliance marks a crucial shift. The digital humanities has been animated, in its deepest intellectual roots, by the language of change. The “revolution” envisioned by the early theorists of hypertext and electronic modes of authorship suggested a radical restructuring of textuality, authorship, and readership and a significant democratization of the institutions of publication, with potentially dramatic political consequences (see, e.g., Lanham). But in more recent years the language of revolution has not retained its critical bite or its pervasiveness. In its place we can see a softer kind of incremental progressive vision in, for instance, the grant-proposal rhetoric that promises to “broaden access,” improve educational outcomes, and create “new ways of thinking” without upsetting fundamental institutional structures. The language of “innovation” situates these changes in a specifically technological frame of reference, emphasizing the production of “newness” as an ipso facto value. In the shift from revolution to innovation we can see a subtle concomitant shift in the implied framework of responsibility created by the vision of change. In calling for a revolution—in describing what such a revolution could entail—theorists and advocates like Jay David Bolter stipulate a set of responsibilities that are pervasively social: the responsibility of individual authors and readers to act as citizens of a newly democratized “writing space,” the responsibility of publishers and institutions to accommodate such spaces through changes in academic policy, the responsibility of professional communities to take seriously the consequences of such changes for the power and authority structures in such communities. The language of innovation, by contrast, implies a much more narrowly economic set of relations in which responsibility has been replaced by opportunity as a model of interaction. Innovation posits a competitive marketplace of ideas, in which newer (better) tools and ideas will be adopted because of the improvement they offer, following considerations of cost and benefit.
¶ 4 Leave a comment on paragraph 4 0 However, the alliance proposed by 4Humanities—and the questions posed in Liu’s TILTS keynote—urges us to recognize that the digital humanities must engage with change not (or not only) as a progressive narrative of steady, technology-driven improvement, but as a more agonistic and uncertain intervention in, or adaptation to, a process of change that is being driven by institutional and economic forces. Turning our attention in this way to changes in human practice—considered less as efficiencies or opportunities arising from innovation than as adaptations to complex reconfigurations of the sociotechnical landscape—shows us a set of responsibilities and consequences that are not fully represented in the facile parlance of innovation, or even in the more consequential language of revolution (now somewhat denatured through repeated use). Several such shifts in obligation have been identified in the essays in this volume. For individuals, important responsibilities arise from the changes to scholarly editing that Susan Schreibman describes in her contribution. Increased transparency of process requires readers to take seriously the significance of editorial processes in mediating the texts they consume, and the relocation and dispersal of editorial authority requires readers to play an active and informed role in that mediation. Similarly, the development of editions that perform a set of textual possibilities (as Schreibman describes Clement’s Versioning Machine edition of In Transition: Selected Poems by the Baroness Elsa von Freytag-Loringhoven) rather than produce a stable editorial product calls for a reader whose use of textual resources can accommodate this play as well. If the traditional edition serves as a grounding so that literary study can proceed without having to first establish its own textual basis, the performative edition asks the literary scholar to work within a much more complex textual universe, as well as to forego certain kinds of interpretive closure.
¶ 5 Leave a comment on paragraph 5 0 These examples from digital editing are paralleled by others from textual analysis and data modeling; the analytic power we gain from algorithmic processes on the one hand, and from carefully modeled data on the other, arises from the ways in which those algorithms and models represent (more or less successfully) a strong understanding of the text under examination. The introduction of this layer of formally expressed knowledge into the scholarly ecology creates a burden of responsibility to understand how that layer works and what it is saying, or at least to take its existence seriously. William Kretzschmar’s essay in this collection offers a case in point, illustrating the levels of decision making and complex mathematics undergirding the analysis that in turn permits assessments of things like the randomness or clustering of distribution of some specific geographically distributed feature such as linguistic idioms. As he notes,
¶ 6 Leave a comment on paragraph 6 0 The making of any model is a deliberate act of the maker, in part a reflection of the maker’s theoretical foundations and assumptions about what is represented. The more explicitly these ideas are formulated and made known, the more usable the model will be for others besides the maker—and vice versa.
¶ 7 Leave a comment on paragraph 7 0 The modeling process described here is not simply an articulation of method but also an act of engagement with users of the data, who are expected to work with the model in ways that echo the kinds of performativity noted above. The reciprocity of that engagement—the ways in which the user’s grasp of the model through reuse of the data constitutes both an acknowledgment of the model’s value and a potential critique of it—puts the modeler and the user into a much more active relationship, one that is also potentially more fraught and risky, as Kretzschmar’s “and vice versa” suggests. In a similar way, Charles Cooney, Glenn Roe, and Mark Olsen’s discussion of text mining and textbases makes clear that the text-mining techniques and algorithms described all incorporate expert decision making into their design and application; these are not systems that can operate blindly. If the quality and appropriateness of these systems has a bearing on the scholarly outcomes they yield, then it follows that as consumers of that scholarship we need to be able and willing to understand what is at stake.
¶ 8 Leave a comment on paragraph 8 0 These examples demonstrate a familiar point about the ways in which we need to be critical consumers of our statistical and technical tools, but I invoke them as well for another, cognate purpose: to articulate the place of the individual mind in the increasingly large-scale logic of digital systems. This is not an exercise in nostalgia but rather an exploration of a peculiar kind of leverage that is being exerted on the humanities—and also the digital humanities—by the changes in the academy, the subject of Liu’s TILTS lecture. The recession-era push to do more with less provides motivation for both defensive, retreating shifts (such as elimination of specialized departments and increased class size) and opportunistic ones (such as the creation of online educational programs that arguably serve to expand access and increase educational opportunities even while they help reduce costs). But both shifts reduce the visibility of the individual—reduce the proportionality, we might say, of the individual to the system. This is true whether we are considering the teacher (now an intellectual focal point for an expanding set of educational relationships that might number in the hundreds or thousands per course) or the student (now a proportionally smaller participant in larger and larger classrooms or online learning communities). It is important to note that both retreat and opportunity operate in the same way here: they accept the same structural premises, namely that the individual must be placed in ever greater subordination to a system of interconnections and that the efficiency and scale of the educational operation are the primary measures of its success—a developmental direction mapped out and justified by the logic of industrial technology.
¶ 9 Leave a comment on paragraph 9 0 This push is happening in the same academic culture that is seeking to find ways of building and engaging with larger aggregations of data: the “what do you do with a million books?” challenge that has motivated tremendous rhetorical and economic resources in the years since Gregory Crane’s article of that title. As above, the discussions of methodology that have accompanied the “big data” turn in digital humanities have stressed the comparative insignificance of any individual item in the research landscape; in his introduction to Graphs, Maps, Trees, Franco Moretti observes of literary study that “a field this large cannot be understood by stitching together separate bits of knowledge about individual cases, because it isn’t a sum of individual cases: it’s a collective system, that should be grasped as such, as a whole” (4). The individual text recedes here to take its place as a point in one of his book’s eponymous visualizations, through which we can see “a field . . . a collective system” that by implication carries more cultural and intellectual weight. Clearly in Moretti the idea of subordination does not carry the same apparent moral significance as it does in the case of human students. Nonetheless, with the earlier example in mind, our reading of Moretti should be alert to the many potential valences of the relation between individuals and systems: for instance, a “collective” logic through which the meaning of individual cases is most effectively realized or a “system” logic of industrial management in which individual distinctiveness and locality is set aside because it cannot be thought.
¶ 10 Leave a comment on paragraph 10 0 If the individual in this ecology is not the text but the scholar, we see an interesting shift of perspective. To return for a moment to the classroom at scale of distance education: when situated in that virtual classroom space, the individual student seems from the perspective of the educational vector (whose intellectual hub is traditionally imagined as the teacher) to be infinitesimally small. But if we reverse the perspective, as if reversing the telescope end for end, the student who was situated in the vast periphery of a subordinatively large number of fellow students is now seen as the hub of an enablingly large number of educational vectors: the educational abundance of the Web. Similarly, Moretti’s subordination of the individual text to the “collective system” situates the scholar—the “we” who “grasp” that system—in what is by implication a much more empowering geometry: not at one end of a single line of close reading that attaches us to a single text but at the center of an information nebula. But for that geometry to be empowering, our transactions with that nebula must successfully convert its informational scale into something we can think, and the tools and modeling tactics we use to accomplish that conversion must be understood as tools for transacting a vast disproportion, for effecting a gigantic informational change of state. Digital literary study must thus consider, as a central problem, the empowerments and disempowerments contingent on its use of tools, not because they are tools, but rather because of the questions they raise about how we are situated in relation to our objects and methods of study. The human scholar of literary studies must be present in the inquiry at its end points—as the initiator of questions and consumer of answers—and also inside the process, inside the tools, as they mediate between us and the field we are seeking to grasp.
¶ 11 Leave a comment on paragraph 11 0 The essays in this volume register this disproportion, this leverage, and engage with it in complex ways. In exploring tools and techniques that offer insight over large bodies of data—GIS, visualization, text mining—they chart in detail the intellectual gains to be had and suggest that an important strand of the research agenda for digital literary studies is to flesh out more fully the potential of such tools through detailed application. But their concluding reflections offer a repeating dialectical figure in which the individual mind is positioned in relation to these systems. In this volume, David Hoover observes, “Computer-assisted textual analysis is neither a panacea nor a substitute for sound literary judgment, but its ability to refine, support, and augment that judgment makes it an important analytic method for literary studies in the digital age.” Schreibman explicitly names a “dialectic between what we might consider the more traditional and intuitive bases of literary interpretation” and the “disambiguating premise of stylometrics, attribution studies, and other statistical methodologies common to computational and algorithmic processing” from which emerge “new forms of analysis, meaning, and insights.” Cooney, Roe, and Olsen also speak of a “dialectic between technological progress and critical inquiry.” Other essays frame the relation more cautiously, opening up an explicit space of value to be occupied by the human scholar. As Stéfan Sinclair, Stan Ruecker, and Milena Radzikowska put it,
¶ 12 Leave a comment on paragraph 12 0 It is important for the scholar to know enough about the visualization tools to understand that the interpretive work is being guided and biased by the data and software. Failing that, we need to have methodologies that are sufficiently well tested and understood for scholars to be able to use the tools with confidence. The question remains whether humanistic inquiry lends itself well to well-trodden methodologies when originality and idiosyncrasy are the norm.
¶ 13 Leave a comment on paragraph 13 0 And Tanya Clement concludes her essay by invoking the “messiness and doubt” that make “the complexities of concepts like race, gender, class, and culture most immediately relevant.” In this language of “dialectic” and “ambiguity,” I suggest, we may see a second important strand of the research agenda, centering on the nature of the “literary” as another way of asking what we mean by “the human.”
¶ 14 Leave a comment on paragraph 14 0 The science of statistics arose as a way of managing scale by distancing the perceiver from the complexity of individual objects (for instance, individual human beings), creating surrogates through which we can know the world as information to be managed. These surrogates are data points, uniform formal packages whose properties are determined by the needs of the knower rather than the exigencies of the object of scrutiny. There is a brutality in this process that exists quite apart from any specific political motivations; the logic of surrogacy turns us away from the individual human narrative through which we could experience pathos, causal explanation, the quiddity of human life. It puts us in possession of a structural understanding, but it thereby also relocates our minds into that structural space so that the appropriate objects of knowledge are trends, relations, aggregations, species, systems.
¶ 15 Leave a comment on paragraph 15 0 When does the individual matter? Literary study has in the past made a practice of dealing with individual narratives: the genesis of a novel, the motivations of an author in relation to a specific work, the semantic ecology of a poem. It has also sought to generalize from these narratives to produce an understanding of history, of genre, of regionality. Digital literary study likewise seeks to move from explanatory narratives that apply to individuals to explanatory narratives that apply to populations. This is a move that seeks to relocate the scene of our knowledge, so that by knowing about more things (for instance, more novels) and by grounding our statements in more data, we thereby in virtue of that data appear to know things about culture as a whole. We have, in tools like text mining and visualization, an unparalleled capacity to gain a very large-scale view of culture—of popular fiction, of linguistic variation, and so forth—as a set of observable phenomena that emerge from the aggregation of data points about individual instances. But what has not been clearly articulated is the role—if any—that an individual instance plays in validating or shaping those larger conclusions.
¶ 16 Leave a comment on paragraph 16 0 I offer two examples by way of illustration. The first is the tolerance for error in large digital humanities data sets. It is a commonplace of digitization practice that very high rates of accuracy (in transcription, in optical character recognition [OCR]) are unachievable (because of cost) and, in large-scale digitization efforts, unnecessary. Because the statistical techniques underlying text mining and text analysis operate on aggregations, individual textual errors typically have a statistically insignificant impact on the outcome of the analysis; similarly, when searching a large text corpus for instances of a given word, the user outcome will be only infinitesimally affected by the omission of a single search hit resulting from a transcription or OCR error. However, interesting cases are sometimes cited in which a pattern of error (such as the mistranscription of long s as f) is substantial enough to skew the results. The threshold of significance for error in the aggregate is thus one crucial point; Geoffrey Nunberg’s complaints about the quality of the metadata in Google Books constituted a claim about the basic usability of that data because of the pervasiveness of error. But another crucial point has to do with edge cases and what we can learn from large-scale collections: while the phenomena that are statistically predominant (and comparatively unaffected by small error rates) are naturally of great interest, we may also want to be able to study rarer phenomena in such large collections, and these will be much more subject to interference or obscuration by even small errors. Methods of analysis that look at infrequent rather than frequent words will also need to take the impact of individual errors into account.
¶ 17 Leave a comment on paragraph 17 0 More interesting than error, though, is the question of how we understand the significance of individual instances in constituting the overall phenomenon being studied. If all words in a text are not equally relevant carriers of meaning—for instance, words occurring within paratextual features such as running heads, indexes, and advertisements or within nonauthorial features such as editorial notes and quoted materials might clearly be excluded from certain kinds of textual analysis—then as our analysis becomes more fine-grained we may find that we need a clearer account of which textual instances are actually contributing to the truth-value of our large-scale observations.
¶ 18 Leave a comment on paragraph 18 0 In the place of the qualitative divide between the individually irrelevant instance and the aggregation (in which the instance is treated almost like a mathematical infinitesimal), we will need a geometry in which the instance and the aggregation both appear. If we think of the statistical aggregation—the large corpus, the visualization, the database—as a coordinate plane populated by data points, each of which carries its tiny payload of information (metadata, word frequency, demographics, and so forth), then what I am concerned with here is reintroducing a z-axis: in effect, a vector of connection between each data point and its source in the world, whether that is a text or an artwork or a human being or a linguistic transaction. A recent example of traversal of this z-axis appears in an article by Daniel Cohen and Fred Gibbs entitled “A Conversation with Data: Prospecting Victorian Words and Ideas,” in which the authors propose an approach that moves between “distant” analysis and a closer look at individual instances to consider things like the structural location and collocation of specific words. Although the examples they give are still more weighted toward the “distant” side of things, the explicitness of their point that large-scale analysis is complementary to detailed exegesis is important. Put this way, the point may seem banal, but the fact that it seemed to them necessary and useful to argue for it suggests that, at least in the current cultural moment, there is a presumption against the need for the “close” view, the z-axis.
¶ 19 Leave a comment on paragraph 19 0 There may be good methodological reasons for insisting on this traversal; for one thing, as we’ve seen, the accuracy of the connection between the plane of the aggregation and the individual sources for its data can’t be taken for granted (and indeed there is a great deal more to be said and learned about methods of ascertaining that accuracy and about whether and how it matters). For another, if our interest in literary study still concerns individual works of literature (however we may define that term), at some point we need to turn our interpretive attention back to these, following whatever insights our look at the broader field may have yielded. But there are also reasons I would like to characterize as moral: in training ourselves and our students to understand the relation between individual cases and broader trends, we train ourselves to treat that same structure responsibly in other arenas—in other words, to understand that trends and populations are consequential narratives of mastery to which we may not be entitled. An earnest contribution to the 4Humanities site on 23 August 2012 makes this point especially vivid: writing about measures taken by the Australian government to prevent asylum seekers from crossing the Indian Ocean, Debjani Ganguly makes clear that part of the problem is precisely the way these refugees are perceptible to and manageable by the observing world: as inputs to a bureaucratic process (the “interminable wait in a queue in Malaysia to get legally processed”), as “a statistical abstraction,” as a mass too large to be absorbed by the countries where they seek asylum. Ganguly proposes that the role of the humanities is to “advance where policy retreats” and, most potently (and pedagogically), to counter these abstractions with a minutely realized view of the human source: “We preserve records of previous habitations. We understand and assemble pasts, presents and futures anew. We imagine spaces and topographies before and after they acquire materiality. We discover facts, affects, metaphors and images.”
¶ 20 Leave a comment on paragraph 20 0 Can digital literary studies provide a model for how humanists might work at scale without losing sight of why our study matters? To complement this question, here is another: can digital literary studies provide a compelling and distinctive account of interpretation—of how literary scholars might work digitally “close up” in ways that take advantage of the detailed modeling that digital representation affords? How can digital methods help us reexamine the representativeness and the distinctiveness of the individual text in the context of the vast cultural landscape these methods help us grasp?
¶ 22 Leave a comment on paragraph 22 0 2. The site describes itself as “a platform and resource for advocacy of the humanities, drawing on the technologies, new-media expertise, and ideas of the international digital humanities community.” It further notes that “[t]he humanities are in trouble today, and digital methods have an important role to play in effectively showing the public why the humanities need to be part of any vision of a future society.”
“About.” 4Humanities: Advocating for the Humanities. Ed. Alan Liu, Geoffrey Rockwell, Stéfan Sinclair, and Melissa Terras. 4Humanities, 2012. Web. 11 Feb. 2013. <http://humanistica.ualberta.ca/about/>.
Bolter, Jay David. Writing Space: Computers, Hypertext, and the Remediation of Print. Hillsdale: Erlbaum, 1991. Print.
Cohen, Daniel, and Fred Gibbs. “A Conversation with Data: Prospecting Victorian Words and Ideas.” Victorian Studies 54.1 (2011): 69–77. Web. 21 Sept. 2012. <http://muse.jhu.edu/journals/victorian_studies/v054/54.1.gibbs.html>.
Crane, Gregory. “What Do You Do with a Million Books?” D-Lib. 12.3 (2006): n. pag. Web. <http://dx.doi.org/10.1045/march2006-crane>.
Ganguly, Debjani. “Keeping the Human Condition.” 4Humanities: Advocating for the Humanities. Ed. Alan Liu, Geoffrey Rockwell, Stéfan Sinclair, and Melissa Terras. 4Humanities, 23 Aug. 2012. Web. 12 Nov. 2012. <http://humanistica.ualberta.ca/2012/08/debjani-ganguly-keeping-the-human-condition/>.
Lanham, Richard. The Electronic Word: Democracy, Technology, and the Arts. Chicago: U of Chicago P, 1993. Print.
Liu, Alan. “The University in the Digital Age: The Big Questions.” The Digital and the Humanities. U of Texas, Austin, 10 Mar. 2011. Web. 20 Sept. 2012. Video of address.
“Mission.” 4Humanities: Advocating for the Humanities. Ed. Alan Liu, Geoffrey Rockwell, Stéfan Sinclair, and Melissa Terras. 4Humanities, 2012. Web. 20 Sept. 2012. <http://humanistica.ualberta.ca/mission/>.
Moretti, Franco. Graphs, Maps, Trees: Abstract Models for a Literary History. London: Verso, 2005. Print.
Nunberg, Geoffrey. “Google’s Book Search: A Disaster for Scholars.” The Chronicle Review. The Chronicle of Higher Educ., 31 Aug. 2009. Web. 16 Nov. 2012. <http://chronicle.com/article/Googles-Book-Search-A/48245/>.