 |
OUT OF PRINT? Meyer’s coming demise has faculty questioning the accessibility of some of its collections.
Courtesy Stanford Archives |
Five or 10 years ago the excitement surrounding digital books was all about access: imagine being able to call up a copy of the 1851 British edition of The Whale (later Moby Dick).
It didn’t take long for readers to discover that scrolling through 1,000 pages was an eye-numbing task. Naturally, industry rose to the challenge, and the “electronic ink” inside devices from Sony (Reader) and Amazon (Kindle) is good enough that I now can imagine curling up by the fireplace with a good e-book. But even the most technically enthused remain skeptical about the value added, and those of us given to nostalgia are not likely to forgo the tactile pleasure of turning a page.
So if not to read, why are digital books so doggone valuable? Search would seem the obvious answer. In a digital world containing billions of words, search is king.
But search can be ineffectual. A query for “whale” in Google Books produces 19,600 hits. Narrowing the search by typing in “whale” and “god” results in a more manageable 1,830 hits. In the top-ranked books is the very promising Complete Literary Guide to the Bible. But it is promptly followed by less promising links to Introductory Sociology: Order and Change in Society and On the Bright Side, I’m Now the Girlfriend of a Sex God: Further Confessions.
The lesson? There is a lot of useless straw in the haystack. Unless I know, more or less, what to look for—say a quotation only partially remembered—searching for re-search purposes, as opposed to browsing for discovery purposes, is not so practical.
More interesting and exciting than the mere searching of digital texts is the ability to leverage computation to process and analyze textual data. The folks at Google recently added a “Popular Passage” feature to their book interface, which reveals a list of quotations from Moby Dick, for example, that are also present in other books in the archive. From here it is easy to see how Moby Dick’s wake sends ripples through the pond of prose.
This is what I call macroanalysis: a method of literary text analysis akin to macroeconomics. Literary macroanalysis does not concern itself with single books but rather with an entire “economy” of texts. English professor Franco Moretti has theorized along similar lines with what he terms “distant-reading,” an alternative to the close reading of texts that steps back in order to consider all the texts of a given genre or period, not just the handful that have come to represent the period. A computer-based macroanalysis of digital books offers a way of realizing this goal.
This sort of analysis motivates the workshop Literary Studies and the Digital Library: Beyond Search and Access sponsored by the Stanford Humanities Center. The participants hope to reveal new kinds of literary information and thus develop a better understanding of literature as a system, as an aggregate.
In the past, scholars had few digital texts to analyze. Today we have millions. But barriers remain. To be useful to students of literature, who want to better understand literary history and how literary style evolves, these books cannot simply be dumped into a massive online bucket. Classification (and, thus, the librarian) is necessary so that we can sort out the romances from the self-help guides, the novels from the legal tracts. And the digital surrogates need to be accurate.
Electronic texts, digital libraries and computation offer us unique ways of researching, analyzing and understanding the literary record. This is a moment of extraordinary opportunity, an opportunity to reimagine our objectives, our methodologies and perhaps even the very subject of our literary scholarship. Exactly how we will do this remains to be seen. |