Thursday, September 24, 2009

The future of books

As a bookseller and a (future) librarian, I'm deeply concerned with the future of books. Books are convenient, functional, relatively inexpensive, portable, semi-permanent and often beautiful physical objects, and I'm convinced that for those reasons the codex will be with us for many decades to come. (Image by Philippe Kurlapski.)

But there are immense pressures building to shift books into digital formats. And as you might guess, huge international media and consumer electronics conglomerates aren't doing this for your convenience. They're pushing for electronic books to replace print because digital licenses give them more control.

The Copyright Act of 1976 gives book owners rights--fair use and the first sale rule--which limit the restrictions copyright holders can place on the use of copyrighted material. Fair use means that limited portions of a copyrighted work can be reproduced and re-used without permission for purposes such as education, commentary and parody; the first sale rule means that once you own a legitimately obtained copy of a copyrighted work you can generally give it away or sell it to someone else.

But in the digital realm, licenses, electronic protection measures, and proprietary software allow publishers to prevent fair use or transfer. Now, you can't copy even a small portion of your e-book; once you're done with it, you can't resell it or even give it away. And even your right to keep a copy of a work you've paid for can be revoked at any time by the seller.

Big Brother, thy name is Amazon. This became rudely apparent this summer to owners of Amazon's Kindle e-book reader who purchased certain editions of George Orwell's novels Animal Farm and 1984. As reported by Brad Stone in the New York Times, when Amazon discovered that the vendor of the titles did not actually own the U.S. rights (the books are in the public domain in Canada but not the U.S.), it remotely deleted the books from its customer's Kindles. Amazon's Kindle terms of service states that customers are buying the "right to keep a permanent copy of the applicable Digital Content"--but the 1984 incident shows how hollow that promise is. The irony that the books involved were Orwell's dystopian visions of totalitarianism is almost too delicious: in 1984, of course, Winston Smith's job at the Ministry of Truth is to delete records of the past that don't conform to current Party orthodoxy by dropping them down the "memory hole" where they will vanish forever.

Stone quotes Kindle owner Bruce Schneier, chief security technology officer for British Telecom: “It illustrates how few rights you have when you buy an e-book from Amazon...I can’t lend people books and I can’t sell books that I’ve already read, and now it turns out that I can’t even count on still having my books tomorrow.”

It wasn't just the texts that were deleted. Kindle allows readers to make their own notes, highlighting and annotations to electronic texts, and those were disappeared along with the e-books themselves. Stone writes of high-school student Justin Gawronski, who had a summer assignment to read 1984 and woke up to find that all of the annotations he'd made to his Kindle version of the book were gone. Gawronski said, "They didn’t just take a book back, they stole my work." Two months later, as reported by Miguel Helft in the New York Times, Amazon offered to replace readers' copies of the deleted Orwell works, including their annotations--an offer that probably comes a bit late for Justin Gawronski.

And as Harvard professor Jonathan Zittrain points out in Heft's article, Amazon's ability to remotely delete Kindle content raises the specter of texts you've purchased being permanently deleted or altered in the future in response to lawsuits or government action. Only with Kindle editions, there's no need for anything so crude as an incinerator--a few keystrokes will suffice.

So is the Kindle any good? In a recent New Yorker piece, novelist Nicholson Baker compared the experience of reading a book on his Kindle 2 to driving "a white 1982 Impala with blown shocks." He details the Kindle's shortcomings--its clunky design, the low contrast of its screen, the "black flash" that occurs every time it displays a new page, the lack of page numbers (instead each section of text has a "location range"), the poor reproduction of illustrations such as photographs, tables and charts. Worst of all, Baker writes that a comparison of the print, online and Kindle versions of the New York Times reveals that the Kindle version is missing content: not only photographs, but "its Web-site links, its listed names of contributing reporters, and almost all captioned pie charts, diagrams, weather maps, crossword puzzles, summary sports scores, financial data....Sometimes whole articles and op-ed contributions aren’t there." In a check of the July 8th and 9th editions of the Times, Baker found five articles that are entirely missing from the Kindle versions. (Image by ShakataGaNai.)

Speaking of monopolies... Presumably, better-designed e-book readers will solve many of the Kindle's issues. And books and other texts that are created specifically for e-book readers, rather than converted from print versions, will offer photographs, tables, charts, and other illustrations that are clearer and more legible. But for the foreseeable future the vast majority of electronic texts will remain digitized print versions.

The largest print-to-digital conversion project is Google Books, and its history to date is not reassuring. For one thing, with any Google service there are significant privacy concerns. For another, Google now has monopoly control over a huge collection of digitized works taken from library collections. As Anthony Grafton writes in the online New Yorker,

"The out-of-print books Google has digitized come from nonprofit institutions that built their collections as a public good....These public treasures will now be monetized for the benefit of a private corporation. True, Google will give every public and university library one terminal where readers can access its entire collection. But these machines won’t be able to download or print texts--and you can imagine the lines. Libraries that want full access to all the books in Google will have to pay for the privilege, and for every download."

Finding the book you want isn't so easy, though. As UC Berkeley's Geoffrey Nunberg details in a recent article in the Chronicle of Higher Education, Google Books' metadata is a mess. Nunberg found that titles were garbled ("Moby-Dick: or the White Wall"), authors mismatched to books ("Madame Bovary By Henry James"), publications misdated (Robert Shelton's No Direction Home: The Life And Music Of Bob Dylan was dated 1899), subject headings misassigned (Jill Watt's biography Mae West: An Icon In Black And White was in the Religion category--although she might indeed be a holy figure to some), and text links misdirected (the link for the 1818 tract Theorie de l'Univers took Nunberg instead to Barbara Taylor Bradford's 1983 novel Voice of the Heart). A 1995 book about the web browser Mosaic was given the publication date of 1939 and attributed to Sigmund Freud.

These examples point to two main problems with the way Google creates metadata. The first is that instead of paying for the thorough, detailed, and accurate metadata painstakingly created by librarians for their collections, Google is clearly relying on automatic assignment of metadata via computer scanning. Nunberg finds that an 1890 guidebook, London of To-Day, was given the publication date of 1774 (a very different "to-day" than 1890) because the first pages contained an ad for a clothing manufacturer founded in 1774. The medieval studies journal Speculum was assigned to the subject category Health and Fitness, because the computer didn't understand the difference between the Latin word for mirror and the medical instrument. These are the sorts of errors that arise from trying to automate a process that requires human judgment.

The second problem is that instead of using library classification systems, such as call numbers and the Library of Congress Subject Headings, Google has chosen to use the Book Industry Standards and Communications (BISAC) system. The BISAC system was designed for bookstores containing thousands or perhaps tens of thousands of titles, not for research collections of millions of volumes. BISAC's idiosyncrasies get magnified by the sheer size of the Google Books collection. Call numbers permit distinctions between and groupings of similar works; for example, when you find a book in an online library catalog, most allow you to browse a virtual "shelf" to examine all the books that are classified in nearby call number ranges. But the BISAC categories are too crude to be a useful browsing tool when the collection consists of millions of volumes.

In the absence of usable metadata, the only efficient way to access Google Books is through text searching. That doesn't work very well if you are looking for a specific edition of a work (since the text will be identical in each), or are trying to investigate the literature of a particular period. Nunberg doesn't say so, but it's clear that Google needs to hire some librarians to sort out its metadata.

Meanwhile, libraries are eliminating books. David Abel of reports that the administrators of Cushing Academy, a Massachusetts prep school, have decided to eliminate the books from its library:
"In place of the stacks, they are spending $42,000 on three large flat-screen TVs that will project data from the Internet and $20,000 on special laptop-friendly study carrels. Where the reference desk was, they are building a $50,000 coffee shop that will include a $12,000 cappuccino machine.

"And to replace those old pulpy devices that have transmitted information since Johannes Gutenberg invented the printing press in the 1400s, they have spent $10,000 to buy 18 electronic readers made by and Sony. Administrators plan to distribute the readers, which they’re stocking with digital material, to students looking to spend more time with literature.

"Those who don’t have access to the electronic readers will be expected to do their research and peruse many assigned texts on their computers."

Perhaps this isn't an issue for students whose families are rich enough to send them to Cushing (which costs nearly $43,000 a year for boarding students), but 18 e-book readers seem laughably inadequate as a replacement for an entire library. Students doing research and perusing assigned texts will evidently have to purchase their own copies to do so. (Clicking on the catalog link on Cushing's Fisher-Watkins Library page led only to an error message the night I tried it.)

Of course, many textbook publishers are moving into e-book formats because digital protection measures prevent students from sharing or selling their used textbooks. Now every student must buy their books new and only new, which means greater profits for publishers. (Image by Mark Wilson for the Boston Globe.)

The good old days. Which is what book publishing has been about for centuries. As Richard Nash writes in his review of Ted Striphas' The Late Age of Print (Columbia University Press, 2009), e-books are just the latest strategy in publishers' long-term struggle to control consumers by "inducing demand" and "creating artificial scarcity." Restrictive digital licenses are "the apotheosis of the publishing industry’s capacity to restrict a reader’s ability to do what they want with their books." In other words, we shouldn't mourn a genteel, non-commercial literary culture that is largely illusory. But if we want to retain fair use and first sale rights, we must organize.

Update 24 September 2009: The New York Times' Miguel Helft is reporting that the settlement between Google, the Authors Guild and the Association of American Publishers is being renegotiated. The settlement in the copyright infringement case brought by the authors' and publishers' groups against Google was reached last year, but is now being revised to address privacy and antitrust issues raised by many concerned individuals and groups, including a coalition of authors and publishers represented by the American Civil Liberties Union, UC Berkeley's Samuelson Clinic, and the Electronic Frontier Foundation.


  1. Your points are well taken but for the egregious omission of the positive aspects of a conversion to electronic format. Don't get me wrong. I love the codex, and I hope it is with us a long, long time. But I have an e-reader (not a Kindle), and I love it. I am someone who reads 25 books at a time, and with my e-reader I can carry them all (along with hundreds of other books and all of the texts I teach) wherever I go. E-texts are less expensive. They take no space. They are searchable in ways that are far, far more efficient that codices. Cutting and pasting is a wonderful way of taking "notes." Moreover, while the defects of Google books you point out are all defects that can be improved upon, they likely will be. But what is remarkable about Google Books is that there is suddenly available to me on the shores of an inland sea far from what the elite consider cultural centers an enormous store of writing and knowledge I never even knew existed. And now I can read it!

    Things change. The codex was once a revolutionary medium. So was papyrus. And with every change things are lost, but things are gained too.

  2. Peter, you make some excellent points. E-books are obviously here to stay, and they can have some advantages over print--as you point out, you can carry 25 (or more) at a time in a device no bigger than an iPhone, and text searching can be easier than using an index (although text searching has its own pitfalls). Digitization does make vast libraries accessible to anyone with an Internet connection, and (as I say in the post) e-books will undoubtedly improve.

    But I think you're mistaking my purpose in writing this post. I'm not saying (to paraphrase Orwell's Animal Farm) "Codex good, e-books bad." Instead, I'm pointing out that large corporations are pushing the e-book format for reasons that are in their commercial interests. And chief among those interests is control of content. If we as readers want retain our rights to fair use, resale, privacy, and even ownership in the rush to e-book adoption, and if we want e-books to be more functional than print rather than less, we need to raise awareness of issues both legal and technological.

    That's why I end the post with the admonition not to mourn some lost, nonexistent paradise of print, but to organize. If you follow the link on that word, you'll see that it leads to the site for the Electronic Frontier Foundation, which is one of many groups fighting legal battles on behalf of (in their words) "free speech, privacy, innovation, and consumer rights" in the digital realm.

    Thanks for your comment!

  3. I am responding to the beginning of your post, regarding fair use and transfer. As as aspiring author, I do not object to the fact that purchasing a copy of a new work electronically doesn't give someone the right to give it away. I was on a college campus from 1998-2001, and then 2004, and saw the massive Napster-ing of the music industry. Someone could download a copy of a favorite song and send it to hundreds of persons on an e-mail list before you could say "Jiminy Cricket." As much as it frustrates me when I do research, I appreciate that Google Books severely limits access to in-print titles - "snippet view." As a bookseller, you probably appreciate the impact that widely available used copies have on an author's royalties - "free" digital copies would have an even greater impact.

    I think there are legitimate reasons for there to be different fair use policies for electronic versus print copies. A copy of a book can be only given to one person at one time; an electronic copy can be sent to hundreds in an instant. As for your comments about Amazon and Google, you may be right about those.

  4. David, you make good points about the ease of dissemination of electronic copies, and how the ability to freely copy texts might affect authors' ability to be fairly compensated for their work.

    But the way things stand currently isn't fair to readers. Often an e-book purchased in one format isn't compatible with more than one brand of e-book reader, which can make it difficult for even a single reader to access texts that they have purchased. If your Kindle konks out, for example, you can't migrate your purchases to a Sony e-book reader.

    And currently magazine and newspaper subscriptions purchased for a Kindle don't even carry over to a Kindle 2--in other words, despite having paid for a subscription, a reader loses access to all back issues if they upgrade their Kindle to a newer model.

    Surely there is a more reasonable compromise that would allow users to make a limited number of copies of purchased electronic content to preserve their access (or yes, to give a copy to a friend).

    As for the impact of used-book sales on authors' royalties, that's the subject for another post. Suffice to say that used books currently account for fewer than 10% of book purchases. I've also observed a certain inelasticity of demand for both used and new books--that is, many people looking for a used copy of a title will not buy a new copy if a used one isn't available, and (perhaps more surprisingly) vice versa. So every used book purchase does not result in a lost new book sale--far from it. But as I say, that's a discussion for another time.

    Thanks for your comment!