A historical perspective on content

In Europe before the 1450s, books were precious, rare objects and were usually copied by hand over a period of months or years. Johannes Gutenberg and his printing press changed the economics of information distribution. The result of this change was less expensive books, greater literacy, and a challenge to those in power, who benefited from restricting information. Today, the rise of the Internet has eliminated distribution costs as a barrier to entering the publishing market. With minimal equipment, anyone can publish in a blog or book, record and distribute a podcast, or deliver video content. What do these changes mean for technical communication? And what lesson can we learn from the changes that took place over 400 years ago?

In the last 20 years, the economics of information have shifted toward the author and away from the publishers (or gatekeepers):

  • It’s possible to record high-quality audio and video with inexpensive equipment
  • The Internet provides numerous publishing platforms (Blogger, WordPress, YouTube, Lulu, Amazon, iTunes, and so on)

The possibilities are endless: books, ebooks, PDF files, web content, screencasts, podcasts, digital videos, wikis, and more. But which of these platforms will succeed?

The text cycle

To understand the implications of digital publishing, it’s helpful to break down the process of information development. Terje Hillesund developed a text cycle1 with the following phases:

  • Writing (authoring)
  • Production
  • Storing
  • Representation
  • Distribution
  • Reading (consumption)

Traditional storytelling combines all of these phases into a single event: one person at the campfire telling a story while the audience listens.

The written language separates distribution and consumption. Instead of needing an author to deliver the story in person, written content can be moved from one location to another.

The printing press introduces further separation of the phases by disconnecting production (formerly hand-copying) from distribution. It becomes possible to produce a page once and create many, many copies of that page.

Digital content allows further separation. Physical distribution is no longer required, and the representation (formatting) of the text is separated from the production (markup) and potentially from the storing (content management system).


Quality versus cost

The printing press, which made inexpensive books possible, did require a compromise in quality. Hand-crafted, hand-copied books, with their carpet pages, intricate capital letters, and unique illustrations (often customized for the person who commissioned the book) were works of art.

The earliest printed books were hand-illuminated after the printing process, but this added effort gave way quickly to mass-produced books. The ability to produce books faster and cheaper was more compelling than the increased quality resulting from extra manual work.

Before the printing press, the act of copying the book also created the formatting. With the printing press, the formatting was done in a separate typesetting step, and it was then possible to create a large number of copies from a single formatting effort.

Today, the publishing world sits at a very similar inflection point. The rise of electronic publishing along with the ability to separate authoring from formatting is analogous to the rise of printing and the ability to separate formatting from distribution. Just before the printing press (1450), approximately 50,000 books existed in Europe. Within 50 years, that number rose to 12 million.2


The rise of books in Europe after the printing press

What are the implications for technical content?

The rules of publishing, which were relatively static for hundreds of years, are now changing by the day. Consider that iPad tablet publishing did not exist until 2010. The Kindle reader (2008) drives a brand new ebook business. We can expect to see increases in publishing velocity, volume, and versioning requirements. And based on the way that printing evolved, we can expect that economic considerations will determine which innovations succeed.

With this in mind, expect the following developments:

Streamlined publishing workflows
Given the proliferation of output formats, the publishing workflow must be automated. Labor-intensive final production work will likely disappear. Like hand-illumination, these tasks add quality, but they obstruct efficiency. For technical content, efficiency will outweigh perfect kerning, copy-fitting, and other design niceties.
Data-driven, user-customizable graphics
Highly designed infographics and other complex images will remain the domain of the professional author for now. To reduce the cost of maintaining (and especially translating) these graphics, authors must use layers and carefully separate the core graphic elements from the labels that require translation. There is room, however, for growth in graphics that readers can manipulate or create. If we make the data available to our readers, they can choose how to display the information (bar graph or pie chart?), filter the information displayed on the chart, and control the colors and the fonts used in the chart. Google Analytics and many web-based application dashboards provide users with ways to manipulate data. Technical communication needs to make better use of these types of technologies and provide flexible ways to render information. Instead of focusing on controlling the presentation of graphical information, we can build information applications that the reader can control.
Limited use of audio and video
If we apply Hillesund’s text cycle to audio and video, we can see why audio and video are not (yet) going to take over from text. The components of the audio and video development cycles are not yet separated as clearly as the text development components. In particular, when audio or video is recorded, the content storage and representation are tied together. These two facets need to be separated to provide for really inexpensive (and therefore widespread) usage. A basic example where storage and representation are separated is text-to-speech functionality, which has the ability to render audio in a voice chosen by the end user, rather than in the audio track laid down by the author. But the vast majority of audio files use sound recordings, where the content is inextricably tied together with the delivery. There are similar issues with video. Exceptions are screencasts and digital animation, where the source files have layers and timelines, which content creators can manipulate as needed. But today, we do not have the same degree of separation of content and formatting for audio and video as we do in text and graphics. We can’t slice apart audio and video the same way that we manipulate text.

Velocity, volume, and versioning

Velocity, volume, and versioning are the three Vs that drive the economics of information:

  • Velocity: the speed at which new information is created and delivered
  • Volume: the amount of content that needs to be created and delivered
  • Versioning: the content variations that need to be supported for end users

The requirements for the three Vs are pushing organizations to fully automate their workflows to eliminate delays in information delivery.

Velocity and volume are also implicated in the rise of topic-based authoring. When authors work at the topic level, it’s easier to move authors from project to project and therefore put additional people to work on high-priority projects. This is much more difficult in narrative or book-based content.

Like velocity and volume, versioning requirements are increasing. Instead of creating a few manageable versions of content, technical communicators are being asked to support products that have dozens or hundreds of variations. The only reasonable solution with the higher number of versions is to deliver all of the content, and then filter it based on a user’s profile. This requires an excellent understanding of the product and (again) complete automation of the rendering process.

High-end versioning probably means that the content objects need traceability—they need to be connected to the corresponding product functions, so that the system can include the appropriate information for each user.

It’s worth noting (again) that the three Vs apply mainly to text and somewhat for graphics.

Search and navigation

Information is valuable only if users can access it. For books, we have standard conventions: a table of contents at the beginning of the book, an index at the end, and page numbers for navigation. We also know that a book in English is read from left to right. Chapter title pages, headings, and caption for figures and tables are all instantly recognizable because we have been exposed to them since primary school.

For newer electronic information products, search and navigation are even more critical—“flipping through the book” is really not viable online—but the user experience is not yet unified. The behavior of an EPUB file depends on the capabilities of the reader in which it is being displayed.


Content displayed in the iBooks app on an iPad tablet

The same EPUB file renders differently on a NOOK reader.

Content displayed on a NOOK reader

Adobe Digital Editions provides yet another variation.


Content displayed in Adobe Digital Editions

For content creators, this introduces a lot of headaches. For example, consider an interactive, multimedia-rich ebook. In this context, what is the equivalent of a page number? What if the rendering and the page numbers are different on different devices?

In addition to navigational issues, there are search challenges.

How will users find the information that they need? Search provides a partial answer, but even the most carefully crafted search string may result in an overwhelming list of results. Search with filters (faceted search) and social search (results that are influenced by the searcher’s social network) can make results more manageable.


Faceted search (left) and social search (right)

1 In this context, “text” includes graphics and other content types.


One Response to A historical perspective on content

  1. Pingback: Perils of DITA publishing, part 7: the displeasures of distributing ebooks « Content Curated By Darin R. McClure & a few photos

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>