Content modeling


Content modeling is a critical portion of the implementation. In this phase, you identify your organization’s requirements, develop a taxonomy (classification system) that meets those requirements, and consider where metadata should be allowed or required.


Do not overlook the importance of metadata. Metadata is critical to making your content manageable—with or without a content management system.


The content model specifies how information is organized. In unstructured authoring tools, such as FrameMaker and Word, the content model is usually specified in a template and a style guide and is enforced by technical editors. In structured authoring tools, the content model is enforced by the software.

As you define the content model, you must balance precision and simplicity. Defining with precision leads to large, complex content models. Keeping the content model as simple as possible makes it more usable. Other workflow components, especially content management systems or single-sourcing plans, may put limitations on how you define your content model and what metadata you create. For example, using HTML5 as your content model is easy (everything is predefined), but it has limitations (for example, no provisions for content reuse). A custom content model can match your requirements exactly, but it is a lot of work to build it. Fairly early in the project, you need to decide whether an existing content model (such as an XML standard) is an acceptable fit for your content or whether you need to create your own content model.

Content development for technical information is in the middle of a difficult transition from unstructured to structured content, generally based on the Extensible Markup Language (XML). XML offers significant advantages, such as the ability to enforce consistent structure and the ability to generate a variety of output formats automatically. However, establishing an XML-based content development environment is technically challenging and often expensive. Several XML standards are available that support technical content. The Darwin Information Typing Architecture (DITA) offers a framework for modular, topic-based content with heavy reuse, and is being adopted by many software companies. Some organizations build custom XML content models so that they can design a model to their exact specifications; others decide that DITA is good enough for their purposes.

Factors to consider in evaluating whether unstructured or structured content (and within that, DITA) is more cost effective for your organization include technical requirements, cultural fit, licensing costs, implementation effort, flexibility, and the size of the organization.


Scriptorium’s unofficial rule of thumb is that an organization has a business case for structured content (based on localization cost savings) if they have:

  • Ten or more writers
  • Four or more languages
  • More than 2,000 pages of content per language

Be careful using existing content as a starting point for your content model. The advantage is that you can build out a content model that supports everything you are currently doing. The disadvantage to this approach is that what you are currently doing may not be what you should be doing. Therefore, you need to take a hard look at the existing content and consider how well it meets your requirements. Also, think about how requirements might change in the future.

At the end of the content modeling phase, you will have a detailed document that describes the proposed content model and explains your decision to use (or not to use) a standard. You may want to include a flowchart or hierarchical tree diagram that explains the structure. The delivery medium is less important than the ideas conveyed.

Excerpt of a structure analysis flowchart (simplified model of DITA structure)

After review and approval of the content model, you can begin to build out that content model in the authoring and publishing environment. Remember that content modeling changes get more and more expensive as you get farther into the project.


Add comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.