Really Strategies Acquires SaaS XML Content Managment Platform DocZone.com

Content management anyway you want it with RSuite or DocZone I am pleased to announce that Really Strategies has acquired SaaS XML content management platform DocZone.com. You can read the press release here.  This is an exciting day for the team at Really Strategies for several reasons:

  1. DocZone.com is known for its DITA-based SaaS content management solution with a Fortune 500 client base.
  2. Our combined teams have the most experienced content management engineering team of any product vendor.

  3. We are now better positioned to serve publishers and technical publishers on a global basis.

  4. We now serve over 100 publishing companies, media companies, and technical publishing organizations. 

  5. DocZone and RSuite provide the market with a wealth of deployment options (SaaS, hosted, deployed, or build your own (using RSuite Engine).  We feel this breadth of solution offerings is unique and is what differentiates us from the competition.

We are excited by the addition of DocZone.com and look to continue building on our past successes.

When your XML is my XML

A number of times through my years at Really Strategies I’ve been asked to give an “intro to XML” type of presentation to a non-technical audience (usually business people or editors).  I generally include a slide of some sort that says, “Your XML is not always my XML” to try to make the point that not all XML is the same.  Many who are not technical and new to XML don’t initially know this.  They think there is one XML format, not many flavors defined by many DTDs and schemas.

My slides usually contain some examples such as using <p> or <para> to indicate a paragraph.  Or illustrating how different standards such as NLM, NITF, and RSS capture the author of the piece.   Or pointing to three XML schemas for marking up recipe content.  Where you say <tomato> I say <tomato>.

This is where DITA comes in to help.  Because with DITA, you may have your XML and I may have mine but at the very least I should be able to understand the basics of your XML.  I should be able to cut through your custom markup and still be able to make some sense of your content.   DITA specializations allow for extending the base model, but DITA aware tools can still process specialized DITA content.  Contrast that with the fact that if we modify NLM XML, it is no longer valid NLM XML. 

And if the DITA for Publishing project takes off, we should be able to get to a time when there is a core set of domain-specific specializations, so at least journal publishers or magazine publishers or whoever will use a common set of specializations.  There will always be publisher-specific specializations, but interchange (and sharing across domains) becomes easier.

Other standards won’t become irrelevant, but we should begin to think of them as delivery targets (which is how many of them were initially designed).  Delivering NewsML feeds to your aggregators. Submitting NLM content to PubMed.   Building an ePub format for eReaders.  But using DITA behind that offers greater flexibility in doing what you want to do with the content and using DITA-aware tools that can process any DITA content.

And if we achieve that vision, your XML can still be your XML, and my XML will be my XML, but underneath it all we will be able to more easily understand, share, interchange and process all of it.

DITA For Publishers: New Community Project

Publishers are starting to take DITA very seriously and Really Strategies has been in the forefront of that trend as champions of Publishing requirements on the DITA Technical Committee and within the larger DITA community, as practitioners developing solutions and approaches for applying DITA to Publishing business problems, and as tool developers creating software solutions that support the Publishing use of DITA.

Out of the work that we've done over the last couple of years we have developed a number of basic Publishing-specific DITA components that are completely generic. We also started to realize that for Publishers to realize the maximum value from their use of DITA there would need to be a common starting point that Publishers could leverage, avoiding the need to re-invent things everyone needs. Eventually Publishers will need formal representation in the DITA standardization process, once there is sufficient Publishing community involvement.

To that end Really Strategies has sponsored the creation of a new community-based, open-source project: DITA For Publishers (dita4publishers.sourceforge.net).

The DITA For Publishers project is intended to provide a set of Publishing-specific DITA element types tailored to the task of representing typical Publishing documents, such as commercial fiction and non-fiction, magazines and other types of periodicals, travel and nature guides, and so on. These are documents that have fundamentally different content requirements and business processes compared to the technical documents to which DITA has traditionally been applied.

The DITA For Publishers project is still very new but it already provides some useful pieces any Publisher would need for a DITA-based XML system:

  • A Publishing-specific DITA map type: pubmap, designed to enable representation of all types of Publishing documents, including documents with arbitrary or idiosyncratic content organization
  • Basic Publishing-specific topic types for articles, book parts, book chapters, generic subsections, and sidebars. These topic types enable the natural and intuitive representation of most existing publications within a DITA context.
  • Publishing-specific support domain (mix-in element types) for representing the sort of arbitrary formatting requests that are an unavoidable reality of Publishing.
  • Basic extensions to the DITA Open Toolkit to support the Publishing-specific element types.
  • A new EPub-creating plugin to the DITA Open Toolkit that enables the creation of reader-ready electronic books from DITA-based content, with specific support for publication maps.
  • Sample Publishing-type documents that demonstrate how to use the DITA For Publishers element types. The first such sample is The Wonderful Wizard of Oz from Project Gutenberg, marked up as a pubmap and a set of chapter topics.
  • How-to information on how to apply DITA and DITA-based technology to common Publishing document types and business problems.

The materials are packaged for download and ready to be used with the latest versions of the DITA Open Toolkit and DITA-aware editors.

DITA For Publishers is a community project, which means it needs and depends on and welcomes involvement and contributions from the entire community of Publishers. Immediate needs for the project are:

  • Statements of requirements from Publishers: What information structuring challenges do you have that you would need or expect a DITA-based solution to solve?
  • Sample Publishing documents that can be used to test and demonstrate the DITA For Publishing specializations and supporting tools (see my free data conversion offer below).
  • Implementation support: there is always a need for programmers to contribute to the development of generic support components (transforms, etc.).

For Publishing document samples, my general offer is:

If you, the Publisher, will provide:

  • Electronic source and final form of one or more Publishing documents
  • An appropriate non-copyright license, such as a Creative Commons non-commercial license, for those documents as served by the DITA For Publishers project through the project's Web site (dita4publishers.sourceforge.net) (so that the DITA For Publishing project can make the source and rendered forms freely available for at least non-commercial use).

I will provide:

  • Conversion of the content to DITA-based XML using the Publishing For DITA markup as appropriate
  • The resulting XML back to you, the Publisher, with your original copyright retained, for you to do with as you will.
  • Such renditions as I can produce with the tools at hand (e.g., HTML, EPub, PDF using XSL-FO)

In the unlikely event I get inundated with samples, I retain the right to cry "uncle".

Note that this is an offer of free data conversion at the small cost of providing a non-exclusive, non-commercial-use license for the content. The value to the DITA For Publishing project is the chance to develop both a larger body of illustrative examples and practical experience with representing Publishing documents.

DITA North America Trip Report

I recently attended the DITA North America conference in St. Petersburg Florida. I was there to present a paper on our experience using the DITA Learning and Training module to develop an XML solution for test preparation publications.

As my and Really Strategies focus for DITA is how it applies to the needs of Publishers, I was pleased to see three different presentations (including my own) on using DITA for learning content.

Robin Sloan of PTC/Arbortext presented on how PTC uses DITA for their own training materials. What they did predated the development of the Learning and Training module and they contributed imporantant design aspects to the module. Robin's main message was that they were able implement a DITA-based process without too much difficulty and achieved significant savings and process improvements by doing so.

Patrick Quinlan and Alisha Carter of Citrix Systems presented on their experience moving from a DTP-based system for developing product training to a DITA-based system. They focused on the business drivers (saving process time and reducing translation costs) and the social aspects of the transition. They were able to implement the techology and meet their first-round time and cost saving goals. They talked in some detail about how they managed the roll out and training of their training development team so as to both grow support from the ground up and avoid backlash from trying to do too much too fast.

In my paper I focused on how the use of DITA in general, and the Learning and Training module in particular, allowed me to develop an XML solution in much less time than a traditional solution would have. Because the system I was developing had to enable publishing of the test prep publications as they are, I had to work out ways to capture all the arbitrary formatting and unavoidable variance inherent in these sorts of publications. An interesting aspect of this project was that the client did not ask for a DITA solution, they asked for an XML solution, and I realized that a DITA-based solution was the shortest path to the best solution.

While only three talks out of three day's of talks doesn't seem like much, it represents a significant increase in the amount of discussion being given to non-tech-doc applications of DITA and reflects what I'm seeing as an increasingly rapid adoption of DITA by Publishers of various sorts.

The new Learning and Training module has lots of obvious value for Publishers creating any sort of educational material but Publishers are starting to understand that DITA offers a lot of value as a base for any XML solution, whatever it is. For example, I am currently working with a professional association to develop a DITA-based solution for publishing magazines and books of all sorts. In this case, they specified the use of DITA to us, based on the sound recommendation of another consultant (the Rockley Group).

It would not suprise me if next year DITA North America includes, if not a Publishing track, at least one day focused on Publishing applications of DITA....

DITA Keyref Example: Links from Glossary Entries

DITA 1.2, currently in the final stages of development by the OASIS DITA Technical Committee, provides a number of important new features. Of the new features in DITA 1.2, it can be argued that the key reference (keyref) feature is the most important. The keyref feature provides the ability to do indirect, context-dependent linking, something that is required in any application that supports re-use of content across multiple publications.

Because keyref is so important and because it also has inherent, unavoidable complexity, I will be posting short examples of how keyref can be used to solve specific business problems. This is the first in an occasional series of such examples.

This example shows one particular application of the keyref feature to a real-world problem faced by one of Really Strategies' clients. The data and the business requirements are real.

This is is a real-world example of using the new DITA 1.2 keyref feature to make existing content with topic-to-topic cross references reusable, using keyref, where without keyref is it not reusable.

The scenario comes from real publications: test preparation manuals for primary education standardized tests. For this client, one business goal was to define an XML solution that produced, as much as possible, publications that reflected their current, pre-XML presentation practices.

Each test prep manual publication consists of a set of lessons, pointed to by topicrefs within a publication-specific map.

Each publication also includes a glossary, where each glossary entry contains a cross reference to the lesson or lessons in which in the term is defined or explained. With DITA 1.1, these glossary entries use a normal <xref> element to point from the glossary entry to the topic for the appropriate lesson:
<glossentry
id="action">
<glossterm>action</glossterm>
<glossdef>what a character does in a story
(<xref href="../lessons/lesson_12.xml"/>)</glossdef>
</glossentry>

This works to the degree that the cross reference will resolve to topic lesson_12.xml but it can only ever resolve to lesson_12.xml. This lesson is specific to a particular test prep manual, e.g., Texas TAKS Language Arts Grade 7.

However, the business requirement is that the glossary entries be re-usable across different publications. With the cross reference, the topic cannot directly be re-used in different publications (different maps) because the crossreference is a direct topic-to-topic reference: regardless of the map that uses the glossary entry, it will always point to lesson_12.xml.

In the DITA 1.1 the only DITA-defined options are:
  1. Have publication-specific glossary entries that conref the definition text and contain publication-specific cross references.
  2. Replace the cross references with reltable links.

Option 1, per-publication glossary entries with conref, requires many publication-specific glossary entry topics as well as the base glossary entries as well as the conrefs themselves. It works, but it's complication and duplicated data. It also requires explicit per-publication authoring of the cross references.

Option 2, reltable links, works, but either produces a presentation result that is not consistent with the pre-DITA publication practice (that is, the legacy presentation that we are attempting to replicate as closely as possible) or requires custom processing to make the presentation result look like a normal cross reference. The relationship tables eliminate the need for per-publication glossary entries, but do require separate per-publication authoring of the relationship tables, one or more rows for each glossary entry.

With DITA 1.2, there is a third solution that avoids both the need for per-publication glossary entries and the need for per-map relationship tables: keyref.

The keyref features provides an indirect addressing mechanism that allows a single reference to resolve to different concrete targets in different maps. Keys are defined on <topicref> elements. Linking elements, such as <xref>, refer to the keys using the new keyref= attribute rather than href=.

In the case of the glossary entries for the test prep manual, it is a business rule that every glossary entry have at least one corresponding lesson in which the glossary entry is defined or explained. This means that glossary entries can blindly point to the key for the lesson for the entry without knowing which lesson that will be in a given map. Within a publication-specific map, the topicref for each lesson simply defines the keys for those terms it defines or explains.

Using the approach, the glossary entry gets modifed to replace the href= on the <xref> element with a keyref=:
<glossentry
id="action">
<glossterm>action</glossterm>
<glossdef>what a character does in a story
(<xref keyref="action_Lesson"/>)</glossdef>
</glossentry>

Note that instead of pointing to a key like "lesson_12", the key reflects the term itself, "action_Lesson", reflecting the requirement that there must be a lesson for the terms "action". In fact, you can take the uttering of the key as a demand that there be a lesson that defines "action." At a minimum, a keyref-aware processor will report any keys that can't be resolved, providing a built-in completeness check on the correlation of glossary entries to lessons.

Within the map for a given publication, the topicrefs to lesson topics are modified to add the keys for the terms they define or explain:
...
<topicref
navtitle="Chapter 2. Literary Elements">
<topicref
navtitle="Character"
href="ELA/TX_ELA_G7/lessons/lesson_13.xml"
keys="action_Lesson"
/>
<topicref
navtitle="Setting"
href="ELA/TX_ELA_G7/lessons/lesson_14.xml"
keys="clause_Lesson setting_Lesson"
/>
...

Note that a given topicref can define any number of keys. In the case of the lessons, a given lesson might define or explain a number of different terms.

Because keys are defined within maps, a given key reference can resolve to different targets when resolved in the context of different maps.

For example, having created the Texas TAKS Language Arts Grade 7 test prep publication, we decided to create the Ohio Grade 8 Language Arts test prep publication. It's the same subject so most, if not all, of the glossary terms are appropriate, but the specific lessons will be different for this new publication.

By defining the same keys in the map for the Ohio publication, the same glossary entries can be re-used without modification in the new publication. The new map might look something like:
...
<topicref
navtitle="Chapter 3. Literary Devices">
<topicref
navtitle="Plot"
href="ELA/OH_ELA_G8/lessons/lesson_21.xml"
keys="clause_Lesson"
/>
<topicref
navtitle="Character"
href="ELA/OH_ELA_G8/lessons/lesson_22.xml"
keys="action_Lesson"
/>
<topicref
navtitle="Setting"
href="ELA/OH_ELA_G8/lessons/lesson_23.xml"
keys="setting_Lesson"
/>
...

Here, the same keys are defined, but linked to different lesson topics.

Now there is exactly one glossary entry for each term, with no need to have publication-specific versions of each glossary entry. The cross references are just normal cross references, so they require no special processing. Likewise, there is no need for relationship tables.

The author of the publication-specific map still has to figure out which terms are defined in which lessons in the publication and declares the keys appropriately, but that work has to be done regardless: the only variable is how the results of the analysis are captured in the DITA markup.

For this particular use case, the keyref feature provides the simplest solution, in terms of data complexity while also allowing the use of xref, reflecting the legacy practice we need to preserve for this particular client.

DITA: It's Just XML

I've been talking for a couple of years now about the value of the DITA standard (http://dita.xml.org/standard) for publishers. I've also implemented several DITA-based applications for publishers in the last few years.

But there still seems to be a lot of confusion and misinformation about DITA, judging by comments I got at the recent RSuiteCMS User Conference. One person came up and said "DITA looks like it would really apply to our content but another vendor told us 'DITA is for TechDoc'", to which I replied, "Not at all, DITA is very likely a very good fit for your application."

In fact DITA is completely and compellingly applicable to almost any situation where XML is used for authoring and managing documents intended for consumption by humans, e.g., books, journals, reports, Web pages, etc.

But it can be hard to see that amid all of the stuff that gets said about DITA.

DITA is a sophisticated application architecture with lots of very useful features. People coming to DITA or promoting it, especially in the TechDoc world, tend to focus on the most sophisticated features because they're focusing on business problems for which those features are intended, such as managing large bodies of small re-used information modules across information for many products (for example, mobile phone manuals). That's cool stuff, but it's also pretty complex. It's no suprise that people see in-depth discussions of DITA maps and re-use strategies and localization best practice and say "hold the phone, I just want to get my traditional documents into XML I can understand--I don't need all this fancy stuff."

I'm here to say: you're probably right, you don't need all that whizbang stuff (today), but don't be so quick to reject DITA as a potential solution base.

If you ignore all of the features of DITA that get the technology guys like me excited, you start to see that DITA has two important aspects that tend to get overlooked:

  1. At its core DITA is very simple and can be easily applied to simple XML applications that just need to represent things like books and magazine articles.
  2. DITA's unique extensibility architecture makes it a much better business value than any comparable XML alternative.

There's lots to say about this, but the short version is that, because of unique features of the DITA architecture, it is both as easy as it could possibly be to develop custom DITA-based XML document types and as easy as it could possibly be to implement authoring and processing for documents using those document types. And the cost will tend to go down over time as more and more XML-aware products and tools become fully DITA-aware.

That is, DITA's unique features, in particular the "specialization" feature, have the effect of keeping both the initial implementation cost of a DITA-based solution and the ongoing, long-term cost as low as it can be for any XML application.

This is a significant benefit irrespective of any other technical benefit DITA might provide in terms of cool features, simply because it means you can, for example, move to or experiment with an XML-based process with a minimum of initial implementation cost, simply because DITA makes it so easy to do. At the same time, even though you're using a standard XML application, you can still create markup that is as optimized for your specific documents and business processes as you care to make, from not all to every tag is specific to you, all within the framework of the DITA standard, without making interchange or processing harder or more expensive.

For example, you could easily define a DITA-based document type intended to represent entire books (or more likely, entire chapters) as single XML files. This is about the simplest way to usefully apply XML to publishing applications and is a pretty typical starting point. The DITA standard completely supports this straightforward use of DITA, even though it's not using DITA maps or any other of the features that people tend to focus on.

Having done this you would have exactly what you would have gotten had you created a document type from scratch or used a standard like DocBook or NLM as your base, except that you would have gotten a bunch of stuff essentially for free:

  • A lot of commercial tools would "just work" with your content with little or no additional configuration, regardless of how custom your tags are.
  • You spent a lot less on the initial implementation than you would have spent on any other way of getting to the same place (especially if you wanted fairly customized markup).
  • You are ready to start using any of the more-sophisticated DITA features you need at any time, as it makes business sense to do so, but not before.

In short, you will have an XML application that, on its surface, is just another XML application like any other, and if the fact that it happens to be DITA-based is not interesting or helpful, you don't need to mention it. It's just XML.

But if it being DITA is useful, then because it is DITA-based, anyone who already understands DITA knows a whole lot about your XML application, just because it is DITA-based. And when you find that you do in fact have a compelling business requirement to make your content more modular or create new products simply by creating new maps over existing content, you're set to do it with minimum additional effort, without having to have stepped up to that level of complexity as a cost of entry.

As somebody who is both a technologist interested in doing cool things with information and pushing the state of the art and a service provider who wants to provide the most appropriate, highest-value solution to my clients (which means appropriate features for now at the lowest cost without sacrificing future flexibility or building in hidden costs), I find this aspect of DITA very exciting.

I see DITA as a way to make XML much more realistically accessible to enterprises, large and small, that have compelling business reasons to use XML but for whom the traditional (pre-DITA) cost was often prohibitive, or at least daunting. In a time when economic presures are simultaneously requiring Publishers to innovate and squeezing the budgets used to deploy that innovation, I see DITA, simply in terms of its economy, as a powerful tool that Publishers can apply, even for what may seem to be simple problems.

If anyone says to you "DITA is too complicated for your needs" or "DITA is only for TechDoc" you tell them to talk to me, because they are misinformed.

DITA is just XML, plain and simple.

Preview of DITA Learning Specialization in Action

Almost exactly two years ago I posted here with enthusiasm for the idea of using DITA to produce topic-based learning objects. Well, yesterday I participated in an OASIS DITA Learning and Training Content Specialization Subcommittee conference call and caught an early glimpse of the group's work in action.

During the call John Hunt demonstrated DITA to SCORM content and manifest publishing using the Open Toolkit. After publishing, he packaged the content and loaded it into the SCORM conformance test suite without errors. Next he showed functional Learning Management System navigation, discussed best practices for integrating relational topic links, and for a finale demonstrated dynamic sequencing based on pre-assessment test results. Nice! For those who have not been keeping track, support for the DITA Learning and Training Content Specialization will be included in the forthcoming DITA 1.2 release.

So while last time around I could only speculate on the natural fit between DITA and SCORM (I am currently fond of saying they go together like peanut butter and chocolate), now I can quote a colleague with confidence and say "it just works!"

DITA For Publishing: DITA Project Gutenberg Samples

As a side effect of the new DITA2InDesign project, I have started converting more or less random publications from Project Gutenberg into DITA as way to both provide some non-trivial, non-technical-document samples in DITA as well as to demonstrate different approaches to using specific DITA features for specific kinds of content.

The source for the samples is in the DITA2InDesign source code repository on SourceForge. The HTML and PDF renderings from the DITA XML source are served from the DITA2InDesign project Web site: DITA Project Gutenberg Samples. These have been rendered using the out-of-the-box DITA Open Toolkit HTML and PDF2 processors (although the PDF2 processor has been customized to use different fonts from the default Arial).

Once the DITA2InDesign process is working these documents will serve as test cases for that process as well, acting as test cases that are representative in terms of size and content charateristics of what modern publications of similar types would be like when managed as DITA-based XML content.

All the Project Gutenberg documents are either in the public domain or were donated by the copyright owners to Project Gutenberg. If anyone reading this post has a publication that they think would be an interesting candidate for DITA representation, and would be willing to donate the source to the DITA2InDesign project for non-commercial use (that is, the donor can retain the copyright and impose any derivative use restrictions they want as long as the material is licensed for viewing and non-commercial use in its DITA form) then I will happily convert the document to DITA. As for the Gutenberg samples, I can't promise an optimal conversion but I can promise a complete and correct conversion. [Note what I'm offering here: essentially free consulting for the price of giving away access rights (but not ownership) to one publication. Of course, this offer is on a first-come, first-served, time-available, while-supplies-last basis.]

Some things that could be done fairly easily with these DITA documents but that are not currently provided for in off-the-shelf tools include:

  • Generating eBooks in various standard and proprietary formats (OEBPS, Sony Reader, Mobipocket, etc.)
  • Generating digital talking books in NIMAS format
  • Generating Web deliverables tailored for mobile delivery
  • Generating a Wiki-style interactive Web site from the DITA source

In addition, this source is all ripe for additional metadata classification. For example, the entries in the Encyclopaedia Britannica sample should all have explicit subject keywords as part of the topics' metadata.

The DITA Project Gutenberg samples have the same unrestricted use licenses as the original data on the Project Gutenberg site, so feel free to use these samples for whatever you want. In particular, these make useful test and demonstration data sets for DITA-aware products.

Enjoy.

Call for Participation: DITA 2 InDesign Plug-In

Really Strategies is supporting the creation of an open-source, community-developed DITA-to-InDesign plug-in for the DITA Open Toolkit. We are donating a small amount of existing code (some early XML-to-InDesign transform experiments) and development effort over the weeks and months to come, as well our existing expertise and experience with both DITA processing and getting arbitrary XML into InDesign.

The project is managed on SourceForge as the DITA2InDesign project.

There's nothing much there now: we're just getting started with development and are actively soliciting contributions from others in the community. See the project's Web site for details on the project and how you can help us move it forward.

Our goal with this project is to help make it easier for Publishers, in particular, to take immediate advantage of DITA, or at least experiment with it with a minimum of up-front effort, by fostering the creation of a print production tool chain that uses tools both familiar to Publishers and capable of meeting Publishers' typographic and composition requirements.

With DITA today you can create printed output using the XSL-FO-based plug in. That plug-in is adequate for technical documents and, with a little effort, you can customize and extend it to reflect corporate branding and specific page layouts.

However, the inherent limitations in the XSL-FO standard and its available free and commercial implementations make it incapable of producing the more sophisticated layouts required by most commercial publications and more heavily-designed technical documents. Thus the need for something like the DITA2InDesign plug-in.

The goal is for the DITA2InDesign plug-in to help bridge the gap and make it as easy as possible to use InDesign with DITA-based content.

NOTE: While the plug-in will go long way toward automating the layout of DITA-based content with InDesign, it won't be able to do everything. There will always be a class of documents that require more automated layout sophistication than the plug-in could hope to provide. For those documents, the Typefi product offers a very attractive solution. Typefi provides very sophisticated automation features for rendering XML content into InDesign layouts. While one doesn't exist today, it should be fairly easy to create a generic DITA-to-Typefi "CXML" process that would allow you to use existing Typefi-based InDesign layouts with any DITA-based content.

Live DITA Application: FASB U.S. GAAP Codification

The work of all accountants doing commercial accounting in the U.S. is governed by the Generally Accepted Accounting Principles (GAAP), created and maintained by the Financial Accounting Standards Board, a member-supported organization mandated by the U.S. Congress.

Historically the GAAP has been created as a mishmash of different documents and supporting interpretation and commentary. There was no single organizing schema or source. In short, it was essentially impossible to determine whether or not you had found everything relevant to a given accounting issue.

To address this problem, the FASB decided to create a new all-encompassing classification taxonomy for the GAAP and codify all existing GAAP standards under this taxonomy. This project has been going on for over four years and has resulted in the Accounting Standards Codification, or ASC. The ASC content is currently undergoing an extended period of public review and is available through the FASB ASC Web site: http://asc.fasb.org/home.

While the ASC taxonomy itself was a major achievement, the codification activity was a daunting editorial process in which all the existing standards content had to be re-authored in a new form that directly reflects the taxonomy. To support this activity the FASB decided to use an XML-based system, which should come as no surprise.

But beyond that, the FASB realized several important things:

  • The GAAP content is highly modular
  • The GAAP content can be organized in many different useful ways depending on how it is being used:
    • By subject
    • By industry
    • By business process
    • By what's of immediate interest to a particular person researching a problem or set of problems.
  • The GAAP content requires rich metadata to enable accurate search and retrieval as well as binding to the new ASC taxonomy
  • Licensees of the content will want the XML source and will want to be able to use it with as little effort and expense as possible
  • The FASB does not have huge budgets for XML application development and implementation yet needs non-trivial systems for authoring and managing the GAAP content through its editorial processes as well as for delivery through the authoritative FASB Web site.

Given the foregoing, the FASB realized that a more traditional XML application, while possible, would not necessarily be optimal and would likely be prohibitively expensive and would not meet the requirements of licensees for ease-of-use of the XML content.

However, a DITA-based application would satisfy all these requirements. David Prather at FASB realized that the GAAP content could be modeled quite handily using DITA with some GAAP-specific specializations.

David worked out a clever way to use DITA maps to manage the organization and packaging of the codified GAAP content and hired me to design and implement the necessary GAAP-specific specializations (as well as do the data conversion from an initial XML format they had used for the initial codification editorial work). The FASB selected Ovitas to implement a new editorial support CMS system as well as the dynamic delivery system used to serve the ASC content through the FASB Web site.

The project went remarkably quickly--we had working DITA specializations defined and in place in a matter of weeks and the models required only minor refinement as the system implementation progressed, mostly stemming from new understandings of the underlying content as the codification editorial process approached completion. The CMS and Web site implementation went equally smoothly (remarkably so in my experience building such systems).

Because we could use the free DITA Open Toolkit to generate HTML sufficient for internal review of the codified content we didn't need to invest any time or money in acquiring or building rendering support just to support internal Q/A of the DITA content, a significant savings. Essentially, it allowed one part-time consultant, me, to do what would in the past have required a team of three or four consultants months of work to implement. By the same token, we were able to use the off-the-shelf DITA support in XML editors like Arbortext Editor and OxygenXML, removing the need to invest in document-type specific editor configurations and customizations, again saving weeks or months of consultant time. I think I spent about two days coming up to speed on how to configure Arbortext Editor to work with specialized DITA document types and about 1/2 day creating the necessary configurations (it's essentially a copy and modify process that I can now do in minutes).

Likewise, the Toolkit means that licensees can do *something* with the ASC content immediately, as well as giving them a solid base from which to develop whatever internal processes they need. Large publishers with existing XML infrastructure can of course apply that, but smaller publishers with little or no XML infrastructure can still take immediate advantage of the ASC XML source.

The ASC content is currently undergoing an extended period of public review and is available through the FASB ASC Web site: http://asc.fasb.org/home. The content is served dynamically from a slightly sanitized version of the DITA source--it is not static HTML pages generated from the DITA source.

The FASB ASC application is a working example of how the unique features of DITA XML applications significantly lower the cost of building this type of system while enabling significant value for the DITA-based content itself.

One interesting side effect of this system is that most, if not all, of the FASB's licensees, which include all the big name publishers and many smaller ones, will end up with both DITA-supporting internal systems as well as internal DITA expertise that can then be quickly and easily applied to any other DITA-based content, regardless of its markup details or subject domain. That seems pretty interesting to me....

Site Feed

About this Blog

This blog is produced by the consultants and analysts from Really Strategies, a content solutions and services provider.

A Content Management System for Publishers

Search This Blog

Lijit Search

Browse Archives

Browse a list of posts by author.