DITA For Publishing: DITA Project Gutenberg Samples
As a side effect of the new DITA2InDesign project, I have started converting more or less random publications from Project Gutenberg into DITA as way to both provide some non-trivial, non-technical-document samples in DITA as well as to demonstrate different approaches to using specific DITA features for specific kinds of content.
The source for the samples is in the DITA2InDesign source code repository on SourceForge. The HTML and PDF renderings from the DITA XML source are served from the DITA2InDesign project Web site: DITA Project Gutenberg Samples. These have been rendered using the out-of-the-box DITA Open Toolkit HTML and PDF2 processors (although the PDF2 processor has been customized to use different fonts from the default Arial).
Once the DITA2InDesign process is working these documents will serve as test cases for that process as well, acting as test cases that are representative in terms of size and content charateristics of what modern publications of similar types would be like when managed as DITA-based XML content.
All the Project Gutenberg documents are either in the public domain or were donated by the copyright owners to Project Gutenberg. If anyone reading this post has a publication that they think would be an interesting candidate for DITA representation, and would be willing to donate the source to the DITA2InDesign project for non-commercial use (that is, the donor can retain the copyright and impose any derivative use restrictions they want as long as the material is licensed for viewing and non-commercial use in its DITA form) then I will happily convert the document to DITA. As for the Gutenberg samples, I can't promise an optimal conversion but I can promise a complete and correct conversion. [Note what I'm offering here: essentially free consulting for the price of giving away access rights (but not ownership) to one publication. Of course, this offer is on a first-come, first-served, time-available, while-supplies-last basis.]
Some things that could be done fairly easily with these DITA documents but that are not currently provided for in off-the-shelf tools include:
- Generating eBooks in various standard and proprietary formats (OEBPS, Sony Reader, Mobipocket, etc.)
- Generating digital talking books in NIMAS format
- Generating Web deliverables tailored for mobile delivery
- Generating a Wiki-style interactive Web site from the DITA source
In addition, this source is all ripe for additional metadata classification. For example, the entries in the Encyclopaedia Britannica sample should all have explicit subject keywords as part of the topics' metadata.
The DITA Project Gutenberg samples have the same unrestricted use licenses as the original data on the Project Gutenberg site, so feel free to use these samples for whatever you want. In particular, these make useful test and demonstration data sets for DITA-aware products.
Enjoy.


Comments