Most publishing companies have one of those folks on staff who is intimate with the content. Someone who knows all the images that were used in a previous edition or which drug monographs couldn't fit into the printed product in time for publication. I used to be one of those people..ask me the ghost words embedded in Tabers' Cyclopedic Dictionary, 18th edition*. Even with a photographic memory, today's proliferation of content makes this skill nearly impossible. I also like to bring up the lottery scenario risk: "what happens if Jim in Production wins the lottery and all that knowledge leaves your organization?"
To effectively manage content, organizations need a handle on what they have. Publishers using a Word document system simply can't be agile in today's environment. Think about a document sitting on some file server, with all its attendant assets—images, charts, chapters and paragraphs—buried within it, and the only way to know what content is in there is for someone in your organization to remember that it’s there.
Without enriching your assets with metadata and storing them in a repository that allows you to search and find content relating to a specific topic—say, tennis elbow or the Higgs boson—you could be duplicating work recreating assets you already own, wasting time searching for those assets, and missing huge revenue opportunities to sell content granularly as a custom bundle or a focused derivative e-product.
At this year’s MarkLogic World conference, Nature Publishing Group (an RSuite CMS customer) presented an explanation of how they support what I would call ‘virtual journals’. There are very specific segments of the scientific world that would not possibly justify the creation of a full-blown journal, but when you start to realize, ‘Hey, we have this very large repository of existing journals with some articles across all of them that appeal to this market, and if we gather these articles up from all these other journals, we’ve got enough content to be of interest to this marketplace.’ Suddenly you have the option to create an online-only product (for example) with very low internal costs that is of specific interest to this niche market that previously was too small to be worth going after. It’s a long-tail concept but without applying metadata consistently and systematically this simply couldn't happen.
Metadata isn't magic and it really isn't all that complicated---you need the proper tools, workflow, and people in your organization. And once you have that set up, the fun begins---new product development, automated distribution to new licensing channels, multi-channel output.
Download our latest white paper and learn how publishers are increasing revenue with strategic content management, including metatadata enrichment. The free white paper includes two case studies from Human Kinetics Publishers and Elsevier Health Science.
*While I no longer work at that publishing company, I won't ever tell!
At the 2012 RSuite User Conference, Lisa Bos, CTO at RSI Content Solutions, will present "How Metadata Management is the Key Component of Findability:"
The origin of most online customer experiences is search---it’s a natural and productive first step. In publishing, making content findable and visible means managing metadata. If you don’t have metadata on your content, you won’t find it efficiently. This presentation details how metadata management is the foundation to new product development, content discovery and re-use, and content enrichment. Lisa references customer use cases who harness RSuite’s metadata management tools as the key to unlocking content from siloed departments and repositories.
RSuite is a CMS for publishers that manages the entire content life cycle.
The RSuite User Conference is a forum for publishing and media professionals interested in learning how strategic content management and an XML-based workflow help deliver content in any format, to any channel, at any time.
You are invited to join hundreds of publishing professionals for this 1-day event.
Thanks to the 2012 sponsors!
RSuite is exhibiting at booth #34 from June 1st through the 3rd at The Society for Scholarly Publishing (#SSP)!
Schedule your time with us today and see how publishers have done the following things with RSuite:
- reduced book production time-to-market by 8 weeks
- automated aggregation and distribution of journal articles to licensing clients
- Increased website traffic by more than 35%
- and much more
Tweet about us at #SSP using the #RSuite hashtag.
I've been working in content management for more than ten years, and thinking back over that time I realized that the dream of a true, standards-based, central repository for all of an organization's assets I naïvely espoused in the late 90s still hasn't become a reality except in the most narrow of applications. When I used to write and teach XML classes I was sure that open markup standards were going to revolutionize the way we created and managed assets. Around 2003 I started to become a bit disillusioned with my vision for content utopia. By 2008 I had all but thrown in the towel. Despite herculean efforts content kept worming its way into proprietary, tactical-level production systems and often was never seen nor heard from again, a victim of legacy of "fire and forget" publishing approaches common prior to the rise of the Internet.
Fortunately, just as I had resigned myself to living in a world of content silos, new strategic ways of managing content started to emerge that rekindled my ideals. The idea is more modest than my grandiose vision of pure standards I once embraced, but offers a new, more practical approach that can survive in the real world.
Rather than insist that every asset be centralized in a consistent, preferably open, format practicality may dictate that we instead work to build a centralized asset repository that shares common representations for all assets. The actual bits and bytes making up the asset (Word documents, InDesign files, photos, videos, etc.) can still be developed and stored in traditional systems where applicable, but a new system takes on the responsibility of cataloging relevant features and details about the asset in a centralized repository. So instead of insisting that every asset be physically managed in a central repository, we instead insist on the much more modest - and realistic - demand that all assets make relevant, common data and metadata available in a consistent format through a centralized system. This distinction means that rather than try to replace the tactical systems we use to create, manage or distribute content we instead develop a parallel, complementary content management strategy that reflects data in these systems and presents a common, consistent view of the asset regardless of type.
So an image file may exist as a TIFF or PSD formatted file in a production system or on some hard drive somewhere, but the centralized repository maintains a record for this image with all of its relevant metadata and a standard image format readily accessible to any system (i.e. PNG, JPG in thumbnail and applicable preview formats). For a lot of applications, centralized lighter-weight representations of content is enough to create new products without returning to . For example, if I want to rapidly re-use images or stories on a new microsite, I don't have to resort to tracking down all of the content in its silos, but instead rely on these common representations to collect the assets together and send them into my Web CMS for the new microsite. Formats, conversions, and so forth can either be provided to the central system through traditional manual conversion or, preferably, through automated mechanisms built in to existing content workflows.
This sort of approach was attempted using search technologies at one time, but lacked an important ability to offer the depth of content management required to not just find the asset but also to be able to use and transform it. It gave us the ability to view the content but not any tools to do anything once we saw it. Search remains important, but a real central repository needs to actually have usable representations of content that can be managed, transformed and distributed as assets on their own. This requires a full content management system.
So my new vision of a centralized asset repository is not the end-all be-all "do everything" system that becomes impossible to design and build, it's a "do-some-things" central system that maintains some consistent, common format that can be readily transformed and transmitted and becomes an organization's strategic content reserve. It can answer questions like "what assets do we have about Egypt?" quickly, and serve as a baseline for those assets so that after finding them they can be used in our various tactical systems.
To build such a thing, consistent representations are needed. When looking for data standards we of course start with XML. When only a binary will do, ensuring that pointers are accurately maintained to the original assets and appropriate renditions of the binaries are created for things like the user interface of the central repository is an obviously useful model. Even if re-work is required the assets are already under active management.
The RSuite Content Management System happens to be a great foundation for building shared, managed centralize repositories of content. The system is flexible, built on an XML standard database with a metadata model that can not only leverage existing metadata but also be extended in arbitrary ways to adapt to evolving requirements. It is built on open standards and is a good corporate citizen, ready to interoperate with existing systems. The native XML database and pointer management features ensure that consistent representations are available. This approach creates a solid foundation for a strategic, centralized asset repository.
Part of my role as Product Manager for Really Strategies will be to focus on the ways that our existing clients have adopted XML-based content management. I'll be reporting in with our client success stories at building these content repositories here on the blog.
Does your organization have a vision for managing content strategically? It’d be great hearing how others are working to address this challenge.
This Salon interview with Geoffrey Nunberg about Google Books' unfortunate use of metadata is fascinating as an illustration of why a publisher implementing a CMS should focus as much (maybe more) on metadata as on anything else. Bad metadata leads to all sorts of problems, and unfortunately it's a self-reinforcing problem - bad leads to worse as users repeat mistakes, act on inaccurate search results, and ultimately come to distrust the system. By "focus on metadata" I mean publishers implementing CMS should take care in:
- modeling metadata
- the creation of controlled lists and taxonomies
- the design of automated and manual tools for assigning metadata
- the development of automated validation tools to ensure quality
- the development of search that leverages metadata
- user interface design to make metadata easily visible in various contexts (browse, edit, search results, ...) to encourage consistent usage and metadata correction/entry whenever it's convenient to the user
Here's Nunberg's original article in the Chronicle of Higher Education from August 2009 and a related blog post. This topic is obviously fascinating at face value as well - as it relates to the usefulness of Google Books for different usages by different users with different expectations. The comments to Nunberg's article/blog posts illustrate effectively that smart, well-intentioned people strongly disagree on the value of metadata or of particular types of metadata as compared to the benefits of "simply" making content available through fulltext search. This basic disagreement often shows up during design projects for RSuite CMS implementation. Leaders within a publisher need to reach agreement about which metadata will truly be of value internally and to readers and about which types of usage are most important to support. They also need to determine the cost/benefit ratio (metadata is often relatively expensive to do right). If they can't reach such agreements, then it's also unlikely they will consistently and usefully build and leverage tools for metadata in the first place - thus leading to a self-fulfilling prophecy on the part of the fulltext-instead-of-metadata advocates.
Of course, there's also a role here for the technology vendor like Really Strategies - we need to make it as easy as possible for publishers to take the steps on the bulleted list at the top of this post, so that the human effort required to make metadata really valuable is also really efficient.