This is number 3 on the Technnolgy Predictions for 2009 on CMS Watch.
"With social computing coming to the fore, it's never been more obvious that everyone does not, and will never, categorize things in the same way. It doesn't even matter what's correct anymore (well, it does to me, but I'm not about to spend my days stopping people from tagging a map of Botswana with the word "Ohio.") While I'll never agree with David Weinberger's assertion that "everything is miscellaneous" (a taxonomist's least-favorite word), I will assert that the days of the traditional, definitive, and single-hierarchy taxonomy are long behind us.
Enter the varied and multi-faceted application of metadata, experienced as people would like to experience it. In the search world, Endeca popularized it, now it's a commodity. You should be able to get to information the way you want, which may be different from your colleague's approach. We still need controlled vocabularies. We still need to tag content. Text mining and auto-tagging software is gradually improving, and extracted terms can be applied as metadata. But that metadata needs to be a lot more fluid, cloud-like, and by no means fixed in a single hierarchy. And even if it doesn't make sense to you that that map of Botswana is tagged with the word "Ohio" -- it probably makes perfect sense to someone. One person's chaos is another person's perfect path to findability."
Opinions?
In my opinion, this definitely depends upon the context in which the user is browsing/searching content. Personalized metadata tagging is a way of tagging and identifying content that has already been discovered. However, unified controlled vocabularies are necessary for undiscovered content.
Another perspective is that taxonomies and ontologies provide value because of their relationships. Sure, a map of Botswana could be tagged with "Ohio" because it is relevant to a particular user, but that only provides discoverability through searches that use the word "Ohio". This is the difference between metadata keyword tagging and metadata concepts tagging.
Ontologies allow more abstract searches. Take, for example, the case where the map of Botswana is tagged with the keyword "Botswana" or "Africa". Because it is a type "map" in an ontology, search terms such as "Kalahari Desert" and "Limpopo River" should also produce this result. You could also make the argument that search terms which match the Botswana culture should also produce this result.
It is not possible to allow flat metadata word tagging and then to layer in relationships, because the English language is riddled with multi-use words and context is of critical importance.
In my opinion, the multi-tiered relationships that taxonomies and ontologies allow users to leverage in search are difficult to replace. The biggest challenge, and the reason why publishers have sought alternatives, is building reliable and unified taxonomies that have enough topical coverage. There are several ways to adress this, but it is clearly an ongoing challenge.
Posted by: Michael Puscar | December 29, 2008 at 12:18 PM
Some of the new tech might imply that the reliance on words only as the pointed end of search is being supplemented by searching on sounds or images.
One example is the app for an iPhone that allows "listening" to a song and being taken to a site with ? (the information about that song? the next place that the band is playing? the closest place to buy the record?)
The other example is the app that allows a photo to be taken of a product, then being taken to a website that ? (more details? cross sell? instructions for huse?)
While sounds and images have always been a critical way that humans search in the physical world, it's only now that they can be computerized.
Is there work being done on an ontology of music and art?
Posted by: Michael Josefowicz | December 29, 2008 at 01:19 PM
Michael P - Not being an expert (some of my colleagues are, but not me) I would say that a mix of both are required - especially for professional content (physicians, lawyers, accountants, etc). It would be neat if individuals could maintain their own "portable" tag set and so if Ohio is meaningful to me (related to a picture of Botswana) I can save that without confusing the larger searching public.
Michael J - I love Shazam (the song recognition software that I have on my iPhone). I will see if anyone knows of anything being done with music and art ontologies. It would be so cool if we could actually store a visual or audio pattern as a tag and recognize it where it occurs.
Posted by: annvmichael | December 29, 2008 at 01:29 PM
I agree that simple-minded applications of taxonomies are outmoded. In the same breath I'd argue that the increasing sophistication of tagging and text-mining technologies are best applied in parallel with taxonomies. I see the two as being good at different things.
Taxonomies are analytical tools for representing the content and (via additional processing) structure of a set of resources. Social tagging is good at providing cues to the temporal and contextual significance of a given resource. Automatic extraction of terms provides a hybrid of content and contextual information, providing a snapshot of a resource based on a vocabulary current at the time of authorship.
Taxonomies, particularly when applied by a subject matter expert, are excellent educational tools. Parallel implementations of taxonomy, thesauri and NLP tools can teach a vocabulary while also facilitating discovery -- an experience that will improve my current and future search results.
A taxonomy can also provide an invaluable bridge between languages. Although machine translation is improving, a translated taxonomy does a much better job of capturing domain-specific and nuanced terms. (For instance, see the National Agriculutral Library Thesaurus (NALT), a robust and constantly updated -- and backward compatible -- vocabulary in English *and* Spanish.)
And, taxonomies can be an invaluable resource when building browse tools. While building a usable visualization of a large repository remains non-trivial, an applied taxonomy offers tested and tried structures to visualize.
On the topic of taxonomies for art, see Iconclass.
Posted by: Andrea Laue | December 29, 2008 at 04:05 PM
Thanks Andrea!
Posted by: annvmichael | December 29, 2008 at 08:41 PM
I agree that we need room for both user-generated supplied tags in conjunction with more formal taxonomies/ontologies. Why? Because they provide different findability affordances.
We In Massachusetts have been working to consolidate the hundreds of standalone websites into a single "portal" - http://www.mass.gov: hundreds of thousands of web pages and download documents, with millions of users trying to find particular information using different approaches. Our primary topic-tree navigation is well used, most likely by people who may not have a precise idea of what they are looking for. Search is heavily used, and I would guess that this is heavily represented by people who are looking for something very specific, based on prior knowledge.
As our collection grows (and our budget doesn't) augmenting search with user-contributed keywords and phrases looks very attractive, as does mapping search terms and results to formal structures to enrich the search experience and perhaps to replace some of our manual navigation-building. The most opportunities are possible by embracing metadata from as many places as you can get it.
Posted by: Sarah Bourne | January 02, 2009 at 01:32 PM
Thanks Sarah -
You bring up a great point about scalability in the face of limited budgets.
Even if not faced with limited budgets, it seems an impossible task to internally retain the ability to tag the continuously and rapidly growing amount of information that needs it.
Happy New Year!
Ann
Posted by: annvmichael | January 02, 2009 at 01:36 PM