-=( In Between )=-

Scholarly Online Publishing, Open Access and Library Related Technology
My Internet Resources - My Department - My Profile
Search the RDN Internet Resource Catalogues -
  
 

Dynamic metadata

 
By ellermann at Tue, 2006-03-28 20:26 | metadata | search | statistics

The problem

Libraries use metadata to facilitate access to their holdings. Catalogues are built around metadata. The metadata are used to describe and identify the resources. Now that more and more documents are online, indexing techniques are used rather successfully to enhance the search and locating capabilities for information seekers.

But there is a problem here, a fundamental one. Metadata and indexes are tied to an individual document. What is not described is the contexts in which a document can be used meaningfully. If I am setting up a website for lay people about, say, nano-technology and I want to reuse as much as possible the literature written for the non-experts, how do I find a good set of documents? If the metadata would include a field indicating the targeted readers, then it would be easy, but it is rare to find a metadataset that includes such information. Sure it could be added. But if, in another scenario, I wanted to find documents useful for a course in information science for undergraduates, which articles published in online journals can I use? Sure, the required level of expertise of the readers could be added to the metadataset, but it rarely is done. And what if I want to write and illustrate a handbook on some subject, where can I find a set of nice and suitable pictures?

You get the drift. Not all possible contexts of use can be encoded in a metadataset. For a start, practical reasons speak against it, just consider the immense effort needed to describe a document fully! But perhaps not even in theory, for would you be able to define beforehand all the possible contexts of use?

One of the reasons why we see such an explosion of new metadatasets is probably due to someone who detects a new context of use of documents and wants to encode relevant fields in their metadataset. In the educational field especially we witness the birth of many new and complex metadatasets. After all, incomplete metadata means loss of information, for what can't be found barely exists.

There is something rotten in the state of metadataland.

As far as I can see, what is wrong is the notion behind metadata of it being a one shot deal. When a document is made available, it is done at the same time that the metadata are made available. Metadata are seen as static. If we adopt a more dynamic view, of metadata that can be changed - either by users or professionals in some methodical way - then we'd have metadata that learns to adapt to users. Only a minimal set might be needed to start with if we are willing to have it changed, all the time.

Solutions

I know of three principal ways to avoid this dilemma.

Have users add metadata explicitly

If users are allowed to make annotations to documents, that is describing it in terms they deem relevant, these annotations can help other, similar, users to better find relevant materials. Annotations, folksonomy, tags (a new craze too in weblogland) can all be used here. Users might also add explicit links to other documents.

Have users add metadata implicitly

When documents are retrieved, usage statistics can be gathered and stored in the metadata. This might even lead to statistics about the co-occurrence of documents, making it possible to suggest alternative, possibly relevant documents to the information seekers.

Use automatic extraction of terms

This can be done from the documents as well as from the annotations. Incorporate the extracted terms, link them to a thesaurus, or an ontology, and so make a more semantically oriented search possible. User input makes this a dynamic process too.
  
-=( Premature Optimization Is The Root Of All Evil )=-