![]() | -=( In Between )=-Scholarly Online Publishing, Open Access and Library Related Technology |
Author: HEnk Ellermann
Today's:All time:
There are currently 0 users and 18 guests online.
|
library systems
It took me some time to realize, but what OAI-ORE seems to be really about is the boundary problem: The fact that groups of objects cannot be properly identified in the basic web architecture. OAI-ORE allows one to identify aggregates. On top of that it offers means to describe the relations between the aggregated objects. It allows one to define boundaries between groups of objects. Of course, any web page can contain a number of links to other pages and documents, but those links are not typed, meaning that it is hard to distinguish between, say, navigational links from links to objects. OAI-ORE may be a way to solve not only the grouping problem (enhanced publications) but may give web archiving a great boost too. Now relatively complex software like httrack or heritrix is used to heuristically define relevant groups, but using OAI-ORE's resource maps, a good hint at what should be archived becomes possible. OAI-ORE also highlights an often underestimated problem. The transition from the normal library to a digital one needs to be based on descriptions of individual items (and not, as is common in the library field, on the expression or manifestation level) and these items need to be grouped. Whether OAI_ORE solves all problems remains to be seen. One of the things that may need reworking is the flexibility of OAI-ORE resource maps. As far as I can see now, all the possible relations between documents in an aggregate need to be predefined. But I am not sure if this is flexible enough. email this story | 786 reads
In a nice article called Size isn't everything Leslie Carr and Tim Brody tried to measure the success of a repository. What is refreshing about their approach is that they do not focus solely on repository size. A repository is successful when it is used. And use here refers not to the number of downloads or views, but to the number of uploads. When the staff uses the repository on a regular basis to disseminate their work, the repository is successful, irrespective of the size. Irregular deposits arise, for instance, when on only a few days a large number of items have been uploaded (batches). This probably means that the maintainers of a repository found a large set of documents. That is fine of course, but it is not in itself a sign of a healthy repository. Besides regularity of uploads the scope of the repository is an important variable too. If it is an institutional repository, the content should reflect contributions from all the disciplines. Measures for "health", "scope" and "size" can be combined, and this leads to a list of successful repositories, a list that can be extracted from this paper. On a personal note, it is good to see that our Groninger repository is considered to be a healthy repository and therefore belongs to the top 20 of successful repositories. CommentObviously, it is very difficult to "design" objective and adequate measures. Carr and Brody make no hard claims here. More research is needed and further refinements have to be incorporated. What does surprise me a bit though is that repository usage (by reading users) is not taken into account. Even a large and healthy repository with a wide scope can be totally useless if no one comes to watch it. Uploads and downloads both should be considered. Also, in this paper, all items are equal. Whether the items in the repository consist of both metadata and documents, or just metadata is not taken into account. The question of whether there is an added value of having an item in a repository is not considered either. If the item is already accessible elsewhere, or if an item consists only of metadata that is already available from other (re-)sources, does it have any real value? Incorporating these aspects was clearly not the goal Carr and Brody set for themselves. But an adequate measure of success should, in the end, take these things into account. email this story | 3438 reads
By ellermann at Mon, 2007-08-13 14:42 | Library 2.0 | library systems LibraryThing, the collective and virtual bookshelf, has recently been connected to a number of Dutch libraries and online booksellers (including Bol,Bruna, the KB, the KBR and the Rijksmuseum Research Library). Dutch users can now find dutch titles much easier than they used to. This increases the number of dutch titles considerably and makes it easier to fill your online catalog. Needless to say, all the other features of LibraryThing become more useful for Dutch users too. email this story | 2441 reads
By ellermann at Wed, 2007-05-02 13:15 | Library 2.0 | library systems The latest addition to social virtual worlds is SecondLife. In their least modest moments it has been described by creators and users as the sequel to the World Wide Web: Web 3.0 as it were. It is only natural then for a librarian to ask whether SecondLife (SL) has the potential to become ThirdLibrary. I have used SL for some time now and tried to pinpoint the features that could make it useful for the library. There are a few. It is, compared to the previous generations of social virtual environments, a rich environment. It is possible to create your own objects and surroundings, from chairs via buildings to complete cities or islands. In these environments one can, well.... chat. But there is a bit more. One can create objects of all kinds, including art, music streams or interactive objects using the scripting language LSL. These objects can be sold "in world" or used as advertisements for outside goods. There is a solid foundation for an online economy and online copyrights, because a creator can determine whether an object can be copied or modified by others; also the creator can set a price before an object is transferred from one owner to the other. The copyright enforcement is still strict and rigid and does not allow for the flexibility of, say, Creative Commons License, which would be beneficial for the (re-)use of scholarly works in SL. Assuming for a moment that this world became so popular that most academics and students would join SL, these features are, to say the least, promising. Building an online library and populating it with books, reference librarians, search engines, links to sources in and outside of SL (that is the web) and what do you have? A library that can be accessed by anyone, everywhere, and all the time. A dream? I am sorry to say that such dreams cannot be dreamed yet. First of all, SL is not as popular as some would have it. At any time there are, currently, around 30,000 users (reaching 40.000 at peak hours these days) online. It goes without saying that that is not enough for a thriving academic community, not even if all users would be academics (which God forbid, but that is another issue). But the misery doesn't stop there. SL is, to put it mildly, extremely bad in handling text. In SL itself the only useful way to present text on some self-made surface is by using pictures. That is like faxing a book, but without the convenience of a faxing machine!! Sure, there are note cards to present text to others, but the layout in those is basically the layout of an ASCII text, and note cards do not integrate in the environment. It is also not possible to create a text in SL by any other means than typing it. So much for self designed interactive forms, and the like! A solution would be to allow certain objects to act like web clients, but there is no sign that this will happen soon. The interfacing with the web is also rather bad. There is one command in the scripting language that allows you to read web documents into SL, but besides being rather slow it has a mere 2K limit on the amount of data that can be transferred. There is also a function to start up websites in your browser and there are a few, rather clumsy commands, that allow you to work with XML_RPC, but only, yes only, in SL! And even there it has its quirks. Taken together these functions are simply too limited. And transferring pictures from the web to SL? Forget it, can't be done without great, great efforts. You can upload pictures from your disk, but for that you have to pay for each pic (ok, only a very small amount, but still!). The scripting language itself needs a redesign too. Although not without power, it is a very clumsy language. It has no support for multidimensional arrays and data storage is limited to only a few KB per script. It is extremely awkward too, to maintain a library of functions. There are ways to circumvent these limitations, but you better not have day job when you do that. In short: SL shows great promise, but its promise has NOT been realized yet. Even worse, there is so much focus on performance, and perhaps on letting the user pay for the storage and CPU time, that essential functionality will probably not be implemented in the near future. And essential means here a good interfacing with the web and a decent handling of larger texts. Sometimes nice results are obtained, there are for example RSS readers in SL, but a library needs more and far better text handling capabilities. Libraries and SL are not a happy marriage yet and I think that the ball lies in the court of the developers and creators of SL now. They have done a great job, but have as yet to undertake the necessary steps to make SL really useful for librarians and academics. "Chat only" is not enough. (Mind you, it is rather nice to sit by a crackling virtual campfire with people from all over the world, talking philosophy or "what great books have you read lately"...) :) email this story | 4939 reads
By ellermann at Sun, 2006-09-24 11:01 | library systems It may be old news to most of you, but I am still catching up. While doing that I found a position paper by the DFG (Deutsche Forschungsgemeinschaft) on 'Scientific Library Services and Information Systems: Funding priorities through to 2105. The document is available in both the German and English languages. It is an important document, I believe. It seems to aim at giving libraries a 'cornerstone' position in e-science of the near future. It makes it quite clear that an immense effort is needed change the library. Cataloguing needs to change; connecting library systems to other systems is a new challenge; and the handling of metadata needs careful consideration. The document contains an 'action plan'. The objective is the implementation, in Germany, of an integrated, '... digital environment for the provision of scientific information in all disciplines and subjects by 2015. So far so good, or even great... The action plan itself is, I think, a bit disappointing. Sure, immense efforts are needed to digitize materials, repositories have to be networked, better metadata standards are needed, a German 'Cream of Science' would be nice, as well as toolboxes for electronic publishing - who could object? The reason to be just a bit disappointed however is that very little thought is given to 'architectural' issues. What is needed for the integration of information from a variety of systems? Also: the document (and I hope I am wrong here) breathes a 'top down' mentality: a national structure is needed, portals will be set up. Isn't it better to think deep about how to formalize (and standardize) the notions of integration and then develop tools with which others (scientists, librarians) can make services that are of real interest to scholars and scientists? I am not saying that national efforts are useless, but they should be, I think, aimed at creating conditions for other people to develop services. But in the Action plan I see no provisions to set up boards that define and maintain protocols for information exchange and no intention to define, say, a minimal metadata standard for interoperability of library systems. I also see no plan to develop tools that can be used, by others, to build new services. Well, perhaps if they find out that we are all very interested? email this story | 9387 reads
I started working in the Library scene in March 2002. Before that I only used libraries, occasionally. It's been a little over 4 years now, and what has happened, really? Well, not much. If I sound overly pessimistic, come and prove me wrong. Not that there has been a lack of activity. I have been at numerous meetings with people from all over the world. I have done my share in starting and running projects. I am now heading a digital library department, working as hard as I possibly can to be innovative in the library field (in between - which incidentally explains the name of this weblog - an inproportional amount of "chores"). Forgive me, but every now and then I look back and ask myself what has actually been accomplished. The only fair answer: not that much! Sure, there is OpenURL and there is OAI-PMH, there are repositories, the two/three bright lights in the library landscape... but they were already there when I started! And while they are indispensable tools for any library of the future, they have only partly addressed the real issues. Right from the beginning it became apparent to me that there are only two themes worth working on. Meaning that tackling them could really change the library scene into becoming "future proof". These themes are Open Access and the integration of the huge amounts of information we store for our users. Integration requires interoperability, and that requires that we identify uniquely the most important "objects" in bibliographic records. Authors, papers, institutions and possibly the users need to be ID-ed. In that particular sense, protocols for information exchange, search and linking are of secondary importance only. While a large number of new tools have appeared on the scene, like RSS feeds, semantic-web-like technologies and social software, not much has been done to redefine library information in local systems in order to make them globally useful. The fundaments, Open Access (including Open Computation) and integration that are so badly needed are at best only very partially realized. Sure there are attempts, but attention is not always focussed on what truly matters, but on what looks nice. Library 2.0 will not be there when there is no Open Access and no fundament for interoperability. So what might be the reason that so little has really happened? I think it has to do with the way libraries cooperate. If they do cooperate, it is in the form of projects, whether or not there is an umbrella organization that initiates those projects. The projects are very often "just" service oriented, meaning that a service is built using the existing infrastructure as a base. Also the projects are often run under some ridiculously tight project management scheme. An alternative would be to have those umbrella organizations take a real look at what is going on in libraries, pick out what is relevant, and then support it - in whatever way the people who started the initiative in the first place see fit. What should be a matter of reward and reinforcement has become a matter of taking over control and demanding others to obey deadlines and other crap like that. There should be an idea of what IS relevant, especially of what might be more relevant than "quick wins" and services built on shaky foundations. There should be reinforcement and reward. There almost isn't any real "drive" behind things, except for meeting a deadline. The library scene is simply not academic enough. Perhaps also, libraries are not quite the place to be. Maybe it is better to leave the problems to centres of innovation (in universities or in corporations). They might actually be able to set up a supportive and open structure that allows good people to do good work, and yes... LETS them work. Sorry, but once in a while I must wonder what it is all good for. I'll get back in shape, no sweat... well, perhaps when the weather cools down a bit! :) email this story | 6097 reads
By ellermann at Mon, 2006-05-08 18:32 | Library 2.0 | library systems The digital library should strive towards an openly accessible, modifiable knowledge base where each object is identified, where changes to objects are rigidly described, and upon which an extensible set of similarity operators can be defined. Cough.... If I can steal a bit of your time, I'd like to add some sense to these words in the rest of this entry. It is an attempt at formulating a mission statement for those working on the digital library of the future. It seems hard to find, or formulate, a mission for the digital library. Many people have said very valuable things about it. Linne Brindley's seven strategic choices are excellent from a managerial perspective, but they are not overly helpful for those who are actually putting their feet in the clay (the developers). Sure, users should be involved and all that, but for what? Others mention stuff like "integration" and "interoperability". Obviously, we have to make our systems interoperable, but should we really take existing systems as the "primitives" for our work? Web 2.0 is about making relatively small tools that empower (groups of) users and that are easy to handle by users, user groups and (amateur) developers alike. And again interoperability seems to be the key here. But "ease of use", "integration" and "interoperability" are just the means to accomplish... yes to accomplish what exactly? How do we find our way amidst the roadmaps, the buzz words, the ambitions, the many urges to innovate? What are the key concepts? I have been thinking a lot lately about a simple mission statement that could guide our daily work. For the time being the opening line to this entry is a very first and admittedly feeble attempt at that mission statement. Sadly, it is not a simple one. The general aim is simple enough, namely to provide everyone with a functional access to scholarly and scientific information. And "everyone" here includes machines and even non-academics. Well, there is one element at least. Open Access has to be a primary goal.
Ergo, we should see the existing literature not as a set of PDF's, but as a knowledge base, obviously without the top-down design procedures that usually accompany this concept. Licklider was, again, right when he wrote that gem Libraries of the Future Interoperability and integration just mean that all the different collections of scholarly information have to become one functional heap, or knowledge base. So what are the basic structuring elements in this heap? For one, each contribution to this heap should be retrievable with a uniform addressing scheme. One and the same object needs an active identifier, meaning that the identifier can be used to locate the objects. So here we have another clue: make everything identifiable. Obviously, if only for safe keeping, there will be many copies of an object, but these exact copies, and only these, should be completely interchangeable (the rest has to be described in terms of provenance). Enter the FRBR discussion about items, manifestations, and the like. But identification alone is not sufficient. We also need to find what we don't know already. Or, perhaps better, each object should make itself findable (yes, another good read is Peter Morville's Ambient Findability : What We Find Changes Who We Become In practice, and given the current state of technology, it means that we have to strive for storing documents in an XML format or else wrap them up in a set of programmatic interfaces, so as to make them "smart". Also, it means that each element in such a document is identifiable. Resolvers have to be built to turn identifiers into locations. The resolvers might point back to an online proxy of a physical document (i.e. metadata), so full digitization may not be required. The rest follows from what we think are important similarity operators (or relations). For instance, if we would consider documents (co-)written by one author as similar, we sure need to identify authors too, or else similarity will be a needlessly fuzzy relation. The same goes for documents written by authors working for a given institute. If we would consider papers read by one and the same reader as similar, we need to identify the readers too. Identification is of essence for many useful similarity measures. User behavior is only one way to group documents, another is by peer review or any other way of making quality judgements. For the latter, user identification is needed too. We could also try to find similarity by some algorithm. These algorithms, perhaps I should call them agents, work on the knowledge-base (night and day) to find subsets that are relevant for a particular user. As soon as we start to tackle similarity, we need users. User centered design is very important here. But, to see the importance of identification, no user is needed: just architects of the digital library of the future. And so... back to the opening statement. I hope this made some sense, if not, tell me! email this story | 11794 reads
Library 2.0 is about a library that reaches out to people over the internet to offer tools to readers and writers to increase their knowledge. A book is a tool, a restaurant can be a tool too, same with a PDF or an RSS feed. Part of the Library 2.0 movement is about translating: translating old terms and concepts into new ones. Language, as we all know, is not neutral. It can have a huge impact on the way we structure our world. We used to talk about books and journals and lending them out, storing them, etc. Now we talk about information and about tools to manage information. That is a big step. The new language is not an exercise in vanity, the new language is necessary simply because ICT has changed the world. Yes the world has changed, so the language has to change, too. And the language then changes the world in turn, it goes both ways. What has altered, and what is captured in the new Library 2.0 speak, is the fact that the informational landscape caters for both people and information objects and allows for a very close interaction between them. Back in the old days there were buildings, books, and people. When a person entered a building, the building did not change (much). When a book was read by a person, the contents did not change. Objects and actions were "persistent" and kept their given "essence". These days, informational objects (like books and websites and E-journals) can change. People can annotate those objects, add to them and, as is the case with Wikis for example, really change them. Linking and editing lift the barriers between informational objects (and the people working with them). Of course, much of yesterday's consistency is still present in the present fluid world, but we have the option of ignoring it. That is where Library 2.0 speak enters the stage. It's an instrument to cope with the change and guide further developments. Social software to interact with people, collaboratories, and indeed, a set of very simple tools are a result of this new language coping with a new world. These tools can easily be adapted to certain needs, witness for instance the many uses of RSS feeds. And it sure makes sense to refer to them as being in a permanent beta stage. The old library systems excel in stability, reliability and clear functionality. The new systems - although it is much more appropriate to talk about tools - are adaptable and simple. A basic requirement for these tools is not the inherent stability of them, but that they operate in a well structured informational environment. The latter is still lacking, in part, the protocols and standards for information identification. Exchange and storage are still an unstable set and not really fit to deal with the many interrelations that might exist between informational objects and online representations of people. The good thing about Library 2.0 speak is that it focuses us on what is still lacking. It is a focus on (the need for) interoperable tools, a focus that was lacking to large degree in Library 1.0 speak. Library 2.0 speak has given us (an awareness of the alternative use of) XML, RDF, chat, complex objects, collaboratories, e-mail, repositories, Open URL, RSS, and much more. Let's change our language, let's talk Library 2.0! email this story | 3937 reads
By ellermann at Mon, 2006-04-10 22:38 | library systems | repository A new version of Fedora is now available. Their website says:
email this story | 3563 reads
By ellermann at Sat, 2006-04-08 09:40 | library systems Karen Calhoun has written a report with the title The Changing Nature of the Catalog and its Integration with Other Discovery Tools. It's about what to do with the catalog now that users massively use Google and other information services for search. Assuming that the catalog is here to stay, Karen Calhoun looks for strategies to revitalize it. The catalog, as she acknowledges too, now is a method to present online information about the content and the location of physical holdings. To me, but apparently not to Karen Calhoun, this implies that as long as there is still information off-web, the catalog will have a function, but that when all the information is on-web, it is pretty close to useless; when the full-text is online there are other methods to manage the information. Why then should we revitalize the catalog? Calhoun works from the assumption that we should, but gives no good answer to this question. So, I'd like to propose a strategy that does not fit in with Calhoun’s assumptions (although she mentions divestment as a theoretical option). I call it the strategy of a graceful exit. Its fundament is to limit the use of catalogs to presenting metadata about physical holdings. Indeed, no inclusion of digitally available material in the catalog. This gives the catalog a clear role and accepts the fact that for online material better search tools can be provided. Furthermore, as another part of the graceful exit strategy: open up the catalog to other information services. Allow other providers to freely use the catalog by adding information retrieval protocols (SRU, Z39.50, OAI-PMH). This strategy makes it possible to gradually spill over the information from a catalog to other systems that have a more secure future than the catalog itself. email this story | 3109 reads |
|