C5.28 Portal prototype functional for machine access to Metadatabase

ViTaL, the Virtual Taxonomic Library is a component of the EDIT infrastructure, being created as part of WP5.3, which is concerned with the provision of Bibliographic tools for the taxonomic community.

The services provided by ViTaL have been discussed in detail in previously published project documents. The diagram below has been extracted from document D5.15 ViTaL draft system design. It provides a useful overview of the system architecture.

This report documents the pilot launch of the software components making up the bibliographic aggregation and search application, known collectively as Falx, which is in development at the Natural History Museum (London). The pilot went live on 26 September 2008.

Functionality

For the purposes of the pilot, Falx is harvesting the bibliographies of EDIT scratchpad sites using their bibtex download facility. Currently over 9,000 records are harvested in total.

The bibliographic data is stored in a MySQL database (the meta-database). This content is then used to build a full-text searchable index using Apache Lucene, a Java indexing application. Finally the index is exposed to the web for both human and machine use at the URL http://taxonlib.org/bibsearch.

Interfaces

Human access is via a web search page which includes search instructions, links to further information, and the search form itself. Search results can be sorted by author name, publication name, title year and source name.

The results of an example search can be seen here: http://taxonlib.org/bibsearch?term=carbon*

An OpenURL link is given for each item, which allows the user to look for full text content using the ViTaL OpenURL resolution service, based on Ex Libris SFX.

Machine access is achieved through the addition of a query string parameter to the search URL:format=json. E.g. http://taxonlib.org/bibsearch?term=carbon*&r=2&n=all&format=json

JSON (JavaScript Object Notation) is a lightweight data-interchange format which is the preferred format for data exchange with the EDIT CDM.