Español (spanish formal Internacional)English (United Kingdom)

MARiMbA

MARiMbA is a command-line tool, designed with librarians in mind, to transform MARC (MAchine-Readable Cataloging) records to RDF, following Linked Data best practices [1][2][3].

The tool supports the whole mapping and transformation process from MARC metadata to RDFS/OWL vocabularies. It is a tool aimed at facilitating the Linked Data generation process and at allowing librarians to carry out the RDF generation without any technical support. In order to achieve this, MARiMbA has the following features:

  • The tool works with MARC authority and bibliographic formats.
  • All work is done using spreadsheets. There is no need to learn any additional mapping or transformation language (e.g. XSLT).
  • The tool analyses MARC input records in order to generate easy-to-use mapping templates. These templates are focused on facilitating the decision-making task, errors discovery and the evaluation of the whole transformation process.
  • It allows the user to use any vocabulary formalized as RDFS/OWL.
  • It includes a minimal configuration file that allows the user to adjust some features of the process . However, the tool is preconfigured to be used out of the box, following the FRBR model (Functional Requirements for Bibliographic Records).
  • It includes a lightweight SPARQL server (Fuseki) that allows the user to perform queries against the generated data with no extra configuration or data loading.

MARiMbA has been successfully used to transform around 7 million MARC 21 records from the Spanish National Library, which produced around 60 million RDF triples. The resulting data are available via SPARQL at http://datos.bne.es/sparql. Additionally, an RDF resource example can be found at http://datos.bne.es/resource/XX1718747.

How to use it?

You need:

  • MARC records (authority and/or bibliographic) in ISO 2709 format
  • Java 1.5 or newer on the path (check with java -version if you're not sure)
  • Spreadsheet editor (OpenOffice, LibreOffice, Ms Excel, etc.)

Steps:

  1. Save/move MARC files to the data folder. The files containing bibliographic records go to the data/bibliographic folder, authority ones go to data/authority. The tool transforms as many files as you like.
  2. Execute the following command to generate mapping template spreadsheets: "marimba --generatemappings -a -b"  This action generates 3 different spreadsheets: classificationMapping y annotationMapping and relationsMapping. Additionally, it creates an extra spreadsheet, called alias, that can be used to define short names for RDF classes and properties URIs. Spreadsheets are found by default in the mappings folder.
  3. Using the aforementioned spreadsheets, establish mappings between MARC metadata and RDF classes and properties. Each spreadsheet has a clearly defined function:
    • classificationMapping: it is used to assign an specific RDF class or type to the MARC records that presents a certain combination of fields/subfields.
    • annotationMapping: it maps MARC subfields to RDF properties.
    • relationsMapping: it aims at establishing a relationship (using and RDFS property or an OWL object property) between to resources that present a certain subfield variation.
  4. Save/move vocabulary files to models folder. To do that, you just need to download the RDF files of the vocabularies you have used or export them if you were using an ontology editor (e.g. NeOnToolkit or Protégé).
  5. Run the following command to generate RDF: "marimba --generaterdf -a -b --writeresultado.rdf"
  6. You can inspect the results using SPARQL and the lightweight FUSEKI server that MARiMbA incorporates by running: "run-marimba-server"

Go to http://localhost:3030/ and start running your SPARQL queries to see the transformed data.

Interested?

The tool will be available early 2012. However, if you would like to have more information or would like to try it, contact: This e-mail address is being protected from spambots. You need JavaScript enabled to view it or This e-mail address is being protected from spambots. You need JavaScript enabled to view it

 

 News

Created under Creative Commons License - 2010 OEG.