MarcOnt/RDFTranslator

From Corrib Clan Wiki

Jump to: navigation, search

olodomcaleto trocelttr olovarno

Contents

RDF Translator

RDF Tranformations Made Easy


Goals

We needed something as powerfull and as simple in the same way as XSLT for [wiki:Articles/ECDL2005Demo MarcOnt Mediation Service]. We failed to use:

  • TRIPLE (it was not working),
  • Sesame Inferencing (we missed things like regular expressions and functions for String processing),
  • FLORA-2 (it was harder to integrate FLORA in to our project that write own one)

So I decided to spend <10h -> and resulted with this implementation :)

Architecture

Overview

The architecture has been based on existing XSLT tools - keeping it as simple as possible. A trasformation object is generated from XML-based transformation definition. It takes RDF model in N-TRIPLES format from a file. It results with Jena Model that can be serialized to RDF/XML, N3 and other format.


XML Schema definition

Image:Xmlschema.png

How Does It Work ?

Translation definition consists of rules. Each rule consists of a list of premises (triples matching rules) and consequents (triples generating templates). Premise consists of subject, predicate and object, defined as URI of resource or text value of literal to be matched, or regular expressions of either.

The literal values can be typed with rdf:datatype attribute and internationalized with xml:lang attribute. It is assumed that xml:lang is used only when rdf:datatype is empty or equals to http://www.w3.org/2001/XMLSchema#string.

The string value defining URI, text value or regular expression can be enriched with variables with

 {$variableName} 

syntax or function calls with

 {marcont:functionName(arg1,arg2)}

syntax.

Each premise that has been matched in the rule that is being processed generates 3 variables:

  $PSn
  $PPn
  $POn

where n - is the position number of the premise in the rule definition. It is than possible to use this variables in next premises and consequents.

So far the list of supported functions consists of:

  • {{{marcont:generateId('namespace:')}}} that generates an resource URI within given namespace;
  • {{{marcont:clone($resource, 'namespace:')}}} that generates resource URI within given namespace but with ID extracted from resource represented by $resource variable.

Each time all premises are matched a bounch of triples is generated from premises templates. Each template can be constructed the same way as premises have been constructed. Although, regular expressions are not allowed here.

Each consequent that has been generated in the rule that is being processed adds 3 variables for the next processing:

  $CSn
  $CPn
  $COn


If there was a successfull match of premiese and rule calls are defined, the list of rule calls is being invoked one by one, until the last one comes back. So far there is no validation of rule calls. So be carefull, not to create loops in these recurrence calls. It is possible to send variables to rule being called. The values of variables can be enriched with use variables and functions, just like in premises and consequents values.

After triples are generated, the last premise is being matched again. If it succeeds, than next run of triples generation is performed. If it fails, the previous premise is being matched again, and so on. Until all the possible combinations of premises matching are fullfilled.


The processing of rules goes on - until rule with terminate attribute is reached. The the process is completed.

Transformation Processing Flow

Image:Transformationflow.png

Installation

In general you will need RDF Translator (RDFT) library and a list of libraries this package depends on:

   * antlr.jar            
   * concurrent.jar  
   * icu4j.jar              
   * jena.jar   
   * log4j-1.2.7.jar         
   * rdft.jar        
   * xml-apis.jar
   * commons-logging.jar  
   * jakarta-oro-2.0.5.jar  
   * junit.jar  
   * rdf-api-2001-01-19.jar  
   * xercesImpl.jar  
   * xpp3.jar

You can find all of them in TAR-GZIPPED archive of RDF Translator. See the example below how to use this tool.

Example

transformation definition

For sample transformation from MARC-RDF ontology to MarcOnt ontology see: [wiki:RDFTranslator/SampleTransformationRules Sample Translation Rules] or [wiki:RDFTranslator/SampleRDFRules RDF Rules] (when using rules in RDF format make sure that <rdf:RDF> tag will be within first 1 kB of the file).

input data

For sample MARC-RDF based source model see: [wiki:RDFTranslator/RDFSource MARC-RDF Source]

execution

  /*
   * Code sniplet from org.marcont.rdftranslator.Translate
   */
  // --- load transformation description from file
  FileInputStream fis = new FileInputStream(new File(args[0]));
  // --- create transformation 
  Translation t = TranslationFactory.createTranslation(fis);
  // --- execute tranformation on the source model loaded from file
              t.execute(new FileInputStream(new File(args[1])));
  System.out.println(" ---- source -----");
  t.getSrcModel().write(System.out, "N-TRIPLE");
  System.out.println(" ---- source -----");
  System.out.println(" ---- results -----");
  t.getDestModel().write(System.out, "N-TRIPLE");
  System.out.println(" ---- results -----");


result of transformation

For sample MarcOnt based result model see: MarcOnt Result


RDF Translator Predefined Functions

FN_CLONE

  clone($var, 'namespace:') - clones the resource of URI = $var to new 'namespace:'. 
  The namespace that the $var has must be registered in Translation.namespaces  

FN_GENERATEID

  generateId() - generates a number based random URI
  generateId('foaf:') - generetes URI based on foaf: namespace
  generateId('namespace:', Object ... args) - generates random URI in given namespace based on sha1sum(args). 
   Args can be varibles, but not regexps.

FN_ITERATOR

   iterator('rdf:',$seq_ID) - generates numbered nodes with increasing number for each such Seq collection
   
   output: e.g. 'rdf:_1'

FN_BNODE

    output: generates a bnode
   


Open Issues

Mapping Tool Use Case Diagram

Image:Mapptooluc.jpg

Image:jira.jpg Go to [1] and submit your ideas or found bugs.

Related pages

If you are interested in examples of data used by RDFTranslator you can visit these useful sites:

Facts about MarcOnt/RDFTranslator — Click + to find similar pages.RDF feed
Personal tools

Corrib cluster project is supported by Enterprise Ireland under Grant No. ILP/05/203, Science Foundation Ireland under Grant No. SFI/02/CE1/I131.
Hosted at DERI, NUI Galway.