Didaskon/IKHarvester/Mapping

From Corrib Clan Wiki

Jump to: navigation, search

Contents

Attribute Mapping Rules

IKHarvester aims at managing informal knowledge. Data harvesting means collecting data from SSIS (in general - online communities) and saving them to the informal knowledge repository; the repository stores these metadata in RDF triples from which Learning Objects described according to LOM standard are created and provided to Didaskon.

Defining the mapping rules for resources' attributes, their semantic representations (predicates), and LOM attributes was crucial for further development. There are plenty of properties that describe a resource. Semantic RDF feeds are very helpful since they already provide mapping from attributes to predicates. For they give a lot of unnecessary information (from learning perspective ), their output must be filtered during LOs composition.

In this section, I describe how attributes mappings for each resource type IKHarvester supports at the moment (blog posts, wiki articles, and JeromeDL resources); the result of my research are presented there.

Blog Posts

Metadata for blog posts is delivered by SIOC data exporters. A blog that supports SIOC, contains some additional information in the meta tag (inside head tag) in the HTML code. For my blog, which is available at http://dobrzanski.net, it looks as follows:

  <link rel="meta" type="application/rdf+xml" title="SIOC" href="http://dobrzanski.net/index.php?sioc_type=site" />

The href attribute value is the URL of the RDF representation of the data on current page. Its value changes during browsing such blog; it is always up to date, ready to produce RDF output. In general, the output consists of some information about the blog itself and its posts. Having the URL of SIOC data for a post, IKHarvester uses the exporter to obtain the RDF graph which is saved to the informal knowledge repository. When it is asked to deliver data, it collects the RDF statements from the repository and transform them so they describe the post in a way compatible with LOM standard. Since some of the metadata is not crucial for e-Learning purposes, it is filtered during creating LO manisfest.

In following table, I present how posts' attributes (first column) are mapped to SIOC ontology predicates (second column) and then to LOM attributes (third column). Some of the LOM attributes are set to default values, which cannot be collected from SIOC exporter output. Attributes labeled with a star (*) can occur more than once.

Attribute Predicate LOM
- sioc:Post Educational.LearningResourceType="BlogPost"
URI - Technical.Location &

General.Identifier.Catalog="URI" &
General.Identifier.Entry &
Meta-Metadata.Identifier.Catalog="URI" &
Meta-Metadata.Identifier.Entry

title dc:title General.Identifier.Title
title dc:title General.Identifier.Title
creator sioc:has_creator Lifecycle.Contribute.Role="Author" &

Lifecycle.Contribute.Entity="Personal info." &
Lifecycle.Contribute.Date="Date of creation" &
Meta-Metadata.Contribute.Role="Author" &
Meta-Metadata.Contribute.Entity="Personal info." &
Meta-Metadata.Contribute.Date="Date"

creation date dctermss:link Lifecycle.version="Date"
description SIOC:content General.Description &

Educational.Description &
Classification.Description

rich content (HTML) content:encoded -
topic* sioc:topic General.Keyword &
Classification.Keyword
reply* sioc:has_reply Annotation.Entity="About author" &

Annotation.Date="Date" &
Annotation.Description="Content"

external link* sioc:links_to Relation.Kind="references" &

Relation.Resource.Identifier.Catalog="URI" &
Relation.Resource.Identifier.Entry &
Relation.Resource.Description="references"

language - General.Language &

Educational.Language &
Meta-Metadata.Language

- - Educational.InteractivityType="expositive"
- - Educational.InteractivityLevel="medium"
- - Educational.SemanticDensity="medium"
- - Educational.IntendedEndUserRole="learner"
- - Educational.Context="school" &

Educational.Context="higher education" &
Educational.Context="training" &
Educational.Context="other"

- - Educational.Difficulty="easy"
- - Rights.Cost="no"
- - Rights.CopyrightAndOtherRestrictions="no"
- - General.Structure="atomic"
- - General.AggregationLevel="1"
- - MetaMetadata.MetadataSchema="LOMv1.0"
- - Technical.Requirement.OrComposite...

.Type="operating system"
.Name="multi-os"
.Type="browser"
.Name="any"

- - LifeCycle.Status="revised"


Wiki Articles

One of the functional requirements says IKHarvester must collect data from semantic and non-semantic wikis which are based on MediaWiki engine. Many information about the concept described in an article from a semantic wiki can be obtained from the RDF feed; these are relations and attributes. However, harvesting should also be performed for non-semantic wikis, like Wikipedia. It turns out there is quite a lot of semantics; different sections like titles, content and categories are put inside sections with formalized identificators. Thus, scraping the page results in a lot of crucial information. In fact, I perform scraping for both semantic and non-semantic wikis.

In the following table, I present the way of mapping the attributes of wiki artcles (first column) to SIOC ontology predicates (second column) and then to LOM attributes (third column). Some of the LOM attributes are set to default values suggesting on LOM standard propeses. Attributes labeled with a star (*) can occur more than one time; those with two stars (**) are served by RDF feeds; they can be multiple as well.


Attribute Predicate LOM
- sioc:WikiArticle Educational.LearningResourceType="WikiArticle"
URI - Technical.Location &

General.Identifier.Catalog="URI" &
General.Identifier.Entry &
Meta-Metadata.Identifier.Catalog="URI" &
Meta-Metadata.Identifier.Entry

title dc:title General.Identifier.Title
last. modif. date dctermss:link Lifecycle.version="Date"
description SIOC:content General.Description &

Educational.Description &
Classification.Description

rich content (HTML) content:encoded -
category* sioc:topic General.Keyword &
Classification.Keyword
external link* sioc:links_to Relation.Kind="references" &

Relation.Resource.Identifier.Catalog="URI" &
Relation.Resource.Identifier.Entry &
Relation.Resource.Description="references"


relation** relation:xxx Relation.Kind=xxx &

Relation.Resource.Identifier.Catalog="URI" &
Relation.Resource.Identifier.Entry &
Relation.Resource.Description=xxx

attribute** attribute:xxx Relation.Kind="has attribute" &

Relation.Resource.Identifier.Catalog="URI" &
Relation.Resource.Identifier.Entry &
Relation.Resource.Description="has attribute"

language - General.Language &

Educational.Language &
Meta-Metadata.Language

- - Educational.InteractivityType="expositive"
- - Educational.InteractivityLevel="medium"
- - Educational.SemanticDensity="medium"
- - Educational.IntendedEndUserRole="learner"
- - Educational.Context="school" &

Educational.Context="higher education" &
Educational.Context="training" &
Educational.Context="other"

- - Educational.Difficulty="medium"
- - Rights.Cost="no"
- - Rights.CopyrightAndOtherRestrictions="no"
- - General.Structure="atomic"
- - General.AggregationLevel="1"
- - MetaMetadata.MetadataSchema="LOMv1.0"
- - Technical.Requirement.OrComposite...

.Type="operating system"
.Name="multi-os"

.Type="browser"
.Name="any"


- - LifeCycle.Status="revised"


JeromeDL resources

JeromeDL provides extract information for resources in a few representation. I have choosen MarcOnt ontology supported by JeromeDL. They both give accurate descriptions. the only thing is filter them and decide on how they should be mapped to LOM attributes. The results of the research in that field are presented in the following table.


Attribute Predicate LOM


- jeromedl:Book Educational.LearningResourceType="JeromeDLResource"
URI - Technical.Location

General.Identifier.Catalog="URI"
General.Identifier.Entry
Meta-Metadata.Identifier.Catalog="URI"
Meta-Metadata.Identifier.Entry

title marcont:hasTitles General.Identifier.Title
creator marcont:hasCreator Lifecycle.Contribute.Role="Author"

Lifecycle.Contribute.Entity="Personal info."
Lifecycle.Contribute.Date="Date of creation"
Meta-Metadata.Contribute.Role="Author"
Meta-Metadata.Contribute.Entity="Personal info."
Meta-Metadata.Contribute.Date="Date"


abstract jeromedl:abstract General.Description

Educational.Description
Classification.Description

keyword* marcont:hasKeyword General.Keyword
Classification.Keyword
bookType jeromedl:bookType Educational.LearningResourceType
digitalType jeromedl:digitalType Technical.Format


protectionType jeromedl:protectionType Rights.Copyright=XXX
Rights.Cost


language - General.Language

Educational.Language
Meta-Metadata.Language

supervisor xmarcont:supervisor Lifecycle.Contribute.Role="Supervisor"

Lifecycle.Contribute.Entity="Personal info."
Meta-Metadata.Contribute.Role="Supervisor"
Meta-Metadata.Contribute.Entity="Personal info."


consultant xmarcont:consultant Lifecycle.Contribute.Role="Consultant"

Lifecycle.Contribute.Entity="Personal info."
Meta-Metadata.Contribute.Role="Consultant"
Meta-Metadata.Contribute.Entity="Personal info."


uploader jeromedl:uploader Lifecycle.Contribute.Role="Uploader"

Lifecycle.Contribute.Entity="Personal info."
Meta-Metadata.Contribute.Role="Uploader"
Meta-Metadata.Contribute.Entity="Personal info."

- - Educational.InteractivityType="expositive"
- - Educational.InteractivityLevel="medium"
- - Educational.SemanticDensity="medium"
- - Educational.IntendedEndUserRole="learner"
- - Educational.Context="school"

Educational.Context="higher education"
Educational.Context="training"
Educational.Context="other"

- - Educational.Difficulty="medium"
- - General.Structure="atomic"
- - General.AggregationLevel="1"
- - MetaMetadata.MetadataSchema="LOMv1.0"
- - Technical.Requirement.OrComposite...

.Type="operating system"
.Name="multi-os"
.Type="browser"
.Name="any"

- - LifeCycle.Status="revised"
Facts about Didaskon/IKHarvester/Mapping — Click + to find similar pages.RDF feed
Personal tools

Corrib cluster project is supported by Enterprise Ireland under Grant No. ILP/05/203, Science Foundation Ireland under Grant No. SFI/02/CE1/I131.
Hosted at DERI, NUI Galway.