Didaskon/IKHarvester/Mapping
From Corrib Clan Wiki
Contents |
Attribute Mapping Rules
IKHarvester aims at managing informal knowledge. Data harvesting means collecting data from SSIS (in general - online communities) and saving them to the informal knowledge repository; the repository stores these metadata in RDF triples from which Learning Objects described according to LOM standard are created and provided to Didaskon.
Defining the mapping rules for resources' attributes, their semantic representations (predicates), and LOM attributes was crucial for further development. There are plenty of properties that describe a resource. Semantic RDF feeds are very helpful since they already provide mapping from attributes to predicates. For they give a lot of unnecessary information (from learning perspective ), their output must be filtered during LOs composition.
In this section, I describe how attributes mappings for each resource type IKHarvester supports at the moment (blog posts, wiki articles, and JeromeDL resources); the result of my research are presented there.
Blog Posts
Metadata for blog posts is delivered by SIOC data exporters. A blog that supports SIOC, contains some additional information in the meta tag (inside head tag) in the HTML code. For my blog, which is available at http://dobrzanski.net, it looks as follows:
<link rel="meta" type="application/rdf+xml" title="SIOC" href="http://dobrzanski.net/index.php?sioc_type=site" />
The href attribute value is the URL of the RDF representation of the data on current page. Its value changes during browsing such blog; it is always up to date, ready to produce RDF output. In general, the output consists of some information about the blog itself and its posts. Having the URL of SIOC data for a post, IKHarvester uses the exporter to obtain the RDF graph which is saved to the informal knowledge repository. When it is asked to deliver data, it collects the RDF statements from the repository and transform them so they describe the post in a way compatible with LOM standard. Since some of the metadata is not crucial for e-Learning purposes, it is filtered during creating LO manisfest.
In following table, I present how posts' attributes (first column) are mapped to SIOC ontology predicates (second column) and then to LOM attributes (third column). Some of the LOM attributes are set to default values, which cannot be collected from SIOC exporter output. Attributes labeled with a star (*) can occur more than once.
| Attribute | Predicate | LOM |
|---|---|---|
| - | sioc:Post | Educational.LearningResourceType="BlogPost" |
| URI | - | Technical.Location & General.Identifier.Catalog="URI" & |
| title | dc:title | General.Identifier.Title |
| title | dc:title | General.Identifier.Title |
| creator | sioc:has_creator | Lifecycle.Contribute.Role="Author" & Lifecycle.Contribute.Entity="Personal info." & |
| creation date | dctermss:link | Lifecycle.version="Date" |
| description | SIOC:content | General.Description & Educational.Description & |
| rich content (HTML) | content:encoded | - |
| topic* | sioc:topic | General.Keyword & Classification.Keyword |
| reply* | sioc:has_reply | Annotation.Entity="About author" & Annotation.Date="Date" & |
| external link* | sioc:links_to | Relation.Kind="references" & Relation.Resource.Identifier.Catalog="URI" & |
| language | - | General.Language & Educational.Language & |
| - | - | Educational.InteractivityType="expositive" |
| - | - | Educational.InteractivityLevel="medium" |
| - | - | Educational.SemanticDensity="medium" |
| - | - | Educational.IntendedEndUserRole="learner" |
| - | - | Educational.Context="school" & Educational.Context="higher education" & |
| - | - | Educational.Difficulty="easy" |
| - | - | Rights.Cost="no" |
| - | - | Rights.CopyrightAndOtherRestrictions="no" |
| - | - | General.Structure="atomic" |
| - | - | General.AggregationLevel="1" |
| - | - | MetaMetadata.MetadataSchema="LOMv1.0" |
| - | - | Technical.Requirement.OrComposite... .Type="operating system" |
| - | - | LifeCycle.Status="revised" |
Wiki Articles
One of the functional requirements says IKHarvester must collect data from semantic and non-semantic wikis which are based on MediaWiki engine. Many information about the concept described in an article from a semantic wiki can be obtained from the RDF feed; these are relations and attributes. However, harvesting should also be performed for non-semantic wikis, like Wikipedia. It turns out there is quite a lot of semantics; different sections like titles, content and categories are put inside sections with formalized identificators. Thus, scraping the page results in a lot of crucial information. In fact, I perform scraping for both semantic and non-semantic wikis.
In the following table, I present the way of mapping the attributes of wiki artcles (first column) to SIOC ontology predicates (second column) and then to LOM attributes (third column). Some of the LOM attributes are set to default values suggesting on LOM standard propeses. Attributes labeled with a star (*) can occur more than one time; those with two stars (**) are served by RDF feeds; they can be multiple as well.
| Attribute | Predicate | LOM |
|---|---|---|
| - | sioc:WikiArticle | Educational.LearningResourceType="WikiArticle" |
| URI | - | Technical.Location & General.Identifier.Catalog="URI" & |
| title | dc:title | General.Identifier.Title |
| last. modif. date | dctermss:link | Lifecycle.version="Date" |
| description | SIOC:content | General.Description & Educational.Description & |
| rich content (HTML) | content:encoded | - |
| category* | sioc:topic | General.Keyword & Classification.Keyword |
| external link* | sioc:links_to | Relation.Kind="references" & Relation.Resource.Identifier.Catalog="URI" &
|
| relation** | relation:xxx | Relation.Kind=xxx & Relation.Resource.Identifier.Catalog="URI" & |
| attribute** | attribute:xxx | Relation.Kind="has attribute" & Relation.Resource.Identifier.Catalog="URI" & |
| language | - | General.Language & Educational.Language & |
| - | - | Educational.InteractivityType="expositive" |
| - | - | Educational.InteractivityLevel="medium" |
| - | - | Educational.SemanticDensity="medium" |
| - | - | Educational.IntendedEndUserRole="learner" |
| - | - | Educational.Context="school" & Educational.Context="higher education" & |
| - | - | Educational.Difficulty="medium" |
| - | - | Rights.Cost="no" |
| - | - | Rights.CopyrightAndOtherRestrictions="no" |
| - | - | General.Structure="atomic" |
| - | - | General.AggregationLevel="1" |
| - | - | MetaMetadata.MetadataSchema="LOMv1.0" |
| - | - | Technical.Requirement.OrComposite...
.Type="operating system"
|
| - | - | LifeCycle.Status="revised" |
JeromeDL resources
JeromeDL provides extract information for resources in a few representation. I have choosen MarcOnt ontology supported by JeromeDL. They both give accurate descriptions. the only thing is filter them and decide on how they should be mapped to LOM attributes. The results of the research in that field are presented in the following table.
| Attribute | Predicate | LOM
|
|---|---|---|
| - | jeromedl:Book | Educational.LearningResourceType="JeromeDLResource" |
| URI | - | Technical.Location General.Identifier.Catalog="URI" |
| title | marcont:hasTitles | General.Identifier.Title |
| creator | marcont:hasCreator | Lifecycle.Contribute.Role="Author" Lifecycle.Contribute.Entity="Personal info."
|
| abstract | jeromedl:abstract | General.Description Educational.Description |
| keyword* | marcont:hasKeyword | General.Keyword Classification.Keyword |
| bookType | jeromedl:bookType | Educational.LearningResourceType |
| digitalType | jeromedl:digitalType | Technical.Format
|
| protectionType | jeromedl:protectionType | Rights.Copyright=XXX Rights.Cost
|
| language | - | General.Language Educational.Language |
| supervisor | xmarcont:supervisor | Lifecycle.Contribute.Role="Supervisor" Lifecycle.Contribute.Entity="Personal info."
|
| consultant | xmarcont:consultant | Lifecycle.Contribute.Role="Consultant" Lifecycle.Contribute.Entity="Personal info."
|
| uploader | jeromedl:uploader | Lifecycle.Contribute.Role="Uploader" Lifecycle.Contribute.Entity="Personal info." |
| - | - | Educational.InteractivityType="expositive" |
| - | - | Educational.InteractivityLevel="medium" |
| - | - | Educational.SemanticDensity="medium" |
| - | - | Educational.IntendedEndUserRole="learner" |
| - | - | Educational.Context="school" Educational.Context="higher education" |
| - | - | Educational.Difficulty="medium" |
| - | - | General.Structure="atomic" |
| - | - | General.AggregationLevel="1" |
| - | - | MetaMetadata.MetadataSchema="LOMv1.0" |
| - | - | Technical.Requirement.OrComposite... .Type="operating system" |
| - | - | LifeCycle.Status="revised" |



