Generate a project RDF Schema from the the RDF Schema Template
Note
To find out more watch the Tutorial on Expanding the SPHN RDF Schema
Target Audience
This document is mainly intended for project data managers and researchers who are interested in generating their project-specific RDF Schema. It provides guidance on how to create a project-specific RDF Schema based on the SPHN RDF Schema. Information on how to modify and extend the SPHN RDF Schema to fit the needs of the project is also given.
Figure 1: Process on how to use and modify the SPHN Dataset for the project specific needs.
1. Project-specific schema creation
To facilitate the steps in creating a project-specific schema, the DCC provides the RDF Schema Template with pre-filled elements accessible here.
This template contains:
the SPHN RDF Schema imported (as
direct Imports
) and the related external resources imported (asindirect Imports
)adequate imports of RDF libraries used in the context of SPHN (e.g. http://purl.org/dc/terms/)
pre-filled metadata (annotations) for the project-specific schema to be updated by the projects.
Please use this file to create your project-specific schema.
1.1 Create a project-specific schema in Protégé
Get the RDF Schema Template file provided by DCC from Git into Protégé and follow the steps to update schema information in there:
First open the template file: File –> Open
Make sure to link to the adequate SPHN RDF Schema and external terminologies when requested to import them (the
catalog.xml
file provided in Git facilitates the import: instructions are available in the README file)Save this project with the project name: File –> Save As –> Select the format (recommended: Turtle syntax, OWL/XML Syntax)
Select location to save and name the project accordingly (e.g. psss_schema, frailty_schema).
1.1.1 Update the ontology IRI
A schema released by a project, which extends the SPHN RDF Schema,
should have its own ontology IRI (namespace) defined.
The ontology IRI, also called base prefix
, will be used by both data providers
(to annotate data) and data users (to query for the relevant classes/properties).
The convention to follow for defining this ontology IRI is:
https://biomedit.ch/rdf/sphn-ontology/
+ <name of the project>
+ /
or #
(e.g., for the PSSS project, the ontology IRI can be: https://biomedit.ch/rdf/sphn-ontology/psss/).
In addition to the ontology IRI, a version IRI must be generated and provided by the project for each published release of their RDF schema. The version IRI must be in the form of:
<ontologyIRI>
+ <year>
+ /
+ <version>
+ /
(e.g. https://biomedit.ch/rdf/sphn-ontology/psss/2021/3/ for the third release of the PSSS RDF Schema in 2021).
The version IRI of a project called PSSS would be reflected in a RDF Turtle file as follow:
@prefix : <https://biomedit.ch/rdf/sphn-ontology/psss/> .
<https://biomedit.ch/rdf/sphn-ontology/psss/>
owl:versionIRI <https://biomedit.ch/rdf/ontology/psss/2021/3/> .
In the template loaded, the ontology IRI and the ontology version IRI
must be updated in the Active Ontology
, section Ontology Header
following
the conventions cited above: simply change the text “PROJECT-NAME” to the actual project name.
1.1.2 Update annotations
Below the Ontology header
section are the annotations holding
the metadata about the project’s schema:
the title (
dc:title
) should be a project-specific title (e.g. ‘the PSSS RDF Schema’)the short comment (
dc:description
) should be a short sentence reflecting the content of the project’s schemathe license of the project (
dcterms:license
) which should be the same as the SPHN licensing
Make sure to update the title and the description by changing the “PROJECT-NAME” to the actual project name. The license does not need any changes.
1.1.3 Information about Imports
In the template, the SPHN RDF Schema is being already imported, marking the following statement in the project-specific (here, an example with the PSSS project) Turtle file:
@prefix : <https://biomedit.ch/rdf/sphn-ontology/psss/> .
<https://biomedit.ch/rdf/sphn-ontology/psss/>
owl:versionIRI <https://biomedit.ch/rdf/sphn-ontology/psss/2021/3/>;
owl:imports <https://biomedit.ch/rdf/sphn-ontology/sphn/2021/1/> .
Note
owl:imports
means that the contents of another OWL ontology (here, the SPHN RDF Schema)
is imported into the current one (here, the PSSS RDF Schema).
More information can be found at: https://www.w3.org/TR/owl-ref/#imports-def.
If you wish to import any other ontology in the project, follow these steps:
In
Ontology imports
, click the+
sign next toDirect Imports
Choose
Import an ontology contained in a local file.
, thenContinue
Select the ontology to import with
Browse
, thenContinue
, and finallyFinish
.
1.1.3 Add the project schema prefix
In the tab Ontology Prefixes
, make sure to update the value of the base prefix
(usually the first line, which has an empty prefix) by changing
the text ‘PROJECT-NAME’ to the actual name.
Then make sure to add the schema prefix of the project where the
Prefix
would be the project name and the Value
would be
the project ontology IRI for better readability in the .ttl
or .owl
file.
2. New concepts and modification of existing ones
We encourage you to design a new concept or modify an existing concept according to the Guiding principles for concept design.
First, create a root class (<PROJECT-NAME>Concept
) and root data
(<PROJECT-NAME>AttributeDatatype
) and object (<PROJECT-NAME>AttributeObject
)
property for the project-specific ontology, where all the classes and properties
specific to the project will be defined as sub-elements.
2.1 Extension or modification of existing SPHN concepts
Extension or modification of existing SPHN concepts can result in additional composedOfs, it can be an alternative semantic standard, that needs to be added, or it can be a required extension of an existing value set. There are various reasons calling for extensions, e.g. implementation of a new standard in the applicable jurisdiction, change in availablity of biomedical data, new needs of research projects, or expanded medical knowledge.
Note
There exist three SPHN concepts that have a special meaning in the processing:
SubjectPseudoIdentifier
, DataProviderInstitute
and AdministrativeCase
Any extension or modification of these concepts might result in invalid pipelines.
Please inform DCC if you want to modify these concepts.
It may happen that you find the concept in the SPHN Dataset for the data you need,
but a piece of information is missing. For example, you need data for a specific measurement,
e.g. Body Temperature
and different measurement methods for measuring the Body Temperature
matter for your research question. The specific measurement Body Temperature
is represented in the SPHN Dataset as a concept. However, the measurement method with
the appropriate value set is not yet defined as a composedOf.
In this case you can extend the SPHN concept with the additional composedOf in
your project specific Dataset.
Note
Please inform the DCC about this extension. It might be relevant to other projects as well and the DCC can coordinate an extension to the SPHN Dataset if needed.
description |
type |
||
---|---|---|---|
concept |
Body Temperature |
body temperature of the individual |
|
composedOf |
temperature |
measured temperature |
quantitative |
composedOf |
datetime |
datetime of measurement |
temporal |
composedOf |
body site |
body site of measurement |
Body Site |
composedOf |
unit |
unit in which the temperature is expressed |
Unit |
composedOf |
method |
method used to measure the temperature |
Measurement Method |
For the example above, the next step would be to define your value set or subset for the new composedOf.
In case you are choosing SNOMED CT as a controlled vocabulary to express your values for the method of
Body Temperature
measurements, you can define a subset as all descendents for
the SNOMED CT concept 56342008 | Temperature taking (procedure) |.
description |
type |
value set or subset |
||
---|---|---|---|---|
composedOf |
method |
method used to measure the temperature |
Measurement Method |
child of: 56342008 | Temperature taking (procedure) | |
2.2 Implementation of changes in RDF
Figure 2: Process on how to use and modify the SPHN RDF Schema for the project specific needs.
This section displays information about the way a project should update the SPHN RDF Schema depending on the modification to perform.
2.2.1 Modifying an existing class
A project modifying an existing class of the SPHN RDF Schema in any way
(minor edit or change breaking compatibility) must provide the modified class with their project prefix.
This implies a new class is generated by the project, with the same naming but a different prefix
(e.g. a modification in the class sphn:Encounter
by the PSSS project would become psss:Encounter
).
In Protégé, a new class must be created in the project ontology with the same name but
this IRI will be the project ontology IRI (e.g. https://biomedit.ch/rdf/sphn-ontology/psss/Encounter).
Note
If we follow the example provided, real data following the PSSS ontology must then provide
the Encounter data elements based on the definition of the PSSS project.
Therefore, the prefix used (and the IRI) will always be
PSSS:Encounter
(and https://biomedit.ch/rdf/sphn-ontology/psss/Encounter
).
2.2.2 Modifying an existing property
Any change affecting a property from the SPHN RDF Schema must result
in the creation of a new property with the project ontology IRI.
For example, DCC has defined a material type liquid
property for the concept
Biosample
: sphn:hasMaterialTypeLiquid
, with a restricted list of possible value set.
The project PSSS decides to narrow down the list of possible values for this
material type liquid
property.
The PSSS project must then define their own psss:hasMaterialTypeLiquid
property.
In this psss:hasMaterialTypeLiquid
property, the value set will be restricted
to only values allowed by the PSSS project.
Note
Value set restriction are encoded as owl:Restriction
(see section Constraints added to properties) since the version 2022.1 of the SPHN RDF Schema.
If a project would like to reuse a property in another context (meaning to describe metadata of another class), a new property must be created following the conventions defined in the section About the SPHN RDF properties.
2.2.3 Creating a new property to an existing class
Adding a new property to an existing class can lead to two different scenarios.
1. If the property does not change the meaning of the class, the project can define their property with their prefix associated to the SPHN class as shown in the example below:
sphn:Encounter
(class)psss:hasServiceType
(new property)
The project should submit the change request of adding the new property into the concept to the DCC. If the change is evaluated to be of general importance, the DCC would adapt the concept accordingly in the next release of the SPHN RDF Schema. This would result in the following:
sphn:Encounter
sphn:hasServiceType
2. If the property changes the meaning of the class and breaks compatibility, a new class must be created with the project prefix (following the recommendations from the section 2.2.1 Modifying an existing class) and the property would be defined for this new class:
psss:Encounter
psss:hasEndDate
For more guidance on knowing whether a property eventually breaks the meaning of a class or if a specific change needs the creation of a project-specific class/property, do not hesitate to contact the DCC (dcc@sib.swiss).
2.3 Meaning binding to controlled vocabulary
For the meaning binding you can use any controlled vocabulary that is appropriate for your concept. If you use SNOMED CT or LOINC, the SNOMED CT Browser and the LOINC Browser are valuable tools to find appropriate SNOMED CT concepts and LOINC codes for the meaning binding. To use the LOINC Browser, you would need to create a free LOINC account. There are good practices for meaning binding in SNOMED CT. Appropriate training is provided by SNOMED International on the elearning platform. Further, please refer to the guiding principles for Controlled vocabulary. If you need help with the meaning binding, please contact the DCC (dcc@sib.swiss).
The integration of meaning binding to RDF classes is represented by
owl:equivalentClass
.
The example below shows that the LOINC code
8302-2
is an equivalent class of the SPHN class BodyHeight
:
### https://biomedit.ch/rdf/sphn-ontology/sphn#BodyHeight
sphn:BodyHeight rdf:type owl:Class ;
owl:equivalentClass <https://loinc.org/rdf/8302-2> ;
rdfs:subClassOf sphn:Measurement ;
rdfs:comment "height of the individual" ;
rdfs:label "Body Height" .
To annotate an equivalent class through Protégé, please follow these instructions:
on the
Class hierarchy
section, select the class of intereston the
Description
section click on the+
sign next toEquivalent To

in the pop-up window that appears, go to the tab
Class expression editor

in the text field, type the label of the equivalent class (for autocomplete, press
Tab
)

Note
The external terminologies (e.g., SNOMED CT, LOINC, GENO and SO) must be provided in the ontology space in order to be able to find and connect the equivalent classes.
Classes composed of multiple words are better found via autocomplete when an apostrophe is entered at the beginning in the Class expression editor text field.
3. Valuesets as individuals in the RDF schema
Valuesets can be defined by the project in order to set and limit the possible values
for a certain property (see section Standards and value sets).
Each possible value needs to be created as an individual in RDF (owl:NamedIndividual
).
These individuals are then grouped into the same valueset, represented with a specific class.
This class is then set as being the range of the property,
meaning that the individuals linked to that class are the possible values for that property.
The creation of a value as an individual and linking a set of values to a property require the following of these steps in Protégé:
Create an individual for each value:
Select tab
Individuals
,Click on
Add individual
,Write the name of the individual to generate the IRI,
Add a label for each individual created.
If not done already, create a
ValueSet
class to group all sets of valuesCreate a class which should be a sub-class of
ValueSet
. The IRI of the class should follow the convention:<DomainClassName>_<propertyName>
where ‘DomainClassName’ is the Domain of the property.Select the class created, then:
Click on the
+
sign next toInstances
,Select the individuals that are linked to this ‘valueset class’ (multiple individuals can be selected with Ctrl+Click),
Click
OK
,Now all individuals of a valueset are connected to a specific valueset class.
The valueset class can now be added in the
owl:restriction
of the class with the property allowing these values:
Select the class,
Click on the
+
sign next toSubClass Of
,Under
Class expression editor
write theowl:restriction
with the following pattern:property-name
+some
+valueset-class
Click
OK
.
For example, the class DiagnosticRadiologicExamination
has the property hasMethod
which has six possible values (PET CT, CT, MRI, PET, SPECT, X-ray).
These six values are created one by one as individuals.
The class DiagnosticRadiologicExamination_method
is then generated
as a subclass of ValueSet
. The six individuals are added as instances
of the class DiagnosticRadiologicExamination_method
.
The class DiagnosticRadiologicExamination_method
is set as
a value restriction on the class DiagnosticRadiologicExamination
for the property hasMethod
(as shown below in a .ttl
format).
sphn:DiagnosticRadiologicExamination
rdfs:subClassOf [ rdf:type owl:Restriction ;
owl:onProperty sphn:hasMethod ;
owl:someValuesFrom sphn:DiagnosticRadiologicExamination_method
]
4. Best practices when generating the RDF
When creating a new class or a new property, following best practices increases to some extent the consistency and the readability of the schema. Here are a few recommendations:
use Pascal case notation for classes (e.g.
AdministrativeGender
) and Camel case notation for data and object properties (e.g.hasEndDateTime
) when creating the IRIsdata and object properties should follow the convention given in the section About the SPHN RDF properties
for all classes and properties, generate a label (
rdfs:label
) with spaces in between words for better readability of classes and properties (e.g.hasEndDateTime
would have as labelhas end date time
)for all classes and properties, create a description (
rdfs:comment
) that explains in an understandable and unambigous sentence the meaning of the class or propertychoose an appropriate controlled vocabulary (meaning binding) to represent your class through the use of
owl:equivalentClass
. (see section Controlled vocabulary) for the guiding principles for the meaning binding to external terminologies (e.g., SNOMED CT, LOINC, GENO or SO).
5. Visualizing the project-specific schema
Once the project-specific RDF Schema is created, it can be visualized with the PyLODE-based SPHN Schema Visualization Tool (see https://git.dcc.sib.swiss/sphn-semantic-framework/sphn-ontology-documentation-visualization). The tool is used to generate human-readable HTML documents for RDF schemas. It takes given ontologies and terminologies as input, manipulates and merges them into a single preprocessed schema and then generates a HTML document.
Merged ontologies/terminologies:
SPHN schema
Project-specific schemas (optional)
SNOMED (labels only)
LOINC (labels only)
CHOP (labels only)
The html document is structured as follow: it starts with some general information about the schema (URI, version, etc.) and then is divided into five main sections. Each section gives detailed information about the adressed schema components. The end of the html document provides information about namespaces and some legends.
Classes: The list of classes defined in the schema contains the sections shown in the Table below:
Section |
Description |
---|---|
URI |
URI |
Description |
short description about the class |
Schema representation |
image containing the class schema and its outgoing properties and metadata |
Meaning binding (Equivalent-class) |
Link to equivalent class (e.g. SNOMED CT, LOINC, GENO or SO class) |
Parents |
Link to super-classes |
Children (Sub-classes) |
Link to sub-classes |
Property (in the domain of) |
List of properties where the class is listed in the domain with given cardinalities, class or datatype information and restriction information (Yes/No) |
Restrictions |
details about the restrictions applied on properties in the context of the class (e.g. specified SNOMED codes) |
Notes |
Notes for specified properties (allowed coding system or recommended values) |
Used in (In the range of) |
List of properties where the class is listed in the range |
Object Properties: provides the list of object properties defined in the schema with their URI, description, super-properties, domain(s) and range.
Datatype Properties: provides the list of datatype properties defined in the schema with their URI, description, super-properties, domain(s) and data type.
Annotation Properties: provides the list of annotation properties with their URI and description (if provided).
Named Individuals: provides the list of named individuals with their URI and the class in which they appear.
For providing a project-specific RDF Schema in the PyLODE-based SPHN Schema Visualization Tool, follow instructions provided in the README - User Guide. The generated HTML file can then be shared by the project members to anyone who wishes to visualize the project-specific schema in a browser.
6. Validating the project-specific schema
Validation is possible with the SHACLer tool.
7. Reporting back to DCC
The DCC welcomes any feedback to the SPHN Dataset and to the SPHN RDF Schema to improve these specifications. If you have any specific change requests to the SPHN Dataset, or to the SPHN RDF Schema, please submit them by email to dcc@sib.swiss. For any change requests to the SPHN Dataset, please include the concept(s) or the composedOf(s), which are affected by the change request, the version of the Dataset, a description of the rationale behind the change request, and your proposal including suggested changes in a table structure following the SPHN Dataset design.