SPHN RDF Schema
Scope of the SPHN RDF Schema
The SPHN RDF Schema provides an interoperable framework for conveying information and storing health-related data from SPHN-related projects using Semantic Web technologies including RDF (see background section on RDF). The schema facilitates the integration of existing external resources within the SPHN Semantic Interoperability Framework. The SPHN RDF Schema is based on the SPHN Dataset (available at: https://sphn.ch/document/sphn-dataset/) and transforms its elements into a formal structure (see Figure 1).
This documentation provides an overview of the content of the SPHN RDF Schema that complies with version 2023.2 of the schema. A visual representation of the schema can be found at: https://biomedit.ch/rdf/sphn-ontology/sphn.
Figure 1. Core elements defined in the SPHN dataset and their translation into RDF.
Concepts of the SPHN dataset are translated into classes (
and concepts’ compositions become either object properties (
or data properties (
owl:DataProperty) depending on the type of the value.
Table 1 summarizes these properties.
A set of values valid for a concept provided in the SPHN dataset
Value sets defined in the dataset are represented as individuals (
The meaning binding to external terminologies (e.g., SNOMED CT, LOINC) is provided in the SPHN dataset
is represented as one or more equivalent classes to a SPHN class (
Meaning binding provided in the dataset for ‘composedOf’ are not taken into account in the RDF schema because semantically, in RDF, a property (the ‘composedOf’ statement from the dataset) can’t be equivalent to a class (the meaning binding provided in the dataset).
Technical specification of the schema
SPHN RDF namespace
The namespace of the SPHN RDF Schema may be divided into two parts:
The SPHN ontology IRI: https://biomedit.ch/rdf/sphn-ontology/sphn. The ontology IRI remains fixed and must be defined in the SPHN RDF Schema. The ontology IRI can be considered as the “base prefix” and will be used by both data providers (to annotate data) and data users (to query for the relevant classes/properties).
The SPHN versioned IRI: https://biomedit.ch/rdf/sphn-ontology/sphn/2023/2. It is provided by the DCC at each published release of the SPHN RDF Schema and enables to distinguish different versions of the SPHN RDF Schema. The versioned IRI is used by the projects to refer to a specific version of the SPHN RDF Schema and is therefore included in the header of all datasets generated using this schema version.
The SPHN RDF Schema has dereferenceable links meaning that the content (classes, properties, individuals) is accessible and resolvable on the web when clicking on the link.
Versioning of the schema
Each release of the SPHN RDF Schema has a version associated with it.
The version, indicated by the tag
owl:versionIRI in the SPHN header information,
contains the year of release and the release number for that year
(e.g. https://biomedit.ch/rdf/sphn-ontology/sphn/2023/2 corresponds to the second release
of the SPHN RDF Schema in 2023). The published ontology IRI will always point to the
latest version IRI of the SPHN RDF Schema.
Official releases of the SPHN RDF Schema are:
SPHN header information
The header of the SPHN RDF Schema is contains the following information:
The title of the schema (
A short description of the content of the schema (
The license defining the terms and conditions of usage of the schema (
The version of the schema (
The external terminologies to be imported (
The previously released version of the schema (
Below is the header information of a turtle file:
<https://biomedit.ch/rdf/sphn-ontology/sphn> a owl:Ontology ; dc:description "The RDF schema describing concepts defined in the official SPHN dataset" ; dc:rights "© Copyright 2023, Personalized Health Informatics Group (PHI), SIB Swiss Institute of Bioinformatics" ; dc:title "The SPHN RDF Schema" ; dcterms:license <https://creativecommons.org/licenses/by/4.0/> ; owl:imports <http://snomed.info/sct/900000000000207008/version/20221231>, <https://biomedit.ch/rdf/sphn-resource/atc/2023/1>, <https://biomedit.ch/rdf/sphn-resource/chop/2023/4>, sphn-geno:20220810, sphn-hgnc:20221107, <https://biomedit.ch/rdf/sphn-resource/icd-10-gm/2023/3>, <https://biomedit.ch/rdf/sphn-resource/loinc/2.74/1>, sphn-so:20211122, <https://biomedit.ch/rdf/sphn-resource/ucum/2023/1> ; owl:priorVersion <https://biomedit.ch/rdf/sphn-ontology/sphn/2023/1> ; owl:versionIRI <https://biomedit.ch/rdf/sphn-ontology/sphn/2023/2> .
The schema header provides information about the version of the external terminologies used in the SPHN RDF Schema 2023.2.
The IRI of SNOMED CT is not within the
biomedit.chdomain since the turtle file is generated automatically using the Snomed OWL Toolkit.
About the SPHN RDF classes
Naming convention for classes
The classes defined in the SPHN RDF Schema come from the concepts defined in the SPHN dataset
(column ‘general concept name’): one concept corresponds to one class. The unique identifier of
a class corresponds to a concatenation of the words forming the concept written in an UpperCase format.
For example, the concept
Radiotherapy Procedure concept is defined as a
class in RDF. A class necessarily contains the following information:
rdfs:labelcorresponding to the class name with a space between words for better readability,
rdfs:commentcontaining the description of the class.
A class may have other annotations which are detailed in the section Annotations for enriching knowledge content.
In addition, a few classes have a meaning binding associated with a
SNOMED CT, LOINC, GENO
or SO concept or code.
This meaning binding is represented in the schema with the annotation
Additional classes definitions in the SPHN RDF Schema
Besides the concepts defined in the SPHN Dataset, four classes have been generated to represent specific types of metadata in RDF:
Terminologyclass, which groups classes and individuals of external resources (e.g. SNOMED CT, ATC, CHOP) used within the SPHN context to be able to refer to them as possible values (i.e. the SNOMED CT code
703117000is a possible value for the SPHN
genderproperty), equivalent classes (i.e. the LOINC code
8867-4is an equivalent class of the SPHN
HeartRateclass), or even individuals (e.g. UCUM units are individuals that can be directly referred to as property values).
Figure 2. SPHN RDF tree.
Terminology is the parent class of ATC, CHOP, ICD-10-GM, LOINC, SNOMED CT and UCUM
that are imported as external terminologies. Note in the figure above that UCUM does not have an arrow
to further expand sub-classes because UCUM elements are defined as individuals and not as classes.
ValueSetclass, which groups classes defining specific SPHN instances of possible values to use for certain object properties. The convention used for defining a ValueSet is:
<Class>_<composedOf general concept name>(e.g.
The exception lies with the
Comparator value set class which naming has been simplified in 2023.2
since the set of values used is the same in
DataReleaseclass, which stamps the extraction date of a dataset together with information related to the version of the SPHN or project dataset (see versioning-of-the-data for more information).
Deprecatedclass, which groups classes from a previous release of the schema that are no longer used in the current version.
Some hierarchies are defined in the SPHN Dataset in the column
parent and interpreted in the RDF schema.
For instance, the class
Diagnosis is a parent of
The rules of inheritance of properties is applied. All properties annotated at a parent class
are automatically inherited in the children class. Therefore, in the
inherited properties, only the parent class is explicitly stated.
The only exception is the
sphn:hasCode property of the
Diagnosis which is specified as
sphn:hasMorphologyCode for the
About the SPHN RDF properties
Concepts in the SPHN Dataset contain compositions (i.e. information about a concept) that are translated in the SPHN RDF Schema as object properties (relationship between individuals) or data properties (relationship of individual to a literal data).
The IRIs of properties are named after the general concept name column in the SPHN Dataset.
The convention used is to define properties with
has + <general concept name> as showcased in
the tables provided in the next subsections (Table 2 for object properties and Table 3 for data properties).
Object properties in the SPHN RDF Schema can define relationships between:
Resources from the clinical data (a given instance of
BodySiteis connected to a patient’s
Resources and elements collected from external ontologies (the
FOPH Diagnosisis identified by a specific
An additional set of four object properties are generated in the RDF schema, not represented in nor required by the SPHN Dataset:
hasSubjectPseudoIdentifier, connecting an information to the patient identifier
hasDataProviderInstitute, connecting an information to the data provider
hasAdministrativeCase, connecting an information to the administrative case
hasExtractionDateTime, providing information about the time of extraction of the data.
Data properties in the SPHN RDF Schema point to literal values of given SPHN concepts.
general concept name (from SPHN Dataset)
scoring system code
Constraints added to properties
owl:Restriction is a particular type of class description that puts value
and cardinality constraints to a property.
There can be multiple constraints added to narrow down and/or enrich a concept.
Value constraints are meant to restrict the possible types of values of a property.
Cardinality constraints restrict the number of instances of a property.
The rules for inheritance of restrictions are applied. Any restrictions annotated on a property are automatically inherited by the subproperties. That is, if a subclass (inheriting the parent class’ property) has a subproperty of that property, the restrictions of this subproperty must be in the range of the restrictions of the parent property. For example, if the parent property has cardinality 0:1, the subproperty may have cardinalities 0:0, 0:1 or 1:1.
Each instance of a subproperty is also an instance of the parent property.
The range of a property usually encodes the type of value allowed for this property. However, in some cases, it was not possible to straightforwardly encode additional information needed to account for the full context. In these cases, value constraints are used to ensure a clean modeling of the data.
For instance, the
Code of a
Unit given to an
However, it is not possible to annotate that the property
hasCode can only have as a
% since hasCode can be used in another context.
To solve this issue, from version 2022.1 of the SPHN RDF Schema, value constraints
owl:Restriction) have been integrated in these particular cases.
Coming back to the example of the oxygen saturation’s unit constraint, the following constraint (in .ttl format) can be built:
sphn:OxygenSaturation rdf:type owl:Class ; owl:subClassOf [ rdf:type owl:Restriction ; owl:onProperty sphn:hasQuantity owl:someValuesFrom [ rdf:type owl:Restriction owl:onProperty sphn:hasUnit ; owl:someValuesFrom [ rdf:type owl:Restriction ; owl:onProperty sphn:hasCode ; owl:hasValue ucum:percent ]]];
The code above can be read line by line as:
sphn:OxygenSaturationis a class.
It has the following restriction statement:
sphn:hasQuantityis used for describing the
only some specific values are allowed, which is in reality a second constraint statement.
A constraint is applied to the property
that again also has only specific values allowed leading to a third constraint statement.
A constraint is now applied to the property
which allows only for the following value:
In other words, when annotating the quantity of an
Code representing the
Unit must be the instance
coming from the UCUM notation (see Figure 3). No other values are allowed.
The constraints are represented in a nested structure that follows the properties’
path as they are encoded in the SPHN RDF Schema.
Figure 3 . Example of property path from OxygenSaturation to the possible code of the Unit.
Cardinalities constraints have been implemented to restrict the number of values an instance of
a class may have for specific properties. The
notation have been used.
For instance, the
Lab Result may have 0 or at most 1
Lab Test connected:
owl:intersectionOf ( [ rdf:type owl:Restriction ; owl:onProperty sphn:hasLabTest ; owl:minCardinality "0"^^xsd:nonNegativeInteger ] [ rdf:type owl:Restriction ; owl:onProperty sphn:hasLabTest ; owl:maxCardinality "1"^^xsd:nonNegativeInteger ] ) ;
Both value and cardinality constraints are visualized in the PyLODE documentation in the definitions of the classes (https://biomedit.ch/rdf/sphn-ontology/sphn/2023/2).
Figure 4. Example of cardinality constraints shown in the SPHN RDF visualization for Lab Result (with PyLODE).
Annotations for enriching knowledge content
In some classes, there is a need to annotate specific information provided in the SPHN Dataset. These additional annotations relate to exceptions in the way property restrictions should be understood. Therefore, a new SPHN annotation was created to ease connecting elements from one SPHN RDF Schema to another.
skos:definition for subclass values not allowed
SNOMED CT provides a hierarchical structure of its codes.
When values are given as SNOMED CT codes, it implicitly means that,
by default, any children of the provided code are valid results when taking
into account the inheritance rules defined in Semantic Web: a
parent class. The child class is simply a more specific element of the parent class.
Sometimes SPHN makes restrictions and only allows for explicitly stated codes
to be valid in a defined value set.
This information is encoded as a
skos:definition with the following text
provided for each property:
sphn:<property> subclasses not allowed.
For instance, values for defining the administrative gender of a patient are strictly limited
identifies as male gender,
identifiers as female gender,
other. This value set is provided as an
[ rdf:type owl:Restriction ; owl:onProperty sphn:hasCode ; owl:someValuesFrom [ rdf:type owl:Class ; owl:unionOf ( snomed:446151000124109 snomed:446141000124107 snomed:74964007 ) ] ]
In addition, the following annotation, given as information in the schema, states that only these values are permitted and no children of these codes are allowed:
sphn:AdministrativeGender skos:definition "sphn:hasCode subclasses not allowed" .
This information is used in the SHACLer for strictly limiting the set of values of a
property and is provided in the PyLODE visualization of the SPHN RDF Schema to indicate
in the restriction field that child terms are not allowed. Here is the same
hasCode property example represented in PyLODE:
Figure 5. Example of the interpretation of a skos:definition constraint in the SPHN RDF visualization (with PyLODE). The skos:definition indicates that the child terms of the given value set are not allowed to be provided.
skos:note for permitted standards and extendable value sets
The SPHN dataset in some cases defines the external standards (or terminologies) that can be used
for a given property in a certain context. This information is encoded in the schema as a
For instance, the code of a
ProblemCondition may come from ICPC or another standard.
This information is depicted as follow in .ttl:
sphn:ProblemCondition skos:note "sphn:hasCode allowed coding system: ICPC or other" .
The SPHN dataset in some cases also defines extendable value sets.
Extendable value sets establish properties with recommended or example values,
where the data provider has the option of extending them with other values as they see fit.
This knowledge is represented as follows in RDF: the
owl:Restriction given for this
property value is not taking into account the single values provided by the dataset but
matches to a higher level of class in the hierarchy from the coding system, enabling other values to be used.
The values recommended in the SPHN dataset are provided with the annotation
For instance, the list of codes about body sites where the oxygen saturation can be measured
is an extendable value set in SPHN. The
owl:Restriction is given in
Code values allowed are only children of
snomed:123037004 | body structure (body structure) |,
as represented below in .ttl:
[ rdf:type owl:Restriction ; owl:onProperty sphn:hasCode ; owl:someValuesFrom <http://snomed.info/id/123037004> ]
The specific values recommended in the SPHN dataset to be used are:
snomed:29707007 | Toe structure (body structure),
snomed:7569003 | Finger structure (body structure) |,
snomed:48800003 | Ear lobule structure (body structure) |.
These values are indicated in the following note annotated in the
sphn:OxygenSaturation skos:note "sphn:hasBodySite/sphn:hasCode recommended values [snomed:29707007; snomed:7569003; snomed:48800003]"
sphn:replaces for reference of previous classes and properties
Changes in classes or properties between different versions of the
SPHN RDF Schema are tracked directly in RDF with the
For instance, in SPHN 2022.1, the property
replaced the property
defined in SPHN 2021.2. The following statement is written in the SPHN 2022.1 RDF schema:
sphn:hasActiveIngredient rdf:type owl:ObjectProperty ; sphn:replaces sphn:hasDrugActiveIngredientSubstance .
This annotation is currently used in the migration path tool to generate a CSV file of the differences between two versions of SPHN RDF Schema.
After the release of each new version of the SPHN Dataset the corresponding SPHN RDF Schema is reviewed by the SPHN Data Coordination Center (DCC) in close collaboration with the IT experts of the Swiss University Hospitals and revised as necessary.
Availability and usage rights
The SPHN RDF Schema is available at https://git.dcc.sib.swiss/sphn-semantic-framework/sphn-ontology/ and can be visualized at https://biomedit.ch/rdf/sphn-ontology/sphn.
External terminologies are accessible through the Terminology Service.
If you need further information, please contact the SPHN Data Coordination Center (DCC) at firstname.lastname@example.org.
The SPHN RDF Schema is under the CC BY 4.0 License.
Touré, V., Krauss, P., Gnodtke, K., Buchhorn, J., Unni, D., Horki, P., Raisaro, J.L., Kalt, K., Teixeira, D., Crameri, K. and Österle, S. (2023). FAIRification of health-related data using semantic web technologies in the Swiss Personalized Health Network. Scientific Data, 10(1), p.127 (doi: 10.1038/s41597-023-02028-y)