Introduction

Note

To find out more about the SPHN Semantic Strategy and the SPHN Ecosystem, watch the seminars on Linked data in SPHN and SPHN Data Ecosystem for FAIR Data

SPHN Semantic Strategy

The Swiss Personalised Health Network (SPHN) is an initiative of the Swiss government that aims to provide a framework for exchanging health-related data in an interoperable way for research. Except for some coding systems used for patient billing and accounting, there are currently only a few nationally adopted and implemented standards for medical information in hospitals. To enable the use of health-related data from clinical routine and other sources for research, SPHN has developed a semantic interoperability strategy [Gaudet-Blavignac et al. 2021]. The developed framework is based on the following three pillars strategy:

  • Pillar 1: Semantic representation

  • Pillar 2: Data transport and storage

  • Pillar 3: Use cases

DCC semantic interoperability

Figure 1. Semantic interoperability strategy of SPHN.

The Data Problem and Stakeholder Roles

One of the main aims of SPHN is to allow biomedical researchers to have access to a variety of data required for personalized health research. Data is available at various sources and in multiple formats. For instance, health-related data is produced and collected in hospitals; genomics data is generated in sequencing facilities; citizen data is collected via mobile devices, etc. These different data sources have various formats and make it difficult if not impossible for researchers to make proper use of the data in a research project. Therefore, SPHN makes a collaborative effort which includes both, data provider (hospitals, health-care providers, etc.) and researchers (data users) to produce and reuse data in a coordinated manner. In this documentation, the following stakeholder roles are defined:

Data Provider (alias ‘data producer’):

  • Clinical Data Manager (in a hospital): a person who maintains data (typically, in a Clinical Data Warehouse) and makes it available for further use to Researchers

Members of a scientific project (alias ‘data consumers’):

  • Project Data Manager: technical expert who prepares or extends data for Researchers and specific scientific projects.

  • Researcher (user): a biomedical researcher who needs to access/analyze biomedical data

  • Project Leader: responsible for a specific research project.

The collaborative effort between data producers (typically in hospitals) and research projects is expressed in the SPHN Data Ecosystem.

SPHN Data Ecosystem for FAIR Data

SPHN has promoted the development of the SPHN Ecosystem [Österle et al. 2021] which encapsulates multiple components to allow exchange and reuse of data related to humans:

  • The basic principle is that a Data Provider produces data in a certain format which can be “understood” and used by Researchers. For this purpose, the Data Coordination Center (DCC) defined the SPHN Dataset which presents a high-level “data model” and describes the meaning of the data (semantics). For example, one needs to provide data of patients having an allergy. In more detail, the SPHN Dataset semantically defines medical and health-related concepts (terms) used in health research in Switzerland (Pillar 1). Note that “dataset” here basically refers to metadata of the actual health data to be used, i.e. attribues or field names defined in a clinical health record.

  • The high-level “data model” and metadata specified in the SPHN Dataset is then represented in a common format, namely in RDF (Resource Description Framework) (Pillar 2). The result is the SPHN RDF schema that indicates the concepts and rules to follow for generating structured clinical datasets following the FAIR principles.

  • For facilitating the use and integration of external (national and international) standard terminologies and classifications, a Terminology Service [Krauss et al. 2021] has been put in place to automatically transform health-related data into RDF formats and make them accessible to both data providers and data users (researchers) who need them at different steps of the presented Ecosystem.

  • Schema and semantic extension: Scientific projects have the possibility to extend the semantics of the SPHN schema by adding project-related information. This extension process is facilitated thanks to a Template RDF schema that can be used as a starting point to extend the SPHN RDF schema into a project-specific RDF schema. A Project Data Mananger of a scientific project can then send the project-specific RDF schema to the data providers who integrate this new schema into their pipelines for transforming clinical data warehouse data and generate RDF data files that comply with the given RDF schema.

  • Quality assurance: Note that any new data that is generated by a Clincal Data Manager needs to be checked for quality and, in particular, if it applies the corresonding schema correctly. Therefore, data can be validated with the quality assurance framework mainly composed of the SHACLer and the SPARQLer tools.

  • Data reuse: Once data was validated and has passed quality assurance checks, Researchers can explore and analyze the data as they need.

SPHN Ecosystem

Figure 2. Simplified overview of SPHN Ecosystem.