HOCH implementation of SPHN
HOCH (Health Ostschweiz) has implemented a modular SPHN data pipeline that builds on an existing hospital integration infrastructure and a dedicated staging environment, deliberately abstracting from individual source systems to ensure long-term maintainability and publication suitability. Operational data from key clinical systems is consolidated via an integration engine into a central raw staging database on Microsoft SQL Server. From there, dbt (data build tool) populates harmonised staging tables and SPHN-aligned delivery tables. For each SPHN concept, the corresponding delivery table is exported as a CSV file and ingested via the SPHN Connector CSV API, which generates RDF for the SPHN Minimal Dataset. The resulting RDF is retrieved via the Connector API and transferred to the BioMedIT Network using Sett.
Figure 1: End-to-end SPHN data flow at HOCH. From hospital source systems via the integration layer and dbt-based transformations to RDF generation and secure transfer to the BioMedIT network.
The overall design separates data acquisition, transformation, terminology mapping, de-identification and delivery into clearly defined layers. Source extraction is implemented in a modular way, allowing individual domains to be re-pointed to alternative or more suitable source systems over time without requiring changes in downstream transformations or the SPHN Connector configuration. This supports iterative development of mappings and concepts, transparent quality control and the ability to gradually extend coverage while maintaining a stable production pipeline for SPHN submissions.
1. Development and Deployment Lifecycle
The SPHN implementation at HOCH follows a staged lifecycle with separate test and production environments. All SPHN-related configuration and mapping changes are validated end-to-end in the test environment before being promoted to production, ensuring stable and reproducible official data deliveries while keeping operational details minimal.
2. Data Extraction and Population Management
Data extraction is implemented using an existing integration engine, which acts as the central hub for SPHN-related data flows. Source data from clinical, administrative and laboratory systems is extracted and written into a raw staging database. Raw staging tables are mostly source-oriented, while normalising core identifiers, timestamps and organisational units. Population management is cohort-driven: for each SPHN project, a cohort definition defines the set of eligible patients and cases that fulfil consent and inclusion criteria. All downstream transformations and exports are restricted to this cohort.
3. Data Processing and Transformation
Figure 2: The Medallion Architecture. The overall transformation design follows the Medallion Architecture, separating source-conform raw staging, harmonised staging and SPHN-aligned delivery layers.
The transformation of source-oriented data into the SPHN target schema follows a multi-layer architecture implemented with dbt:
A raw staging layer preserving source semantics.
A harmonised staging layer aligning identifiers, timestamps and structures across domains.
A delivery layer aligned to the SPHN Connector’s per-concept CSV specifications.
This layered approach enables modular development, testing and reuse of shared components across SPHN concepts.
3.1 De-identification
De-identification is applied before data is made available to the SPHN Connector. Identifiers are pseudonymised deterministically per project, and additional measures are applied where required to reduce identifiability. De-identification logic is implemented in the dbt layer to remain transparent and reproducible.
3.2 Terminology Mapping
Terminology mapping is a core activity in the HOCH SPHN implementation. Laboratory, diagnosis, procedure, medication and assessment data are mapped to the corresponding SPHN-relevant code systems using curated mapping tables and transformation logic. Mappings are maintained separately from transformation logic to support traceability and iterative refinement.
Figure 3: Terminology mapping strategy. Local catalogues and codes are mapped to standard terminologies and SPHN value sets using curated mapping tables and dbt transformations.
3.3 Concept Factory
The concept factory at HOCH is realised as a structured set of dbt models that transform harmonised staging data into SPHN-aligned delivery tables. For each SPHN concept, the required attributes are selected, mapped, de-identified and materialised in a delivery table matching the SPHN Connector CSV specification. These tables are exported and ingested via the SPHN Connector. The dbt project forms a directed acyclic graph of models, making dependencies explicit and enabling reuse of shared components.
Figure 4: Concept factory design. Design of the concept factory and highlight the reuse of shared dbt components and rules across multiple SPHN concept models.
3.4 Validation
Validation is performed at multiple levels, including dbt tests in relational layers and SHACL-based validation in the SPHN Connector. Validation findings are iteratively fed back into transformation logic and mapping tables.
4. Data Submission to BioMedIT
Once validated, per-concept CSV files are ingested into the SPHN Connector, transformed into RDF and transferred to the BioMedIT Network using the standard SPHN procedure. Official submissions are performed from the production environment.
Figure 5: Data submission workflow. From validated delivery tables and per-concept CSV files through RDF generation and SHACL validation in the SPHN Connector to encrypted transfer via Sett.
5. Platform Architecture
The HOCH SPHN implementation is embedded into the existing hospital IT landscape and adds a dedicated SPHN layer on top of established integration and database services. Core principles are strict network segmentation, role-based access control and the use of standard SPHN components without functional modification. The SPHN Connector is deployed using the standard container-based distribution, and data is exchanged exclusively through defined APIs and file-based interfaces, ensuring a clear separation between hospital-internal systems and the BioMedIT network.
Figure 6: Platform architecture. Architecture and security zoning of the SPHN components within the hospital network and the BioMedIT environment.