RDF Quads

RDF Quads are an extension of RDF that associates to triples (i.e. composed of a subject, a predicate and, an object) a graph name. This supports the representation of contextual information by organizing and distinguishing sets of triples in a given RDF graph as observed in Figure 1.

Example of quads representation.

Figure 1. Mock example of a single RDF Triple about a CareHandling subject which has a predicate hasTypeCode pointing to an Outpatient procedure object; and this triple embedded in a RDF Quad with a named graph patient27, indicating that the triple belongs to patient27.

Formats

Different data formats exist to encode RDF Quads. We focus on two formats called N-Quads and Trig.

N-Quads format

N-Quads is an extension of N-Triples which enables the representation of named graphs with a line-based statement. Each line contains a single Quad, which is composed of a subject, a predicate, an object and the graph name as shown in the example below:

https://biomedit.ch/rdf/sphn-resource/CHE-108_904_325-sphn-CareHandling-191b7a4f-281e-417b-88c4-919efcc70f3e a sphn:CareHandling http://patient_27.ch/ .

https://biomedit.ch/rdf/sphn-resource/CHE-108_904_325-sphn-CareHandling-191b7a4f-281e-417b-88c4-919efcc70f3e sphn:hasSourceSystem https://biomedit.ch/rdf/sphn-resource/CHE-108_904_325-sphn-SourceSystem-57e8b884-5c8d-4580-8ad9-8e7de77861ba http://patient_27.ch/ .

https://biomedit.ch/rdf/sphn-resource/CHE-108_904_325-sphn-CareHandling-191b7a4f-281e-417b-88c4-919efcc70f3e sphn:hasTypeCode https://biomedit.ch/rdf/sphn-resource/CHE-108_904_325-sphn-CareHandling-191b7a4f-281e-417b-88c4-919efcc70f3e-sphn-Code-SNOMED-CT-371883000 http://patient_27.ch/ .

https://biomedit.ch/rdf/sphn-resource/CHE-108_904_325-sphn-CareHandling-191b7a4f-281e-417b-88c4-919efcc70f3e-sphn-Code-SNOMED-CT-371883000 a snomed:371883000 http://patient_27.ch/ .

TriG format

TriG is an extension of the Turtle format which enables the representation of named graphs. Unlike N-Quads, TriG makes use of curly brackets to denote the triples that belong to a named graph. In the example below we have a set of triples that belong to a patient represented in TriG:

<http://patient_27.ch/> {
resource:CHE-108_904_325-sphn-CareHandling-191b7a4f-281e-417b-88c4-919efcc70f3e a sphn:CareHandling;
sphn:hasSourceSystem resource:CHE-108_904_325-sphn-SourceSystem-57e8b884-5c8d-4580-8ad9-8e7de77861ba;
        sphn:hasTypeCode resource:CHE-108_904_325-sphn-CareHandling-191b7a4f-281e-417b-88c4-919efcc70f3e-sphn-Code-SNOMED-CT-371883000 .

resource:CHE-108_904_325-sphn-CareHandling-191b7a4f-281e-417b-88c4-919efcc70f3e-sphn-Code-SNOMED-CT-371883000 a snomed:371883000 .
}

Comparison of N-Quads and TriG

Comparative table of N-Quads and TriG

Format

N-Quads

TriG

Syntax

Quadruple-based (one Quad per line)

Triple-based (curly brackets indicate the named graph)

Extension

.nq

.trig

Machine-processable

yes (can be slightly faster than TriG)

yes

Human-readable

yes (but with some overhead)

yes (easier than N-Quads)

Compressed file size

smaller than TriG

a little larger than N-Quads

In SPHN, we would recommend the use of TriG as a format to describe RDF Quads. However, projects are free to request any of these two formats when receiving data.

Quads in SPHN

In SPHN, RDF Quads are used for clustering data (triples) related to a specific patient together in both the processing of data when being delivered and the storage of that data in triplestores. Quads are generated with the SPHN Connector.

Using RDF Quads makes it easier to handle patients in the different SPHN projects, considering that data generated with the SPHN Connector is patient-specific (one RDF file per patient).

For data providers, this helps to streamline the process of updating data when changes occur to specific patients by enabling single patient re-upload of the data instead of the full set of patients. This tremendously reduces the loading time of patients into the database (especially in the context of Data Exploration and Analysis System, DEAS - more information on this resource will come in the course of 2024) but also when recurrent delivery of data happens in specific projects (handled with the SPHN Connector).

For data users, this simplifies and speeds up the querying process by allowing them to concentrate on the data of patients they’re interested in. Any modification to a patient can easily be reloaded in the triplestore by simply 1) deleting the named graph already present in the database of a patient and 2) importing the named graph of that patient that contains all the data of that patient and the newly modified information. When the modification is only about adding new information to the patient, the data user can simply add this new information with the same named graph IRI encoded in the TriG or N-Quads file. In this case, the prior deletion of the patient’s named graph is not needed.

Quads can also facilitate the access control of particular graph statements (i.e. data elements) to specific users only. For instance, in GraphDB, the access to a specific named graph can be denied to certain users (see more https://graphdb.ontotext.com/documentation/10.5/quad-based-access-control.html). This may be of interest within or between SPHN projects.

Querying with Quads

Patient information

How to query for information about a specific patient?

The following queries gives you all the triples (i.e. all the data) that are associated with a given patient’s named graph (i.e. that relate to a patient).

Option 1:

SELECT *
WHERE {
            GRAPH <http://patient_27.ch/> {
                ?s ?p ?o
        }
}

Option 2:

SELECT *
FROM NAMED <http://patient_27.ch/>
WHERE {
    GRAPH ?g {
        ?s ?p ?o .
    }
}

Deletion of patients

How to easily delete a patient with the named graph http://patient_27.ch? (scenario where a patient revoked consent).

Option 1:

DELETE {
    GRAPH <http://patient_27.ch/> { ?s ?p ?o }
}
WHERE {
    ?s ?p ?o .
}

Option 2:

CLEAR GRAPH <http://patient_27.ch/>