Exploratory data analysis with triplestores

Exploiting the data in RDF

Loading and visualization with GraphDB

The data in RDF can be loaded and sometimes also visualized in triple stores. Here, we demonstrate how to load and visualize data in GraphDB. GraphDB’s documentation gives a good overview of the options to load RDF data into GraphDB.

Step 1: Create and configure a repository

First, you have a create a new repository which will hold the data:

Create a new repository

Fill in the necessary information:

Fill in the necessary information

Select the repository on the top right before importing data into the created repository:

First select the repository on the top right, then import data into the repository

Enable the Autocomplete setting to easen your searches in the tool:

Enabling the Autocomplete setting in this repository easens the search

Step 2: Import data

There are several options for loading data into GraphDB.

Option A: Import from a text snippet

In this example we will import data via a text snippet that is copied into the web front-end. In the Import/RDF menu, on the User data tab, select Import RDF text snippet.

Import RDF text snippet

Copy and paste the following text into the text field:

@prefix allergies: <http://sib.swiss/allergies/> .
@prefix patients: <http://sib.swiss/fictivePatients/> .
@prefix substances: <http://sib.swiss/substances/> .
@prefix sib: <http://sib.swiss/> .
@prefix sphn: <https://biomedit.ch/rdf/sphn-ontology/sphn#> .
@prefix snomed: <http://snomed.info/id/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

# types
patients:anonymous1 rdf:type sphn:SubjectPseudoIdentifier .
sib:hospital1 rdf:type sphn:DataProviderInstitute .
allergies:allergy1 rdf:type sphn:Allergy .
substances:peanuts1 rdf:type snomed:762952008 .

# relations to the allergy
allergies:allergy1 sphn:hasSubjectPseudoIdentifier patients:anonymous1 .
allergies:allergy1 sphn:hasDataProviderInstitute sib:hospital1 .
allergies:allergy1 sphn:hasSubstance substances:peanuts1 .
This shows the import from a text snippet

Accept the default settings:

We accept the default settings

A message appears showing the successful import:

Message showing the successful import
Option B: Import from server files

If enabled at your GraphDB instance, data in a dedicated folder on the GraphDB server is exposed to the user interface. To list and load these files and folders, navigate to the Import/RDF menu, select the Server files tab, and import selected or all files.

Data import via server files

When prompted, accept all default settings (as above).

A message appears showing the successful import:

Message showing the successful import
Option C: Import via the preload command

For large datasets, GraphDB’s preload tool offers a better performance than import via the user interface. The preload command needs to be executed directly on the GraphDB server. Please get in touch with your instance’s system administrators.

Monitoring resources while importing data

System resources, such as memory or CPU consumption, can be monitored via Monitor/Resources:

Resource monitoring

This can be helpful to debug issues with excessive resource consumption, especially while importing large datasets.

Step 3: Visualize the graph

Search for specific data of interest that was imported (here, allergy1):

Searching for the Concept previously imported

GraphDB enables the visualization of data. Here, the imported and searched data resource allergy1 is shown with its first hop neighbours:

visualization of the imported data centric to the searched resource with its first hop neighbours