-
Notifications
You must be signed in to change notification settings - Fork 2
Building an RDF browser using Elasticsearch, Kibi and node red
François Belleau, 2010 Biohacker
For year we have been building data store exposed as linked data. One main problem about semantic web adoption by researchers is the lack of convivial software to analyse life science data in RDF. The goal of this project is to illustrate the potential of the popular data analysis tool Kibana and Elasticsearch.
In 2010, I had the privilege to participate to Tokyo Biohackathon, at the time the Bio2RDF Virtuoso triplestore was at its beginning, I was a fan of this emerging technology. My main contribution then was to promote semantic web and explains to biohacker its advantage over traditional approaches. OpenLinks Virtuoso software has been central to so many successful projects. At that time Uniprot, Ensembl, Pubchem, Reactome, GO did not have their own SPARQL endpoint, that was Bio2RDF mission then to have major database exposed as linked data. Seven years later, Linked data production store in life science is a reality and our community a been very effective to promote it. Despite these success, the need for a proper Linked data browser software designed for end user remains a obstacle for SW triplestore to be browsed.
Seven years later, now that so many semantic web project have gain maturity, (identifier.org, schema.org, JSON-LD) many initial problems have been successfully solved. Remember TBL's Tabulator sotware, it was a great idea but never evolve to a production software. We all used Virtuoso Facet Browser to explore and build our project, but frankly, it has never been a end user appropriate tool.
BioMart project (http://www.biomart.org/) have build a solid community of end user. It took the SRS community to the next level, with an open source software that had been adopted by major data provider (HGNC, MGI, Reactome, Ensembl, etc.). An attempt was made to create a SPARQL engine like interface over the data store, not a success. But this relational database technology project could not get close the agility of triplestore. Its major advantage was a really cool user interface that was the same over different web site. From my personal experience, there is not such an appropriate user interface yet to help end user adopt triplestore on a daily basis. This is still my goal.
As a data analyst, I use daily the ELK stack (Elasticsearch, Logstash, Kibana) to do my job. Here, I proposed to expose the potential of these open source software to illustrate that may be the Linked Data browser for life scientist may already be there.
Finally, I will use Kibi stack from the Siren company instead of Kibana, because of its build in plugins and relational capability.
The first thing to be done was installing the tools
- https://support.siren.solutions/support/solutions/folders/17000068159
- https://nodered.org/docs/getting-started/installation
Data extracted with node-red from DisGeNet sparql endpoint and Wikidata LDF API.
Data extracted with node-red from Bio2RDF and Ontobee SPARQL endpoints.
The demo web site
http://vps146209.vps.ovh.ca/goto/cec8f5d474926905c6b5ae2aaad5f3ba