Bio2RDF@BH2010

Bio2RDF Cognoscope A killer app for the life scienc e François Belleau

What is know about hexokinase ?

A new approche: The Cognoscope

http://www.pcworld.idg.com.au/article/132245/berners-lee_seeks_killer_app_semantic_web "Similarly, if we could get critical mass in life sciences, if we get a half a dozen or a dozen set of ontologies, the core ones for drug discovery out there, then suddenly the Semantic Web within life sciences would have a critical mass. It'll snowball much more rapidly and it will be copied. Other areas will realize: Oh it's worth investing in this," Tim Berners-Lee WWW inventor

The problem: How to do data integration in Bioinformatics ? Carole Goble (ISWC 2005)

http://www.biopax.org/Docs/2004-10-28_SWLS-SessionVII.pdf

http://informationarchitects.jp/ia-trendmap-2007v2/ Web Trend Map 2007

The proposed solution Bio2RDF solve the problem of data integration in bioinformatics by applying the Semantic Web approach based on RDF, OWL and SPARQL technologies.

Web of data subway map from W3C http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/#(1)

"Wouldn't it be great if you were able to organize all this information based on your own terms, instead of based on the application you use to access the information ?” Ramanathan V. Guha RDF initiator http://cgi.netscape.com/columns/techvision/innovators_rg.html

R esource D escription F ramework

It is triples... < subject > < predicate > < object_uri > . OR < subject > < predicate > " object_literal " .

The same in RDF/XML <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:exterms="http://www.example.org/terms/" > <rdf:Description rdf:about=" http://www.example.org/index.html "> < exterms:creation-date > August 16, 1999 </ exterms:creation-date > </rdf:Description> </rdf:RDF>

The same in NTRIPLES < http://www.example.org/index.html > < http://www.example.org/terms/creation-date > “ August 16, 1999 ” .

It is a technology stack http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/

It is a distributed architecture http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/

Goal #1 Convert many public bioinformatic databases to RDF.

Bio2RDF rdfised public databases

Bio2RDF Mouse and Human Atlas map in 2008 65 millions triples

Linked Data cloud evolution http://linkeddata.org/ http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics Linked data cloud in March 2009 Linked data cloud in May 2007

http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html

LODD wins the 2009 Triplify challenge http://triplify.org/files/challenge_2009/LODD.pdf

Bio2RDF cloud map of namespaces from 2,3 billions triples

http://www.w3.org/DesignIssues/LinkedData

http://bio2rdf.wiki.sourceforge.net/Banff%20Manifesto

Bio2RDF realtime rdfiser in 2007

Actual Architecture 2010 Offline rdfising process

Virtuoso SPARQL endpoints network

Namespace resolution through DNS subdomain

Bio2RDF has 3 mirror sites http://cu.bio2rdf.org/ http://qut.bio2rdf.org/ http://quebec.bio2rdf.org/

Main REST services Describe a ressource by a dereferencable URI http://bio2rdf.org/ ns : id Global services over federated endpoints http://bio2rdf.org/links/ ns : id

http://bio2rdf.org/search/ searchedTerm Targeted services to a specific endpoint http://bio2rdf.org/linksns/ ns2 / ns : id

http://bio2rdf.org/searchns/ ns / searchedTerm

Goal #2 Ask a useful question to the network of SPARQL endpoints.

What is known about hexokinase ?

Existing integrated search services NCBI/Entrez EBI/EB-eye KEGG/DBGET Riken/OmicScan

Ask http://atlas.bio2rdf.org/fct

Submit a SPARQL query http://atlas.bio2rdf.org/sparql

Ask it to each SPARQL endpoint http://NAMESPACE.bio2rdf.org/fct

Ask Bio2RDF REST federated search http://bio2rdf.org/search/hexokinase

The mashup principle To answer a complex question we first need to build a specific database, a mashup, to which we submit the appropriate query.

Cognoscope new definition A Cognoscope is an instrument to explore and collect topics from the Linked Data cloud of SPARQL endpoints. It permits the querying over a distributed network of knowledge resource.

Cognoscope definition The magnifying effect depends of the density of links between resource (entity links), which is a by-product of the human intellectual activity in the social network.

The filtering effect is based on the inherent semantic of RDF graph described using types and predicates.

Facet browsing is used to zoom in and out in the observed graph.

Full text search is used to discover concept.

Bio2RDF@BH2010

Recommended

More Related Content

What's hot (20)

Viewers also liked (8)

Similar to Bio2RDF@BH2010 (20)

More from François Belleau (17)

Recently uploaded (20)

Bio2RDF@BH2010