SlideShare a Scribd company logo
The Impact of
       Data Caching on
Query Execution for Linked Data

                                          Olaf Hartig
                          http://olafhartig.de/foaf.rdf#olaf
                                                @olafhartig

    Database and Information Systems Research Group
                       Humboldt-Universität zu Berlin
Can we query the Web of Data
 as of it were a single,
 giant database?


                    SELECT DISTINCT ?i ?label
                    WHERE {

                     ?prof rdf:type <http://res ... data/dbprofs#DBProfessor> ;
                          foaf:topic_interest ?i .




                    }
                     OPTIONAL {


                     }
                       ?i rdfs:label ?label
                       FILTER( LANG(?label)="en" || LANG(?label)="")


                    ORDER BY ?label
                                                                        ?




 Our approach: Link Traversal Based Query Execution
                                                 [ISWC'09]
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data       2
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset




                                                                              query-local
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 3
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 4
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:




                                                                                     htt
                                                                                         p:/ ?
     ●   Evaluate parts of the query (triple patterns)




                                                                                            /bo
         on a continuously augmented set of data




                                                                                               b.n
                                                                                                  am
         Look up URIs in intermediate




                                                                                                    e
     ●

         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                             5
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:




                                                                                     htt
                                                                                         p:/ ?
     ●   Evaluate parts of the query (triple patterns)




                                                                                            /bo
         on a continuously augmented set of data




                                                                                               b.n
                                                                                                  am
         Look up URIs in intermediate




                                                                                                    e
     ●

         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                             6
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:




                                                                                     htt
                                                                                         p:/ ?
     ●   Evaluate parts of the query (triple patterns)




                                                                                            /bo
         on a continuously augmented set of data




                                                                                               b.n
                                                                                                  am
         Look up URIs in intermediate




                                                                                                    e
     ●

         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                             7
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:




                                                                                     htt
                                                                                         p:/ ?
     ●   Evaluate parts of the query (triple patterns)




                                                                                            /bo
         on a continuously augmented set of data




                                                                                               b.n
                                                                                                  am
         Look up URIs in intermediate




                                                                                                    e
     ●

         solutions and add retrieved data
                  “Descriptor object”
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                             8
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 9
Main Idea
 ●   Intertwine query evaluation with traversal of data links
 ●   We alternate between:
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset
                                                                  http://bob.name
                                         Query                                kno
                                                                                 ws
         http://bob.name
                                                                                      http://alice.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                      query-local
                project           ?prj
                                                                                dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                               10
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                   ?acq
 ●   We alternate between:
                                                                                             http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset
                                                                  http://bob.name
                                         Query                                kno
                                                                                 ws
         http://bob.name
                                                                                      http://alice.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                      query-local
                project           ?prj
                                                                                dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                      11
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                           ?acq
 ●   We alternate between:
                                                                                                http://alice.name
     ●   Evaluate parts of the query (triple patterns)




                                                                                                    ? me
         on a continuously augmented set of data




                                                                                                       a
                                                                                                    e.n
                                                                                                lic
                                                                                                a
                                                                                            ://
     ●   Look up URIs in intermediate




                                                                                            p
                                                                                        htt
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         12
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                           ?acq
 ●   We alternate between:
                                                                                                http://alice.name
     ●   Evaluate parts of the query (triple patterns)




                                                                                                    ? me
         on a continuously augmented set of data




                                                                                                       a
                                                                                                    e.n
                                                                                                lic
                                                                                                a
                                                                                            ://
     ●   Look up URIs in intermediate




                                                                                            p
                                                                                        htt
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         13
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                           ?acq
 ●   We alternate between:
                                                                                                http://alice.name
     ●   Evaluate parts of the query (triple patterns)




                                                                                                    ? me
         on a continuously augmented set of data




                                                                                                       a
                                                                                                    e.n
                                                                                                lic
                                                                                                a
                                                                                            ://
     ●   Look up URIs in intermediate




                                                                                            p
                                                                                        htt
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         14
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                  ?acq
 ●   We alternate between:
                                                                                            http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                     15
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                  ?acq
 ●   We alternate between:
                                                                                            http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                     query-local
                project           ?prj
                                                                               dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                     16
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                           ?acq
 ●   We alternate between:
                                                                                                    http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate
         solutions and add retrieved data
         to the query-local dataset
                                                                        http://alice.name
                                         Query                                   pr o
         http://bob.name                                                                jec
                                                                                              t
                                     ?prjName                                                     http://.../AlicesPrj
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                       query-local
                project           ?prj
                                                                                 dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                              17
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                              ?acq
 ●   We alternate between:
                                                                                                       http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate                                               ?acq                       ?prj
                                                                              http://alice.name        http://.../AlicesPrj
         solutions and add retrieved data
         to the query-local dataset
                                                                        http://alice.name
                                         Query                                      pr o
         http://bob.name                                                                   jec
                                                                                                 t
                                     ?prjName                                                        http://.../AlicesPrj
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                          query-local
                project           ?prj
                                                                                    dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                                   18
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                         ?acq
 ●   We alternate between:
                                                                                                  http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate                                               ?acq                 ?prj
                                                                              http://alice.name   http://.../AlicesPrj
         solutions and add retrieved data
         to the query-local dataset

                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                          query-local
                project           ?prj
                                                                                    dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                              19
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                            ?acq
 ●   We alternate between:
                                                                                                    http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate                                               ?acq                    ?prj
                                                                              http://alice.name     http://.../AlicesPrj
         solutions and add retrieved data
         to the query-local dataset                                                          ?prj           ?prjName
                                                                                     http://.../AlicesPrj   “…“
                                         Query
         http://bob.name
                                     ?prjName
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                          query-local
                project           ?prj
                                                                                    dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                                20
Main Idea
 ●   Intertwine query evaluation with traversal of data links
                                                                                                            ?acq
 ●   We alternate between:
                                                                                                    http://alice.name
     ●   Evaluate parts of the query (triple patterns)
         on a continuously augmented set of data
     ●   Look up URIs in intermediate                                               ?acq                    ?prj
                                                                              http://alice.name     http://.../AlicesPrj
         solutions and add retrieved data
         to the query-local dataset                                                          ?prj           ?prjName
                                                                                     http://.../AlicesPrj   “…“
                                         Query                       ?acq                    ?prj           ?prjName
         http://bob.name
                                     ?prjName               http://alice.name        http://.../AlicesPrj   “…“
            s
           ow




                                           me
         kn




                                        na




     ?acq                                                                          query-local
                project           ?prj
                                                                                    dataset
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                                21
Characteristics
 ●   Link traversal based query execution:
     ●   Evaluation on a continuously augmented dataset
     ●   Discovery of potentially relevant data during execution
     ●   Discovery driven by intermediate solutions

 ●   Main advantage:
     ●   No need to know all data sources in advance

 ●   Limitations:
     ●   Query has to contain a URI as a starting point
     ●
         Ignores data that is not reachable* by the query execution
                                                                    *
                                                                        formal definition in [LDOW'11a]
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                          22
The Issue
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel

                                                                              query-local
                                                                               dataset




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 23
The Issue
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel

                                             htt                              query-local
                                                 p:   //b
                                                          ob                   dataset
                                                        ? .nam
                                                              e




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                 24
The Issue
 Query
                 ?acq interest                                         http://bob.name
                                              ?i
                                                                                kno
               s
            ow


                                                                                   w   s



                                          label
          kn




                                                                                           http://alice.name
    http://bob.name
                                       ?iLabel

                                                                                  query-local
                                                                                   dataset




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                    25
The Issue
 Query
                 ?acq interest                                         http://bob.name
                                              ?i
                                                                                   kno
               s
            ow


                                                                                      w   s



                                          label
          kn




                                                                                              http://alice.name
    http://bob.name
                                       ?iLabel

                                                                                    query-local
                                                                                     dataset



                                                   ?acq                       ?i          ?iLabel




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                       26
The Issue
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel

                                                                                 query-local
                                                                                  dataset


                                         Query
       http://bob.name
                                     ?prjName
           s
        ow




                                           me
      kn




                                        na




   ?acq                                                                       query-local
               project            ?prj
                                                                               dataset

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                    27
Reusing the Query-Local Dataset
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel

                                                                                 query-local
                                                                                  dataset


                                         Query
       http://bob.name
                                     ?prjName
           s
        ow




                                           me
      kn




                                        na




   ?acq                                                                       query-local
               project            ?prj
                                                                               dataset

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                    28
Reusing the Query-Local Dataset
 Query
                 ?acq interest
                                              ?i
               s
            ow




                                          label
          kn




    http://bob.name
                                       ?iLabel


                                                                                            http://alice.name

                                                                                         o ws
                                         Query                                         kn
       http://bob.name
                                                                          http://bob.name
                                     ?prjName
           s
        ow




                                           me
      kn




                                        na




   ?acq                                                                         query-local
               project            ?prj
                                                                                 dataset

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                     29
Reusing the Query-Local Dataset
 Query
                 ?acq interest
                                              ?i                                                     ?acq
               s
            ow



                                                                                               http://alice.name




                                          label
          kn




    http://bob.name
                                       ?iLabel


                                                                                            http://alice.name

                                                                                         o ws
                                         Query                                         kn
       http://bob.name
                                                                          http://bob.name
                                     ?prjName
           s
        ow




                                           me
      kn




                                        na




   ?acq                                                                         query-local
               project            ?prj
                                                                                 dataset

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        30
Hypothesis




        Re-using the query-local dataset (a.k.a. data caching)
                            may benefit
             query performance + result completeness




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   31
Contributions
 ●   Systematic analysis of the impact of data caching
     ●
         Theoretical foundation*
     ●
         Conceptual analysis*
     ●   Empirical evaluation of the potential impact

                                                                              *
                                                                                  see [LDOW'11a]




 ●   Out of scope: Caching strategies (replacement, invalidation)


Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                        32
Experiment – Scenario



                                                                              ●   Information about the
                                                                                  distributed social
                                                                                  network of FOAF
                                                                                  profiles
                                                                              ●   5 types of queries
                                                                              ●   Experiment Setup:
                                                                                  ●   20 persons
                                                                                  ●   Sequential use
                                                                                  ➔   100 queries

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                               33
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                  query
                        per                                 ●   no reuse experiment:
                                                                0,01   0,1     1                       10         100


ContactInfoDanBri
                                                                ●   No data caching
  (Query No. 61)                                            ●   reuse per query experiment
UnsetPropsDanBri                                                ●   Reuse of query-local dataset
  (Query No. 62)                                                    for 3 executions of each query
2ndDegree1DanBri
                                                                ●   Third execution measured
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                    0,01             0,1         1         10         100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        34
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                  query
                        per                                 ●   no 0,01
                                                                   reuse experiment:
                                                                         0,1    1                        10       100


ContactInfoDanBri
                                                                ●   No data caching
  (Query No. 61)                                            ●   reuse per query experiment
UnsetPropsDanBri                                                ●   Reuse of query-local dataset
  (Query No. 62)                                                    for 3 executions of each query
2ndDegree1DanBri
                                                                ●   Third execution measured
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                          0,01           0,1       1         10       100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        35
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                        per                                           0,01           0,1       1         10       100
                  query

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                          0,01           0,1       1         10       100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        36
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                        per                                           0,01           0,1       1         10       100
                  query

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                          0,01           0,1       1         10       100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        37
Experiment – Single Query
  no reuse        reuse 0 10 20 30 40 50 60
                        per                                           0,01           0,1       1         10       100
                  query

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                          0,01           0,1       1         10       100
                         number of query results                              query execution time (in seconds)

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                        38
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                  query
                        per  reuse 40
                             queries
                                                            ●   reuse all queries experiment: 100
                                                                       0,01   0,1   1    10


ContactInfoDanBri
                                                                ●   Reuse of the query-local
  (Query No. 61)                                                    dataset for the complete
                                                                    sequence of all 100 queries
UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  39
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                        per  reuse 40                                         0,01   0,1     1       10     100
                  query      queries

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  40
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                        per  reuse 40                                         0,01   0,1     1       10     100
                  query      queries

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  41
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                        per  reuse 40                                         0,01   0,1     1       10     100
                  query      queries

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  42
Experiment – Complete Sequence
  no reuse        reuse 0 10 20 30 all 50 60
                        per  reuse 40                                         0,01   0,1     1       10     100
                  query      queries

ContactInfoDanBri
  (Query No. 61)

UnsetPropsDanBri
  (Query No. 62)

2ndDegree1DanBri
   (Query No. 63)

2ndDegree2DanBri
    (Query No. 64)

   IncomingDanBri
    (Query No. 65)
                         0 10 20 30 40 50 60                                  0,01   0,1     1       10     100
                      number of query results                                        query execution time
                                                                                         (in seconds)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                  43
Outlook



 ●   Requirements of a data cache:
     ●   Replacement mechanism
     ●   Coherency mechanism




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   44
Cache Replacement
 ●   Cache full → remove descriptor objects
 ●   Replacement strategy
     ●   Primary goal: maximize hit rate
     ●   Recency-based
     ●   Frequency-based
     ●   Function-based
     ●   Randomized
 ●   Replacement process
     ●   Watermarks: high and low



Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   45
Studying Cache Replacement?




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   46
Studying Cache Replacement?

                      “Web cache replacement in its general
                       form seems to be a solved topic.”
                                           S. Podlipnig and L. Böszörmenyi: Survey of
                                           Web Cache Replacement Strategies, 2003




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data             47
Studying Cache Replacement?

                      “Web cache replacement in its general
                       form seems to be a solved topic.”
                                           S. Podlipnig and L. Böszörmenyi: Survey of
                                           Web Cache Replacement Strategies, 2003

 ●
     6 quad indexes* in main memory
     ●   Size grows linear in the number of quads
 ●   Example (after reuse all queries experiment, 100 queries):
     ●   905 descriptor objects, overall number of 745,756 triples
     ●   ca. 103 MB
 ➔   Available main memory is almost no limit
                                   *
                                     as introduced in [LDOW'11b]
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data             48
Cache Coherency
 ●   Data items in the cache may become inconsistent
 ●   Strong cache consistency
     ●   Server validation
     ●   Client validation
 ●   Weak cache consistency
     ●   Time to live (TTL)
     ●   Adaptive TTL




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   49
Client Validation
 ●   Polling every time
 ●   Enables strong cache consistency
 ●   Conditional GET
     ●   Request with If-Modified-Since header
     ●   Possible response: 304 Not Modified




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   50
Client Validation
 ●   Polling every time
 ●   Enables strong cache consistency
 ●   Conditional GET
     ●   Request with If-Modified-Since header
     ●   Possible response: 304 Not Modified
 ●   Not supported by most Linked Data servers
     ●   Experiment based on the CKAN catalog of linked datasets



     ●   41 out of 154 example resources (26.6%) from 110 datasets


Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   51
Time to Live (TTL)
 ●   TTL field: life time estimation for each object
 ●   Supported by HTTP response headers:
     ●   Expires
     ●   Cache-Control: max-age
 ●   When TTL elapses, object is invalid
 ●   Accessing an invalid object → re-retrieve object again
     ●   Conditional GET




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   52
Time to Live (TTL)
 ●   TTL field: life time estimation for each object
 ●   Supported by HTTP response headers:
     ●   Expires                                                       37.0%
     ●   Cache-Control: max-age                                         37.7%
 ●   When TTL elapses, object is invalid
 ●   Accessing an invalid object → re-retrieve object again
     ●   Conditional GET                                             26.6%
 ●   Alternative (due to lack of support in Linked Data servers):
     ●   Assume a default TTL for each object
     ●   Ordinary GET

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data     53
Adaptive TTL
 ●   Assumption:
     ●   The older an object, the less likely it is to be modified
 ●   TTL is a percentage of the age:
     ●   Threshold = 10% ; age = 30 days → TTL = 3 days
     ●   Last verification: yesterday → invalidation in 2 days
 ●   HTTP-based implementation:
     ●   Calculation of age: use Last-Modified response header
     ●   Verification with conditional GET




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   54
Adaptive TTL
 ●   Assumption:
     ●   The older an object, the less likely it is to be modified
 ●   TTL is a percentage of the age:
     ●   Threshold = 10% ; age = 30 days → TTL = 3 days
     ●   Last verification: yesterday → invalidation in 2 days
 ●   HTTP-based implementation:                                               35.1%
     ●   Calculation of age: use Last-Modified response header
     ●   Verification with conditional GET
 ●   Alternative (due to lack of support in Linked Data servers):
     ●   Assume Last-Modified is time of first retrieval
     ●   Verification by comparing a response to the current version
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data           55
Summary
 ●   Systematic analysis of the impact of data cache
     ●   Theoretical foundation
     ●   Conceptual analysis
     ●   Empirical evaluation
 ●   Main findings:
     ●   Additional results possible (for semantically similar queries)
     ●   Impact on performance may be positive but also negative
 ●   Future work:
     ●   Analysis of caching strategies in our context
     ●   Main issue: invalidation


Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   56
Backup Slides




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   57
Contributions
 ●   Theoretical foundation (extension of the original definition)
     ●   Reachability by a Dseed-initialized execution of a BGP query b
     ●   Dseed-dependent solution for a BGP query b
     ●   Reachability R(B) for a serial execution of B = b1 , … , bn
     ➔   Each solution for bcur is also R(B)-dependent solution for bcur
 ●   Conceptual analysis of the impact of data caching
     ●   Performance factor: p( bcur , B ) = c( bcur , [ ] ) – c( bcur , B )
     ●   Serendipity factor: s( bcur , B ) = b( bcur , B ) – b( bcur , [ ] )
 ●   Empirical verification of the potential impact

 ●   Out of scope: Caching strategies (replacement, invalidation)
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data    58
Query Template Contact
 SELECT * WHERE {             <PERSON> foaf:knows ?p .

                             OPTIONAL       {   ?p   foaf:name ?name }
                             OPTIONAL       {   ?p   foaf:firstName ?firstName }
                             OPTIONAL       {   ?p   foaf:givenName ?givenName }
                             OPTIONAL       {   ?p   foaf:givenname ?givenname }
                             OPTIONAL       {   ?p   foaf:familyName ?familyName }
                             OPTIONAL       {   ?p   foaf:family_name ?family_name }
                             OPTIONAL       {   ?p   foaf:lastName ?lastName }
                             OPTIONAL       {   ?p   foaf:surname ?surname }

                             OPTIONAL { ?p foaf:birthday ?birthday }

                             OPTIONAL { ?p foaf:img ?img }

                             OPTIONAL       {   ?p   foaf:phone ?phone }
                             OPTIONAL       {   ?p   foaf:aimChatID ?aimChatID }
                             OPTIONAL       {   ?p   foaf:icqChatID ?icqChatID }
                             OPTIONAL       {   ?p   foaf:jabberID ?jabberID }
                             OPTIONAL       {   ?p   foaf:msnChatID ?msnChatID }
                             OPTIONAL       {   ?p   foaf:skypeID ?skypeID }
                             OPTIONAL       {   ?p   foaf:yahooChatID ?yahooChatID }
 }

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data            59
Query Template UnsetProps
 SELECT DISTINCT ?result ?resultLabel WHERE
 {
    ?result rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> .
    ?result rdfs:domain foaf:Person .

       OPTIONAL { <PERSON> ?result ?var0 }
       FILTER ( !bound(?var0) )

       <PERSON> foaf:knows ?var2 .
       ?var2 ?result ?var3 .
       ?result rdfs:label ?resultLabel .
       ?result vs:term_status ?var1 .
 }
 ORDER BY ?var1




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   60
Query Template Incoming
 SELECT DISTINCT ?result WHERE
 {
   ?result foaf:knows <PERSON> .

       OPTIONAL
       {
         ?result foaf:knows ?var1 .
         FILTER ( <PERSON> = ?var1 )
         <PERSON> foaf:knows ?result .
       }
       FILTER ( !bound(?var1) )
 }


Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   61
Query Template 2ndDegree1
 SELECT DISTINCT ?result WHERE
 {
    <PERSON> foaf:knows ?p1 .
    <PERSON> foaf:knows ?p2 .
    FILTER ( ?p1 != ?p2 )

       ?p1 foaf:knows ?result .
       FILTER ( <PERSON> != ?result )
       ?p2 foaf:knows ?result .

       OPTIONAL {
          <PERSON> ?knows ?result .
          FILTER ( ?knows = foaf:knows )
       }
       FILTER ( !bound(?knows) )
 }
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   62
Query Template 2ndDegree2
 SELECT DISTINCT ?result WHERE
 {
    <PERSON> foaf:knows ?p1 .
    <PERSON> foaf:knows ?p2 .
    FILTER ( ?p1 != ?p2 )

       ?result foaf:knows ?p1 .
       FILTER ( <PERSON> != ?result )
       ?result foaf:knows ?p2 .

       OPTIONAL {
          <PERSON> ?knows ?result .
          FILTER ( ?knows = foaf:knows )
       }
       FILTER ( !bound(?knows) )
 }
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   63
Experiment – Single Query
         Experiment              Avg.1 number of                        Average1                Avg.1 query
                                 Query Results                          Hit Rate              Execution Time

                                             (std.dev.)                         (std.dev.)         (std.dev.)
                                           27.37                              0.849              64.95 s
          no reuse
                                                  (140.49)                          (0.205)           (124.50)
                                           26.71                                1                 0.02 s
 reuse per query
                                                  (148.77)                              (0)                (0.07)
                                                                       1
                                                                           Averaged over all 100 queries
 ●   In the ideal case for Bupper= [ bcur , bcur ] :
     ●   pupper( bcur , Bupper ) = c( bcur , [ ] ) – c( bcur , Bupper ) = c( bcur , [ ] )
     ●   supper( bcur , Bupper ) = b( bcur , Bupper ) – b( bcur , [ ] ) = 0

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         64
Experiment – Single Query
         Experiment              Avg.1 number of                        Average1                Avg.1 query
                                 Query Results                          Hit Rate              Execution Time

                                             (std.dev.)                         (std.dev.)         (std.dev.)
                                           27.37                              0.849              64.95 s
          no reuse
                                                  (140.49)                          (0.205)           (124.50)
                                           26.71                                1                 0.02 s
 reuse per query
                                                  (148.77)                              (0)                (0.07)
                                                                       1
                                                                           Averaged over all 100 queries

 ●   Summary (measurement errors aside):
     ●   Same number of query results
     ●   Significant improvements in query performance

Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         65
Experiment – Complete Sequence
         Experiment              Avg.1 number of                        Average1                Avg.1 query
                                 Query Results                          Hit Rate              Execution Time

                                             (std.dev.)                         (std.dev.)         (std.dev.)
                                           27.37                              0.849              64.95 s
          no reuse
                                                  (140.49)                          (0.205)           (124.50)
                                           26.71                                1                 0.02 s
 reuse per query
                                                  (148.77)                              (0)                (0.07)
                                           44.87                              0.991              37.91 s
 reuse all queries
                                                  (178.36)                          (0.053)           (112.94)
                                                                       1
                                                                           Averaged over all 100 queries
 ●   Summary:
     ●   Data cache may provide for additional query results
     ●   Impact on performance may be positive but also negative
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         66
Experiment – Complete Sequence
     Experiment                  Avg.1 number of                        Average1                Avg.1 query
                                 Query Results                          Hit Rate              Execution Time

                                             (std.dev.)                         (std.dev.)         (std.dev.)
                                           27.37                              0.849              64.95 s
        no reuse
                                                  (140.49)                          (0.205)           (124.50)
                                           26.71                                1                 0.02 s
 reuse per query
                                                  (148.77)                              (0)                (0.07)
                                           44.87                              0.991              37.91 s
 reuse all queries
                                                  (178.36)                          (0.053)           (112.94)
 reuse all queries                        118.18                              0.992              20.61 s
 (random orders)                                  (867.07)                          (0.016)           (216.61)

 ●   Executing the query sequence in a random order results in
     measurements similar to the given order.
Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data                                         67
These slides have been created by
                                       Olaf Hartig

                                              http://olafhartig.de


                     This work is licensed under a
       Creative Commons Attribution-Share Alike 3.0 License
           (http://creativecommons.org/licenses/by-sa/3.0/)




Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data   68
Ad

More Related Content

What's hot (18)

Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
IRJET Journal
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
Besnik Fetahu
 
Ir
IrIr
Ir
almashraee
 
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACHTEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
IJDKP
 
EDF2012 Peter Boncz - LOD benchmarking SRbench
EDF2012   Peter Boncz - LOD benchmarking SRbenchEDF2012   Peter Boncz - LOD benchmarking SRbench
EDF2012 Peter Boncz - LOD benchmarking SRbench
European Data Forum
 
Z04506138145
Z04506138145Z04506138145
Z04506138145
IJERA Editor
 
ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database Virtualization
Boris Glavic
 
BoTLRet: A Template-based Linked Data Information Retrieval
 BoTLRet: A Template-based Linked Data Information Retrieval BoTLRet: A Template-based Linked Data Information Retrieval
BoTLRet: A Template-based Linked Data Information Retrieval
National Inistitute of Informatics (NII), Tokyo, Japann
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval
Abhay Ratnaparkhi
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
Rakebul Hasan
 
Fasta
FastaFasta
Fasta
Vidya Kalaivani Rajkumar
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
Mohamed BEN ELLEFI
 
Comparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrievalComparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrieval
eSAT Journals
 
Sub1579
Sub1579Sub1579
Sub1579
International Journal of Science and Research (IJSR)
 
Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept Location
Sonia Haiduc
 
IRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET- Information Retrieval and De-duplication for Tourism Recommender SystemIRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET Journal
 
Blast 2013 1
Blast 2013 1Blast 2013 1
Blast 2013 1
Jumbo Nantawong
 
Directed versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysDirected versus undirected network analysis of student essays
Directed versus undirected network analysis of student essays
Roy Clariana
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
IRJET Journal
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
Besnik Fetahu
 
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACHTEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
IJDKP
 
EDF2012 Peter Boncz - LOD benchmarking SRbench
EDF2012   Peter Boncz - LOD benchmarking SRbenchEDF2012   Peter Boncz - LOD benchmarking SRbench
EDF2012 Peter Boncz - LOD benchmarking SRbench
European Data Forum
 
ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database Virtualization
Boris Glavic
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval
Abhay Ratnaparkhi
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
Rakebul Hasan
 
Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking  Profile-based Dataset Recommendation for RDF Data Linking
Profile-based Dataset Recommendation for RDF Data Linking
Mohamed BEN ELLEFI
 
Comparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrievalComparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrieval
eSAT Journals
 
Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept Location
Sonia Haiduc
 
IRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET- Information Retrieval and De-duplication for Tourism Recommender SystemIRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET- Information Retrieval and De-duplication for Tourism Recommender System
IRJET Journal
 
Directed versus undirected network analysis of student essays
Directed versus undirected network analysis of student essaysDirected versus undirected network analysis of student essays
Directed versus undirected network analysis of student essays
Roy Clariana
 

Similar to The Impact of Data Caching of on Query Execution for Linked Data (20)

Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Peter Haase
 
Seaform Slides in VLDB 2010 PhD Workshop
Seaform Slides in VLDB 2010 PhD WorkshopSeaform Slides in VLDB 2010 PhD Workshop
Seaform Slides in VLDB 2010 PhD Workshop
Hao Wu
 
LODOP - Multi-Query Optimization for Linked Data Profiling Queries
LODOP - Multi-Query Optimization for Linked Data Profiling QueriesLODOP - Multi-Query Optimization for Linked Data Profiling Queries
LODOP - Multi-Query Optimization for Linked Data Profiling Queries
Anja Jentzsch
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search Engine
Salford Systems
 
UNIT V.pdf
UNIT V.pdfUNIT V.pdf
UNIT V.pdf
MerbinJose
 
Linked Data and Sevices
Linked Data and SevicesLinked Data and Sevices
Linked Data and Sevices
PlanetData Network of Excellence
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda Provenance
Vlad Vega
 
Linked Data Query Processing Strategies
Linked Data Query Processing StrategiesLinked Data Query Processing Strategies
Linked Data Query Processing Strategies
Thanh Tran
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
Dmitry Grapov
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
Deborah McGuinness
 
25.ranking on data manifold with sink points
25.ranking on data manifold with sink points25.ranking on data manifold with sink points
25.ranking on data manifold with sink points
Venkatesh Neerukonda
 
Distributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query ProcessingDistributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query Processing
Gyanmanjari Institute Of Technology
 
Performance Analysis of MapReduce Implementations on High Performance Homolog...
Performance Analysis of MapReduce Implementations on High Performance Homolog...Performance Analysis of MapReduce Implementations on High Performance Homolog...
Performance Analysis of MapReduce Implementations on High Performance Homolog...
Koichi Shirahata
 
Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollink
SSSW
 
Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval
Jean Brenda
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
eXascale Infolab
 
Web Access Log Management
Web Access Log ManagementWeb Access Log Management
Web Access Log Management
Jay Patel
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
eXascale Infolab
 
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
COST Action TD1210
 
SIGIR 2011
SIGIR 2011SIGIR 2011
SIGIR 2011
chetanagavankar
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Peter Haase
 
Seaform Slides in VLDB 2010 PhD Workshop
Seaform Slides in VLDB 2010 PhD WorkshopSeaform Slides in VLDB 2010 PhD Workshop
Seaform Slides in VLDB 2010 PhD Workshop
Hao Wu
 
LODOP - Multi-Query Optimization for Linked Data Profiling Queries
LODOP - Multi-Query Optimization for Linked Data Profiling QueriesLODOP - Multi-Query Optimization for Linked Data Profiling Queries
LODOP - Multi-Query Optimization for Linked Data Profiling Queries
Anja Jentzsch
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search Engine
Salford Systems
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda Provenance
Vlad Vega
 
Linked Data Query Processing Strategies
Linked Data Query Processing StrategiesLinked Data Query Processing Strategies
Linked Data Query Processing Strategies
Thanh Tran
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
Dmitry Grapov
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
Deborah McGuinness
 
25.ranking on data manifold with sink points
25.ranking on data manifold with sink points25.ranking on data manifold with sink points
25.ranking on data manifold with sink points
Venkatesh Neerukonda
 
Performance Analysis of MapReduce Implementations on High Performance Homolog...
Performance Analysis of MapReduce Implementations on High Performance Homolog...Performance Analysis of MapReduce Implementations on High Performance Homolog...
Performance Analysis of MapReduce Implementations on High Performance Homolog...
Koichi Shirahata
 
Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollink
SSSW
 
Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval
Jean Brenda
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
eXascale Infolab
 
Web Access Log Management
Web Access Log ManagementWeb Access Log Management
Web Access Log Management
Jay Patel
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
eXascale Infolab
 
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
COST Action TD1210
 
Ad

More from Olaf Hartig (20)

LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
Olaf Hartig
 
A Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the WebA Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the Web
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Olaf Hartig
 
An Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and QueryAn Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and Query
Olaf Hartig
 
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
Olaf Hartig
 
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Olaf Hartig
 
A Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked DataA Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked Data
Olaf Hartig
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Olaf Hartig
 
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Olaf Hartig
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)
Olaf Hartig
 
Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)
Olaf Hartig
 
Linked Data on the Web
Linked Data on the WebLinked Data on the Web
Linked Data on the Web
Olaf Hartig
 
Executing SPARQL Queries of the Web of Linked Data
Executing SPARQL Queries of the Web of Linked DataExecuting SPARQL Queries of the Web of Linked Data
Executing SPARQL Queries of the Web of Linked Data
Olaf Hartig
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
Olaf Hartig
 
LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
Olaf Hartig
 
A Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the WebA Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the Web
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Olaf Hartig
 
An Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and QueryAn Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and Query
Olaf Hartig
 
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
Olaf Hartig
 
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Olaf Hartig
 
A Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked DataA Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked Data
Olaf Hartig
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Olaf Hartig
 
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Olaf Hartig
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)
Olaf Hartig
 
Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)
Olaf Hartig
 
Linked Data on the Web
Linked Data on the WebLinked Data on the Web
Linked Data on the Web
Olaf Hartig
 
Executing SPARQL Queries of the Web of Linked Data
Executing SPARQL Queries of the Web of Linked DataExecuting SPARQL Queries of the Web of Linked Data
Executing SPARQL Queries of the Web of Linked Data
Olaf Hartig
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
Olaf Hartig
 
Ad

Recently uploaded (20)

Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Build 3D Animated Safety Induction - Tech EHS
Build 3D Animated Safety Induction - Tech EHSBuild 3D Animated Safety Induction - Tech EHS
Build 3D Animated Safety Induction - Tech EHS
TECH EHS Solution
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
Play It Safe: Manage Security Risks - Google Certificate
Play It Safe: Manage Security Risks - Google CertificatePlay It Safe: Manage Security Risks - Google Certificate
Play It Safe: Manage Security Risks - Google Certificate
VICTOR MAESTRE RAMIREZ
 
The Changing Compliance Landscape in 2025.pdf
The Changing Compliance Landscape in 2025.pdfThe Changing Compliance Landscape in 2025.pdf
The Changing Compliance Landscape in 2025.pdf
Precisely
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
The Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI IntegrationThe Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI Integration
Re-solution Data Ltd
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Build 3D Animated Safety Induction - Tech EHS
Build 3D Animated Safety Induction - Tech EHSBuild 3D Animated Safety Induction - Tech EHS
Build 3D Animated Safety Induction - Tech EHS
TECH EHS Solution
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
Play It Safe: Manage Security Risks - Google Certificate
Play It Safe: Manage Security Risks - Google CertificatePlay It Safe: Manage Security Risks - Google Certificate
Play It Safe: Manage Security Risks - Google Certificate
VICTOR MAESTRE RAMIREZ
 
The Changing Compliance Landscape in 2025.pdf
The Changing Compliance Landscape in 2025.pdfThe Changing Compliance Landscape in 2025.pdf
The Changing Compliance Landscape in 2025.pdf
Precisely
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
The Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI IntegrationThe Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI Integration
Re-solution Data Ltd
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 

The Impact of Data Caching of on Query Execution for Linked Data

  • 1. The Impact of Data Caching on Query Execution for Linked Data Olaf Hartig http://olafhartig.de/foaf.rdf#olaf @olafhartig Database and Information Systems Research Group Humboldt-Universität zu Berlin
  • 2. Can we query the Web of Data as of it were a single, giant database? SELECT DISTINCT ?i ?label WHERE { ?prof rdf:type <http://res ... data/dbprofs#DBProfessor> ; foaf:topic_interest ?i . } OPTIONAL { } ?i rdfs:label ?label FILTER( LANG(?label)="en" || LANG(?label)="") ORDER BY ?label ? Our approach: Link Traversal Based Query Execution [ISWC'09] Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 2
  • 3. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset query-local dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 3
  • 4. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 4
  • 5. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: htt p:/ ? ● Evaluate parts of the query (triple patterns) /bo on a continuously augmented set of data b.n am Look up URIs in intermediate e ● solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 5
  • 6. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: htt p:/ ? ● Evaluate parts of the query (triple patterns) /bo on a continuously augmented set of data b.n am Look up URIs in intermediate e ● solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 6
  • 7. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: htt p:/ ? ● Evaluate parts of the query (triple patterns) /bo on a continuously augmented set of data b.n am Look up URIs in intermediate e ● solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 7
  • 8. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: htt p:/ ? ● Evaluate parts of the query (triple patterns) /bo on a continuously augmented set of data b.n am Look up URIs in intermediate e ● solutions and add retrieved data “Descriptor object” to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 8
  • 9. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 9
  • 10. Main Idea ● Intertwine query evaluation with traversal of data links ● We alternate between: ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset http://bob.name Query kno ws http://bob.name http://alice.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 10
  • 11. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset http://bob.name Query kno ws http://bob.name http://alice.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 11
  • 12. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) ? me on a continuously augmented set of data a e.n lic a :// ● Look up URIs in intermediate p htt solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 12
  • 13. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) ? me on a continuously augmented set of data a e.n lic a :// ● Look up URIs in intermediate p htt solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 13
  • 14. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) ? me on a continuously augmented set of data a e.n lic a :// ● Look up URIs in intermediate p htt solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 14
  • 15. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 15
  • 16. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 16
  • 17. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the query-local dataset http://alice.name Query pr o http://bob.name jec t ?prjName http://.../AlicesPrj s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 17
  • 18. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate ?acq ?prj http://alice.name http://.../AlicesPrj solutions and add retrieved data to the query-local dataset http://alice.name Query pr o http://bob.name jec t ?prjName http://.../AlicesPrj s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 18
  • 19. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate ?acq ?prj http://alice.name http://.../AlicesPrj solutions and add retrieved data to the query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 19
  • 20. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate ?acq ?prj http://alice.name http://.../AlicesPrj solutions and add retrieved data to the query-local dataset ?prj ?prjName http://.../AlicesPrj “…“ Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 20
  • 21. Main Idea ● Intertwine query evaluation with traversal of data links ?acq ● We alternate between: http://alice.name ● Evaluate parts of the query (triple patterns) on a continuously augmented set of data ● Look up URIs in intermediate ?acq ?prj http://alice.name http://.../AlicesPrj solutions and add retrieved data to the query-local dataset ?prj ?prjName http://.../AlicesPrj “…“ Query ?acq ?prj ?prjName http://bob.name ?prjName http://alice.name http://.../AlicesPrj “…“ s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 21
  • 22. Characteristics ● Link traversal based query execution: ● Evaluation on a continuously augmented dataset ● Discovery of potentially relevant data during execution ● Discovery driven by intermediate solutions ● Main advantage: ● No need to know all data sources in advance ● Limitations: ● Query has to contain a URI as a starting point ● Ignores data that is not reachable* by the query execution * formal definition in [LDOW'11a] Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 22
  • 23. The Issue Query ?acq interest ?i s ow label kn http://bob.name ?iLabel query-local dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 23
  • 24. The Issue Query ?acq interest ?i s ow label kn http://bob.name ?iLabel htt query-local p: //b ob dataset ? .nam e Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 24
  • 25. The Issue Query ?acq interest http://bob.name ?i kno s ow w s label kn http://alice.name http://bob.name ?iLabel query-local dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 25
  • 26. The Issue Query ?acq interest http://bob.name ?i kno s ow w s label kn http://alice.name http://bob.name ?iLabel query-local dataset ?acq ?i ?iLabel Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 26
  • 27. The Issue Query ?acq interest ?i s ow label kn http://bob.name ?iLabel query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 27
  • 28. Reusing the Query-Local Dataset Query ?acq interest ?i s ow label kn http://bob.name ?iLabel query-local dataset Query http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 28
  • 29. Reusing the Query-Local Dataset Query ?acq interest ?i s ow label kn http://bob.name ?iLabel http://alice.name o ws Query kn http://bob.name http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 29
  • 30. Reusing the Query-Local Dataset Query ?acq interest ?i ?acq s ow http://alice.name label kn http://bob.name ?iLabel http://alice.name o ws Query kn http://bob.name http://bob.name ?prjName s ow me kn na ?acq query-local project ?prj dataset Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 30
  • 31. Hypothesis Re-using the query-local dataset (a.k.a. data caching) may benefit query performance + result completeness Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 31
  • 32. Contributions ● Systematic analysis of the impact of data caching ● Theoretical foundation* ● Conceptual analysis* ● Empirical evaluation of the potential impact * see [LDOW'11a] ● Out of scope: Caching strategies (replacement, invalidation) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 32
  • 33. Experiment – Scenario ● Information about the distributed social network of FOAF profiles ● 5 types of queries ● Experiment Setup: ● 20 persons ● Sequential use ➔ 100 queries Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 33
  • 34. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 query per ● no reuse experiment: 0,01 0,1 1 10 100 ContactInfoDanBri ● No data caching (Query No. 61) ● reuse per query experiment UnsetPropsDanBri ● Reuse of query-local dataset (Query No. 62) for 3 executions of each query 2ndDegree1DanBri ● Third execution measured (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 34
  • 35. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 query per ● no 0,01 reuse experiment: 0,1 1 10 100 ContactInfoDanBri ● No data caching (Query No. 61) ● reuse per query experiment UnsetPropsDanBri ● Reuse of query-local dataset (Query No. 62) for 3 executions of each query 2ndDegree1DanBri ● Third execution measured (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 35
  • 36. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 per 0,01 0,1 1 10 100 query ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 36
  • 37. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 per 0,01 0,1 1 10 100 query ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 37
  • 38. Experiment – Single Query no reuse reuse 0 10 20 30 40 50 60 per 0,01 0,1 1 10 100 query ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 38
  • 39. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 query per reuse 40 queries ● reuse all queries experiment: 100 0,01 0,1 1 10 ContactInfoDanBri ● Reuse of the query-local (Query No. 61) dataset for the complete sequence of all 100 queries UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 39
  • 40. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 per reuse 40 0,01 0,1 1 10 100 query queries ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 40
  • 41. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 per reuse 40 0,01 0,1 1 10 100 query queries ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 41
  • 42. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 per reuse 40 0,01 0,1 1 10 100 query queries ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 42
  • 43. Experiment – Complete Sequence no reuse reuse 0 10 20 30 all 50 60 per reuse 40 0,01 0,1 1 10 100 query queries ContactInfoDanBri (Query No. 61) UnsetPropsDanBri (Query No. 62) 2ndDegree1DanBri (Query No. 63) 2ndDegree2DanBri (Query No. 64) IncomingDanBri (Query No. 65) 0 10 20 30 40 50 60 0,01 0,1 1 10 100 number of query results query execution time (in seconds) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 43
  • 44. Outlook ● Requirements of a data cache: ● Replacement mechanism ● Coherency mechanism Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 44
  • 45. Cache Replacement ● Cache full → remove descriptor objects ● Replacement strategy ● Primary goal: maximize hit rate ● Recency-based ● Frequency-based ● Function-based ● Randomized ● Replacement process ● Watermarks: high and low Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 45
  • 46. Studying Cache Replacement? Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 46
  • 47. Studying Cache Replacement? “Web cache replacement in its general form seems to be a solved topic.” S. Podlipnig and L. Böszörmenyi: Survey of Web Cache Replacement Strategies, 2003 Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 47
  • 48. Studying Cache Replacement? “Web cache replacement in its general form seems to be a solved topic.” S. Podlipnig and L. Böszörmenyi: Survey of Web Cache Replacement Strategies, 2003 ● 6 quad indexes* in main memory ● Size grows linear in the number of quads ● Example (after reuse all queries experiment, 100 queries): ● 905 descriptor objects, overall number of 745,756 triples ● ca. 103 MB ➔ Available main memory is almost no limit * as introduced in [LDOW'11b] Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 48
  • 49. Cache Coherency ● Data items in the cache may become inconsistent ● Strong cache consistency ● Server validation ● Client validation ● Weak cache consistency ● Time to live (TTL) ● Adaptive TTL Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 49
  • 50. Client Validation ● Polling every time ● Enables strong cache consistency ● Conditional GET ● Request with If-Modified-Since header ● Possible response: 304 Not Modified Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 50
  • 51. Client Validation ● Polling every time ● Enables strong cache consistency ● Conditional GET ● Request with If-Modified-Since header ● Possible response: 304 Not Modified ● Not supported by most Linked Data servers ● Experiment based on the CKAN catalog of linked datasets ● 41 out of 154 example resources (26.6%) from 110 datasets Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 51
  • 52. Time to Live (TTL) ● TTL field: life time estimation for each object ● Supported by HTTP response headers: ● Expires ● Cache-Control: max-age ● When TTL elapses, object is invalid ● Accessing an invalid object → re-retrieve object again ● Conditional GET Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 52
  • 53. Time to Live (TTL) ● TTL field: life time estimation for each object ● Supported by HTTP response headers: ● Expires 37.0% ● Cache-Control: max-age 37.7% ● When TTL elapses, object is invalid ● Accessing an invalid object → re-retrieve object again ● Conditional GET 26.6% ● Alternative (due to lack of support in Linked Data servers): ● Assume a default TTL for each object ● Ordinary GET Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 53
  • 54. Adaptive TTL ● Assumption: ● The older an object, the less likely it is to be modified ● TTL is a percentage of the age: ● Threshold = 10% ; age = 30 days → TTL = 3 days ● Last verification: yesterday → invalidation in 2 days ● HTTP-based implementation: ● Calculation of age: use Last-Modified response header ● Verification with conditional GET Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 54
  • 55. Adaptive TTL ● Assumption: ● The older an object, the less likely it is to be modified ● TTL is a percentage of the age: ● Threshold = 10% ; age = 30 days → TTL = 3 days ● Last verification: yesterday → invalidation in 2 days ● HTTP-based implementation: 35.1% ● Calculation of age: use Last-Modified response header ● Verification with conditional GET ● Alternative (due to lack of support in Linked Data servers): ● Assume Last-Modified is time of first retrieval ● Verification by comparing a response to the current version Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 55
  • 56. Summary ● Systematic analysis of the impact of data cache ● Theoretical foundation ● Conceptual analysis ● Empirical evaluation ● Main findings: ● Additional results possible (for semantically similar queries) ● Impact on performance may be positive but also negative ● Future work: ● Analysis of caching strategies in our context ● Main issue: invalidation Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 56
  • 57. Backup Slides Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 57
  • 58. Contributions ● Theoretical foundation (extension of the original definition) ● Reachability by a Dseed-initialized execution of a BGP query b ● Dseed-dependent solution for a BGP query b ● Reachability R(B) for a serial execution of B = b1 , … , bn ➔ Each solution for bcur is also R(B)-dependent solution for bcur ● Conceptual analysis of the impact of data caching ● Performance factor: p( bcur , B ) = c( bcur , [ ] ) – c( bcur , B ) ● Serendipity factor: s( bcur , B ) = b( bcur , B ) – b( bcur , [ ] ) ● Empirical verification of the potential impact ● Out of scope: Caching strategies (replacement, invalidation) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 58
  • 59. Query Template Contact SELECT * WHERE { <PERSON> foaf:knows ?p . OPTIONAL { ?p foaf:name ?name } OPTIONAL { ?p foaf:firstName ?firstName } OPTIONAL { ?p foaf:givenName ?givenName } OPTIONAL { ?p foaf:givenname ?givenname } OPTIONAL { ?p foaf:familyName ?familyName } OPTIONAL { ?p foaf:family_name ?family_name } OPTIONAL { ?p foaf:lastName ?lastName } OPTIONAL { ?p foaf:surname ?surname } OPTIONAL { ?p foaf:birthday ?birthday } OPTIONAL { ?p foaf:img ?img } OPTIONAL { ?p foaf:phone ?phone } OPTIONAL { ?p foaf:aimChatID ?aimChatID } OPTIONAL { ?p foaf:icqChatID ?icqChatID } OPTIONAL { ?p foaf:jabberID ?jabberID } OPTIONAL { ?p foaf:msnChatID ?msnChatID } OPTIONAL { ?p foaf:skypeID ?skypeID } OPTIONAL { ?p foaf:yahooChatID ?yahooChatID } } Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 59
  • 60. Query Template UnsetProps SELECT DISTINCT ?result ?resultLabel WHERE { ?result rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> . ?result rdfs:domain foaf:Person . OPTIONAL { <PERSON> ?result ?var0 } FILTER ( !bound(?var0) ) <PERSON> foaf:knows ?var2 . ?var2 ?result ?var3 . ?result rdfs:label ?resultLabel . ?result vs:term_status ?var1 . } ORDER BY ?var1 Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 60
  • 61. Query Template Incoming SELECT DISTINCT ?result WHERE { ?result foaf:knows <PERSON> . OPTIONAL { ?result foaf:knows ?var1 . FILTER ( <PERSON> = ?var1 ) <PERSON> foaf:knows ?result . } FILTER ( !bound(?var1) ) } Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 61
  • 62. Query Template 2ndDegree1 SELECT DISTINCT ?result WHERE { <PERSON> foaf:knows ?p1 . <PERSON> foaf:knows ?p2 . FILTER ( ?p1 != ?p2 ) ?p1 foaf:knows ?result . FILTER ( <PERSON> != ?result ) ?p2 foaf:knows ?result . OPTIONAL { <PERSON> ?knows ?result . FILTER ( ?knows = foaf:knows ) } FILTER ( !bound(?knows) ) } Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 62
  • 63. Query Template 2ndDegree2 SELECT DISTINCT ?result WHERE { <PERSON> foaf:knows ?p1 . <PERSON> foaf:knows ?p2 . FILTER ( ?p1 != ?p2 ) ?result foaf:knows ?p1 . FILTER ( <PERSON> != ?result ) ?result foaf:knows ?p2 . OPTIONAL { <PERSON> ?knows ?result . FILTER ( ?knows = foaf:knows ) } FILTER ( !bound(?knows) ) } Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 63
  • 64. Experiment – Single Query Experiment Avg.1 number of Average1 Avg.1 query Query Results Hit Rate Execution Time (std.dev.) (std.dev.) (std.dev.) 27.37 0.849 64.95 s no reuse (140.49) (0.205) (124.50) 26.71 1 0.02 s reuse per query (148.77) (0) (0.07) 1 Averaged over all 100 queries ● In the ideal case for Bupper= [ bcur , bcur ] : ● pupper( bcur , Bupper ) = c( bcur , [ ] ) – c( bcur , Bupper ) = c( bcur , [ ] ) ● supper( bcur , Bupper ) = b( bcur , Bupper ) – b( bcur , [ ] ) = 0 Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 64
  • 65. Experiment – Single Query Experiment Avg.1 number of Average1 Avg.1 query Query Results Hit Rate Execution Time (std.dev.) (std.dev.) (std.dev.) 27.37 0.849 64.95 s no reuse (140.49) (0.205) (124.50) 26.71 1 0.02 s reuse per query (148.77) (0) (0.07) 1 Averaged over all 100 queries ● Summary (measurement errors aside): ● Same number of query results ● Significant improvements in query performance Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 65
  • 66. Experiment – Complete Sequence Experiment Avg.1 number of Average1 Avg.1 query Query Results Hit Rate Execution Time (std.dev.) (std.dev.) (std.dev.) 27.37 0.849 64.95 s no reuse (140.49) (0.205) (124.50) 26.71 1 0.02 s reuse per query (148.77) (0) (0.07) 44.87 0.991 37.91 s reuse all queries (178.36) (0.053) (112.94) 1 Averaged over all 100 queries ● Summary: ● Data cache may provide for additional query results ● Impact on performance may be positive but also negative Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 66
  • 67. Experiment – Complete Sequence Experiment Avg.1 number of Average1 Avg.1 query Query Results Hit Rate Execution Time (std.dev.) (std.dev.) (std.dev.) 27.37 0.849 64.95 s no reuse (140.49) (0.205) (124.50) 26.71 1 0.02 s reuse per query (148.77) (0) (0.07) 44.87 0.991 37.91 s reuse all queries (178.36) (0.053) (112.94) reuse all queries 118.18 0.992 20.61 s (random orders) (867.07) (0.016) (216.61) ● Executing the query sequence in a random order results in measurements similar to the given order. Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 67
  • 68. These slides have been created by Olaf Hartig http://olafhartig.de This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License (http://creativecommons.org/licenses/by-sa/3.0/) Olaf Hartig - The Impact of Data Caching on Query Execution for Linked Data 68