Debs 2011 pattern rewritingforeventprocessingoptimization

Pattern Rewriting Framework for Event Processing Optimization Ella Rabinovich, Opher Etzion , Avigdor Gal

Motivation Adi A., Etzion O. Amit - the situation manager. The VLDB Journal – The International Journal on Very Large Databases. Volume 13 Issue 2, 2004. Previous studies ‎indicate that there is a major performance degradation as application complexity increases. Mendes M., Bizarro P., Marques P. Benchmarking event processing systems: current state and future directions. WOSP/SIPEW 2010: 259-260 . Optimize complex scenarios

Optimization tools Blackbox optimizations: Distribution Parallelism Scheduling Load balancing Load shedding Whitebox optimizations: Implementation selection Implementation optimization Pattern rewriting Our focus

An example of a complex scenario E1 E2 E3 E15 E16 A process has 16 steps, that have to be executed in a predefined order; termination of each step creates an event with a status-code (SC). The process is reported as committed when The 16 steps have completed in the correct order (sequence pattern) and the pattern assertion is satisfied. The assertion that may look like: E 1 .SC == E2.SC or E3.SC < 4 For this scenario we succeeded to achieve more than tenfold decrease of latency, or more than 20% increase in throughput

Pattern Rewriting Approach The goal: create equivalent pattern that provides better performance Rewriting techniques exist in other domains such as: rule system, SQL queries Due to the inherent complexity of event processing patterns there are some unique challenges seq(E1,E2,E3,E4) seq(E1,E2,E3,E5,E6) seq(E1,E2,E3) seq(DE,E4) seq(DE,E5,E6) all(E1,E2,E3,E4) all(E1,E2) all(E3,E4) all(DE1,DE2) subsumption of a common logic splitting for parallel execution DE1 DE2 DE

Challenges: Assertion Split A pattern assertion (PA) is a predicate that event collection needs to satisfied for the pattern to be matched. seq(E1,E2) with PA’ seq(DE,E3) with PA’’ DE seq(E1,E2,E3) with pattern assertion: E1.SC == E2.SC OR E3.SC < 4 E1.SC == E2.SC OR E3.SC <4 E1.SC == E2.SC E3.SC < 4 seq(E1,E2,E3) with PA the direct connection of the two patterns implies “AND” operator between PA’ and PA’’ seq(E1,E3) with PA’ seq(DE,E2) with PA’’ DE seq(E1,E2,E3) with pattern assertion: E1.SC == E3.SC AND E2.SC = 0 E1.SC == E3.SC AND E2.SC = 0 E1.SC == E3.SC E2.SC = 0 seq(E1,E2,E3) with PA the assertion should be separable in terms of its variables

Assertion Split – Solution Convert the pattern assertion expression into conjunctive normal form (CNF). Identify independent participants’ sub-groups, by generating assertion variables dependency graph. Maximal number of independent partitions implies the finest granulation of the assertion expression. E1 E2 E4 E5 E6 E3 (E1.SC > E2.SC) AND (E4.SC > E5.SC) AND NOT ((E5.SC==E6.SC) AND (E3.SC==77)) (E1.SC > E2.SC) AND (E4.SC > E5.SC) AND (NOT(E5.SC==E6.SC) OR NOT(E3.SC==77)) CNF

Pattern Matching - Policies Pattern: seq(PG, ATM-W) within 10 minutes PG1 PG2 ATM-W1 Instance selection policy PG1 PG2 ATM-W1 ATM-W2 first detection additional detection? Cardinality policy PG1 PG2 ATM-W1 first detection – are instances consumed? ATM-W2 Consumption policy

Naïve pattern split, keeping the original policies in the rewritten version will result in incorrect matching: Challenges: Policies Mapping seq(E1,E2,E3) {single, last, …} seq(E1,E2) {single, last, …} seq(DE,E3) {single, last, …} e1.1 e3.1 e1.1 e3.1 e2.1 blood pressure measure e2.2 blood pressure measure e2.1 blood pressure measure e2.2 blood pressure measure detection point detection point detection point

Policies Mapping – Solution Mapping of policies in the rewritten alternative (f2’ + f2’’), based on the original pattern (f1): - reuse - Consumption last last last Instance selection single unrestricted single Cardinality rewritten (f2’’) rewritten (f2’) original (f1) policy seq(E1,E2,E3) seq(E1,E2) seq(DE,E3) + pattern assertion extensions consume reuse consume Consumption last each last Instance selection unrestricted unrestricted unrestricted Cardinality rewritten (f2’’) rewritten (f2’) original (f1) policy

Denotational semantics approach: Event processing pattern is a function (f) , mapping pattern’s input (participant set - PS) into its output (matching set - MS). We formally demonstrate that for the same PS both alternatives produce the identical MS: f1(PS, …) == f2’( (f2’’(PS’, …)  PS’’) , …)  PS Equivalence assurance seq(E 1 ,…, E N ) PA, Policies seq(E 1 , …, E K ) PA’, Policies’ seq(DE, E K+1 , …, E N ) PA’’, Policies’’ participant set (PS) participant set (PS) matching set (MS) matching set (MS)

Throughput vs. Latency Tradeoff Pattern throughput is an average rate of events it can process The detecting event latency as a delay between the last input event causing a this pattern detection and the detection itself, resulting in derivation of an output event. Example : seq(E1,E2,E3) produces derived event DE Detecting event latency = DE.detection_time - E3.detection_time DE.detection_time: time DE was detected by the system E3.detection_time: time E3 arrived to the system seq(E 1 ,…, E N ) seq(E 1 , …, E N-2 ) seq(DE, E N-1 , E N ) throughput latency throughput latency lazy evaluation eager evaluation

Bi-objective Performance Optimization Define bi-objective performance function Assign a scalar weight for each objective to be optimized Weight of  to pattern throughput (th) Complementary weight (1-  ) to the detecting event latency (lt) Minimize the goal function of the form: g =  *lt + C*(1-  )*(1/th) Simulation-based approach to select the optimal rewriting alternative (minimizing the goal function g) For a set of rewriting alternatives A = {A 1 , … A K } , find argmin Ai ( g )

Experimental Results Simulation results for seq (E1, …, E16) split of pairs The Pareto frontier Min latency Max throughput The base pattern Not in the Pareto frontier 15 95 7 : 1 8 32 163 6 : 2 7 63 172 5 : 3 6 95 165 4 : 4 5 rewritten pattern 147 155 3 : 5 4 174 142 2 : 6 3 189 110 1 : 7 2 260 140 0 : 8 1 Detected event latency (ms) throughput (event/s) rewriting # lazy eager

Pattern rewriting framework for event processing optimization With more than tenfold performance improvement between the original pattern and its rewritten alternative Future research and practical activities Investigation of additional rewritings Using patterns of the same type (e.g., for all pattern) Additional methods for rewriting (e.g. seq using all and filter agents) Elaborating an algorithm for event processing network rewriting Exploring heuristic-based approach for selection of the rewriting alternative of the sequence pattern Future Work

Debs 2011 pattern rewritingforeventprocessingoptimization

Recommended

More Related Content

What's hot (20)

Viewers also liked (6)

Similar to Debs 2011 pattern rewritingforeventprocessingoptimization (20)

More from Opher Etzion (20)

Recently uploaded (20)

Debs 2011 pattern rewritingforeventprocessingoptimization

Editor's Notes