SlideShare a Scribd company logo
IBM Watson – IBM Streams
© 2018 IBM Corporation
IBM Streams V4.3
SPL Event-Time Processing
Victor Dogaru
IBM Streams Development
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation2 © 2018 IBM Corporation
Please note
▪ IBM’s statements regarding its plans, directions, and intent are subject to change
or withdrawal without notice and at IBM’s sole discretion.
▪ Information regarding potential future products is intended to outline our general
product direction and it should not be relied on in making a purchasing decision.
▪ The information mentioned regarding potential future products is not a commitment, promise,
or legal obligation to deliver any material, code or functionality. Information about potential
future products may not be incorporated into any contract.
▪ The development, release, and timing of any future features or functionality described for our
products remains at our sole discretion.
▪ Performance is based on measurements and projections using standard IBM benchmarks in
a controlled environment. The actual throughput or performance that any user will experience
will vary depending upon many factors, including considerations such as the amount of
multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and
the workload processed. Therefore, no assurance can be given that an individual user will
achieve results similar to those stated here.
2
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation3 © 2018 IBM Corporation
Overview
▪ Use Case: How does this help Streams developers and users?
▪ Out of order streams and late data
▪ SPL Watermarks
▪ SPL event-time language definitions
– @eventTime annotation
– TimeInterval window
▪ SPL TimeInterval window and window panes
▪ SPL event-time functions
▪ Support for Java and C++ operators
3
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation4 © 2018 IBM Corporation
Use Case
Streams application for monitoring device data
▪ The user has to write an application which
– Ingests timestamped events
– Calculates metrics of interest every 20 minutes, for events which occurred from 10:00 to 11:00,
10:20 to 11:20, 10:40 to 11:40, etc.
– Updates calculations if late events arrive after metrics were calculated
– Discards events if they arrive later than 6 hours after metrics were calculated
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation5 © 2018 IBM Corporation
Use Case
Solution
▪ Designate SPL attribute for the event timestamp
– Add attribute to stream schemas from the data ingest point downstream
▪ Insert Aggregate operator with an event-time window that groups tuples into intervals based on
their event timestamp
▪ Designate the operator which generates watermarks (usually at data ingest point)
– Watermarks provide a time base for event-time streams
– The Streams runtime and operator logic ensures that the tuple order relative to watermarks is preserved
– For each operator, if inputs are not late then output should not be late with respect to the watermark
value
▪ As watermarks reach the event-time window, they trigger:
– Calculation of aggregate metrics
– Updates in case of late events
– Eviction for data beyond the discarding age horizon
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation6 © 2018 IBM Corporation
But How About Out-of-Order Streams and Late Data?
▪ Event time is the time that an event happened in the real world
– Event-time timestamp is carried with the tuple
▪ Processing (or system) time is the time measured by a machine that processes the event
– Processing time is the machine time when the tuple is being processed
▪ Events are streamed out of order because of variable delay prior to data ingestion and within
Streams
– Some event producers are not always connected (sensors, mobile devices, etc.)
– Some event producers locally buffer data and emit their events in bursts
– Events and Tuples travel on different network paths
– Backpressure and queuing delays from the stream operators
6
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation7 © 2018 IBM Corporation
Watermarks
▪ Watermarks provide a measure of event time progress in a data stream
– For an input stream and a Watermark with value X, all tuples with event time less than X have been received
– For an output stream and a Watermark with value X, all tuples with event time less than X have been submitted
▪ A Watermark is only an estimate of completeness
– Events with timestamps earlier than X may arrive after the Watermark X. These are late data.
7
1312
WM 14 WM 10WM 15
1112
Tuple Late Tuple WM Watermark
▪ The IBM Streams runtime broadcasts Watermarks downstream
– Ensures tuple order is maintained with respect to Watermarks
– Tuples derived from non-late inputs should not be submitted late
▪ A new “currentWatermark” operator custom metric displays the current watermark value in
milliseconds
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation8 © 2018 IBM Corporation
SPL Event-time Language Definitions
@eventTime annotation
– Attribute name, resolution
– Watermark generation
Event-time stream schemas
contain the event-time
attribute
TimeInterval window
– Calculates aggregates for
defined time intervals
8
// Event-time source
@eventTime(eventTimeAttribute=et, lag=5.0, minimumGap=0.075)
stream<timestamp et, ...> Events = TCPSource()
{ ... }
. . .
// Event-time graph
stream<..., timestamp et, ...> B = MyOperator(A) {}
. . .
// Aggregate over event-time window
stream<timestamp et, ...> Out = Aggregate(In) {
window In : timeInterval, intervalDuration(3600.0),
creationPeriod(1200.0), discardAge(21600.0),
partitioned;
param partitionBy : a, b, c;
output Out :
timeStart = windowBegin(),
timeEnd = windowEnd(),
...
}
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation9 © 2018 IBM Corporation
@eventTime Annotation
@eventTime(eventTimeAttribute=et, resolution=Nanoseconds, lag=5.0, minimumGap=0.075)
▪ Indicates that the annotated operator and all the downstream operators which are connected
via event-time streams participate in an event-time graph
– Connectivity extends only downstream
– Event-time ends at a sink or at an operator which does not output the event-time attribute
▪ Annotation elements
– eventTimeAttribute : name of the tuple attribute which represents the event time of the tuple
– Supported types: timestamp, uint64, int64
– resolution : time units of the event-time attribute values -- Milliseconds, Microseconds, Nanoseconds
– lag : duration in seconds between the maximum event-time of submitted tuples and the value of the watermark
– minimumGap : minimum event-time duration in seconds between subsequent watermarks
▪ The operator's watermark set to WM = max(event-time of processed tuples) – lag
9
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation10 © 2018 IBM Corporation
TimeInterval Window
window In : timeInterval, intervalDuration(3600.0), creationPeriod(1200.0), intervalOffset(1800.0), discardAge(21600.0)
▪ Window options
– timeInterval : the window kind -- tuples are placed into panes which correspond to equal intervals in the event-
time domain
– intervalDuration : duration between the lower and upper interval endpoints
– creationPeriod : duration between the lower endpoint of consecutive intervals
– discardAge : duration between the point in time when a window pane becomes complete and the point in time
when the pane closes and does not accept late tuples. Panes are discarded after they close.
– intervalOffset : point in time value which coincides with an interval start time
▪ Window panes partition the event time domain into intervals of the form:
[N * creationPeriod + intervalOffset, N * creationPeriod + intervalDuration + intervalOffset)
▪ Value 0 represents the Unix epoch: 1970-01-01T00:00:00Z UTC
10
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation11 © 2018 IBM Corporation
TimeInterval Window Panes
▪ TimeInterval Window manages a collection of window panes
▪ Each pane stores tuples for a fixed event-time interval
▪ Panes trigger when Watermark reaches the top of the interval
▪ Panes close and get discarded when they get older than the ‘discardAge’
▪ System creates new panes as specified by the ‘creationPeriod’
Example
– When Tuple(13:55) is received: Tuple is assigned to Pane D
– On Watermark(14:00): Pane D is complete and triggers, Pane A closes and gets discarded
– When late Tuple(12:45) is received: Tuple is assigned to Pane C, Pane C triggers (on the next Watermark)
11
timeInterval, intervalDuration(60.0), discardAge(180.0)
12:45
14:00 10:0011:0012:0013:0013:55
WM
14:00
D AC B
Arriving tuples
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation12 © 2018 IBM Corporation
SPL Event-time Functions
timestamp windowBegin();
timestamp windowEnd();
<tuple T> timestamp getEventTime(T t);
timestamp toTimestamp(uint64 ticks, enum {Milliseconds, Microseconds,
Nanoseconds} resolution);
timestamp toTimestamp(int64 ticks, enum {Milliseconds, Microseconds, Nanoseconds}
resolution);
int64 int64TicksFromTimestamp(timestamp ts, enum {Milliseconds, Microseconds,
Nanoseconds} resolution);
uint64 uint64TicksFromTimestamp(timestamp ts, enum {Milliseconds, Microseconds,
Nanoseconds} resolution);
public uint64 paneIndex();
Sys.PaneTiming paneTiming();
12
Window intervals
Event-time
transformations
Window pane status
IBM Watson – IBM Streams
IBM Confidential © 2018 IBM Corporation13 © 2018 IBM Corporation
Support for primitive Java and C++ Operators
▪ New C++ windowing library classes for TimeInterval window
▪ Java and C++ primitive operators can explicitly set the operator’s Watermark value, or let the
system set it for them
13
IBM Watson – IBM Streams
© 2018 IBM Corporation
Thank You
Ad

More Related Content

Similar to SPL Event-Time Processing in IBM Streams V4.3 (20)

Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...
Eric Sammer
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
confluent
 
Guidelhhghghine document final
Guidelhhghghine document finalGuidelhhghghine document final
Guidelhhghghine document final
nanirao686
 
Belsoft Collaboration Day 2018 - Dreaming of..
Belsoft Collaboration Day 2018 - Dreaming of..Belsoft Collaboration Day 2018 - Dreaming of..
Belsoft Collaboration Day 2018 - Dreaming of..
Belsoft
 
IBM MQ - Monitoring and Managing Hybrid Messaging Environments
IBM MQ - Monitoring and Managing Hybrid Messaging EnvironmentsIBM MQ - Monitoring and Managing Hybrid Messaging Environments
IBM MQ - Monitoring and Managing Hybrid Messaging Environments
MarkTaylorIBM
 
Eclipse SCADA 0.2
Eclipse SCADA 0.2Eclipse SCADA 0.2
Eclipse SCADA 0.2
Jürgen Rose
 
Viavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptxViavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptx
mani723
 
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
NETWAYS
 
Agentless System Crawler - InterConnect 2016
Agentless System Crawler - InterConnect 2016Agentless System Crawler - InterConnect 2016
Agentless System Crawler - InterConnect 2016
Canturk Isci
 
Kurt Schneider [Discover Financial] | How Discover Modernizes Observability w...
Kurt Schneider [Discover Financial] | How Discover Modernizes Observability w...Kurt Schneider [Discover Financial] | How Discover Modernizes Observability w...
Kurt Schneider [Discover Financial] | How Discover Modernizes Observability w...
InfluxData
 
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...
Flink Forward
 
What's New in Athene™ 11.10?
What's New in Athene™ 11.10?What's New in Athene™ 11.10?
What's New in Athene™ 11.10?
Precisely
 
Big Data Warsaw
Big Data WarsawBig Data Warsaw
Big Data Warsaw
Maximilian Michels
 
Stream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data ArtisansStream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data Artisans
Evention
 
Tool overview – how to capture – how to create basic workflow .pptx
Tool overview – how to capture – how to create basic workflow .pptxTool overview – how to capture – how to create basic workflow .pptx
Tool overview – how to capture – how to create basic workflow .pptx
RUPAK BHATTACHARJEE
 
Let's get to know the Data Streaming
Let's get to know the Data StreamingLet's get to know the Data Streaming
Let's get to know the Data Streaming
Knoldus Inc.
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Data Con LA
 
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
In-Memory Computing Summit
 
UiPath Community Meetup ServiceNow + mainframe and legacy
UiPath Community Meetup ServiceNow + mainframe and legacy UiPath Community Meetup ServiceNow + mainframe and legacy
UiPath Community Meetup ServiceNow + mainframe and legacy
UiPath
 
Network time protocol
Network time protocolNetwork time protocol
Network time protocol
Mohd Amir
 
Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...
Eric Sammer
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
confluent
 
Guidelhhghghine document final
Guidelhhghghine document finalGuidelhhghghine document final
Guidelhhghghine document final
nanirao686
 
Belsoft Collaboration Day 2018 - Dreaming of..
Belsoft Collaboration Day 2018 - Dreaming of..Belsoft Collaboration Day 2018 - Dreaming of..
Belsoft Collaboration Day 2018 - Dreaming of..
Belsoft
 
IBM MQ - Monitoring and Managing Hybrid Messaging Environments
IBM MQ - Monitoring and Managing Hybrid Messaging EnvironmentsIBM MQ - Monitoring and Managing Hybrid Messaging Environments
IBM MQ - Monitoring and Managing Hybrid Messaging Environments
MarkTaylorIBM
 
Viavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptxViavi_TeraVM Core Emulator.pptx
Viavi_TeraVM Core Emulator.pptx
mani723
 
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
NETWAYS
 
Agentless System Crawler - InterConnect 2016
Agentless System Crawler - InterConnect 2016Agentless System Crawler - InterConnect 2016
Agentless System Crawler - InterConnect 2016
Canturk Isci
 
Kurt Schneider [Discover Financial] | How Discover Modernizes Observability w...
Kurt Schneider [Discover Financial] | How Discover Modernizes Observability w...Kurt Schneider [Discover Financial] | How Discover Modernizes Observability w...
Kurt Schneider [Discover Financial] | How Discover Modernizes Observability w...
InfluxData
 
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...
Flink Forward
 
What's New in Athene™ 11.10?
What's New in Athene™ 11.10?What's New in Athene™ 11.10?
What's New in Athene™ 11.10?
Precisely
 
Stream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data ArtisansStream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data Artisans
Evention
 
Tool overview – how to capture – how to create basic workflow .pptx
Tool overview – how to capture – how to create basic workflow .pptxTool overview – how to capture – how to create basic workflow .pptx
Tool overview – how to capture – how to create basic workflow .pptx
RUPAK BHATTACHARJEE
 
Let's get to know the Data Streaming
Let's get to know the Data StreamingLet's get to know the Data Streaming
Let's get to know the Data Streaming
Knoldus Inc.
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Data Con LA
 
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
IMCSummit 2015 - Day 1 Developer Track - Implementing Operational Intelligenc...
In-Memory Computing Summit
 
UiPath Community Meetup ServiceNow + mainframe and legacy
UiPath Community Meetup ServiceNow + mainframe and legacy UiPath Community Meetup ServiceNow + mainframe and legacy
UiPath Community Meetup ServiceNow + mainframe and legacy
UiPath
 
Network time protocol
Network time protocolNetwork time protocol
Network time protocol
Mohd Amir
 

More from lisanl (20)

What's New Overview for IBM Streams V4.3
What's New Overview for IBM Streams V4.3 What's New Overview for IBM Streams V4.3
What's New Overview for IBM Streams V4.3
lisanl
 
Option Data Types in IBM Streams V4.3
Option Data Types in IBM Streams V4.3Option Data Types in IBM Streams V4.3
Option Data Types in IBM Streams V4.3
lisanl
 
Dynamic and Elastic Scaling in IBM Streams V4.3
Dynamic and Elastic Scaling in IBM Streams V4.3Dynamic and Elastic Scaling in IBM Streams V4.3
Dynamic and Elastic Scaling in IBM Streams V4.3
lisanl
 
Streaming Analytics for Bluemix Enhancements
Streaming Analytics for Bluemix EnhancementsStreaming Analytics for Bluemix Enhancements
Streaming Analytics for Bluemix Enhancements
lisanl
 
Toolkits Overview for IBM Streams V4.2
Toolkits Overview for IBM Streams V4.2Toolkits Overview for IBM Streams V4.2
Toolkits Overview for IBM Streams V4.2
lisanl
 
Highlights of the Telecommunications Event Data Analytics toolkit
Highlights of the Telecommunications Event Data Analytics toolkitHighlights of the Telecommunications Event Data Analytics toolkit
Highlights of the Telecommunications Event Data Analytics toolkit
lisanl
 
IBM Streams V4.2 Submission Time Fusion and Configuration
IBM Streams V4.2 Submission Time Fusion and ConfigurationIBM Streams V4.2 Submission Time Fusion and Configuration
IBM Streams V4.2 Submission Time Fusion and Configuration
lisanl
 
IBM Streams Getting Started Resources
IBM Streams Getting Started ResourcesIBM Streams Getting Started Resources
IBM Streams Getting Started Resources
lisanl
 
IBM ODM Rules Compiler support in IBM Streams V4.2.
IBM ODM Rules Compiler support in IBM Streams V4.2.IBM ODM Rules Compiler support in IBM Streams V4.2.
IBM ODM Rules Compiler support in IBM Streams V4.2.
lisanl
 
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.
lisanl
 
IBM Streams IoT Integration
IBM Streams IoT IntegrationIBM Streams IoT Integration
IBM Streams IoT Integration
lisanl
 
What's New in IBM Streams V4.2
What's New in IBM Streams V4.2What's New in IBM Streams V4.2
What's New in IBM Streams V4.2
lisanl
 
Data Governance with IBM Streams V4.1
Data Governance with IBM Streams V4.1Data Governance with IBM Streams V4.1
Data Governance with IBM Streams V4.1
lisanl
 
Github Projects Overview and IBM Streams V4.1
Github Projects Overview and IBM Streams V4.1Github Projects Overview and IBM Streams V4.1
Github Projects Overview and IBM Streams V4.1
lisanl
 
What's New in Toolkits for IBM Streams V4.1
What's New in Toolkits for IBM Streams V4.1What's New in Toolkits for IBM Streams V4.1
What's New in Toolkits for IBM Streams V4.1
lisanl
 
IBM Streams V4.1 and Incremental Checkpointing
IBM Streams V4.1 and Incremental CheckpointingIBM Streams V4.1 and Incremental Checkpointing
IBM Streams V4.1 and Incremental Checkpointing
lisanl
 
IBM Streams V4.1 REST API Support for Cross-Origin Resource Sharing (CORS)
IBM Streams V4.1 REST API Support for Cross-Origin Resource Sharing (CORS)IBM Streams V4.1 REST API Support for Cross-Origin Resource Sharing (CORS)
IBM Streams V4.1 REST API Support for Cross-Origin Resource Sharing (CORS)
lisanl
 
IBM Streams V4.1 and User Authentication with Client Certificates
IBM Streams V4.1 and User Authentication with Client CertificatesIBM Streams V4.1 and User Authentication with Client Certificates
IBM Streams V4.1 and User Authentication with Client Certificates
lisanl
 
IBM Streams V4.1 and JAAS Login Module Support
IBM Streams V4.1 and JAAS Login Module SupportIBM Streams V4.1 and JAAS Login Module Support
IBM Streams V4.1 and JAAS Login Module Support
lisanl
 
IBM Streams V4.1 Integration with IBM Platform Symphony
IBM Streams V4.1 Integration with IBM Platform SymphonyIBM Streams V4.1 Integration with IBM Platform Symphony
IBM Streams V4.1 Integration with IBM Platform Symphony
lisanl
 
What's New Overview for IBM Streams V4.3
What's New Overview for IBM Streams V4.3 What's New Overview for IBM Streams V4.3
What's New Overview for IBM Streams V4.3
lisanl
 
Option Data Types in IBM Streams V4.3
Option Data Types in IBM Streams V4.3Option Data Types in IBM Streams V4.3
Option Data Types in IBM Streams V4.3
lisanl
 
Dynamic and Elastic Scaling in IBM Streams V4.3
Dynamic and Elastic Scaling in IBM Streams V4.3Dynamic and Elastic Scaling in IBM Streams V4.3
Dynamic and Elastic Scaling in IBM Streams V4.3
lisanl
 
Streaming Analytics for Bluemix Enhancements
Streaming Analytics for Bluemix EnhancementsStreaming Analytics for Bluemix Enhancements
Streaming Analytics for Bluemix Enhancements
lisanl
 
Toolkits Overview for IBM Streams V4.2
Toolkits Overview for IBM Streams V4.2Toolkits Overview for IBM Streams V4.2
Toolkits Overview for IBM Streams V4.2
lisanl
 
Highlights of the Telecommunications Event Data Analytics toolkit
Highlights of the Telecommunications Event Data Analytics toolkitHighlights of the Telecommunications Event Data Analytics toolkit
Highlights of the Telecommunications Event Data Analytics toolkit
lisanl
 
IBM Streams V4.2 Submission Time Fusion and Configuration
IBM Streams V4.2 Submission Time Fusion and ConfigurationIBM Streams V4.2 Submission Time Fusion and Configuration
IBM Streams V4.2 Submission Time Fusion and Configuration
lisanl
 
IBM Streams Getting Started Resources
IBM Streams Getting Started ResourcesIBM Streams Getting Started Resources
IBM Streams Getting Started Resources
lisanl
 
IBM ODM Rules Compiler support in IBM Streams V4.2.
IBM ODM Rules Compiler support in IBM Streams V4.2.IBM ODM Rules Compiler support in IBM Streams V4.2.
IBM ODM Rules Compiler support in IBM Streams V4.2.
lisanl
 
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.
lisanl
 
IBM Streams IoT Integration
IBM Streams IoT IntegrationIBM Streams IoT Integration
IBM Streams IoT Integration
lisanl
 
What's New in IBM Streams V4.2
What's New in IBM Streams V4.2What's New in IBM Streams V4.2
What's New in IBM Streams V4.2
lisanl
 
Data Governance with IBM Streams V4.1
Data Governance with IBM Streams V4.1Data Governance with IBM Streams V4.1
Data Governance with IBM Streams V4.1
lisanl
 
Github Projects Overview and IBM Streams V4.1
Github Projects Overview and IBM Streams V4.1Github Projects Overview and IBM Streams V4.1
Github Projects Overview and IBM Streams V4.1
lisanl
 
What's New in Toolkits for IBM Streams V4.1
What's New in Toolkits for IBM Streams V4.1What's New in Toolkits for IBM Streams V4.1
What's New in Toolkits for IBM Streams V4.1
lisanl
 
IBM Streams V4.1 and Incremental Checkpointing
IBM Streams V4.1 and Incremental CheckpointingIBM Streams V4.1 and Incremental Checkpointing
IBM Streams V4.1 and Incremental Checkpointing
lisanl
 
IBM Streams V4.1 REST API Support for Cross-Origin Resource Sharing (CORS)
IBM Streams V4.1 REST API Support for Cross-Origin Resource Sharing (CORS)IBM Streams V4.1 REST API Support for Cross-Origin Resource Sharing (CORS)
IBM Streams V4.1 REST API Support for Cross-Origin Resource Sharing (CORS)
lisanl
 
IBM Streams V4.1 and User Authentication with Client Certificates
IBM Streams V4.1 and User Authentication with Client CertificatesIBM Streams V4.1 and User Authentication with Client Certificates
IBM Streams V4.1 and User Authentication with Client Certificates
lisanl
 
IBM Streams V4.1 and JAAS Login Module Support
IBM Streams V4.1 and JAAS Login Module SupportIBM Streams V4.1 and JAAS Login Module Support
IBM Streams V4.1 and JAAS Login Module Support
lisanl
 
IBM Streams V4.1 Integration with IBM Platform Symphony
IBM Streams V4.1 Integration with IBM Platform SymphonyIBM Streams V4.1 Integration with IBM Platform Symphony
IBM Streams V4.1 Integration with IBM Platform Symphony
lisanl
 
Ad

Recently uploaded (20)

LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Flip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptxFlip flop presenation-Presented By Mubahir khan.pptx
Flip flop presenation-Presented By Mubahir khan.pptx
mubashirkhan45461
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
Data Science Courses in India iim skills
Data Science Courses in India iim skillsData Science Courses in India iim skills
Data Science Courses in India iim skills
dharnathakur29
 
Ad

SPL Event-Time Processing in IBM Streams V4.3

  • 1. IBM Watson – IBM Streams © 2018 IBM Corporation IBM Streams V4.3 SPL Event-Time Processing Victor Dogaru IBM Streams Development
  • 2. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation2 © 2018 IBM Corporation Please note ▪ IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice and at IBM’s sole discretion. ▪ Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. ▪ The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. ▪ The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. ▪ Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here. 2
  • 3. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation3 © 2018 IBM Corporation Overview ▪ Use Case: How does this help Streams developers and users? ▪ Out of order streams and late data ▪ SPL Watermarks ▪ SPL event-time language definitions – @eventTime annotation – TimeInterval window ▪ SPL TimeInterval window and window panes ▪ SPL event-time functions ▪ Support for Java and C++ operators 3
  • 4. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation4 © 2018 IBM Corporation Use Case Streams application for monitoring device data ▪ The user has to write an application which – Ingests timestamped events – Calculates metrics of interest every 20 minutes, for events which occurred from 10:00 to 11:00, 10:20 to 11:20, 10:40 to 11:40, etc. – Updates calculations if late events arrive after metrics were calculated – Discards events if they arrive later than 6 hours after metrics were calculated
  • 5. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation5 © 2018 IBM Corporation Use Case Solution ▪ Designate SPL attribute for the event timestamp – Add attribute to stream schemas from the data ingest point downstream ▪ Insert Aggregate operator with an event-time window that groups tuples into intervals based on their event timestamp ▪ Designate the operator which generates watermarks (usually at data ingest point) – Watermarks provide a time base for event-time streams – The Streams runtime and operator logic ensures that the tuple order relative to watermarks is preserved – For each operator, if inputs are not late then output should not be late with respect to the watermark value ▪ As watermarks reach the event-time window, they trigger: – Calculation of aggregate metrics – Updates in case of late events – Eviction for data beyond the discarding age horizon
  • 6. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation6 © 2018 IBM Corporation But How About Out-of-Order Streams and Late Data? ▪ Event time is the time that an event happened in the real world – Event-time timestamp is carried with the tuple ▪ Processing (or system) time is the time measured by a machine that processes the event – Processing time is the machine time when the tuple is being processed ▪ Events are streamed out of order because of variable delay prior to data ingestion and within Streams – Some event producers are not always connected (sensors, mobile devices, etc.) – Some event producers locally buffer data and emit their events in bursts – Events and Tuples travel on different network paths – Backpressure and queuing delays from the stream operators 6
  • 7. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation7 © 2018 IBM Corporation Watermarks ▪ Watermarks provide a measure of event time progress in a data stream – For an input stream and a Watermark with value X, all tuples with event time less than X have been received – For an output stream and a Watermark with value X, all tuples with event time less than X have been submitted ▪ A Watermark is only an estimate of completeness – Events with timestamps earlier than X may arrive after the Watermark X. These are late data. 7 1312 WM 14 WM 10WM 15 1112 Tuple Late Tuple WM Watermark ▪ The IBM Streams runtime broadcasts Watermarks downstream – Ensures tuple order is maintained with respect to Watermarks – Tuples derived from non-late inputs should not be submitted late ▪ A new “currentWatermark” operator custom metric displays the current watermark value in milliseconds
  • 8. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation8 © 2018 IBM Corporation SPL Event-time Language Definitions @eventTime annotation – Attribute name, resolution – Watermark generation Event-time stream schemas contain the event-time attribute TimeInterval window – Calculates aggregates for defined time intervals 8 // Event-time source @eventTime(eventTimeAttribute=et, lag=5.0, minimumGap=0.075) stream<timestamp et, ...> Events = TCPSource() { ... } . . . // Event-time graph stream<..., timestamp et, ...> B = MyOperator(A) {} . . . // Aggregate over event-time window stream<timestamp et, ...> Out = Aggregate(In) { window In : timeInterval, intervalDuration(3600.0), creationPeriod(1200.0), discardAge(21600.0), partitioned; param partitionBy : a, b, c; output Out : timeStart = windowBegin(), timeEnd = windowEnd(), ... }
  • 9. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation9 © 2018 IBM Corporation @eventTime Annotation @eventTime(eventTimeAttribute=et, resolution=Nanoseconds, lag=5.0, minimumGap=0.075) ▪ Indicates that the annotated operator and all the downstream operators which are connected via event-time streams participate in an event-time graph – Connectivity extends only downstream – Event-time ends at a sink or at an operator which does not output the event-time attribute ▪ Annotation elements – eventTimeAttribute : name of the tuple attribute which represents the event time of the tuple – Supported types: timestamp, uint64, int64 – resolution : time units of the event-time attribute values -- Milliseconds, Microseconds, Nanoseconds – lag : duration in seconds between the maximum event-time of submitted tuples and the value of the watermark – minimumGap : minimum event-time duration in seconds between subsequent watermarks ▪ The operator's watermark set to WM = max(event-time of processed tuples) – lag 9
  • 10. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation10 © 2018 IBM Corporation TimeInterval Window window In : timeInterval, intervalDuration(3600.0), creationPeriod(1200.0), intervalOffset(1800.0), discardAge(21600.0) ▪ Window options – timeInterval : the window kind -- tuples are placed into panes which correspond to equal intervals in the event- time domain – intervalDuration : duration between the lower and upper interval endpoints – creationPeriod : duration between the lower endpoint of consecutive intervals – discardAge : duration between the point in time when a window pane becomes complete and the point in time when the pane closes and does not accept late tuples. Panes are discarded after they close. – intervalOffset : point in time value which coincides with an interval start time ▪ Window panes partition the event time domain into intervals of the form: [N * creationPeriod + intervalOffset, N * creationPeriod + intervalDuration + intervalOffset) ▪ Value 0 represents the Unix epoch: 1970-01-01T00:00:00Z UTC 10
  • 11. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation11 © 2018 IBM Corporation TimeInterval Window Panes ▪ TimeInterval Window manages a collection of window panes ▪ Each pane stores tuples for a fixed event-time interval ▪ Panes trigger when Watermark reaches the top of the interval ▪ Panes close and get discarded when they get older than the ‘discardAge’ ▪ System creates new panes as specified by the ‘creationPeriod’ Example – When Tuple(13:55) is received: Tuple is assigned to Pane D – On Watermark(14:00): Pane D is complete and triggers, Pane A closes and gets discarded – When late Tuple(12:45) is received: Tuple is assigned to Pane C, Pane C triggers (on the next Watermark) 11 timeInterval, intervalDuration(60.0), discardAge(180.0) 12:45 14:00 10:0011:0012:0013:0013:55 WM 14:00 D AC B Arriving tuples
  • 12. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation12 © 2018 IBM Corporation SPL Event-time Functions timestamp windowBegin(); timestamp windowEnd(); <tuple T> timestamp getEventTime(T t); timestamp toTimestamp(uint64 ticks, enum {Milliseconds, Microseconds, Nanoseconds} resolution); timestamp toTimestamp(int64 ticks, enum {Milliseconds, Microseconds, Nanoseconds} resolution); int64 int64TicksFromTimestamp(timestamp ts, enum {Milliseconds, Microseconds, Nanoseconds} resolution); uint64 uint64TicksFromTimestamp(timestamp ts, enum {Milliseconds, Microseconds, Nanoseconds} resolution); public uint64 paneIndex(); Sys.PaneTiming paneTiming(); 12 Window intervals Event-time transformations Window pane status
  • 13. IBM Watson – IBM Streams IBM Confidential © 2018 IBM Corporation13 © 2018 IBM Corporation Support for primitive Java and C++ Operators ▪ New C++ windowing library classes for TimeInterval window ▪ Java and C++ primitive operators can explicitly set the operator’s Watermark value, or let the system set it for them 13
  • 14. IBM Watson – IBM Streams © 2018 IBM Corporation Thank You