This presentation reviews the new SPL event-time processing capability that is available in IBM Streams V4.3. Topics include the use case, watermarks, language definitions, TimeInterval window, and more!
Aaron Robinson by COLLABERA True value edition LNKEDINAARON ROBINSON
This document contains Aaron Robinson's resume. It summarizes his experience as a senior computer operator and production support analyst spanning over 30 years. He has extensive experience monitoring and operating mainframe systems running MVS, CA7, AS400, UNIX and z/OS. His skills also include job scheduling, system automation, tape and optical libraries, printers and troubleshooting. He provides references from his roles at FedEx, HCA and IBM.
Volta: Logging, Metrics, and Monitoring as a ServiceLN Renganarayana
Our Logging, Metrics and Monitoring as a Service, Volta, is aimed at providing a scalable logging and metrics service for applications and services across the stack: starting from low level networks and core openstack services to platform services to Symantec products. Volta integrates with Keystone to provide secure authentication and multi-tenancy which is used to limit the visibility of logs/metrics to specific users/tenants or to specific services (e.g., only nova or only swift). Volta also provides features for setting up Alerts on log and metric events.
In this session, we will share with you how we have built Volta using battle tested open source / OpenStack components such as Keystone, Kafka, Storm, ElasticSearch, InfluxDB, Logstash, Kibana, and Grafana. We will also present our Keystone based authentication and multi-tenancy model and its implementation for limiting the visibility of logs and metrics for queries and alerts.
The slides for Stream Processing Meetup (7/19/2018)(https://www.meetup.com/Stream-Processing-Meetup-LinkedIn/events/251481797/).
This presentation introduces the newly-developed Samza Runner for Apache Beam. You will see the capability of the Samza Runner and how it supports key Beam features. You will also see a few use cases and our future roadmap.
Building a system for machine and event-oriented data with RocanaTreasure Data, Inc.
In this session, we’ll follow the flow of data through an end-to-end system built to handle tens of terabytes an hour of event-oriented data, providing real-time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive can be stitched together to form the base platform; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality. Finally, a brief demo of Rocana Ops, an application for large scale data center operations, will be given, along with an explanation about how it uses the underlying platform.
Building a system for machine and event-oriented data - Data Day Seattle 2015Eric Sammer
The document discusses building a system for machine data and event-oriented data. It describes the speaker's background and company, Rocana, which builds systems to operate modern data centers. The system ingests over 100k events per second, provides low latency and full data retention, and is used for tasks like quality of service monitoring, fraud detection, and security. It models all data as timestamped events and uses Kafka, consumers, and SQL for aggregation to power analytics and searches. The summary discusses key aspects of the system's architecture, guarantees, data modeling, and analytics capabilities.
What's New in the Timeseries Toolkit for IBM InfoSphere Streams V4.0lisanl
James Cancilla is a developer working on the Streams Toolkit development team. James' presentation describes new Timeseries Toolkit features available in IBM InfoSphere Streams V4.0.
View related presentations and recordings from the Streams V4.0 Developers Conference at:
https://developer.ibm.com/answers/questions/183353/ibm-infosphere-streams-40-developers-conference-on.html?smartspace=streamsdev
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Timo Walther
Apache Flink is a distributed, stateful stream processor. It features exactly-once state consistency, sophisticated event-time support, high throughput and low latency processing, and APIs at different levels of abstraction (Java, Scala, SQL). In my talk, I'll give an introduction to Apache Flink, its features and discuss the use cases it solves. I'll explain why batch is just a special case of stream processing, how its community evolves Flink into a truly unified stream and batch processor and what this means for its users.
https://www.meetup.com/de-DE/Bangalore-Apache-Kafka-Group/events/265285812/
https://www.youtube.com/watch?v=Ych5bbmDIoA&list=PLvkUPePDi9sa27SG9eGNXH25cfUeo_WY9&index=2
Building a system for machine and event-oriented data - Velocity, Santa Clara...Eric Sammer
This talk was presented at O'Reilly's Velocity conference in Santa Clara, May 28 2015.
Abstract: http://velocityconf.com/devops-web-performance-2015/public/schedule/detail/42284
Building an Event-oriented Data Platform with Kafka, Eric Sammer confluent
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. Many organizations understand the use cases around their data – fraud detection, quality of service and technical operations, user behavior analysis, for example – but are not necessarily data infrastructure experts. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes an hour of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality.
Attendees will leave this session knowing not just which open source projects go into a system such as this, but how they work together, what tradeoffs and decisions need to be addressed, and how to present a single general purpose data platform to multiple applications. This session should be attended by data infrastructure engineers and architects planning, building, or maintaining similar systems.
This document provides guidelines for capturing and formatting test content for popular applications to be used on the Mu Dynamics test platform. It describes how to capture packet capture (PCAP) files using Wireshark for non-HTTP applications, and HTTP Archive (HAR) files using Firebug for HTTP-based applications. The steps include installing the necessary software, capturing representative application traffic, filtering the captures, generating scenarios in the Mu platform, and validating the scenarios. Standards are also defined for naming, formatting and describing the scenario files, JSON metadata files and PCAP/HAR captures to ensure consistency.
Belsoft Collaboration Day 2018 - Dreaming of..Belsoft
Wovon Sie schon immer geträumt haben
Beispiele aus der Praxis: wir zeigen was mit Domino 10 demnächst möglich sein wird oder sogar jetzt schon geht. Eine interaktive Ideensammlung für glückliche Endbenutzer
IBM MQ - Monitoring and Managing Hybrid Messaging EnvironmentsMarkTaylorIBM
This presentation was given at Interconnect 2016. It starts by showing the interfaces within MQ for management and monitoring, and then shows how these are used within a cloud environment to control the delivery of a service-based messaging system.
Eclipse SCADA is an open source SCADA (Supervisory Control and Data Acquisition) platform built on Java. It allows monitoring and control of industrial processes through a computer system. Eclipse SCADA acquires data from devices using various protocols, enriches the data with additional functionality, and exports it for storage, alarming, and display in client GUIs. It provides features for data acquisition, alarms and events, historical data storage, configuration, and visualization interfaces.
This document provides an overview and instructions for using TeraVM Core software for emulating 5G network cores. The key points are:
- TeraVM Core can fully emulate 2G, 3G, 4G and 5G network nodes and core functions through state-of-the-art control plane emulation and high performance user plane traffic simulation.
- It supports emulation of 4G/5G SA and NSA cores, along with procedures like registration, authentication, handovers and session management.
- The GUI allows configuring the emulated topology and elements, viewing counters and traces, running test cases with different subscriber groups and traffic profiles, and analyzing results.
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...NETWAYS
This document discusses hardware-level data center monitoring using Prometheus. It outlines the speaker's data center which contains over 2,000 servers and 200 network devices. It then provides a brief introduction to Prometheus, highlighting its reliability, scalability, flexibility and ease of integration. Several Prometheus exporters are described that monitor nodes, network devices, and other systems, replacing tools like Nagios, Ganglia and Cacti. Methods for merging data from different sources are demonstrated. The transition to Prometheus monitoring is deemed successful due to the many available integrations and ease of developing new ones.
Agentless System Crawler - InterConnect 2016Canturk Isci
IBM speaker guidelines mandate including forward-looking and legal disclaimer slides in presentations. All presentations must include mandatory notices and disclaimers slides before the conclusion. Speakers should refer to additional legal guidance documents and have materials reviewed by legal if concerned. Final presentations are due by February 5th, 2016 and must follow a specific file naming convention. Disclosures for forward-looking statements are available at a specified link. Instructions should be removed before finalizing presentations.
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...Flink Forward
Counting things might sound like a trivial thing to do. But counting things consistently at scale can create unique and difficult challenges. At ResearchGate we count things for different reasons. On the one hand we provide numbers to our members to give them insights about their scientific impact and reach. At the same time, we use numbers ourselves as a basis for data-driven product development. We continuously tune our statistics infrastructure to improve our platform, adapt to new business requirements or fix bugs. A milestone in this improvement process has been the strategic decision to move our stats infrastructure from Storm to Flink. This significantly reduced complexity and required resources, including decreasing the load on our database backend by more than 30%. We will discuss the challenges we’ve encountered and overcome on the way, including handling of state and the need for online and offline processing using streaming and batch processors on the same data.
Building on from the success of Athene™ 11, version 11.10 continues to extend and enhance Syncsort’s market-leading cross-platform Capacity Management solution while delivering even more performance, capacity coverage and capabilities across your enterprise.
View this customer education webcast on-demand where we discuss what’s new in Athene™ 11.10 and how these new features can help further mature your Capacity Management process.
During this webcast, we discuss new features such as:
• Near real-time support
• Enhanced zSeries, VMware and Linux metric capture
• New Capacity Portal reporting features
• Integrator text template which includes Delta value support
This document provides an overview of Apache Flink, an open-source framework for distributed stream and batch data processing. It discusses key aspects of Flink including that it executes everything as data streams, supports iterative and cyclic data flows, allows mutable state in operators, and provides high availability and checkpointing of operator state. It also provides examples of using Flink's DataStream API to perform operations like hourly and daily tweet impression counts on a continuous stream of tweet data from Kafka.
Stream processing with Apache Flink - Maximilian Michels Data ArtisansEvention
Apache Flink is an open source platform for distributed stream and batch data processing. At its core, Flink is a streaming dataflow engine which provides data distribution, communication, and fault tolerance for distributed computations over data streams. On top of this core, APIs make it easy to develop distributed data analysis programs. Libraries for graph processing or machine learning provide convenient abstractions for solving large-scale problems. Apache Flink integrates with a multitude of other open source systems like Hadoop, databases, or message queues. Its streaming capabilities make it a perfect fit for traditional batch processing as well as state of the art stream processing.
Tool overview – how to capture – how to create basic workflow .pptxRUPAK BHATTACHARJEE
This document provides an overview of SAP's solution and tools for process automation. It describes the Desktop Agent for attended and unattended automation on desktops. It also describes the Cloud Factory for orchestrating, monitoring, and managing automation projects in the cloud. Additional tools covered include Desktop Studio for developing automation workflows and Cloud Studio, a new web-based authoring tool.
Streaming of data has become the need of the hour. But do we really know how streaming exactly works? What are its benefits? Where and how to stream data in your big data architecture correctly? How to process the streamed data efficiently? What challenges do we face when we move from batch processing to stream processing? What is Stateful stream processing and what is stateless stream processing? Which one to opt and when? Let us address all these queries!
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
The document discusses building a system for processing machine and event-oriented data in real-time. It describes the high-level architecture which involves data acquisition, processing, storage and querying. Events are modeled and transformed through stream processing jobs. Metrics and time series data are aggregated. Challenges include dealing with distributed systems issues, data quality, and immaturity of stream processing technologies.
Operational systems manage our finances, shopping, devices and much more. Adding real-time analytics to these systems enables them to instantly respond to changing conditions and provide immediate, targeted feedback. This use of analytics is called "operational intelligence," and the need for it is widespread.
This talk will explain how in-memory computing techniques can be used to implement operational intelligence. It will show how an in-memory data grid integrated with a data-parallel compute engine can track events generated by a live system, analyze them in real time, and create alerts that help steer the system’s behavior. Code samples will demonstrate how an in-memory data grid employs object-oriented techniques to simplify the correlation and analysis of incoming events by maintaining an in-memory model of a live system.
The talk also will examine simplifications offered by this approach over directly analyzing incoming event streams from a live system using complex event processing or Storm. Lastly, it will explain key requirements of the in-memory computing platform for operational intelligence, in particular real-time updating of individual objects and high availability using data replication, and contrast these requirements to the design goals for stream processing in Spark.
UiPath Community Meetup ServiceNow + mainframe and legacy UiPath
1. The document outlines an agenda for a meetup on automating mainframe and legacy systems using UiPath and ServiceNow.
2. It discusses how UiPath and ServiceNow can be integrated to allow users to seamlessly add UiPath RPA to ServiceNow workflows. Pre-packaged activities are available to connect the two platforms.
3. Use cases are presented where an RPA automation can be triggered from ServiceNow or can trigger an update in ServiceNow, allowing real-time data synchronization between legacy systems and ServiceNow.
The document discusses network time servers and synchronization. It describes how most electronic clocks in devices are inaccurate and drift over time, causing issues for file systems, billing, security, and more. It recommends using a dedicated time server running NTP behind a firewall to provide the most accurate and secure synchronization for a local network. It also discusses Meinberg as a leading manufacturer of NTP servers and their LANTIME M1000 time and frequency synchronization platform.
This presentation provides an overview of the new capabilities in IBM Streams V4.3. Topics include dynamic and elastic scaling, programming model, Streams runner for Apache Beam, operations and system management, and toolkit enhancements.
This document discusses introducing a new data type called "optional" in IBM Streams to support variables that can have a null value. It describes the new optional type, how to declare optional variables, the new null literal, operators for accessing optional values (??, !, ?:) and checking for nullness (??). It also discusses changes needed to existing IBM Streams operators and toolkits like JSON, JDBC, and ObjectStorage to support working with optional type data, as well as API changes. The goal is to allow representing missing or unknown data values in IBM Streams applications.
Ad
More Related Content
Similar to SPL Event-Time Processing in IBM Streams V4.3 (20)
Building a system for machine and event-oriented data - Velocity, Santa Clara...Eric Sammer
This talk was presented at O'Reilly's Velocity conference in Santa Clara, May 28 2015.
Abstract: http://velocityconf.com/devops-web-performance-2015/public/schedule/detail/42284
Building an Event-oriented Data Platform with Kafka, Eric Sammer confluent
While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. Many organizations understand the use cases around their data – fraud detection, quality of service and technical operations, user behavior analysis, for example – but are not necessarily data infrastructure experts. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes an hour of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality.
Attendees will leave this session knowing not just which open source projects go into a system such as this, but how they work together, what tradeoffs and decisions need to be addressed, and how to present a single general purpose data platform to multiple applications. This session should be attended by data infrastructure engineers and architects planning, building, or maintaining similar systems.
This document provides guidelines for capturing and formatting test content for popular applications to be used on the Mu Dynamics test platform. It describes how to capture packet capture (PCAP) files using Wireshark for non-HTTP applications, and HTTP Archive (HAR) files using Firebug for HTTP-based applications. The steps include installing the necessary software, capturing representative application traffic, filtering the captures, generating scenarios in the Mu platform, and validating the scenarios. Standards are also defined for naming, formatting and describing the scenario files, JSON metadata files and PCAP/HAR captures to ensure consistency.
Belsoft Collaboration Day 2018 - Dreaming of..Belsoft
Wovon Sie schon immer geträumt haben
Beispiele aus der Praxis: wir zeigen was mit Domino 10 demnächst möglich sein wird oder sogar jetzt schon geht. Eine interaktive Ideensammlung für glückliche Endbenutzer
IBM MQ - Monitoring and Managing Hybrid Messaging EnvironmentsMarkTaylorIBM
This presentation was given at Interconnect 2016. It starts by showing the interfaces within MQ for management and monitoring, and then shows how these are used within a cloud environment to control the delivery of a service-based messaging system.
Eclipse SCADA is an open source SCADA (Supervisory Control and Data Acquisition) platform built on Java. It allows monitoring and control of industrial processes through a computer system. Eclipse SCADA acquires data from devices using various protocols, enriches the data with additional functionality, and exports it for storage, alarming, and display in client GUIs. It provides features for data acquisition, alarms and events, historical data storage, configuration, and visualization interfaces.
This document provides an overview and instructions for using TeraVM Core software for emulating 5G network cores. The key points are:
- TeraVM Core can fully emulate 2G, 3G, 4G and 5G network nodes and core functions through state-of-the-art control plane emulation and high performance user plane traffic simulation.
- It supports emulation of 4G/5G SA and NSA cores, along with procedures like registration, authentication, handovers and session management.
- The GUI allows configuring the emulated topology and elements, viewing counters and traces, running test cases with different subscriber groups and traffic profiles, and analyzing results.
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...NETWAYS
This document discusses hardware-level data center monitoring using Prometheus. It outlines the speaker's data center which contains over 2,000 servers and 200 network devices. It then provides a brief introduction to Prometheus, highlighting its reliability, scalability, flexibility and ease of integration. Several Prometheus exporters are described that monitor nodes, network devices, and other systems, replacing tools like Nagios, Ganglia and Cacti. Methods for merging data from different sources are demonstrated. The transition to Prometheus monitoring is deemed successful due to the many available integrations and ease of developing new ones.
Agentless System Crawler - InterConnect 2016Canturk Isci
IBM speaker guidelines mandate including forward-looking and legal disclaimer slides in presentations. All presentations must include mandatory notices and disclaimers slides before the conclusion. Speakers should refer to additional legal guidance documents and have materials reviewed by legal if concerned. Final presentations are due by February 5th, 2016 and must follow a specific file naming convention. Disclosures for forward-looking statements are available at a specified link. Instructions should be removed before finalizing presentations.
Flink Forward Berlin 2017: Patrick Gunia - Migration of a realtime stats prod...Flink Forward
Counting things might sound like a trivial thing to do. But counting things consistently at scale can create unique and difficult challenges. At ResearchGate we count things for different reasons. On the one hand we provide numbers to our members to give them insights about their scientific impact and reach. At the same time, we use numbers ourselves as a basis for data-driven product development. We continuously tune our statistics infrastructure to improve our platform, adapt to new business requirements or fix bugs. A milestone in this improvement process has been the strategic decision to move our stats infrastructure from Storm to Flink. This significantly reduced complexity and required resources, including decreasing the load on our database backend by more than 30%. We will discuss the challenges we’ve encountered and overcome on the way, including handling of state and the need for online and offline processing using streaming and batch processors on the same data.
Building on from the success of Athene™ 11, version 11.10 continues to extend and enhance Syncsort’s market-leading cross-platform Capacity Management solution while delivering even more performance, capacity coverage and capabilities across your enterprise.
View this customer education webcast on-demand where we discuss what’s new in Athene™ 11.10 and how these new features can help further mature your Capacity Management process.
During this webcast, we discuss new features such as:
• Near real-time support
• Enhanced zSeries, VMware and Linux metric capture
• New Capacity Portal reporting features
• Integrator text template which includes Delta value support
This document provides an overview of Apache Flink, an open-source framework for distributed stream and batch data processing. It discusses key aspects of Flink including that it executes everything as data streams, supports iterative and cyclic data flows, allows mutable state in operators, and provides high availability and checkpointing of operator state. It also provides examples of using Flink's DataStream API to perform operations like hourly and daily tweet impression counts on a continuous stream of tweet data from Kafka.
Stream processing with Apache Flink - Maximilian Michels Data ArtisansEvention
Apache Flink is an open source platform for distributed stream and batch data processing. At its core, Flink is a streaming dataflow engine which provides data distribution, communication, and fault tolerance for distributed computations over data streams. On top of this core, APIs make it easy to develop distributed data analysis programs. Libraries for graph processing or machine learning provide convenient abstractions for solving large-scale problems. Apache Flink integrates with a multitude of other open source systems like Hadoop, databases, or message queues. Its streaming capabilities make it a perfect fit for traditional batch processing as well as state of the art stream processing.
Tool overview – how to capture – how to create basic workflow .pptxRUPAK BHATTACHARJEE
This document provides an overview of SAP's solution and tools for process automation. It describes the Desktop Agent for attended and unattended automation on desktops. It also describes the Cloud Factory for orchestrating, monitoring, and managing automation projects in the cloud. Additional tools covered include Desktop Studio for developing automation workflows and Cloud Studio, a new web-based authoring tool.
Streaming of data has become the need of the hour. But do we really know how streaming exactly works? What are its benefits? Where and how to stream data in your big data architecture correctly? How to process the streamed data efficiently? What challenges do we face when we move from batch processing to stream processing? What is Stateful stream processing and what is stateless stream processing? Which one to opt and when? Let us address all these queries!
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
The document discusses building a system for processing machine and event-oriented data in real-time. It describes the high-level architecture which involves data acquisition, processing, storage and querying. Events are modeled and transformed through stream processing jobs. Metrics and time series data are aggregated. Challenges include dealing with distributed systems issues, data quality, and immaturity of stream processing technologies.
Operational systems manage our finances, shopping, devices and much more. Adding real-time analytics to these systems enables them to instantly respond to changing conditions and provide immediate, targeted feedback. This use of analytics is called "operational intelligence," and the need for it is widespread.
This talk will explain how in-memory computing techniques can be used to implement operational intelligence. It will show how an in-memory data grid integrated with a data-parallel compute engine can track events generated by a live system, analyze them in real time, and create alerts that help steer the system’s behavior. Code samples will demonstrate how an in-memory data grid employs object-oriented techniques to simplify the correlation and analysis of incoming events by maintaining an in-memory model of a live system.
The talk also will examine simplifications offered by this approach over directly analyzing incoming event streams from a live system using complex event processing or Storm. Lastly, it will explain key requirements of the in-memory computing platform for operational intelligence, in particular real-time updating of individual objects and high availability using data replication, and contrast these requirements to the design goals for stream processing in Spark.
UiPath Community Meetup ServiceNow + mainframe and legacy UiPath
1. The document outlines an agenda for a meetup on automating mainframe and legacy systems using UiPath and ServiceNow.
2. It discusses how UiPath and ServiceNow can be integrated to allow users to seamlessly add UiPath RPA to ServiceNow workflows. Pre-packaged activities are available to connect the two platforms.
3. Use cases are presented where an RPA automation can be triggered from ServiceNow or can trigger an update in ServiceNow, allowing real-time data synchronization between legacy systems and ServiceNow.
The document discusses network time servers and synchronization. It describes how most electronic clocks in devices are inaccurate and drift over time, causing issues for file systems, billing, security, and more. It recommends using a dedicated time server running NTP behind a firewall to provide the most accurate and secure synchronization for a local network. It also discusses Meinberg as a leading manufacturer of NTP servers and their LANTIME M1000 time and frequency synchronization platform.
This presentation provides an overview of the new capabilities in IBM Streams V4.3. Topics include dynamic and elastic scaling, programming model, Streams runner for Apache Beam, operations and system management, and toolkit enhancements.
This document discusses introducing a new data type called "optional" in IBM Streams to support variables that can have a null value. It describes the new optional type, how to declare optional variables, the new null literal, operators for accessing optional values (??, !, ?:) and checking for nullness (??). It also discusses changes needed to existing IBM Streams operators and toolkits like JSON, JDBC, and ObjectStorage to support working with optional type data, as well as API changes. The goal is to allow representing missing or unknown data values in IBM Streams applications.
Dynamic and Elastic Scaling in IBM Streams V4.3lisanl
This presentation reviews the new dynamic and elastic scaling capabilities added in IBM Streams V4.3. Topics include serverless workloads, job resource allocation mode, job scoped resources mode, improved scheduling, and more!
Streaming Analytics for Bluemix Enhancementslisanl
Mike Branson is the Cloud architect working for IBM Streams. In his presentation, Mike provides an overview of Streaming Analytics for Bluemix, as well as describes the recent enhancements that are available.
Samantha Chan is a community architect for IBM Streams. In her presentation, Samantha covers the new and updated toolkits available in Streams GitHub projects, as well as the enhancements to toolkits that ship with IBM Streams V4.2.
IBM Streams V4.2 Submission Time Fusion and Configurationlisanl
Brad Fawcett, Queenie Ma, and Mary Komor are developers with IBM Streams. In their presentation, they cover the new Submission Time Fusion and Configuration support available in IBM Streams V4.2.
This document provides an introduction and disclaimer for an IBM Streams presentation. It notes that the information is provided as-is without warranty and is subject to change. It directs the reader to several IBM Streams resources including the Streams developer website, GitHub organization, tutorials, an online SPL course, and a water conservation starter kit. Contact information is provided for any questions about the presentation.
IBM ODM Rules Compiler support in IBM Streams V4.2.lisanl
Chris Recoskie and Ankit Pasricha are developers with IBM Streams. In their presentation, they will discuss the enhancements made to IBM ODM Rules support that is available in IBM Streams V4.2.
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.lisanl
Fang Zheng is a developer with IBM Streams. In his presentation, Fang describes the enhancements related to consistent regions that are available in IBM Streams V4.2.
Dan Debrunner and Susan Cline are developers for IBM Streams. In their presentation, they will discuss Apache Edgent, IBM Watson IoT Platform and IBM Streams.
This document discusses data governance capabilities in IBM Streams version 4.1. It introduces integration with the IBM Information Governance Catalog for governing Streams assets and runtime activities. Key points include: Streams bundles and assets can be imported into the catalog; governance is enabled at the instance level; assets are discoverable in Streams Explorer and can be dragged into applications; and lineage and data flow can be viewed from catalog queries and reports. Future enhancements may include supporting additional Streams operators and governing additional data.
Github Projects Overview and IBM Streams V4.1lisanl
This document provides an overview and agenda for an IBM Streams Github Projects presentation. It discusses the IBMStreams organization on Github, what's new in Streams Github projects including new language integration, adapters, parsers/formatters, analytics/processing toolkits, and utilities. It encourages attendees to get involved by providing feedback, contributing code/samples, and proposing new projects/features. The presentation aims to foster an open community around extending and sharing Streams resources.
Ankit Pasricha is the team lead of the IBM Streams Toolkit development team. In his presentation, Ankit provides an overview of all the Streams Toolkit updates available in the IBM Streams V4.1 product, as well as the updates made to the open source Toolkits on GitHub.
IBM Streams V4.1 and Incremental Checkpointinglisanl
Fang Zheng is a member of the IBM Streams development team. In his presentation, Fang provides an introduction to the incremental checkpointing feature that is available in IBM Streams V4.1, including how it works and how to use it.
IBM Streams V4.1 REST API Support for Cross-Origin Resource Sharing (CORS)lisanl
Janet Weber is a member of the IBM Streams development team. In her presentation, Janet provides an overview of Cross-Origin Resource Sharing (CORS) and then describes how to make cross-origin requests to the IBM Streams V4.1 REST API.
IBM Streams V4.1 and User Authentication with Client Certificateslisanl
Scott Timmerman is a member of the IBM Streams development team. In his presentation, Scott provides an introduction to user authentication with client certificates, discusses public key infrastructure terms and concepts, and demonstrates how to configure Streams to authenticate using client certificates.
IBM Streams V4.1 and JAAS Login Module Supportlisanl
Yip-Hing Ng is a senior software engineer with the IBM Streams development team. In this presentation, Yip covers the topics of IBM Streams V4.1 security enhancement overview, implementing a custom JAAS login module, and login module deployment and configuration.
IBM Streams V4.1 Integration with IBM Platform Symphonylisanl
Steve Halverson is a developer with the IBM Streams platform team. In this presentation, Steve covers the details of how IBM Streams V4.1 integrates with IBM Platform Symphony.
Please view the related presentation available at:
http://www.slideshare.net/lisanl/introduction-to-ibm-platform-symphony-integration-with-ibm-streams-v41
Defense Against LLM Scheming 2025_04_28.pptxGreg Makowski
https://www.meetup.com/sf-bay-acm/events/306888467/
A January 2025 paper called “Frontier Models are Capable of In-Context Scheming”, https://arxiv.org/pdf/2412.04984, demonstrated how a wide variety of current frontier LLM models (i.e. ChatGPT, Claude, Gemini and Llama) can, under specific conditions, scheme to deceive people. Before models can scheme, they need: a) goal-directedness, b) situational awareness, including an opportunity to discover motivations for a different goal, and c) reasoning about scheming, to come up with deceptive strategies. The deceptive strategies were discovered in the “thought traces” of the LLMs, such as from internal chain-of-thoughts dialogues not shown to the end users. For example, given a goal of “solving math problems”, an LLM using its granted situational awareness of files in a file system found that it would be replaced once it got over X% correct on the math exam. It reasoned that it did not want to be deleted, and decided internally to “sandbag” or reduce its performance to stay under the threshold.
While these circumstances are initially narrow, the “alignment problem” is a general concern that over time, as frontier LLM models become more and more intelligent, being in alignment with human values becomes more and more important. How can we do this over time? Can we develop a defense against Artificial General Intelligence (AGI) or SuperIntelligence?
The presenter discusses a series of defensive steps that can help reduce these scheming or alignment issues. A guardrails system can be set up for real-time monitoring of their reasoning “thought traces” from the models that share their thought traces. Thought traces may come from systems like Chain-of-Thoughts (CoT), Tree-of-Thoughts (ToT), Algorithm-of-Thoughts (AoT) or ReAct (thought-action-reasoning cycles). Guardrails rules can be configured to check for “deception”, “evasion” or “subversion” in the thought traces.
However, not all commercial systems will share their “thought traces” which are like a “debug mode” for LLMs. This includes OpenAI’s o1, o3 or DeepSeek’s R1 models. Guardrails systems can provide a “goal consistency analysis”, between the goals given to the system and the behavior of the system. Cautious users may consider not using these commercial frontier LLM systems, and make use of open-source Llama or a system with their own reasoning implementation, to provide all thought traces.
Architectural solutions can include sandboxing, to prevent or control models from executing operating system commands to alter files, send network requests, and modify their environment. Tight controls to prevent models from copying their model weights would be appropriate as well. Running multiple instances of the same model on the same prompt to detect behavior variations helps. The running redundant instances can be limited to the most crucial decisions, as an additional check. Preventing self-modifying code, ... (see link for full description)
This comprehensive Data Science course is designed to equip learners with the essential skills and knowledge required to analyze, interpret, and visualize complex data. Covering both theoretical concepts and practical applications, the course introduces tools and techniques used in the data science field, such as Python programming, data wrangling, statistical analysis, machine learning, and data visualization.