From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
This presentation:
* covers basics of caching and popular cache types
* explains evolution from simple cache to distributed, and from distributed to IMDG
* not describes usage of NoSQL solutions for caching
* is not intended for products comparison or for promotion of Hazelcast as the best solution
Meta/Facebook's database serving social workloads is running on top of MyRocks (MySQL on RocksDB). This means our performance and reliability depends a lot on RocksDB. Not just MyRocks, but also we have other important systems running on top of RocksDB. We have learned many lessons from operating and debugging RocksDB at scale.
In this session, we will offer an overview of RocksDB, key differences from InnoDB, and share a few interesting lessons learned from production.
Communication between Microservices is inherently unreliable. These integration points may produce cascading failures, slow responses, service outages. We will walk through stability patterns like timeouts, circuit breaker, bulkheads and discuss how they improve stability of Microservices.
Traditionally database systems were optimized either for OLAP either for OLTP workloads. Such mainstream DBMSes like Postgres,MySQL,... are mostly used for OLTP, while Greenplum, Vertica, Clickhouse, SparkSQL,... are oriented on analytic queries. But right now many companies do not want to have two different data stores for OLAP/OLTP and need to perform analytic queries on most recent data. I want to discuss which features should be added to Postgres to efficiently handle HTAP workload.
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...HostedbyConfluent
Apache Hudi is a data lake platform, that provides streaming primitives (upserts/deletes/change streams) on top of data lake storage. Hudi powers very large data lakes at Uber, Robinhood and other companies, while being pre-installed on four major cloud platforms.
Hudi supports exactly-once, near real-time data ingestion from Apache Kafka to cloud storage, which is typically used in-place of a S3/HDFS sink connector to gain transactions and mutability. While this approach is scalable and battle-tested, it can only ingest data in mini batches, leading to lower data freshness. In this talk, we introduce a Kafka Connect Sink Connector for Apache Hudi, which writes data straight into Hudi's log format, making the data immediately queryable, while Hudi's table services like indexing, compaction, clustering work behind the scenes, to further re-organize for better query performance.
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
Why is Kafka so fast? Why is Kafka so popular? Why Kafka? This slide deck is a tutorial for the Kafka streaming platform. This slide deck covers Kafka Architecture with some small examples from the command line. Then we expand on this with a multi-server example to demonstrate failover of brokers as well as consumers. Then it goes through some simple Java client examples for a Kafka Producer and a Kafka Consumer. We have also expanded on the Kafka design section and added references. The tutorial covers Avro and the Schema Registry as well as advance Kafka Producers.
SF Big Analytics 2020-07-28
Anecdotal history of Data Lake and various popular implementation framework. Why certain tradeoff was made to solve the problems, such as cloud storage, incremental processing, streaming and batch unification, mutable table, ...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
Spark SQL is a highly scalable and efficient relational processing engine with ease-to-use APIs and mid-query fault tolerance. It is a core module of Apache Spark. Spark SQL can process, integrate and analyze the data from diverse data sources (e.g., Hive, Cassandra, Kafka and Oracle) and file formats (e.g., Parquet, ORC, CSV, and JSON). This talk will dive into the technical details of SparkSQL spanning the entire lifecycle of a query execution. The audience will get a deeper understanding of Spark SQL and understand how to tune Spark SQL performance.
Redis is an in-memory key-value store that is often used as a database, cache, and message broker. It supports various data structures like strings, hashes, lists, sets, and sorted sets. While data is stored in memory for fast access, Redis can also persist data to disk. It is widely used by companies like GitHub, Craigslist, and Engine Yard to power applications with high performance needs.
This document summarizes a talk about different server models using synchronous I/O (thread-per-connection) versus asynchronous I/O (NIO). The talk discusses the evolution of server designs from simple multithreaded models to NIO-based approaches. It presents arguments for both approaches but notes that benchmarks showed synchronous I/O to be faster than NIO in some cases, contradicting common beliefs. The talk aims to empirically compare the two approaches and debunk myths around multithreading and NIO performance.
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
Enterprise data architectures usually contain many systems—data lakes, message queues, and data warehouses—that data must pass through before it can be analyzed. Each transfer step between systems adds a delay and a potential source of errors. What if we could remove all these steps? In recent years, cloud storage and new open source systems have enabled a radically new architecture: the lakehouse, an ACID transactional layer over cloud storage that can provide streaming, management features, indexing, and high-performance access similar to a data warehouse. Thousands of organizations including the largest Internet companies are now using lakehouses to replace separate data lake, warehouse and streaming systems and deliver high-quality data faster internally. I’ll discuss the key trends and recent advances in this area based on Delta Lake, the most widely used open source lakehouse platform, which was developed at Databricks.
A brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will introduce some of the newer components of Kafka that will help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
Kafka Streams is a new stream processing library natively integrated with Kafka. It has a very low barrier to entry, easy operationalization, and a natural DSL for writing stream processing applications. As such it is the most convenient yet scalable option to analyze, transform, or otherwise process data that is backed by Kafka. We will provide the audience with an overview of Kafka Streams including its design and API, typical use cases, code examples, and an outlook of its upcoming roadmap. We will also compare Kafka Streams' light-weight library approach with heavier, framework-based tools such as Spark Streaming or Storm, which require you to understand and operate a whole different infrastructure for processing real-time data in Kafka.
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisHostedbyConfluent
"There's little talk about capacity planning Kafka clusters, it's very much learn as you go, every cluster is different. In this talk Kafka DevOps Engineer Jason Bell takes you through the things that will help you, from broker capacity, thinking about topics and how the other Confluent components can affect throughput and performance. With a number of production deployments under his watchful gaze for over six years Jason has plenty of experience, stories and useful information that will help you.
By the end of the talk you'll have a good understanding of designing the cluster for various scenarios, where the points of latency are to watch and monitor. And also how to prevent teams breaking the cluster behind your back.
This talk is designed for everyone, anyone who is just starting to those who are operating Kafka on a daily basis."
This document provides an overview of HazelCast IMDG (In-Memory Data Grid), which is middleware software that manages objects across distributed servers in RAM, enabling scaling and fault tolerance. It discusses cache access patterns, cache types, use cases for HazelCast including scaling applications and sharing data across clusters, features like dynamic clustering and distributed data structures, data partitioning, and configurations. It also covers advanced techniques, alternatives to HazelCast like Redis, and performance comparisons.
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
Watch video at: http://youtu.be/Wg2boMqLjCg
Want to learn how to write faster and more efficient programs for Apache Spark? Two Spark experts from Databricks, Vida Ha and Holden Karau, provide some performance tuning and testing tips for your Spark applications
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...StreamNative
Lakehouses are quickly growing in popularity as a new approach to Data Platform Architecture bringing some of the long-established benefits from OLTP world to OLAP, including transactions, record-level updates/deletes, and changes streaming. In this talk, we will discuss Apache Hudi and how it unlocks possibilities of building your own fully open-source Lakehouse featuring a rich set of integrations with existing technologies, including Apache Pulsar. In this session, we will present: - What Lakehouses are, and why they are needed. - What Apache Hudi is and how it works. - Provide a use-case and demo that applies Apache Hudi’s DeltaStreamer tool to ingest data from Apache Pulsar.
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
In this talk, Dremio Developer Advocate, Alex Merced, discusses strategies for migrating your existing data over to Apache Iceberg. He'll go over the following:
How to Migrate Hive, Delta Lake, JSON, and CSV sources to Apache Iceberg
Pros and Cons of an In-place or Shadow Migration
Migrating between Apache Iceberg catalogs Hive/Glue -- Arctic/Nessie
Kafka Streams State Stores Being Persistentconfluent
This document discusses Kafka Streams state stores. It provides examples of using different types of windowing (tumbling, hopping, sliding, session) with state stores. It also covers configuring state store logging, caching, and retention policies. The document demonstrates how to define windowed state stores in Kafka Streams applications and discusses concepts like grace periods.
In this presentation we describe the design and implementation of Kafka Connect, Kafka’s new tool for scalable, fault-tolerant data import and export. First we’ll discuss some existing tools in the space and why they fall short when applied to data integration at large scale. Next, we will explore Kafka Connect’s design and how it compares to systems with similar goals, discussing key design decisions that trade off between ease of use for connector developers, operational complexity, and reuse of existing connectors. Finally, we’ll discuss how standardizing on Kafka Connect can ultimately lead to simplifying your entire data pipeline, making ETL into your data warehouse and enabling stream processing applications as simple as adding another Kafka connector.
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
This document discusses the top 5 use cases and architectures for data in motion in 2022. It describes:
1) The Kappa architecture as an alternative to the Lambda architecture that uses a single stream to handle both real-time and batch data.
2) Hyper-personalized omnichannel experiences that integrate customer data from multiple sources in real-time to provide personalized experiences across channels.
3) Multi-cloud deployments using Apache Kafka and data mesh architectures to share data across different cloud platforms.
4) Edge analytics that deploy stream processing and Kafka brokers at the edge to enable low-latency use cases and offline functionality.
5) Real-time cybersecurity applications that use streaming data
As part of NoSQL series, I presented Google Bigtable paper. In presentation I tried to give some plain introduction to Hadoop, MapReduce, HBase
www.scalability.rs
The document provides an overview of the activity feeds architecture. It discusses the fundamental entities of connections and activities. Connections express relationships between entities and are implemented as a directed graph. Activities form a log of actions by entities. To populate feeds, activities are copied and distributed to relevant entities and then aggregated. The aggregation process involves selecting connections, classifying activities, scoring them, pruning duplicates, and sorting the results into a merged newsfeed.
Hazelcast is an easy to use but scalable in-memory datagrid and distributed executor framework. It enables you to build applications having a big requirement on memory or that needs to scale horizontally.
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
Spark SQL is a highly scalable and efficient relational processing engine with ease-to-use APIs and mid-query fault tolerance. It is a core module of Apache Spark. Spark SQL can process, integrate and analyze the data from diverse data sources (e.g., Hive, Cassandra, Kafka and Oracle) and file formats (e.g., Parquet, ORC, CSV, and JSON). This talk will dive into the technical details of SparkSQL spanning the entire lifecycle of a query execution. The audience will get a deeper understanding of Spark SQL and understand how to tune Spark SQL performance.
Redis is an in-memory key-value store that is often used as a database, cache, and message broker. It supports various data structures like strings, hashes, lists, sets, and sorted sets. While data is stored in memory for fast access, Redis can also persist data to disk. It is widely used by companies like GitHub, Craigslist, and Engine Yard to power applications with high performance needs.
This document summarizes a talk about different server models using synchronous I/O (thread-per-connection) versus asynchronous I/O (NIO). The talk discusses the evolution of server designs from simple multithreaded models to NIO-based approaches. It presents arguments for both approaches but notes that benchmarks showed synchronous I/O to be faster than NIO in some cases, contradicting common beliefs. The talk aims to empirically compare the two approaches and debunk myths around multithreading and NIO performance.
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
Enterprise data architectures usually contain many systems—data lakes, message queues, and data warehouses—that data must pass through before it can be analyzed. Each transfer step between systems adds a delay and a potential source of errors. What if we could remove all these steps? In recent years, cloud storage and new open source systems have enabled a radically new architecture: the lakehouse, an ACID transactional layer over cloud storage that can provide streaming, management features, indexing, and high-performance access similar to a data warehouse. Thousands of organizations including the largest Internet companies are now using lakehouses to replace separate data lake, warehouse and streaming systems and deliver high-quality data faster internally. I’ll discuss the key trends and recent advances in this area based on Delta Lake, the most widely used open source lakehouse platform, which was developed at Databricks.
A brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will introduce some of the newer components of Kafka that will help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
Kafka Streams is a new stream processing library natively integrated with Kafka. It has a very low barrier to entry, easy operationalization, and a natural DSL for writing stream processing applications. As such it is the most convenient yet scalable option to analyze, transform, or otherwise process data that is backed by Kafka. We will provide the audience with an overview of Kafka Streams including its design and API, typical use cases, code examples, and an outlook of its upcoming roadmap. We will also compare Kafka Streams' light-weight library approach with heavier, framework-based tools such as Spark Streaming or Storm, which require you to understand and operate a whole different infrastructure for processing real-time data in Kafka.
Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisHostedbyConfluent
"There's little talk about capacity planning Kafka clusters, it's very much learn as you go, every cluster is different. In this talk Kafka DevOps Engineer Jason Bell takes you through the things that will help you, from broker capacity, thinking about topics and how the other Confluent components can affect throughput and performance. With a number of production deployments under his watchful gaze for over six years Jason has plenty of experience, stories and useful information that will help you.
By the end of the talk you'll have a good understanding of designing the cluster for various scenarios, where the points of latency are to watch and monitor. And also how to prevent teams breaking the cluster behind your back.
This talk is designed for everyone, anyone who is just starting to those who are operating Kafka on a daily basis."
This document provides an overview of HazelCast IMDG (In-Memory Data Grid), which is middleware software that manages objects across distributed servers in RAM, enabling scaling and fault tolerance. It discusses cache access patterns, cache types, use cases for HazelCast including scaling applications and sharing data across clusters, features like dynamic clustering and distributed data structures, data partitioning, and configurations. It also covers advanced techniques, alternatives to HazelCast like Redis, and performance comparisons.
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
Watch video at: http://youtu.be/Wg2boMqLjCg
Want to learn how to write faster and more efficient programs for Apache Spark? Two Spark experts from Databricks, Vida Ha and Holden Karau, provide some performance tuning and testing tips for your Spark applications
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...StreamNative
Lakehouses are quickly growing in popularity as a new approach to Data Platform Architecture bringing some of the long-established benefits from OLTP world to OLAP, including transactions, record-level updates/deletes, and changes streaming. In this talk, we will discuss Apache Hudi and how it unlocks possibilities of building your own fully open-source Lakehouse featuring a rich set of integrations with existing technologies, including Apache Pulsar. In this session, we will present: - What Lakehouses are, and why they are needed. - What Apache Hudi is and how it works. - Provide a use-case and demo that applies Apache Hudi’s DeltaStreamer tool to ingest data from Apache Pulsar.
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
In this talk, Dremio Developer Advocate, Alex Merced, discusses strategies for migrating your existing data over to Apache Iceberg. He'll go over the following:
How to Migrate Hive, Delta Lake, JSON, and CSV sources to Apache Iceberg
Pros and Cons of an In-place or Shadow Migration
Migrating between Apache Iceberg catalogs Hive/Glue -- Arctic/Nessie
Kafka Streams State Stores Being Persistentconfluent
This document discusses Kafka Streams state stores. It provides examples of using different types of windowing (tumbling, hopping, sliding, session) with state stores. It also covers configuring state store logging, caching, and retention policies. The document demonstrates how to define windowed state stores in Kafka Streams applications and discusses concepts like grace periods.
In this presentation we describe the design and implementation of Kafka Connect, Kafka’s new tool for scalable, fault-tolerant data import and export. First we’ll discuss some existing tools in the space and why they fall short when applied to data integration at large scale. Next, we will explore Kafka Connect’s design and how it compares to systems with similar goals, discussing key design decisions that trade off between ease of use for connector developers, operational complexity, and reuse of existing connectors. Finally, we’ll discuss how standardizing on Kafka Connect can ultimately lead to simplifying your entire data pipeline, making ETL into your data warehouse and enabling stream processing applications as simple as adding another Kafka connector.
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
This document discusses the top 5 use cases and architectures for data in motion in 2022. It describes:
1) The Kappa architecture as an alternative to the Lambda architecture that uses a single stream to handle both real-time and batch data.
2) Hyper-personalized omnichannel experiences that integrate customer data from multiple sources in real-time to provide personalized experiences across channels.
3) Multi-cloud deployments using Apache Kafka and data mesh architectures to share data across different cloud platforms.
4) Edge analytics that deploy stream processing and Kafka brokers at the edge to enable low-latency use cases and offline functionality.
5) Real-time cybersecurity applications that use streaming data
As part of NoSQL series, I presented Google Bigtable paper. In presentation I tried to give some plain introduction to Hadoop, MapReduce, HBase
www.scalability.rs
The document provides an overview of the activity feeds architecture. It discusses the fundamental entities of connections and activities. Connections express relationships between entities and are implemented as a directed graph. Activities form a log of actions by entities. To populate feeds, activities are copied and distributed to relevant entities and then aggregated. The aggregation process involves selecting connections, classifying activities, scoring them, pruning duplicates, and sorting the results into a merged newsfeed.
Hazelcast is an easy to use but scalable in-memory datagrid and distributed executor framework. It enables you to build applications having a big requirement on memory or that needs to scale horizontally.
Hazelcast provides scale-out computing capabilities that allow cluster capacity to be increased or decreased on demand. It enables resilience through automatic recovery from member failures without data loss. Hazelcast's programming model allows developers to easily program cluster applications as if they are a single process. It also provides fast application performance by holding large data sets in main memory.
Do you need to scale your application, share data across cluster, perform massive parallel processing on many JVMs or maybe consider alternative to your favorite NoSQL technology? Hazelcast to the rescue! With Hazelcast distributed development is much easier. This presentation will be useful to those who would like to get acquainted with Hazelcast top features and see some of them in action, e.g. how to cluster application, cache data in it, partition in-memory data, distribute workload onto many servers, take advantage of parallel processing, etc.
Presented on JavaDay Kyiv 2014 conference.
This document provides an overview of Hazelcast, an open source in-memory data grid. It discusses what Hazelcast is, common use cases, features, and how to configure and use distributed maps (IMap) and querying with predicates. Key points covered include that Hazelcast stores data in memory and distributes it across a cluster, supports caching, distributed computing and messaging use cases, and IMap implements a distributed concurrent map that can be queried using predicates and configured with eviction policies and persistence.
This document discusses various techniques for monitoring microservices including Circuit Breaker pattern, bulkheads, back pressure, Hystrix, Hystrix Dashboard, thread pools, Dynatrace, and memory analysis. Hystrix is a latency and fault tolerance library used for monitoring, controlling concurrency, and isolating points of access to remote systems. The Hystrix Dashboard provides real-time monitoring. Dynatrace allows deep profiling of applications through memory dumps and other techniques to identify performance issues and root causes. The workshops agenda covers setting up Hystrix commands, the dashboard, and connecting applications to Dynatrace for memory and GC monitoring.
This document discusses Hazelcast, an open source in-memory data grid. It provides an overview of Hazelcast's features such as distributed caching, data structures, and partitioning. It also summarizes several performance tests run on Hazelcast, showing average and maximum operations per second for different workloads including shopping cart simulations, locks, transactions, and entry processors. The presentation concludes by noting that Hazelcast Inc. is hiring.
The ability of a system to respond gracefully to an unexpected hardware or software failure.
There are many levels of fault tolerance, the lowest being the ability to continue operation in the event of a power failure. Many fault-tolerant computer systems mirror all operations -- that is, every operation is performed on two or more duplicate systems, so if one fails the other can take over.
In most cases it could be achieved by redundancy in application design and set of patterns and approaches to software design.
Peter Veentjer is a senior developer and solution architect for Hazelcast who has 13 years of Java experience. Hazelcast is an open source in-memory data grid that simplifies building scalable and highly available systems. It provides distributed data structures like maps, queues, topics and more through a simple Java API. Hazelcast can be used for caching, messaging, job processing and more.
Distributed Computing in Hazelcast - Geekout 2014 EditionChristoph Engelbert
Today’s amounts of collected data are showing a nearly exponential growth. More than 75% of all the data have been collected in the past 5 years. To store this data and process it in an appropriate time you need to partition the data and parallelize the processing of reports and analytics.
This talk will demonstrate how to parallelize data processing using Hazelcast and it’s underlying distributed data structures. With a quick introduction into the different terms and some short live coding examples we will make the journey into the distributed computing.
Sourcecode of the demonstrations are available here:
1. https://github.com/noctarius/hazelcast-mapreduce-presentation
2. https://github.com/noctarius/hazelcast-distributed-computing
Hazelcast is an in-memory data grid that allows multiple instances of an application to communicate and share data between each other. It keeps data in main memory for fast processing and provides structures like maps, lists, sets and queues to store distributed data. Hazelcast makes it easy to set up distributed caching and synchronization between nodes with no need to manually discover instances.
Today’s amounts of collected data are showing nearly exponential growth. More than 75 percent of all collected data has been collected in the past five years. To store that data and process it within an appropriate time, you need to partition the data and parallelize the processing of reports and analytics. This session demonstrates how to quickly and easily parallelize data processing with Hazelcast and its underlying distributed data structures. By giving a few quick introductions to different terms and some short live coding sessions, the presentation takes you on a journey through distributed computing.
This document provides an overview of Hazelcast, an open-source in-memory data grid. It includes an agenda for a Hazelcast training that will cover what Hazelcast is, its main features like scalability and speed, and a live demo. Key points are that Hazelcast is a Java library for building distributed applications, it provides data distribution, high availability, and easy scaling.
The way from monolithic to micro service architectures can hard. Overall micro services are not the all holy grail to just solve all your issues. You need to be aware that you need the right developers and the right toolset. Oh and not to forget, moving state to authorization systems doesn't mean your application is really stateless :)
Anyhow micro services are a great architecture and this deck is a short introduction on why we need to change our application architectures and what pitfalls you you have when introducing the idea of micro services.
This document provides an overview and introduction to Hazelcast, an open source in-memory data grid solution for Java applications. It discusses what Hazelcast is, why it can be used, how it compares to other solutions, code samples, configuration, and internals. The presentation includes sections on distributed data structures, clustering, partitioning, transactions, configuration, and how data is serialized and distributed across nodes.
In this webinar we will compare the complexities involved in Terracotta with the code/configuration changes to migrate to Hazelcast. You will learn about important features of Hazelcast such as IMDG capabilities, off-heap data storage, distributed collections, etc. and the feature-rich product portfolio of Hazelcast. We will cover how Hazelcast can scale up and out dynamically and without downtime against the static configuration of Terracotta. Expect to leave the webinar being more educated about Hazelcast in terms of architecture, important features and best practices.
We’ll cover these topics:
- Hazelcast architecture and features
- Terracotta distributed architecture
- Scale – Vertical + Horizontal = Showcase no downtime feature in Hazelcast
- BigMemory vs. HDC
- Ease of installation – two jars against multiple jars
- Config and Code changes – cache vs. maps, off-heap vs. HDC
- Portability of Client APIs – IMap, IQueue, Topics, etc.
- Added functionalities – Showcase IExecutorService, EntryProcessors, Multimap, etc.
- DSO – Showcase EntryProcessors taking place of DSO
- Live Q&A
Presenter:
Rahul Gupta, Senior Solutions Architect
Rahul is a technology-driven professional with 12+ years of experience in building and architecting highly scalable and concurrent, low latency business critical distributed infrastructure. His expertise lies in Big Data and Real Time Analytics space where he specializes in big data governing technologies and Enterprise Architecture. Rahul is an expert in working with decision makers across different business verticals within an organization and guiding them in right decision making through in-depth technical understanding, analysis and evaluation procedures to bring home critical deals with high business values.
The document discusses Hazelcast architecture and configuration options. It provides an overview of Hazelcast data structures like maps, queues, locks and topics. It then details different serialization options in Hazelcast like Serializable, DataSerializable, Portable and pluggable serialization. It benchmarks the different serialization approaches by serializing a shopping cart object and finds that pluggable serialization using Kryo provides the best performance with read throughput of 60 ops/ms and size of 210 bytes. The presentation concludes with a Q&A session.
Building scalable applications with hazelcastFuad Malikov
Hazelcast is an open source in-memory data grid that provides distributed map, list, queue, and topic data structures. It allows storing data in memory across a cluster of nodes for fast access and scalable applications. Hazelcast provides high availability, resilience to failures, and easy programming model to build clustered applications.
JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and IgniteJoseph Kuo
This session aims to establish applications running against distributed and scalable system, or as we know cloud computing system. We will introduce you not only briefing of Hazelcast but also deeper kernel of it, and how it works with Spark, the most famous Map-reduce library. Furthermore, we will introduce another in-memory cache called Apache Ignite and compare it with Hazelcast to see what's the difference between them. In the end, we will give a demonstration showing how Hazelcast and Spark work together well to form a cloud-base service which is distributed, flexible, reliable, available, scalable and stable. You can find demo code here: https://github.com/CyberJos/jcconf2016-hazelcast-spark
https://cyberjos.blog/java/seminar/jcconf-2016-cloud-computing-applications-hazelcast-spark-and-ignite/
Sharing of Distributed Objects in a DX Cluster, thanks to Hazelcast - Online ...Jahia Solutions Group
Here are the slides of Jahia's Developers Meetup held online, via Cisco WebEx, on March 15, 2017.
The topic of this meetup was: Sharing of Distributed Objects in a DX Cluster, thanks to Hazelcast.
OpenNebulaConf 2013 - Making Clouds: Turning OpenNebula into a Product by Car...OpenNebula Project
What does it takes to bring innovations like private clouds to small and medium enterprises? In the course of this talk we will present our experience in creating a self-service toolkit for creating a complete virtualization and cloud platform based on OpenNebula, as well as our experience gathered in tens of installations of all sizes. From scalable storage (with benchmarks!) to autonomic optimization, we will present what in our view is needed to bring private clouds to everyone, what components and additions we created to better solve our customers’ problems (from replacing industrial control systems to medium scale virtual desktop infrastructures), and why OpenNebula has been chosen over other competing cloud toolkits.
Bio:
Carlo Daffara the Technical director of Cloudweavers, and formerly head of research and development at Conecta, a consulting firm specializing in open source systems and distributed computing; Italian member of the European Working Group on Libre Software and co-coordinator of the working group on SMEs of the EU ICT task force on competitiveness. Since 1999, works as evaluator for IST programme submissions in the field of component-based software engineering, GRIDs and international cooperation. Coordinator of the open source platforms technical area of the IEEE technical committee on scalable computing, co-chair of the SIENA EU cloud initiative roadmap editorial board and part of the editorial review board of the International Journal of Open Source Software & Processes (IJOSSP).
Making Clouds: Turning OpenNebula into a ProductNETWAYS
What does it takes to bring innovations like private clouds to small and medium enterprises? In the course of this talk we will present our experience in creating a self-service toolkit for creating a complete virtualization and cloud platform based on OpenNebula, as well as our experience gathered in tens of installations of all sizes. From scalable storage (with benchmarks!) to autonomic optimization, we will present what in our view is needed to bring private clouds to everyone, what components and additions we created to better solve our customers’ problems (from replacing industrial control systems to medium scale virtual desktop infrastructures), and why OpenNebula has been chosen over other competing cloud toolkits.
28March2024-Codeless-Generative-AI-Pipelines
https://www.meetup.com/futureofdata-princeton/events/299440871/
https://www.meetup.com/real-time-analytics-meetup-ny/events/299290822/
******Note*****
The event is seat-limited, therefore please complete your registration here. Only people completing the form will be able to attend.
-----------------------
We're excited to invite you to join us in-person, for a Real-Time Analytics exploration!
Join us for an evening of insights, networking as we delve into the OSS technologies shaping the field!
Agenda:
05:30-06:00: Pizza and friends
06:00- 06:40: Codeless GenAI Pipelines with Flink, Kafka, NiFi
06:40- 07:20 Real-Time Analytics in the Corporate World: How Apache Pinot® Powers Industry Leaders
07:20-07:30 QNA
Codeless GenAI Pipelines with Flink, Kafka, NiFi | Tim Spann, Cloudera
Explore the power of real-time streaming with GenAI using Apache NiFi. Learn how NiFi simplifies data engineering workflows, allowing you to focus on creativity over technical complexities. I'll guide you through practical examples, showcasing NiFi's automation impact from ingestion to delivery. Whether you're a seasoned data engineer or new to GenAI, this talk offers valuable insights into optimizing workflows. Join us to unlock the potential of real-time streaming and witness how NiFi makes data engineering a breeze for GenAI applications!
Real-Time Analytics in the Corporate World: How Apache Pinot® Powers Industry Leaders | Viktor Gamov, StarTree
Explore how industry leaders like LinkedIn, Uber Eats, and Stripe are mastering real-time data with Viktor as your guide. Discover how Apache Pinot transforms data into actionable insights instantly. Viktor will showcase Pinot's features, including the Star-Tree Index, and explain why it's a game-changer in data strategy. This session is for everyone, from data geeks to business gurus, eager to uncover the future of tech. Join us and be wowed by the power of real-time analytics with Apache Pinot!
-------
Tim Spann is a Principal Developer Advocate in Data In Motion for Cloudera.
He works with Apache NiFi, Apache Kafka, Apache Pulsar, Apache Flink, Flink SQL, Apache Pinot, Trino, Apache Iceberg, DeltaLake, Apache Spark, Big Data, IoT, Cloud, AI/DL, machine learning, and deep learning. Tim has over ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Developer Advocate at StreamNative, Principal DataFlow Field Engineer at Cloudera, a Senior Solutions Engineer at Hortonworks, a Senior Solutions Architect at AirisData, a Senior Field Engineer at Pivotal and a Team Leader at HPE. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton & NYC on Big Data, Cloud, IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as ApacheCon, DeveloperWeek, Pulsar Summit and many more.
NkSIP is an open-source, scalable Erlang SIP application server. It provides a distributed platform for building highly available communications applications. NkSIP aims to simplify SIP development by hiding complexity while providing full control. It uses Erlang for concurrency and fault tolerance and allows applications to be developed in multiple languages. NkSIP will soon support distribution based on the Dynamo model and integration with other protocols beyond SIP.
OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs
AI has been a hot topic lately, with advances being made constantly in what is possible, there has not been as much discussion of the infrastructure and scaling challenges that come with it. How do you support dozens of different languages and frameworks, and make them interoperate invisibly? How do you scale to run abstract code from thousands of different developers, simultaneously and elastically, while maintaining less than 15ms of overhead?
At Algorithmia, we’ve built, deployed, and scaled thousands of algorithms and machine learning models, using every kind of framework (from scikit-learn to tensorflow). We’ve seen many of the challenges faced in this area, and in this talk I’ll share some insights into the problems you’re likely to face, and how to approach solving them.
In brief, we’ll examine the need for, and implementations of, a complete “Operating System for AI” – a common interface for different algorithms to be used and combined, and a general architecture for serverless machine learning which is discoverable, versioned, scalable and sharable.
2014 carlos gzlez florido nksip the erlang sip application serverVOIP2DAY
NkSIP is an open-source, scalable Erlang SIP application server that allows developers to easily create distributed SIP applications. It provides a full-featured SIP stack and plugin system to handle all SIP details, allowing applications to focus on business logic. The long term vision is for NkSIP to become a general distributed application server called NkCORE, serving as a platform for network functions and services across various protocols using Erlang on control nodes and Docker on workers.
Linux Distribution Collaboration …on a Mainframe!All Things Open
Presented at All Things Open 2023
Presented by Elizabeth K. Joseph - IBM
Title: Linux Distribution Collaboration …on a Mainframe!
Abstract: Linux has run on the mainframe architecture (s390x) for over 20 years now, and there’s even Linux-only mainframe hardware! But tight collaboration between the Linux distributions is rather new. Enter the Open Mainframe Project Linux Distributions Working Group, founded in late 2021.
Bringing together various Linux distributions, both corporate-backed and community-driven, representatives from openSUSE, Debian, Fedora, SUSE, and more immediately joined the effort to share bug reports and patches that impact all the distributions. Issues are often shared and discussed on the mailing list, and more complicated topics covered during the monthly meetings. The working group has a number of success stories that will be shared.
Future potential issues are also tackled, and notes shared about upstream changes that may soon impact the package processes. In the latest effort, the team has started thinking about actual upstream projects to invite to our group to be more pro-active about changes that may cause problems on the s390x architecture.
But more importantly, this is a story about community and collaboration. Many people view the various Linux distributions as a competitive space, but like so much of the open source software community, we are all more successful when we share knowledge about our core. The success of this working group, and growing enthusiasm for it from new Linux distributions who are joining, is a great example of this.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
2023 conference: https://2023.allthingsopen.org/
Data processing at the speed of 100 Gbps@Apache Crail (Incubating)DataWorks Summit
Apache Crail (Incubating) is a distributed data store platform designed to share data at hardware speeds of 100+ Gbps with microsecond access latencies. It uses high-performance user-level I/O, careful software design, and data orchestration techniques to achieve these speeds. Evaluation shows it can deliver full hardware bandwidth and ultra-low access latencies. When used with Apache Spark, it provides significant performance gains for workloads like broadcast, shuffle, TeraSort, and joins. The project is seeking more language and framework integrations and aims to optimize for cloud deployments.
Datacenter Computing with Apache Mesos - BigData DCPaco Nathan
The document discusses datacenter computing using Apache Mesos. It begins by discussing concepts like "data democratization" and "cluster democratization", which refer to making data and computing resources available throughout an organization. It then discusses lessons from Google's approach to datacenter computing, and frameworks that can be integrated with Mesos like Hadoop, Spark, and Docker. Examples of companies using Mesos in production are provided, including Twitter, Airbnb, and eBay. Mesos provides a common substrate that makes heterogeneous computing resources available as a homogeneous set, improving scalability, elasticity, fault tolerance and resource utilization.
The document discusses software as a service (SAAS) and why the company Viridian chose to use the Ruby on Rails web application framework. It notes that Rails allows for lower entry costs than other options due to reduced server maintenance needs and flexibility. It also summarizes some key advantages of Rails like its convention over configuration approach and support for modern technologies. The document provides resources for learning Rails including dev environments, tutorials, and open source projects to review.
Unified Batch and Real-Time Stream Processing Using Apache FlinkSlim Baltagi
This talk was given at Capital One on September 15, 2015 at the launch of the Washington DC Area Apache Flink Meetup. Apache flink is positioned at the forefront of 2 major trends in Big Data Analytics:
- Unification of Batch and Stream processing
- Multi-purpose Big Data Analytics frameworks
In these slides, we will also find answers to the burning question: Why Apache Flink? You will also learn more about how Apache Flink compares to Hadoop MapReduce, Apache Spark and Apache Storm.
This document provides a summary of a presentation on innovating with AI at scale. The presentation discusses:
1. Implementing AI use cases at scale across industries like retail, life sciences, and transportation.
2. Deploying AI models to the edge using tools like TensorFlow and TensorRT for high-performance inference on devices.
3. Best practices and frameworks for distributed deep learning training on large clusters to train models faster.
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...confluent
Lookout is a mobile cybersecurity company that ingests telemetry data from hundreds of millions of mobile devices to provide security scanning and apply corporate policies. They were facing scaling issues with their existing data pipeline and storage as the number of devices grew. They decided to use Apache Kafka and Confluent Platform for scalable data ingestion and ScyllaDB as the persistent store. Testing showed the new architecture could handle their target of 1 million devices with low latency and significantly lower costs compared to their previous DynamoDB-based solution. Key learnings included improving Kafka's default partitioner and working through issues during proof of concept testing with ScyllaDB.
This document discusses using TYPO3 CMS in the cloud to save developer time. It covers using TYPO3 in cloud platforms like Microsoft Azure, common infrastructure problems, and how cloud principles of being orchestrated, consistent, and deterministic can help. The document advocates for automating infrastructure deployment to ensure environments are identical from development to production. This allows developers to focus on code instead of configuration. Finally, it discusses do-it-yourself options and potential future integrations with cloud platforms.
The NRB Group mainframe day 2021 - Containerisation on Z - Paul Pilotto - Seb...NRB
Containerization on IBM Z : the notion of containers, their principles, how it works, their benefits on IBM Z and the reasons to adopt containers.
The second part of the presentation focuses on the various solutions available on IBM Z to run and execute your containers at the best place, on IBM Z !
Extending DevOps to Big Data Applications with KubernetesNicola Ferraro
DevOps, continuous delivery and modern architectural trends can incredibly speed up the software development process. Big Data applications cannot be an exception and need to keep the same pace.
Running Accurate, Scalable, and Reproducible Simulations of Distributed Syste...Rafael Ferreira da Silva
Scientific workflows are used routinely in numerous scientific domains, and Workflow Management Systems (WMSs) have been developed to orchestrate and optimize workflow executions on distributed platforms. WMSs are complex software systems that interact with complex software infrastructures. Most WMS research and development activities rely on empirical experiments conducted with full-fledged software stacks on actual hardware platforms. Such experiments, however, are limited to hardware and software infrastructures at hand and can be labor- and/or time-intensive. As a result, relying solely on real- world experiments impedes WMS research and development. An alternative is to conduct experiments in simulation.
In this work we present WRENCH, a WMS simulation framework, whose objectives are (i) accurate and scalable simula- tions; and (ii) easy simulation software development. WRENCH achieves its first objective by building on the SimGrid framework. While SimGrid is recognized for the accuracy and scalability of its simulation models, it only provides low-level simulation abstractions and thus large software development efforts are required when implementing simulators of complex systems. WRENCH thus achieves its second objective by providing high- level and directly re-usable simulation abstractions on top of SimGrid. After describing and giving rationales for WRENCH’s software architecture and APIs, we present a case study in which we apply WRENCH to simulate the Pegasus production WMS. We report on ease of implementation, simulation accuracy, and simulation scalability so as to determine to which extent WRENCH achieves its two above objectives. We also draw both qualitative and quantitative comparisons with a previously proposed workflow simulator.
This presentation explores code comprehension challenges in scientific programming based on a survey of 57 research scientists. It reveals that 57.9% of scientists have no formal training in writing readable code. Key findings highlight a "documentation paradox" where documentation is both the most common readability practice and the biggest challenge scientists face. The study identifies critical issues with naming conventions and code organization, noting that 100% of scientists agree readable code is essential for reproducible research. The research concludes with four key recommendations: expanding programming education for scientists, conducting targeted research on scientific code quality, developing specialized tools, and establishing clearer documentation guidelines for scientific software.
Presented at: The 33rd International Conference on Program Comprehension (ICPC '25)
Date of Conference: April 2025
Conference Location: Ottawa, Ontario, Canada
Preprint: https://arxiv.org/abs/2501.10037
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDinusha Kumarasiri
AI is transforming APIs, enabling smarter automation, enhanced decision-making, and seamless integrations. This presentation explores key design principles for AI-infused APIs on Azure, covering performance optimization, security best practices, scalability strategies, and responsible AI governance. Learn how to leverage Azure API Management, machine learning models, and cloud-native architectures to build robust, efficient, and intelligent API solutions
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentShubham Joshi
A secure test infrastructure ensures that the testing process doesn’t become a gateway for vulnerabilities. By protecting test environments, data, and access points, organizations can confidently develop and deploy software without compromising user privacy or system integrity.
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Versionsaimabibi60507
Copy & Past Link👉👉
https://dr-up-community.info/
Pixologic ZBrush, now developed by Maxon, is a premier digital sculpting and painting software renowned for its ability to create highly detailed 3D models. Utilizing a unique "pixol" technology, ZBrush stores depth, lighting, and material information for each point on the screen, allowing artists to sculpt and paint with remarkable precision .
⭕️➡️ FOR DOWNLOAD LINK : http://drfiles.net/ ⬅️⭕️
Maxon Cinema 4D 2025 is the latest version of the Maxon's 3D software, released in September 2024, and it builds upon previous versions with new tools for procedural modeling and animation, as well as enhancements to particle, Pyro, and rigid body simulations. CG Channel also mentions that Cinema 4D 2025.2, released in April 2025, focuses on spline tools and unified simulation enhancements.
Key improvements and features of Cinema 4D 2025 include:
Procedural Modeling: New tools and workflows for creating models procedurally, including fabric weave and constellation generators.
Procedural Animation: Field Driver tag for procedural animation.
Simulation Enhancements: Improved particle, Pyro, and rigid body simulations.
Spline Tools: Enhanced spline tools for motion graphics and animation, including spline modifiers from Rocket Lasso now included for all subscribers.
Unified Simulation & Particles: Refined physics-based effects and improved particle systems.
Boolean System: Modernized boolean system for precise 3D modeling.
Particle Node Modifier: New particle node modifier for creating particle scenes.
Learning Panel: Intuitive learning panel for new users.
Redshift Integration: Maxon now includes access to the full power of Redshift rendering for all new subscriptions.
In essence, Cinema 4D 2025 is a major update that provides artists with more powerful tools and workflows for creating 3D content, particularly in the fields of motion graphics, VFX, and visualization.
Landscape of Requirements Engineering for/by AI through Literature ReviewHironori Washizaki
Hironori Washizaki, "Landscape of Requirements Engineering for/by AI through Literature Review," RAISE 2025: Workshop on Requirements engineering for AI-powered SoftwarE, 2025.
Exploring Wayland: A Modern Display Server for the FutureICS
Wayland is revolutionizing the way we interact with graphical interfaces, offering a modern alternative to the X Window System. In this webinar, we’ll delve into the architecture and benefits of Wayland, including its streamlined design, enhanced performance, and improved security features.
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Eric D. Schabell
It's time you stopped letting your telemetry data pressure your budgets and get in the way of solving issues with agility! No more I say! Take back control of your telemetry data as we guide you through the open source project Fluent Bit. Learn how to manage your telemetry data from source to destination using the pipeline phases covering collection, parsing, aggregation, transformation, and forwarding from any source to any destination. Buckle up for a fun ride as you learn by exploring how telemetry pipelines work, how to set up your first pipeline, and exploring several common use cases that Fluent Bit helps solve. All this backed by a self-paced, hands-on workshop that attendees can pursue at home after this session (https://o11y-workshops.gitlab.io/workshop-fluentbit).
Why Orangescrum Is a Game Changer for Construction Companies in 2025Orangescrum
Orangescrum revolutionizes construction project management in 2025 with real-time collaboration, resource planning, task tracking, and workflow automation, boosting efficiency, transparency, and on-time project delivery.
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Andre Hora
Exceptions allow developers to handle error cases expected to occur infrequently. Ideally, good test suites should test both normal and exceptional behaviors to catch more bugs and avoid regressions. While current research analyzes exceptions that propagate to tests, it does not explore other exceptions that do not reach the tests. In this paper, we provide an empirical study to explore how frequently exceptional behaviors are tested in real-world systems. We consider both exceptions that propagate to tests and the ones that do not reach the tests. For this purpose, we run an instrumented version of test suites, monitor their execution, and collect information about the exceptions raised at runtime. We analyze the test suites of 25 Python systems, covering 5,372 executed methods, 17.9M calls, and 1.4M raised exceptions. We find that 21.4% of the executed methods do raise exceptions at runtime. In methods that raise exceptions, on the median, 1 in 10 calls exercise exceptional behaviors. Close to 80% of the methods that raise exceptions do so infrequently, but about 20% raise exceptions more frequently. Finally, we provide implications for researchers and practitioners. We suggest developing novel tools to support exercising exceptional behaviors and refactoring expensive try/except blocks. We also call attention to the fact that exception-raising behaviors are not necessarily “abnormal” or rare.
F-Secure Freedome VPN 2025 Crack Plus Activation New Versionsaimabibi60507
Copy & Past Link 👉👉
https://dr-up-community.info/
F-Secure Freedome VPN is a virtual private network service developed by F-Secure, a Finnish cybersecurity company. It offers features such as Wi-Fi protection, IP address masking, browsing protection, and a kill switch to enhance online privacy and security .
Solidworks Crack 2025 latest new + license codeaneelaramzan63
Copy & Paste On Google >>> https://dr-up-community.info/
The two main methods for installing standalone licenses of SOLIDWORKS are clean installation and parallel installation (the process is different ...
Disable your internet connection to prevent the software from performing online checks during installation
Download YouTube By Click 2025 Free Full Activatedsaniamalik72555
Copy & Past Link 👉👉
https://dr-up-community.info/
"YouTube by Click" likely refers to the ByClick Downloader software, a video downloading and conversion tool, specifically designed to download content from YouTube and other video platforms. It allows users to download YouTube videos for offline viewing and to convert them to different formats.
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...Andre Hora
Unittest and pytest are the most popular testing frameworks in Python. Overall, pytest provides some advantages, including simpler assertion, reuse of fixtures, and interoperability. Due to such benefits, multiple projects in the Python ecosystem have migrated from unittest to pytest. To facilitate the migration, pytest can also run unittest tests, thus, the migration can happen gradually over time. However, the migration can be timeconsuming and take a long time to conclude. In this context, projects would benefit from automated solutions to support the migration process. In this paper, we propose TestMigrationsInPy, a dataset of test migrations from unittest to pytest. TestMigrationsInPy contains 923 real-world migrations performed by developers. Future research proposing novel solutions to migrate frameworks in Python can rely on TestMigrationsInPy as a ground truth. Moreover, as TestMigrationsInPy includes information about the migration type (e.g., changes in assertions or fixtures), our dataset enables novel solutions to be verified effectively, for instance, from simpler assertion migrations to more complex fixture migrations. TestMigrationsInPy is publicly available at: https://github.com/altinoalvesjunior/TestMigrationsInPy.
How can one start with crypto wallet development.pptxlaravinson24
This presentation is a beginner-friendly guide to developing a crypto wallet from scratch. It covers essential concepts such as wallet types, blockchain integration, key management, and security best practices. Ideal for developers and tech enthusiasts looking to enter the world of Web3 and decentralized finance.
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfTechSoup
In this webinar we will dive into the essentials of generative AI, address key AI concerns, and demonstrate how nonprofits can benefit from using Microsoft’s AI assistant, Copilot, to achieve their goals.
This event series to help nonprofits obtain Copilot skills is made possible by generous support from Microsoft.
What You’ll Learn in Part 2:
Explore real-world nonprofit use cases and success stories.
Participate in live demonstrations and a hands-on activity to see how you can use Microsoft 365 Copilot in your own work!
3. about me
3
core developer at hazelcast
holds bsc. computer engineering
started programming some time ago,
then turned into a career
lives in beautiful istanbul
interested in distributed systems
4. distributed computing
4
use of bunch of computers to solve a
computational problem
problem is divided into multiple tasks
and they are solved by one or more
computers
computers communicate each other
by sending messages
5. in-memory data grids
5
middleware software
shared nothing architecture
manages objects across distributed
servers in the RAM
ability to scale
provides fault tolerance
6. why use an imdg?
6
performance - ram is faster
flexibility - rich set of data structures
operations - easy to scale/maintain
7. other imdg solutions
7
oracle coherence
ibm extremescale
vmware gemfire
gigaspaces
redhat infinispan
gridgain
terracotta
11. a company
11
hazelcast enterprise edition
management center
enterprise support
training / consulting
offices in istanbul (r&d), palo alto(hq)
and london
12. an open-source project
12
leading open-source in-memory data
grid.
dead simple distributed
programming
easy way to scale applications
simple api
built with in Istanbul
13. use cases
13
scaling your application
sharing data across cluster
partitioning data
sending/receiving messages
load balancing
session replication
parallel task processing on multiple
machines
…
14. how differs ?
14
apache licensed open source
lightweight w/o any dependency
ease of use and more fun !
15. 15
fact : every ~0.4 second a hazelcast node is started
around the world
a lot of developers :)
who uses ?
17. 17
what is ?
distributed impl. of Java Collections
dynamic clustering, backup and
failover
transaction support ( two phase, XA)
distributed execution framework
map/reduce api
distributed queries
native Java, C#, C++ clients
33. service provider interface
( SPI )
33
roll your own services
extend hazelcast based on your needs !
hierarchical lock service
priority queue
scheduled executor service
distributed actors
anything you can think of !
check out SPI section of the hazelcast
documentation