Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...DataStax
We built an application based on the principles of CQRS and Event Sourcing using Cassandra and Spark. During the project we encountered a number of challenges and problems with Cassandra and the Spark Connector.
In this talk we want to outline a few of those problems and our actions to solve them. While some problems are specific to CQRS and Event Sourcing applications most of them are use case independent.
About the Speakers
Matthias Niehoff IT-Consultant, codecentric AG
works as an IT-Consultant at codecentric AG in Germany. His focus is on big data & streaming applications with Apache Cassandra & Apache Spark. Yet he does not lose track of other tools in the area of big data. Matthias shares his experiences on conferences, meetups and usergroups.
Stephan Kepser Senior IT Consultant and Data Architect, codecentric AG
Dr. Stephan Kepser is an expert on cloud computing and big data. He wrote a couple of journal articles and blog posts on subjects of both fields. His interests reach from legal questions to questions of architecture and design of cloud computing and big data systems to technical details of NoSQL databases.
A brief history of Instagram's adoption cycle of the open source distributed database Apache Cassandra, in addition to details about it's use case and implementation. This was presented at the San Francisco Cassandra Meetup at the Disqus HQ in August 2013.
Apache Cassandra operations have the reputation to be quite simple against single datacenter clusters and / or low volume clusters but they become way more complex against high latency multi-datacenter clusters: basic operations such as repair, compaction or hints delivery can have dramatic consequences even on a healthy cluster.
In this presentation, Julien will go through Cassandra operations in details: bootstrapping new nodes and / or datacenter, repair strategies, compaction strategies, GC tuning, OS tuning, large batch of data removal and Apache Cassandra upgrade strategy.
Julien will give you tips and techniques on how to anticipate issues inherent to multi-datacenter cluster: how and what to monitor, hardware and network considerations as well as data model and application level bad design / anti-patterns that can affect your multi-datacenter cluster performances.
Apache Cassandra operations have the reputation to be simple on single datacenter deployments and / or low volume clusters but they become way more complex on high latency multi-datacenter clusters with high volume and / or high throughout: basic Apache Cassandra operations such as repairs, compactions or hints delivery can have dramatic consequences even on a healthy high latency multi-datacenter cluster.
In this presentation, Julien will go through Apache Cassandra mutli-datacenter concepts first then show multi-datacenter operations essentials in details: bootstrapping new nodes and / or datacenter, repairs strategy, Java GC tuning, OS tuning, Apache Cassandra configuration and monitoring.
Based on his 3 years experience managing a multi-datacenter cluster against Apache Cassandra 2.0, 2.1, 2.2 and 3.0, Julien will give you tips on how to anticipate and prevent / mitigate issues related to basic Apache Cassandra operations with a multi-datacenter cluster.
About the Speaker
Julien Anguenot VP Software Engineering, iland Internet Solutions, Corp
Julien currently serves as iland's Vice President of Software Engineering. Prior to joining iland, Mr. Anguenot held tech leadership positions at several open source content management vendors and tech startups in Europe and in the U.S. Julien is a long time Open Source software advocate, contributor and speaker: Zope, ZODB, Nuxeo contributor, Zope and OpenStack foundations member, his talks includes Apache Con, Cassandra summit, OpenStack summit, The WWW Conference or still EuroPython.
Cassandra and Spark: Optimizing for Data LocalityRussell Spitzer
This document discusses how the Spark Cassandra Connector optimizes for data locality when performing analytics on Cassandra data using Spark. It does this by using the partition keys and token ranges to create Spark partitions that correspond to the data distribution across the Cassandra nodes, allowing work to be done locally to each data node without moving data across the network. This improves performance and avoids the costs of data shuffling.
Apache Cassandra operations have the reputation to be simple on single datacenter deployments and / or low volume clusters but they become way more complex on high latency multi-datacenter clusters with high volume and / or high throughout: basic Apache Cassandra operations such as repairs, compactions or hints delivery can have dramatic consequences even on a healthy high latency multi-datacenter cluster.
In this presentation, Julien will go through Apache Cassandra mutli-datacenter concepts first then show multi-datacenter operations essentials in details: bootstrapping new nodes and / or datacenter, repairs strategy, Java GC tuning, OS tuning, Apache Cassandra configuration and monitoring.
Based on his 3 years experience managing a multi-datacenter cluster against Apache Cassandra 2.0, 2.1, 2.2 and 3.0, Julien will give you tips on how to anticipate and prevent / mitigate issues related to basic Apache Cassandra operations with a multi-datacenter cluster.
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...DataStax
With the addition of vnodes (Virtual Nodes), Cassandra users were able to gain a few benefits as a result of streaming when it came to bootstrapping and decommissioning nodes. On the flip side, having to route requests on larger clusters became a lot more intensive of a workload for all nodes that were then forced to act coordinator nodes. By setting up a tier of proxy nodes, we were able to have our cluster of 50 nodes perform with a 300% improvement on average in a mixed workload environment. This is an explanation of what we did, how we did it, and why it works.
About the Speaker
Eric Lubow CTO, SimpleReach
Eric Lubow is CTO of SimpleReach, where he builds highly-scalable distributed systems for processing analytics data. Eric is also a DataStax MVP for Cassandra, and co-author of Practical Cassandra. In his spare time, Eric is a skydiver, motorcycle rider, mixed martial artist, and dog dad.
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
Customizing JVM settings for the needs of an application can be a tricky business, especially when running externally developed software such as Cassandra. In this talk I will share our experiences and the procedure that we have used to test and validate changes with Java tuning. We'll explore with two recent experiences: changes and monitoring of G1 garbage collection, and moving buffer objects off the heap.
For the talk, I'll discuss our tuning process at Knewton. I will share some of the challenges that we faced while identifying what we expected to learn. I'll discuss how we isolated and minimized variables across tests, the importance of the duration of these tests, and how we try to separate correlation from causation. I will demonstrate how to use and interpret the results of the custom scripts that we were driven to develop to gain visibility into our G1GC processes; these scripts will be open sourced.
About the Speaker
Carlos Monroy Senior Software Engineer, Knewton
Carlos Monroy is a senior engineer on the database team at Knewton, an education company that created an adaptive learning platform. Carlos has been developing software professionally since 1998. His experience holding multiple roles on the software lifecycle provides him a wholistic approach. Having used over a half dozen relational database engines, he has recently come over to the NoSQL side, first working with HBase and for the last three years Cassandra.
C* Summit 2013: Cassandra at Instagram by Rick BransonDataStax Academy
Speaker: Rick Branson, Infrastructure Engineer at Instagram
Cassandra is a critical part of Instagram's large scale site infrastructure that supports more than 100 million active users. This talk is a practical deep dive into data models, systems architecture, and challenges encountered during the implementation process.
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
Traditionally, machines were statically partitioned across the different services at Uber. In an effort to increase the machine utilization, Uber has recently started transitioning most of its services, including the storage services, to run on top of Mesos. This presentation will describe the initial experience building and operating a framework for running Cassandra on top of Mesos running across multiple datacenters at Uber. This framework automates several Cassandra operations such as node repairs, addition of new nodes and backup/restore. It improves efficiency by co-locating CPU-intensive services as well as multiple Cassandra nodes on the same Mesos agent. It handles failure and restart of Mesos agents by using persistent volumes and dynamic reservations. This talk includes statistics about the number of Cassandra clusters in production, time taken to start a new cluster, add a new node, detect a node failure; and the observed Cassandra query throughput and latency.
About the Speaker
Abhishek Verma Software Engineer, Uber
Dr. Abhishek Verma is currently working on running Cassandra on top of Mesos at Uber. Prior to this, he worked on BorgMaster at Google and was the first author of the Borg paper published in Eurosys 2015. He received an MS in 2010 and a PhD in 2012 in Computer Science from the University of Illinois at Urbana-Champaign, during which he authored more than 20 publications in conferences, journals and books and presented tens of talks.
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
Making sure your Data Model will work on the production cluster after 6 months as well as it does on your laptop is an important skill. It's one that we use every day with our clients at The Last Pickle, and one that relies on tools like the cassandra-stress. Knowing how the data model will perform under stress once it has been loaded with data can prevent expensive re-writes late in the project.
In this talk Christopher Batey, Consultant at The Last Pickle, will shed some light on how to use the cassandra-stress tool to test your own schema, graph the results and even how to extend the tool for your own use cases. While this may be called premature optimisation for a RDBS, a successful Cassandra project depends on it's data model.
About the Speaker
Christopher Batey Consultant / Software Engineer, The Last Pickle
Christopher (@chbatey) is a part time consultant at The Last Pickle where he works with clients to help them succeed with Apache Cassandra as well as a freelance software engineer working in London. Likes: Scala, Haskell, Java, the JVM, Akka, distributed databases, XP, TDD, Pairing. Hates: Untested software, code ownership. You can checkout his blog at: http://www.batey.info
Introduction to Cassandra: Replication and ConsistencyBenjamin Black
A short introduction to replication and consistency in the Cassandra distributed database. Delivered April 28th, 2010 at the Seattle Scalability Meetup.
Cassandra is the dominant data store used at Netflix and it's health is critical to many of its services. In this talk we will share details of the recent redesign of our health monitoring system and how we leveraged a reactive stream processing system to give us a real-time view our entire fleet while dramatically improving accuracy and reducing false alarms in our alerting.
About the Speaker
Jason Cacciatore Senior Software Engineer, Netflix
Jason Cacciatore is a Senior Software Engineer at Netflix, where he's been working for the past several years. He's interested in stateful distributed systems and has a diverse background in technology. In his spare time he enjoys spending time with his wife and two sons, reading non-fiction, and watching Netflix documentaries.
Brisk - Truly peer-to-peer hadoop.
Brisk is an open-source Hadoop & Hive distribution that uses Apache Cassandra for its core services and storage. Brisk makes it possible to run Hadoop MapReduce on top of CassandraFS, an HDFS-compatible storage layer. By replacing HDFS with CassandraFS, users leverage MapReduce jobs on Cassandra’s peer-to-peer, fault-tolerant and scalable architecture.
With CassandraFS all nodes are peers. Data files can be loaded through any node in the cluster and any node can serve as the JobTracker for MapReduce jobs. Hive MetaStore is stored & accessed as just another column family (table) on the distributed data store. Brisk makes Hadoop truly peer-to-peer.
We demonstrate visualisation & monitoring of Brisk using OpsCenter. The operational simplicity of cassandra’s multi-datacenter & multi-region aware replication makes Brisk well-suited for a rich set of Applications and usecases. And by being able to store and isolate hdfs & online data within the same data cluster, Brisk makes analytics possible without ETL!
LA Scalability Talk, Mahalo
May 31.2011
Advanced Apache Cassandra Operations with JMXzznate
Nodetool is a command line interface for managing a Cassandra node. It provides commands for node administration, cluster inspection, table operations and more. The nodetool info command displays node-specific information such as status, load, memory usage and cache details. The nodetool compactionstats command shows compaction status including active tasks and progress. The nodetool tablestats command displays statistics for a specific table including read/write counts, space usage, cache usage and latency.
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Spark Summit
This document discusses how the Spark Cassandra Connector optimizes for data locality when performing analytics on Cassandra data using Spark. It does this by using the partition keys and token ranges to create Spark partitions that correspond to the data distribution across the Cassandra nodes, allowing work to be done locally to each data node without moving data across the network. This improves performance and avoids the costs of data shuffling.
This document discusses bulk loading and unloading data into and from Cassandra. It describes using CQL INSERT statements via Java drivers or CQLSH COPY FROM for loading, as well as using SSTable files via sstableloader or custom code. For unloading, it recommends using parallel CQL SELECT queries by splitting the token range across multiple connections. Testing showed Java asynchronous INSERTs to be the fastest loading method in most cases, while sstableloader requires all nodes be online. Batching INSERTs can improve throughput but increases latency.
Cassandra Community Webinar: Apache Cassandra InternalsDataStax
Apache Cassandra solves many interesting problems to provide a scalable, distributed, fault tolerant database. Cluster wide operations track node membership, direct requests and implement consistency guarantees. At the node level, the Log Structured storage engine provides high performance reads and writes. All of this is implemented in a Java code base that has greatly matured over the past few years.
In this webinar Aaron Morton will step through read and write requests, automatic processes and manual maintenance tasks. He will also discuss the general approach to solving the problem and drill down to the code responsible for implementation.
Speaker: Aaron Morton, Apache Cassandra Committer
Aaron Morton is a Freelance Developer based in New Zealand, and a Committer on the Apache Cassandra project. In 2010 he gave up the RDBMS world for the scale and reliability of Cassandra. He now spends his time advancing the Cassandra project and helping others get the best out of it.
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
iland has built a global data warehouse across multiple data centers, collecting and aggregating data from core cloud services including compute, storage and network as well as chargeback and compliance. iland's warehouse brings actionable intelligence that customers can use to manipulate resources, analyze trends, define alerts and share information.
In this session, we would like to present the lessons learned around Cassandra, both at the development and operations level, but also the technology and architecture we put in action on top of Cassandra such as Redis, syslog-ng, RabbitMQ, Java EE, etc.
Finally, we would like to share insights on how we are currently extending our platform with Spark and Kafka and what our motivations are.
The document discusses Spark exceptions and errors related to shuffling data between nodes. It notes that tasks can fail due to out of memory errors or files being closed prematurely. It also provides explanations of Spark's shuffle operations and how data is written and merged across nodes during shuffles.
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark Summit
Typesafe has launched Spark support for Mesosphere's Data Center Operating System (DCOS). Typesafe engineers are contributing to the Mesos support for Spark and Typesafe will provide commercial support for Spark development and production deployment on Mesos. Mesos' flexibility allows many frameworks like Spark to run on top of it. This document discusses Spark on Mesos in coarse-grained and fine-grained modes and some features coming soon like dynamic allocation and constraints.
Real-time data analytics with Cassandra at ilandJulien Anguenot
This presentation, from the 2014 Datastax Cassandra Tech Day in Houston, TX, is a quick overview on how iland Internet Solutions leverages Cassandra as its sole data storage platform for performance metrics as well as configuration across multiple-data centers in the US, EU and Asia.
iland exposes these metrics to its customers trough its ECS portal application which is covering a wealth of functionality including offering visibility into resource consumption, billing, performance, alerts, the impact of change and other key areas. The platform also provides predictive analytics that help companies monitor performance, achieve consistency and anticipate growth requirements.
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...DataStax
A solid backup strategy is a DBA's bread and butter. Cassandra's nodetool snapshot makes it easy to back up the SSTable files, but there remains the question of where to put them and how. Knewton's backup strategy uses Ansible for distributed backups and stores them in S3.
Unfortunately, it's all too easy to store backups that are essentially useless due to the absence of a coherent restoration strategy. This problem proved much more difficult and nuanced than taking the backups themselves. I will discuss Knewton's restoration strategy, which again leverages Ansible, yet I will focus on general principles and pitfalls to be avoided. In particular, restores necessitated modifying our backup strategy to generate cluster-wide metadata that is critical for a smooth automated restoration. Such pitfalls indicate that a restore-focused backup design leads to faster and more deterministic recovery.
About the Speaker
Joshua Wickman Database Engineer, Knewton
Dr. Joshua Wickman is currently part of the database team at Knewton, a NYC tech company focused on adaptive learning. He earned his PhD at the University of Delaware in 2012, where he studied particle physics models of the early universe. After a brief stint teaching college physics, he entered the New York tech industry in 2014 working with NoSQL, first with MongoDB and then Cassandra. He was certified in Cassandra at his first Cassandra Summit in 2015.
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...DataStax
Successfully running Apache Cassandra in production often means knowing what configuration settings to change and which ones to leave as default. Over the years the cassandra.yaml file has grown to provide a number of settings that can improve stability and performance. While the file contains plenty of helpful comments, there is more to be said about the settings and when to change them.
In this talk Edward Capriolo, Consultant at The Last Pickle, will break down the parameters in the configuration files. Looking at those that are essential to getting started, those that impact performance, those that improve availability, the exotic ones, and the ones that should not be played with. This talk is ideal for someone someone setting up Cassandra for the first time up to people with deployments in productions and wondering what the more exotic configuration options do.
About the Speaker
Edward Capriolo Consultant, The Last Pickle
Long time Apache Cassandra user, big data enthusiast.
SF ElasticSearch Meetup 2013.04.06 - MonitoringSushant Shankar
Using monitoring tools Zabbix for systems-level monitoring of ElasticSearch and SPM (http://sematext.com/spm/elasticsearch-performance-monitoring/index.html) for ElasticSearch-specific monitoring. Using these tools was crucial was optimizing index building performance as well as query performance. Some general tips for index building and query performance.
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...DataStax
With the addition of vnodes (Virtual Nodes), Cassandra users were able to gain a few benefits as a result of streaming when it came to bootstrapping and decommissioning nodes. On the flip side, having to route requests on larger clusters became a lot more intensive of a workload for all nodes that were then forced to act coordinator nodes. By setting up a tier of proxy nodes, we were able to have our cluster of 50 nodes perform with a 300% improvement on average in a mixed workload environment. This is an explanation of what we did, how we did it, and why it works.
About the Speaker
Eric Lubow CTO, SimpleReach
Eric Lubow is CTO of SimpleReach, where he builds highly-scalable distributed systems for processing analytics data. Eric is also a DataStax MVP for Cassandra, and co-author of Practical Cassandra. In his spare time, Eric is a skydiver, motorcycle rider, mixed martial artist, and dog dad.
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
Customizing JVM settings for the needs of an application can be a tricky business, especially when running externally developed software such as Cassandra. In this talk I will share our experiences and the procedure that we have used to test and validate changes with Java tuning. We'll explore with two recent experiences: changes and monitoring of G1 garbage collection, and moving buffer objects off the heap.
For the talk, I'll discuss our tuning process at Knewton. I will share some of the challenges that we faced while identifying what we expected to learn. I'll discuss how we isolated and minimized variables across tests, the importance of the duration of these tests, and how we try to separate correlation from causation. I will demonstrate how to use and interpret the results of the custom scripts that we were driven to develop to gain visibility into our G1GC processes; these scripts will be open sourced.
About the Speaker
Carlos Monroy Senior Software Engineer, Knewton
Carlos Monroy is a senior engineer on the database team at Knewton, an education company that created an adaptive learning platform. Carlos has been developing software professionally since 1998. His experience holding multiple roles on the software lifecycle provides him a wholistic approach. Having used over a half dozen relational database engines, he has recently come over to the NoSQL side, first working with HBase and for the last three years Cassandra.
C* Summit 2013: Cassandra at Instagram by Rick BransonDataStax Academy
Speaker: Rick Branson, Infrastructure Engineer at Instagram
Cassandra is a critical part of Instagram's large scale site infrastructure that supports more than 100 million active users. This talk is a practical deep dive into data models, systems architecture, and challenges encountered during the implementation process.
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
Traditionally, machines were statically partitioned across the different services at Uber. In an effort to increase the machine utilization, Uber has recently started transitioning most of its services, including the storage services, to run on top of Mesos. This presentation will describe the initial experience building and operating a framework for running Cassandra on top of Mesos running across multiple datacenters at Uber. This framework automates several Cassandra operations such as node repairs, addition of new nodes and backup/restore. It improves efficiency by co-locating CPU-intensive services as well as multiple Cassandra nodes on the same Mesos agent. It handles failure and restart of Mesos agents by using persistent volumes and dynamic reservations. This talk includes statistics about the number of Cassandra clusters in production, time taken to start a new cluster, add a new node, detect a node failure; and the observed Cassandra query throughput and latency.
About the Speaker
Abhishek Verma Software Engineer, Uber
Dr. Abhishek Verma is currently working on running Cassandra on top of Mesos at Uber. Prior to this, he worked on BorgMaster at Google and was the first author of the Borg paper published in Eurosys 2015. He received an MS in 2010 and a PhD in 2012 in Computer Science from the University of Illinois at Urbana-Champaign, during which he authored more than 20 publications in conferences, journals and books and presented tens of talks.
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
Making sure your Data Model will work on the production cluster after 6 months as well as it does on your laptop is an important skill. It's one that we use every day with our clients at The Last Pickle, and one that relies on tools like the cassandra-stress. Knowing how the data model will perform under stress once it has been loaded with data can prevent expensive re-writes late in the project.
In this talk Christopher Batey, Consultant at The Last Pickle, will shed some light on how to use the cassandra-stress tool to test your own schema, graph the results and even how to extend the tool for your own use cases. While this may be called premature optimisation for a RDBS, a successful Cassandra project depends on it's data model.
About the Speaker
Christopher Batey Consultant / Software Engineer, The Last Pickle
Christopher (@chbatey) is a part time consultant at The Last Pickle where he works with clients to help them succeed with Apache Cassandra as well as a freelance software engineer working in London. Likes: Scala, Haskell, Java, the JVM, Akka, distributed databases, XP, TDD, Pairing. Hates: Untested software, code ownership. You can checkout his blog at: http://www.batey.info
Introduction to Cassandra: Replication and ConsistencyBenjamin Black
A short introduction to replication and consistency in the Cassandra distributed database. Delivered April 28th, 2010 at the Seattle Scalability Meetup.
Cassandra is the dominant data store used at Netflix and it's health is critical to many of its services. In this talk we will share details of the recent redesign of our health monitoring system and how we leveraged a reactive stream processing system to give us a real-time view our entire fleet while dramatically improving accuracy and reducing false alarms in our alerting.
About the Speaker
Jason Cacciatore Senior Software Engineer, Netflix
Jason Cacciatore is a Senior Software Engineer at Netflix, where he's been working for the past several years. He's interested in stateful distributed systems and has a diverse background in technology. In his spare time he enjoys spending time with his wife and two sons, reading non-fiction, and watching Netflix documentaries.
Brisk - Truly peer-to-peer hadoop.
Brisk is an open-source Hadoop & Hive distribution that uses Apache Cassandra for its core services and storage. Brisk makes it possible to run Hadoop MapReduce on top of CassandraFS, an HDFS-compatible storage layer. By replacing HDFS with CassandraFS, users leverage MapReduce jobs on Cassandra’s peer-to-peer, fault-tolerant and scalable architecture.
With CassandraFS all nodes are peers. Data files can be loaded through any node in the cluster and any node can serve as the JobTracker for MapReduce jobs. Hive MetaStore is stored & accessed as just another column family (table) on the distributed data store. Brisk makes Hadoop truly peer-to-peer.
We demonstrate visualisation & monitoring of Brisk using OpsCenter. The operational simplicity of cassandra’s multi-datacenter & multi-region aware replication makes Brisk well-suited for a rich set of Applications and usecases. And by being able to store and isolate hdfs & online data within the same data cluster, Brisk makes analytics possible without ETL!
LA Scalability Talk, Mahalo
May 31.2011
Advanced Apache Cassandra Operations with JMXzznate
Nodetool is a command line interface for managing a Cassandra node. It provides commands for node administration, cluster inspection, table operations and more. The nodetool info command displays node-specific information such as status, load, memory usage and cache details. The nodetool compactionstats command shows compaction status including active tasks and progress. The nodetool tablestats command displays statistics for a specific table including read/write counts, space usage, cache usage and latency.
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Spark Summit
This document discusses how the Spark Cassandra Connector optimizes for data locality when performing analytics on Cassandra data using Spark. It does this by using the partition keys and token ranges to create Spark partitions that correspond to the data distribution across the Cassandra nodes, allowing work to be done locally to each data node without moving data across the network. This improves performance and avoids the costs of data shuffling.
This document discusses bulk loading and unloading data into and from Cassandra. It describes using CQL INSERT statements via Java drivers or CQLSH COPY FROM for loading, as well as using SSTable files via sstableloader or custom code. For unloading, it recommends using parallel CQL SELECT queries by splitting the token range across multiple connections. Testing showed Java asynchronous INSERTs to be the fastest loading method in most cases, while sstableloader requires all nodes be online. Batching INSERTs can improve throughput but increases latency.
Cassandra Community Webinar: Apache Cassandra InternalsDataStax
Apache Cassandra solves many interesting problems to provide a scalable, distributed, fault tolerant database. Cluster wide operations track node membership, direct requests and implement consistency guarantees. At the node level, the Log Structured storage engine provides high performance reads and writes. All of this is implemented in a Java code base that has greatly matured over the past few years.
In this webinar Aaron Morton will step through read and write requests, automatic processes and manual maintenance tasks. He will also discuss the general approach to solving the problem and drill down to the code responsible for implementation.
Speaker: Aaron Morton, Apache Cassandra Committer
Aaron Morton is a Freelance Developer based in New Zealand, and a Committer on the Apache Cassandra project. In 2010 he gave up the RDBMS world for the scale and reliability of Cassandra. He now spends his time advancing the Cassandra project and helping others get the best out of it.
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
iland has built a global data warehouse across multiple data centers, collecting and aggregating data from core cloud services including compute, storage and network as well as chargeback and compliance. iland's warehouse brings actionable intelligence that customers can use to manipulate resources, analyze trends, define alerts and share information.
In this session, we would like to present the lessons learned around Cassandra, both at the development and operations level, but also the technology and architecture we put in action on top of Cassandra such as Redis, syslog-ng, RabbitMQ, Java EE, etc.
Finally, we would like to share insights on how we are currently extending our platform with Spark and Kafka and what our motivations are.
The document discusses Spark exceptions and errors related to shuffling data between nodes. It notes that tasks can fail due to out of memory errors or files being closed prematurely. It also provides explanations of Spark's shuffle operations and how data is written and merged across nodes during shuffles.
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark Summit
Typesafe has launched Spark support for Mesosphere's Data Center Operating System (DCOS). Typesafe engineers are contributing to the Mesos support for Spark and Typesafe will provide commercial support for Spark development and production deployment on Mesos. Mesos' flexibility allows many frameworks like Spark to run on top of it. This document discusses Spark on Mesos in coarse-grained and fine-grained modes and some features coming soon like dynamic allocation and constraints.
Real-time data analytics with Cassandra at ilandJulien Anguenot
This presentation, from the 2014 Datastax Cassandra Tech Day in Houston, TX, is a quick overview on how iland Internet Solutions leverages Cassandra as its sole data storage platform for performance metrics as well as configuration across multiple-data centers in the US, EU and Asia.
iland exposes these metrics to its customers trough its ECS portal application which is covering a wealth of functionality including offering visibility into resource consumption, billing, performance, alerts, the impact of change and other key areas. The platform also provides predictive analytics that help companies monitor performance, achieve consistency and anticipate growth requirements.
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...DataStax
A solid backup strategy is a DBA's bread and butter. Cassandra's nodetool snapshot makes it easy to back up the SSTable files, but there remains the question of where to put them and how. Knewton's backup strategy uses Ansible for distributed backups and stores them in S3.
Unfortunately, it's all too easy to store backups that are essentially useless due to the absence of a coherent restoration strategy. This problem proved much more difficult and nuanced than taking the backups themselves. I will discuss Knewton's restoration strategy, which again leverages Ansible, yet I will focus on general principles and pitfalls to be avoided. In particular, restores necessitated modifying our backup strategy to generate cluster-wide metadata that is critical for a smooth automated restoration. Such pitfalls indicate that a restore-focused backup design leads to faster and more deterministic recovery.
About the Speaker
Joshua Wickman Database Engineer, Knewton
Dr. Joshua Wickman is currently part of the database team at Knewton, a NYC tech company focused on adaptive learning. He earned his PhD at the University of Delaware in 2012, where he studied particle physics models of the early universe. After a brief stint teaching college physics, he entered the New York tech industry in 2014 working with NoSQL, first with MongoDB and then Cassandra. He was certified in Cassandra at his first Cassandra Summit in 2015.
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...DataStax
Successfully running Apache Cassandra in production often means knowing what configuration settings to change and which ones to leave as default. Over the years the cassandra.yaml file has grown to provide a number of settings that can improve stability and performance. While the file contains plenty of helpful comments, there is more to be said about the settings and when to change them.
In this talk Edward Capriolo, Consultant at The Last Pickle, will break down the parameters in the configuration files. Looking at those that are essential to getting started, those that impact performance, those that improve availability, the exotic ones, and the ones that should not be played with. This talk is ideal for someone someone setting up Cassandra for the first time up to people with deployments in productions and wondering what the more exotic configuration options do.
About the Speaker
Edward Capriolo Consultant, The Last Pickle
Long time Apache Cassandra user, big data enthusiast.
SF ElasticSearch Meetup 2013.04.06 - MonitoringSushant Shankar
Using monitoring tools Zabbix for systems-level monitoring of ElasticSearch and SPM (http://sematext.com/spm/elasticsearch-performance-monitoring/index.html) for ElasticSearch-specific monitoring. Using these tools was crucial was optimizing index building performance as well as query performance. Some general tips for index building and query performance.
Logging. Everyone does it. Many don't know why they do it. It is often considered a boring chore. A chore that is done by habit rather than for a purpose. But it doesn't have to be! Learn how to build a powerful, scalable open source logging environment with LogStash.
The document discusses Netflix's use of Elasticsearch for querying log events. It describes how Netflix evolved from storing logs in files to using Elasticsearch to enable interactive exploration of billions of log events. It also summarizes some of Netflix's best practices for running Elasticsearch at scale, such as automatic sharding and replication, flexible schemas, and extensive monitoring.
This document discusses using MapReduce with Cassandra. It describes how writing to Cassandra from MapReduce has always been possible, while reading was enabled starting with Cassandra 0.6.x. Using MapReduce with Cassandra provides analytics capabilities and avoids single points of failure compared to MapReduce with HBase. The document covers setup and configuration considerations like locality, and provides examples of a separate cluster approach and hybrid cluster approach. It also outlines future work like improving output to Cassandra and adding Hive support.
The document summarizes a presentation on integrating Oracle Real Application Clusters (RAC) with Oracle GoldenGate 12c. The presentation covers:
- Configuring an ASM Cluster File System (ACFS) for the shared storage needed by GoldenGate in a RAC environment.
- Installing Oracle GoldenGate 12c and configuring it to use the ACFS.
- Creating an application VIP and registering GoldenGate with the Oracle Grid Infrastructure bundled agent to enable automated startup and failover of GoldenGate processes on RAC nodes.
- Demonstrations of stopping GoldenGate on one node and verifying failover to the other node.
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Docker, Inc.
Docker provides a standardized way to build, ship, and run Linux containers. It uses Linux kernel features like namespaces and cgroups to isolate containers and make them lightweight. Docker allows building container images using Dockerfiles and sharing them via public or private registries. Images can be pulled and run anywhere. Docker aims to make containers easy to use and commoditize the container technology provided by Linux containers (LXC).
Lightweight Virtualization with Linux Containers and Docker | YaC 2013dotCloud
This document provides an overview of lightweight virtualization using Linux containers and Docker. It begins by explaining the problems of deploying applications across different environments and targets, and how containers can help solve this issue similarly to how shipping containers standardized cargo transportation. It then discusses what Linux containers are, how they provide isolation using namespaces and cgroups. It introduces Docker and how it builds on containers to further simplify deployment by allowing images to be easily built, shared, and run anywhere through standard formats and tools.
This document discusses the process of rebalancing in Voldemort. It begins by outlining the high-level steps taken, including getting the current and target cluster states, planning partition movements in batches, changing cluster metadata and rebalancing states, migrating data with redundancy checks, and rolling back changes if failures occur. Key aspects like maintaining consistency through proxying requests and handling failure scenarios are also summarized.
Building a serverless company on AWS lambda and Serverless frameworkLuciano Mammino
Planet9energy.com is a new electricity company building a sophisticated analytics and energy trading platform for the UK market. Since the earliest draft of the platform, we took the unconventional decision to go serverless and build the product on top of AWS Lambda and the Serverless framework using Node.js. In this talk, I want to discuss why we took this radical decision, what are the pros and cons of this approach and what are the main issues we faced as a tech team in our design and development experience. We will discuss how normal things like testing and deployment need to be re-thought to work on a serverless fashion but also the benefits of (almost) infinite self-scalability and the peace of mind of not having to manage hundreds of servers. Finally, we will underline how Node.js seems to fit naturally in this scenario and how it makes developing serverless applications extremely convenient.
Technologies:
Backend
Frontend
Application architecture
Javascript
cloud computing
Kettunen, miaubiz fuzzing at scale and in styleDefconRussia
This document summarizes information about fuzzing and discusses bugs found through fuzzing browsers like Chromium and Firefox in 2012. It describes AddressSanitizer output that reveals crashes, mentions 50 duplicate bugs found by two fuzzing groups, and outlines tips for smarter fuzzing like generating inputs based on specifications and minimizing reproducible test cases.
The document discusses the tools and practices used by a Ruby development team, including using RVM for managing Ruby versions and gemsets, Postgres.app for the database, Pow for local development, Git for version control, GitHub pull requests for code reviews, CircleCI for continuous integration and deployment to Heroku, Capistrano or Mina for deployment automation, and services like Rollbar and HipChat for error tracking and communication. Consistent coding styles, Sublime Text settings, and code quality practices like testing and reviews are also recommended.
This document discusses wxPerl, which provides bindings for the wxWidgets GUI toolkit to allow building graphical user interfaces in Perl that are cross-platform. It covers why desktop apps are still useful, why Perl is a good choice, an overview of the wxPerl distribution and architecture, techniques for cross-platform development such as version control and automation, challenges of installation on different platforms, ideas for "Perlinizing" or improving the wxPerl modules, testing approaches, and packaging/deployment strategies.
Sally and Leo use infrastructure as code practices like Cucumber, ServerSpec, Vagrant, and Ansible to automate the provisioning and configuration of a web server. They write behavior tests in Cucumber and infrastructure tests in ServerSpec. Vagrant is used to provision a virtual machine, and Ansible configures the server. By tying the tests to the provisioning code, they can now repeatedly build servers that are known to meet requirements.
The document discusses methods for sharding MySQL databases. It begins with an introduction to sharding and the different types of sharding methods. It then provides details on building a large database cluster using the Palomino Cluster Tool, which utilizes configuration management tools like Ansible, Chef and Puppet. The document concludes with a section on administering the large database cluster using the open source tool Jetpants.
The document discusses using Vagrant and cloud platforms like GCP to develop and deploy applications from development to production. It introduces Vagrant as a tool for setting up and managing development environments and shows how to use Vagrant with FreeBSD. It then demonstrates provisioning a FreeBSD VM on GCP and discusses identity and access management on the cloud platform. The document aims to provide an overview of using Vagrant for development and cloud platforms like GCP for production deployments.
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.UA Mobile
The document criticizes the "library" approach to Android development and advocates for a more minimal and performant approach. It discusses several common libraries used in Android development such as ORMs, EventBus, and RxAndroid. It argues that many libraries are not optimized for Android, can have performance issues, and promote coupling between components. The document recommends choosing libraries carefully based on your needs, testing library performance, and favoring simpler solutions over complex libraries when possible to follow Android development best practices around performance and resource usage.
Adventures in Thread-per-Core Async with Redpanda and SeastarScyllaDB
Thread-per-core programming models are well known in software domains where latency is important. Pinning application threads to physical cores and handling data in a core-affine way can yield impressive results, but what is it like to build a large scale general purpose software product within these constraints?
Redpanda is a high performance persistent stream engine, built on the Seastar framework. This session will describe practical experience of building high performance systems with C++20 in an asynchronous runtime, the unexpected simplicity that can come from strictly mapping data to cores, and explore the challenges & tradeoffs in adopting a thread-per-core architecture.
This document discusses deployment strategies for Ruby on Rails applications. It covers common components of the "Rails stack" including databases, caching, application servers and load balancers. Popular options are mentioned like Nginx, Memcached, Mongrel and Capistrano. Performance optimization techniques are also summarized, such as caching, background processing and memory usage strategies. The document concludes with an overview of common deployment processes and challenges.
Analyze Virtual Machine Overhead Compared to Bare Metal with TracingScyllaDB
The document compares the performance of running a MySQL database benchmark (Sysbench) on virtual machines versus bare metal machines. On Fedora, the benchmark achieved 6-7% higher transactions per second, queries per second, and lower latency when run on the bare metal host compared to the virtual machine guest. Similarly, on Debian, the benchmark achieved significantly higher transactions per second (over 500 vs under 80) and lower latency when run on bare metal. Tracing tools like trace-cmd can be used to analyze the additional overhead introduced by the virtualization layer.
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersToradex
Toradex brings robust Linux support to SMARC (Smart Mobility Architecture), ensuring high performance and long-term reliability for embedded applications. Here’s how:
• Optimized Torizon OS & Yocto Support – Toradex provides Torizon OS, a Debian-based easy-to-use platform, and Yocto BSPs for customized Linux images on SMARC modules.
• Seamless Integration with i.MX 8M Plus and i.MX 95 – Toradex SMARC solutions leverage NXP’s i.MX 8 M Plus and i.MX 95 SoCs, delivering power efficiency and AI-ready performance.
• Secure and Reliable – With Secure Boot, over-the-air (OTA) updates, and LTS kernel support, Toradex ensures industrial-grade security and longevity.
• Containerized Workflows for AI & IoT – Support for Docker, ROS, and real-time Linux enables scalable AI, ML, and IoT applications.
• Strong Ecosystem & Developer Support – Toradex offers comprehensive documentation, developer tools, and dedicated support, accelerating time-to-market.
With Toradex’s Linux support for SMARC, developers get a scalable, secure, and high-performance solution for industrial, medical, and AI-driven applications.
Do you have a specific project or application in mind where you're considering SMARC? We can help with Free Compatibility Check and help you with quick time-to-market
For more information: https://www.toradex.com/computer-on-modules/smarc-arm-family
Generative Artificial Intelligence (GenAI) in BusinessDr. Tathagat Varma
My talk for the Indian School of Business (ISB) Emerging Leaders Program Cohort 9. In this talk, I discussed key issues around adoption of GenAI in business - benefits, opportunities and limitations. I also discussed how my research on Theory of Cognitive Chasms helps address some of these issues
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Aqusag Technologies
In late April 2025, a significant portion of Europe, particularly Spain, Portugal, and parts of southern France, experienced widespread, rolling power outages that continue to affect millions of residents, businesses, and infrastructure systems.
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPathCommunity
Join this UiPath Community Berlin meetup to explore the Orchestrator API, Swagger interface, and the Test Manager API. Learn how to leverage these tools to streamline automation, enhance testing, and integrate more efficiently with UiPath. Perfect for developers, testers, and automation enthusiasts!
📕 Agenda
Welcome & Introductions
Orchestrator API Overview
Exploring the Swagger Interface
Test Manager API Highlights
Streamlining Automation & Testing with APIs (Demo)
Q&A and Open Discussion
Perfect for developers, testers, and automation enthusiasts!
👉 Join our UiPath Community Berlin chapter: https://community.uipath.com/berlin/
This session streamed live on April 29, 2025, 18:00 CET.
Check out all our upcoming UiPath Community sessions at https://community.uipath.com/events/.
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell
With expertise in data architecture, performance tracking, and revenue forecasting, Andrew Marnell plays a vital role in aligning business strategies with data insights. Andrew Marnell’s ability to lead cross-functional teams ensures businesses achieve sustainable growth and operational excellence.
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfSoftware Company
Explore the benefits and features of advanced logistics management software for businesses in Riyadh. This guide delves into the latest technologies, from real-time tracking and route optimization to warehouse management and inventory control, helping businesses streamline their logistics operations and reduce costs. Learn how implementing the right software solution can enhance efficiency, improve customer satisfaction, and provide a competitive edge in the growing logistics sector of Riyadh.
Spark is a powerhouse for large datasets, but when it comes to smaller data workloads, its overhead can sometimes slow things down. What if you could achieve high performance and efficiency without the need for Spark?
At S&P Global Commodity Insights, having a complete view of global energy and commodities markets enables customers to make data-driven decisions with confidence and create long-term, sustainable value. 🌍
Explore delta-rs + CDC and how these open-source innovations power lightweight, high-performance data applications beyond Spark! 🚀
Quantum Computing Quick Research Guide by Arthur MorganArthur Morgan
This is a Quick Research Guide (QRG).
QRGs include the following:
- A brief, high-level overview of the QRG topic.
- A milestone timeline for the QRG topic.
- Links to various free online resource materials to provide a deeper dive into the QRG topic.
- Conclusion and a recommendation for at least two books available in the SJPL system on the QRG topic.
QRGs planned for the series:
- Artificial Intelligence QRG
- Quantum Computing QRG
- Big Data Analytics QRG
- Spacecraft Guidance, Navigation & Control QRG (coming 2026)
- UK Home Computing & The Birth of ARM QRG (coming 2027)
Any questions or comments?
- Please contact Arthur Morgan at [email protected].
100% human made.
Big Data Analytics Quick Research Guide by Arthur MorganArthur Morgan
This is a Quick Research Guide (QRG).
QRGs include the following:
- A brief, high-level overview of the QRG topic.
- A milestone timeline for the QRG topic.
- Links to various free online resource materials to provide a deeper dive into the QRG topic.
- Conclusion and a recommendation for at least two books available in the SJPL system on the QRG topic.
QRGs planned for the series:
- Artificial Intelligence QRG
- Quantum Computing QRG
- Big Data Analytics QRG
- Spacecraft Guidance, Navigation & Control QRG (coming 2026)
- UK Home Computing & The Birth of ARM QRG (coming 2027)
Any questions or comments?
- Please contact Arthur Morgan at [email protected].
100% human made.
What is Model Context Protocol(MCP) - The new technology for communication bw...Vishnu Singh Chundawat
The MCP (Model Context Protocol) is a framework designed to manage context and interaction within complex systems. This SlideShare presentation will provide a detailed overview of the MCP Model, its applications, and how it plays a crucial role in improving communication and decision-making in distributed systems. We will explore the key concepts behind the protocol, including the importance of context, data management, and how this model enhances system adaptability and responsiveness. Ideal for software developers, system architects, and IT professionals, this presentation will offer valuable insights into how the MCP Model can streamline workflows, improve efficiency, and create more intuitive systems for a wide range of use cases.
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxshyamraj55
We’re bringing the TDX energy to our community with 2 power-packed sessions:
🛠️ Workshop: MuleSoft for Agentforce
Explore the new version of our hands-on workshop featuring the latest Topic Center and API Catalog updates.
📄 Talk: Power Up Document Processing
Dive into smart automation with MuleSoft IDP, NLP, and Einstein AI for intelligent document workflows.
How Can I use the AI Hype in my Business Context?Daniel Lehner
𝙄𝙨 𝘼𝙄 𝙟𝙪𝙨𝙩 𝙝𝙮𝙥𝙚? 𝙊𝙧 𝙞𝙨 𝙞𝙩 𝙩𝙝𝙚 𝙜𝙖𝙢𝙚 𝙘𝙝𝙖𝙣𝙜𝙚𝙧 𝙮𝙤𝙪𝙧 𝙗𝙪𝙨𝙞𝙣𝙚𝙨𝙨 𝙣𝙚𝙚𝙙𝙨?
Everyone’s talking about AI but is anyone really using it to create real value?
Most companies want to leverage AI. Few know 𝗵𝗼𝘄.
✅ What exactly should you ask to find real AI opportunities?
✅ Which AI techniques actually fit your business?
✅ Is your data even ready for AI?
If you’re not sure, you’re not alone. This is a condensed version of the slides I presented at a Linkedin webinar for Tecnovy on 28.04.2025.
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfAbi john
Analyze the growth of meme coins from mere online jokes to potential assets in the digital economy. Explore the community, culture, and utility as they elevate themselves to a new era in cryptocurrency.
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025BookNet Canada
Book industry standards are evolving rapidly. In the first part of this session, we’ll share an overview of key developments from 2024 and the early months of 2025. Then, BookNet’s resident standards expert, Tom Richardson, and CEO, Lauren Stewart, have a forward-looking conversation about what’s next.
Link to recording, transcript, and accompanying resource: https://bnctechforum.ca/sessions/standardsgoals-for-2025-standards-certification-roundup/
Presented by BookNet Canada on May 6, 2025 with support from the Department of Canadian Heritage.
Dev Dives: Automate and orchestrate your processes with UiPath MaestroUiPathCommunity
This session is designed to equip developers with the skills needed to build mission-critical, end-to-end processes that seamlessly orchestrate agents, people, and robots.
📕 Here's what you can expect:
- Modeling: Build end-to-end processes using BPMN.
- Implementing: Integrate agentic tasks, RPA, APIs, and advanced decisioning into processes.
- Operating: Control process instances with rewind, replay, pause, and stop functions.
- Monitoring: Use dashboards and embedded analytics for real-time insights into process instances.
This webinar is a must-attend for developers looking to enhance their agentic automation skills and orchestrate robust, mission-critical processes.
👨🏫 Speaker:
Andrei Vintila, Principal Product Manager @UiPath
This session streamed live on April 29, 2025, 16:00 CET.
Check out all our upcoming Dev Dives sessions at https://community.uipath.com/dev-dives-automation-developer-2025/.
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...SOFTTECHHUB
I started my online journey with several hosting services before stumbling upon Ai EngineHost. At first, the idea of paying one fee and getting lifetime access seemed too good to pass up. The platform is built on reliable US-based servers, ensuring your projects run at high speeds and remain safe. Let me take you step by step through its benefits and features as I explain why this hosting solution is a perfect fit for digital entrepreneurs.
2. This is NOT strictly a
Cassandra talk.
♫ There's no earthly way of knowing ♫
3. This is an infrastructure talk.
♫ How your infrastructure's growing. ♫
4. Startups move fast.
Priorities change.
Infrastructure needs to be able to
pivot, too.
♫ Who knows where business is going.
Or which way the data's flowing. ♫
5. When you scale up,
so do your problems.
♫ Drives imploding?
IO plateauing? ♫
6. Not to mention unexpected
disasters.
We lost a whole data center during
Hurricane Sandy.
♫ Is a hurricane a'blowing? ♫
7. How do you keep up with growth?
♫ There's no earthly way of knowing ♫
8. How do you deal with failure?
♫ Are the status LEDs a 'glowing?
Is the server reaper mowing? ♫
9. How do you deal with too much
success?
♫ Yes! The danger must be growing
For the data keeps on flowing. ♫
10. What do you do?
♫ And they're certainly not showing
any signs that they are slowing! ♫
12. ♫ Come with me
And you'll be
In a world of systems automation ♫
13. ♫ Take a look
And you’ll see
Into my Chef lucubrations
So login, Install, begin
With the Chef cookbook of my creation
What you'll see might require
Explanation ♫
14. ♫ If you want to view paradise
Simply go to Github and view it
Pull requests welcome, go to it
Want to change the code
A merge will do it ♫
https://github.com/linkedin/glu/
https://github.com/octo/collectd/
https://github.com/opscode/chef/
https://github.com/saltstack/salt/
https://github.com/outbrain/onering/
https://github.com/nmilford/chef-cassandra/
https://github.com/rabbitmq/rabbitmq-server/
15. def discover_cassandra_schema
require 'cassandra-cql'
schema = {}
server = "#{node[:ipaddress]}:#{node[:Cassandra][:rpc_port]}"
db = CassandraCQL::Database.new("#{server}") rescue nil
if db
db.keyspaces.collect{|s| schema[s.name] =
s.column_families.collect{|cfname, cfobj| cfname } }
schema.delete("system")
schema.delete("OpsCenter")
return schema ♫ There is no life I know
end To compare with writing automation
return nil Write it once
end You’ll be free♫
17. ♫ If you want to scale past a petabyte
Just install Chef, Salt and Graphite
If you want to sleep the whole night
Automate the world
It will be all right♫
18. ♫ There is no life I know
To compare with writing automation
Write it once
You’ll be free ♫
20. The
Automation
Factory
A Journey from Bare Metal
to Active Cassandra Node
[email protected]
blog.milford.io
twitter.com/NathanMilford
github.com/nmilford
22. 2 Years Later
● 80 billion impressions a month.
● 4 clusters for disparate
use-cases, more in planning.
● 73 Cassandra nodes
across 3 data centers.
23. Mo' Servers,
Mo' Problems
We got multiple cages of servers.
So... yeah... you can see where
automation might help :)
24. Automation Attack Plan
●
Provisioning!
●
Orchestration! ●
Command and Control!
●
Config Management! ● Monitoring and Alerting!
25. Provisioning
●
Started with Cobbler (which is Awesome!)
●
High performance infrastructures are snowflakes,
can get out of hand fast.
●
No tool that worked completely, end to end, the
tool won't write itself.
26. We Built Our Own: Onering
Note: I am only a moderate Lord of the Rings Fan, and the guy who did most of the work on it, Gary Hetzel, is a
Star Trek fan. We are not responsible for any LotR puns.
https://github.com/outbrain/onering/
27. Onering: Provisioning &
Orchestration
●
Initiates/manages provisioning
and inventory.
●
Acts as an orchestration layer in
our automation.
●
Keeps all metadata, which is
searchable.
●
Has a CLI tool and REST API to
work with.
●
Acts as our single point of truth
& final authority on state.
29. Onering Provisioning Workflow
➔
Developers put in machine requests by role for
quarterly order.
➔
Machines show up, get racked and powered on.
➔
Machines boot into the Razor microkernel and report to
Onering.
➔
Appropriate nodes get kickstarted & bootstrapped into
roles specified.
➔
Additional nodes sit idle in 'allocatable' state.
➔
Once OS is installed, configuration is handed off to...
30. Config Management: Chef
●
Onering bootstraps into a Chef run.
●
Chef installs all the system stuff.
●
Chef sets up Java and tunes the OS how we like.
●
Chef runs the Cassandra Cookbook.
include_recipe "java"
package "apache-cassandra1" do
action :install
end
template "/etc/cassandra/conf/cassandra.yaml" do
owner "cassandra"
group "cassandra"
mode "0755"
source "cassandra.yaml.erb"
end
https://github.com/opscode/chef/
31. Cassandra Cookbook does it all!
●
Builds/mounts disks.
●
Handles multiple clusters,
different versions.
●
Generates configs (in some
cases automatically based
on hardware profile).
●
Connects to local instance
and gets the schema.
●
Generates collectd config
and maintenance script.
●
Schedules maintenance.
https://github.com/nmilford/chef-cassandra
32. Glu: Continuous Deployment
● Not related to getting a C* node
to production, but it's how we get
apps there.
● Built at Linkedin.
● Onering talks to it!
●
Holds deployment metadata.
●
Maven Builds an RPM, dumps to a repo.
●
Glu-Agent yum installs it and performs checks.
https://github.com/linkedin/glu
35. Command & Control:
Distributed commands:
salt '*ny*' cassandra.column_families
salt 'cass*' cassandra.compactionstats
salt '*stg*' cassandra.info
salt 'cass1.ny.*' cassandra.keyspaces
salt -E 'cass1-(stg|prod)' cassandra.netstats
salt '*' cassandra.tpstats
Scary commands:
salt '*' --batch-size 25% service.restart cassandra
salt '*' -b2 cmd.run "nodetool -h $(hostname) -p 7199 snapshot"
We actually wrap salt in Onering to provide AAA, as well to allow use of Onering
metadata for node targeting.
https://github.com/saltstack/salt
37. Common Monitoring & Events Bus
●
A single infrastructure-wide bus for systems
data:
– Metrics
– Events
– Metadata
●
Collectd as systems agent.
●
RabbitMQ as message bus.
●
Graphite as metrics endpoint.
●
Working on an events mechanism.
●
Each layer should be interchangeable.
38. Collectd
●
Been around forever.
●
Had to rebuild the JMX plugin to not use OpenJDK.
●
Easy to write plugins and extend.
●
Writes to RabbitMQ out of the box.
●
Easy to templatize config for Chef.
<% @node[:Cassandra][:Keyspaces].each do |ks| -%>
<% ks[1].each do |cf| -%>
Collect "<%= ks[0] %>.<%= cf %>"
Collect "KeyCache.<%= ks[0] %>.<%= cf %>"
Collect "RowCache.<%= ks[0] %>.<%= cf %>"
<% end -%>
<% end -%>
https://github.com/octo/collectd
39. RabbitMQ
●
Lots of apps support AMPQ.
●
Shovel plugin for multi-site.
●
Pretty stable.
●
I'm not mad at it.
https://github.com/rabbitmq/rabbitmq-server
40. Graphite
●
Plays well with RabbitMQ.
●
Easy to get metrics into.
●
Scads of functions.
●
Easy to get meaningful data out of.
https://launchpad.net/graphite
43. Alerting: Nagios Self Serve
●
Uses Onering for new node discovery.
●
Developers add their own alerts based off of
Graphite data.
●
Ops get fewer alerts and are not a bottleneck.
●
Devs are more engaged.
●
Everyone is happy.