This is the Docker Logging & Monitoring workshop completed during DockerCon 2018 Europe. We cover how to build native tools in Docker, deploy an ELK stack, and Prometheus with cAdvisor, node-exporter, Prometheus, and Grafana stack
In this presentation, Alex Brett will show how Citrix has constructed a Test-as-a-Service environment which is used by the wider XenServer engineering team, highlighting the benefits the approach provides, together with an introduction to the (recently open sourced) XenRT automation framework which powers it, and discuss how this could be applied within the Xen Project community.
Nagios Conference 2011 - Nicholas Scott - Nagios Performance TuningNagios
Nicholas Scott's presentation on tuning Nagios performance. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
This document discusses performance management for websites and applications. It outlines several ways to monitor and improve performance, including:
1) Monitor performance before and after releasing new code using tools like PHP profilers, page profilers, and load testing to identify issues.
2) Monitor performance in production by tracking metrics like response times, error rates, server uptime, memory usage, and database queries.
3) Improve performance through techniques like caching, using a content delivery network, optimizing database and code, and adding redundancy to prevent downtime. Prevention and testing are emphasized as the best approaches to performance management.
Migrare la tua applicazione verso il cloud è estremamente semplice, sulla carta. La dura verità è che l'unico modo per sapere con certezza come si comporterà è testare con attenzione. Estrarre un benchmark on premises è già abbastanza difficile, ma il benchmarking nel cloud può diventare davvero complicato a causa delle restrizioni negli ambienti PaaS e per la mancanza di strumenti.
Raggiungimi in questa sessione e scopri come catturare un carico di lavoro da produzione, riprodurlo nel tuo database cloud e confrontare le prestazioni. Ti illustrerò la metodologia e gli strumenti per portare il tuo database nel cloud senza battere ciglio.
By Gianluca Sartori
Building a High Performance Analytics PlatformSantanu Dey
The document discusses using flash memory to build a high performance data platform. It notes that flash memory is faster than disk storage and cheaper than RAM. The platform utilizes NVMe flash drives connected via PCIe for high speed performance. This allows it to provide in-memory database speeds at the cost and density of solid state drives. It can scale independently by adding compute nodes or storage nodes. The platform offers a unified database for both real-time and analytical workloads through common APIs.
Speed Up Your Existing Relational Databases with Hazelcast and SpeedmentHazelcast
In this webinar
How do you get data from your existing relational databases changed by third party applications into your Hazelcast maps? How do you accomplish this if you have several databases, located on different sites, that need to be aggregated into a global Hazelcast map? How is it possible to reflect data from a relational database that has ten thousand updates per second or more?
Speedment’s SQL Reflector makes it possible to integrate your existing relational data with continuous updates of Hazelcast data-maps in real-time. In this webinar, we will show a couple of real-world cases where database applications are speeded up using Hazelcast maps fed by Speedment. We will also demonstrate how easily your existing database can be “reverse engineered” by the Speedment software that automatically creates efficient Java POJOs that can be used directly by Hazelcast.
We’ll cover these topics:
-Joint solution case studies
-Demo
-Live Q&A
Presenter:
Per-Åke Minborg, CTO at Speedment
Per-Åke Minborg is founder and CTO at Speedment AB. He is a passionate Java developer, dedicated to OpenSource software and an expert in finding new ways of solving problems – the harder problem the better. As a result, he has 15+ US patent applications and invention disclosures. He has a deep understanding of in-memory databases, high-performance solutions, cloud technologies and concurrent programming. He has previously served as CTO and founder of Chilirec and the Phone Pages. Per-Åke has a M.Sc. in Electrical Engineering from Chalmers University of Technology and several years of studies in computer science and computer security at university and PhD level.
"What database can tell about application issues? What application can tell a...Fwdays
This document summarizes common database and application issues including:
- SNAT exhaustion from too many outgoing connections and available ports
- Blocking queries that prevent other operations from executing concurrently
- Different index strategies like index seeks that improve query performance versus scans
Potential solutions include: connection pooling, scaling out app instances, using private endpoints, and modifying queries, transactions and indexes to reduce blocking.
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Zabbix
For the last two years I've been working in Cambridge (US) in Novartis Institute for Biomedical Research (NIBR) on a project related to a support of HPC cluster infrastructure and users. We're using Zabbix for HPC cluster monitoring (more than 1000 nodes, 10000+ cores, GPU cores, etc). In this presentation we will cover interesting use cases of Zabbix for HPC cluster, as it's not a regular infrastructure monitoring. We will talk about some challenges we have in HPC monitoring, how Zabbix helps us to work with scientists as well as present some solutions, which might be interesting for Zabbix community.
Securing Databases with Dynamic Credentials and HashiCorp VaultMitchell Pronschinske
Dynamic credentials and secrets—meaning credentials that are automatically rotated over a reasonable period of time—are crucial to a strong security posture. Without them, an attacker could move around in your network for months or years with valid credentials.
Frequently rotating credentials can be a major hassle, but HashiCorp Vault is changing that. In this solutions engineering hangout session, Thomas Kula, an SE at HashiCorp will demo how to use Vault to deliver dynamic database credentials in an easy, automated manner.
Zabbix Conference LatAm 2016 - Rodrigo Mohr - Challenges on Large Env with Or...Zabbix
Scalability on a large environment can be a challenge on many different aspects involving customization of monitors, performance and reporting. The goal of this presentation is to share the experience we had at Dell, monitoring a big number of servers in an environment with constant changes, lots of custom monitors and new servers configured every week. We will present, from our 3 years of experience with Zabbix and Oracle, which positive/negative aspects we have taken from the configuration parameters we used, involving strong use of User Macros, optimization of Database Queries, Table Partitioning and Automation.
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu SkinNagios
Jeff Sly's presentation on using Nagios XI to consolidate multiple monitoring products. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
Best Practices? That’s like asking how long is a piece of string! While every environment is different, there are however a number of configurations, tweaks and methods that can be of great benefit for your Nagios XI environment. This talk will cover a variety of Best Practice topics for Nagios XI ranging from flexible object configurations through to back end performance enhancements.
Nagios Conference 2011 - Kimbrough Henley - Using Nagios To Monitor ServiceDeskNagios
Kimbrough Henley's presentation on monitoring ServiceDesk with Nagios. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Kyle Hailey
The document discusses analyzing I/O performance and summarizing lessons learned. It describes common tools used to measure I/O like moats.sh, strace, and ioh.sh. It also summarizes the top 10 anomalies encountered like caching effects, shared drives, connection limits, I/O request consolidation and fragmentation over NFS, and tiered storage migration. Solutions provided focus on avoiding caching, isolating workloads, proper sizing of NFS parameters, and direct I/O.
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...Nagios
Nate Broderick's presentation on Nagios XI large implementation tips and tricks. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
OpenStack Summit Vancouver: Lessons learned on upgradesFrédéric Lepied
Deploying OpenStack in production at any scale, upgrade support is one of the requirements to have a successful deployment. Without upgrade management, adeployment will have bugs and security issues from day 1. Also in longer term, it will miss the latest features that OpenStack offers.
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016Zabbix
During outages on 10k+ hosts environment, NOC and Operations teams may face hundreds of alerts in order to perform root cause analysis, remediation or escalation, meanwhile logging resolution progress to Incident Management system for audit purposes.
This presentation will describe RingCentral approach to Incident and Problem Management in large Zabbix monitored cloud.
Co-authors of the presentation: Dmitry Shchemelinin, Ph.D., Sr. Director of Operations, RingCentral, USA.
Structured Container Delivery by Oscar Renalias, AccentureDocker, Inc.
With tools like Docker Toolbox, the entry barrier to Docker and containers is rather low. However, it takes a lot more to design, build and run an entire container platform, at scale, for production applications.
This talk will focus on why it is important to have a well-defined reference model for building container platforms that guides container engineers and architects through the process of identifying platform concerns, patterns, components as well as the interactions between them in order to deliver a set of platform capabilities (service discovery, load balancing, security, and others) to support containerized applications using existing tooling.
As part of this session will also see how a container architecture has enabled real projects in their delivery of container platforms.
1. A major airline experienced an outage when engineers performed a planned database failover that unexpectedly caused all check-in kiosks and IVR servers to go down after 3 hours, costing the company hundreds of thousands in delays and expenses.
2. An online retailer crashed on launch day when their new system was overwhelmed by high traffic, with sessions consuming resources due to session replication on every request. They added throttling and blocking to stabilize the system.
3. Zero downtime deployments can be achieved by expanding capacity, doing rolling upgrades of each server one by one, and finally cleaning up old components once complete.
With more than 140 million users, KakaoTalk is the most popular mobile messaging platform in South Korea. The team at daumkakao has been using OpenStack with the intention for tranforming the current legacy infrastructure into scale out based cloud to build and offer new services for its users. In this session, we'd like to share our experiences with the OpenStack community, specifically in regards to meeting our needs for networking with Neutron.OpenStack Neutron offers a lot of methods to implement networking for VMs and containers. For production operations, VM migration can be a common activity to manage resources and improve uptime. It's not hard using shared storage like Ceph, but network settings, such as IP addresses need to be preserved. With a shared storage environment, an image can be attached anywhere inside of a data center, but a service IP for a virtual machine is different story. And when you don't use the floating IPs, keeping the same IP across a data center-wide set of VLANs is hard job.To maintain a virtual machine's IP settings and balance IPs between VLANS, we tried several options including overlay, SDN, and NFV technologies. In the end we came to use a route-only network for our virtual machine networks, leveraging technology like Quagga for RIP, OSPF BGP integrated with Neutron.
Many system administrators are responsible for monitoring numerous servers that have different critical roles and often run applications 24/7. Properly planning and selecting monitoring software is important to determine what and how to monitor servers effectively based on requirements, while considering scalability, cost, and support. Nagios is an open-source option that is well-documented, customizable through plugins, and easy to install, but configuration requires using templates and hierarchies to manage alerts and check intervals for large server pools.
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ Docker, Inc.
In this talk, we will provide a 10,000-ft. overview of the key concepts, architectures, and common deployment scenarios for stateful services. We will cover the Docker volumes and available storage options in the community including ClusterHQ’s Flocker volume manager. After getting the lay of the land, we'll see these concepts in action. Starting by deploying a database container on a single node with UCP, Flocker and VolumeHub. Then, using the features of Docker Swarm and Flocker, we will then allow Swarm to automatically reschedule the stateful service along with Flocker moving its volume when the node fails giving us a HA containerized database.
Google Cloud Platform monitoring with ZabbixMax Kuzkin
This presentation describes how to configure Zabbix (https://zabbix.com/) to configure Google Cloud Platform events through its Monitoring API, using gcpmetrics (https://github.com/odin-public/gcpmetrics/) command line tool.
The document discusses Cloudify, an open source platform for deploying, managing, and scaling complex multi-tier applications on cloud infrastructures. It introduces key concepts of Cloudify including topologies defined using TOSCA, workflows written in Python, policies defined in YAML, and how Cloudify ties various automation tools together across the deployment continuum. The document also provides demonstrations of uploading a blueprint to Cloudify and installing an application using workflows, and discusses how Cloudify collects logs, metrics and handles events during workflow execution.
Accelerate Develoment with VIrtual DataKyle Hailey
This document summarizes best practices for application development using data virtualization to remove data as a constraint. It discusses how data management currently does not scale with agile development and is a major bottleneck. The solution presented is using a data virtualization appliance to create thin clones from production data for development, QA, and test environments. This allows for self-service provisioning of environments and parallel development. It provides use cases showing how virtual data improves development throughput, shifts testing left to find bugs earlier, and enables continuous delivery of features to production.
A map for DevOps on Microsoft Stack - MS DevSummitGiulio Vian
This document provides an overview of DevOps on the Microsoft stack. It discusses three ways of implementing DevOps: 1) Flowing work from idea to production using tools like GitHub, Azure Boards, Azure DevOps Server, and infrastructure as code. 2) Gathering feedback using observability tools like Application Insights and alerting. 3) Fostering communication, documentation, learning and fun through tools like GitHub Pages, Teams, LinkedIn Learning and DevTest Labs. The document recommends resources for learning more about DevOps and the Microsoft stack.
This document discusses the importance of comprehensive monitoring for containerized applications and infrastructure. It recommends monitoring at multiple levels, from the host and operating system to the network, orchestration layer, and applications themselves. The document outlines common failure models for each of these levels. It also provides an example of how detailed metrics and monitoring could have helped identify issues that caused an e-commerce site outage. The document concludes by recommending best practices for monitoring, such as starting small, avoiding alert fatigue, and testing monitoring systems.
How to accelerate docker adoption with a simple and powerful user experienceDocker, Inc.
1) Societe Generale aims to accelerate Docker adoption by providing a simple and powerful user experience. They plan to increase their container usage from 2000 to 15,000 containers.
2) They aim to achieve this growth while improving security, quality of service, and reducing VM costs. Their challenge is providing these improvements while maintaining a good user experience.
3) Docker Universal Control Plane (UCP) is used to provide a production cluster with logical isolation and central administration. This achieves multi-tenancy, security/compliance checks, and self-service onboarding.
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Zabbix
For the last two years I've been working in Cambridge (US) in Novartis Institute for Biomedical Research (NIBR) on a project related to a support of HPC cluster infrastructure and users. We're using Zabbix for HPC cluster monitoring (more than 1000 nodes, 10000+ cores, GPU cores, etc). In this presentation we will cover interesting use cases of Zabbix for HPC cluster, as it's not a regular infrastructure monitoring. We will talk about some challenges we have in HPC monitoring, how Zabbix helps us to work with scientists as well as present some solutions, which might be interesting for Zabbix community.
Securing Databases with Dynamic Credentials and HashiCorp VaultMitchell Pronschinske
Dynamic credentials and secrets—meaning credentials that are automatically rotated over a reasonable period of time—are crucial to a strong security posture. Without them, an attacker could move around in your network for months or years with valid credentials.
Frequently rotating credentials can be a major hassle, but HashiCorp Vault is changing that. In this solutions engineering hangout session, Thomas Kula, an SE at HashiCorp will demo how to use Vault to deliver dynamic database credentials in an easy, automated manner.
Zabbix Conference LatAm 2016 - Rodrigo Mohr - Challenges on Large Env with Or...Zabbix
Scalability on a large environment can be a challenge on many different aspects involving customization of monitors, performance and reporting. The goal of this presentation is to share the experience we had at Dell, monitoring a big number of servers in an environment with constant changes, lots of custom monitors and new servers configured every week. We will present, from our 3 years of experience with Zabbix and Oracle, which positive/negative aspects we have taken from the configuration parameters we used, involving strong use of User Macros, optimization of Database Queries, Table Partitioning and Automation.
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu SkinNagios
Jeff Sly's presentation on using Nagios XI to consolidate multiple monitoring products. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
Best Practices? That’s like asking how long is a piece of string! While every environment is different, there are however a number of configurations, tweaks and methods that can be of great benefit for your Nagios XI environment. This talk will cover a variety of Best Practice topics for Nagios XI ranging from flexible object configurations through to back end performance enhancements.
Nagios Conference 2011 - Kimbrough Henley - Using Nagios To Monitor ServiceDeskNagios
Kimbrough Henley's presentation on monitoring ServiceDesk with Nagios. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Kyle Hailey
The document discusses analyzing I/O performance and summarizing lessons learned. It describes common tools used to measure I/O like moats.sh, strace, and ioh.sh. It also summarizes the top 10 anomalies encountered like caching effects, shared drives, connection limits, I/O request consolidation and fragmentation over NFS, and tiered storage migration. Solutions provided focus on avoiding caching, isolating workloads, proper sizing of NFS parameters, and direct I/O.
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...Nagios
Nate Broderick's presentation on Nagios XI large implementation tips and tricks. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
OpenStack Summit Vancouver: Lessons learned on upgradesFrédéric Lepied
Deploying OpenStack in production at any scale, upgrade support is one of the requirements to have a successful deployment. Without upgrade management, adeployment will have bugs and security issues from day 1. Also in longer term, it will miss the latest features that OpenStack offers.
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016Zabbix
During outages on 10k+ hosts environment, NOC and Operations teams may face hundreds of alerts in order to perform root cause analysis, remediation or escalation, meanwhile logging resolution progress to Incident Management system for audit purposes.
This presentation will describe RingCentral approach to Incident and Problem Management in large Zabbix monitored cloud.
Co-authors of the presentation: Dmitry Shchemelinin, Ph.D., Sr. Director of Operations, RingCentral, USA.
Structured Container Delivery by Oscar Renalias, AccentureDocker, Inc.
With tools like Docker Toolbox, the entry barrier to Docker and containers is rather low. However, it takes a lot more to design, build and run an entire container platform, at scale, for production applications.
This talk will focus on why it is important to have a well-defined reference model for building container platforms that guides container engineers and architects through the process of identifying platform concerns, patterns, components as well as the interactions between them in order to deliver a set of platform capabilities (service discovery, load balancing, security, and others) to support containerized applications using existing tooling.
As part of this session will also see how a container architecture has enabled real projects in their delivery of container platforms.
1. A major airline experienced an outage when engineers performed a planned database failover that unexpectedly caused all check-in kiosks and IVR servers to go down after 3 hours, costing the company hundreds of thousands in delays and expenses.
2. An online retailer crashed on launch day when their new system was overwhelmed by high traffic, with sessions consuming resources due to session replication on every request. They added throttling and blocking to stabilize the system.
3. Zero downtime deployments can be achieved by expanding capacity, doing rolling upgrades of each server one by one, and finally cleaning up old components once complete.
With more than 140 million users, KakaoTalk is the most popular mobile messaging platform in South Korea. The team at daumkakao has been using OpenStack with the intention for tranforming the current legacy infrastructure into scale out based cloud to build and offer new services for its users. In this session, we'd like to share our experiences with the OpenStack community, specifically in regards to meeting our needs for networking with Neutron.OpenStack Neutron offers a lot of methods to implement networking for VMs and containers. For production operations, VM migration can be a common activity to manage resources and improve uptime. It's not hard using shared storage like Ceph, but network settings, such as IP addresses need to be preserved. With a shared storage environment, an image can be attached anywhere inside of a data center, but a service IP for a virtual machine is different story. And when you don't use the floating IPs, keeping the same IP across a data center-wide set of VLANs is hard job.To maintain a virtual machine's IP settings and balance IPs between VLANS, we tried several options including overlay, SDN, and NFV technologies. In the end we came to use a route-only network for our virtual machine networks, leveraging technology like Quagga for RIP, OSPF BGP integrated with Neutron.
Many system administrators are responsible for monitoring numerous servers that have different critical roles and often run applications 24/7. Properly planning and selecting monitoring software is important to determine what and how to monitor servers effectively based on requirements, while considering scalability, cost, and support. Nagios is an open-source option that is well-documented, customizable through plugins, and easy to install, but configuration requires using templates and hierarchies to manage alerts and check intervals for large server pools.
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ Docker, Inc.
In this talk, we will provide a 10,000-ft. overview of the key concepts, architectures, and common deployment scenarios for stateful services. We will cover the Docker volumes and available storage options in the community including ClusterHQ’s Flocker volume manager. After getting the lay of the land, we'll see these concepts in action. Starting by deploying a database container on a single node with UCP, Flocker and VolumeHub. Then, using the features of Docker Swarm and Flocker, we will then allow Swarm to automatically reschedule the stateful service along with Flocker moving its volume when the node fails giving us a HA containerized database.
Google Cloud Platform monitoring with ZabbixMax Kuzkin
This presentation describes how to configure Zabbix (https://zabbix.com/) to configure Google Cloud Platform events through its Monitoring API, using gcpmetrics (https://github.com/odin-public/gcpmetrics/) command line tool.
The document discusses Cloudify, an open source platform for deploying, managing, and scaling complex multi-tier applications on cloud infrastructures. It introduces key concepts of Cloudify including topologies defined using TOSCA, workflows written in Python, policies defined in YAML, and how Cloudify ties various automation tools together across the deployment continuum. The document also provides demonstrations of uploading a blueprint to Cloudify and installing an application using workflows, and discusses how Cloudify collects logs, metrics and handles events during workflow execution.
Accelerate Develoment with VIrtual DataKyle Hailey
This document summarizes best practices for application development using data virtualization to remove data as a constraint. It discusses how data management currently does not scale with agile development and is a major bottleneck. The solution presented is using a data virtualization appliance to create thin clones from production data for development, QA, and test environments. This allows for self-service provisioning of environments and parallel development. It provides use cases showing how virtual data improves development throughput, shifts testing left to find bugs earlier, and enables continuous delivery of features to production.
A map for DevOps on Microsoft Stack - MS DevSummitGiulio Vian
This document provides an overview of DevOps on the Microsoft stack. It discusses three ways of implementing DevOps: 1) Flowing work from idea to production using tools like GitHub, Azure Boards, Azure DevOps Server, and infrastructure as code. 2) Gathering feedback using observability tools like Application Insights and alerting. 3) Fostering communication, documentation, learning and fun through tools like GitHub Pages, Teams, LinkedIn Learning and DevTest Labs. The document recommends resources for learning more about DevOps and the Microsoft stack.
This document discusses the importance of comprehensive monitoring for containerized applications and infrastructure. It recommends monitoring at multiple levels, from the host and operating system to the network, orchestration layer, and applications themselves. The document outlines common failure models for each of these levels. It also provides an example of how detailed metrics and monitoring could have helped identify issues that caused an e-commerce site outage. The document concludes by recommending best practices for monitoring, such as starting small, avoiding alert fatigue, and testing monitoring systems.
How to accelerate docker adoption with a simple and powerful user experienceDocker, Inc.
1) Societe Generale aims to accelerate Docker adoption by providing a simple and powerful user experience. They plan to increase their container usage from 2000 to 15,000 containers.
2) They aim to achieve this growth while improving security, quality of service, and reducing VM costs. Their challenge is providing these improvements while maintaining a good user experience.
3) Docker Universal Control Plane (UCP) is used to provide a production cluster with logical isolation and central administration. This achieves multi-tenancy, security/compliance checks, and self-service onboarding.
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...rschuppe
Application Performance doesn't come easy. How to find the root cause of performance issues in modern and complex applications? All you have is a complaining user to start with?
In this presentation (mainly in German, but understandable for english speakers) I'd reprised the fundamentals of trouble shooting and have some new examples on how to tackle issues.
Follow up presentation to "Performance Trouble Shooting 101 - Schweine, Schlangen und Papierschnitte"
Prometheus and Docker (Docker Galway, November 2015)Brian Brazil
Brian Brazil is an engineer passionate about reliable systems who has worked at Google SRE and Boxever. He discusses Prometheus, an open source monitoring system he helped create. Prometheus offers inclusive monitoring of services, is manageable and reliable, integrates easily with other tools, and provides powerful querying and dashboards. It is efficient, scalable, and helps provide visibility into systems through its data model and labeling.
Best Practices for Becoming an Exceptional Postgres DBA EDB
Drawing from our teams who support hundreds of Postgres instances and production database systems for customers worldwide, this presentation provides real-real best practices from the nation's top DBAs. Learn top-notch monitoring and maintenance practices, get resource planning advice that can help prevent, resolve, or eliminate common issues, learning top database tuning tricks for increasing system performance and ultimately, gain greater insight into how to improve your effectiveness as a DBA.
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...gree_tech
This material is presented on CUFP 2013.
Functional programming is already an established technology is many areas. However, the lack of skilled developers has been a challenging hurdle in the adoption of such languages. It is easy for an inexperienced programmer to fall into the many traps of functional programming, resulting in a loss of productivity and bad software quality. Resource leaks caused by Haskell's lazy evaluation, for instance, are only the tip of the iceberg. Knowledge sharing and a mature tool-assisted development process are ways to avoid such pitfalls. At GREE, one of the largest mobile gaming companies, we use Haskell and Scala to develop major components of our platform, such as a distributed NoSQL solution, or an image storage infrastructure. However, only 11 programmers use functional programming on their daily task. In this talk, we will describe some unexpected functional programming issues we ran into, how we solved them and how we hope to avoid them in the future. We have developed a system testing framework to enhance regression testing, spent lots of time documenting pitfalls and introduced technical reviews. Recently, we even started holding lunchtime presentations about functional programming in order to attract beginners and prevent them from falling into the same traps.
Proactive ops for container orchestration environmentsDocker, Inc.
This document discusses different approaches to monitoring systems from manual and reactive to proactive monitoring using container orchestration tools. It provides examples of metrics to monitor at the host/hardware, networking, application, and orchestration layers. The document emphasizes applying the principles of observability including structured logging, events and tracing with metadata, and monitoring the monitoring systems themselves. Speakers provide best practices around failure prediction, understanding failure modes, and using chaos engineering to build system resilience.
Early Software Development through Palladium EmulationRaghav Nayak
1) The document discusses using emulation to enable early software development for a complex multicore system-on-chip (SoC) design based on Freescale's Layerscape architecture.
2) It describes the challenges of integrating hardware and validating software early in the design cycle. The methodology used emulation to parallelize hardware and software design activities.
3) Key benefits of the emulation approach included enabling more complex software testing earlier, building confidence in the device design, and having boot and operating system code ready before tape out. This allowed issues to be found and addressed early.
This session introduces tools that can help you analyze and troubleshoot performance with SharePoint 2013. This sessions presents tools like perfmon, Fiddler, Visual Round Trip Analyzer, IIS LogParser, Developer Dashboard and of course we create Web and Load Tests in Visual Studio 2013.
At the end we also take a look at some of the tips and best practices to improve performance on SharePoint 2013.
(ATS6-PLAT07) Managing AEP in an enterprise environmentBIOVIA
Deployments can range from personal laptop usage to large enterprise environments. The installer allows both interactive and unattended installations. Key folders include Users for individual data, Jobs for temporary execution data, Shared Public for shared resources, and XMLDB for the database. Logs record job executions, authentication events, and errors. Tools like DbUtil allow backup/restore of data, pkgutil creates packages for application delivery, and regress enables test automation. Planning folder locations and maintenance is important for managing resources in an enterprise environment.
The document discusses building an enterprise integration platform on Azure using Terraform. It summarizes the challenges of traditional on-premise integration platforms like BizTalk and how Azure services can address these. It then demonstrates how to define Azure infrastructure as code using Terraform to automate the provisioning of an integration platform across environments in under 45 minutes. The document concludes by discussing how Azure DevOps pipelines can be used to manage deployments and ensure consistency.
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
Think you have big data? What about high availability
requirements? At DataDog we process billions of data points every day including metrics and events, as we help the world
monitor the their applications and infrastructure. Being the world’s monitoring system is a big responsibility, and thanks to
Redis we are up to the task. Join us as we discuss how the DataDog team monitors and scales Redis to power our SaaS based monitoring offering. We will discuss our usage and deployment patterns, as well as dive into monitoring best practices for production Redis workloads
This document describes Cerberus, an open source test automation tool developed by La Redoute. Cerberus allows centralized management of test cases across multiple technologies like web, mobile, and APIs. It supports features like a step library, test automation, execution reporting, and integration with other tools. The document also provides examples of how Cerberus is used at La Redoute for regression testing websites in multiple languages and environments. It maintains over 3,500 regression tests that execute twice daily. Cerberus can also be used for functional monitoring of websites by regularly executing test cases and monitoring performance metrics.
DevOps Fest 2020. immutable infrastructure as code. True story.Vlad Fedosov
This document discusses the journey of transitioning infrastructure management at Namecheap to an immutable infrastructure as code model using tools like Terraform, Docker, and Jenkins. Key points include taking over a project from an outsourcing company, setting up immutable infrastructure with infrastructure as code, configuring CI/CD pipelines as code in Jenkins, and lessons learned around testing, chaos engineering, and encouraging team feedback. The overall goals were to make infrastructure hard to break, easy to repair, and easy to modify.
Adding Value in the Cloud with Performance TestRodolfo Kohn
This document discusses the importance of performance testing cloud applications and outlines best practices for defining performance requirements, testing methodology, and identifying issues. It provides examples of performance problems found in databases, applications, operating systems, and networks. The key goals of performance testing are to understand system behavior under load, find bottlenecks and hidden bugs, and verify that requirements are met.
The document discusses YouView's efforts to support HTML5 development for their platform. It outlines their process for allowing local development on production devices, tests of popular JavaScript frameworks for performance, and best practices for optimizing loading times such as lazy loading, modular code, and build tools. Test results showed underscore.js had the best performance and loading code in modular bundles helped speed up loading times by 2x.
Learn how to improve the performance of your Cognos environment. We cover hardware and server specifics, architecture setup, dispatcher tuning, report specific tuning including the Interactive Performance Assistant and more. See the recording and download this deck: https://senturus.com/resources/cognos-analytics-performance-tuning/
Senturus offers a full spectrum of services for business analytics. Our Knowledge Center has hundreds of free live and recorded webinars, blog posts, demos and unbiased product reviews available on our website at: https://senturus.com/resources/
Building Real World Applications using Windows Azure - Scott Guthrie, 2nd Dec...Vikas Sahni
This document discusses key development patterns and practices for building cloud applications, covering topics in two parts. Part 1 discusses automating everything, source control, continuous integration and delivery, web development best practices, enterprise identity integration, and data storage options. Part 2 covers data partitioning strategies, unstructured blob storage, designing to survive failures, monitoring and telemetry, transient fault handling, distributed caching, and queue-centric work patterns. The document emphasizes the importance of these patterns for building scalable and resilient cloud solutions.
This document discusses key development patterns and practices for building cloud applications, covering topics in two parts. Part 1 discusses automating everything, source control, continuous integration and delivery, web development best practices, identity integration, and data storage options. Part 2 covers data partitioning strategies, unstructured blob storage, designing to survive failures, monitoring and telemetry, transient fault handling, distributed caching, and queue-centric work patterns. The document emphasizes leveraging these cloud patterns to build scalable and resilient cloud solutions.
56K.Cloud is a future technologies company that focuses on developer enablement across public and private cloud, IoT, and machine learning. They provide training, workshops, and consulting services for application development on public cloud platforms, IoT deployments, and more. Their team includes experts in DevOps engineering, cloud architecture, containers, infrastructure automation, and embedded systems. For customers, 56K.Cloud follows an engagement process that includes discovery, definition, and implementation modules to define the scope of work and implement solutions in an agile manner.
This document provides an overview of Docker and cloud native training presented by Brian Christner of 56K.Cloud. It includes an agenda for Docker labs, common IT struggles Docker can address, and 56K.Cloud's consulting and training services. It discusses concepts like containers, microservices, DevOps, infrastructure as code, and cloud migration. It also includes sections on Docker architecture, networking, volumes, logging, and monitoring tools. Case studies and examples are provided to demonstrate how Docker delivers speed, agility, and cost savings for application development.
The ExpertsLive EU session covering how we are trying to push Web of Things (WOT) standards to devices so we can bring more web technologies and DevOps into the IoT world. In this presentation, we covered how Docker is enabling hardware development to become more rapid much like software development is now.
The document discusses serverless computing and OpenFaaS, an open source serverless framework. It introduces serverless functions and how OpenFaaS allows developers to easily write and deploy stateless functions. The document provides examples of how functions can be chained together and invoked asynchronously. It also shares several case studies of organizations using OpenFaaS and announces the launch of OpenFaaS Cloud.
This document provides an overview of Docker and Kubernetes by Brian Christner, Co-Founder of 56K.Cloud. It includes background on Brian and his expertise in containers, cloud, and engineering. It then discusses concepts like cloud containers, containerization, microservices, and DevOps as well as 56K.Cloud's services in areas like reference architectures and managed services. The rest of the document focuses on Docker and Kubernetes, providing brief histories and explaining what they are. It includes demos of Docker Desktop, Kubernetes in Docker Desktop, and Kubernetes. The document emphasizes how Docker and Kubernetes enable portability, agility, and control for applications in the cloud.
This document provides an overview and agenda for a Docker and cloud native training. It introduces Brian Christner as the trainer and his background. It then covers various cloud native topics that will be discussed including containers, microservices, DevOps, and orchestration. The remainder of the document demonstrates Docker concepts hands-on and discusses container architecture, portability, and monitoring. It also briefly explores future directions like serverless and concludes by providing additional Docker resources.
This presentation was presented to the Fachhochschule Bern. The course was part of the Master program and we covered the topics of Cloud Native & Docker
This presentation covers the Cloud Native standards for building applications and enforces how important Docker & Containers are when builiing Cloud Native Applications/Architecture. Next, we cover how to use Docker to build a Serverless infrastrucutre.
A technical deep dive about Docker, Docker's benefits, what is the difference between VM's and Containers, DevOps & Docker and the future of Docker with Serverless.
Docker - Build, Ship and Run Any App, Anywhere Hollywood editionBrian Christner
Since I presented this inside a movie theater I reference a few movies. This talk covers the Why of Docker more business case relevant, the VM vs Containers argument and Swisscom use cases with Docker
How to monitor a Docker Swarms with Prometheues, Google cAdvisor & Node Exporter while sending alerts to Slack. This provides background on monitoring, some best practices and the landscape of containers at the moment.
An overview about Docker's new 1.12 release. This slide deck covers the new features as well as demo including commands to start up your own Docker Swarm on either Docker for Mac or Docker for Windows and deploy a Docker Service
The presentation was delivered via a container running RevealJS which is where some of the some formatting issues come from in SlideShare.
Do you know the performance of your containers or Docker Hosts? I will show you how to get up and running quickly with 2 different Open Source Docker Monitoring solutions. We will quickly cover Docker Stats as the basis and discover how Google cAdvisor gathers metrics for our 2 solutions. We will then build upon this basis to build a Docker Monitoring solution with cAdvisor+InfluxDB+Grafana and then cAdvisor+Prometheus and create dashboards based on the gathered monitoring metrics with Grafna and Prometheus.
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfAbi john
Analyze the growth of meme coins from mere online jokes to potential assets in the digital economy. Explore the community, culture, and utility as they elevate themselves to a new era in cryptocurrency.
This is the keynote of the Into the Box conference, highlighting the release of the BoxLang JVM language, its key enhancements, and its vision for the future.
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul
Artificial intelligence is changing how businesses operate. Companies are using AI agents to automate tasks, reduce time spent on repetitive work, and focus more on high-value activities. Noah Loul, an AI strategist and entrepreneur, has helped dozens of companies streamline their operations using smart automation. He believes AI agents aren't just tools—they're workers that take on repeatable tasks so your human team can focus on what matters. If you want to reduce time waste and increase output, AI agents are the next move.
How Can I use the AI Hype in my Business Context?Daniel Lehner
𝙄𝙨 𝘼𝙄 𝙟𝙪𝙨𝙩 𝙝𝙮𝙥𝙚? 𝙊𝙧 𝙞𝙨 𝙞𝙩 𝙩𝙝𝙚 𝙜𝙖𝙢𝙚 𝙘𝙝𝙖𝙣𝙜𝙚𝙧 𝙮𝙤𝙪𝙧 𝙗𝙪𝙨𝙞𝙣𝙚𝙨𝙨 𝙣𝙚𝙚𝙙𝙨?
Everyone’s talking about AI but is anyone really using it to create real value?
Most companies want to leverage AI. Few know 𝗵𝗼𝘄.
✅ What exactly should you ask to find real AI opportunities?
✅ Which AI techniques actually fit your business?
✅ Is your data even ready for AI?
If you’re not sure, you’re not alone. This is a condensed version of the slides I presented at a Linkedin webinar for Tecnovy on 28.04.2025.
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveScyllaDB
Want to learn practical tips for designing systems that can scale efficiently without compromising speed?
Join us for a workshop where we’ll address these challenges head-on and explore how to architect low-latency systems using Rust. During this free interactive workshop oriented for developers, engineers, and architects, we’ll cover how Rust’s unique language features and the Tokio async runtime enable high-performance application development.
As you explore key principles of designing low-latency systems with Rust, you will learn how to:
- Create and compile a real-world app with Rust
- Connect the application to ScyllaDB (NoSQL data store)
- Negotiate tradeoffs related to data modeling and querying
- Manage and monitor the database for consistently low latencies
Quantum Computing Quick Research Guide by Arthur MorganArthur Morgan
This is a Quick Research Guide (QRG).
QRGs include the following:
- A brief, high-level overview of the QRG topic.
- A milestone timeline for the QRG topic.
- Links to various free online resource materials to provide a deeper dive into the QRG topic.
- Conclusion and a recommendation for at least two books available in the SJPL system on the QRG topic.
QRGs planned for the series:
- Artificial Intelligence QRG
- Quantum Computing QRG
- Big Data Analytics QRG
- Spacecraft Guidance, Navigation & Control QRG (coming 2026)
- UK Home Computing & The Birth of ARM QRG (coming 2027)
Any questions or comments?
- Please contact Arthur Morgan at [email protected].
100% human made.
What is Model Context Protocol(MCP) - The new technology for communication bw...Vishnu Singh Chundawat
The MCP (Model Context Protocol) is a framework designed to manage context and interaction within complex systems. This SlideShare presentation will provide a detailed overview of the MCP Model, its applications, and how it plays a crucial role in improving communication and decision-making in distributed systems. We will explore the key concepts behind the protocol, including the importance of context, data management, and how this model enhances system adaptability and responsiveness. Ideal for software developers, system architects, and IT professionals, this presentation will offer valuable insights into how the MCP Model can streamline workflows, improve efficiency, and create more intuitive systems for a wide range of use cases.
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPathCommunity
Join this UiPath Community Berlin meetup to explore the Orchestrator API, Swagger interface, and the Test Manager API. Learn how to leverage these tools to streamline automation, enhance testing, and integrate more efficiently with UiPath. Perfect for developers, testers, and automation enthusiasts!
📕 Agenda
Welcome & Introductions
Orchestrator API Overview
Exploring the Swagger Interface
Test Manager API Highlights
Streamlining Automation & Testing with APIs (Demo)
Q&A and Open Discussion
Perfect for developers, testers, and automation enthusiasts!
👉 Join our UiPath Community Berlin chapter: https://community.uipath.com/berlin/
This session streamed live on April 29, 2025, 18:00 CET.
Check out all our upcoming UiPath Community sessions at https://community.uipath.com/events/.
Artificial Intelligence is providing benefits in many areas of work within the heritage sector, from image analysis, to ideas generation, and new research tools. However, it is more critical than ever for people, with analogue intelligence, to ensure the integrity and ethical use of AI. Including real people can improve the use of AI by identifying potential biases, cross-checking results, refining workflows, and providing contextual relevance to AI-driven results.
News about the impact of AI often paints a rosy picture. In practice, there are many potential pitfalls. This presentation discusses these issues and looks at the role of analogue intelligence and analogue interfaces in providing the best results to our audiences. How do we deal with factually incorrect results? How do we get content generated that better reflects the diversity of our communities? What roles are there for physical, in-person experiences in the digital world?
TrsLabs - Fintech Product & Business ConsultingTrs Labs
Hybrid Growth Mandate Model with TrsLabs
Strategic Investments, Inorganic Growth, Business Model Pivoting are critical activities that business don't do/change everyday. In cases like this, it may benefit your business to choose a temporary external consultant.
An unbiased plan driven by clearcut deliverables, market dynamics and without the influence of your internal office equations empower business leaders to make right choices.
Getting things done within a budget within a timeframe is key to Growing Business - No matter whether you are a start-up or a big company
Talk to us & Unlock the competitive advantage
Book industry standards are evolving rapidly. In the first part of this session, we’ll share an overview of key developments from 2024 and the early months of 2025. Then, BookNet’s resident standards expert, Tom Richardson, and CEO, Lauren Stewart, have a forward-looking conversation about what’s next.
Link to recording, presentation slides, and accompanying resource: https://bnctechforum.ca/sessions/standardsgoals-for-2025-standards-certification-roundup/
Presented by BookNet Canada on May 6, 2025 with support from the Department of Canadian Heritage.
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfSoftware Company
Explore the benefits and features of advanced logistics management software for businesses in Riyadh. This guide delves into the latest technologies, from real-time tracking and route optimization to warehouse management and inventory control, helping businesses streamline their logistics operations and reduce costs. Learn how implementing the right software solution can enhance efficiency, improve customer satisfaction, and provide a competitive edge in the growing logistics sector of Riyadh.
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc
Most consumers believe they’re making informed decisions about their personal data—adjusting privacy settings, blocking trackers, and opting out where they can. However, our new research reveals that while awareness is high, taking meaningful action is still lacking. On the corporate side, many organizations report strong policies for managing third-party data and consumer consent yet fall short when it comes to consistency, accountability and transparency.
This session will explore the research findings from TrustArc’s Privacy Pulse Survey, examining consumer attitudes toward personal data collection and practical suggestions for corporate practices around purchasing third-party data.
Attendees will learn:
- Consumer awareness around data brokers and what consumers are doing to limit data collection
- How businesses assess third-party vendors and their consent management operations
- Where business preparedness needs improvement
- What these trends mean for the future of privacy governance and public trust
This discussion is essential for privacy, risk, and compliance professionals who want to ground their strategies in current data and prepare for what’s next in the privacy landscape.
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025BookNet Canada
Book industry standards are evolving rapidly. In the first part of this session, we’ll share an overview of key developments from 2024 and the early months of 2025. Then, BookNet’s resident standards expert, Tom Richardson, and CEO, Lauren Stewart, have a forward-looking conversation about what’s next.
Link to recording, transcript, and accompanying resource: https://bnctechforum.ca/sessions/standardsgoals-for-2025-standards-certification-roundup/
Presented by BookNet Canada on May 6, 2025 with support from the Department of Canadian Heritage.
11. • Total Downtime: Just under 4
minutes
• 502 error messages total: 12 000
• People affected by the 502 error
who did not get their bargain: 400
Website Down?
14. Users Care About 3 Things
● Availability - Is my system online yes/no
● Latency - Does it take a long time to access application x,y,z
● Reliability - Can the user rely on using the application
15. Brain Based Tools
• We can track 8 objects on average
• 4 Moving Objects
• Build Dashboards & Tools
accordingly
17. SRE is treats Operations as if it
were a Software Problem
“Hope is not a
strategy.”
Traditional SRE saying
SRE (Site Reliability Engineering)
www.google.com/sre
19. R.E.D (Microservice level)
(Request) Rate: the number of requests, per second, you
services are serving.
(Request) Errors: the number of failed requests per second.
Utilization: the average time that the resource was busy servicing
work
(Request) Duration: distributions of the amount of time each
request takes.
20. U.S.E (Low Level / Infrastructure)
For every resource, check Utilization, Saturation, and Errors
Resource: all physical server functional components (CPUs,
disks, busses, ...)
Utilization: the average time that the resource was busy
servicing work
Saturation: the degree to which the resource has extra work
which it can't service, often queued
Errors: the count of error events
21. Black box vs. White box
Monitoring
Black Box Monitoring White Box Monitoring
App metrics, requests,
responses, process times
HTTP, Ping, etc
External App Metrics Internal App Metrics
50. Agenda
- [ X ] Introduction
- [ X ] Operations Overview
- [ X ] Logging Workshop
- [ X ] Monitoring Workshop
- [ ] Best Practices & Recap
51. • Start small & increment
• Don’t Overlert yourself
• Set Resource Limits
• Aim for actionable Information
• Run separate from Workload
• Test for Failures
• Know your Failure Models
Best Practices