The document discusses migrating a Novell GroupWise environment from Netware to Linux. It provides an overview of both manual and wizard-guided migration approaches. The manual process involves installing agents, copying data, and configuring the agents on the Linux server. The wizard aims to simplify the process but provides less control. Key steps covered include pre-migration planning, post office and domain migration, and post-migration configuration changes.
- The document describes issues that can cause the Xen hypervisor tool "xenwatch" to stall when destroying or creating virtual machines (domUs).
- One cause is leftover "inflight packets" in the network backend driver that prevent xenwatch threads from stopping. Resetting the network interface can help.
- Other potential causes involve idle block tags being unavailable or persistent grant pages remaining mapped due to storage or filesystem issues.
- The proposed solution is to run a dedicated xenwatch kernel thread per domU to avoid locking issues and allow independent processing of events.
The document discusses various strategies for backing up and recovering PostgreSQL databases. It begins by introducing the speaker and their background. It then covers the objectives of business continuity planning. The main types of backups discussed are logical backups using pg_dump and physical/file-system level backups. Advantages and disadvantages of each approach are provided. Validation of backups through continuous testing of restores is emphasized. Automating the backup process through scripting and configuration management tools is presented as a best practice. Specific tools discussed include pg_basebackup, ZFS snapshots for file-system level backups, Barman and WAL-E for third party managed backups, and examples of custom in-house solutions.
This document discusses PostgreSQL backups and disaster recovery. It covers the need for different types of backups like logical and physical backups. It discusses how to store backups and automate the backup process. The document also covers how to validate backups are working properly and tools that can be used. It emphasizes that both logical and physical backups are important to have for different recovery scenarios. Automation is recommended to manage the complex backup processes.
If you’re using SAN in your Power Systems environment without taking advantage of FlashCopy, we have one question for you: Why not?
FlashCopy takes a quick snapshot of your data at a particular point in time, then POOF! Your data is available for backup or use on another partition for high availability, disaster recovery, or even to create a test environment for your developers.
Join IBM i expert Chuck Stupca, IBM emeritus, as he explains how FlashCopy works and how best to take advantage of its unique features. We’ll also discuss ways that it helps you build a better backup strategy for your IBM i environment:
• Making a backup copy of production for tape-based saves
• Providing test environments from your production data in seconds
• Comparing FlashCopy to a save-while-active backup
This document provides an overview of scale-out NAS systems and file systems. It discusses the differences between parallel storage and scale-out NAS, and describes IBM's GPFS and NFS v4.1. It also covers emerging distributed erasure coding technologies and how they are replacing traditional RAID. The document analyzes EMC's Isilon S-series and X-series scale-out NAS clusters, which use erasure coding, and NetApp's FAS 3200 series, which uses a NAS head architecture. It concludes with a comparison of data protection features in EMC's OneFS and NetApp's Data ONTAP operating systems.
The document discusses database cloning challenges and solutions for thin provision cloning using various technologies like Oracle CloneDB, EMC BCV, SRDF, VMware snapshots, ZFS, and NetApp FlexClone. It also covers database virtualization solutions like Oracle SMU and Delphix that provide self-service virtual databases to developers by sharing blocks between clones for faster provisioning and development cycles. Case studies described how virtualization can accelerate development by providing frequent fresh clones of the source database.
Database virtualization technologies allow for cloning database instances while sharing data. This avoids consuming large amounts of storage for full copies. Technologies like CloneDB, Oracle ZFS Storage Appliance, Delphix, and Data Director create clone instances that only store changed data, sharing read-only data from snapshots. They provide benefits like faster provisioning of clones, reduced storage usage, and easier testing and development.
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...The Linux Foundation
This talk presents a production-ready automotive virtualization solution with Xen. The key requirements that we focus are super-fast startup and recovery from failure, static virtual machine creation with dedicated resources, and performance effective graphics rendering. To reduce the boot time, we optimize the Xen startup procedure by effectively initializing Xen heap and VM memory, and booting multiple VMs concurrently. We provide fast recovery mechanism by re-implementing the VM reset feature. We also develop a highly optimized graphics APIs-forwarding mechanism supporting OpenGLES APIs up to v3.2. The pass rate of Khronos CTS in a guest OS is comparable to the Domain0’s. Our experiment shows that our virtualization solution provides reasonable performance for ARM-based automotive systems (hypervisor booting: less than 70ms, graphics performance: about 96% of Domain0).
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.The Linux Foundation
Currently, several initiatives promote XEN hypervisor into the automotive area as a base of complex virtualized systems. To support those initiatives and plunge into the automotive world XEN should fit at least two requirements: it should be appropriately certified and to be able to host a security domain. Leaving behind certification topic, here we focus on security domain hosting capability of XEN. Particularly on keeping RT guarantees for the specific domain.
This talk is a presentation of the investigation on a XEN hypervisor applicability to building a multi-OS system with real-time guarantees being kept for one of the hosted OSes.
During this presentation following topics would be outlined:
- experimental setup
- experimental use-cases and their motivation
- received results and discovered issues
- solutions and mitigation measures for discovered issues
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld
This document provides an overview and best practices for storage technologies. It discusses factors that affect storage performance like interconnect bandwidth versus IOPS and command sizing. It covers tiering strategies and when auto-tiering may not be effective. It also discusses SSDs versus spinning disks, large VMDK and VMFS support, thin provisioning at the VM and LUN level, and architecting storage for failure including individual component failure, temporary and permanent site loss. It provides examples of how to implement a low-cost disaster recovery site using inexpensive hardware.
The document discusses tuning MySQL server settings for performance. Some key points covered include:
- Settings are workload-specific and depend on factors like storage engine, OS, hardware. Tuning involves getting a few settings right rather than maximizing all settings.
- Monitoring tools like SHOW STATUS, SHOW INNODB STATUS, and OS tools can help evaluate performance and identify tuning opportunities.
- Memory allocation and settings like innodb_buffer_pool_size, key_buffer_size, query_cache_size are important to configure based on the workload and available memory.
PostgreSQL Portland Performance Practice Project - Database Test 2 HowtoMark Wong
Fourth presentation in a speaker series sponsored by the Portland State University Computer Science Department. The series covers PostgreSQL performance with an OLTP (on-line transaction processing) workload called Database Test 2 (DBT-2). This presentation is a set of examples to go along with the live presentation given on March 12, 2009.
Slides for JavaOne 2015 talk by Brendan Gregg, Netflix (video/audio, of some sort, hopefully pending: follow @brendangregg on twitter for updates). Description: "At Netflix we dreamed of one visualization to show all CPU consumers: Java methods, GC, JVM internals, system libraries, and the kernel. With the help of Oracle this is now possible on x86 systems using system profilers (eg, Linux perf_events) and the new JDK option -XX:+PreserveFramePointer. This lets us create Java mixed-mode CPU flame graphs, exposing all CPU consumers. We can also use system profilers to analyze memory page faults, TCP events, storage I/O, and scheduler events, also with Java method context. This talk describes the background for this work, instructions generating Java mixed-mode flame graphs, and examples from our use at Netflix where Java on x86 is the primary platform for the Netflix cloud."
This document discusses Intel's high performance storage solution using Lustre file systems. It provides an overview of Lustre, how it can interface with various Intel technologies like SSDs, networking fabrics and processors. It also summarizes the key features of Lustre including its scalability, POSIX compatibility, shared namespace and how all clients can access data. Specific examples are given around using OpenZFS as the backend storage for Lustre and how technologies like L2ARC can improve performance. Monitoring and management tools for Lustre file systems are also highlighted.
Pg_prefaulter is a tool that helps eliminate replication lag and reduce startup times. It works by prefaulting WAL files on the follower nodes before the regular replication process applies the WAL. This is done by parsing the WAL files on the primary using pg_xlogdump to determine which database relations (tables, indexes) need to be prefaulted. Pg_prefaulter then issues prefetch system calls in parallel to warm the OS caches and disk buffers for those relations, improving performance of the downstream replication and recovery processes.
1. The document provides procedures for backing up a production server, including cloning the production server to create a backup server and daily and monthly backup procedures.
2. Key steps in cloning the production server include shutting down databases and applications, copying files to the backup server, and running scripts to configure the database and applications.
3. Daily backups involve making copies of updated database files from the production server to the backup server and reconfiguring the database. Monthly backups also update applications if any patches were applied.
The document outlines best practices for routinely backing up a production server to ensure data is not lost by maintaining a cloned backup server and regularly updating databases and applications.
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, ArmThe Linux Foundation
The Arm architecture allows for a wide variety of cache configurations, levels and features. This enables building systems that will optimally fit power/area budgets set for the target application.
A consequence of this is that architecturally compliant software has to cater for a much wider range of behaviors than on other architectures. While most software uses cache instructions that don't need special treatment in a virtualized environment, some will want to directly manage a given cache using set/way instructions and will introduce challenges for the hypervisor to handle them.
This talk will give an overview of how caches behave in the Arm architecture, especially in the context of virtualization. It will then describe the problem of using set/way instructions in a virtualized environment. We will also discuss the modifications required in Xen to handle those instructions.
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
This document discusses using the Linux perf profiling tool at Netflix. It begins with an overview of why Netflix needs Linux profiling to understand CPU usage quickly and completely. It then provides an introduction to the perf tool, covering its basic workflow and commands. The document discusses profiling CPU usage with perf, including potential issues like JIT runtimes and missing symbols. It provides several examples of perf commands for listing, counting, and recording events. The overall summary is that perf allows Netflix to quickly and accurately profile CPU usage across the entire software stack, from applications to libraries to the kernel, to optimize performance.
It is no accident that Xen software powers some of the largest Clouds in existence. From its outset, the Xen Project was intended to enable what we now call Cloud Computing. This session will explore how the Xen Architecture addresses the needs of the Cloud in ways which facilitate security, throughput, and agility. It will also cover some of the hot new developments of the Xen Project.
XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins...The Linux Foundation
While virtualization technologies like Xen have been around for a long time, it is only in recent years that they have started to be targeted as viable systems for implementing middlebox processing (e.g., firewalls, NATs). But can they provide this functionality while yielding the high performance expected from hardware-based middlebox offerings? In this talk Joao Martins will introduce ClickOS, a tiny, MiniOS-based virtual machine tailored for network processing. In addition to the vm itself, Joao Martins will describe performance improvements done to the entire Xen I/O pipe. Finally, Joao Martins will discuss an evaluation showing that ClickOS can be instantiated in 30 msecs, can process traffic at 10Gb/s for almost all packet sizes, introduces delay of 40 microseconds and can run middleboxes at rates of 5 Mp/s.
[CCP Games] Versioning Everything with PerforcePerforce
How far can you take versioning everything in and around your product? How far should you take it? This man says that you should take it all the way—and he has hard-won lessons from the MMO game development and operation of EVE Online and Dust 514 to share with you.
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
Improving Your Heroku App Performance with Asset CDN and UnicornSimon Bagreev
This document summarizes tips for optimizing the performance of Rails applications using asset CDNs and the Unicorn web server. It discusses using Amazon S3 and CloudFront for caching and delivering assets to improve load times. It also explains how to configure Unicorn to handle requests concurrently across worker processes to better utilize dyno resources on Heroku. Benchmark tests show these approaches reduced load times and increased the number of concurrent requests applications can handle.
This talk discusses Linux profiling using perf_events (also called "perf") based on Netflix's use of it. It covers how to use perf to get CPU profiling working and overcome common issues. The speaker will give a tour of perf_events features and show how Netflix uses it to analyze performance across their massive Amazon EC2 Linux cloud. They rely on tools like perf for customer satisfaction, cost optimization, and developing open source tools like NetflixOSS. Key aspects covered include why profiling is needed, a crash course on perf, CPU profiling workflows, and common "gotchas" to address like missing stacks, symbols, or profiling certain languages and events.
The 4.5 release no a minor "point" update: it is one of the most feature-rich releases in the project's history. It contains several important additions. Most notably, new Xen PVH virtualization mode now supports running as dom0, enhanced support for Remus, significant ARM architecture updates, security improvements, real-time scheduling, support for Intel Cache Monitoring Technology (CMT), as well as improvements for automotive and embedded use-cases. Other enhancements include additional support for FreeBSD, systemd support, additional libvirt support, the release of Mirage OS 2.0, and more.
Besides giving an overview of Xen 4.5, we will explain the project's roadmap process and share what's ahead for 2015: such as improved OpenStack integration and hotpatching (applying security fixes without the need to reboot).
Systems Performance: Enterprise and the CloudBrendan Gregg
My talk for BayLISA, Oct 2013, launching the Systems Performance book. Operating system performance analysis and tuning leads to a better end-user experience and lower costs, especially for cloud computing environments that pay by the operating system instance. This book covers concepts, strategy, tools and tuning for Unix operating systems, with a focus on Linux- and Solaris-based systems. The book covers the latest tools and techniques, including static and dynamic tracing, to get the most out of your systems.
The document discusses high performance infrastructure for Server Density which includes 150 servers that have been running since June 2009 and migrated from MySQL to MongoDB. It stores 25TB of data per month. Key aspects of performance discussed are using fast networks like 10 Gigabit Ethernet on AWS, ensuring high memory, using SSDs over spinning disks for performance, and factors like replication lag based on location. The document also compares options like using cloud, dedicated servers, or colocation and discusses monitoring, backups, dealing with outages, and other operational aspects.
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...The Linux Foundation
This talk presents a production-ready automotive virtualization solution with Xen. The key requirements that we focus are super-fast startup and recovery from failure, static virtual machine creation with dedicated resources, and performance effective graphics rendering. To reduce the boot time, we optimize the Xen startup procedure by effectively initializing Xen heap and VM memory, and booting multiple VMs concurrently. We provide fast recovery mechanism by re-implementing the VM reset feature. We also develop a highly optimized graphics APIs-forwarding mechanism supporting OpenGLES APIs up to v3.2. The pass rate of Khronos CTS in a guest OS is comparable to the Domain0’s. Our experiment shows that our virtualization solution provides reasonable performance for ARM-based automotive systems (hypervisor booting: less than 70ms, graphics performance: about 96% of Domain0).
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.The Linux Foundation
Currently, several initiatives promote XEN hypervisor into the automotive area as a base of complex virtualized systems. To support those initiatives and plunge into the automotive world XEN should fit at least two requirements: it should be appropriately certified and to be able to host a security domain. Leaving behind certification topic, here we focus on security domain hosting capability of XEN. Particularly on keeping RT guarantees for the specific domain.
This talk is a presentation of the investigation on a XEN hypervisor applicability to building a multi-OS system with real-time guarantees being kept for one of the hosted OSes.
During this presentation following topics would be outlined:
- experimental setup
- experimental use-cases and their motivation
- received results and discovered issues
- solutions and mitigation measures for discovered issues
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld
This document provides an overview and best practices for storage technologies. It discusses factors that affect storage performance like interconnect bandwidth versus IOPS and command sizing. It covers tiering strategies and when auto-tiering may not be effective. It also discusses SSDs versus spinning disks, large VMDK and VMFS support, thin provisioning at the VM and LUN level, and architecting storage for failure including individual component failure, temporary and permanent site loss. It provides examples of how to implement a low-cost disaster recovery site using inexpensive hardware.
The document discusses tuning MySQL server settings for performance. Some key points covered include:
- Settings are workload-specific and depend on factors like storage engine, OS, hardware. Tuning involves getting a few settings right rather than maximizing all settings.
- Monitoring tools like SHOW STATUS, SHOW INNODB STATUS, and OS tools can help evaluate performance and identify tuning opportunities.
- Memory allocation and settings like innodb_buffer_pool_size, key_buffer_size, query_cache_size are important to configure based on the workload and available memory.
PostgreSQL Portland Performance Practice Project - Database Test 2 HowtoMark Wong
Fourth presentation in a speaker series sponsored by the Portland State University Computer Science Department. The series covers PostgreSQL performance with an OLTP (on-line transaction processing) workload called Database Test 2 (DBT-2). This presentation is a set of examples to go along with the live presentation given on March 12, 2009.
Slides for JavaOne 2015 talk by Brendan Gregg, Netflix (video/audio, of some sort, hopefully pending: follow @brendangregg on twitter for updates). Description: "At Netflix we dreamed of one visualization to show all CPU consumers: Java methods, GC, JVM internals, system libraries, and the kernel. With the help of Oracle this is now possible on x86 systems using system profilers (eg, Linux perf_events) and the new JDK option -XX:+PreserveFramePointer. This lets us create Java mixed-mode CPU flame graphs, exposing all CPU consumers. We can also use system profilers to analyze memory page faults, TCP events, storage I/O, and scheduler events, also with Java method context. This talk describes the background for this work, instructions generating Java mixed-mode flame graphs, and examples from our use at Netflix where Java on x86 is the primary platform for the Netflix cloud."
This document discusses Intel's high performance storage solution using Lustre file systems. It provides an overview of Lustre, how it can interface with various Intel technologies like SSDs, networking fabrics and processors. It also summarizes the key features of Lustre including its scalability, POSIX compatibility, shared namespace and how all clients can access data. Specific examples are given around using OpenZFS as the backend storage for Lustre and how technologies like L2ARC can improve performance. Monitoring and management tools for Lustre file systems are also highlighted.
Pg_prefaulter is a tool that helps eliminate replication lag and reduce startup times. It works by prefaulting WAL files on the follower nodes before the regular replication process applies the WAL. This is done by parsing the WAL files on the primary using pg_xlogdump to determine which database relations (tables, indexes) need to be prefaulted. Pg_prefaulter then issues prefetch system calls in parallel to warm the OS caches and disk buffers for those relations, improving performance of the downstream replication and recovery processes.
1. The document provides procedures for backing up a production server, including cloning the production server to create a backup server and daily and monthly backup procedures.
2. Key steps in cloning the production server include shutting down databases and applications, copying files to the backup server, and running scripts to configure the database and applications.
3. Daily backups involve making copies of updated database files from the production server to the backup server and reconfiguring the database. Monthly backups also update applications if any patches were applied.
The document outlines best practices for routinely backing up a production server to ensure data is not lost by maintaining a cloned backup server and regularly updating databases and applications.
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, ArmThe Linux Foundation
The Arm architecture allows for a wide variety of cache configurations, levels and features. This enables building systems that will optimally fit power/area budgets set for the target application.
A consequence of this is that architecturally compliant software has to cater for a much wider range of behaviors than on other architectures. While most software uses cache instructions that don't need special treatment in a virtualized environment, some will want to directly manage a given cache using set/way instructions and will introduce challenges for the hypervisor to handle them.
This talk will give an overview of how caches behave in the Arm architecture, especially in the context of virtualization. It will then describe the problem of using set/way instructions in a virtualized environment. We will also discuss the modifications required in Xen to handle those instructions.
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
This document discusses using the Linux perf profiling tool at Netflix. It begins with an overview of why Netflix needs Linux profiling to understand CPU usage quickly and completely. It then provides an introduction to the perf tool, covering its basic workflow and commands. The document discusses profiling CPU usage with perf, including potential issues like JIT runtimes and missing symbols. It provides several examples of perf commands for listing, counting, and recording events. The overall summary is that perf allows Netflix to quickly and accurately profile CPU usage across the entire software stack, from applications to libraries to the kernel, to optimize performance.
It is no accident that Xen software powers some of the largest Clouds in existence. From its outset, the Xen Project was intended to enable what we now call Cloud Computing. This session will explore how the Xen Architecture addresses the needs of the Cloud in ways which facilitate security, throughput, and agility. It will also cover some of the hot new developments of the Xen Project.
XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins...The Linux Foundation
While virtualization technologies like Xen have been around for a long time, it is only in recent years that they have started to be targeted as viable systems for implementing middlebox processing (e.g., firewalls, NATs). But can they provide this functionality while yielding the high performance expected from hardware-based middlebox offerings? In this talk Joao Martins will introduce ClickOS, a tiny, MiniOS-based virtual machine tailored for network processing. In addition to the vm itself, Joao Martins will describe performance improvements done to the entire Xen I/O pipe. Finally, Joao Martins will discuss an evaluation showing that ClickOS can be instantiated in 30 msecs, can process traffic at 10Gb/s for almost all packet sizes, introduces delay of 40 microseconds and can run middleboxes at rates of 5 Mp/s.
[CCP Games] Versioning Everything with PerforcePerforce
How far can you take versioning everything in and around your product? How far should you take it? This man says that you should take it all the way—and he has hard-won lessons from the MMO game development and operation of EVE Online and Dust 514 to share with you.
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
Improving Your Heroku App Performance with Asset CDN and UnicornSimon Bagreev
This document summarizes tips for optimizing the performance of Rails applications using asset CDNs and the Unicorn web server. It discusses using Amazon S3 and CloudFront for caching and delivering assets to improve load times. It also explains how to configure Unicorn to handle requests concurrently across worker processes to better utilize dyno resources on Heroku. Benchmark tests show these approaches reduced load times and increased the number of concurrent requests applications can handle.
This talk discusses Linux profiling using perf_events (also called "perf") based on Netflix's use of it. It covers how to use perf to get CPU profiling working and overcome common issues. The speaker will give a tour of perf_events features and show how Netflix uses it to analyze performance across their massive Amazon EC2 Linux cloud. They rely on tools like perf for customer satisfaction, cost optimization, and developing open source tools like NetflixOSS. Key aspects covered include why profiling is needed, a crash course on perf, CPU profiling workflows, and common "gotchas" to address like missing stacks, symbols, or profiling certain languages and events.
The 4.5 release no a minor "point" update: it is one of the most feature-rich releases in the project's history. It contains several important additions. Most notably, new Xen PVH virtualization mode now supports running as dom0, enhanced support for Remus, significant ARM architecture updates, security improvements, real-time scheduling, support for Intel Cache Monitoring Technology (CMT), as well as improvements for automotive and embedded use-cases. Other enhancements include additional support for FreeBSD, systemd support, additional libvirt support, the release of Mirage OS 2.0, and more.
Besides giving an overview of Xen 4.5, we will explain the project's roadmap process and share what's ahead for 2015: such as improved OpenStack integration and hotpatching (applying security fixes without the need to reboot).
Systems Performance: Enterprise and the CloudBrendan Gregg
My talk for BayLISA, Oct 2013, launching the Systems Performance book. Operating system performance analysis and tuning leads to a better end-user experience and lower costs, especially for cloud computing environments that pay by the operating system instance. This book covers concepts, strategy, tools and tuning for Unix operating systems, with a focus on Linux- and Solaris-based systems. The book covers the latest tools and techniques, including static and dynamic tracing, to get the most out of your systems.
The document discusses high performance infrastructure for Server Density which includes 150 servers that have been running since June 2009 and migrated from MySQL to MongoDB. It stores 25TB of data per month. Key aspects of performance discussed are using fast networks like 10 Gigabit Ethernet on AWS, ensuring high memory, using SSDs over spinning disks for performance, and factors like replication lag based on location. The document also compares options like using cloud, dedicated servers, or colocation and discusses monitoring, backups, dealing with outages, and other operational aspects.
• We sleeping well. And our mobile ringing and ringing. Message: DISASTER! In this session (on slides) we are NOT talk about potential disaster (such BCM); we talk about: And what NOW? New version old my old well-known session updated for whole changes which happened in DBA World in last two-three years.
• So, from the ground to the Sky and further - everything for surviving disaster. Which tasks should have been finished BEFORE. Is virtual or physical SQL matter? We talk about systems, databases, peoples, encryption, passwords, certificates and users.
• In this session (on few demos) I'll show which part of our SQL Server Environment are critical and how to be prepared to disaster. In some documents I'll show You how to be BEST prepared.
Session form series of conferences during Data Relay (formerly SQL Relay) 2018 in Newcastle, Leeds, Birmingham, Reading, Bristol. The session contains only slides form the talk (no videos included).
The document discusses using data virtualization to address the constraint of data in DevOps workflows. It describes how traditional database cloning methods are inefficient and consume significant resources. The solution presented uses thin cloning technology to take snapshots of production databases and provide virtual copies for development, QA, and other environments. This allows for unlimited, self-service virtual databases that reduce bottlenecks and waiting times compared to physical copies.
The document discusses best practices for preparing for and surviving a disaster involving IT systems. It emphasizes the importance of being prepared through thorough backup and recovery procedures. Key aspects of preparation include having documented procedures for backup and restore of SQL and SharePoint environments, understanding roles and responsibilities, maintaining service level agreements, keeping an encrypted envelope of credentials, and ensuring necessary hardware, software, and support contracts are accounted for. The overall message is that with proper planning through documented policies and procedures, the impact of a disaster can be minimized.
The document discusses best practices for preventing and recovering from disasters affecting IT systems. It emphasizes the importance of being prepared through regular backups, testing restores, clear documentation of backup and restore procedures, and defined roles and responsibilities. Key recommendations include performing backups to separate storage regularly; testing restores from backups; having a disaster recovery plan, procedures, and environment ready; and ensuring appropriate staff are assigned roles to respond to an outage. The overall message is that the best way to survive a disaster is through preparation, including backups, documentation, training and assigning roles.
The document discusses using capacity planning and performance analysis to improve system performance. It describes two case studies: capacity planning for an Oracle RAC database and performance analysis of a SQL Server application on HP blades. For the Oracle case, different platform options were evaluated and optimized configurations identified. For SQL Server, enabling AWE resolved soft paging and reduced response times by improving memory usage. The lessons highlight challenges in using performance tools and the need for better fault detection and data presentation.
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Kyle Hailey
The document discusses analyzing I/O performance and summarizing lessons learned. It describes common tools used to measure I/O like moats.sh, strace, and ioh.sh. It also summarizes the top 10 anomalies encountered like caching effects, shared drives, connection limits, I/O request consolidation and fragmentation over NFS, and tiered storage migration. Solutions provided focus on avoiding caching, isolating workloads, proper sizing of NFS parameters, and direct I/O.
OOUG - Oracle Performance Tuning with AASKyle Hailey
The document provides information about Kyle Hailey and his work in Oracle performance tuning. It includes details about his career history working with Oracle from 1990 to present, including roles in support, porting software versions, benchmarking, and performance tuning. Graphics and clear visualizations are emphasized as important tools for effectively communicating complex technical information and identifying problems. The goal is stated as simplifying database tuning information to empower database administrators.
Denver devops : enabling DevOps with data virtualizationKyle Hailey
This document discusses how data constraints can limit DevOps efforts and proposes a solution using virtual data and thin cloning. It notes that moving and copying production data is challenging due to storage, personnel and time requirements. This typically results in bottlenecks, long wait times for environments, code check-ins and production bugs. The solution presented is to use a data virtualization platform that can take thin clones of production data using file system snapshots, compress the data and share it across environments through a centralized cache. This allows self-service provisioning of database environments and accelerates DevOps processes.
This document provides information about Kyle Hailey's background and expertise in Oracle performance tuning. It discusses his experience working with Oracle since 1990 and his focus on simplifying performance information for DBAs. It promotes tools like OEM ASHMON/SASH and the DB Optimizer for interactively exploring performance data in a clear, understandable way.
Nimble Storage is developing flash-enabled storage solutions including an accelerator appliance and storage server. The accelerator increases storage cache by 100x and reduces latency by 25x compared to disk-only solutions. It is targeted at the $15 billion networked storage market. Nimble's technology utilizes a new flash-optimized file system and compression to provide compelling price/performance advantages over competitors. The company is led by experienced engineers from Data Domain, NetApp, and other storage firms.
DevOps, Databases and The Phoenix Project UGF4042 from OOW14Kyle Hailey
This document discusses using a virtual data platform to address data constraints in IT. It begins by explaining how data flooding infrastructure strains IT resources and costs companies huge sums. Most companies are unaware of these data costs. The solution presented is a virtual data appliance that can clone database environments from snapshots to provision dev/test environments quickly without consuming large amounts of physical storage. Key benefits outlined include unlimited, full-sized, self-service database environments for development and QA teams as well as fast rollback capabilities and A/B testing for QA.
KoprowskiT - SQLBITS X - 2am a disaster just beganTobias Koprowski
A document outlines best practices for surviving a disaster involving SQL Server infrastructure. It recommends being well prepared with regular backups stored offsite, documented restore procedures, clear roles and responsibilities, and service level agreements defining acceptable downtimes. Key aspects of preparation include backups, restore testing, documentation, contact lists, hardware and software inventory, passwords, encryption keys, defined teams, and keeping management informed. The overall message is that with proper planning, a disaster can be survived by following the best practice of being prepared.
TrsLabs - Leverage the Power of UPI PaymentsTrs Labs
Revolutionize your Fintech growth with UPI Payments
"Riding the UPI strategy" refers to leveraging the Unified Payments Interface (UPI) to drive digital payments in India and beyond. This involves understanding UPI's features, benefits, and potential, and developing strategies to maximize its usage and impact. Essentially, it's about strategically utilizing UPI to promote digital payments, financial inclusion, and economic growth.
Generative Artificial Intelligence (GenAI) in BusinessDr. Tathagat Varma
My talk for the Indian School of Business (ISB) Emerging Leaders Program Cohort 9. In this talk, I discussed key issues around adoption of GenAI in business - benefits, opportunities and limitations. I also discussed how my research on Theory of Cognitive Chasms helps address some of these issues
TrsLabs - Fintech Product & Business ConsultingTrs Labs
Hybrid Growth Mandate Model with TrsLabs
Strategic Investments, Inorganic Growth, Business Model Pivoting are critical activities that business don't do/change everyday. In cases like this, it may benefit your business to choose a temporary external consultant.
An unbiased plan driven by clearcut deliverables, market dynamics and without the influence of your internal office equations empower business leaders to make right choices.
Getting things done within a budget within a timeframe is key to Growing Business - No matter whether you are a start-up or a big company
Talk to us & Unlock the competitive advantage
Slides for the session delivered at Devoxx UK 2025 - Londo.
Discover how to seamlessly integrate AI LLM models into your website using cutting-edge techniques like new client-side APIs and cloud services. Learn how to execute AI models in the front-end without incurring cloud fees by leveraging Chrome's Gemini Nano model using the window.ai inference API, or utilizing WebNN, WebGPU, and WebAssembly for open-source models.
This session dives into API integration, token management, secure prompting, and practical demos to get you started with AI on the web.
Unlock the power of AI on the web while having fun along the way!
Autonomous Resource Optimization: How AI is Solving the Overprovisioning Problem
In this session, Suresh Mathew will explore how autonomous AI is revolutionizing cloud resource management for DevOps, SRE, and Platform Engineering teams.
Traditional cloud infrastructure typically suffers from significant overprovisioning—a "better safe than sorry" approach that leads to wasted resources and inflated costs. This presentation will demonstrate how AI-powered autonomous systems are eliminating this problem through continuous, real-time optimization.
Key topics include:
Why manual and rule-based optimization approaches fall short in dynamic cloud environments
How machine learning predicts workload patterns to right-size resources before they're needed
Real-world implementation strategies that don't compromise reliability or performance
Featured case study: Learn how Palo Alto Networks implemented autonomous resource optimization to save $3.5M in cloud costs while maintaining strict performance SLAs across their global security infrastructure.
Bio:
Suresh Mathew is the CEO and Founder of Sedai, an autonomous cloud management platform. Previously, as Sr. MTS Architect at PayPal, he built an AI/ML platform that autonomously resolved performance and availability issues—executing over 2 million remediations annually and becoming the only system trusted to operate independently during peak holiday traffic.
The FS Technology Summit
Technology increasingly permeates every facet of the financial services sector, from personal banking to institutional investment to payments.
The conference will explore the transformative impact of technology on the modern FS enterprise, examining how it can be applied to drive practical business improvement and frontline customer impact.
The programme will contextualise the most prominent trends that are shaping the industry, from technical advancements in Cloud, AI, Blockchain and Payments, to the regulatory impact of Consumer Duty, SDR, DORA & NIS2.
The Summit will bring together senior leaders from across the sector, and is geared for shared learning, collaboration and high-level networking. The FS Technology Summit will be held as a sister event to our 12th annual Fintech Summit.
Artificial Intelligence is providing benefits in many areas of work within the heritage sector, from image analysis, to ideas generation, and new research tools. However, it is more critical than ever for people, with analogue intelligence, to ensure the integrity and ethical use of AI. Including real people can improve the use of AI by identifying potential biases, cross-checking results, refining workflows, and providing contextual relevance to AI-driven results.
News about the impact of AI often paints a rosy picture. In practice, there are many potential pitfalls. This presentation discusses these issues and looks at the role of analogue intelligence and analogue interfaces in providing the best results to our audiences. How do we deal with factually incorrect results? How do we get content generated that better reflects the diversity of our communities? What roles are there for physical, in-person experiences in the digital world?
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungenpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-und-verwaltung-von-multiuser-umgebungen/
HCL Nomad Web wird als die nächste Generation des HCL Notes-Clients gefeiert und bietet zahlreiche Vorteile, wie die Beseitigung des Bedarfs an Paketierung, Verteilung und Installation. Nomad Web-Client-Updates werden “automatisch” im Hintergrund installiert, was den administrativen Aufwand im Vergleich zu traditionellen HCL Notes-Clients erheblich reduziert. Allerdings stellt die Fehlerbehebung in Nomad Web im Vergleich zum Notes-Client einzigartige Herausforderungen dar.
Begleiten Sie Christoph und Marc, während sie demonstrieren, wie der Fehlerbehebungsprozess in HCL Nomad Web vereinfacht werden kann, um eine reibungslose und effiziente Benutzererfahrung zu gewährleisten.
In diesem Webinar werden wir effektive Strategien zur Diagnose und Lösung häufiger Probleme in HCL Nomad Web untersuchen, einschließlich
- Zugriff auf die Konsole
- Auffinden und Interpretieren von Protokolldateien
- Zugriff auf den Datenordner im Cache des Browsers (unter Verwendung von OPFS)
- Verständnis der Unterschiede zwischen Einzel- und Mehrbenutzerszenarien
- Nutzung der Client Clocking-Funktion
Viam product demo_ Deploying and scaling AI with hardware.pdfcamilalamoratta
Building AI-powered products that interact with the physical world often means navigating complex integration challenges, especially on resource-constrained devices.
You'll learn:
- How Viam's platform bridges the gap between AI, data, and physical devices
- A step-by-step walkthrough of computer vision running at the edge
- Practical approaches to common integration hurdles
- How teams are scaling hardware + software solutions together
Whether you're a developer, engineering manager, or product builder, this demo will show you a faster path to creating intelligent machines and systems.
Resources:
- Documentation: https://on.viam.com/docs
- Community: https://discord.com/invite/viam
- Hands-on: https://on.viam.com/codelabs
- Future Events: https://on.viam.com/updates-upcoming-events
- Request personalized demo: https://on.viam.com/request-demo
UiPath Agentic Automation: Community Developer OpportunitiesDianaGray10
Please join our UiPath Agentic: Community Developer session where we will review some of the opportunities that will be available this year for developers wanting to learn more about Agentic Automation.
Canadian book publishing: Insights from the latest salary survey - Tech Forum...BookNet Canada
Join us for a presentation in partnership with the Association of Canadian Publishers (ACP) as they share results from the recently conducted Canadian Book Publishing Industry Salary Survey. This comprehensive survey provides key insights into average salaries across departments, roles, and demographic metrics. Members of ACP’s Diversity and Inclusion Committee will join us to unpack what the findings mean in the context of justice, equity, diversity, and inclusion in the industry.
Results of the 2024 Canadian Book Publishing Industry Salary Survey: https://publishers.ca/wp-content/uploads/2025/04/ACP_Salary_Survey_FINAL-2.pdf
Link to presentation recording and transcript: https://bnctechforum.ca/sessions/canadian-book-publishing-insights-from-the-latest-salary-survey/
Presented by BookNet Canada and the Association of Canadian Publishers on May 1, 2025 with support from the Department of Canadian Heritage.
AI Agents at Work: UiPath, Maestro & the Future of DocumentsUiPathCommunity
Do you find yourself whispering sweet nothings to OCR engines, praying they catch that one rogue VAT number? Well, it’s time to let automation do the heavy lifting – with brains and brawn.
Join us for a high-energy UiPath Community session where we crack open the vault of Document Understanding and introduce you to the future’s favorite buzzword with actual bite: Agentic AI.
This isn’t your average “drag-and-drop-and-hope-it-works” demo. We’re going deep into how intelligent automation can revolutionize the way you deal with invoices – turning chaos into clarity and PDFs into productivity. From real-world use cases to live demos, we’ll show you how to move from manually verifying line items to sipping your coffee while your digital coworkers do the grunt work:
📕 Agenda:
🤖 Bots with brains: how Agentic AI takes automation from reactive to proactive
🔍 How DU handles everything from pristine PDFs to coffee-stained scans (we’ve seen it all)
🧠 The magic of context-aware AI agents who actually know what they’re doing
💥 A live walkthrough that’s part tech, part magic trick (minus the smoke and mirrors)
🗣️ Honest lessons, best practices, and “don’t do this unless you enjoy crying” warnings from the field
So whether you’re an automation veteran or you still think “AI” stands for “Another Invoice,” this session will leave you laughing, learning, and ready to level up your invoice game.
Don’t miss your chance to see how UiPath, DU, and Agentic AI can team up to turn your invoice nightmares into automation dreams.
This session streamed live on May 07, 2025, 13:00 GMT.
Join us and check out all our past and upcoming UiPath Community sessions at:
👉 https://community.uipath.com/dublin-belfast/
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Raffi Khatchadourian
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce DL code that is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, less error-prone imperative DL frameworks encouraging eager execution have emerged at the expense of run-time performance. While hybrid approaches aim for the "best of both worlds," the challenges in applying them in the real world are largely unknown. We conduct a data-driven analysis of challenges---and resultant bugs---involved in writing reliable yet performant imperative DL code by studying 250 open-source projects, consisting of 19.7 MLOC, along with 470 and 446 manually examined code patches and bug reports, respectively. The results indicate that hybridization: (i) is prone to API misuse, (ii) can result in performance degradation---the opposite of its intention, and (iii) has limited application due to execution mode incompatibility. We put forth several recommendations, best practices, and anti-patterns for effectively hybridizing imperative DL code, potentially benefiting DL practitioners, API designers, tool developers, and educators.
The Future of Cisco Cloud Security: Innovations and AI IntegrationRe-solution Data Ltd
Stay ahead with Re-Solution Data Ltd and Cisco cloud security, featuring the latest innovations and AI integration. Our solutions leverage cutting-edge technology to deliver proactive defense and simplified operations. Experience the future of security with our expert guidance and support.
HCL Nomad Web – Best Practices and Managing Multiuser Environmentspanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-and-managing-multiuser-environments/
HCL Nomad Web is heralded as the next generation of the HCL Notes client, offering numerous advantages such as eliminating the need for packaging, distribution, and installation. Nomad Web client upgrades will be installed “automatically” in the background. This significantly reduces the administrative footprint compared to traditional HCL Notes clients. However, troubleshooting issues in Nomad Web present unique challenges compared to the Notes client.
Join Christoph and Marc as they demonstrate how to simplify the troubleshooting process in HCL Nomad Web, ensuring a smoother and more efficient user experience.
In this webinar, we will explore effective strategies for diagnosing and resolving common problems in HCL Nomad Web, including
- Accessing the console
- Locating and interpreting log files
- Accessing the data folder within the browser’s cache (using OPFS)
- Understand the difference between single- and multi-user scenarios
- Utilizing Client Clocking
HCL Nomad Web – Best Practices and Managing Multiuser Environmentspanagenda
Ad
Open Source Data Backup, or: How to Sleep Better at Night (OSCON 2005)
1. Open Source Data Backup, or:
How To Sleep Better At Night
Fran Fabrizio
Senior Systems Administrator
Dept. of Computer and Information Sciences
University of Alabama at Birmingham
O’Reilly Open Source Convention, August 1-5, 2005
2. Open Source Data Backup (OSCON 2005), Slide 2
Talk Overview
• Introduction to Amanda
– What it is, how it thinks
• Amanda In Action
– Real world examples
• Configuring Amanda
– Quick look at the config files
• Introduction to Bacula
– What it is, how it thinks
• Bacula vs. Amanda
– Major differences
• For More Info
3. Open Source Data Backup (OSCON 2005), Slide 3
Introduction To Amanda
What is Amanda?
How does it work?
What is its philosophy?
4. Open Source Data Backup (OSCON 2005), Slide 4
What is Amanda?
• Cross-Platform
• Scalable
• Automated
• Flexible
• Robust
Amanda is the Advanced Maryland Automatic
Network Disk Archiver. It has the following
features:
5. Open Source Data Backup (OSCON 2005), Slide 5
What is Amanda? (Cont)
• A set of CLI utilities written in C
• Its own protocols on top of TCP and UDP
• Client and server components that partner to
stream and store data for backup and recovery
6. Open Source Data Backup (OSCON 2005), Slide 6
Holding
Disk
Tape
Drive
OS X Client
Solaris Client
Linux Client Windows
Client
Amanda
Amanda Server
amandad
Samba
amandad
amandad
Sample Topology
7. Open Source Data Backup (OSCON 2005), Slide 7
Typical Sequence of Events
Amanda Server Amanda Client
amdump
planner
dumper(s)
driver
taper
amandad
amandad
launches
passes schedule
spawns
writes
to disk
writes to
taper
flush to tape
requests estimate
returns estimate
requests backup
returns image
(many clients serviced in parallel)
8. Open Source Data Backup (OSCON 2005), Slide 8
How Does It Work?
• gtar and dump
• Uses standard backup levels
• Manages tapes
• Balances resources
• Supports compression and encryption
• Degrades gracefully
9. Open Source Data Backup (OSCON 2005), Slide 9
Some Amanda Terminology
• Dump cycle
– How often do you want a full backup?
• Disklist Entry / DLE / Target
– Something (partition, filesystem) you want to backup
10. Open Source Data Backup (OSCON 2005), Slide 10
Amanda's Philosophy
• “You tell me how often you want a full backup,
and I'll worry about everything else”
• At least one full backup of each DLE per cycle
• You don't get to say when full backups happen
• Sounds scary, but usually sufficient
11. Open Source Data Backup (OSCON 2005), Slide 11
Real World Examples – CIS @ UAB
• Before....
(And this isn't even the whole dirty truth)
12. Open Source Data Backup (OSCON 2005), Slide 12
Real World Examples – CIS @ UAB
• So we bought...
Great! Only
one problem...
13. Open Source Data Backup (OSCON 2005), Slide 13
Real World Examples – CIS @ UAB
• This hardware is very expensive
• I work for a public university...
• ....in Alabama ;-)
• We had no more money
• Amanda saved the day
• Usage at CIS
– 43 filesystems/partitions on 14 Linux, Solaris and
Windows clients
– ~ 60GB of data per night
– Fully automated, requiring about 2 hours of attention so
far this year
14. Open Source Data Backup (OSCON 2005), Slide 14
Real World Examples - Others
• Other Examples
– Much Larger
• One user wrote about a 700GB nightly dump
– Much Smaller
• one client systems
– Vtape setups
• virtual tapes on disk
• periodically burn to DVD
15. Open Source Data Backup (OSCON 2005), Slide 15
Typical Amanda Daily Operation
• At 2:30pm, am check utility runs via cron job and
informs me via email if there are any problems
• At 2:00am, am dum p utility runs to kick off the
Amanda backup process
• Sometime in the morning, amdump sends an email
summarizing last night's activity.
16. Open Source Data Backup (OSCON 2005), Slide 16
Sample amdump Email Output
These dumps were to tape CIS-004.
The next tape Amanda expects to use is: CIS-005.
STATISTICS:
Total Full Daily
-------- -------- --------
Estimate Time (hrs:min) 1:00
Run Time (hrs:min) 2:57
Dump Time (hrs:min) 3:24 2:02 1:22
Output Size (meg) 20957.1 17678.4 3278.8
Original Size (meg) 38473.8 31408.7 7065.1
Avg Compressed Size (%) 54.5 56.3 46.4 (level:#disks ...)
Filesystems Dumped 41 11 30 (1:28 3:1 4:1)
Avg Dump Rate (k/s) 1753.8 2474.3 682.5
Tape Time (hrs:min) 0:21 0:15 0:05
Tape Size (meg) 20957.2 17678.4 3278.8
Tape Used (%) 4.3 3.6 0.7 (level:#disks ...)
Filesystems Taped 41 11 30 (1:28 3:1 4:1)
Avg Tp Write Rate (k/s) 17361.9 19931.6 10242.2
USAGE BY TAPE:
Label Time Size % Nb
CIS-004 0:21 20957.2 4.3 41
17. Open Source Data Backup (OSCON 2005), Slide 17
Sample amdump Email Output
(Continued)NOTES:
planner: Full dump of virginia:/hc promoted from 4 days ahead.
planner: Full dump of florida:/var promoted from 4 days ahead.
planner: Full dump of alabama:/etc promoted from 4 days ahead.
planner: Full dump of florida:/etc promoted from 1 day ahead.
planner: Full dump of alabama:/home promoted from 1 day ahead.
planner: Full dump of georgia:/home promoted from 4 days ahead.
planner: Full dump of illinois:/home promoted from 4 days ahead.
planner: Full dump of newyork:/home promoted from 4 days ahead.
planner: Full dump of newjersey:/ promoted from 4 days ahead.
taper: tape CIS-004 kb 21462176 fm 41 [OK]
DUMP SUMMARY:
DUMPER STATS TAPER STATS
HOSTNAME DISK L ORIG-KB OUT-KB COMP% MMM:SS KB/s MMM:SS KB/s
-------------------------- --------------------------------- ------------
missouri.hpcl.c -xport/home 1 24671 13205 53.5 3:42 59.6 0:018848.1
missouri.hpcl.c /export/opt 1 1055 49 4.6 0:44 1.1 0:01 42.8
missouri.hpcl.c /var/mail 1 1538303 828724 53.9 13:171040.4 1:299345.3
nevada.cis.ua / 4 2271760 881675 38.8 5:092850.4 1:0813033.1
ohio.cis.u /home 1 1540 247 16.0 0:01 401.8 0:04 69.7
maine.cis. /home 1 82640 59078 71.5 0:144300.2 0:087671.2
florida.cis.u /etc 0 5200 1613 31.0 0:03 625.5 0:011186.9
florida.cis.u /he 1 8780 614 7.0 0:26 23.4 0:01 512.8
florida.cis.u /hf 1 16440 2066 12.6 2:02 16.9 0:021249.4
etc....
(brought to you by Amanda version 2.4.4p2)
18. Open Source Data Backup (OSCON 2005), Slide 18
Amanda's planner In Action
planner clientA
“Please estimate level 0, level 1 and level 2 backups for /home.”
Let's assume that
last night Amanda
performed a level 1
backup of clientA's
/home DLE...
clientA returns the info
“Will promoting this to a level 0 (full) dump lead to
more balance over the dump cycle?”
“Will we save a significant amount of tape space by
going to a level 2 incremental instead of level 1?”
Schedules
Level 0 dump
Schedules
Level 2 dump
Schedules level 1 dump
Yes
Yes
No
No
19. Open Source Data Backup (OSCON 2005), Slide 19
More about planner
• Guarantees one full dump per cycle
• If the tape is too small for the run's data, it will
delay some of the dumps in the least disruptive
way
• Looks at past dumps to determine optimal
balancing behavior
• Tries to stay as close to level 0 as possible to
reduce need to use multiple tapes for restore
20. Open Source Data Backup (OSCON 2005), Slide 20
Restoring One Or A Few Files
[root@alabam a /tm p]# amrecover -C CIS -s
amanda.cis.uab.edu -t amanda.cis.uab.edu -d
/dev/nst0
AMRECOVER Version 2.4.4p1.Contacting server on
am anda.cis.uab.edu ...
220 keep AMANDA index server (2.4.4p2) ready.
200 Access OK
Setting restore date to today (2004-06-28)
200 Working date set to 2004-06-28.
Scanning /dum ps/am anda...
20040622:found Am anda directory.
200 Config set to CIS.
200 Dum p host set to alabam a.cis.uab.edu.
am recover> setdisk /var/spool/mail
200 Disk set to /var/spool/m ail.
am recover> setdate 2004-06-25
200 Working date set to 2004-06-25.
Problem: User 'bryant' requests that you restore his INBOX to its state on 2004-06-25.
Solution: Use the amrecover utility
21. Open Source Data Backup (OSCON 2005), Slide 21
Restoring One Or A Few Files (Con't)
am recover> ls
[...]
2004-06-25 brockhw
2004-06-25 brownta
2004-06-25 bryant
2004-06-25 byrdv
[...]
am recover> add bryant
Added /bryant
am recover> extract
Extracting files using tape drive /dev/nst0 on host am anda.cis.uab.edu.
The following tapes are needed:CIS-024
Restoring files into directory /tm p
Continue [?/Y/n]? Y
Extracting files using tape drive /dev/nst0 on host am anda.cis.uab.edu.
Load tape CIS-024 now
Continue [?/Y/n/s/t]? Y
./bryant
am recover> quit
200 Good bye.
[root@alabam a /tm p]# ls -l bryant
-rw------- 1 bryant disk 14533946 Jun 24 19:10 bryant
22. Open Source Data Backup (OSCON 2005), Slide 22
Restoring An Entire Backup Target
Problem: The disk holding /etc on a critical server has failed
Solution: Use the amrestore utility.
[root@am anda testrestore]# amadmin CIS info alabama.cis.uab.edu '/etc$'
Current info for alabam a.cis.uab.edu /etc:
Stats:dum p rates (kps),Full: 481.0,320.0,350.0
Increm ental: 19.0, 13.0, 11.0
com pressed size,Full: 20.6%,20.6%,20.6%
Increm ental: 5.4%, 5.4%, 5.4%
Dum ps:lev datestm p tape file origK com pK secs
0 20040623 CIS-022 23 18690 3849 8
1 20040628 CIS-027 10 700 38 0
[root@am anda testrestore]# su - amanda
-bash-2.05b$ am tape CIS labelCIS-022
am tape:scanning for tape with labelCIS-022
am tape:slot 26:date 20040623 labelCIS-022 (exact labelm atch)
am tape:labelCIS-022 is now loaded.
-bash-2.05b$ exit
23. Open Source Data Backup (OSCON 2005), Slide 23
Restoring An Entire Backup Target (Con't)
[root@am anda testrestore]# amrestore -p /dev/nst0 alabama.cis.uab.edu /etc >
etc.0.tar
am restore: 0:skipping start oftape:date 20040623 labelCIS-022
am restore: 1:skipping am anda.cis.uab.edu.__texas_dfs_hom e_undergrad.private.20040623.1
am restore: 2:skipping florida.cis.uab.edu._etc.20040623.1
am restore: 3:skipping alabam a.cis.uab.edu._root.20040623.1
[...]
amrestore: 21: skipping missouri.cis.uab.edu._.20040623.0
amrestore: 22: skipping oregon.cis.uab.edu._home.20040623.1
amrestore: 23: restoring alabama.cis.uab.edu._etc.20040623.0
amrestore: 24: reached end of information
[root@am anda testrestore]# ls -l
total18716
-rw-r--r-- 1 root root 19138560 Jun 28 14:40 etc.0.tar
24. Open Source Data Backup (OSCON 2005), Slide 24
Other Ways To Restore Files
• Amanda not available?
root@am anda testrestore]# mt -f /dev/nst0 rewind
[root@am anda testrestore]# dd if=/dev/nst0 bs=32k count=1
AMANDA:TAPESTART DATE 20040623 TAPE CIS-022
1+0 records in
1+0 records out
[root@am anda testrestore]# mt -f /dev/nst0 fsf 01
[root@am anda testrestore]# dd if=/dev/nst0 bs=32k count=1
AMANDA:FILE 20040623 am anda.cis.uab.edu
//texas/dfs/hom e/undergrad.public lev 1 com p .gz program /usr/bin/sm bclient
To restore,position tape at start offile and run:
dd if=<tape> bs=32k skip=1 |/bin/gzip -dc |usr/bin/sm bclient -f...-
1+0 records in
1+0 records out
25. Open Source Data Backup (OSCON 2005), Slide 25
When Things Go Wrong
• Data backup is a complex interaction between a
lot of players, and things -will- go wrong...
– Hosts will be down, or away (laptops)
– Tapes will go bad
– You'll change a password and then forget to tell Amanda
you changed it
– Your holding disk might be too small
• Amanda tries to be proactive by running am check
and giving you time to fix problems
26. Open Source Data Backup (OSCON 2005), Slide 26
Sample amcheck Problem Report
• Host down
Amanda Tape Server Host Check
-----------------------------
Holding disk /dumps/amanda: 60967688 KB disk space available, that's plenty
amcheck-server: slot 5: date 20040520 label CIS-018 (exact label match)
NOTE: skipping tape-writable test
Tape CIS-018 label ok
Server check took 175.627 seconds
Amanda Backup Client Hosts Check
--------------------------------
WARNING: vermont.cis.uab.edu: selfcheck request timed out. Host down?
Client check: 13 hosts checked in 30.210 seconds, 1 problem found
(brought to you by Amanda 2.4.4p2)
27. Open Source Data Backup (OSCON 2005), Slide 27
More amcheck Sample Messages
NOTE: index dir
/
usr/local/etc/amanda/CIS/index/amanda.cis.uab.edu/__texas_dfs_home_undergr
ad.private: does not exist
ERROR: georgia.cis.uab.edu: [access as amanda not allowed from
amanda@amanda] amandahostsauth failed
ERROR: virginia.cis.uab.edu: [dir /etc needs 64KB, only has 5KB
WARNING: holding disk /dumps/amanda: only 50254708 KB free (52428800 KB
requested)
amcheck-server: slot 19: rewinding tape: No medium found
amcheck-server: slot 19: date 20040330 label CIS-027 (active tape)
amcheck-server: fatal slot 20: slot 20 move failed
ERROR: label CIS-016 or new tape not found in rack
(expecting tape CIS-016 or a new tape)
amcheck-server: could not get changer info: could not read result from
"/usr/local/libexec/chg-scsi"
WARNING: skipping tape test because amdump or amflush seem to be running
WARNING: if they are not, you must run amcleanup
28. Open Source Data Backup (OSCON 2005), Slide 28
amcheck Is The Best Thing Since Sliced Bread
• Cron this script to email you every day before you
leave the office
• Run this script manually any time you touch a
config file
• Run this script manually any time you add or
change a client
• It will save you hours of troubleshooting
• One of the best features of Amanda
29. Open Source Data Backup (OSCON 2005), Slide 29
amcheck Can't Do Everything
• am check cannot catch problems that arise after it
runs
• am check doesn't check for everything
• Amanda's other utilities are really good at telling
you why they could not do their job
30. Open Source Data Backup (OSCON 2005), Slide 30
Revisiting The amdump Email Report
• Very rarely is the email report as uneventful as
was presented earlier. It typically will include
something like this:
These dumps were to tape CIS-004.
The next tape Amanda expects to use is: CIS-005.
FAILURE AND STRANGE DUMP SUMMARY:
alabama.cis.u /usr lev 1 STRANGE
STATISTICS:
Total Full Daily
-------- -------- --------
Estimate Time (hrs:min) 1:00
Run Time (hrs:min) 2:57
etc....
31. Open Source Data Backup (OSCON 2005), Slide 31
Revisiting The amdump Email Report (Con't)
• And then later on, it will explain itself....
FAILED AND STRANGE DUMP DETAILS:
/-- alabama.cis.u /usr lev 1 STRANGE
sendbackup: start [alabama.cis.uab.edu:/usr level 1]
sendbackup: info BACKUP=/bin/gtar
sendbackup: info RECOVER_CMD=/bin/gzip -dc |/bin/gtar -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? gtar: ./local/majordomo-1.94.5/log/mdlog: file changed as we
read it
| Total bytes written: 207923200 (198MB, 3.1MB/s)
sendbackup: size 203050
sendbackup: end
--------
Amanda is telling us a file was in use as it tried to
grab it. This is usually harmless and constitutes
99% of STRANGE results.
32. Open Source Data Backup (OSCON 2005), Slide 32
Revisiting The amdump Email Report
(Con't)
• Sometimes, there are more serious failures
FAILURE AND STRANGE DUMP SUMMARY:
amanda.cis.u //texas/dfs/simnetxpcd lev 0 FAILED [disk
//texas/dfs/simnetxpcd, all estimate failed]
amanda.cis.u //texas/dfs/classfiles lev 0 FAILED [disk
//texas/dfs/classfiles, all estimate failed]
amanda.cis.u //texas/dfs/officefiles lev 0 FAILED [disk
//texas/dfs/officefiles, all estimate failed]
amanda.cis.u //texas/dfs/scripts lev 0 FAILED [disk //texas/dfs/scripts,
all estimate failed]
amanda.cis.u //texas/dfs/home/graduate.private lev 0 FAILED [disk
//texas/dfs/home/graduate.private, all estimate failed]
amanda.cis.u //texas/dfs/home/graduate.public lev 0 FAILED [disk
//texas/dfs/home/graduate.public, all estimate failed]
amanda.cis.u //texas/dfs/home/undergrad.private lev 0 FAILED [disk
//texas/dfs/home/undergrad.private, all estimate failed]
This was because I changed the password to the SAMBA
share that Amanda was using to back up a Windows server.
The email clued me in and the problem was resolved
quickly.
33. Open Source Data Backup (OSCON 2005), Slide 33
Handling Tape Failures
• Amanda keeps going and stores as much as
possible in the holding disk
• You can then use am flush to flush data to tape
-bash-2.05b$ amflush CIS
Scanning /dum ps/am anda...
20040516:found Am anda directory.
20040517:found Am anda directory.
Multiple Am anda directories,please pick one by letter:
A.20040516
B.20040517
Select directories to flush [A..B]:[ALL]
Today is:20040517
Flushing dum ps in 20040516,20040517 using tape changer "chg-scsi".
Expecting tape CIS-015 or a new tape. (The last dum ps were to tape CIS-014)
Are you sure you want to do this [yN]? y
Running in background,you can log offnow.
You'llget m ailwhen am flush is finished.
34. Open Source Data Backup (OSCON 2005), Slide 34
Amanda Prerequisites
• A server that is mostly idle during the times that
you want to do your backups
• Enough disk space for a suitable holding disk
• GNU tar
• Samba (for Windows clients)
• A large capacity tape drive (typically)
• GNUplot
35. Open Source Data Backup (OSCON 2005), Slide 35
Configuration Roadmap
• Set up server
– Create amanda user and assign to group with permission
to use the tape and changer devices (e.g. 'disk' on Linux)
– Gather info on tape and changer devices - mt and mtx
are handy scripts to have here
– Open ports (10080,10082,10083) and set up services
(amanda, amandaidx, amidxtape)
– Configure amanda.conf & changer (e.g. chg-scsi.conf)
• Set up clients
– Create amanda user
– Config ports, services, files, directories
– Allow access (typically .amandahosts)
36. Open Source Data Backup (OSCON 2005), Slide 36
Things To Think About Before
Proceeding
• What should my cycle be?
– Once per night? Every three nights?
• If your cycle seems convoluted, try using two:
– One for daily backups, once per night, one week cycle
– One for archives, always full dumps, run manually when
you need it
• Find the right balance for the cycle length
– Short cycles eat up lots of resources doing full dumps
– Long cycles can be a pain to restore from, you might
need 4 tapes for example
• How many tapes to use?
– How far back into the past do you want to go?
37. Open Source Data Backup (OSCON 2005), Slide 37
amanda.conf
• Amanda's main configuration file
• Many options
• Well documented
• Cannot begin to cover everything here, definitely
read the documentation first!
• This is where you define your cycle's parameters
(length, number of runs, number of tapes, etc...)
38. Open Source Data Backup (OSCON 2005), Slide 38
amanda.conf - dumptypes
• Different rules for each DLE
define dum ptype root-tar {
global
program "GNUTAR"
com m ent "root partitions dum ped with tar"
com press none
index yes
exclude list "/usr/local/lib/am anda/exclude.gtar"
priority low
}
define dum ptype com p-high {
global
com m ent "very im portant partitions on fast m achines"
com press client best
priority high
}
39. Open Source Data Backup (OSCON 2005), Slide 39
disklist
• Defining your DLEs / targets
# File form at is:
#
# hostnam e diskdev dum ptype [spindle [interface]]
# the tape server itself
am anda.cis.uab.edu / com p-root-tar-exclude-holdingdisk
# the directory server
newjersey.cis.uab.edu / com p-root-tar
# the file server
virginia.cis.uab.edu /m z com p-user-tar
virginia.cis.uab.edu /root com p-root-tar
virginia.cis.uab.edu /usr com p-user-tar
virginia.cis.uab.edu /hc com p-root-tar
40. Open Source Data Backup (OSCON 2005), Slide 40
Configuring Tape Changers
• Amanda has a generic interface to tape changers
• Tape changer configuration is stored in chg-
scsi.conf
• chg-scsi.conf is one of many changer scripts that
come with Amanda
• This is where you tell Amanda how many drives
you have, which tapes go to which drives, barcode
support, etc...
41. Open Source Data Backup (OSCON 2005), Slide 41
Tape Drive Configuration Heads-Up
• There are many choices of changer configuration
scripts (chg-scsi, chg-multi, chg-mtx, chg-manual,
chg-disk, etc...)
• You may have to use one even if you don't have
an actual changer (chg-manual, chg-disk)
• Many options, many chances for confusion
• Please read docs/TAPE.CHANGERS and other
sources of information (listed at end of talk)
42. Open Source Data Backup (OSCON 2005), Slide 42
Configuring Your Tape Collection
• Use amlabel to label new tapes
– am labelCIS CIS-000 slot 0
• Then use amtape to build the tapelist
– am tape CIS update
am tape:scanning all30 slots in tape-changer rack:
slot 26:date 20040623 labelCIS-022
etc...
43. Open Source Data Backup (OSCON 2005), Slide 43
Special Case Configurations
• Windows Clients
– Use Samba to backup Windows clients
– Configure the Amanda server or other Unix client with
Samba shares
– Then in the disklist point to that server and share
•am anda.cis.uab.edu //texas/dfs/officefiles com p-user-tar
44. Open Source Data Backup (OSCON 2005), Slide 44
Special Case Configurations
• Windows Clients
– Use Samba to backup Windows clients
– Configure the Amanda server or other Unix client with
Samba shares
– Then in the disklist point to that server and share
•am anda.cis.uab.edu //texas/dfs/officefiles com p-user-tar
• Firewalls
– --with-tcpportrange=40000,40030 ( something > 1024)
– --with-udpportrange=920,940 (something < 1024)
– or, iptables has amanda support you can enable
45. Open Source Data Backup (OSCON 2005), Slide 45
Introduction To Bacula
What is Bacula?
How is it different?
46. Open Source Data Backup (OSCON 2005), Slide 46
What is Bacula?
• “The Network Backup Tool for Linux, Unix, Mac
and Windows.”
• Another open-source project that aims to provide
a robust network-based, multiplatform backup
solution
• Newer than Amanda (started in 2000)
“It comes in the night and sucks the essence from your computers.”
- Kern Sibbald
47. Open Source Data Backup (OSCON 2005), Slide 47
Bacula Features
• Modular, scalable components
• Its own protocols on top of TCP and UDP
• Client and server components that partner to
stream and store data for backup and recovery
• Clean component separation - all communication
between them goes over the network
• Threaded rather than multiple processes
• Excellent documentation
48. Open Source Data Backup (OSCON 2005), Slide 48
How is Bacula Different than Amanda?
• It can support multiple volumes
• You may find it easier to setup
• There are both command line and GUI
configuration tools available
• Scheduler gives you more control over what jobs
run at which times
• It will reuse a tape on multiple nights until it is
full
• Support for automated restores from bare metal
• Native support for Windows (no Samba/NFS)
• SQL database support
49. Open Source Data Backup (OSCON 2005), Slide 49
Sample Bacula Topology
Director
Daemon
Database
Server
File Daemon
Windows
File Daemon
Linux
File Daemon
Unix
File Daemon
OS X
Storage
Daemon
Tape
Device
Storage
Daemon
Tape
Changer
Storage
Daemon
Disk
Device
Admin
Console
50. Open Source Data Backup (OSCON 2005), Slide 50
Component Roles
• The Director manages all scheduling and job
creation. It is via an administration interface
talking to the Director that the backup
administrator controls the backup process.
• The Storage Daemon is responsible for writing
data out to disk/tape/changer
• The Database keeps the catalog of what has been
backed up and where
• The File Daemon streams the data to be backed up
from the client to the Storage Daemon
51. Open Source Data Backup (OSCON 2005), Slide 51
Prerequisites
• Bacula currently works with SQLite, MySQL and
PostgreSQL
• GNU C++ 2.95 or higher to compile
• Other software may be necessary depending on
configuration. If burning DVDs, you need the
dvd+rw-tools. If using the GUI console, you need
recent GNOME and GTK+ libs.
52. Open Source Data Backup (OSCON 2005), Slide 52
Configuration Overview
• Bacula is configured via a series of config files
– bacula-dir.conf, bacula-fd.conf, bacula-sd.conf,
console.conf
• Examples coming here
53. Open Source Data Backup (OSCON 2005), Slide 53
Storage Organization
• Bacula organizes tapes (volumes) into pools. It
will use one volume up until it is full, and then
move on to the next. You can give more control if
you want to use a new tape each night
• Similar to Amanda, each Bacula volume get a
unique label and added to the pool. However,
unlike Amanda, there is no set rotation. Bacula
will use one until full, and then go look for
another, etc...
• Can use multiple pools to ensure that a new tape is
used each day. e.q. Setup a Monday pool, a
Tuesday pool, etc...
54. Open Source Data Backup (OSCON 2005), Slide 54
Pre and Post Job Scripts
• Bacula's File Daemon has the ability to run a
script before and after a job
• This can be used to shutdown a database in order
to take a safe backup of it, for example
• Use the bacula-fd.conf directives “Run Before
Job” and “Run After Job”
55. Open Source Data Backup (OSCON 2005), Slide 55
Take Home Points...
• Amanda is very robust, highly scalable, almost
infinitely configurable, and can very likely handle
your data backup situation
• If you find Amanda limiting or do not agree with
the scheduling philosophy, Bacula may be for
you. Momentum -may- be headed to Bacula.
• Once you get either of these configured, you can
trust it, and move on with your life. They work,
are battle-tested and just as reliable, if not more
so, than the expensive commercial products.
• Test your backup and recovery system and
strategy early and often!!
56. Open Source Data Backup (OSCON 2005), Slide 56
For More Information
• http://www.amanda.org/
• Documentation
– README, INSTALL, docs/* (esp. FAQ and
TAPE.CHANGERS), example/*
– The AMANDA section in O'Reilly's “Unix Backup and
Recovery”
– man pages
• http://www.bacula.org/
– Documentation - Tutorial, Quick Start, User Guide
• The user communities are wonderful
– Sign up for [email protected] at
http://www.amanda.org/support/mailinglists.php
– Sign up for [email protected] at
http://lists.sourceforge.net/lists/listinfo/bacula-users
57. Open Source Data Backup (OSCON 2005), Slide 57
My Information
• My email is [email protected]
• This presentation is available at
http://www.cis.uab.edu/fran/
• More Amanda and Bacula information can be
found at the above URL, including:
– A document detailing every step of my Amanda
configuration, along with complete sample amanda.conf
and chg-scsi.conf
– Notes on configuring Samba to backup Windows shares
– Sample Bacula configuration files for each component
– Notes on strategies for setting up volumes and pools for
Bacula
58. Open Source Data Backup (OSCON 2005), Slide 58
My Information
• My email is [email protected]
• This presentation is available at
http://www.cis.uab.edu/fran/
• Much more Amanda information can be found at
the above URL, including:
– A document detailing every step of my configuration
– A complete sample amanda.conf and chg-scsi.conf
– Notes on configuring Samba to backup Windows shares
The End - Thank You!