In 40 minutes the audience will learn a variety of ways to make postgresql database suddenly go out of memory on a box with half a terabyte of RAM.
Developer's and DBA's best practices for preventing this will also be discussed, as well as a bit of Postgres and Linux memory management internals.
The paperback version is available on lulu.com there http://goo.gl/fraa8o
This is the first volume of the postgresql database administration book. The book covers the steps for installing, configuring and administering a PostgreSQL 9.3 on Linux debian. The book covers the logical and physical aspect of PostgreSQL. Two chapters are dedicated to the backup/restore topic.
The document provides an overview of PostgreSQL performance tuning. It discusses caching, query processing internals, and optimization of storage and memory usage. Specific topics covered include the PostgreSQL configuration parameters for tuning shared buffers, work memory, and free space map settings.
This document discusses PostgreSQL statistics and how to use them effectively. It provides an overview of various PostgreSQL statistics sources like views, functions and third-party tools. It then demonstrates how to analyze specific statistics like those for databases, tables, indexes, replication and query activity to identify anomalies, optimize performance and troubleshoot issues.
This document provides an agenda and background information for a presentation on PostgreSQL. The agenda includes topics such as practical use of PostgreSQL, features, replication, and how to get started. The background section discusses the history and development of PostgreSQL, including its origins from INGRES and POSTGRES projects. It also introduces the PostgreSQL Global Development Team.
This document discusses streaming replication in PostgreSQL. It covers how streaming replication works, including the write-ahead log and replication processes. It also discusses setting up replication between a primary and standby server, including configuring the servers and verifying replication is working properly. Monitoring replication is discussed along with views and functions for checking replication status. Maintenance tasks like adding or removing standbys and pausing replication are also mentioned.
1. Log structured merge trees store data in multiple levels with different storage speeds and costs, requiring data to periodically merge across levels.
2. This structure allows fast writes by storing new data in faster levels before merging to slower levels, and efficient reads by querying multiple levels and merging results.
3. The merging process involves loading, sorting, and rewriting levels to consolidate and propagate deletions and updates between levels.
MariaDB 10.0 introduces domain-based parallel replication which allows transactions in different domains to execute concurrently on replicas. This can result in out-of-order transaction commit. MariaDB 10.1 adds optimistic parallel replication which maintains commit order. The document discusses various parallel replication techniques in MySQL and MariaDB including schema-based replication in MySQL 5.6 and logical clock replication in MySQL 5.7. It provides performance benchmarks of these techniques from Booking.com's database environments.
Devrim Gunduz gives a presentation on Write-Ahead Logging (WAL) in PostgreSQL. WAL logs all transactions to files called write-ahead logs (WAL files) before changes are written to data files. This allows for crash recovery by replaying WAL files. WAL files are used for replication, backup, and point-in-time recovery (PITR) by replaying WAL files to restore the database to a previous state. Checkpoints write all dirty shared buffers to disk and update the pg_control file with the checkpoint location.
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
The document discusses optimizations made to Spark SQL performance when working with Parquet files at ByteDance. It describes how Spark originally reads Parquet files and identifies two main areas for optimization: Parquet filter pushdown and the Parquet reader. For filter pushdown, sorting columns improved statistics and reduced data reads by 30%. For the reader, splitting it to first filter then read other columns prevented loading unnecessary data. These changes improved Spark SQL performance at ByteDance without changing jobs.
There are many ways to run high availability with PostgreSQL. Here, we present a template for you to create your own customized, high-availability solution using Python and for maximum accessibility, a distributed configuration store like ZooKeeper or etcd.
This is the presentation on ASH that I did with Graham Wood at RMOUG 2014 and that represents the final best effort to capture essential and advanced ASH content as started in a presentation Uri Shaft and I gave at a small conference in Denmark sometime in 2012 perhaps. The presentation is also available publicly through the RMOUG website, so I felt at liberty to post it myself here. If it disappears it would likely be because I have been asked to remove it by Oracle.
This document discusses Bloom filters and how they are used to optimize hash joins in Oracle databases. It provides examples of how hash joins work in serial and parallel execution plans. Bloom filters allow the cost-based optimizer to "filter early" in hash joins by preventing unnecessary row transportation between parallel query slave sets. The document explains the build and probe phases of hash joins and how Bloom filters can filter rows before they flow through parent execution stages, improving performance.
The document discusses the Performance Schema in MySQL. It provides an overview of what the Performance Schema is and how it can be used to monitor events within a MySQL server. It also describes how to configure the Performance Schema by setting up actors, objects, instruments, consumers and threads to control what is monitored. Finally, it explains how to initialize the Performance Schema by truncating existing summary tables before collecting new performance data.
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015PostgreSQL-Consulting
This document discusses how PostgreSQL works with disks and provides recommendations for disk subsystem monitoring, hardware selection, and configuration tuning to optimize performance. It explains that PostgreSQL relies on disk I/O for reading pages, writing the write-ahead log (WAL), and checkpointing. It recommends monitoring disk utilization, IOPS, latency, and I/O wait. The document also provides tips for choosing hardware like SSDs or RAID configurations and configuring the operating system, file systems, and PostgreSQL to improve performance.
Meta/Facebook's database serving social workloads is running on top of MyRocks (MySQL on RocksDB). This means our performance and reliability depends a lot on RocksDB. Not just MyRocks, but also we have other important systems running on top of RocksDB. We have learned many lessons from operating and debugging RocksDB at scale.
In this session, we will offer an overview of RocksDB, key differences from InnoDB, and share a few interesting lessons learned from production.
PostgreSQL is designed to be easily extensible. For this reason, extensions loaded into the database can function just like features that are built in. In this session, we will learn more about PostgreSQL extension framework, how are they built, look at some popular extensions, management of these extensions in your deployments.
Automatic Storage Management (ASM) provides a simple way to manage Oracle database files across disk storage. ASM uses disk groups and metadata to distribute data extents across disks for redundancy. Rebalancing operations redistribute extents to maintain even distribution as disks are added or removed, and the estimated time for rebalancing can be found in V$ASM_OPERATION. ASM supports different redundancy levels including external, normal, and high redundancy.
This document provides an introduction and overview of PostgreSQL, including its history, features, installation, usage and SQL capabilities. It describes how to create and manipulate databases, tables, views, and how to insert, query, update and delete data. It also covers transaction management, functions, constraints and other advanced topics.
MaxScale is a database proxy that provides high availability and scalability for MariaDB servers. It can be used to configure load balancing of read/write connections, auto failover/switchover/rejoin using MariaDB GTID replication. Keepalived can be used with MaxScale to provide high availability by monitoring MaxScale and failing over if needed. The document provides details on setting up MariaDB replication with GTID, installing and configuring MaxScale and Keepalived. It also describes testing the auto failover functionality.
How to set up orchestrator to manage thousands of MySQL serversSimon J Mudd
This document discusses how to scale Orchestrator to manage thousands of MySQL servers. It describes how Booking.com uses Orchestrator to manage their thousands of MySQL servers. As the number of monitored servers increases, integration with internal infrastructure is needed, Orchestrator performance must be optimized, and high availability and wider user access features are added. The document provides examples of configuration settings and special considerations needed to effectively use Orchestrator at large scale.
MariaDB MaxScale is a database proxy that provides scalability, high availability, and data streaming capabilities for MariaDB and MySQL databases. It acts as a load balancer and router to distribute queries across database servers. MaxScale supports services like read/write splitting, query caching, and security features like selective data masking. It can monitor replication lag and route queries accordingly. MaxScale uses a plugin architecture and its core remains stateless to provide flexibility and high performance.
This document provides an overview of Linux containers and Docker containers. It begins with definitions of containers and their advantages over virtual machines. It describes early implementations of containers like chroot, Jails, and Zones. It explains the underlying kernel technologies like cgroups and namespaces that enable Linux containers. It provides instructions for using LXC and systemd-nspawn to deploy basic containers. It then focuses on Docker containers, covering installation, images, volumes, and best practices for running applications like PostgreSQL in Docker containers.
Deploying MariaDB databases with containers at Nokia NetworksMariaDB plc
Nokia is focused on providing software and products that facilitate rapid development, deployment and scaling of products and services to customers. The Common Software Foundation (CSF) within Nokia develops and supports product reuse by multiple applications within Nokia, including MariaDB. Their focus over the last year has been to develop a containerized MariaDB solution supporting multiple architectures, including both clustering and primary/secondary replication with MariaDB MaxScale. In this talk, Rick Lane discusses this journey of these containerized solutions from development to customer trials, including problems encountered and solutions.
Netflix’s Big Data Platform team manages data warehouse in Amazon S3 with over 60 petabytes of data and writes hundreds of terabytes of data every day. With a data warehouse at this scale, it is a constant challenge to keep improving performance. This talk will focus on Iceberg, a new table metadata format that is designed for managing huge tables backed by S3 storage. Iceberg decreases job planning time from minutes to under a second, while also isolating reads from writes to guarantee jobs always use consistent table snapshots.
In this session, you'll learn:
• Some background about big data at Netflix
• Why Iceberg is needed and the drawbacks of the current tables used by Spark and Hive
• How Iceberg maintains table metadata to make queries fast and reliable
• The benefits of Iceberg's design and how it is changing the way Netflix manages its data warehouse
• How you can get started using Iceberg
Speaker
Ryan Blue, Software Engineer, Netflix
This document provides an overview of pgCenter, an open source tool for monitoring and managing PostgreSQL databases. It summarizes pgCenter's main features, which include displaying statistics on databases, tables, indexes and functions; monitoring long running queries and statements; managing connections to multiple PostgreSQL instances; and performing administrative tasks like viewing logs, editing configuration files, and canceling queries. Use cases and examples of how pgCenter can help optimize PostgreSQL performance are also provided.
Devrim Gunduz gives a presentation on Write-Ahead Logging (WAL) in PostgreSQL. WAL logs all transactions to files called write-ahead logs (WAL files) before changes are written to data files. This allows for crash recovery by replaying WAL files. WAL files are used for replication, backup, and point-in-time recovery (PITR) by replaying WAL files to restore the database to a previous state. Checkpoints write all dirty shared buffers to disk and update the pg_control file with the checkpoint location.
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
The document discusses optimizations made to Spark SQL performance when working with Parquet files at ByteDance. It describes how Spark originally reads Parquet files and identifies two main areas for optimization: Parquet filter pushdown and the Parquet reader. For filter pushdown, sorting columns improved statistics and reduced data reads by 30%. For the reader, splitting it to first filter then read other columns prevented loading unnecessary data. These changes improved Spark SQL performance at ByteDance without changing jobs.
There are many ways to run high availability with PostgreSQL. Here, we present a template for you to create your own customized, high-availability solution using Python and for maximum accessibility, a distributed configuration store like ZooKeeper or etcd.
This is the presentation on ASH that I did with Graham Wood at RMOUG 2014 and that represents the final best effort to capture essential and advanced ASH content as started in a presentation Uri Shaft and I gave at a small conference in Denmark sometime in 2012 perhaps. The presentation is also available publicly through the RMOUG website, so I felt at liberty to post it myself here. If it disappears it would likely be because I have been asked to remove it by Oracle.
This document discusses Bloom filters and how they are used to optimize hash joins in Oracle databases. It provides examples of how hash joins work in serial and parallel execution plans. Bloom filters allow the cost-based optimizer to "filter early" in hash joins by preventing unnecessary row transportation between parallel query slave sets. The document explains the build and probe phases of hash joins and how Bloom filters can filter rows before they flow through parent execution stages, improving performance.
The document discusses the Performance Schema in MySQL. It provides an overview of what the Performance Schema is and how it can be used to monitor events within a MySQL server. It also describes how to configure the Performance Schema by setting up actors, objects, instruments, consumers and threads to control what is monitored. Finally, it explains how to initialize the Performance Schema by truncating existing summary tables before collecting new performance data.
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015PostgreSQL-Consulting
This document discusses how PostgreSQL works with disks and provides recommendations for disk subsystem monitoring, hardware selection, and configuration tuning to optimize performance. It explains that PostgreSQL relies on disk I/O for reading pages, writing the write-ahead log (WAL), and checkpointing. It recommends monitoring disk utilization, IOPS, latency, and I/O wait. The document also provides tips for choosing hardware like SSDs or RAID configurations and configuring the operating system, file systems, and PostgreSQL to improve performance.
Meta/Facebook's database serving social workloads is running on top of MyRocks (MySQL on RocksDB). This means our performance and reliability depends a lot on RocksDB. Not just MyRocks, but also we have other important systems running on top of RocksDB. We have learned many lessons from operating and debugging RocksDB at scale.
In this session, we will offer an overview of RocksDB, key differences from InnoDB, and share a few interesting lessons learned from production.
PostgreSQL is designed to be easily extensible. For this reason, extensions loaded into the database can function just like features that are built in. In this session, we will learn more about PostgreSQL extension framework, how are they built, look at some popular extensions, management of these extensions in your deployments.
Automatic Storage Management (ASM) provides a simple way to manage Oracle database files across disk storage. ASM uses disk groups and metadata to distribute data extents across disks for redundancy. Rebalancing operations redistribute extents to maintain even distribution as disks are added or removed, and the estimated time for rebalancing can be found in V$ASM_OPERATION. ASM supports different redundancy levels including external, normal, and high redundancy.
This document provides an introduction and overview of PostgreSQL, including its history, features, installation, usage and SQL capabilities. It describes how to create and manipulate databases, tables, views, and how to insert, query, update and delete data. It also covers transaction management, functions, constraints and other advanced topics.
MaxScale is a database proxy that provides high availability and scalability for MariaDB servers. It can be used to configure load balancing of read/write connections, auto failover/switchover/rejoin using MariaDB GTID replication. Keepalived can be used with MaxScale to provide high availability by monitoring MaxScale and failing over if needed. The document provides details on setting up MariaDB replication with GTID, installing and configuring MaxScale and Keepalived. It also describes testing the auto failover functionality.
How to set up orchestrator to manage thousands of MySQL serversSimon J Mudd
This document discusses how to scale Orchestrator to manage thousands of MySQL servers. It describes how Booking.com uses Orchestrator to manage their thousands of MySQL servers. As the number of monitored servers increases, integration with internal infrastructure is needed, Orchestrator performance must be optimized, and high availability and wider user access features are added. The document provides examples of configuration settings and special considerations needed to effectively use Orchestrator at large scale.
MariaDB MaxScale is a database proxy that provides scalability, high availability, and data streaming capabilities for MariaDB and MySQL databases. It acts as a load balancer and router to distribute queries across database servers. MaxScale supports services like read/write splitting, query caching, and security features like selective data masking. It can monitor replication lag and route queries accordingly. MaxScale uses a plugin architecture and its core remains stateless to provide flexibility and high performance.
This document provides an overview of Linux containers and Docker containers. It begins with definitions of containers and their advantages over virtual machines. It describes early implementations of containers like chroot, Jails, and Zones. It explains the underlying kernel technologies like cgroups and namespaces that enable Linux containers. It provides instructions for using LXC and systemd-nspawn to deploy basic containers. It then focuses on Docker containers, covering installation, images, volumes, and best practices for running applications like PostgreSQL in Docker containers.
Deploying MariaDB databases with containers at Nokia NetworksMariaDB plc
Nokia is focused on providing software and products that facilitate rapid development, deployment and scaling of products and services to customers. The Common Software Foundation (CSF) within Nokia develops and supports product reuse by multiple applications within Nokia, including MariaDB. Their focus over the last year has been to develop a containerized MariaDB solution supporting multiple architectures, including both clustering and primary/secondary replication with MariaDB MaxScale. In this talk, Rick Lane discusses this journey of these containerized solutions from development to customer trials, including problems encountered and solutions.
Netflix’s Big Data Platform team manages data warehouse in Amazon S3 with over 60 petabytes of data and writes hundreds of terabytes of data every day. With a data warehouse at this scale, it is a constant challenge to keep improving performance. This talk will focus on Iceberg, a new table metadata format that is designed for managing huge tables backed by S3 storage. Iceberg decreases job planning time from minutes to under a second, while also isolating reads from writes to guarantee jobs always use consistent table snapshots.
In this session, you'll learn:
• Some background about big data at Netflix
• Why Iceberg is needed and the drawbacks of the current tables used by Spark and Hive
• How Iceberg maintains table metadata to make queries fast and reliable
• The benefits of Iceberg's design and how it is changing the way Netflix manages its data warehouse
• How you can get started using Iceberg
Speaker
Ryan Blue, Software Engineer, Netflix
This document provides an overview of pgCenter, an open source tool for monitoring and managing PostgreSQL databases. It summarizes pgCenter's main features, which include displaying statistics on databases, tables, indexes and functions; monitoring long running queries and statements; managing connections to multiple PostgreSQL instances; and performing administrative tasks like viewing logs, editing configuration files, and canceling queries. Use cases and examples of how pgCenter can help optimize PostgreSQL performance are also provided.
Lightweight locks (LWLocks) in PostgreSQL provide mutually exclusive access to shared memory structures. They support both shared and exclusive locking modes. The LWLocks framework uses wait queues, semaphores, and spinlocks to efficiently manage acquiring and releasing locks. Dynamic monitoring of LWLock events is possible through special builds that incorporate statistics collection.
Nine Circles of Inferno or Explaining the PostgreSQL VacuumAlexey Lesovsky
The document describes the nine circles of the PostgreSQL vacuum process. Circle I discusses the postmaster process, which initializes shared memory and launches the autovacuum launcher and worker processes. Circle II focuses on the autovacuum launcher, which manages worker processes and determines when to initiate vacuuming for different databases. Circle III returns to the postmaster process and how it launches autovacuum workers. Circle IV discusses what occurs within an autovacuum worker process after it is launched, including initializing, signaling the launcher, scanning relations, and updating databases. Circle V delves into processing a single database by an autovacuum worker.
This document provides an overview of troubleshooting streaming replication in PostgreSQL. It begins with introductions to write-ahead logging and replication internals. Common troubleshooting tools are then described, including built-in views and functions as well as third-party tools. Finally, specific troubleshooting cases are discussed such as replication lag, WAL bloat, recovery conflicts, and high CPU recovery usage. Throughout, examples are provided of how to detect and diagnose issues using the various tools.
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
This talk is prepared as a bunch of slides, where each slide describes a really bad way people can screw up their PostgreSQL database and provides a weight - how frequently I saw that kind of problem. Right before the talk I will reshuffle the deck to draw ten random slides and explain you why such practices are bad and how to avoid running into them.
PostgreSQL Performance Problems: Monitoring and AlertingGrant Fritchey
PostgreSQL can be difficult to troubleshoot when the pressure is on without the right knowledge and tools. Knowing where to find the information you need to improve performance is central to your ability to act quickly and solve problems. In this training, we'll discuss the various query statistic views and log information that's available in PostgreSQL so that you can solve problems quickly. Along the way, we'll highlight a handful of open-source and paid tools that can help you track data over time and provide better alerting capabilities so that you know about problems before they become critical.
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas FittlCitus Data
Your PostgreSQL database is one of the most important pieces of your architecture. What should you really watch out for, send reports on and alert on? We’ll discuss how query performance statistics can be made accessible to application developers, critical entries one should monitor in the PostgreSQL log files, how to collect EXPLAIN plans at scale, how to watch over autovacuum and VACUUM operations, and how to flag issues based on schema statistics.
Monitoring Postgres at Scale | PostgresConf US 2018 | Lukas FittlCitus Data
Your PostgreSQL database is one of the most important pieces of your architecture - yet the level of introspection available in Postgres is often hard to work with. Its easy to get very detailed information, but what should you really watch out for, send reports on and alert on?
In this talk we'll discuss how query performance statistics can be made accessible to application developers, critical entries one should monitor in the PostgreSQL log files, how to collect EXPLAIN plans at scale, how to watch over autovacuum and VACUUM operations, and how to flag issues based on schema statistics.
We'll also talk a bit about monitoring multi-server setups, first going into high availability and read standbys, logical replication, and then reviewing how monitoring looks like for sharded databases like Citus.
The talk will primarily describe free/open-source tools and statistics views readily available from within Postgres.
This document discusses using PostgreSQL statistics to optimize performance. It describes various statistics sources like pg_stat_database, pg_stat_bgwriter, and pg_stat_replication that provide information on operations, caching, and replication lag. It also provides examples of using these sources to identify issues like long transactions, temporary file growth, and replication delays.
Deep dive into PostgreSQL internal statistics / Алексей Лесовский (PostgreSQL...Ontico
СУБД PostgreSQL — это огромный механизм, который состоит из множества подсистем, чья работа определяет производительность PostgreSQL. В процессе эксплуатации обеспечивается сбор статистики и информации о работе компонентов, что позволяет оценить эффективность PostgreSQL и принять меры для повышения производительности. Однако, этой информации очень много и представлена она в достаточно упрощенном виде. Обработка этой информации и ее интерпретация порой совсем нетривиальная задача, а зоопарк инструментов и утилит запросто поставит в тупик даже продвинутого DBA.
В докладе речь пойдет о подсистеме сбора статистики, о том какая информация доступна для оценки эффективности PostgreSQL, как её получить, не прибегая к зоопарку инструментов. Как интерпретировать и использовать полученную информацию, как найти узкие места, устранить их и повысить производительность PostgreSQL.
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLSeveralnines
To operate PostgreSQL efficiently, you need to have insight into database performance and make sure it is at optimal levels.
With that in mind, we dive into monitoring PostgreSQL for performance in this webinar replay.
PostgreSQL offers many metrics through various status overviews and commands, but which ones really matter to you? How do you trend and alert on them? What is the meaning behind the metrics? And what are some of the most common causes for performance problems in production?
We discuss this and more in ordinary, plain DBA language. We also have a look at some of the tools available for PostgreSQL monitoring and trending; and we’ll show you how to leverage ClusterControl’s PostgreSQL metrics, dashboards, custom alerting and other features to track and optimize the performance of your system.
AGENDA
- PostgreSQL architecture overview
- Performance problems in production
- Common causes
- Key PostgreSQL metrics and their meaning
- Tuning for performance
- Performance monitoring tools
- Impact of monitoring on performance
- How to use ClusterControl to identify performance issues
- Demo
SPEAKER
Sebastian Insausti, Support Engineer at Severalnines, has loved technology since his childhood, when he did his first computer course (Windows 3.11). And from that moment he was decided on what his profession would be. He has since built up experience with MySQL, PostgreSQL, HAProxy, WAF (ModSecurity), Linux (RedHat, CentOS, OL, Ubuntu server), Monitoring (Nagios), Networking and Virtualization (VMWare, Proxmox, Hyper-V, RHEV).
Prior to joining Severalnines, Sebastian worked as a consultant to state companies in security, database replication and high availability scenarios. He’s also a speaker and has given a few talks locally on InnoDB Cluster and MySQL Enterprise together with an Oracle team. Previous to that, he worked for a Mexican company as chief of sysadmin department as well as for a local ISP (Internet Service Provider), where he managed customers' servers and connectivity.
This webinar builds upon a related blog post by Sebastian: https://severalnines.com/blog/performance-cheat-sheet-postgresql.
PGConf APAC 2018 - Monitoring PostgreSQL at ScalePGConf APAC
Speaker: Lukas Fittl
Your PostgreSQL database is one of the most important pieces of your architecture - yet the level of introspection available in Postgres is often hard to work with. Its easy to get very detailed information, but what should you really watch out for, send reports on and alert on?
In this talk we'll discuss how query performance statistics can be made accessible to application developers, critical entries one should monitor in the PostgreSQL log files, how to collect EXPLAIN plans at scale, how to watch over autovacuum and VACUUM operations, and how to flag issues based on schema statistics.
We'll also talk a bit about monitoring multi-server setups, first going into high availability and read standbys, logical replication, and then reviewing how monitoring looks like for sharded databases like Citus.
The talk will primarily describe free/open-source tools and statistics views readily available from within Postgres.
PgCenter is a tool for monitoring and troubleshooting PostgreSQL. It provides a graphical interface to view key performance metrics and statuses. Some of its main features include displaying server health, load, memory and disk usage, statement performance, replication status and more. It aims to help PostgreSQL administrators quickly check the health of their databases and identify potential problems.
Covered Database Maintenance & Performance and Concurrency :
1. PostgreSQL Tuning and Performance
2. Find and Tune Slow Running Queries
3. Collecting regular statistics from pg_stat* views
4. Finding out what makes SQL slow
5. Speeding up queries without rewriting them
6. Discovering why a query is not using an index
7. Forcing a query to use an index
8. EXPLAIN and SQL Execution
9. Workload Analysis
Introduction to PostgreSQL for System AdministratorsJignesh Shah
This document provides an introduction and overview of PostgreSQL for system administrators. It covers why to use open source databases like PostgreSQL, a quick start guide to installing and configuring PostgreSQL, PostgreSQL internals like file system layout and processes, and monitoring PostgreSQL performance using tools like iostat and top. The presentation is aimed at helping system administrators get started with managing PostgreSQL databases.
This document discusses using window functions in PostgreSQL to analyze sales data. It shows a table with movie titles, categories, and total sales. Window functions can be used to calculate things like running totals, ranks, and more to analyze the sales data across rows.
This document discusses how Solaris features like ZFS, Zones, and DTrace can be leveraged to improve PostgreSQL performance and capabilities. ZFS allows for cheap, space-efficient snapshots of databases that can be quickly cloned and used without locks. Zones enable point-in-time copies of entire databases to run independently with low overhead. DTrace provides powerful tracing of PostgreSQL queries, I/O, and system activity for troubleshooting performance issues. Tools from OmniTI Labs further exploit these Solaris features for PostgreSQL monitoring, testing, and disaster recovery. In summary, combining PostgreSQL with the advanced virtualization and instrumentation of Solaris can significantly enhance database capabilities and operations.
This document discusses how Solaris features like ZFS, Zones, and DTrace can be leveraged to improve PostgreSQL performance and capabilities. ZFS allows for cheap, space-efficient snapshots of databases that can be quickly cloned and used without locks. Zones enable point-in-time copies of entire databases to run independently with low overhead. DTrace provides powerful instrumentation that allows deeply monitoring PostgreSQL queries, I/O, and system interactions. Tools from OmniTI Labs further exploit these Solaris features for PostgreSQL monitoring, testing, and disaster recovery. In summary, combining PostgreSQL with the advanced virtualization and tracing tools in Solaris can significantly enhance database capabilities and operations.
This document discusses advanced Postgres monitoring. It begins with an introduction of the speaker and an agenda for the discussion. It then covers selection criteria for monitoring solutions, compares open source and SAAS monitoring options, and provides examples of collecting specific Postgres metrics using CollectD. It also discusses alerting, handling monitoring changes, and being prepared to respond to incidents outside of normal hours.
Mark Wong
pg_proctab is a collection of PostgreSQL stored functions that provide access to the operating system process table using SQL. We'll show you which functions are available and where they collect the data, and give examples of their use to collect processor and I/O statistics on SQL queries.
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
pg_proctab is a collection of PostgreSQL stored functions that provide access to the operating system process table using SQL. We'll show you which functions are available and where they collect the data, and give examples of their use to collect processor and I/O statistics on SQL queries.
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...Citus Data
I’m a Postgres person. Period. After talking to many Rails developers about their application performance, I realized many performance issues can be solved by understanding your database a bit better. So I thought I’d share the statistics Postgres captures for you and how you can use them to find slow queries, un-used indexes, or tables which are not getting vacuumed correctly. This talk will cover Postgres tools and tips for the above, including pgstatstatements, useful catalog tables, and recently added Postgres features such as CREATE STATISTICS.
As the popularity of PostgreSQL continues to soar, many companies are exploring ways of migrating their application database over. At Redgate Software, we recently added PostgreSQL as an optional data store for SQL Monitor, our flagship monitoring application, after nearly 18 years of being backed exclusively by SQL Server. Knowing that others will be taking this journey in the near future, we'd like to discuss what we learned. In this training, we'll discuss the planning that needs to take place before a migration begins, including datatype changes, PostgreSQL configuration modifications, and query differences. This will be a mix of slides and demo from our own learnings, as well as those of some clients we've helped along the way.
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
pc_proctab is a collection of PostgreSQL stored functions that allow you to access the operating system process table using SQL. See examples on how to use these stored functions to collect processor and I/O statistics on SQL statements run against the database.
The document discusses using pg_proctab, a PostgreSQL extension that provides functions to query operating system process and statistics tables from within PostgreSQL. It demonstrates how to use pg_proctab to monitor CPU and memory usage, I/O, and other process-level metrics for queries. The document also shows how to generate custom reports on database activity and performance by taking snapshots before and after queries and analyzing the differences.
Отладка и устранение проблем в PostgreSQL Streaming Replication.Alexey Lesovsky
Потоковая репликация, которая появилась в 2010 году, стала одной из прорывных фич постгреса и в настоящее время практически ни одна инсталляция не обходится без использования потоковой репликации. Она надежна, легка в настройке, нетребовательна к ресурсам. Однако при всех своих положительных качествах, при её эксплуатации могут возникать различные проблемы и неприятные ситуации. Для диагностики и решения проблем, связанных с потоковой репликацией, есть множество инструментов, как встроенных в PostgreSQL, так и сторонних.
В этом докладе я сделаю обзор доступных инструментов и расскажу, как с помощью этих средств диагностировать различные типы проблем и как устранять их. Рассматривая методы решения, мы также рассмотрим проблемы, которые возникают при эксплуатации потоковой репликации.
Доклад будет полезен DBA и системным администраторам.
1. A PostgreSQL database outage occurred at GitLab on January 31st due to a combination of factors including an increase in load, replication lag, and the deletion of the database directory.
2. Lessons learned include monitoring replication, using tools like pg_basebackup properly, and having backups and disaster recovery processes in place.
3. Recommended preventative measures include setting sane configuration values, automated testing of backups, assigning an owner for data durability, and improving documentation.
This presentation discusses optimizing Linux systems for PostgreSQL databases. Linux is a good choice for databases due to its active development, features, stability, and community support. The presentation covers optimizing various system resources like CPU scheduling, memory, storage I/O, and power management to improve database performance. Specific topics include disabling transparent huge pages, tuning block I/O schedulers, and selecting appropriate scaling governors. The overall message is that Linux can be adapted for database workloads through testing and iterative changes.
This document provides an overview of pgCenter, a tool for managing and monitoring PostgreSQL databases. It describes pgCenter's interface which displays system metrics, PostgreSQL statistics and additional information. The interface shows values for items like CPU and memory usage, database connections, autovacuum operations, and query information. PgCenter provides a quick way to view real-time PostgreSQL and server performance metrics.
The document provides configuration instructions and guidelines for setting up streaming replication between a PostgreSQL master and standby server, including setting parameter values for wal_level, max_wal_senders, wal_keep_segments, creating a dedicated replication role, using pg_basebackup to initialize the standby, and various recovery target options to control the standby's behavior. It also discusses synchronous replication using replication slots and monitoring the replication process on both the master and standby servers.
Slides from Secon'2015 - Software Developers Conference. Penza, Russia.
The database is an essential element of any project. The database must be stable and provide good performance. If you plan to use PostgreSQL in your project, you will run into question the choice of operating system. Linux is one of the most popular operating system today. The combination of flexibility and stability makes Linux a good candidate as a platform for PostgreSQL. However, the default settings are suitable for a wide range of workloads. In this report, I will talk about what settings should pay attention and how they affect the performance of PostgreSQL. Which of these settings are more important, and what - no. How do the PostgreSQL more predictable and stable under normal circumstances or in cases of increasing load.
How to Configure Vendor Management in Lunch App of Odoo 18Celine George
The Vendor management in the Lunch app of Odoo 18 is the central hub for managing all aspects of the restaurants or caterers that provide food for your employees.
Human Anatomy and Physiology II Unit 3 B pharm Sem 2
Respiratory system
Anatomy of respiratory system with special reference to anatomy
of lungs, mechanism of respiration, regulation of respiration
Lung Volumes and capacities transport of respiratory gases,
artificial respiration, and resuscitation methods
Urinary system
Anatomy of urinary tract with special reference to anatomy of
kidney and nephrons, functions of kidney and urinary tract,
physiology of urine formation, micturition reflex and role of
kidneys in acid base balance, role of RAS in kidney and
disorders of kidney
Artificial intelligence Presented by JM.jmansha170
AI (Artificial Intelligence) :
"AI is the ability of machines to mimic human intelligence, such as learning, decision-making, and problem-solving."
Important Points about AI:
1. Learning – AI can learn from data (Machine Learning).
2. Automation – It helps automate repetitive tasks.
3. Decision Making – AI can analyze and make decisions faster than humans.
4. Natural Language Processing (NLP) – AI can understand and generate human language.
5. Vision & Recognition – AI can recognize images, faces, and patterns.
6. Used In – Healthcare, finance, robotics, education, and more.
Owner By:
Name : Junaid Mansha
Work : Web Developer and Graphics Designer
Contact us : +92 322 2291672
Email : [email protected]
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecdrazelitouali
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecd
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecd
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecd
HOW YOU DOIN'?
Cool, cool, cool...
Because that's what she said after THE QUIZ CLUB OF PSGCAS' TV SHOW quiz.
Grab your popcorn and be seated.
QM: THARUN S A
BCom Accounting and Finance (2023-26)
THE QUIZ CLUB OF PSGCAS.
This presentation was provided by Jennifer Gibson of Dryad, during the first session of our 2025 NISO training series "Secrets to Changing Behavior in Scholarly Communications." Session One was held June 5, 2025.
How to Manage & Create a New Department in Odoo 18 EmployeeCeline George
In Odoo 18's Employee module, organizing your workforce into departments enhances management and reporting efficiency. Departments are a crucial organizational unit within the Employee module.
Unit- 4 Biostatistics & Research Methodology.pdfKRUTIKA CHANNE
Blocking and confounding (when a third variable, or confounder, influences both the exposure and the outcome) system for Two-level factorials (a type of experimental design where each factor (independent variable) is investigated at only two levels, typically denoted as "high" and "low" or "+1" and "-1")
Regression modeling (statistical model that estimates the relationship between one dependent variable and one or more independent variables using a line): Hypothesis testing in Simple and Multiple regression models
Introduction to Practical components of Industrial and Clinical Trials Problems: Statistical Analysis Using Excel, SPSS, MINITAB®️, DESIGN OF EXPERIMENTS, R - Online Statistical Software to Industrial and Clinical trial approach
Parenting Teens: Supporting Trust, resilience and independencePooky Knightsmith
For more information about my speaking and training work, visit: https://www.pookyknightsmith.com/speaking/
SESSION OVERVIEW:
Parenting Teens: Supporting Trust, Resilience & Independence
The teenage years bring new challenges—for teens and for you. In this practical session, we’ll explore how to support your teen through emotional ups and downs, growing independence, and the pressures of school and social life.
You’ll gain insights into the teenage brain and why boundary-pushing is part of healthy development, along with tools to keep communication open, build trust, and support emotional resilience. Expect honest ideas, relatable examples, and space to connect with other parents.
By the end of this session, you will:
• Understand how teenage brain development affects behaviour and emotions
• Learn ways to keep communication open and supportive
• Explore tools to help your teen manage stress and bounce back from setbacks
• Reflect on how to encourage independence while staying connected
• Discover simple strategies to support emotional wellbeing
• Share experiences and ideas with other parents
Analysis of Quantitative Data Parametric and non-parametric tests.pptxShrutidhara2
This presentation covers the following points--
Parametric Tests
• Testing the Significance of the Difference between Means
• Analysis of Variance (ANOVA) - One way and Two way
• Analysis of Co-variance (One-way)
Non-Parametric Tests:
• Chi-Square test
• Sign test
• Median test
• Sum of Rank test
• Mann-Whitney U-test
Moreover, it includes a comparison of parametric and non-parametric tests, a comparison of one-way ANOVA, two-way ANOVA, and one-way ANCOVA.
How to Create Quotation Templates Sequence in Odoo 18 SalesCeline George
In this slide, we’ll discuss on how to create quotation templates sequence in Odoo 18 Sales. Odoo 18 Sales offers a variety of quotation templates that can be used to create different types of sales documents.
How to Manage Upselling of Subscriptions in Odoo 18Celine George
Subscriptions in Odoo 18 are designed to auto-renew indefinitely, ensuring continuous service for customers. However, businesses often need flexibility to adjust pricing or quantities based on evolving customer needs.
2. What is PostgreSQL activity statistics.
How to use statistics effectively.
How to solve problems with statistics.
http://goo.gl/uDuSvs
Agenda
5. $ ps hf -u postgres -o cmd
/usr/pgsql-9.5/bin/postgres -D /var/lib/pgsql/9.5/data
_ postgres: logger process
_ postgres: checkpointer process
_ postgres: writer process
_ postgres: wal writer process
_ postgres: autovacuum launcher process
_ postgres: stats collector process
_ postgres: postgres pgbench [local] idle in transaction
_ postgres: postgres pgbench [local] idle
_ postgres: postgres pgbench [local] UPDATE
_ postgres: postgres pgbench [local] UPDATE waiting
_ postgres: postgres pgbench [local] UPDATE
Black box
6. Write Ahead Log
Shared
Buffers
Buffers IO Autovacuum Workers
Autovacuum Launcher
Background Workers
Indexes IO
Query Execution
Query Planning
Client Backends Postmaster
Relations IO
Logger Process Stats Collector
Logical
Replication
WAL Sender
Process
Archiver
Process
Background
Writer
Checkpointer
Process
Network Storage
Recovery Process
WAL Receiver Process
Tables/Indexes Data Files
Where PostgreSQL spends its time
7. Too much information (more than 100 counters in 9.5).
Statistics are provided as an online counters.
No history (but reset functions are available).
No native handy stat tools in PostgreSQL.
A lot of 3rd party tools and programs.
Problems
8. Too much information (more than 100 counters in 9.5).
Statistics are provided as an online counters.
No history (but reset functions are available).
No native handy stat tools in PostgreSQL.
A lot of 3rd party tools and programs.
Important to use stats directly from PostgreSQL.
Basic SQL skills are required.
Problems
9. Counters in shared memory.
Functions.
Builtin views.
Official extensions in contribs package.
Unofficial extensions.
Statistics sources
13. $ select * from pg_stat_database;
...
blks_read | 7978770895
blks_hit | 9683551077519
...
$ select
sum(blks_hit)*100/sum(blks_hit+blks_read) as hit_ratio
from pg_stat_database;
More is better, and not less than 90%
Cache hit ratio
21. $ select * from pg_stat_replication;
...
sent_location | 1691/EEE65900
write_location | 1691/EEE65900
flush_location | 1691/EEE65900
replay_location | 1691/EEE658D0
...
1692/EEE65900 — location in transaction log (WAL)
All values are equal = ideal
Replication lag
22. Lag causes:
Networking
Storage
CPU
How many bytes written in WAL
$ select
pg_xlog_location_diff(pg_current_xlog_location(),'0/00000000');
Replication lag in bytes
$ select
client_addr,
pg_xlog_location_diff(pg_current_xlog_location(), replay_location)
from pg_stat_replication;
Replication lag in seconds
$ select
extract(epoch from now() - pg_last_xact_replay_timestamp());
Replication lag
23. $ select
client_addr as client,
pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),sent_location)) as pending,
pg_size_pretty(pg_xlog_location_diff(sent_location,write_location)) as write,
pg_size_pretty(pg_xlog_location_diff(write_location,flush_location)) as flush,
pg_size_pretty(pg_xlog_location_diff(flush_location,replay_location)) as replay,
pg _size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),replay_location)) as total
from pg_stat_replication;
client | pending | network | written | flushed | total
-----------+----------+----------+---------+------------+------------
127.0.0.1 | 0 bytes | 0 bytes | 0 bytes | 48 bytes | 48 bytes
10.1.0.8 | 12 GB | 30 MB | 0 bytes | 156 kB | 12 GB
10.2.0.6 | 0 bytes | 48 bytes | 0 bytes | 551 MB | 552 MB
Replication lag
28. $ select
s.relname,
pg_size_pretty(pg_relation_size(relid)),
coalesce(n_tup_ins,0) + 2 * coalesce(n_tup_upd,0) -
coalesce(n_tup_hot_upd,0) + coalesce(n_tup_del,0) AS total_writes,
(coalesce(n_tup_hot_upd,0)::float * 100 / (case when n_tup_upd > 0
then n_tup_upd else 1 end)::float)::numeric(10,2) AS hot_rate,
(select v[1] FROM regexp_matches(reloptions::text,E'fillfactor=(d+)') as
r(v) limit 1) AS fillfactor
from pg_stat_all_tables s
join pg_class c ON c.oid=relid
order by total_writes desc limit 50;
What is Heap-Only Tuples?
HOT does not cause index update.
HOT is only for non-indexed columns.
Big n_tup_hot_upd = good.
How to increase n_tup_hot_upd?
Write activity
32. $ select * from pg_stat_all_indexes where idx_scan = 0;
-[ RECORD 1 ]-+------------------------------------------
relid | 98242
indexrelid | 55732253
schemaname | public
relname | products
indexrelname | products_special2_idx
idx_scan | 0
idx_tup_read | 0
idx_tup_fetch | 0
pg_stat_all_indexes
33. $ select * from pg_stat_all_indexes where idx_scan = 0;
...
indexrelname | products_special2_idx
idx_scan | 0 0 = bad
...
Unused indexes are bad.
Uses storage.
Slow down UPDATE, DELETE, INSERT operations.
Extra work for VACUUM.
Unused indexes
34. $ select * from pg_stat_all_indexes where idx_scan = 0;
...
indexrelname | products_special2_idx
idx_scan | 0 0 = bad
...
Unused indexes are bad.
Uses storage.
Slow down UPDATE, DELETE, INSERT operations.
Extra work for VACUUM.
https://goo.gl/0qXDjl
http://goo.gl/5QxTm4
Unused indexes
38. $ select * from pg_stat_activity;
...
backend_start | 2015-10-14 15:18:03.01039+00
xact_start | 2015-10-14 15:21:15.336325+00
query_start | 2015-10-14 15:21:30.336325+00
state_change | 2015-10-14 15:21:30.33635+00
...
Long queries and xacts
39. $ select * from pg_stat_activity;
...
backend_start | 2015-10-14 15:18:03.01039+00
xact_start | 2015-10-14 15:21:15.336325+00
query_start | 2015-10-14 15:21:30.336325+00
state_change | 2015-10-14 15:21:30.33635+00
...
$ select
client_addr, usename, datname,
clock_timestamp() - xact_start as xact_age,
clock_timestamp() - query_start as query_age,
query
from pg_stat_activity order by xact_start, query_start;
Long queries and xacts
40. $ select * from pg_stat_activity;
...
backend_start | 2015-10-14 15:18:03.01039+00
xact_start | 2015-10-14 15:21:15.336325+00
query_start | 2015-10-14 15:21:30.336325+00
state_change | 2015-10-14 15:21:30.33635+00
...
$ select
client_addr, usename, datname,
clock_timestamp() - xact_start as xact_age,
clock_timestamp() - query_start as query_age,
query
from pg_stat_activity order by xact_start, query_start;
clock_timestamp() for calculating query or transaction age.
Long queries: remember, terminate, optimize.
Long queries and xacts
41. $ select * from pg_stat_activity where state in
('idle in transaction', 'idle in transaction (aborted)';
...
xact_start | 2015-10-14 15:21:21.128192+00
query_start | 2015-10-14 15:21:30.336325+00
state_change | 2015-10-14 15:21:30.33635+00
state | idle in transaction
...
Bad xacts
42. $ select * from pg_stat_activity where state in
('idle in transaction', 'idle in transaction (aborted)';
...
xact_start | 2015-10-14 15:21:21.128192+00
query_start | 2015-10-14 15:21:30.336325+00
state_change | 2015-10-14 15:21:30.33635+00
state | idle in transaction
...
idle in transaction, idle in transaction (aborted) = bad
Warning value: > 5
clock_timestamp() for calculate xact age.
Bad xacts: remember, terminate, optimize app.
Bad xacts
43. $ select * from pg_stat_activity where waiting;
...
xact_start | 2015-10-14 15:21:21.128192+00
query_start | 2015-10-14 15:21:30.336325+00
state_change | 2015-10-14 15:21:30.33635+00
waiting | t
...
Waiting clients
44. $ select * from pg_stat_activity where waiting;
...
xact_start | 2015-10-14 15:21:21.128192+00
query_start | 2015-10-14 15:21:30.336325+00
state_change | 2015-10-14 15:21:30.33635+00
waiting | t
...
waiting = true = bad.
clock_timestamp() for calculating query or xact age.
Enable log_lock_waits GUC, examine server logs.
Use pg_locks for searching blocking query or xact.
Waiting queries: remember, terminate, optimize app.
Waiting clients
47. $ select * from pg_stat_statements;
...
query | SELECT "id" FROM run_plan_xact(?)
calls | 11165832
total_time | 11743325.6880088
rows | 11165832
blk_read_time | 495425.535999976
blk_write_time | 0
Statements average time in ms
$ select (sum(total_time) / sum(calls))::numeric(6,3)
from pg_stat_statements;
The most writing (to shared_buffers) queries
$ select query, shared_blks_dirtied
from pg_stat_statements
where shared_blks_dirtied > 0 order by 2 desc;
pg_stat_statements
48. query total time: 15:43:07 (14.9%, CPU: 18.2%, IO: 9.0%)
сalls: 476 (0.00%) rows: 476,000
avg_time: 118881.54ms (IO: 21.2%)
user: app_user db: ustats
query: SELECT
filepath, type, deviceuid
FROM imvevents
WHERE
state = ?::eventstate
AND servertime BETWEEN $1 AND $2
ORDER BY servertime DESC LIMIT $3 OFFSET $4
https://goo.gl/6025wZ
Query reports
49. query total time: 15:43:07 (14.9%, CPU: 18.2%, IO: 9.0%)
сalls: 476 (0.00%) rows: 476,000
avg_time: 118881.54ms (IO: 21.2%)
user: app_user db: ustats
query: SELECT
filepath, type, deviceuid
FROM imvevents
WHERE
state = ?::eventstate
AND servertime BETWEEN $1 AND $2
ORDER BY servertime DESC LIMIT $3 OFFSET $4
Use sum() for calculating totals.
Calculate queries «contribution» in totals.
Resource usage (CPU, IO).
Query reports
50. pg_statio_all_tables, pg_statio_all_indexes.
pg_stat_user_functions.
Size functions - df *size*
pgstattuple (in official contribs package)
● Bloat estimation for tables and indexes.
● Estimation time depends on table (or index) size.
pg_buffercache (in official contribs package)
● Shared buffers inspection.
● Heavy performance impact (buffers lock).
Behind this talk
51. pgfincore (3rd party module)
● Low-level operations with tables using mincore().
● OS page cache inspection.
pg_stat_kcache (3rd party module)
● Using getrusage() before and after query.
● CPU usage and real filesystem operations stats.
● Requires pg_stat_statements and postgresql >= 9.4.
● No performance impact.
Behind this talk
52. ● The ability to use statistics is useful.
● Statistics are not difficult.
● Statistics help to answer the questions.
● Do experiments.
Resume