SlideShare a Scribd company logo
Maintaining Terabytes
   with Postgres

          Selena Deckelmann
     Emma, Inc - http://myemma.com
 PostgreSQL Global Development Group
http://tech.myemma.com
   @emmaemailtech
Environment at Emma

• 1.6 TB, 1 cluster,Version 8.2 (RAID10)
• 1.1 TB, 2 clusters,Version 8.3 (RAID10)
• 8.4, 9.0 Dev
• Putting 9.0 into production (May 2011)
• pgpool, Redis, RabbitMQ, NFS
Other stats

• daily peaks: ~3000 commits per second
• average writes: 4 MBps
• average reads: 8 MBps
• From benchmarks we’ve done, load is
  pushing the limits of our hardware.
I say all of this with
         love.
Huge catalogs

• 409,994 tables
• Minor mistake in parent table definitions
 • Parent table updates take 30+ minutes
not null default nextval('important_sequence'::regclass)

                            vs

not null default nextval('important_sequence'::text)
Huge catalogs

• Bloat in the catalog
 • User-provoked ALTER TABLE
 • VACUUM FULL of catalog takes 2+ hrs
Huge catalogs suck

• 9,019,868 total data points for table stats
• 4,550,770 total data points for index stats
• Stats collection is slow
Disk Management

• $PGDATA:
 • pg_tblspc (TABLESPACES)
 • pg_xlog
 • global/pg_stats
 • wal for warm standby
Problems we worked through
  with big schemas Postgres

• Bloat
• Backups
• System resource exhaustion
• Minor upgrades
• Major upgrades
• Transaction wraparound
Bloat Causes

• Frequent UPDATE patterns
• Frequent DELETEs without VACUUM
 • a terabyte of dead tuples
SELECT
                  BLOAT QUERY
   schemaname, tablename, reltuples::bigint, relpages::bigint, otta,
   ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,
   CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,
   CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,
   CASE WHEN relpages < otta THEN '0 bytes'::text ELSE (bs*(relpages-otta))::bigint || ' bytes' END AS wastedsize,
   iname, ituples::bigint, ipages::bigint, iotta,
   ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,
   CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages,
   CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes,
   CASE WHEN ipages < iotta THEN '0 bytes' ELSE (bs*(ipages-iotta))::bigint || ' bytes' END AS wastedisize
 FROM (
   SELECT
     schemaname, tablename, cc.reltuples, cc.relpages, bs,
     CEIL((cc.reltuples*((datahdr+ma-
        (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta,
     COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
     COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta
   FROM (
     SELECT
        ma,bs,schemaname,tablename,
        (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
        (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
     FROM (
        SELECT
          schemaname, tablename, hdr, ma, bs,
          SUM((1-null_frac)*avg_width) AS datawidth,
          MAX(null_frac) AS maxfracsum,
          hdr+(
            SELECT 1+count(*)/8
            FROM pg_stats s2
            WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
          ) AS nullhdr
        FROM pg_stats s, (
          SELECT
            (SELECT current_setting('block_size')::numeric) AS bs,
            CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
            CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
          FROM (SELECT version() AS v) AS foo
        ) AS constants
        GROUP BY 1,2,3,4,5
     ) AS foo
   ) AS rs
   JOIN pg_class cc ON cc.relname = rs.tablename
   JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
   LEFT JOIN pg_index i ON indrelid = cc.oid
   LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
 ) AS sml
 WHERE tablename = 'addr'
 ORDER BY wastedbytes DESC LIMIT 1;




            Use check_postgres.pl
https://github.com/bucardo/check_postgres/
Fixing bloat
• Wrote scripts to clean things up
 • VACUUM (for small amounts)
 • CLUSTER
 • TRUNCATE (data loss!)
 • Or most extreme: DROP/CREATE
• And then ran the scripts.
Backups


• pg_dump takes longer and longer
 	
  	
  backup	
  	
  	
  |	
  	
  	
  	
  	
  	
  	
  	
  duration	
  	
  	
  	
  	
  	
  	
  	
  
-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐
	
  2009-­‐11-­‐22	
  |	
  02:44:36.821475
	
  2009-­‐11-­‐23	
  |	
  02:46:20.003507
	
  2009-­‐11-­‐24	
  |	
  02:47:06.260705
	
  2009-­‐12-­‐06	
  |	
  07:13:04.174964
	
  2009-­‐12-­‐13	
  |	
  05:00:01.082676
	
  2009-­‐12-­‐20	
  |	
  06:24:49.433043
	
  2009-­‐12-­‐27	
  |	
  05:35:20.551477
	
  2010-­‐01-­‐03	
  |	
  07:36:49.651492
	
  2010-­‐01-­‐10	
  |	
  05:55:02.396163
	
  2010-­‐01-­‐17	
  |	
  07:32:33.277559
	
  2010-­‐01-­‐24	
  |	
  06:22:46.522319
	
  2010-­‐01-­‐31	
  |	
  10:48:13.060888
	
  2010-­‐02-­‐07	
  |	
  21:21:47.77618
	
  2010-­‐02-­‐14	
  |	
  14:32:04.638267
	
  2010-­‐02-­‐21	
  |	
  11:34:42.353244
	
  2010-­‐02-­‐28	
  |	
  11:13:02.102345
Backups

• pg_dump fails
 • patching pg_dump for SELECT ... LIMIT
 • Crank down shared_buffers
 • or...
http://seeifixedit.com/view/there-i-fixed-it/45
Install 32-bit Postgres and libraries on a 64-bit system.

    Install 64-bit Postgres/libs of the same version.

Copy “hot backup” from 32-bit sys over to 64-bit sys.

Run pg_dump from 64-bit version on 32-bit Postgres.
PSA

• Warm standby is not a backup
 • Hot backup instances
 • “You don’t have valid backups, you have
    valid restores.” (thanks @sarahnovotny)
 • Necessity is the mother of invention...
Ship WAL from Solaris x86 -> Linux
           It did work!
Running out of inodes
• UFS on Solaris
    “The only way to add more inodes to a UFS
    filesystem is:
    1. destroy the filesystem and create a new
    filesystem with a higher inode density
    2. enlarge the filesystem - growfs man page”
•   Solution 0: Delete files.

•   Solution 1: Sharding and bigger FS on Linux

•   Solution 2: ext4 (soon!)
Running out of
available file descriptors

• Too many open files by the database
• Pooling - pgpool-II or pgbouncer?
Minor upgrades


• Stop/start database
• CHECKPOINT() before shutdown
Major Version upgrades

• Too much downtime to dump/restore
 • Write tools to migrate data
 • Trigger-based replication
 •  pg_upgrade
Transaction
wraparound avoidance
• autovacuum triggers are too small
 • 200,000 transactions (2 days)
 • Watch age(datfrozenxid)
 • Increase autovacuum_freeze_max_age
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets big
Thanks!

• We’re hiring! - selena@myemma.com
• Emma’s Tech Blog: http://tech.myemma.com
• My blog: http://chesnok.com
• http://twitter.com/selenamarie

More Related Content

What's hot (20)

PDF
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
PostgresOpen
 
PDF
Disperse xlator ramon_datalab
Gluster.org
 
PDF
Gluster.community.day.2013
Udo Seidel
 
PPTX
Latest performance changes by Scylla - Project optimus / Nolimits
ScyllaDB
 
PPTX
Write behind logging
Pouyan Rezazadeh
 
PDF
Distributed Postgres
Stas Kelvich
 
PDF
Postgres-XC Write Scalable PostgreSQL Cluster
Mason Sharp
 
PDF
Cassandra: Open Source Bigtable + Dynamo
jbellis
 
PDF
Bn 1016 demo postgre sql-online-training
conline training
 
ODP
Introduction to PostgreSQL
Jim Mlodgenski
 
PPTX
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
DataStax
 
PDF
Performance bottlenecks for metadata workload in Gluster with Poornima Gurusi...
Gluster.org
 
PPTX
RocksDB compaction
MIJIN AN
 
PDF
CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Cural...
DataStax
 
PDF
XSKY - ceph luminous update
inwin stack
 
ODP
Tiering barcelona
Gluster.org
 
PDF
Distribute Key Value Store
Santal Li
 
PDF
Challenges with Gluster and Persistent Memory with Dan Lambright
Gluster.org
 
PDF
Cassandra background-and-architecture
Markus Klems
 
PPTX
Gluster Storage
Raz Tamir
 
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
PostgresOpen
 
Disperse xlator ramon_datalab
Gluster.org
 
Gluster.community.day.2013
Udo Seidel
 
Latest performance changes by Scylla - Project optimus / Nolimits
ScyllaDB
 
Write behind logging
Pouyan Rezazadeh
 
Distributed Postgres
Stas Kelvich
 
Postgres-XC Write Scalable PostgreSQL Cluster
Mason Sharp
 
Cassandra: Open Source Bigtable + Dynamo
jbellis
 
Bn 1016 demo postgre sql-online-training
conline training
 
Introduction to PostgreSQL
Jim Mlodgenski
 
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...
DataStax
 
Performance bottlenecks for metadata workload in Gluster with Poornima Gurusi...
Gluster.org
 
RocksDB compaction
MIJIN AN
 
CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Cural...
DataStax
 
XSKY - ceph luminous update
inwin stack
 
Tiering barcelona
Gluster.org
 
Distribute Key Value Store
Santal Li
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Gluster.org
 
Cassandra background-and-architecture
Markus Klems
 
Gluster Storage
Raz Tamir
 

Viewers also liked (14)

PPT
Pitchtraining voor studievereniging WATT
Saxion, University of Applied Sciences
 
PDF
Product Management: Site & Situation
Jon Gatrell
 
PDF
Flex and the city in London - Keynote
Michael Chaize
 
PPT
Understand Risk in Communications and Data Breach
Jon Gatrell
 
PPTX
Twitter Personas: Bot or Not?
Jon Gatrell
 
PDF
Irrational Loss Aversion
Jon Gatrell
 
PPT
Social Media Class Baarn - 151112
Peter Wiegman
 
PPTX
Museum learning between physical and conceptual spaces
Mike Sharples
 
PPTX
Contributing to the WordPress Codex
Lorelle VanFossen
 
PDF
Maker Art: How to Create a Wonderbox
Green Change
 
PDF
Ria2010 keynote développeurs
Michael Chaize
 
PPT
I M S Bocharov
Lidia Pivovarova
 
PPT
Foldervisie
Peter Wiegman
 
PPT
Chebanova
Lidia Pivovarova
 
Pitchtraining voor studievereniging WATT
Saxion, University of Applied Sciences
 
Product Management: Site & Situation
Jon Gatrell
 
Flex and the city in London - Keynote
Michael Chaize
 
Understand Risk in Communications and Data Breach
Jon Gatrell
 
Twitter Personas: Bot or Not?
Jon Gatrell
 
Irrational Loss Aversion
Jon Gatrell
 
Social Media Class Baarn - 151112
Peter Wiegman
 
Museum learning between physical and conceptual spaces
Mike Sharples
 
Contributing to the WordPress Codex
Lorelle VanFossen
 
Maker Art: How to Create a Wonderbox
Green Change
 
Ria2010 keynote développeurs
Michael Chaize
 
I M S Bocharov
Lidia Pivovarova
 
Foldervisie
Peter Wiegman
 
Chebanova
Lidia Pivovarova
 
Ad

Similar to Managing terabytes: When Postgres gets big (20)

PDF
Managing terabytes: When PostgreSQL gets big
Selena Deckelmann
 
PDF
PostgreSQL on Solaris
Theo Schlossnagle
 
PDF
PostgreSQL on Solaris
Theo Schlossnagle
 
PDF
PgconfSV compression
Anastasia Lubennikova
 
PDF
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Heroku
 
PPTX
Postgresql Database Administration Basic - Day2
PoguttuezhiniVP
 
PDF
PostgreSQL High_Performance_Cheatsheet
Lucian Oprea
 
PDF
Visualizing Postgres
elliando dias
 
PDF
Creating PostgreSQL-as-a-Service at Scale
Sean Chittenden
 
PDF
pg_proctab: Accessing System Stats in PostgreSQL
Command Prompt., Inc
 
PDF
pg_proctab: Accessing System Stats in PostgreSQL
Mark Wong
 
PDF
The Accidental DBA
PostgreSQL Experts, Inc.
 
PDF
pg_proctab: Accessing System Stats in PostgreSQL
Command Prompt., Inc
 
PDF
pg_proctab: Accessing System Stats in PostgreSQL
Mark Wong
 
PPTX
Migrating To PostgreSQL
Grant Fritchey
 
PDF
In-core compression: how to shrink your database size in several times
Aleksander Alekseev
 
PDF
PostgreSQL Performance Tuning
elliando dias
 
PDF
Btree. Explore the heart of PostgreSQL.
Anastasia Lubennikova
 
PDF
pg_proctab: Accessing System Stats in PostgreSQL
Mark Wong
 
PDF
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
PROIDEA
 
Managing terabytes: When PostgreSQL gets big
Selena Deckelmann
 
PostgreSQL on Solaris
Theo Schlossnagle
 
PostgreSQL on Solaris
Theo Schlossnagle
 
PgconfSV compression
Anastasia Lubennikova
 
Rob Sullivan at Heroku's Waza 2013: Your Database -- A Story of Indifference
Heroku
 
Postgresql Database Administration Basic - Day2
PoguttuezhiniVP
 
PostgreSQL High_Performance_Cheatsheet
Lucian Oprea
 
Visualizing Postgres
elliando dias
 
Creating PostgreSQL-as-a-Service at Scale
Sean Chittenden
 
pg_proctab: Accessing System Stats in PostgreSQL
Command Prompt., Inc
 
pg_proctab: Accessing System Stats in PostgreSQL
Mark Wong
 
The Accidental DBA
PostgreSQL Experts, Inc.
 
pg_proctab: Accessing System Stats in PostgreSQL
Command Prompt., Inc
 
pg_proctab: Accessing System Stats in PostgreSQL
Mark Wong
 
Migrating To PostgreSQL
Grant Fritchey
 
In-core compression: how to shrink your database size in several times
Aleksander Alekseev
 
PostgreSQL Performance Tuning
elliando dias
 
Btree. Explore the heart of PostgreSQL.
Anastasia Lubennikova
 
pg_proctab: Accessing System Stats in PostgreSQL
Mark Wong
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
PROIDEA
 
Ad

More from Selena Deckelmann (20)

PDF
While we're here, let's fix computer science education
Selena Deckelmann
 
KEY
Algorithms are Recipes
Selena Deckelmann
 
PDF
Hire the right way
Selena Deckelmann
 
PDF
Mistakes were made - LCA 2012
Selena Deckelmann
 
PDF
Pg92 HA, LCA 2012, Ballarat
Selena Deckelmann
 
PDF
Managing terabytes
Selena Deckelmann
 
PDF
Mistakes were made
Selena Deckelmann
 
PDF
Postgres needs an aircraft carrier
Selena Deckelmann
 
PDF
Mistakes were made
Selena Deckelmann
 
PDF
Harder, better, faster, stronger: PostgreSQL 9.1
Selena Deckelmann
 
PDF
How to ask for money
Selena Deckelmann
 
PDF
Letters from the open source trenches - Postgres community
Selena Deckelmann
 
PDF
Own it: working with a changing open source community
Selena Deckelmann
 
PDF
Pdxpugday2010 pg90
Selena Deckelmann
 
PDF
Making Software Communities
Selena Deckelmann
 
PDF
Illustrated buffer cache
Selena Deckelmann
 
PDF
Bucardo
Selena Deckelmann
 
PDF
How a bunch of normal people Used Technology To Repair a Rigged Election
Selena Deckelmann
 
PDF
Open Source Bridge Opening Day
Selena Deckelmann
 
PDF
What Assumptions Make: Filesystem I/O from a database perspective
Selena Deckelmann
 
While we're here, let's fix computer science education
Selena Deckelmann
 
Algorithms are Recipes
Selena Deckelmann
 
Hire the right way
Selena Deckelmann
 
Mistakes were made - LCA 2012
Selena Deckelmann
 
Pg92 HA, LCA 2012, Ballarat
Selena Deckelmann
 
Managing terabytes
Selena Deckelmann
 
Mistakes were made
Selena Deckelmann
 
Postgres needs an aircraft carrier
Selena Deckelmann
 
Mistakes were made
Selena Deckelmann
 
Harder, better, faster, stronger: PostgreSQL 9.1
Selena Deckelmann
 
How to ask for money
Selena Deckelmann
 
Letters from the open source trenches - Postgres community
Selena Deckelmann
 
Own it: working with a changing open source community
Selena Deckelmann
 
Pdxpugday2010 pg90
Selena Deckelmann
 
Making Software Communities
Selena Deckelmann
 
Illustrated buffer cache
Selena Deckelmann
 
How a bunch of normal people Used Technology To Repair a Rigged Election
Selena Deckelmann
 
Open Source Bridge Opening Day
Selena Deckelmann
 
What Assumptions Make: Filesystem I/O from a database perspective
Selena Deckelmann
 

Managing terabytes: When Postgres gets big

  • 1. Maintaining Terabytes with Postgres Selena Deckelmann Emma, Inc - http://myemma.com PostgreSQL Global Development Group
  • 3. Environment at Emma • 1.6 TB, 1 cluster,Version 8.2 (RAID10) • 1.1 TB, 2 clusters,Version 8.3 (RAID10) • 8.4, 9.0 Dev • Putting 9.0 into production (May 2011) • pgpool, Redis, RabbitMQ, NFS
  • 4. Other stats • daily peaks: ~3000 commits per second • average writes: 4 MBps • average reads: 8 MBps • From benchmarks we’ve done, load is pushing the limits of our hardware.
  • 5. I say all of this with love.
  • 6. Huge catalogs • 409,994 tables • Minor mistake in parent table definitions • Parent table updates take 30+ minutes
  • 7. not null default nextval('important_sequence'::regclass) vs not null default nextval('important_sequence'::text)
  • 8. Huge catalogs • Bloat in the catalog • User-provoked ALTER TABLE • VACUUM FULL of catalog takes 2+ hrs
  • 9. Huge catalogs suck • 9,019,868 total data points for table stats • 4,550,770 total data points for index stats • Stats collection is slow
  • 10. Disk Management • $PGDATA: • pg_tblspc (TABLESPACES) • pg_xlog • global/pg_stats • wal for warm standby
  • 11. Problems we worked through with big schemas Postgres • Bloat • Backups • System resource exhaustion • Minor upgrades • Major upgrades • Transaction wraparound
  • 12. Bloat Causes • Frequent UPDATE patterns • Frequent DELETEs without VACUUM • a terabyte of dead tuples
  • 13. SELECT BLOAT QUERY schemaname, tablename, reltuples::bigint, relpages::bigint, otta, ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages, CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes, CASE WHEN relpages < otta THEN '0 bytes'::text ELSE (bs*(relpages-otta))::bigint || ' bytes' END AS wastedsize, iname, ituples::bigint, ipages::bigint, iotta, ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat, CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages, CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes, CASE WHEN ipages < iotta THEN '0 bytes' ELSE (bs*(ipages-iotta))::bigint || ' bytes' END AS wastedisize FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, bs, CEIL((cc.reltuples*((datahdr+ma- (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta, COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages, COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta FROM ( SELECT ma,bs,schemaname,tablename, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr, (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 FROM ( SELECT schemaname, tablename, hdr, ma, bs, SUM((1-null_frac)*avg_width) AS datawidth, MAX(null_frac) AS maxfracsum, hdr+( SELECT 1+count(*)/8 FROM pg_stats s2 WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename ) AS nullhdr FROM pg_stats s, ( SELECT (SELECT current_setting('block_size')::numeric) AS bs, CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr, CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma FROM (SELECT version() AS v) AS foo ) AS constants GROUP BY 1,2,3,4,5 ) AS foo ) AS rs JOIN pg_class cc ON cc.relname = rs.tablename JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema' LEFT JOIN pg_index i ON indrelid = cc.oid LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid ) AS sml WHERE tablename = 'addr' ORDER BY wastedbytes DESC LIMIT 1; Use check_postgres.pl https://github.com/bucardo/check_postgres/
  • 14. Fixing bloat • Wrote scripts to clean things up • VACUUM (for small amounts) • CLUSTER • TRUNCATE (data loss!) • Or most extreme: DROP/CREATE • And then ran the scripts.
  • 15. Backups • pg_dump takes longer and longer
  • 16.      backup      |                duration                 -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  2009-­‐11-­‐22  |  02:44:36.821475  2009-­‐11-­‐23  |  02:46:20.003507  2009-­‐11-­‐24  |  02:47:06.260705  2009-­‐12-­‐06  |  07:13:04.174964  2009-­‐12-­‐13  |  05:00:01.082676  2009-­‐12-­‐20  |  06:24:49.433043  2009-­‐12-­‐27  |  05:35:20.551477  2010-­‐01-­‐03  |  07:36:49.651492  2010-­‐01-­‐10  |  05:55:02.396163  2010-­‐01-­‐17  |  07:32:33.277559  2010-­‐01-­‐24  |  06:22:46.522319  2010-­‐01-­‐31  |  10:48:13.060888  2010-­‐02-­‐07  |  21:21:47.77618  2010-­‐02-­‐14  |  14:32:04.638267  2010-­‐02-­‐21  |  11:34:42.353244  2010-­‐02-­‐28  |  11:13:02.102345
  • 17. Backups • pg_dump fails • patching pg_dump for SELECT ... LIMIT • Crank down shared_buffers • or...
  • 19. Install 32-bit Postgres and libraries on a 64-bit system. Install 64-bit Postgres/libs of the same version. Copy “hot backup” from 32-bit sys over to 64-bit sys. Run pg_dump from 64-bit version on 32-bit Postgres.
  • 20. PSA • Warm standby is not a backup • Hot backup instances • “You don’t have valid backups, you have valid restores.” (thanks @sarahnovotny) • Necessity is the mother of invention...
  • 21. Ship WAL from Solaris x86 -> Linux It did work!
  • 22. Running out of inodes • UFS on Solaris “The only way to add more inodes to a UFS filesystem is: 1. destroy the filesystem and create a new filesystem with a higher inode density 2. enlarge the filesystem - growfs man page” • Solution 0: Delete files. • Solution 1: Sharding and bigger FS on Linux • Solution 2: ext4 (soon!)
  • 23. Running out of available file descriptors • Too many open files by the database • Pooling - pgpool-II or pgbouncer?
  • 24. Minor upgrades • Stop/start database • CHECKPOINT() before shutdown
  • 25. Major Version upgrades • Too much downtime to dump/restore • Write tools to migrate data • Trigger-based replication • pg_upgrade
  • 26. Transaction wraparound avoidance • autovacuum triggers are too small • 200,000 transactions (2 days) • Watch age(datfrozenxid) • Increase autovacuum_freeze_max_age
  • 29. Thanks! • We’re hiring! - [email protected] • Emma’s Tech Blog: http://tech.myemma.com • My blog: http://chesnok.com • http://twitter.com/selenamarie