SlideShare a Scribd company logo
Understanding PostgreSQL LWLocks



                  Jignesh Shah
           Staff Engineer, VMware Inc

              PgCon 2011 - Ottawa
About MySelf

§  Joined	
  VMware	
  in	
  	
  2010	
  	
  
     q  	
  PostgreSQL	
  performance	
  on	
  vSphere	
  

§  Previously	
  at	
  Sun	
  Microsystems	
  from	
  2000-­‐2010	
  
     q  Database	
  Performance	
  on	
  Solaris/Sun	
  Systems	
  

§  Work	
  with	
  	
  PostgreSQL	
  Performance	
  Community	
  	
  
     q  Scaling,	
  BoIlenecks	
  using	
  various	
  workloads	
  

§  My	
  Blog:	
  hIp://jkshah.blogspot.com	
  
	
  
	
  
	
  


                           © 2011 VMware Inc                             2
Content

v What	
  are	
  PostgreSQL	
  LWLocks	
  ?	
  
v Understanding	
  modes	
  of	
  LWLock	
  
v LWLocks	
  API	
  
v Architecture	
  of	
  the	
  LWLocks	
  Framework	
  	
  
v 	
  Internals	
  of	
  LWLock	
  Wait	
  List	
  	
  
v Ways	
  to	
  monitors	
  the	
  events	
  related	
  to	
  LWLocks	
  
v Top	
  LWLocks.	
  	
  
	
  
	
  
	
  
	
  
	
                            © 2011 VMware Inc                              3
What are PostgreSQL LWLocks?

§  LWLocks	
  –	
  Light	
  Weight	
  Locks	
  of	
  PostgreSQL	
  
§  Primary	
  Intent	
  
     q  Mutually	
  Exclusive	
  access	
  to	
  shared	
  memory	
  structures	
  

§  Offers	
  Shared	
  and	
  Exclusive	
  mode	
  
§  Not	
  to	
  be	
  confused	
  with	
  the	
  PostgreSQL	
  Locks	
  	
  
§  PostgreSQL	
  Locks	
  depends	
  on	
  LWLocks	
  to	
  protect	
  its	
  shared	
  
    state	
  
§  Uses	
  SpinLockAcquire/SpinLockRelease	
  and	
  
    PGSemaphoreLock/PGSemaphoreUnlock	
  



                             © 2011 VMware Inc                                   4
Understanding Modes of LWLocks

ExclusiveLock	
  
§ 	
   For	
  a	
  parcular	
  lockid	
  	
  there	
  can	
  be	
  only	
  1	
  exclusivelock	
  held	
  
     	
   (no	
  shared	
  locks	
  and	
  no	
  other	
  exclusive	
  lock	
  for	
  the	
  lockid)	
  
	
   	
  
SharedLock	
  
•  For	
  a	
  parcular	
  lockid	
  there	
  can	
  be	
  1	
  or	
  more	
  SharedLock	
  held	
  
          (no	
  exclusive	
  lock	
  for	
  the	
  lockid)	
  
	
  
	
  
	
  
	
  

                                    © 2011 VMware Inc                                            5
LWLocks API

§  LWLockAssign(void)	
  
       §  Get	
  a	
  dynamic	
  LWLock	
  number	
  (mostly	
  at	
  module	
  load)	
  
§  LWLockAcquire(LWLockId	
  lockid,	
  LWLockMode	
  mode)	
  
       §  Waits	
  ll	
  Lock	
  is	
  acquired	
  	
  in	
  EXCLUSIVE	
  or	
  SHARED	
  mode	
  
§  LWLockCondionalAcquire(LWLockId	
  lockid,	
  LWLockMode	
  mode)	
  
       §  Returns	
  False	
  if	
  it	
  not	
  available	
  (Non-­‐blocking)	
  
§  LWLockRelease(LWLockId	
  lockid)	
  
       §  Releases	
  the	
  lock	
  
§  LWLockReleaseAll	
  (void)	
  
       §  Used	
  aaer	
  an	
  ERROR	
  to	
  cleanup	
  
§  LWLockHeldByMe	
  (LWLockId	
  lockid)	
  
       §  Debug	
  support	
  test	
  
	
  
	
                                  © 2011 VMware Inc                                                  6
Architecture Flow of LWLocks

LWLockCondionalAcquire	
  	
  
•  Easy	
  lock	
  it	
  if	
  available	
  else	
  return	
  false	
  (Non-­‐blocking)	
  
	
  
LWLockAcquire	
  
	
  
•  Lock	
  	
  if	
  available	
  or	
  else	
  put	
  me	
  in	
  	
  wait	
  queue	
  (FIFO)	
  and	
  	
  got	
  
	
  
     to	
  sleep	
  on	
  a	
  semaphore	
  (and	
  loop)	
  
•  Only	
  return	
  when	
  lock	
  is	
  available	
  (Blocking)	
  
LWLockRelease	
  
•  Release	
  the	
  Lock	
  and	
  if	
  there	
  are	
  no	
  more	
  shared	
  locks	
  
     remaining	
  then	
  wake	
  up	
  the	
  next	
  process	
  waing	
  in	
  the	
  wait	
  
     queue	
  
	
  
	
  
	
                                  © 2011 VMware Inc                                               7
Wait Queue List

•  Wait	
  Queue	
  is	
  protected	
  by	
  a	
  mutex	
  which	
  is	
  accessed	
  using	
  
	
   SpinLocks	
  
•  When	
  the	
  lock	
  is	
  released	
  and	
  there	
  are	
  no	
  more	
  shared	
  locks	
  
	
  
     pending	
  then	
  the	
  process	
  releasing	
  the	
  lock	
  will	
  wake	
  the	
  next	
  
	
  
     waiter	
  as	
  follows	
  
        •  If	
  the	
  next	
  process	
  in	
  wait	
  queue	
  is	
  waing	
  for	
  an	
  
           EXCLUSIVE	
  mode	
  ,	
  only	
  that	
  process	
  is	
  removed	
  from	
  
           the	
  queue	
  and	
  wakes	
  them	
  up	
  
        •  Or	
  since	
  the	
  next	
  process	
  is	
  waing	
  for	
  SHARED	
  mode,	
  it	
  
           will	
  try	
  to	
  	
  remove	
  	
  as	
  many	
  consecuve	
  SHARED	
  mode	
  
           processes	
  and	
  wake	
  them	
  up	
  


                                © 2011 VMware Inc                                        8
Lock Acquisition Internals

	
  	
     Spins	
  on	
  	
  ‘lock’	
  mutex	
                     Sleep	
  on	
  	
  its	
  process	
  
                                                                    wait	
  semaphore
	
  
	
  
	
                        Lock	
                         •    If	
  Not	
  Add	
  to	
  Wait	
  Queue	
  List	
  
	
                        Available	
                    •    Release	
  ‘lock’	
  mutex	
  

	
  


            Release	
  	
  ‘lock’	
  mutex	
  




                                                    © 2011 VMware Inc                                               9
Lock Release Internals

	
  	
   Spins	
  on	
  	
  ‘lock’	
  mutex	
  
	
  
	
   Release	
  Lock	
  
	
  
	
                                                      •    Wake	
  Up	
  	
  process	
  next	
  in	
  FIFO	
  

	
          Shared	
                                    •    If	
  it	
  is	
  Shared	
  wake	
  up	
  all	
  
                                                             sequenal	
  shared	
  waiters
                 Locks	
  ?	
  




        Release	
  	
  ‘lock’	
  mutex	
  



                                                  © 2011 VMware Inc                                                10
LWLocks Wait Queue List FIFO Example

•  Consider	
  something	
  80-­‐20	
  Shared	
  /Exclusive	
  rao	
  
	
   	
  
	
   	
   S S S S E S E S S S
     S

	
   	
  
     It	
  will	
  wake	
  up	
  first	
  5	
  shared	
  locks	
  then	
  wait	
  ll	
  all	
  of	
  them	
  
           releases	
  the	
  lock	
  before	
  waking	
  up	
  the	
  process	
  asking	
  
           for	
  Exclusive	
  
     It	
  will	
  do	
  only	
  exclusive	
  one	
  then	
  and	
  one	
  it	
  releases	
  the	
  
           lock,	
  wakes	
  up	
  the	
  next	
  1	
  shared	
  one	
  
     	
  



                                  © 2011 VMware Inc                                              11
Observations about LWLocks Wait Queue List

•  All	
  Operaons	
  on	
  Wait	
  Queue	
  List	
  are	
  serialized	
  	
  
	
        •  Not	
  scalable	
  on	
  SMP	
  architecture	
  
•  Currently	
  only	
  FIFO	
  supported	
  	
  
	
  
•  No	
  restricon	
  on	
  shared	
  wakeups	
  	
  
	
  
          	
  




                               © 2011 VMware Inc                                  12
LWLocks – Defined in lwlock.h
typedef enum LWLockId
{
    BufFreelistLock,          ShmemIndexLock,       OidGenLock,
    	
  
    XidGenLock,
    SInvalWriteLock,
                              ProcArrayLock,
                              WALInsertLock,
                                                    SInvalReadLock,
                                                    WALWriteLock,
    ControlFileLock,          CheckpointLock,       CLogControlLock,
    	
  
    SubtransControlLock,      MultiXactGenLock,     MultiXactOffsetControlLock,
    MultiXactMemberControlLock,   RelCacheInitLock, BgWriterCommLock,
    	
  
    TwoPhaseStateLock,        TablespaceCreateLock, BtreeVacuumLock,
    AddinShmemInitLock,       AutovacuumLock,       AutovacuumScheduleLock,
    SyncScanLock,             RelationMappingLock, AsyncCtlLock,
    AsyncQueueLock,      SerializableXactHashLock, SerializableFinishedListLock,
    SerializablePredicateLockListLock, OldSerXidLock,   SyncRepLock,
    /* Individual lock IDs end here */
    FirstBufMappingLock,
    FirstLockMgrLock = FirstBufMappingLock + NUM_BUFFER_PARTITIONS, /*16 */
    FirstPredicateLockMgrLock = FirstLockMgrLock + NUM_LOCK_PARTITIONS,
    /* must be last except for MaxDynamicLWLock: */
    NumFixedLWLocks = FirstPredicateLockMgrLock + NUM_PREDICATELOCK_PARTITIONS,
    MaxDynamicLWLock = 1000000000
} LWLockId;




                           © 2011 VMware Inc                           13
Monitoring

LWLOCK_STAT	
  
•  Special	
  build	
  with	
  LWLOCK_STAT	
  defined	
  
	
  
LOCK_DEBUG	
  
	
  
•  Puts	
  debug	
  messages	
  in	
  system	
  alert	
  log	
  
	
  
DYNAMIC	
  TRACING	
  (DTrace,	
  SystemTap)	
  	
  
•  Recommended	
  
•  Useful	
  to	
  find	
  hot	
  locks	
  
•  Not	
  for	
  producon	
  use,	
  because	
  it	
  can	
  make	
  the	
  server	
  
     unstable	
  




                              © 2011 VMware Inc                                       14
Dynamic Tracing for LWLocks

•      postgresql-­‐lwlock-­‐wait-­‐start(lockid,mode)	
  
• 
	
     Postgresql-­‐lwlock-­‐wait-­‐done(lockid,mode)	
  
• 
	
     Postgresql-­‐lwlock-­‐acquire(lockid,mode)	
  
• 
	
     Postgresql-­‐lwlock-­‐condacquire(lockid,mode)	
  
•      Postgresql-­‐lwlock-­‐condacquire-­‐fail(lockid,mode)	
  
•      Postgresql-­‐lwlock-­‐release(lockid)	
  




                           © 2011 VMware Inc                       15
Example Monitoring using Systemtap
(DBT2)

       LOCKNAME   LWID   M W/A       COUNT      SUM-TIME(us) MAX-TIME(us) AVG-TIME(us)
  WALInsertLock      7 Ex     W      14013           2746505       3955        195
   WALWriteLock      8 Ex     W      10006          25508653     286749       2549
    LockMgrLock     55 Ex     W       2035            203429       3323         99
    LockMgrLock     45 Ex     W        932             54297       2860         58
    LockMgrLock     54 Ex     W        673             24362       1062         36
  ProcArrayLock      4 Ex     W        515             15907        666         30
  ProcArrayLock      4 Sh     W        176              6064         97         34
    LockMgrLock     56 Ex     W        171              5826        376         34
CLogControlLock     11 Sh     W        111             22490       6127        202
    LockMgrLock     57 Ex     W        101              5524       1326         54
    LockMgrLock     59 Ex     W         79              2883        347         36
CLogControlLock     11 Ex     W         58              8543       4439        147
    LockMgrLock     49 Ex     W         57              1848         76         32
    LockMgrLock     47 Ex     W         57              3166       1468         55




                            © 2011 VMware Inc                                  16
Top Locks - WALWriteLock

•  Protects	
  	
  writes	
  on	
  WAL	
  
•  Acquired	
  when	
  WAL	
  records	
  flushed	
  to	
  disk	
  
	
  
•  Acquired	
  when	
  WAL	
  Log	
  switch	
  occurs	
  
	
  
•  Improve	
  the	
  underlying	
  storage	
  of	
  pg_xlog	
  
	
  
•  Synchronous_commit=off	
  helps	
  indirectly	
  (	
  Do	
  not	
  wait	
  for	
  
   flush	
  to	
  the	
  disk)	
  
•  Full_page_writes=off	
  also	
  helps	
  reduce	
  the	
  stress	
  (but	
  not	
  
   recommended	
  since	
  resiliency	
  goes	
  down)	
  




                            © 2011 VMware Inc                                  17
Top Locks - WALInsertLock

•  Protects	
  WAL	
  Buffers	
  
•  Increasing	
  wal	
  buffers	
  may	
  help	
  though	
  to	
  only	
  certain	
  extent	
  
	
  
•  Snychnrous_commit	
  off	
  will	
  lead	
  to	
  increased	
  pressure	
  on	
  this	
  
	
  
	
   lock	
  (	
  Not	
  a	
  bad	
  thing	
  )	
  
•  However	
  eventually	
  not	
  much	
  can	
  be	
  done	
  once	
  it	
  gets	
  to	
  a	
  
     big	
  problem	
  without	
  new	
  commits	
  
•  Full_page_writes=off	
  certainly	
  helps	
  (again	
  not	
  recommended	
  
     since	
  it	
  reduces	
  the	
  resiliency	
  of	
  the	
  database	
  from	
  write	
  
     errors)	
  




                              © 2011 VMware Inc                                      18
Top Locks - ProcArrayLock

•  Protects	
  ProcArray	
  structure	
  
•  It	
  used	
  to	
  be	
  that	
  every	
  transacon	
  actually	
  acquired	
  this	
  lock	
  
	
  
	
   in	
  exclusive	
  mode	
  before	
  commit	
  causing	
  it	
  to	
  be	
  a	
  top	
  lock	
  
•  Fixed	
  in	
  9.0	
  	
  
	
  




                               © 2011 VMware Inc                                          19
Top Locks - SInvalidReadLock

•  Protects	
  sinval	
  array	
  	
  
•  Readers	
  take	
  “Shared”	
  SInvalReadLock	
  
	
  
•  SICleanupQueue	
  and	
  other	
  array-­‐wide	
  updates	
  take	
  
	
  
	
   “Excluslive”	
  SInvalReadLock	
  to	
  lock	
  out	
  all	
  readers	
  
•  Long	
  wait	
  mes	
  to	
  acquire	
  SInvalidReadLock	
  	
  generally	
  results	
  
     when	
  the	
  	
  Shared	
  Buffer	
  pool	
  is	
  being	
  stressed	
  	
  
•  Increase	
  shared_buffers	
  in	
  postgresql.conf	
  corresponding	
  to	
  
     acve	
  data	
  size	
  
	
  




                             © 2011 VMware Inc                                   20
Top Locks - CLogControlLock

•  Protects	
  CLogControl	
  structure	
  
•  Generally	
  not	
  a	
  problem	
  	
  
	
  
•  If	
  it	
  shows	
  on	
  the	
  top	
  lists	
  check	
  
	
  
	
         •  $PGDATA/pg_clog	
  should	
  be	
  on	
  buffered	
  file	
  system	
  




                           © 2011 VMware Inc                                  21
Example Monitoring using Systemtap
(Sysbench simple read)

     LOCKNAME   LWID   M W/A       COUNT     SUM-TIME(us) MAX-TIME(us) AVG-TIME(us)
  LockMgrLock    45 Ex     W      85343        469682510      13152       5503
  LockMgrLock    57 Ex     W      57547         30903727       8313        537
  LockMgrLock    44 Ex     W        390            34061       1670         87
  LockMgrLock    59 Ex     W        375            41570       2032        110
  LockMgrLock    56 Ex     W        361            39685       1889        109
  LockMgrLock    47 Ex     W        344            24548       1564         71
  LockMgrLock    54 Ex     W        335            67770       2319        202
  LockMgrLock    50 Ex     W        325            44213       1690        136
  LockMgrLock    49 Ex     W        325            39280       1475        120
  LockMgrLock    55 Ex     W        323            39448       1584        122
  LockMgrLock    48 Ex     W        323            26982       1669         83




                         © 2011 VMware Inc                                 22
Top Locks - LockMgrLocks

•  Protects	
  relaons	
  
•  Sets	
  of	
  about	
  16	
  Lock	
  Parons	
  by	
  default	
  to	
  handle	
  all	
  
	
  
	
   relaons	
  
•  Each	
  relaon	
  is	
  part	
  of	
  only	
  one	
  paron	
  (irrespecve	
  of	
  size)	
  
	
  
	
  




                               © 2011 VMware Inc                                        23
Other Locks - BufMappingLocks

•  Protects	
  regions	
  of	
  Buffers	
  
•  Sets	
  of	
  about	
  16	
  Regions	
  of	
  Buffers	
  by	
  default	
  to	
  handle	
  the	
  
	
  
	
   whole	
  Bufferpool	
  
•  Only	
  taken	
  shared	
  access	
  
	
  

	
  




                                © 2011 VMware Inc                                          24
Problems with LWLock

•  Overall	
  system	
  gravitates	
  to	
  certain	
  top	
  locks	
  
•  Single	
  mutex	
  lock	
  protects	
  adding	
  and	
  depleng	
  the	
  wait	
  
	
  
	
   queue	
  	
  
	
      •  Not	
  SMP	
  Scalable	
  
        •  Chance	
  for	
  opmizaon	
  out	
  there	
  
        •  Performance	
  limited	
  to	
  serialized	
  rate	
  the	
  locks	
  are	
  
           processed	
  
	
  




                             © 2011 VMware Inc                                      25
Questions / More Information

v Email:	
  jshah@vmware.com	
  
v Learn	
  more	
  about	
  PostgreSQL	
  
    q  hIp://www.postgresql.org	
  

v Blog:	
  hIp://jkshah.blogspot.com	
  	
  
	
  




                         © 2011 VMware Inc      26

More Related Content

What's hot (20)

PDF
あなたの知らないPostgreSQL監視の世界
Yoshinori Nakanishi
 
PDF
[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya Morita
Insight Technology, Inc.
 
PDF
[Pgday.Seoul 2020] SQL Tuning
PgDay.Seoul
 
PDF
Fluentd with MySQL
I Goo Lee
 
PDF
オンライン物理バックアップの排他モードと非排他モードについて(第15回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
PDF
まずやっとくPostgreSQLチューニング
Kosuke Kida
 
PPT
DataGuard体験記
Shinnosuke Akita
 
PPTX
iostat await svctm の 見かた、考え方
歩 柴田
 
PPTX
オンライン物理バックアップの排他モードと非排他モードについて ~PostgreSQLバージョン15対応版~(第34回PostgreSQLアンカンファレンス...
NTT DATA Technology & Innovation
 
PDF
PostgreSQL WAL for DBAs
PGConf APAC
 
PDF
The basics of fluentd
Treasure Data, Inc.
 
PDF
PostgreSQL 9.6 新機能紹介
Masahiko Sawada
 
PDF
Solving PostgreSQL wicked problems
Alexander Korotkov
 
PDF
High Availability PostgreSQL with Zalando Patroni
Zalando Technology
 
PPTX
PostgreSQLモニタリングの基本とNTTデータが追加したモニタリング新機能(Open Source Conference 2021 Online F...
NTT DATA Technology & Innovation
 
PDF
PostgreSQL 15の新機能を徹底解説
Masahiko Sawada
 
PDF
Galera cluster for high availability
Mydbops
 
PDF
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Altinity Ltd
 
PPTX
MySQL8.0_performance_schema.pptx
NeoClova
 
PDF
PostgreSQLレプリケーション徹底紹介
NTT DATA OSS Professional Services
 
あなたの知らないPostgreSQL監視の世界
Yoshinori Nakanishi
 
[B31] LOGMinerってレプリケーションソフトで使われているけどどうなってる? by Toshiya Morita
Insight Technology, Inc.
 
[Pgday.Seoul 2020] SQL Tuning
PgDay.Seoul
 
Fluentd with MySQL
I Goo Lee
 
オンライン物理バックアップの排他モードと非排他モードについて(第15回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
まずやっとくPostgreSQLチューニング
Kosuke Kida
 
DataGuard体験記
Shinnosuke Akita
 
iostat await svctm の 見かた、考え方
歩 柴田
 
オンライン物理バックアップの排他モードと非排他モードについて ~PostgreSQLバージョン15対応版~(第34回PostgreSQLアンカンファレンス...
NTT DATA Technology & Innovation
 
PostgreSQL WAL for DBAs
PGConf APAC
 
The basics of fluentd
Treasure Data, Inc.
 
PostgreSQL 9.6 新機能紹介
Masahiko Sawada
 
Solving PostgreSQL wicked problems
Alexander Korotkov
 
High Availability PostgreSQL with Zalando Patroni
Zalando Technology
 
PostgreSQLモニタリングの基本とNTTデータが追加したモニタリング新機能(Open Source Conference 2021 Online F...
NTT DATA Technology & Innovation
 
PostgreSQL 15の新機能を徹底解説
Masahiko Sawada
 
Galera cluster for high availability
Mydbops
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Altinity Ltd
 
MySQL8.0_performance_schema.pptx
NeoClova
 
PostgreSQLレプリケーション徹底紹介
NTT DATA OSS Professional Services
 

Viewers also liked (15)

PDF
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
PDF
Tuning DB2 in a Solaris Environment
Jignesh Shah
 
PDF
Best Practices with PostgreSQL on Solaris
Jignesh Shah
 
PDF
SFPUG - DVDStore Performance Benchmark and PostgreSQL
Jignesh Shah
 
PDF
OLTP Performance Benchmark Review
Jignesh Shah
 
PDF
Introduction to PostgreSQL for System Administrators
Jignesh Shah
 
PDF
My experience with embedding PostgreSQL
Jignesh Shah
 
PDF
Best Practices of running PostgreSQL in Virtual Environments
Jignesh Shah
 
PPTX
PostgreSQL and Linux Containers
Jignesh Shah
 
PDF
Introduction to PgBench
Joshua Drake
 
PDF
PostgreSQL and Benchmarks
Jignesh Shah
 
PDF
PostgreSQL High Availability in a Containerized World
Jignesh Shah
 
PDF
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Jignesh Shah
 
PPTX
画像処理ライブラリ OpenCV で 出来ること・出来ないこと
Norishige Fukushima
 
PDF
5 Steps to PostgreSQL Performance
Command Prompt., Inc
 
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Tuning DB2 in a Solaris Environment
Jignesh Shah
 
Best Practices with PostgreSQL on Solaris
Jignesh Shah
 
SFPUG - DVDStore Performance Benchmark and PostgreSQL
Jignesh Shah
 
OLTP Performance Benchmark Review
Jignesh Shah
 
Introduction to PostgreSQL for System Administrators
Jignesh Shah
 
My experience with embedding PostgreSQL
Jignesh Shah
 
Best Practices of running PostgreSQL in Virtual Environments
Jignesh Shah
 
PostgreSQL and Linux Containers
Jignesh Shah
 
Introduction to PgBench
Joshua Drake
 
PostgreSQL and Benchmarks
Jignesh Shah
 
PostgreSQL High Availability in a Containerized World
Jignesh Shah
 
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Jignesh Shah
 
画像処理ライブラリ OpenCV で 出来ること・出来ないこと
Norishige Fukushima
 
5 Steps to PostgreSQL Performance
Command Prompt., Inc
 
Ad

Similar to Understanding PostgreSQL LW Locks (20)

PDF
spinlock.pdf
Adrian Huang
 
PDF
Kernel locking
Kalimuthu Velappan
 
PDF
Linux Locking Mechanisms
Kernel TLV
 
PDF
AOS Lab 4: If you liked it, then you should have put a “lock” on it
Zubair Nabi
 
PDF
Linux kernel development_ch9-10_20120410
huangachou
 
PDF
Linux kernel development chapter 10
huangachou
 
PPT
Windows Server 2008 for Developers - Part 2
ukdpe
 
PDF
Much Ado About Blocking: Wait/Wakke in the Linux Kernel
Davidlohr Bueso
 
PDF
Troubleshooting tips and tricks for Oracle Database Oct 2020
Sandesh Rao
 
PDF
Locks (Concurrency)
Sri Prasanna
 
PDF
Describe synchronization techniques used by programmers who develop .pdf
excellentmobiles
 
PDF
semaphore & mutex.pdf
Adrian Huang
 
PDF
Preparation for mit ose lab4
Benux Wei
 
PPTX
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Sneeker Yeh
 
DOC
Linux synchronization tools
mukul bhardwaj
 
PDF
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
pgdayrussia
 
PPTX
Operating Systems
Harshith Meela
 
PDF
Latches v4
Rick van Ek
 
PDF
Preemptable ticket spinlocks: improving consolidated performance in the cloud
Jiannan Ouyang, PhD
 
spinlock.pdf
Adrian Huang
 
Kernel locking
Kalimuthu Velappan
 
Linux Locking Mechanisms
Kernel TLV
 
AOS Lab 4: If you liked it, then you should have put a “lock” on it
Zubair Nabi
 
Linux kernel development_ch9-10_20120410
huangachou
 
Linux kernel development chapter 10
huangachou
 
Windows Server 2008 for Developers - Part 2
ukdpe
 
Much Ado About Blocking: Wait/Wakke in the Linux Kernel
Davidlohr Bueso
 
Troubleshooting tips and tricks for Oracle Database Oct 2020
Sandesh Rao
 
Locks (Concurrency)
Sri Prasanna
 
Describe synchronization techniques used by programmers who develop .pdf
excellentmobiles
 
semaphore & mutex.pdf
Adrian Huang
 
Preparation for mit ose lab4
Benux Wei
 
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Sneeker Yeh
 
Linux synchronization tools
mukul bhardwaj
 
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
pgdayrussia
 
Operating Systems
Harshith Meela
 
Latches v4
Rick van Ek
 
Preemptable ticket spinlocks: improving consolidated performance in the cloud
Jiannan Ouyang, PhD
 
Ad

Recently uploaded (20)

PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Mastering ODC + Okta Configuration - Chennai OSUG
HathiMaryA
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
NASA A Researcher’s Guide to International Space Station : Physical Sciences ...
Dr. PANKAJ DHUSSA
 
Digital Circuits, important subject in CS
contactparinay1
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 

Understanding PostgreSQL LW Locks

  • 1. Understanding PostgreSQL LWLocks Jignesh Shah Staff Engineer, VMware Inc PgCon 2011 - Ottawa
  • 2. About MySelf §  Joined  VMware  in    2010     q   PostgreSQL  performance  on  vSphere   §  Previously  at  Sun  Microsystems  from  2000-­‐2010   q  Database  Performance  on  Solaris/Sun  Systems   §  Work  with    PostgreSQL  Performance  Community     q  Scaling,  BoIlenecks  using  various  workloads   §  My  Blog:  hIp://jkshah.blogspot.com         © 2011 VMware Inc 2
  • 3. Content v What  are  PostgreSQL  LWLocks  ?   v Understanding  modes  of  LWLock   v LWLocks  API   v Architecture  of  the  LWLocks  Framework     v   Internals  of  LWLock  Wait  List     v Ways  to  monitors  the  events  related  to  LWLocks   v Top  LWLocks.               © 2011 VMware Inc 3
  • 4. What are PostgreSQL LWLocks? §  LWLocks  –  Light  Weight  Locks  of  PostgreSQL   §  Primary  Intent   q  Mutually  Exclusive  access  to  shared  memory  structures   §  Offers  Shared  and  Exclusive  mode   §  Not  to  be  confused  with  the  PostgreSQL  Locks     §  PostgreSQL  Locks  depends  on  LWLocks  to  protect  its  shared   state   §  Uses  SpinLockAcquire/SpinLockRelease  and   PGSemaphoreLock/PGSemaphoreUnlock   © 2011 VMware Inc 4
  • 5. Understanding Modes of LWLocks ExclusiveLock   §    For  a  parcular  lockid    there  can  be  only  1  exclusivelock  held     (no  shared  locks  and  no  other  exclusive  lock  for  the  lockid)       SharedLock   •  For  a  parcular  lockid  there  can  be  1  or  more  SharedLock  held   (no  exclusive  lock  for  the  lockid)           © 2011 VMware Inc 5
  • 6. LWLocks API §  LWLockAssign(void)   §  Get  a  dynamic  LWLock  number  (mostly  at  module  load)   §  LWLockAcquire(LWLockId  lockid,  LWLockMode  mode)   §  Waits  ll  Lock  is  acquired    in  EXCLUSIVE  or  SHARED  mode   §  LWLockCondionalAcquire(LWLockId  lockid,  LWLockMode  mode)   §  Returns  False  if  it  not  available  (Non-­‐blocking)   §  LWLockRelease(LWLockId  lockid)   §  Releases  the  lock   §  LWLockReleaseAll  (void)   §  Used  aaer  an  ERROR  to  cleanup   §  LWLockHeldByMe  (LWLockId  lockid)   §  Debug  support  test       © 2011 VMware Inc 6
  • 7. Architecture Flow of LWLocks LWLockCondionalAcquire     •  Easy  lock  it  if  available  else  return  false  (Non-­‐blocking)     LWLockAcquire     •  Lock    if  available  or  else  put  me  in    wait  queue  (FIFO)  and    got     to  sleep  on  a  semaphore  (and  loop)   •  Only  return  when  lock  is  available  (Blocking)   LWLockRelease   •  Release  the  Lock  and  if  there  are  no  more  shared  locks   remaining  then  wake  up  the  next  process  waing  in  the  wait   queue         © 2011 VMware Inc 7
  • 8. Wait Queue List •  Wait  Queue  is  protected  by  a  mutex  which  is  accessed  using     SpinLocks   •  When  the  lock  is  released  and  there  are  no  more  shared  locks     pending  then  the  process  releasing  the  lock  will  wake  the  next     waiter  as  follows   •  If  the  next  process  in  wait  queue  is  waing  for  an   EXCLUSIVE  mode  ,  only  that  process  is  removed  from   the  queue  and  wakes  them  up   •  Or  since  the  next  process  is  waing  for  SHARED  mode,  it   will  try  to    remove    as  many  consecuve  SHARED  mode   processes  and  wake  them  up   © 2011 VMware Inc 8
  • 9. Lock Acquisition Internals     Spins  on    ‘lock’  mutex   Sleep  on    its  process   wait  semaphore       Lock   •  If  Not  Add  to  Wait  Queue  List     Available   •  Release  ‘lock’  mutex     Release    ‘lock’  mutex   © 2011 VMware Inc 9
  • 10. Lock Release Internals     Spins  on    ‘lock’  mutex       Release  Lock       •  Wake  Up    process  next  in  FIFO     Shared   •  If  it  is  Shared  wake  up  all   sequenal  shared  waiters Locks  ?   Release    ‘lock’  mutex   © 2011 VMware Inc 10
  • 11. LWLocks Wait Queue List FIFO Example •  Consider  something  80-­‐20  Shared  /Exclusive  rao           S S S S E S E S S S S     It  will  wake  up  first  5  shared  locks  then  wait  ll  all  of  them   releases  the  lock  before  waking  up  the  process  asking   for  Exclusive   It  will  do  only  exclusive  one  then  and  one  it  releases  the   lock,  wakes  up  the  next  1  shared  one     © 2011 VMware Inc 11
  • 12. Observations about LWLocks Wait Queue List •  All  Operaons  on  Wait  Queue  List  are  serialized       •  Not  scalable  on  SMP  architecture   •  Currently  only  FIFO  supported       •  No  restricon  on  shared  wakeups         © 2011 VMware Inc 12
  • 13. LWLocks – Defined in lwlock.h typedef enum LWLockId { BufFreelistLock, ShmemIndexLock, OidGenLock,   XidGenLock, SInvalWriteLock, ProcArrayLock, WALInsertLock, SInvalReadLock, WALWriteLock, ControlFileLock, CheckpointLock, CLogControlLock,   SubtransControlLock, MultiXactGenLock, MultiXactOffsetControlLock, MultiXactMemberControlLock, RelCacheInitLock, BgWriterCommLock,   TwoPhaseStateLock, TablespaceCreateLock, BtreeVacuumLock, AddinShmemInitLock, AutovacuumLock, AutovacuumScheduleLock, SyncScanLock, RelationMappingLock, AsyncCtlLock, AsyncQueueLock, SerializableXactHashLock, SerializableFinishedListLock, SerializablePredicateLockListLock, OldSerXidLock, SyncRepLock, /* Individual lock IDs end here */ FirstBufMappingLock, FirstLockMgrLock = FirstBufMappingLock + NUM_BUFFER_PARTITIONS, /*16 */ FirstPredicateLockMgrLock = FirstLockMgrLock + NUM_LOCK_PARTITIONS, /* must be last except for MaxDynamicLWLock: */ NumFixedLWLocks = FirstPredicateLockMgrLock + NUM_PREDICATELOCK_PARTITIONS, MaxDynamicLWLock = 1000000000 } LWLockId; © 2011 VMware Inc 13
  • 14. Monitoring LWLOCK_STAT   •  Special  build  with  LWLOCK_STAT  defined     LOCK_DEBUG     •  Puts  debug  messages  in  system  alert  log     DYNAMIC  TRACING  (DTrace,  SystemTap)     •  Recommended   •  Useful  to  find  hot  locks   •  Not  for  producon  use,  because  it  can  make  the  server   unstable   © 2011 VMware Inc 14
  • 15. Dynamic Tracing for LWLocks •  postgresql-­‐lwlock-­‐wait-­‐start(lockid,mode)   •    Postgresql-­‐lwlock-­‐wait-­‐done(lockid,mode)   •    Postgresql-­‐lwlock-­‐acquire(lockid,mode)   •    Postgresql-­‐lwlock-­‐condacquire(lockid,mode)   •  Postgresql-­‐lwlock-­‐condacquire-­‐fail(lockid,mode)   •  Postgresql-­‐lwlock-­‐release(lockid)   © 2011 VMware Inc 15
  • 16. Example Monitoring using Systemtap (DBT2) LOCKNAME LWID M W/A COUNT SUM-TIME(us) MAX-TIME(us) AVG-TIME(us) WALInsertLock 7 Ex W 14013 2746505 3955 195 WALWriteLock 8 Ex W 10006 25508653 286749 2549 LockMgrLock 55 Ex W 2035 203429 3323 99 LockMgrLock 45 Ex W 932 54297 2860 58 LockMgrLock 54 Ex W 673 24362 1062 36 ProcArrayLock 4 Ex W 515 15907 666 30 ProcArrayLock 4 Sh W 176 6064 97 34 LockMgrLock 56 Ex W 171 5826 376 34 CLogControlLock 11 Sh W 111 22490 6127 202 LockMgrLock 57 Ex W 101 5524 1326 54 LockMgrLock 59 Ex W 79 2883 347 36 CLogControlLock 11 Ex W 58 8543 4439 147 LockMgrLock 49 Ex W 57 1848 76 32 LockMgrLock 47 Ex W 57 3166 1468 55 © 2011 VMware Inc 16
  • 17. Top Locks - WALWriteLock •  Protects    writes  on  WAL   •  Acquired  when  WAL  records  flushed  to  disk     •  Acquired  when  WAL  Log  switch  occurs     •  Improve  the  underlying  storage  of  pg_xlog     •  Synchronous_commit=off  helps  indirectly  (  Do  not  wait  for   flush  to  the  disk)   •  Full_page_writes=off  also  helps  reduce  the  stress  (but  not   recommended  since  resiliency  goes  down)   © 2011 VMware Inc 17
  • 18. Top Locks - WALInsertLock •  Protects  WAL  Buffers   •  Increasing  wal  buffers  may  help  though  to  only  certain  extent     •  Snychnrous_commit  off  will  lead  to  increased  pressure  on  this       lock  (  Not  a  bad  thing  )   •  However  eventually  not  much  can  be  done  once  it  gets  to  a   big  problem  without  new  commits   •  Full_page_writes=off  certainly  helps  (again  not  recommended   since  it  reduces  the  resiliency  of  the  database  from  write   errors)   © 2011 VMware Inc 18
  • 19. Top Locks - ProcArrayLock •  Protects  ProcArray  structure   •  It  used  to  be  that  every  transacon  actually  acquired  this  lock       in  exclusive  mode  before  commit  causing  it  to  be  a  top  lock   •  Fixed  in  9.0       © 2011 VMware Inc 19
  • 20. Top Locks - SInvalidReadLock •  Protects  sinval  array     •  Readers  take  “Shared”  SInvalReadLock     •  SICleanupQueue  and  other  array-­‐wide  updates  take       “Excluslive”  SInvalReadLock  to  lock  out  all  readers   •  Long  wait  mes  to  acquire  SInvalidReadLock    generally  results   when  the    Shared  Buffer  pool  is  being  stressed     •  Increase  shared_buffers  in  postgresql.conf  corresponding  to   acve  data  size     © 2011 VMware Inc 20
  • 21. Top Locks - CLogControlLock •  Protects  CLogControl  structure   •  Generally  not  a  problem       •  If  it  shows  on  the  top  lists  check       •  $PGDATA/pg_clog  should  be  on  buffered  file  system   © 2011 VMware Inc 21
  • 22. Example Monitoring using Systemtap (Sysbench simple read) LOCKNAME LWID M W/A COUNT SUM-TIME(us) MAX-TIME(us) AVG-TIME(us) LockMgrLock 45 Ex W 85343 469682510 13152 5503 LockMgrLock 57 Ex W 57547 30903727 8313 537 LockMgrLock 44 Ex W 390 34061 1670 87 LockMgrLock 59 Ex W 375 41570 2032 110 LockMgrLock 56 Ex W 361 39685 1889 109 LockMgrLock 47 Ex W 344 24548 1564 71 LockMgrLock 54 Ex W 335 67770 2319 202 LockMgrLock 50 Ex W 325 44213 1690 136 LockMgrLock 49 Ex W 325 39280 1475 120 LockMgrLock 55 Ex W 323 39448 1584 122 LockMgrLock 48 Ex W 323 26982 1669 83 © 2011 VMware Inc 22
  • 23. Top Locks - LockMgrLocks •  Protects  relaons   •  Sets  of  about  16  Lock  Parons  by  default  to  handle  all       relaons   •  Each  relaon  is  part  of  only  one  paron  (irrespecve  of  size)       © 2011 VMware Inc 23
  • 24. Other Locks - BufMappingLocks •  Protects  regions  of  Buffers   •  Sets  of  about  16  Regions  of  Buffers  by  default  to  handle  the       whole  Bufferpool   •  Only  taken  shared  access       © 2011 VMware Inc 24
  • 25. Problems with LWLock •  Overall  system  gravitates  to  certain  top  locks   •  Single  mutex  lock  protects  adding  and  depleng  the  wait       queue       •  Not  SMP  Scalable   •  Chance  for  opmizaon  out  there   •  Performance  limited  to  serialized  rate  the  locks  are   processed     © 2011 VMware Inc 25
  • 26. Questions / More Information v Email:  [email protected]   v Learn  more  about  PostgreSQL   q  hIp://www.postgresql.org   v Blog:  hIp://jkshah.blogspot.com       © 2011 VMware Inc 26