You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before this fix, the following query:
SET GLOBAL offline_mode = ON
could cause the server to crash.
The root cause is actually complex, as described below.
1)
A session A is connected, and performing network io,
typically waiting for the next command to execute.
The performance schema socket instrumentation is enabled,
so that pfs_start_socket_wait() / pfs_end_socket_wait()
are executed.
2)
A session B executes SET GLOBAL offline_mode = ON,
which terminates session A.
In particular, session B forcefully closes the socket used by session A.
3)
Session A and session B are different threads, but they both
execute performance schema instrumented code against the same socket.
4)
Because a socket is "owned" by a thread,
the instrumentation in pfs_start/end_socket_wait()
uses the same PFS_thread (of thread A) in both thread A and B.
This leads to race conditions when using member m_events_waits_current.
5)
Because PFS_thread::m_events_waits_current can be damaged with race conditions,
the m_events_waits_current pointer can point outside of the waits array.
Using this pointer to populate current waits can damage other members
of the PFS_thread structure, most notably LF_HASH pins.
6)
Upon thread disconnect, using a corrupted LF_HASH pin when calling
lf_hash_put_pins leads to a crash.
---
The fix for this issue is to use the current thread, not the socket owner,
in the performance schema socket instrumenttion.
Also, asserts have been added to detect similar failures.
With the asserts, the original issue,
which was spurious and only occured rarely,
is not detected systematically.
0 commit comments