Atomic Ops
Atomic Ops
Bitmask Operations
David S. Miller
local_t is very similar to atomic_t. If the counter is per CPU and only
updated by one CPU, local_t is probably more appropriate. Please see
Documentation/local_ops.txt for the semantics of local_t.
The first operations to implement for atomic_t's are the initializers and
plain reads.
The initializer is atomic in that the return values of the atomic operations
are guaranteed to be correct reflecting the initialized value if the
initializer is used before runtime. If the initializer is used at runtime, a
proper implicit or explicit read memory barrier is needed before reading the
value with atomic_read from another thread.
k = kmalloc(sizeof(*k), GFP_KERNEL);
if (!k)
return -ENOMEM;
atomic_set(&k->counter, 0);
The setting is atomic in that the return values of the atomic operations by
all threads are guaranteed to be correct reflecting either the value that has
been set with this operation or set with another operation. A proper implicit
or explicit memory barrier is needed before the value set with the operation
is guaranteed to be readable with atomic_read from another thread.
Next, we have:
#define atomic_read(v) ((v)->counter)
which simply reads the counter value currently visible to the calling thread.
The read is atomic in that the return value is guaranteed to be one of the
values initialized or modified with the interface operations if a proper
implicit or explicit memory barrier is used after possible runtime
initialization by any other thread and the value is modified only with the
interface operations. atomic_read does not guarantee that the runtime
initialization by any other thread is visible yet, so the user of the
interface must take care of that with a proper implicit or explicit memory
barrier.
Some architectures may choose to use the volatile keyword, barriers, or inline
assembly to guarantee some degree of immediacy for atomic_read() and
atomic_set(). This is not uniformly guaranteed, and may change in the future,
so all users of atomic_t should treat atomic_read() and atomic_set() as simple
C statements that may be reordered or optimized away entirely by the compiler
or processor, and explicitly invoke the appropriate compiler and/or memory
barrier for each use case. Failure to do so will result in code that may
suddenly break when used with different architectures or compiler
optimizations, or even changes in unrelated code which changes how the
compiler optimizes the section accessing atomic_t variables.
while (a > 0)
do_something();
If the compiler can prove that do_something() does not store to the
variable a, then the compiler is within its rights transforming this to
the following:
tmp = a;
if (a > 0)
for (;;)
do_something();
If you don't want the compiler to do this (and you probably don't), then
you should use something like the following:
tmp_a = a;
do_something_with(tmp_a);
do_something_else_with(tmp_a);
If the compiler can prove that do_something_with() does not store to the
variable a, then the compiler is within its rights to manufacture an
additional load as follows:
tmp_a = a;
do_something_with(tmp_a);
tmp_a = a;
do_something_else_with(tmp_a);
This could fatally confuse your code if it expected the same value
to be passed to do_something_with() and do_something_else_with().
tmp_a = ACCESS_ONCE(a);
do_something_with(tmp_a);
do_something_else_with(tmp_a);
For a final example, consider the following code, assuming that the
variable a is set at boot time before the second CPU is brought online
and never changed later, so that memory barriers are not needed:
if (a)
b = 9;
else
b = 42;
b = 42;
if (a)
b = 9;
if (a)
ACCESS_ONCE(b) = 9;
else
ACCESS_ONCE(b) = 42;
Don't even -think- about doing this without proper use of memory barriers,
locks, or atomic operations if variable a can change at runtime!
Now, we move onto the atomic operation interfaces typically implemented with
the help of assembly code.
These four routines add and subtract integral values to/from the given
atomic_t value. The first two routines pass explicit integers by
which to make the adjustment, whereas the latter two use an implicit
adjustment value of "1".
One very important aspect of these two routines is that they DO NOT
require any explicit memory barriers. They need only perform the
atomic_t counter update in an SMP safe manner.
Next, we have:
Next:
Then:
The semantics for atomic_cmpxchg are the same as those defined for 'cas'
below.
Finally:
void smp_mb__before_atomic_dec(void);
void smp_mb__after_atomic_dec(void);
void smp_mb__before_atomic_inc(void);
void smp_mb__after_atomic_inc(void);
obj->dead = 1;
smp_mb__before_atomic_dec();
atomic_dec(&obj->ref_count);
A missing memory barrier in the cases where they are required by the
atomic_t implementation above can have disastrous results. Here is
an example, which follows a pattern occurring frequently in the Linux
kernel. It is the use of atomic counters to implement reference
counting, and it works such that once the counter falls to zero it can
be guaranteed that no other entity can be accessing the object:
void obj_poke(void)
{
struct obj *obj;
spin_lock(&global_list_lock);
obj = obj_list_peek(&global_list);
spin_unlock(&global_list_lock);
if (obj) {
obj->ops->poke(obj);
if (atomic_dec_and_test(&obj->refcnt))
obj_destroy(obj);
}
}
if (atomic_dec_and_test(&obj->refcnt))
obj_destroy(obj);
}
Given the above scheme, it must be the case that the obj->active
update done by the obj list deletion be visible to other processors
before the atomic counter decrement is performed.
Otherwise, the counter could fall to zero, yet obj->active would still
be set, thus triggering the assertion in obj_destroy(). The error
sequence looks like this:
cpu 0 cpu 1
obj_poke() obj_timeout()
obj = obj_list_peek();
... gains ref to obj, refcnt=2
obj_list_del(obj);
obj->active = 0 ...
... visibility delayed ...
atomic_dec_and_test()
... refcnt drops to 1 ...
atomic_dec_and_test()
... refcount drops to 0 ...
obj_destroy()
BUG() triggers since obj->active
still seen as one
obj->active update visibility occurs
We will now cover the atomic bitmask operations. You will find that
their SMP and memory barrier semantics are similar in shape and scope
to the atomic_t ops above.
These routines set, clear, and change, respectively, the bit number
indicated by "nr" on the bit mask pointed to by "ADDR".
They must execute atomically, yet there are no implicit memory barrier
semantics required of these interfaces.
Like the above, except that these routines return a boolean which
indicates whether the changed bit was set _BEFORE_ the atomic bit
operation.
For one thing, this return value gets truncated to int in many code
paths using these interfaces, so on 64-bit if the bit is set in the
upper 32-bits then testers will never see that.
One great example of where this problem crops up are the thread_info
flag operations. Routines such as test_and_set_ti_thread_flag() chop
the return value into an int. There are other places where things
like this occur as well.
obj->dead = 1;
if (test_and_set_bit(0, &obj->flags))
/* ... */;
obj->killed = 1;
void smp_mb__before_clear_bit(void);
void smp_mb__after_clear_bit(void);
They are used as follows, and are akin to their atomic_t operation
brothers:
There are two special bitops with lock barrier semantics (acquire/release,
same as spinlocks). These operate in the same way as their non-_lock/unlock
postfixed variants, except that they are to provide acquire/release semantics,
respectively. This means they can be used for bit_spin_trylock and
bit_spin_unlock type operations without specifying any more barriers.
The routines xchg() and cmpxchg() need the same exact memory barriers
as the atomic and bit operations returning values.
while (1) {
old = *counter;
new = old + 1;
went_to_zero = 0;
while (1) {
old = atomic_read(atomic);
new = old - 1;
if (new == 0) {
went_to_zero = 1;
spin_lock(lock);
}
ret = cas(atomic, old, new);
if (ret == old)
break;
if (went_to_zero) {
spin_unlock(lock);
went_to_zero = 0;
}
}
return went_to_zero;
}
Note that this also means that for the case where the counter
is not dropping to zero, there are no memory ordering
requirements.