[net-next-2.6.git] / Documentation / spinlocks.txt

Lesson 1: Spin locks

The most basic primitive for locking is spinlock.

static DEFINE_SPINLOCK(xxx_lock);

	unsigned long flags;

	spin_lock_irqsave(&xxx_lock, flags);
	... critical section here ..
	spin_unlock_irqrestore(&xxx_lock, flags);

The above is always safe. It will disable interrupts _locally_, but the
spinlock itself will guarantee the global lock, so it will guarantee that
there is only one thread-of-control within the region(s) protected by that
lock. This works well even under UP. The above sequence under UP
essentially is just the same as doing

	unsigned long flags;

	save_flags(flags); cli();
	 ... critical section ...
	restore_flags(flags);

so the code does _not_ need to worry about UP vs SMP issues: the spinlocks
work correctly under both (and spinlocks are actually more efficient on
architectures that allow doing the "save_flags + cli" in one operation).

   NOTE! Implications of spin_locks for memory are further described in:

     Documentation/memory-barriers.txt
       (5) LOCK operations.
       (6) UNLOCK operations.

The above is usually pretty simple (you usually need and want only one
spinlock for most things - using more than one spinlock can make things a
lot more complex and even slower and is usually worth it only for
sequences that you _know_ need to be split up: avoid it at all cost if you
aren't sure). HOWEVER, it _does_ mean that if you have some code that does

	cli();
	.. critical section ..
	sti();

and another sequence that does

	spin_lock_irqsave(flags);
	.. critical section ..
	spin_unlock_irqrestore(flags);

then they are NOT mutually exclusive, and the critical regions can happen
at the same time on two different CPU's. That's fine per se, but the
critical regions had better be critical for different things (ie they
can't stomp on each other).

The above is a problem mainly if you end up mixing code - for example the
routines in ll_rw_block() tend to use cli/sti to protect the atomicity of
their actions, and if a driver uses spinlocks instead then you should
think about issues like the above.

This is really the only really hard part about spinlocks: once you start
using spinlocks they tend to expand to areas you might not have noticed
before, because you have to make sure the spinlocks correctly protect the
shared data structures _everywhere_ they are used. The spinlocks are most
easily added to places that are completely independent of other code (for
example, internal driver data structures that nobody else ever touches).

   NOTE! The spin-lock is safe only when you _also_ use the lock itself
   to do locking across CPU's, which implies that EVERYTHING that
   touches a shared variable has to agree about the spinlock they want
   to use.

----

Lesson 2: reader-writer spinlocks.

If your data accesses have a very natural pattern where you usually tend
to mostly read from the shared variables, the reader-writer locks
(rw_lock) versions of the spinlocks are sometimes useful. They allow multiple
readers to be in the same critical region at once, but if somebody wants
to change the variables it has to get an exclusive write lock.

   NOTE! reader-writer locks require more atomic memory operations than
   simple spinlocks.  Unless the reader critical section is long, you
   are better off just using spinlocks.

The routines look the same as above:

   rwlock_t xxx_lock = RW_LOCK_UNLOCKED;

	unsigned long flags;

	read_lock_irqsave(&xxx_lock, flags);
	.. critical section that only reads the info ...
	read_unlock_irqrestore(&xxx_lock, flags);

	write_lock_irqsave(&xxx_lock, flags);
	.. read and write exclusive access to the info ...
	write_unlock_irqrestore(&xxx_lock, flags);

The above kind of lock may be useful for complex data structures like
linked lists, especially searching for entries without changing the list
itself.  The read lock allows many concurrent readers.  Anything that
_changes_ the list will have to get the write lock.

   NOTE! RCU is better for list traversal, but requires careful
   attention to design detail (see Documentation/RCU/listRCU.txt).

Also, you cannot "upgrade" a read-lock to a write-lock, so if you at _any_
time need to do any changes (even if you don't do it every time), you have
to get the write-lock at the very beginning.

   NOTE! We are working hard to remove reader-writer spinlocks in most
   cases, so please don't add a new one without consensus.  (Instead, see
   Documentation/RCU/rcu.txt for complete information.)

----

Lesson 3: spinlocks revisited.

The single spin-lock primitives above are by no means the only ones. They
are the most safe ones, and the ones that work under all circumstances,
but partly _because_ they are safe they are also fairly slow. They are
much faster than a generic global cli/sti pair, but slower than they'd
need to be, because they do have to disable interrupts (which is just a
single instruction on a x86, but it's an expensive one - and on other
architectures it can be worse).

If you have a case where you have to protect a data structure across
several CPU's and you want to use spinlocks you can potentially use
cheaper versions of the spinlocks. IFF you know that the spinlocks are
never used in interrupt handlers, you can use the non-irq versions:

	spin_lock(&lock);
	...
	spin_unlock(&lock);

(and the equivalent read-write versions too, of course). The spinlock will
guarantee the same kind of exclusive access, and it will be much faster. 
This is useful if you know that the data in question is only ever
manipulated from a "process context", ie no interrupts involved. 

The reasons you mustn't use these versions if you have interrupts that
play with the spinlock is that you can get deadlocks:

	spin_lock(&lock);
	...
		<- interrupt comes in:
			spin_lock(&lock);

where an interrupt tries to lock an already locked variable. This is ok if
the other interrupt happens on another CPU, but it is _not_ ok if the
interrupt happens on the same CPU that already holds the lock, because the
lock will obviously never be released (because the interrupt is waiting
for the lock, and the lock-holder is interrupted by the interrupt and will
not continue until the interrupt has been processed). 

(This is also the reason why the irq-versions of the spinlocks only need
to disable the _local_ interrupts - it's ok to use spinlocks in interrupts
on other CPU's, because an interrupt on another CPU doesn't interrupt the
CPU that holds the lock, so the lock-holder can continue and eventually
releases the lock). 

Note that you can be clever with read-write locks and interrupts. For
example, if you know that the interrupt only ever gets a read-lock, then
you can use a non-irq version of read locks everywhere - because they
don't block on each other (and thus there is no dead-lock wrt interrupts. 
But when you do the write-lock, you have to use the irq-safe version. 

For an example of being clever with rw-locks, see the "waitqueue_lock" 
handling in kernel/sched.c - nothing ever _changes_ a wait-queue from
within an interrupt, they only read the queue in order to know whom to
wake up. So read-locks are safe (which is good: they are very common
indeed), while write-locks need to protect themselves against interrupts.

		Linus

----

Reference information:

For dynamic initialization, use spin_lock_init() or rwlock_init() as
appropriate:

   spinlock_t xxx_lock;
   rwlock_t xxx_rw_lock;

   static int __init xxx_init(void)
   {
	spin_lock_init(&xxx_lock);
	rwlock_init(&xxx_rw_lock);
	...
   }

   module_init(xxx_init);

For static initialization, use DEFINE_SPINLOCK() / DEFINE_RWLOCK() or
__SPIN_LOCK_UNLOCKED() / __RW_LOCK_UNLOCKED() as appropriate.

SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated.  These interfere
with lockdep state tracking.

Most of the time, you can simply turn:
	static spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED;
into:
	static DEFINE_SPINLOCK(xxx_lock);

Static structure member variables go from:

	struct foo bar {
		.lock	=	SPIN_LOCK_UNLOCKED;
	};

to:

	struct foo bar {
		.lock	=	__SPIN_LOCK_UNLOCKED(bar.lock);
	};

Declaration of static rw_locks undergo a similar transformation.
Commit	Line	Data
fb0bbb92	1	Lesson 1: Spin locks
1da177e4	2
fb0bbb92	3	The most basic primitive for locking is spinlock.
1da177e4	4
fb0bbb92	5	static DEFINE_SPINLOCK(xxx_lock);
1da177e4 LT	6
	7	unsigned long flags;
	8
	9	spin_lock_irqsave(&xxx_lock, flags);
	10	... critical section here ..
	11	spin_unlock_irqrestore(&xxx_lock, flags);
	12
fb0bbb92	13	The above is always safe. It will disable interrupts _locally_, but the
1da177e4 LT	14	spinlock itself will guarantee the global lock, so it will guarantee that
1da177e4 LT	15	there is only one thread-of-control within the region(s) protected by that
fb0bbb92 WAS	16	lock. This works well even under UP. The above sequence under UP
fb0bbb92 WAS	17	essentially is just the same as doing
1da177e4 LT	18
	19	unsigned long flags;
	20
	21	save_flags(flags); cli();
	22	... critical section ...
	23	restore_flags(flags);
	24
	25	so the code does _not_ need to worry about UP vs SMP issues: the spinlocks
	26	work correctly under both (and spinlocks are actually more efficient on
fb0bbb92 WAS	27	architectures that allow doing the "save_flags + cli" in one operation).
	28
	29	NOTE! Implications of spin_locks for memory are further described in:
1da177e4	30
fb0bbb92 WAS	31	Documentation/memory-barriers.txt
	32	(5) LOCK operations.
	33	(6) UNLOCK operations.
1da177e4 LT	34
	35	The above is usually pretty simple (you usually need and want only one
	36	spinlock for most things - using more than one spinlock can make things a
	37	lot more complex and even slower and is usually worth it only for
	38	sequences that you _know_ need to be split up: avoid it at all cost if you
	39	aren't sure). HOWEVER, it _does_ mean that if you have some code that does
	40
	41	cli();
	42	.. critical section ..
	43	sti();
	44
	45	and another sequence that does
	46
	47	spin_lock_irqsave(flags);
	48	.. critical section ..
	49	spin_unlock_irqrestore(flags);
	50
	51	then they are NOT mutually exclusive, and the critical regions can happen
	52	at the same time on two different CPU's. That's fine per se, but the
	53	critical regions had better be critical for different things (ie they
fb0bbb92	54	can't stomp on each other).
1da177e4 LT	55
	56	The above is a problem mainly if you end up mixing code - for example the
	57	routines in ll_rw_block() tend to use cli/sti to protect the atomicity of
	58	their actions, and if a driver uses spinlocks instead then you should
fb0bbb92	59	think about issues like the above.
1da177e4 LT	60
	61	This is really the only really hard part about spinlocks: once you start
	62	using spinlocks they tend to expand to areas you might not have noticed
	63	before, because you have to make sure the spinlocks correctly protect the
	64	shared data structures _everywhere_ they are used. The spinlocks are most
fb0bbb92 WAS	65	easily added to places that are completely independent of other code (for
	66	example, internal driver data structures that nobody else ever touches).
	67
	68	NOTE! The spin-lock is safe only when you _also_ use the lock itself
	69	to do locking across CPU's, which implies that EVERYTHING that
	70	touches a shared variable has to agree about the spinlock they want
	71	to use.
1da177e4 LT	72
	73	----
	74
	75	Lesson 2: reader-writer spinlocks.
	76
	77	If your data accesses have a very natural pattern where you usually tend
	78	to mostly read from the shared variables, the reader-writer locks
fb0bbb92	79	(rw_lock) versions of the spinlocks are sometimes useful. They allow multiple
1da177e4	80	readers to be in the same critical region at once, but if somebody wants
fb0bbb92	81	to change the variables it has to get an exclusive write lock.
1da177e4	82
fb0bbb92 WAS	83	NOTE! reader-writer locks require more atomic memory operations than
	84	simple spinlocks. Unless the reader critical section is long, you
	85	are better off just using spinlocks.
1da177e4	86
fb0bbb92 WAS	87	The routines look the same as above:
	88
	89	rwlock_t xxx_lock = RW_LOCK_UNLOCKED;
1da177e4 LT	90
	91	unsigned long flags;
	92
	93	read_lock_irqsave(&xxx_lock, flags);
	94	.. critical section that only reads the info ...
	95	read_unlock_irqrestore(&xxx_lock, flags);
	96
	97	write_lock_irqsave(&xxx_lock, flags);
	98	.. read and write exclusive access to the info ...
	99	write_unlock_irqrestore(&xxx_lock, flags);
	100
fb0bbb92 WAS	101	The above kind of lock may be useful for complex data structures like
	102	linked lists, especially searching for entries without changing the list
	103	itself. The read lock allows many concurrent readers. Anything that
	104	_changes_ the list will have to get the write lock.
	105
	106	NOTE! RCU is better for list traversal, but requires careful
	107	attention to design detail (see Documentation/RCU/listRCU.txt).
1da177e4	108
fb0bbb92	109	Also, you cannot "upgrade" a read-lock to a write-lock, so if you at _any_
1da177e4	110	time need to do any changes (even if you don't do it every time), you have
fb0bbb92 WAS	111	to get the write-lock at the very beginning.
	112
	113	NOTE! We are working hard to remove reader-writer spinlocks in most
	114	cases, so please don't add a new one without consensus. (Instead, see
	115	Documentation/RCU/rcu.txt for complete information.)
1da177e4 LT	116
	117	----
	118
	119	Lesson 3: spinlocks revisited.
	120
	121	The single spin-lock primitives above are by no means the only ones. They
	122	are the most safe ones, and the ones that work under all circumstances,
	123	but partly _because_ they are safe they are also fairly slow. They are
	124	much faster than a generic global cli/sti pair, but slower than they'd
	125	need to be, because they do have to disable interrupts (which is just a
	126	single instruction on a x86, but it's an expensive one - and on other
	127	architectures it can be worse).
	128
	129	If you have a case where you have to protect a data structure across
	130	several CPU's and you want to use spinlocks you can potentially use
	131	cheaper versions of the spinlocks. IFF you know that the spinlocks are
	132	never used in interrupt handlers, you can use the non-irq versions:
	133
	134	spin_lock(&lock);
	135	...
	136	spin_unlock(&lock);
	137
	138	(and the equivalent read-write versions too, of course). The spinlock will
	139	guarantee the same kind of exclusive access, and it will be much faster.
	140	This is useful if you know that the data in question is only ever
	141	manipulated from a "process context", ie no interrupts involved.
	142
	143	The reasons you mustn't use these versions if you have interrupts that
	144	play with the spinlock is that you can get deadlocks:
	145
	146	spin_lock(&lock);
	147	...
	148	<- interrupt comes in:
	149	spin_lock(&lock);
	150
	151	where an interrupt tries to lock an already locked variable. This is ok if
	152	the other interrupt happens on another CPU, but it is _not_ ok if the
	153	interrupt happens on the same CPU that already holds the lock, because the
	154	lock will obviously never be released (because the interrupt is waiting
	155	for the lock, and the lock-holder is interrupted by the interrupt and will
	156	not continue until the interrupt has been processed).
	157
	158	(This is also the reason why the irq-versions of the spinlocks only need
	159	to disable the _local_ interrupts - it's ok to use spinlocks in interrupts
	160	on other CPU's, because an interrupt on another CPU doesn't interrupt the
	161	CPU that holds the lock, so the lock-holder can continue and eventually
	162	releases the lock).
	163
	164	Note that you can be clever with read-write locks and interrupts. For
	165	example, if you know that the interrupt only ever gets a read-lock, then
	166	you can use a non-irq version of read locks everywhere - because they
	167	don't block on each other (and thus there is no dead-lock wrt interrupts.
	168	But when you do the write-lock, you have to use the irq-safe version.
	169
	170	For an example of being clever with rw-locks, see the "waitqueue_lock"
	171	handling in kernel/sched.c - nothing ever _changes_ a wait-queue from
	172	within an interrupt, they only read the queue in order to know whom to
	173	wake up. So read-locks are safe (which is good: they are very common
	174	indeed), while write-locks need to protect themselves against interrupts.
	175
	176	Linus
	177
fb0bbb92 WAS	178	----
	179
	180	Reference information:
	181
	182	For dynamic initialization, use spin_lock_init() or rwlock_init() as
	183	appropriate:
	184
	185	spinlock_t xxx_lock;
	186	rwlock_t xxx_rw_lock;
	187
	188	static int __init xxx_init(void)
	189	{
	190	spin_lock_init(&xxx_lock);
	191	rwlock_init(&xxx_rw_lock);
	192	...
	193	}
	194
	195	module_init(xxx_init);
	196
	197	For static initialization, use DEFINE_SPINLOCK() / DEFINE_RWLOCK() or
	198	__SPIN_LOCK_UNLOCKED() / __RW_LOCK_UNLOCKED() as appropriate.
	199
	200	SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated. These interfere
	201	with lockdep state tracking.
	202
	203	Most of the time, you can simply turn:
	204	static spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED;
	205	into:
	206	static DEFINE_SPINLOCK(xxx_lock);
	207
	208	Static structure member variables go from:
	209
	210	struct foo bar {
	211	.lock = SPIN_LOCK_UNLOCKED;
	212	};
	213
	214	to:
1da177e4	215
fb0bbb92 WAS	216	struct foo bar {
	217	.lock = __SPIN_LOCK_UNLOCKED(bar.lock);
	218	};
	219
	220	Declaration of static rw_locks undergo a similar transformation.