diff options
Diffstat (limited to 'kernel/Documentation/RCU')
-rw-r--r-- | kernel/Documentation/RCU/RTFP.txt | 2 | ||||
-rw-r--r-- | kernel/Documentation/RCU/arrayRCU.txt | 20 | ||||
-rw-r--r-- | kernel/Documentation/RCU/lockdep.txt | 10 | ||||
-rw-r--r-- | kernel/Documentation/RCU/rcu_dereference.txt | 40 | ||||
-rw-r--r-- | kernel/Documentation/RCU/stallwarn.txt | 36 | ||||
-rw-r--r-- | kernel/Documentation/RCU/torture.txt | 39 | ||||
-rw-r--r-- | kernel/Documentation/RCU/trace.txt | 68 | ||||
-rw-r--r-- | kernel/Documentation/RCU/whatisRCU.txt | 14 |
8 files changed, 107 insertions, 122 deletions
diff --git a/kernel/Documentation/RCU/RTFP.txt b/kernel/Documentation/RCU/RTFP.txt index f29bcbc46..370ca006d 100644 --- a/kernel/Documentation/RCU/RTFP.txt +++ b/kernel/Documentation/RCU/RTFP.txt @@ -1496,7 +1496,7 @@ Canis Rufus and Zoicon5 and Anome and Hal Eisen" ,month="July" ,day="8" ,year="2006" -,note="\url{http://en.wikipedia.org/wiki/Read-copy-update}" +,note="\url{https://en.wikipedia.org/wiki/Read-copy-update}" ,annotation={ Wikipedia RCU page as of July 8 2006. [Viewed August 21, 2006] diff --git a/kernel/Documentation/RCU/arrayRCU.txt b/kernel/Documentation/RCU/arrayRCU.txt index 453ebe695..f05a9afb2 100644 --- a/kernel/Documentation/RCU/arrayRCU.txt +++ b/kernel/Documentation/RCU/arrayRCU.txt @@ -10,7 +10,19 @@ also be used to protect arrays. Three situations are as follows: 3. Resizeable Arrays -Each of these situations are discussed below. +Each of these three situations involves an RCU-protected pointer to an +array that is separately indexed. It might be tempting to consider use +of RCU to instead protect the index into an array, however, this use +case is -not- supported. The problem with RCU-protected indexes into +arrays is that compilers can play way too many optimization games with +integers, which means that the rules governing handling of these indexes +are far more trouble than they are worth. If RCU-protected indexes into +arrays prove to be particularly valuable (which they have not thus far), +explicit cooperation from the compiler will be required to permit them +to be safely used. + +That aside, each of the three RCU-protected pointer situations are +described in the following sections. Situation 1: Hash Tables @@ -36,9 +48,9 @@ Quick Quiz: Why is it so important that updates be rare when Situation 3: Resizeable Arrays Use of RCU for resizeable arrays is demonstrated by the grow_ary() -function used by the System V IPC code. The array is used to map from -semaphore, message-queue, and shared-memory IDs to the data structure -that represents the corresponding IPC construct. The grow_ary() +function formerly used by the System V IPC code. The array is used +to map from semaphore, message-queue, and shared-memory IDs to the data +structure that represents the corresponding IPC construct. The grow_ary() function does not acquire any locks; instead its caller must hold the ids->sem semaphore. diff --git a/kernel/Documentation/RCU/lockdep.txt b/kernel/Documentation/RCU/lockdep.txt index cd83d2348..da51d3068 100644 --- a/kernel/Documentation/RCU/lockdep.txt +++ b/kernel/Documentation/RCU/lockdep.txt @@ -47,11 +47,6 @@ checking of rcu_dereference() primitives: Use explicit check expression "c" along with srcu_read_lock_held()(). This is useful in code that is invoked by both SRCU readers and updaters. - rcu_dereference_index_check(p, c): - Use explicit check expression "c", but the caller - must supply one of the rcu_read_lock_held() functions. - This is useful in code that uses RCU-protected arrays - that is invoked by both RCU readers and updaters. rcu_dereference_raw(p): Don't check. (Use sparingly, if at all.) rcu_dereference_protected(p, c): @@ -64,11 +59,6 @@ checking of rcu_dereference() primitives: but retain the compiler constraints that prevent duplicating or coalescsing. This is useful when when testing the value of the pointer itself, for example, against NULL. - rcu_access_index(idx): - Return the value of the index and omit all barriers, but - retain the compiler constraints that prevent duplicating - or coalescsing. This is useful when when testing the - value of the index itself, for example, against -1. The rcu_dereference_check() check expression can be any boolean expression, but would normally include a lockdep expression. However, diff --git a/kernel/Documentation/RCU/rcu_dereference.txt b/kernel/Documentation/RCU/rcu_dereference.txt index ceb05da5a..c0bf2441a 100644 --- a/kernel/Documentation/RCU/rcu_dereference.txt +++ b/kernel/Documentation/RCU/rcu_dereference.txt @@ -25,21 +25,10 @@ o You must use one of the rcu_dereference() family of primitives for an example where the compiler can in fact deduce the exact value of the pointer, and thus cause misordering. -o Do not use single-element RCU-protected arrays. The compiler - is within its right to assume that the value of an index into - such an array must necessarily evaluate to zero. The compiler - could then substitute the constant zero for the computation, so - that the array index no longer depended on the value returned - by rcu_dereference(). If the array index no longer depends - on rcu_dereference(), then both the compiler and the CPU - are within their rights to order the array access before the - rcu_dereference(), which can cause the array access to return - garbage. - o Avoid cancellation when using the "+" and "-" infix arithmetic operators. For example, for a given variable "x", avoid "(x-x)". There are similar arithmetic pitfalls from other - arithmetic operatiors, such as "(x*0)", "(x/(x+1))" or "(x%1)". + arithmetic operators, such as "(x*0)", "(x/(x+1))" or "(x%1)". The compiler is within its rights to substitute zero for all of these expressions, so that subsequent accesses no longer depend on the rcu_dereference(), again possibly resulting in bugs due @@ -76,14 +65,15 @@ o Do not use the results from the boolean "&&" and "||" when dereferencing. For example, the following (rather improbable) code is buggy: - int a[2]; - int index; - int force_zero_index = 1; + int *p; + int *q; ... - r1 = rcu_dereference(i1) - r2 = a[r1 && force_zero_index]; /* BUGGY!!! */ + p = rcu_dereference(gp) + q = &global_q; + q += p != &oom_p1 && p != &oom_p2; + r1 = *q; /* BUGGY!!! */ The reason this is buggy is that "&&" and "||" are often compiled using branches. While weak-memory machines such as ARM or PowerPC @@ -94,14 +84,15 @@ o Do not use the results from relational operators ("==", "!=", ">", ">=", "<", or "<=") when dereferencing. For example, the following (quite strange) code is buggy: - int a[2]; - int index; - int flip_index = 0; + int *p; + int *q; ... - r1 = rcu_dereference(i1) - r2 = a[r1 != flip_index]; /* BUGGY!!! */ + p = rcu_dereference(gp) + q = &global_q; + q += p > &oom_p; + r1 = *q; /* BUGGY!!! */ As before, the reason this is buggy is that relational operators are often compiled using branches. And as before, although @@ -193,6 +184,11 @@ o Be very careful about comparing pointers obtained from pointer. Note that the volatile cast in rcu_dereference() will normally prevent the compiler from knowing too much. + However, please note that if the compiler knows that the + pointer takes on only one of two values, a not-equal + comparison will provide exactly the information that the + compiler needs to deduce the value of the pointer. + o Disable any value-speculation optimizations that your compiler might provide, especially if you are making use of feedback-based optimizations that take data collected from prior runs. Such diff --git a/kernel/Documentation/RCU/stallwarn.txt b/kernel/Documentation/RCU/stallwarn.txt index b57c0c1cd..0f7fb4298 100644 --- a/kernel/Documentation/RCU/stallwarn.txt +++ b/kernel/Documentation/RCU/stallwarn.txt @@ -26,12 +26,6 @@ CONFIG_RCU_CPU_STALL_TIMEOUT Stall-warning messages may be enabled and disabled completely via /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress. -CONFIG_RCU_CPU_STALL_INFO - - This kernel configuration parameter causes the stall warning to - print out additional per-CPU diagnostic information, including - information on scheduling-clock ticks and RCU's idle-CPU tracking. - RCU_STALL_DELAY_DELTA Although the lockdep facility is extremely useful, it does add @@ -101,15 +95,13 @@ interact. Please note that it is not possible to entirely eliminate this sort of false positive without resorting to things like stop_machine(), which is overkill for this sort of problem. -If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set, -more information is printed with the stall-warning message, for example: +Recent kernels will print a long form of the stall-warning message: INFO: rcu_preempt detected stall on CPU 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 softirq=82/543 (t=65000 jiffies) -In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is -printed: +In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed: INFO: rcu_preempt detected stall on CPU 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D @@ -171,6 +163,23 @@ message will be about three times the interval between the beginning of the stall and the first message. +Stall Warnings for Expedited Grace Periods + +If an expedited grace period detects a stall, it will place a message +like the following in dmesg: + + INFO: rcu_sched detected expedited stalls on CPUs: { 1 2 6 } 26009 jiffies s: 1043 + +This indicates that CPUs 1, 2, and 6 have failed to respond to a +reschedule IPI, that the expedited grace period has been going on for +26,009 jiffies, and that the expedited grace-period sequence counter is +1043. The fact that this last value is odd indicates that an expedited +grace period is in flight. + +It is entirely possible to see stall warnings from normal and from +expedited grace periods at about the same time from the same run. + + What Causes RCU CPU Stall Warnings? So your kernel printed an RCU CPU stall warning. The next question is @@ -196,6 +205,13 @@ o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the behavior, you might need to replace some of the cond_resched() calls with calls to cond_resched_rcu_qs(). +o Booting Linux using a console connection that is too slow to + keep up with the boot-time console-message rate. For example, + a 115Kbaud serial console can be -way- too slow to keep up + with boot-time message rates, and will frequently result in + RCU CPU stall warning messages. Especially if you have added + debug printk()s. + o Anything that prevents RCU's grace-period kthreads from running. This can result in the "All QSes seen" console-log message. This message will include information on when the kthread last diff --git a/kernel/Documentation/RCU/torture.txt b/kernel/Documentation/RCU/torture.txt index dac02a621..118e7c176 100644 --- a/kernel/Documentation/RCU/torture.txt +++ b/kernel/Documentation/RCU/torture.txt @@ -166,40 +166,27 @@ test_no_idle_hz Whether or not to test the ability of RCU to operate in torture_type The type of RCU to test, with string values as follows: - "rcu": rcu_read_lock(), rcu_read_unlock() and call_rcu(). - - "rcu_sync": rcu_read_lock(), rcu_read_unlock(), and - synchronize_rcu(). - - "rcu_expedited": rcu_read_lock(), rcu_read_unlock(), and - synchronize_rcu_expedited(). + "rcu": rcu_read_lock(), rcu_read_unlock() and call_rcu(), + along with expedited, synchronous, and polling + variants. "rcu_bh": rcu_read_lock_bh(), rcu_read_unlock_bh(), and - call_rcu_bh(). - - "rcu_bh_sync": rcu_read_lock_bh(), rcu_read_unlock_bh(), - and synchronize_rcu_bh(). + call_rcu_bh(), along with expedited and synchronous + variants. - "rcu_bh_expedited": rcu_read_lock_bh(), rcu_read_unlock_bh(), - and synchronize_rcu_bh_expedited(). + "rcu_busted": This tests an intentionally incorrect version + of RCU in order to help test rcutorture itself. "srcu": srcu_read_lock(), srcu_read_unlock() and - call_srcu(). - - "srcu_sync": srcu_read_lock(), srcu_read_unlock() and - synchronize_srcu(). - - "srcu_expedited": srcu_read_lock(), srcu_read_unlock() and - synchronize_srcu_expedited(). + call_srcu(), along with expedited and + synchronous variants. "sched": preempt_disable(), preempt_enable(), and - call_rcu_sched(). - - "sched_sync": preempt_disable(), preempt_enable(), and - synchronize_sched(). + call_rcu_sched(), along with expedited, + synchronous, and polling variants. - "sched_expedited": preempt_disable(), preempt_enable(), and - synchronize_sched_expedited(). + "tasks": voluntary context switch and call_rcu_tasks(), + along with expedited and synchronous variants. Defaults to "rcu". diff --git a/kernel/Documentation/RCU/trace.txt b/kernel/Documentation/RCU/trace.txt index 08651da15..ec6998b1b 100644 --- a/kernel/Documentation/RCU/trace.txt +++ b/kernel/Documentation/RCU/trace.txt @@ -56,14 +56,14 @@ rcuboost: The output of "cat rcu/rcu_preempt/rcudata" looks as follows: - 0!c=30455 g=30456 pq=1/0 qp=1 dt=126535/140000000000000/0 df=2002 of=4 ql=0/0 qs=N... b=10 ci=74572 nci=0 co=1131 ca=716 - 1!c=30719 g=30720 pq=1/0 qp=0 dt=132007/140000000000000/0 df=1874 of=10 ql=0/0 qs=N... b=10 ci=123209 nci=0 co=685 ca=982 - 2!c=30150 g=30151 pq=1/1 qp=1 dt=138537/140000000000000/0 df=1707 of=8 ql=0/0 qs=N... b=10 ci=80132 nci=0 co=1328 ca=1458 - 3 c=31249 g=31250 pq=1/1 qp=0 dt=107255/140000000000000/0 df=1749 of=6 ql=0/450 qs=NRW. b=10 ci=151700 nci=0 co=509 ca=622 - 4!c=29502 g=29503 pq=1/0 qp=1 dt=83647/140000000000000/0 df=965 of=5 ql=0/0 qs=N... b=10 ci=65643 nci=0 co=1373 ca=1521 - 5 c=31201 g=31202 pq=1/0 qp=1 dt=70422/0/0 df=535 of=7 ql=0/0 qs=.... b=10 ci=58500 nci=0 co=764 ca=698 - 6!c=30253 g=30254 pq=1/0 qp=1 dt=95363/140000000000000/0 df=780 of=5 ql=0/0 qs=N... b=10 ci=100607 nci=0 co=1414 ca=1353 - 7 c=31178 g=31178 pq=1/0 qp=0 dt=91536/0/0 df=547 of=4 ql=0/0 qs=.... b=10 ci=109819 nci=0 co=1115 ca=969 + 0!c=30455 g=30456 cnq=1/0:1 dt=126535/140000000000000/0 df=2002 of=4 ql=0/0 qs=N... b=10 ci=74572 nci=0 co=1131 ca=716 + 1!c=30719 g=30720 cnq=1/0:0 dt=132007/140000000000000/0 df=1874 of=10 ql=0/0 qs=N... b=10 ci=123209 nci=0 co=685 ca=982 + 2!c=30150 g=30151 cnq=1/1:1 dt=138537/140000000000000/0 df=1707 of=8 ql=0/0 qs=N... b=10 ci=80132 nci=0 co=1328 ca=1458 + 3 c=31249 g=31250 cnq=1/1:0 dt=107255/140000000000000/0 df=1749 of=6 ql=0/450 qs=NRW. b=10 ci=151700 nci=0 co=509 ca=622 + 4!c=29502 g=29503 cnq=1/0:1 dt=83647/140000000000000/0 df=965 of=5 ql=0/0 qs=N... b=10 ci=65643 nci=0 co=1373 ca=1521 + 5 c=31201 g=31202 cnq=1/0:1 dt=70422/0/0 df=535 of=7 ql=0/0 qs=.... b=10 ci=58500 nci=0 co=764 ca=698 + 6!c=30253 g=30254 cnq=1/0:1 dt=95363/140000000000000/0 df=780 of=5 ql=0/0 qs=N... b=10 ci=100607 nci=0 co=1414 ca=1353 + 7 c=31178 g=31178 cnq=1/0:0 dt=91536/0/0 df=547 of=4 ql=0/0 qs=.... b=10 ci=109819 nci=0 co=1115 ca=969 This file has one line per CPU, or eight for this 8-CPU system. The fields are as follows: @@ -188,14 +188,14 @@ o "ca" is the number of RCU callbacks that have been adopted by this Kernels compiled with CONFIG_RCU_BOOST=y display the following from /debug/rcu/rcu_preempt/rcudata: - 0!c=12865 g=12866 pq=1/0 qp=1 dt=83113/140000000000000/0 df=288 of=11 ql=0/0 qs=N... kt=0/O ktl=944 b=10 ci=60709 nci=0 co=748 ca=871 - 1 c=14407 g=14408 pq=1/0 qp=0 dt=100679/140000000000000/0 df=378 of=7 ql=0/119 qs=NRW. kt=0/W ktl=9b6 b=10 ci=109740 nci=0 co=589 ca=485 - 2 c=14407 g=14408 pq=1/0 qp=0 dt=105486/0/0 df=90 of=9 ql=0/89 qs=NRW. kt=0/W ktl=c0c b=10 ci=83113 nci=0 co=533 ca=490 - 3 c=14407 g=14408 pq=1/0 qp=0 dt=107138/0/0 df=142 of=8 ql=0/188 qs=NRW. kt=0/W ktl=b96 b=10 ci=121114 nci=0 co=426 ca=290 - 4 c=14405 g=14406 pq=1/0 qp=1 dt=50238/0/0 df=706 of=7 ql=0/0 qs=.... kt=0/W ktl=812 b=10 ci=34929 nci=0 co=643 ca=114 - 5!c=14168 g=14169 pq=1/0 qp=0 dt=45465/140000000000000/0 df=161 of=11 ql=0/0 qs=N... kt=0/O ktl=b4d b=10 ci=47712 nci=0 co=677 ca=722 - 6 c=14404 g=14405 pq=1/0 qp=0 dt=59454/0/0 df=94 of=6 ql=0/0 qs=.... kt=0/W ktl=e57 b=10 ci=55597 nci=0 co=701 ca=811 - 7 c=14407 g=14408 pq=1/0 qp=1 dt=68850/0/0 df=31 of=8 ql=0/0 qs=.... kt=0/W ktl=14bd b=10 ci=77475 nci=0 co=508 ca=1042 + 0!c=12865 g=12866 cnq=1/0:1 dt=83113/140000000000000/0 df=288 of=11 ql=0/0 qs=N... kt=0/O ktl=944 b=10 ci=60709 nci=0 co=748 ca=871 + 1 c=14407 g=14408 cnq=1/0:0 dt=100679/140000000000000/0 df=378 of=7 ql=0/119 qs=NRW. kt=0/W ktl=9b6 b=10 ci=109740 nci=0 co=589 ca=485 + 2 c=14407 g=14408 cnq=1/0:0 dt=105486/0/0 df=90 of=9 ql=0/89 qs=NRW. kt=0/W ktl=c0c b=10 ci=83113 nci=0 co=533 ca=490 + 3 c=14407 g=14408 cnq=1/0:0 dt=107138/0/0 df=142 of=8 ql=0/188 qs=NRW. kt=0/W ktl=b96 b=10 ci=121114 nci=0 co=426 ca=290 + 4 c=14405 g=14406 cnq=1/0:1 dt=50238/0/0 df=706 of=7 ql=0/0 qs=.... kt=0/W ktl=812 b=10 ci=34929 nci=0 co=643 ca=114 + 5!c=14168 g=14169 cnq=1/0:0 dt=45465/140000000000000/0 df=161 of=11 ql=0/0 qs=N... kt=0/O ktl=b4d b=10 ci=47712 nci=0 co=677 ca=722 + 6 c=14404 g=14405 cnq=1/0:0 dt=59454/0/0 df=94 of=6 ql=0/0 qs=.... kt=0/W ktl=e57 b=10 ci=55597 nci=0 co=701 ca=811 + 7 c=14407 g=14408 cnq=1/0:1 dt=68850/0/0 df=31 of=8 ql=0/0 qs=.... kt=0/W ktl=14bd b=10 ci=77475 nci=0 co=508 ca=1042 This is similar to the output discussed above, but contains the following additional fields: @@ -237,42 +237,26 @@ o "ktl" is the low-order 16 bits (in hexadecimal) of the count of The output of "cat rcu/rcu_preempt/rcuexp" looks as follows: -s=21872 d=21872 w=0 tf=0 wd1=0 wd2=0 n=0 sc=21872 dt=21872 dl=0 dx=21872 +s=21872 wd0=0 wd1=0 wd2=0 wd3=5 n=0 enq=0 sc=21872 These fields are as follows: -o "s" is the starting sequence number. +o "s" is the sequence number, with an odd number indicating that + an expedited grace period is in progress. -o "d" is the ending sequence number. When the starting and ending - numbers differ, there is an expedited grace period in progress. - -o "w" is the number of times that the sequence numbers have been - in danger of wrapping. - -o "tf" is the number of times that contention has resulted in a - failure to begin an expedited grace period. - -o "wd1" and "wd2" are the number of times that an attempt to - start an expedited grace period found that someone else had - completed an expedited grace period that satisfies the +o "wd0", "wd1", "wd2", and "wd3" are the number of times that an + attempt to start an expedited grace period found that someone + else had completed an expedited grace period that satisfies the attempted request. "Our work is done." -o "n" is number of times that contention was so great that - the request was demoted from an expedited grace period to - a normal grace period. +o "n" is number of times that a concurrent CPU-hotplug operation + forced a fallback to a normal grace period. + +o "enq" is the number of quiescent states still outstanding. o "sc" is the number of times that the attempt to start a new expedited grace period succeeded. -o "dt" is the number of times that we attempted to update - the "d" counter. - -o "dl" is the number of times that we failed to update the "d" - counter. - -o "dx" is the number of times that we succeeded in updating - the "d" counter. - The output of "cat rcu/rcu_preempt/rcugp" looks as follows: diff --git a/kernel/Documentation/RCU/whatisRCU.txt b/kernel/Documentation/RCU/whatisRCU.txt index 88dfce182..dc49c6712 100644 --- a/kernel/Documentation/RCU/whatisRCU.txt +++ b/kernel/Documentation/RCU/whatisRCU.txt @@ -256,7 +256,9 @@ rcu_dereference() If you are going to be fetching multiple fields from the RCU-protected structure, using the local variable is of course preferred. Repeated rcu_dereference() calls look - ugly and incur unnecessary overhead on Alpha CPUs. + ugly, do not guarantee that the same pointer will be returned + if an update happened while in the critical section, and incur + unnecessary overhead on Alpha CPUs. Note that the value returned by rcu_dereference() is valid only within the enclosing RCU read-side critical section. @@ -362,7 +364,7 @@ uses of RCU may be found in listRCU.txt, arrayRCU.txt, and NMI-RCU.txt. }; DEFINE_SPINLOCK(foo_mutex); - struct foo *gbl_foo; + struct foo __rcu *gbl_foo; /* * Create a new struct foo that is the same as the one currently @@ -384,7 +386,7 @@ uses of RCU may be found in listRCU.txt, arrayRCU.txt, and NMI-RCU.txt. new_fp = kmalloc(sizeof(*new_fp), GFP_KERNEL); spin_lock(&foo_mutex); - old_fp = gbl_foo; + old_fp = rcu_dereference_protected(gbl_foo, lockdep_is_held(&foo_mutex)); *new_fp = *old_fp; new_fp->a = new_a; rcu_assign_pointer(gbl_foo, new_fp); @@ -485,7 +487,7 @@ The foo_update_a() function might then be written as follows: new_fp = kmalloc(sizeof(*new_fp), GFP_KERNEL); spin_lock(&foo_mutex); - old_fp = gbl_foo; + old_fp = rcu_dereference_protected(gbl_foo, lockdep_is_held(&foo_mutex)); *new_fp = *old_fp; new_fp->a = new_a; rcu_assign_pointer(gbl_foo, new_fp); @@ -879,11 +881,9 @@ SRCU: Initialization/cleanup All: lockdep-checked RCU-protected pointer access - rcu_access_index rcu_access_pointer - rcu_dereference_index_check rcu_dereference_raw - rcu_lockdep_assert + RCU_LOCKDEP_WARN rcu_sleep_check RCU_NONIDLE |