DragonFly BSD
DragonFly kernel List (threaded) for 2013-05
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Deadlock while switching from bfq scheduler to another policy


From: tripun goel <tripun@xxxxxxxxx>
Date: Fri, 3 May 2013 18:02:33 +0530

--20cf307f307a8313c304dbcf8cd6
Content-Type: text/plain; charset=ISO-8859-1

Hi all ,
it is a known bug  documented in the code too that   deadlock can occur
when  teardown and helper thread is on the same  cpu
[1] Referring to bfq_teardown() in bfq.c
I found a possible cause and have an amateur solution for it .
it happens because  when teardown sends a kill message [2] to helper thread
, helper_msg_kill is  called to add the kill message in queue with other
messages serialized by lwkt and return to teardown function .The helper
thread will only receive kill once it has executed all other messages
before it .
After making this  call teardown destroy the message cache and the helper
thread  never receives the kill message which continues executing and hence
a deadlock.

Solution :
Either we can directly kill the helper thread ( using a global variable
which is usually a bad idea) without adding the message to lwkt
ex
if(kill==1)// where kill is global variable initialized zero and is set to
1 by teardown
break;

or
a spinlock which makes teardown wait for the helper thread to complete
reading its messages and release it .

Comments please..
Cheers
Tripun

Reference
[1] http://nxr.netbsd.org/xref/src-dragonflybsd/sys/kern/dsched/bfq/bfq.c
[2]
http://nxr.netbsd.org/xref/src-dragonflybsd/sys/kern/dsched/bfq/bfq_helper_thread.c

--20cf307f307a8313c304dbcf8cd6
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hi all ,<br></div>it is a known bug=A0 documented in =
the code too that=A0=A0 deadlock can occur when=A0 teardown and helper thre=
ad is on the same=A0=20
cpu <div><div><div>[1] Referring to bfq_teardown() in bfq.c <br></div>I fou=
nd a possible cause and have an amateur solution for it . <br>it=20
happens because=A0 when teardown sends a kill message [2] to helper thread =
, helper_msg_kill is=A0 called to=20
add the kill message in queue with other messages serialized by lwkt and
 return to teardown function .The helper thread will only receive kill=20
once it has executed all other messages before it .<br>

After making this=A0 call teardown destroy the message cache and the
 helper thread=A0 never receives the kill message which continues=20
executing and hence a deadlock.<br><br>Solution : <br>Either
 we can directly kill the helper thread ( using a global variable which=20
is usually a bad idea) without adding the message to lwkt <br>

<div>ex=A0 <br></div><div>if(kill=3D=3D1)// where kill is global variable i=
nitialized zero and is set to 1 by teardown<br></div><div>break;<br><br></d=
iv>or=A0 <br>a spinlock which makes teardown wait for the helper thread to =
complete reading its messages and release it .<br>

</div><div><br></div><div>Comments please.. <br></div><div>Cheers<br></div>=
<div>Tripun<br></div><div><br>

<div>Reference<br></div>[1] <a href=3D"http://nxr.netbsd.org/xref/src-drago=
nflybsd/sys/kern/dsched/bfq/bfq.c" target=3D"_blank">http://nxr.netbsd.org/=
xref/src-dragonflybsd/sys/kern/dsched/bfq/bfq.c</a><br>
[2] <a href=3D"http://nxr.netbsd.org/xref/src-dragonflybsd/sys/kern/dsched/=
bfq/bfq_helper_thread.c" target=3D"_blank">http://nxr.netbsd.org/xref/src-d=
ragonflybsd/sys/kern/dsched/bfq/bfq_helper_thread.c</a></div></div></div>

--20cf307f307a8313c304dbcf8cd6--



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]