<div dir="ltr">I've noticed a very rare bug caused by this patch. A simplified explanation of what happens:<div><br></div><div style>CPU 1 Job A completes execution in userspace.</div>CPU 2 Job A is preempted, and requeued ON THE RELEASE QUEUE.<div style>
CPU 1 Job A has complete_job() called in the ensuing call to schedule().</div><div style>CPU 1 Job A is removed() from the ready queue under if(is_queued(t)) unlink(). Because Job A is in the release queue, and not the ready queue, the system crashes.</div>
<div style><br></div><div style>I was only able to create this pattern of execution on the first job after a synchronous release with CONFIGS_SCHED_CPU_AFFINITY set. I don't know why. I've pushed another patch, prop/budget-bug-fix-fix, to github which prevents requeues while a task's completed flag is set.</div>
<div style><div><br></div><div>commit eb3ec58872e6ca6074b67d55f1e3ca363499d6af</div><div>Author: Jonathan Herman <<a href="mailto:hermanjl@cs.unc.edu">hermanjl@cs.unc.edu</a>></div><div>Date: Wed Feb 20 16:58:50 2013 -0500</div>
<div><br></div><div> Don't requeue jobs which have their completed flag set.</div><div><br></div></div><div style><div>diff --git a/include/litmus/budget.h b/include/litmus/budget.h</div><div>index 33344ee..59e9869 100644</div>
<div>--- a/include/litmus/budget.h</div><div>+++ b/include/litmus/budget.h</div><div>@@ -29,7 +29,7 @@ static inline int requeue_preempted_job(struct task_struct* t)</div><div> /* Add task to ready queue only if not subject to budget enforcement or</div>
<div> * if the job has budget remaining. t may be NULL.</div><div> */</div><div>- return t && (!budget_exhausted(t) || !budget_enforced(t));</div><div>+ return t && (!budget_exhausted(t) || !budget_enforced(t)) && !is_completed(t);</div>
<div> }</div><div> </div><div> #endif</div><div><br></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Jun 15, 2012 at 4:28 AM, Björn Brandenburg <span dir="ltr"><<a href="mailto:bbb@mpi-sws.org" target="_blank">bbb@mpi-sws.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>
On Jun 14, 2012, at 5:21 PM, Giovani Gracioli wrote:<br>
<br>
><br>
> just recompiled the kernel with the modifications and the bug is fixed.<br>
<br>
</div>Excellent, thanks for confirming this. I'll merge the patch into staging.<br>
<span class="HOEnZb"><font color="#888888"><br>
- Björn<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
_______________________________________________<br>
litmus-dev mailing list<br>
<a href="mailto:litmus-dev@lists.litmus-rt.org">litmus-dev@lists.litmus-rt.org</a><br>
<a href="https://lists.litmus-rt.org/listinfo/litmus-dev" target="_blank">https://lists.litmus-rt.org/listinfo/litmus-dev</a><br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>Jonathan Herman<br>Department of Computer Science at UNC Chapel Hill
</div>