<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Feb 21, 2013, at 1:18 PM, Jonathan Herman <<a href="mailto:hermanjl@cs.unc.edu">hermanjl@cs.unc.edu</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">My patch is still on github and applies to all plugins AFAIK. I think it would address your situation.</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Feb 21, 2013 at 11:21 AM, Glenn Elliott <span dir="ltr"><<a href="mailto:gelliott@cs.unc.edu" target="_blank">gelliott@cs.unc.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; position: static; z-index: auto; "><div class="HOEnZb"><div class="h5"><br>
On Feb 21, 2013, at 7:31 AM, Björn Brandenburg <<a href="mailto:bbb@mpi-sws.org">bbb@mpi-sws.org</a>> wrote:<br>
<br>
><br>
> On Feb 20, 2013, at 11:00 PM, Jonathan Herman <<a href="mailto:hermanjl@cs.unc.edu">hermanjl@cs.unc.edu</a>> wrote:<br>
><br>
>> I've noticed a very rare bug caused by this patch. A simplified explanation of what happens:<br>
>><br>
>> CPU 1 Job A completes execution in userspace.<br>
>> CPU 2 Job A is preempted, and requeued ON THE RELEASE QUEUE.<br>
>> CPU 1 Job A has complete_job() called in the ensuing call to schedule().<br>
>> CPU 1 Job A is removed() from the ready queue under if(is_queued(t)) unlink(). Because Job A is in the release queue, and not the ready queue, the system crashes.<br>
>><br>
>> I was only able to create this pattern of execution on the first job after a synchronous release with CONFIGS_SCHED_CPU_AFFINITY set. I don't know why. I've pushed another patch, prop/budget-bug-fix-fix, to github which prevents requeues while a task's completed flag is set.<br>
>><br>
>> commit eb3ec58872e6ca6074b67d55f1e3ca363499d6af<br>
>> Author: Jonathan Herman <<a href="mailto:hermanjl@cs.unc.edu">hermanjl@cs.unc.edu</a>><br>
>> Date: Wed Feb 20 16:58:50 2013 -0500<br>
>><br>
>> Don't requeue jobs which have their completed flag set.<br>
>><br>
>> diff --git a/include/litmus/budget.h b/include/litmus/budget.h<br>
>> index 33344ee..59e9869 100644<br>
>> --- a/include/litmus/budget.h<br>
>> +++ b/include/litmus/budget.h<br>
>> @@ -29,7 +29,7 @@ static inline int requeue_preempted_job(struct task_struct* t)<br>
>> /* Add task to ready queue only if not subject to budget enforcement or<br>
>> * if the job has budget remaining. t may be NULL.<br>
>> */<br>
>> - return t && (!budget_exhausted(t) || !budget_enforced(t));<br>
>> + return t && (!budget_exhausted(t) || !budget_enforced(t)) && !is_completed(t);<br>
>> }<br>
>><br>
>> #endif<br>
><br>
><br>
> Just wondering: is this the same issue that Glenn reported, and for which I posted a potential C-EDF-specific patch a couple of weeks ago?<br>
><br>
> - Björn<br>
<br>
</div></div>Speaking of the patch that Björn provided me a few weeks ago, I had reported that it didn't work. I tracked down the problem. My version of C-EDF supports locking protocols and it uses functions similar to GSN-EDF's set/clear_priority_inheritance() (Mainline C-EDF doesn't support locking protocols). When I apply Björn's C-EDF patch, I also have to update the clear_priority_inheritance() function to replace gsnedf_job_arrival() with:<br>
<br>
/* maintains original functionality of cedf_job_arrival() */<br>
if (is_released(t, litmus_clock()) { cedf_job_arrival(t); } else { add_release(&cluster->domain, t) }<br>
<br>
So I am getting a little bit confused about the batched Björn and Jonathan have been discussing. Questions:<br>
1) Jonathan-- Do you still have a patch, or has it already been included? If it already has been merged somewhere, is it in mainline or staging?<br>
2) Björn-- With regards to the patch you provided me earlier, as Jonathan said, does this need to be applied to other plugins as well? You had said that you have overhauled rt_domain in private patches. Do you plan on mainlining these? If so, should we just wait and not push temporary code?<br>
<br>
Thanks,<br>
Glenn</blockquote></div></div></blockquote></div><br><div>I tried out Jonathan's budget patch in place of Björn's patch that changed requeue(). While Jonathan's patch may fix a bug, it does not appear to fix my bug. The system hangs on task set release.</div><div><br></div><div>-Glenn</div><div><br></div></body></html>