[LITMUS^RT] running a task as its execution time

Glenn Elliott gelliott at cs.unc.edu
Thu Feb 21 17:21:32 CET 2013


On Feb 21, 2013, at 7:31 AM, Björn Brandenburg <bbb at mpi-sws.org> wrote:

> 
> On Feb 20, 2013, at 11:00 PM, Jonathan Herman <hermanjl at cs.unc.edu> wrote:
> 
>> I've noticed a very rare bug caused by this patch. A simplified explanation of what happens:
>> 
>> CPU 1 Job A completes execution in userspace.
>> CPU 2 Job A is preempted, and requeued ON THE RELEASE QUEUE.
>> CPU 1 Job A has complete_job() called in the ensuing call to schedule().
>> CPU 1 Job A is removed() from the ready queue under if(is_queued(t)) unlink(). Because Job A is in the release queue, and not the ready queue, the system crashes.
>> 
>> I was only able to create this pattern of execution on the first job after a synchronous release with CONFIGS_SCHED_CPU_AFFINITY set. I don't know why. I've pushed another patch, prop/budget-bug-fix-fix, to github which prevents requeues while a task's completed flag is set.
>> 
>> commit eb3ec58872e6ca6074b67d55f1e3ca363499d6af
>> Author: Jonathan Herman <hermanjl at cs.unc.edu>
>> Date:   Wed Feb 20 16:58:50 2013 -0500
>> 
>>    Don't requeue jobs which have their completed flag set.
>> 
>> diff --git a/include/litmus/budget.h b/include/litmus/budget.h
>> index 33344ee..59e9869 100644
>> --- a/include/litmus/budget.h
>> +++ b/include/litmus/budget.h
>> @@ -29,7 +29,7 @@ static inline int requeue_preempted_job(struct task_struct* t)
>>        /* Add task to ready queue only if not subject to budget enforcement or
>>         * if the job has budget remaining. t may be NULL.
>>         */
>> -       return t && (!budget_exhausted(t) || !budget_enforced(t));
>> +       return t && (!budget_exhausted(t) || !budget_enforced(t)) && !is_completed(t);
>> }
>> 
>> #endif
> 
> 
> Just wondering: is this the same issue that Glenn reported, and for which I posted a potential C-EDF-specific patch a couple of weeks ago?
> 
> - Björn

Speaking of the patch that Björn provided me a few weeks ago, I had reported that it didn't work.  I tracked down the problem.  My version of C-EDF supports locking protocols and it uses functions similar to GSN-EDF's set/clear_priority_inheritance() (Mainline C-EDF doesn't support locking protocols).  When I apply Björn's C-EDF patch, I also have to update the clear_priority_inheritance() function to replace gsnedf_job_arrival() with:

/* maintains original functionality of cedf_job_arrival() */
if (is_released(t, litmus_clock()) { cedf_job_arrival(t); } else { add_release(&cluster->domain, t) }

So I am getting a little bit confused about the batched Björn and Jonathan have been discussing.  Questions:
1) Jonathan-- Do you still have a patch, or has it already been included?  If it already has been merged somewhere, is it in mainline or staging?
2) Björn-- With regards to the patch you provided me earlier, as Jonathan said, does this need to be applied to other plugins as well?  You had said that you have overhauled rt_domain in private patches.  Do you plan on mainlining these?  If so, should we just wait and not push temporary code?

Thanks,
Glenn






More information about the litmus-dev mailing list