[LITMUS^RT] Supporting Job Aborts

Jonathan Herman hermanjl at cs.unc.edu
Fri Sep 7 19:29:59 CEST 2012


For Glenn's current issue, setitimer may give the correct result. I
can imagine other situations where it wouldn't work. Suppose some kind
of slack stealing is in effect, and the current task overexecutes, but
this can be handled by the slack. We don't want the timer going off
and having the task quit in this case.

I think only the kernel should be trusted to know when a task has or
has not exhausted its budget and needs to abort.

On Fri, Sep 7, 2012 at 12:18 PM, Björn Brandenburg <bbb at mpi-sws.org> wrote:
>
> On Sep 7, 2012, at 12:34 AM, Glenn Elliott wrote:
>
>> I am working on adding support to Litmus to allow jobs to be aborted.  A simple use case: Abort a job on budget exhaustion.
>>
>> Suppose we have a task T with consecutive jobs J_1 and J_2.  Further suppose J_1 exhausts its budget before completing.  Budget exhaustion is handled by Litmus currently in either of two ways: (1) NO_ENFORCEMENT - J_1 is allowed to continue execution.  (2) QUANTUM/PRECISE_ENFORCEMENT - J_1 is preempted and does not resume until its budget is replenished.  In either case, all of the work of J_1 must be completed before the work for J_2 can start.  This stinks in applications where J_1 has no utility if J_1 cannot be completed until after the budget has been replenished.  This also can put pressure on later jobs to complete within their effectively reduced budgets as well.
>
> Hi Glenn,
>
> the kernel already supports sending signals in response to crossing pre-determined execution time boundaries with setitimer(2).
>
>         http://linux.die.net/man/2/setitimer
>
> I think what you want is ITIMER_VIRTUAL, which, according to the man page, "decrements only when the process is executing, and delivers SIGVTALRM upon expiration." That *should* still work with LITMUS^RT.  Why don't you use that instead? Just set it at the beginning of the job to expire shortly before the budget is exhausted; this leaves you some time to handle the signal and rollback the state just in time for the next job.
>
> Fundamentally, from my point of view PRECISE/QUANTUM_ENFORCEMENT protect against *uncooperative* tasks, whereas you seek to provide feedback to *cooperative* tasks.
>
> - Björn
>
>
> _______________________________________________
> litmus-dev mailing list
> litmus-dev at lists.litmus-rt.org
> https://lists.litmus-rt.org/listinfo/litmus-dev



-- 
Jonathan Herman
Department of Computer Science at UNC Chapel Hill




More information about the litmus-dev mailing list