[LITMUS^RT] understanding PREEMPT_ACTIVE

Fri Mar 7 20:18:36 CET 2014

On Mar 7, 2014, at 1:45 PM, Glenn Elliott <gelliott at cs.unc.edu> wrote:

> Hi Everyone,
> 
> Can anyone explain more about the use of Linux's preempt_count flag PREEMPT_ACTIVE?  This is the best resource that I have found, but it leaves much to be desired in terms of detail: http://lkml.iu.edu//hypermail/linux/kernel/0403.2/0784.html
> 
> As I understand it, when PREEMPT_ACTIVE is set in a task’s preempt_count field, the kernel is allowed to preempt the task, even if the task’s state is not TASK_RUNNING.
> 
> Why do I care about this flag?  Here’s my situation.
> 
> On processor P0, I have a task T that is in a TASK_UNINTERRUPTIBLE state.  P0 is about to call wake_up_process(T).  Note, and this is important, that no scheduler locks are held—only interrupts are disabled on P0.*  Here’s the general structure of the code:
> 
> 	local_irq_save(…);
> 	…
> 	get_scheduler_lock_and_other_locks();
> 	enqueue T for waking
> 	free_scheduler_lock_and_other_locks();
> 	…
> 	wake_up_process(T);  // the scheduler lock will be re-aquired within this function call
> 	local_irq_restore(…);
> 
> On processor P1, the budget of task T has been detected as exhausted (in my scheduler, a task’s budget can drain even when the task is not scheduled/linked).  From within a budget timer interrupt handler, I use the scheduler’s job_completion() function to refresh the budget of, and set a new deadline for, T.  It is safe to change T’s deadline since P1 holds the appropriate scheduler lock.
> 
> At the end of my scheduler’s job_completion(), it executes code that looks like this (https://github.com/LITMUS-RT/litmus-rt/blob/master/litmus/sched_gsn_edf.c#L364):
> 
> 	if(is_running(t)) { job_arrival(t); }
> 
> This makes sense.  We want to make the task eligible to run if it is runnable.  Is my task T running?  This macro is_running() says it is, even though T’s state is TASK_UNINTERRUPTIBLE.  This is because the macro is_running() expands to:
> 
> 	((t)->state == TASK_RUNNING || task_thread_info(t)->preempt_count & PREEMPT_ACTIVE)
> 
> So what happens to my task T?  Ultimately, P0 and P1 both link T to different CPUs.  These operations are serialized by the scheduler lock, but clearly, this is still wrong.  (Thankfully, the bug is caught by link_task_to_cpu().  Example: https://github.com/LITMUS-RT/litmus-rt/blob/master/litmus/sched_gsn_edf.c#L181)
> 
> So what’s the fix?  I think that I’d like P1 to still refresh T’s budget and set its new deadline.  I believe this is safe since P1 holds the scheduler lock.  However, I’d like P0 to be the processor to wake up T and link it to a processor.  I could add a new flag (protected by the scheduler lock) to tell P1 that T is queued up for waking on a remote processor, but this seems messy.  Would it be safe to replace
> 
> 	if(is_running(t)) { job_arrival(t); }
> 
> with
> 
> 	if(t->state == TASK_RUNNING) { job_arrival(t); }
> 
> ?
> 
> Under vanilla Litmus, I believe job_completion() is always called by the task itself within schedule(), but only if the task is not self-suspending.  This would mean that t->state would always equal TASK_RUNNING, correct?  Thus, my proposed change would not affect normal/pre-existing code paths.
> 
> Thanks,
> Glenn
> 
> * This behavior on P0 is to deal with a nasty lock dependency problem.  Without getting into the details, I hoist wake_up_process() out of these lock critical sections.  I hate this, but it’s how my code works at the moment.  Implementing nested inheritance, with dynamic group locks (especially with priority-ordered lock queues), with unusual budget enforcement mechanisms = coding nightmare.  It would take a significant effort to design and implement a cleaner solution, but (1) I don’t have the time, and (2) it wouldn’t be very fruitful without proper container/server support in Litmus.  Thus, I am hoping to find a technically correct solution to my above problem, even if it is a kludge.

It looks like I was wrong about t->state not being TASK_RUNNING.  This line of code changes the task state to TASK_RUNNING before the task is woken up:

https://github.com/LITMUS-RT/litmus-rt/blob/master/kernel/sched/litmus.c#L192

Thus, my proposed “fix” is not enough.  Why do we change the task state prior to calling litmus->task_wake_up()?  Could this be done after the scheduler lock has been acquired?  The comments say that we need to change the task state, but they don’t say _why_.

Thanks,
Glenn