[LITMUS^RT] understanding PREEMPT_ACTIVE

Fri Mar 7 19:45:47 CET 2014

Hi Everyone,

Can anyone explain more about the use of Linux's preempt_count flag PREEMPT_ACTIVE?  This is the best resource that I have found, but it leaves much to be desired in terms of detail: http://lkml.iu.edu//hypermail/linux/kernel/0403.2/0784.html

As I understand it, when PREEMPT_ACTIVE is set in a task’s preempt_count field, the kernel is allowed to preempt the task, even if the task’s state is not TASK_RUNNING.

Why do I care about this flag?  Here’s my situation.

On processor P0, I have a task T that is in a TASK_UNINTERRUPTIBLE state.  P0 is about to call wake_up_process(T).  Note, and this is important, that no scheduler locks are held—only interrupts are disabled on P0.*  Here’s the general structure of the code:

	local_irq_save(…);
	…
	get_scheduler_lock_and_other_locks();
	enqueue T for waking
	free_scheduler_lock_and_other_locks();
	…
	wake_up_process(T);  // the scheduler lock will be re-aquired within this function call
	local_irq_restore(…);

On processor P1, the budget of task T has been detected as exhausted (in my scheduler, a task’s budget can drain even when the task is not scheduled/linked).  From within a budget timer interrupt handler, I use the scheduler’s job_completion() function to refresh the budget of, and set a new deadline for, T.  It is safe to change T’s deadline since P1 holds the appropriate scheduler lock.

At the end of my scheduler’s job_completion(), it executes code that looks like this (https://github.com/LITMUS-RT/litmus-rt/blob/master/litmus/sched_gsn_edf.c#L364):

	if(is_running(t)) { job_arrival(t); }

This makes sense.  We want to make the task eligible to run if it is runnable.  Is my task T running?  This macro is_running() says it is, even though T’s state is TASK_UNINTERRUPTIBLE.  This is because the macro is_running() expands to:

	((t)->state == TASK_RUNNING || task_thread_info(t)->preempt_count & PREEMPT_ACTIVE)

So what happens to my task T?  Ultimately, P0 and P1 both link T to different CPUs.  These operations are serialized by the scheduler lock, but clearly, this is still wrong.  (Thankfully, the bug is caught by link_task_to_cpu().  Example: https://github.com/LITMUS-RT/litmus-rt/blob/master/litmus/sched_gsn_edf.c#L181)

So what’s the fix?  I think that I’d like P1 to still refresh T’s budget and set its new deadline.  I believe this is safe since P1 holds the scheduler lock.  However, I’d like P0 to be the processor to wake up T and link it to a processor.  I could add a new flag (protected by the scheduler lock) to tell P1 that T is queued up for waking on a remote processor, but this seems messy.  Would it be safe to replace

	if(is_running(t)) { job_arrival(t); }

with

	if(t->state == TASK_RUNNING) { job_arrival(t); }

?

Under vanilla Litmus, I believe job_completion() is always called by the task itself within schedule(), but only if the task is not self-suspending.  This would mean that t->state would always equal TASK_RUNNING, correct?  Thus, my proposed change would not affect normal/pre-existing code paths.

Thanks,
Glenn

* This behavior on P0 is to deal with a nasty lock dependency problem.  Without getting into the details, I hoist wake_up_process() out of these lock critical sections.  I hate this, but it’s how my code works at the moment.  Implementing nested inheritance, with dynamic group locks (especially with priority-ordered lock queues), with unusual budget enforcement mechanisms = coding nightmare.  It would take a significant effort to design and implement a cleaner solution, but (1) I don’t have the time, and (2) it wouldn’t be very fruitful without proper container/server support in Litmus.  Thus, I am hoping to find a technically correct solution to my above problem, even if it is a kludge.