[LITMUS^RT] bug in overhead tracing

Mon Apr 14 21:48:21 CEST 2014

Hi Everyone,

I’m afraid that I’ve found a pretty bad bug in Litmus when overhead tracing is enabled.  The bug is pretty easy to reproduce: with a kernel compiled for overhead tracing, just set up a real-time task that loops on sched_yield().  Note that sched_yield() is on the code path for exiting np-sections in liblitmus.

BUG: The TS_SYSCALL_IN_END macro inadvertently enables interrupts when it should not.

Code path:
1) [user] sched_yield()
2) sys_sched_yield(): Disable interrupt and acquire the run-queue lock. (https://github.com/LITMUS-RT/litmus-rt/blob/master/kernel/sched/core.c#L4447)
3) sys_sched_yield() calls sched_class->yield_task();
4) [for SCHED_LITMUS tasks] yield_task_litmus() calls TS_SYSCALL_IN_END. (https://github.com/LITMUS-RT/litmus-rt/blob/master/kernel/sched/litmus.c#L216)
5) TS_SYS_CALL_IN_END re-enables interrupts unconditionally: https://github.com/LITMUS-RT/litmus-rt/blob/master/include/litmus/litmus.h#L314

Ouch!  My system was locking up because the tick interrupt was being handled while the rq lock (from step 2, above) was still held (the tick interrupt handler acquires the rq lock so it can update scheduling statistics).  Anyway, the CPU deadlocked on itself.

What’s the fix?  I see three options:
1) We give up on instrumenting sched_yield.  We just delete TS_SYSCALL* from yield_task_litmus().
2) We push the TS_SYSCALL* probes up into sys_sched_yield()
3) We make TS_SYSCALL_IN_END interrupt-flag aware.  An easy fix is to avoid the disable/enable interrupt code if interrupts are already disabled (code branch).  Another fix is just to save/disable/restore the current interrupt flags.  However, is there a more elegant solution?  Do we need to do any irq-related accounting in TS_SYSCALL_IN_END if interrupts are already disabled?  This code was developed at MPI, so I defer to their expertise.

Regardless of what we decide, I would like to see these IRQ-tracing macros converted into inline functions with normal function-like names.  This bug was particularly difficult to diagnose (it took me four days) because there is logic hidden in TS_SYSCALL_IN_END.  I overlooked this macro in by debugging because TS_* macros are normally enabled/disabled by feather-trace at runtime.  However, TS_SYSCALL_IN_END is nothing like a normal TS_* overhead tracing macro.  It still does work even if I don’t trace SYSCALL overheads.

Anyway, I’d like to hear back from someone at MPI on a suggested fix.  I’d be happy to put the patch together.

Thanks,
Glenn