[LITMUS^RT] Questions about Litmus

Björn Brandenburg bbb at mpi-sws.org
Wed Dec 9 16:14:57 CET 2015


Dear Shuai, 

thanks for your interest in LITMUS^RT. 

> On 08 Dec 2015, at 09:02, Shuai Zhao <zs673 at york.ac.uk> wrote:
> 
> 1. I've been reading your litmus report (a very long one) and notice you said 
> 
> " In our experience, handling suspending and resuming processes is a common source of errors. A process may suspend and be resumed again almost immediately thereafter before a context switch can occur (e.g., this is the case for many page faults, and especially those due to copy-on-write semantics). In the past, novice developers have commonly disregarded the possibility of such “quickly resuming” processes, which may result in ready processes failing to be enqueued in a ready queue (in which case they are never scheduled again and become “stuck”) or in being wrongly enqueued twice (which crashes the kernel eventually). Besides migration support (see Section 3.3.4 below), races related to short suspensions have in our experience been the most common cause of crashes in LITMUSRT. "
> 
> And this is exactly the problem that I am now experiencing. 
> 
> I tired to implement the MrsP protocol under the P-FP scheduler and developed a test program in user space to test. However, The program sometimes stuck because the lock holder is blocked but not rescheduled (mostly when trying to print by printf).

I’m not sure I understand. Are you saying there’s a bug in the core P-FP plugin related to suspensions, or are you saying your *modified* plugin has a difficult time handling short suspensions?

There are no known suspension-related bugs in any of the plugins. If you come across a problem, a test case that triggers the bug in an unmodified LITMUS^RT kernel would be greatly appreciated. At the very least, we’ll need some traces to understand what’s going on.

> I wonder, how to prevent such races from happening? Will printf be problematic in used in Litmus real-time thread with a very short period (say 1ms). 

A correct plugin will be able to support arbitrary self-suspension behavior. You can use printf() however you like. We do not place restrictions on userspace tasks. (Or, rather, if we do, you should get an error code and not a crash or hang.)

> 
> 2. MrsP defines that if a lock holder is preempted, It can migrate to any processors that has a waiting (spinning) task on it. Yet as I am trying to implement MrsP under P-FP, there is no support for choosing the most most suitable cpu among all the cpus the preempted task can migrate to. I wonder do you have any suggestions for building a mini migration routine for F-PF or could I use any facilities to achieve this from either Linux or Litmus?

You can take a look at the existing migration support code, but I’m afraid that implementing the MBWI / MrsP / OMIP / MC-IPC migration rules can be rather tricky. I’m working on a cleaned-up version of the MC-IPC implementation, which may be interesting to you as a source of inspiration, but I need a few more days to finish it. (And I’m currently not getting any time to work on LITMUS^RT.)

> 
> I also notice that the sched_setaffinity is disabled by litmus as it could introduce unbounded priority inversion.

Not sure how you get to unbounded priority inversion. We simply don’t support arbitrary affinities in LITMUS^RT yet. The plugins control task-to-processor placement; each plugin is free to interpret (or ignore) a task's affinity as it sees fit.

> However, I wonder how can I use this function again in Litmus RT tasks if I do not care the drawback it carries.

You can implement your own plugin that respects affinities, or extend one of the existing plugins to add affinity support.

Regards,
Björn





More information about the litmus-dev mailing list