[LITMUS^RT] RFC: Impl. of Queue Locks in Litmus

Sun Oct 6 00:51:15 CEST 2013

Hi Everyone,

There was some discussion/confusion here at UNC a few weeks ago on whether or not Litmus used queue locks or ticket locks.  A little investigation confirmed that we indeed use Linux's ticket locks.  As Björn's dissertation shows, queue locks are generally preferable over ticket locks.  I've uploaded a work-in-progress branch (wip-queue-locks) to github that introduces an implementation of queue locks.  This implementation is based upon the userspace implementation Björn made for his dissertation, with a few modifications to make it work in kernelspace.  I'd like to solicit comments on this implementation from those who are particularly experienced with memory barriers and atomics.

A few questions:
1) Did I use smp_mb() correctly?  Documentation/atomic_ops.txt says that atomic_cmpxchg() is not a memory barrier, so I added my own.
2) Is there a way to get rid of "volatile" in mcspnode_t?  I'm fine with 'volatile' being there, but Documentation/volatile-considered-harmful.txt says we should avoid it, if possible.
3) The space complexity of this implementation is pretty awful.  Each queue lock takes (m+1) cache lines.  On a 32-core Intel platform, that's nearly half a page of data (each cache line is 64 bytes).  Of course, the whole point of a queue lock is to ensure that each CPU never touches more than two or three lines, so it's not as though a queue lock is going to wipe out the cache.  Nevertheless, any ideas on how to reduce the queue lock's memory footprint and still benefit from O(1) remote accesses?

The branch also introduces "litmus_spinlock_t".  When CONFIG_LITMUS_SPINLOCK is set, litmus_spinlock_t points to queue locks, raw_spinlocks_t otherwise.  I've tested queue locks in G-EDF under heavy load.  All real-time tasks ran to completion.  There were no panics nor warnings, so that is encouraging.

In addition to soliciting comments on this implementation, I also plan on asking some of our new Litmus developers at UNC to profile scheduling overheads on a 24-core, globally-scheduled, system to see if there is any real benefit to using queue locks in this context.  I think we'll look at worst-case scheduling overheads and maybe cache performance counters.  Are there any other metrics that we should consider?

Here's a link to the queue lock implementation on github: https://github.com/LITMUS-RT/litmus-rt/blob/wip-queue-locks/include/litmus/mcsplock.h

Thanks,
Glenn