[LITMUS^RT] prop/sched-domains

Mon Feb 24 20:20:56 CET 2014

On 20 Feb 2014, at 03:23, Glenn Elliott <gelliott at cs.unc.edu> wrote:

> 
> On Feb 5, 2014, at 4:34 PM, Björn Brandenburg <bbb at mpi-sws.org> wrote:
> 
>> 
>> On 05 Feb 2014, at 14:23, Glenn Elliott <gelliott at cs.unc.edu> wrote:
>> 
>>> It seems to me that your concerns focus mainly on breaking liblitmus’s API.  We can keep the old functions and tag them as deprecated.  Under these patches, the clusterSize field would just be ignored.
>> 
>> No, I’m opposed to leaving around deprecated functions. The goal is to reduce the size of the code base that we need to maintain, not to increase it. ;-)
>> 
>>> 
>>>>> Why can’t scheduler just migrate the task automatically? More frustratingly, the plugins silently reject the task in admit_task()—nothing gets printk()’ed.
>>>> 
>>>> It could. I’m happy to merge well-tested patches that do this on a plugin-by-plugin basis. However, I wouldn’t remove this as a general rule. We have some plugins where we’d like to keep enforcing that restriction.
>>> 
>>> I can put together a simple patch that puts a printk() on the failure path.
>> 
>> Sounds good, thanks.
>> 
>>> 
>>>> By the way, plugins should not print anything on a *successful* task admission. Systems configured to redirect printk() to the serial port (e.g., our 64-core server that’s hooked up to RAC) can incur ridiculous latencies when talking to the serial port driver. For Felipe's OSPERT”13 paper, we had to remove some printk()s from the non-error-path because the serial port driver gave us interrupt latencies in the millisecond range (when flushing printk()s, it apparently spins with interrupts off, waiting for the other side to ack writes).
>>>> 
>>>> Thanks,
>>>> Björn
>>> 
>>> 
>>> I’m flexible with regards to liblitmus API changes/compatibility.  The case I am trying to make is that plugins should explicitly report their cluster configurations, rather than leave it up to userspace code (or scripts) to deduce it.  The deduction is hard: you have to look at which CPU is release master (if any), look at cache configurations in /sys, examine the clustering configuration in /proc/litmus/plugins/<plugin>/cluster, and understand the inner-workings of the plugin in question.  I am proposing the /proc/litmus interface is cleaner and far more foolproof.
>> 
>> 
>> Well, I wasn’t thinking of providing a general script that can deduce all this. Rather, I usually write per-experiment scripts that just happen to have hardcoded  the cluster configurations that I’m using for the paper/that machine. It’s easy to get up and running, and easy to tweak as the need arises. It’s admittedly not a shrink-wrapped, user-friendly way of doing things. Your patches are of course much easier to work with for novices. I’m happy to merge them if you think they’ve been tested enough.
>> 
>> Thanks,
>> Björn
> 
> 
> 
> Hi Björn,
> 
> I’m pretty happy with the litmus-rt patch.  I went ahead and added another patch to liblitmus/prop/sched-domains to add back the old migration functions.  These functions merely wrap the new ones, however.  I tested the patches in KVM and ludwig.  I think these are ready to be merged into staging, if everyone is in agreement.
> 
> Thanks,
> Glenn

Hi Glenn,

I’ve merged both the kernel and liblitmus branches into the respective staging branches. I had to do a fair bit of rebasing and also had to resolve some merge conflicts. Could you please make check that I didn’t break anything in the process? It compiles fine on my system, but it would be good if you could take a look and give it a spin.

Thanks,
Björn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20140224/e644e5e2/attachment.html>