[LITMUS^RT] new patches and new branch

Wed Jun 26 17:43:01 CEST 2013

On Jun 26, 2013, at 7:08 AM, Björn Brandenburg <bbb at mpi-sws.org> wrote:

> On Jun 25, 2013, at 4:52 PM, Glenn Elliott <gelliott at cs.unc.edu> wrote:
> 
>> 
>> Questions about the GSN-EDF link changes: Do you have any empirical data on how this changes average case preemption/migration costs in comparison to the affinity-aware linking? Does it make sense to avoid an IPI if it means migrating across a CPU socket?  It seems we should be chasing after improved average-case behavior here because I don't believe this latest patch changes schedulability analysis formulas---just possibly the values we plug in.
> 
> The "local CPU first" shortcut serves to reduce scheduling latency in the average case (unless the system is highly utilized), and also affects observed maxima in the case of having one task per core. The latter is the workload run by cyclictest, which we used to compare scheduling latency across LITMUS^RT, Linux, and PREEMPT_RT.
> 
> 	https://www.mpi-sws.org/~bbb/papers/pdf/ospert13.pdf
> 
> The affinity-aware linking doesn't help to improve scheduling latency. Conversely, I don't think the "local CPU first" shortcut significantly affects preemption / migration costs. In any case, we could make it optional with a CONFIG option.

Interesting results in the OSPERT paper!  I guess it confirms what we suspected about Litmus's latencies.  I'm glad that it's getting better though.  I haven't read the paper in detail yet, but did you test Litmus with "threadirqs" set in the kernel boot parameters?  Litmus tasks blocked for an I/O interrupt could suffer an unbounded priority inversion though...

With respect to the "local CPU first" shortcut, do you suppose we're looking at an application-dependent trade-off?  We can either reduce a small fixed cost (scheduling latency), or reduce occasional big costs (migration of tasks with large WSS).  Suppose that we have two available CPUs when we need to schedule task T: (1) a local CPU and (2) a remote CPU with which T has affinity.  Suppose (1) is an arbitrary distance X, w.r.t memory hierarchy, from (2).  If we wish to minimize task/job response time, then I would expect the following:

1) For a task with a small WSS, it is best to schedule on the local CPU.
2) For a task with a large WSS, it is best to schedule on the remote CPU.
3) For a task with a  medium WSS, the best CPU is dependent upon X.

Hardware-specific and application-specific characteristics would define what small/medium/large mean.

I guess what I'm trying to say is that I see no way for Litmus to offer a good one-size-fits-all solution offline, so I agree that a compile-time option is best.  Although, it would be interesting if Litmus could go through some sort of "self-tuning" phase at boot.  Applications could then inform Litmus of their WSS.  Litmus could run through the three scenarios above to make better scheduling decisions.  Still, I wonder if there would be any real payoff to all this.  Perhaps it would just add a lot of complexity to occasionally save a few microseconds.

>> Also, going forward, how do you feel about committing to keeping C-EDF in sync with GSN-EDF?  I'd be happy to port any patches from GSN-EDF to C-EDF since I appear to be its primary user.
> 
> Sure. They should generally be in sync.  In this case, we were using G-EDF and porting it to C-EDF just wasn't a priority yet. (Rebasing cleanly on top of 3.0 took a couple of days.)

Great! Glad to hear it.

Thanks,
Glenn