[LITMUS^RT] Help with Understanding Scheduling Results
Meng Xu
xumengpanda at gmail.com
Mon Mar 16 20:48:56 CET 2015
Hi Geoffrey,
2015-03-16 15:07 GMT-04:00 Geoffrey Tran <gtran at isi.edu>:
> Hi Björn,
>
> Thanks for your reply. I've answered your questions below. I hope the formatting did not get disrupted.
>
>>----- Original Message -----
>>From: "Björn Brandenburg" <bbb at mpi-sws.org>
>>To: "Geoffrey Tran" <gtran at isi.edu>, litmus-dev at lists.litmus-rt.org
>>Cc: "Mikyung Kang" <mkkang at isi.edu>, "Stephen Crago" <crago at isi.edu>, "John Walters" <jwalters at isi.edu>
>>Sent: Thursday, March 12, 2015 5:39:09 PM
>>Subject: Re: [LITMUS^RT] Help with Understanding Scheduling Results
>>
>>
>>> On 12 Mar 2015, at 23:06, Geoffrey Tran <gtran at isi.edu> wrote:
>>>
>>> I have been working with LITMUS-RT, but have come across some weird results
>>> and was hoping someone could help clarify it please.
>>
>>Hi Geoffrey,
>>
>>thanks for your interest in LITMUS^RT. I’ve not observed any similar behavior; it indeed looks pretty strange. Some questions/comments inline below.
>>
>>>
>>> The machine used is a 16-core system. 8 cores are dedicated to the host,
>>> while up to the remainig 8 cores are used to run a Xen virtual machine.
>>> Just in case more background is needed, Xen is a popular hypervisor. The
>>> RTDS scheduler is a new real-time scheduler that was originally the RT-Xen
>>> project. It is now a part of the upstream Xen project. The scheduler
>>> gives each VCPU a guaranteed capacity based on the input parameters.
>>>
>>> LITMUS-RT v2014.2 is installed in the guest. The hypervisor is using the
>>> rtds scheduler, with each VCPU receiving the full allocation (Budget: 10000,
>>> Period: 10000).
>
>>Which LITMUS^RT scheduler are you using?
>
> We are using the GSN-EDF scheduler.
>
>>Can you reproduce the issue with LITMUS^RT running on bare metal?
>
> When running on bare metal, the issue is not observed. All jobs for each task
> meet the deadlines.
>
>>>
>>> The issue is as follows: when running a simple two tasks with less than 50%
>>> single core utilization each (Cost=200ms, Period=500ms), jobs are still missing
>>> in certain cases.
>>> - When we use only 2VCPUs for the guest, success in all jobs is observed
>>> - When we use 3 or more VCPUs, misses are observed
>>
>>Do you use budget enforcement ?
>
> No, there is no budget enforcement. This could explain the "rollover" on
> why each missed job comes in batches, but not the reason why the first job
> missed, right?
>
>>
>>> The results from st_job_stats and an exerpt of the logs for the 3 VCPU case
>>> are below:
>>> # Task, Job, Period, Response, DL Miss?, Lateness, Tardiness
>>> # task NAME=myapp PID=14442 COST=250000000 PERIOD=500000000 CPU=0
>>> 14442, 2, 500000000, 32592, 0, -499967408, 0
>>> 14442, 3, 500000000, 61805, 0, -499938195, 0
>>> 14442, 4, 500000000, 291581439, 0, -208418561, 0
>>> 14442, 5, 500000000, 251964671, 0, -248035329, 0
>>> 14442, 6, 500000000, 252424182, 0, -247575818, 0
>>> 14442, 7, 500000000, 251938074, 0, -248061926, 0
>>> 14442, 8, 500000000, 252145862, 0, -247854138, 0
>>> 14442, 9, 500000000, 251845811, 0, -248154189, 0
>>> 14442, 10, 500000000, 257706935, 0, -242293065, 0
>>> 14442, 11, 500000000, 251850581, 0, -248149419, 0
>>> 14442, 12, 500000000, 252553597, 0, -247446403, 0
>>> 14442, 13, 500000000, 251765063, 0, -248234937, 0
>>> 14442, 14, 500000000, 252902538, 0, -247097462, 0
>>> 14442, 15, 500000000, 1009091185, 1, 509091185, 509091185
>>> 14442, 16, 500000000, 760966632, 1, 260966632, 260966632
>>> 14442, 17, 500000000, 512866266, 1, 12866266, 12866266
>>> 14442, 18, 500000000, 264818921, 0, -235181079, 0
>>> 14442, 19, 500000000, 253024397, 0, -246975603, 0
>>> 14442, 20, 500000000, 252785150, 0, -247214850, 0
>>> 14442, 21, 500000000, 252466946, 0, -247533054, 0
>>> 14442, 22, 500000000, 1459862887, 1, 959862887, 959862887
>>> 14442, 23, 500000000, 1211903080, 1, 711903080, 711903080
>>> 14442, 24, 500000000, 963919848, 1, 463919848, 463919848
>>> # task NAME=myapp PID=14443 COST=200000000 PERIOD=500000000 CPU=0
>>> 14443, 2, 500000000, 58150, 0, -499941850, 0
>>> 14443, 3, 500000000, 3202178, 0, -496797822, 0
>>> 14443, 4, 500000000, 201662924, 0, -298337076, 0
>>> 14443, 5, 500000000, 213828161, 0, -286171839, 0
>>> 14443, 6, 500000000, 202532002, 0, -297467998, 0
>>> 14443, 7, 500000000, 961643647, 1, 461643647, 461643647
>>> 14443, 8, 500000000, 663707479, 1, 163707479, 163707479
>>> 14443, 9, 500000000, 365603701, 0, -134396299, 0
>>> 14443, 10, 500000000, 201910605, 0, -298089395, 0
>>> 14443, 11, 500000000, 209025099, 0, -290974901, 0
>>> 14443, 12, 500000000, 210602663, 0, -289397337, 0
>>> 14443, 13, 500000000, 247544048, 0, -252455952, 0
>>> 14443, 14, 500000000, 459680759, 0, -40319241, 0
>>> 14443, 15, 500000000, 202763167, 0, -297236833, 0
>>> 14443, 16, 500000000, 202368570, 0, -297631430, 0
>>> 14443, 17, 500000000, 201857551, 0, -298142449, 0
>>> 14443, 18, 500000000, 201999263, 0, -298000737, 0
>>> 14443, 19, 500000000, 958909178, 1, 458909178, 458909178
>>> 14443, 20, 500000000, 660591085, 1, 160591085, 160591085
>>> 14443, 21, 500000000, 362025653, 0, -137974347, 0
>>> 14443, 22, 500000000, 202565748, 0, -297434252, 0
>>> 14443, 23, 500000000, 202456104, 0, -297543896, 0
>>> 14443, 24, 500000000, 202528628, 0, -297471372, 0
>>>
>>> .....
>>> [ 328638446] RELEASE 14443/14 on CPU 0 328638946.00ms
>>> [ 328638446] RELEASE 14442/14 on CPU 1 328638946.00ms
>>> [ 328638446] SWITCH_TO 14442/14 on CPU 0
>>> [ 328638698] SWITCH_FROM 14442/14 on CPU 0
>>> [ 328638698] COMPLETION 14442/14 on CPU 0
>>> [ 328638704] SWITCH_TO 14443/14 on CPU 1
>>> [ 328638905] SWITCH_FROM 14443/14 on CPU 1
>>> [ 328638905] COMPLETION 14443/14 on CPU 1
>>> [ 328638946] RELEASE 14442/15 on CPU 0 328639446.00ms
>>> [ 328638946] RELEASE 14443/15 on CPU 1 328639446.00ms
>>> [ 328638946] SWITCH_TO 14443/15 on CPU 0
>>> [ 328639148] SWITCH_FROM 14443/15 on CPU 0
>>> [ 328639148] COMPLETION 14443/15 on CPU 0
>>> [ 328639446] RELEASE 14443/16 on CPU 0 328639946.00ms
>>> [ 328639446] RELEASE 14442/16 on CPU 1 328639946.00ms
>>
>>Random observation: the next release of 14443 happens on CPU 0, whereas 14442 is still being released on CPU 1. This means CPU 1 must have set a timer for the release to occur. Usually, it shouldn’t do that unless it either (a) processed a completion or (b) prevented a budget overrun. Either event should have left a trace record (I think). Are you sure the trace is complete (i.e., no missing writes)?
>>
>
> Yes, as far as I know, there were no reported failed writes at the completion of the tracing.
>
>>
>>> [ 328639446] SWITCH_TO 14443/16 on CPU 0
>>> [ 328639648] SWITCH_FROM 14443/16 on CPU 0
>>> [ 328639648] COMPLETION 14443/16 on CPU 0
>>> [ 328639703] SWITCH_TO 14442/15 on CPU 1
>>> [ 328639946] RELEASE 14442/17 on CPU 1 328640446.00ms
>>> [ 328639946] RELEASE 14443/17 on CPU 0 328640446.00ms
>>> [ 328639946] SWITCH_TO 14443/17 on CPU 0
>>> [ 328639955] SWITCH_FROM 14442/15 on CPU 1
>>> [ 328639955] COMPLETION 14442/15 on CPU 1
>>> [ 328639955] SWITCH_TO 14442/16 on CPU 1
>>> [ 328640147] SWITCH_FROM 14443/17 on CPU 0
>>> [ 328640147] COMPLETION 14443/17 on CPU 0
>>> [ 328640206] SWITCH_FROM 14442/16 on CPU 1
>>> [ 328640206] COMPLETION 14442/16 on CPU 1
>>> [ 328640206] SWITCH_TO 14442/17 on CPU 1
>>> .....
>>>
>>>
>>> It is interesting to note that when both job 15's are released, the job for PID 14443 is
>>> allowed to run. however, for PID 14442, the job does not run until the next period!
>>>
>>> It has also been observed that running two tasks on a guest with 8 VCPUs sees the same
>>> task misses. But by artificailly loading the system by running 6 instances of "yes >
>>> /dev/null", all jobs meet their deadlines.
>>
>>Do these instances of yes run as LITMUS^RT tasks or just as regular SCHED_OTHER tasks?
>
> The instances of yes run as regular tasks. I just launched them from the shell without any calls to liblitmus.
>
>>
>>Also, can you provide matching output of the debug trace log (the TRACE() stream)?
>
> I've attached two text documents containing the output from st_job_stats followed by the
> trace logs you've requested. One is for the bare-metal case, one is for the VM. Sorry,
> we don't have the logs for that exact run available, but this is from a run with the
> same parameters. The bare metal case is a bit cluttered due to the current system having
> the preemption tracing enabled.
>
>>
>>I’ve not used LITMUS^RT on Xen. I don’t mean to blame Xen, but it would make it easier to for us if you could reproduce this behavior on bare metal.
>
> Again, thanks for your help and comments so far!
>
> To provide a little bit of context, We're virtualizing in this case because it
> looks to be a very interesting area. The weirdness that we are observing is the
> fact that the above issues do not occur on bare-metal. It is very odd becase we
> do not observe this on KVM either.
How did you configure the KVM? I'm not sure if KVM has the real-time
scheduler or not. If not, it will not be an apple-to-apple comparison.
(I'm not trying to defend Xen, I'm just trying to figure out what is
going wrong.)
>>> The issue is as follows: when running a simple two tasks with less than 50%
>>> single core utilization each (Cost=200ms, Period=500ms), jobs are still missing
>>> in certain cases.
>>> - When we use only 2VCPUs for the guest, success in all jobs is observed
>>> - When we use 3 or more VCPUs, misses are observed
The weird thing is why allocating more resource to the VM make the
schedulable taskset unschedulable? This is more like a scheduling
anomaly.
Is it possible that the release timer is configured on VCPU 1, but
VCPU 1 is scheduled out right before the release timer will be
triggered. This will cause the release delay. In addition, if you
just dedicate a cpu to a VCPU and does not allow that cpu to run any
other VCPUs, will the same weird situation happen?
Thanks,
Meng
--
-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
More information about the litmus-dev
mailing list