[LITMUS^RT] Help with Understanding Scheduling Results

Thu Mar 19 14:04:14 CET 2015

2015-03-19 2:48 GMT-04:00 Geoffrey Tran <gtran at isi.edu>:
> Hi Meng,
>
> Sorry for the late reply.
>
> I didn't do the setup for the KVM instance, so I can't comment on the characteristics. I will try to find out though. I agree that if it isn't a RT scheduler, we cannot make a fair comparison. However, I mentioned it as an example of the case not occurring on a virtualized system.

Another way to compare Xen and KVM is to use the credit or credit2
scheduler on Xen and see if the scheduling anomaly occurs.

Thanks,

Meng

>
> It indeed is very strange that allocating more resources to the VM makes the jobs start missing. I did try pinning VCPUs to their own dedicated PCPUs, and still see the same results.
>
> Thanks,
> Geoffrey
>
> ----- Original Message -----
> From: "Meng Xu" <xumengpanda at gmail.com>
> To: "Geoffrey Tran" <gtran at isi.edu>, litmus-dev at lists.litmus-rt.org
> Cc: "Björn Brandenburg" <bbb at mpi-sws.org>, "Mikyung Kang" <mkkang at isi.edu>, "Stephen Crago" <crago at isi.edu>, "John Walters" <jwalters at isi.edu>
> Sent: Monday, March 16, 2015 9:48:56 AM
> Subject: Re: [LITMUS^RT] Help with Understanding Scheduling Results
>
> Hi Geoffrey,
>
> 2015-03-16 15:07 GMT-04:00 Geoffrey Tran <gtran at isi.edu>:
>> Hi Björn,
>>
>> Thanks for your reply. I've answered your questions below. I hope the formatting did not get disrupted.
>>
>>>----- Original Message -----
>>>From: "Björn Brandenburg" <bbb at mpi-sws.org>
>>>To: "Geoffrey Tran" <gtran at isi.edu>, litmus-dev at lists.litmus-rt.org
>>>Cc: "Mikyung Kang" <mkkang at isi.edu>, "Stephen Crago" <crago at isi.edu>, "John Walters" <jwalters at isi.edu>
>>>Sent: Thursday, March 12, 2015 5:39:09 PM
>>>Subject: Re: [LITMUS^RT] Help with Understanding Scheduling Results
>>>
>>>
>>>> On 12 Mar 2015, at 23:06, Geoffrey Tran <gtran at isi.edu> wrote:
>>>>
>>>> I have been working with LITMUS-RT, but have come across some weird results
>>>> and was hoping someone could help clarify it please.
>>>
>>>Hi Geoffrey,
>>>
>>>thanks for your interest in LITMUS^RT. I’ve not observed any similar behavior; it indeed looks pretty strange. Some questions/comments inline below.
>>>
>>>>
>>>> The machine used is a 16-core system.  8 cores are dedicated to the host,
>>>> while up to the remainig 8 cores are used to run a Xen virtual machine.
>>>> Just in case more background is needed, Xen is a popular hypervisor.  The
>>>> RTDS scheduler is a new real-time scheduler that was originally the RT-Xen
>>>> project.  It is now a part of the upstream Xen project.  The scheduler
>>>> gives each VCPU a guaranteed capacity based on the input parameters.
>>>>
>>>> LITMUS-RT v2014.2 is installed in the guest.  The hypervisor is using the
>>>> rtds scheduler, with each VCPU receiving the full allocation (Budget: 10000,
>>>> Period: 10000).
>>
>>>Which LITMUS^RT scheduler are you using?
>>
>> We are using the GSN-EDF scheduler.
>>
>>>Can you reproduce the issue with LITMUS^RT running on bare metal?
>>
>> When running on bare metal, the issue is not observed. All jobs for each task
>> meet the deadlines.
>>
>>>>
>>>> The issue is as follows:  when running a simple two tasks with less than 50%
>>>> single core utilization each (Cost=200ms, Period=500ms), jobs are still missing
>>>> in certain cases.
>>>> - When we use only 2VCPUs for the guest, success in all jobs is observed
>>>> - When we use 3 or more VCPUs, misses are observed
>>>
>>>Do you use budget enforcement ?
>>
>> No, there is no budget enforcement. This could explain the "rollover" on
>> why each missed job comes in batches, but not the reason why the first job
>> missed, right?
>>
>>>
>>>> The results from st_job_stats and an exerpt of the logs for the 3 VCPU case
>>>> are below:
>>>>         # Task,   Job,     Period,   Response, DL Miss?,   Lateness,  Tardiness
>>>>         # task NAME=myapp PID=14442 COST=250000000 PERIOD=500000000 CPU=0
>>>>          14442,     2,  500000000,      32592,        0, -499967408,          0
>>>>          14442,     3,  500000000,      61805,        0, -499938195,          0
>>>>          14442,     4,  500000000,  291581439,        0, -208418561,          0
>>>>          14442,     5,  500000000,  251964671,        0, -248035329,          0
>>>>          14442,     6,  500000000,  252424182,        0, -247575818,          0
>>>>          14442,     7,  500000000,  251938074,        0, -248061926,          0
>>>>          14442,     8,  500000000,  252145862,        0, -247854138,          0
>>>>          14442,     9,  500000000,  251845811,        0, -248154189,          0
>>>>          14442,    10,  500000000,  257706935,        0, -242293065,          0
>>>>          14442,    11,  500000000,  251850581,        0, -248149419,          0
>>>>          14442,    12,  500000000,  252553597,        0, -247446403,          0
>>>>          14442,    13,  500000000,  251765063,        0, -248234937,          0
>>>>          14442,    14,  500000000,  252902538,        0, -247097462,          0
>>>>          14442,    15,  500000000, 1009091185,        1,  509091185,  509091185
>>>>          14442,    16,  500000000,  760966632,        1,  260966632,  260966632
>>>>          14442,    17,  500000000,  512866266,        1,   12866266,   12866266
>>>>          14442,    18,  500000000,  264818921,        0, -235181079,          0
>>>>          14442,    19,  500000000,  253024397,        0, -246975603,          0
>>>>          14442,    20,  500000000,  252785150,        0, -247214850,          0
>>>>          14442,    21,  500000000,  252466946,        0, -247533054,          0
>>>>          14442,    22,  500000000, 1459862887,        1,  959862887,  959862887
>>>>          14442,    23,  500000000, 1211903080,        1,  711903080,  711903080
>>>>          14442,    24,  500000000,  963919848,        1,  463919848,  463919848
>>>>         # task NAME=myapp PID=14443 COST=200000000 PERIOD=500000000 CPU=0
>>>>          14443,     2,  500000000,      58150,        0, -499941850,          0
>>>>          14443,     3,  500000000,    3202178,        0, -496797822,          0
>>>>          14443,     4,  500000000,  201662924,        0, -298337076,          0
>>>>          14443,     5,  500000000,  213828161,        0, -286171839,          0
>>>>          14443,     6,  500000000,  202532002,        0, -297467998,          0
>>>>          14443,     7,  500000000,  961643647,        1,  461643647,  461643647
>>>>          14443,     8,  500000000,  663707479,        1,  163707479,  163707479
>>>>          14443,     9,  500000000,  365603701,        0, -134396299,          0
>>>>          14443,    10,  500000000,  201910605,        0, -298089395,          0
>>>>          14443,    11,  500000000,  209025099,        0, -290974901,          0
>>>>          14443,    12,  500000000,  210602663,        0, -289397337,          0
>>>>          14443,    13,  500000000,  247544048,        0, -252455952,          0
>>>>          14443,    14,  500000000,  459680759,        0,  -40319241,          0
>>>>          14443,    15,  500000000,  202763167,        0, -297236833,          0
>>>>          14443,    16,  500000000,  202368570,        0, -297631430,          0
>>>>          14443,    17,  500000000,  201857551,        0, -298142449,          0
>>>>          14443,    18,  500000000,  201999263,        0, -298000737,          0
>>>>          14443,    19,  500000000,  958909178,        1,  458909178,  458909178
>>>>          14443,    20,  500000000,  660591085,        1,  160591085,  160591085
>>>>          14443,    21,  500000000,  362025653,        0, -137974347,          0
>>>>          14443,    22,  500000000,  202565748,        0, -297434252,          0
>>>>          14443,    23,  500000000,  202456104,        0, -297543896,          0
>>>>          14443,    24,  500000000,  202528628,        0, -297471372,          0
>>>>
>>>>          .....
>>>>         [ 328638446] RELEASE        14443/14    on CPU  0 328638946.00ms
>>>>         [ 328638446] RELEASE        14442/14    on CPU  1 328638946.00ms
>>>>         [ 328638446] SWITCH_TO      14442/14    on CPU  0
>>>>         [ 328638698] SWITCH_FROM    14442/14    on CPU  0
>>>>         [ 328638698] COMPLETION     14442/14    on CPU  0
>>>>         [ 328638704] SWITCH_TO      14443/14    on CPU  1
>>>>         [ 328638905] SWITCH_FROM    14443/14    on CPU  1
>>>>         [ 328638905] COMPLETION     14443/14    on CPU  1
>>>>         [ 328638946] RELEASE        14442/15    on CPU  0 328639446.00ms
>>>>         [ 328638946] RELEASE        14443/15    on CPU  1 328639446.00ms
>>>>         [ 328638946] SWITCH_TO      14443/15    on CPU  0
>>>>         [ 328639148] SWITCH_FROM    14443/15    on CPU  0
>>>>         [ 328639148] COMPLETION     14443/15    on CPU  0
>>>>         [ 328639446] RELEASE        14443/16    on CPU  0 328639946.00ms
>>>>         [ 328639446] RELEASE        14442/16    on CPU  1 328639946.00ms
>>>
>>>Random observation: the next release of 14443 happens on CPU 0, whereas 14442 is still being released on CPU 1. This means CPU 1 must have set a timer for the release to occur. Usually, it shouldn’t do that unless it either (a) processed a completion or (b) prevented a budget overrun. Either event should have left a trace record (I think). Are you sure the trace is complete (i.e., no missing writes)?
>>>
>>
>> Yes, as far as I know, there were no reported failed writes at the completion of the tracing.
>>
>>>
>>>>         [ 328639446] SWITCH_TO      14443/16    on CPU  0
>>>>         [ 328639648] SWITCH_FROM    14443/16    on CPU  0
>>>>         [ 328639648] COMPLETION     14443/16    on CPU  0
>>>>         [ 328639703] SWITCH_TO      14442/15    on CPU  1
>>>>         [ 328639946] RELEASE        14442/17    on CPU  1 328640446.00ms
>>>>         [ 328639946] RELEASE        14443/17    on CPU  0 328640446.00ms
>>>>         [ 328639946] SWITCH_TO      14443/17    on CPU  0
>>>>         [ 328639955] SWITCH_FROM    14442/15    on CPU  1
>>>>         [ 328639955] COMPLETION     14442/15    on CPU  1
>>>>         [ 328639955] SWITCH_TO      14442/16    on CPU  1
>>>>         [ 328640147] SWITCH_FROM    14443/17    on CPU  0
>>>>         [ 328640147] COMPLETION     14443/17    on CPU  0
>>>>         [ 328640206] SWITCH_FROM    14442/16    on CPU  1
>>>>         [ 328640206] COMPLETION     14442/16    on CPU  1
>>>>         [ 328640206] SWITCH_TO      14442/17    on CPU  1
>>>>         .....
>>>>
>>>>
>>>> It is interesting to note that when both job 15's are released, the job for PID 14443 is
>>>> allowed to run. however, for PID 14442, the job does not run until the next period!
>>>>
>>>> It has also been observed that running two tasks on a guest with 8 VCPUs sees the same
>>>> task misses.  But by artificailly loading the system by running 6 instances of "yes >
>>>> /dev/null", all jobs meet their deadlines.
>>>
>>>Do these instances of yes run as LITMUS^RT tasks or just as regular SCHED_OTHER tasks?
>>
>> The instances of yes run as regular tasks.  I just launched them from the shell without any calls to liblitmus.
>>
>>>
>>>Also, can you provide matching output of the debug trace log (the TRACE() stream)?
>>
>> I've attached two text documents containing the output from st_job_stats followed by the
>> trace logs you've requested. One is for the bare-metal case, one is for the VM. Sorry,
>> we don't have the logs for that exact run available, but this is from a run with the
>> same parameters. The bare metal case is a bit cluttered due to the current system having
>> the preemption tracing enabled.
>>
>>>
>>>I’ve not used LITMUS^RT on Xen. I don’t mean to blame Xen, but it would make it easier to for us if you could reproduce this behavior on bare metal.
>>
>> Again, thanks for your help and comments so far!
>>
>> To provide a little bit of context, We're virtualizing in this case because it
>> looks to be a very interesting area.  The weirdness that we are observing is the
>> fact that the above issues do not occur on bare-metal.  It is very odd becase we
>> do not observe this on KVM either.
>
> How did you configure the KVM? I'm not sure if KVM has the real-time
> scheduler or not. If not, it will not be an apple-to-apple comparison.
> (I'm not trying to defend Xen, I'm just trying to figure out what is
> going wrong.)
>
>>>> The issue is as follows:  when running a simple two tasks with less than 50%
>>>> single core utilization each (Cost=200ms, Period=500ms), jobs are still missing
>>>> in certain cases.
>>>> - When we use only 2VCPUs for the guest, success in all jobs is observed
>>>> - When we use 3 or more VCPUs, misses are observed
>
> The weird thing is why allocating more resource to the VM make the
> schedulable taskset unschedulable? This is more like a scheduling
> anomaly.
>
> Is it possible that the release timer is configured on VCPU 1, but
> VCPU 1 is scheduled out right before the release timer will be
> triggered. This will cause the release delay.  In addition, if you
> just dedicate a cpu to a VCPU and does not allow that cpu to run any
> other VCPUs, will the same weird situation happen?
>
> Thanks,
>
> Meng
>
>
>
> --
>
>
> -----------
> Meng Xu
> PhD Student in Computer and Information Science
> University of Pennsylvania

-- 

-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania