[LITMUS^RT] Help with Understanding Scheduling Results
Geoffrey Tran
gtran at isi.edu
Thu Apr 2 21:16:33 CEST 2015
Hi Björn,
Thank you again for your detailed reply!
For the logs, I have disabled CONFIG_PREEMPT_STATE_TRACE since it added up to huge logs. Did you mean that I should turn this on
before collecting logs? I just wanted to clear that up before I go recompiling the kernel to disable the affinity code.
One of my colleagues did test with the affinity disabled, and while it did improve scheduling results for this two-task case, it
did not solve the issue with a larger task set. However, that image is no longer available, and no logs were stored, so I will
redo it after receiving your comments on the log settings.
The in depth explanation of the log is again helpful. I am not sure on if Xen always delivers all IPIs.
When you say enable SCHED_STATE, do you mean the above tracing parameter? Everything else is enabled in the LITMUS^RT->Tracing
menu for the kernel config.
Thanks,
Geoffrey
----- Original Message -----
From: "Björn Brandenburg" <bbb at mpi-sws.org>
To: "Geoffrey Tran" <gtran at isi.edu>, "Glenn Elliott" <gelliott at cs.unc.edu>
Cc: "Meng Xu" <xumengpanda at gmail.com>, litmus-dev at lists.litmus-rt.org, "Mikyung Kang" <mkkang at isi.edu>, "Stephen Crago" <crago at isi.edu>, "John Walters" <jwalters at isi.edu>
Sent: Thursday, April 2, 2015 9:48:04 AM
Subject: Re: [LITMUS^RT] Help with Understanding Scheduling Results
> On 01 Apr 2015, at 04:22, Geoffrey Tran <gtran at isi.edu> wrote:
>
> Hi Meng,
>
> I received this email on the KVM setup:
> <each VM>
> 8 cores
> 100% budget
> raw qemu image
> vda virtio
> with the following kernel: Linux localhost 3.4.43-WR5.0.1.10_standard #1 SMP PREEMPT Sat Dec 21 16:28:51 EST 2013 x86_64 GNU/Linux
>
> Also, we observe similar anomalies with the credit scheduler.
>
>
>
> Hello again Björn,
>
> Would you happen to have any more comments or suggestions we could try?
>
> Thank you very much,
> Geoffrey
Hi Geoffrey,
(CC'ing Glenn, who knows the affinity code best)
sorry for the long pause. I finally had some time to look at the traces.
First, it looks like your bm_log.txt trace contains ONLY messages related to SCHED_STATE tracing, which is not very useful.
Second, could you please retrace what happens under Xen with CONFIG_SCHED_CPU_AFFINITY disabled? Something goes wrong around the affinity check. I'm not quite sure what's happening, but let's try disabling that code to narrow down the issue.
The first job to miss a deadline is myapp/1535:15, so let's look at that.
The predecessor myapp/1535:14 completes before its deadline; the task is hence added to the release queue.
278 P1 [gsnedf_schedule at litmus/sched_gsn_edf.c:442]: (myapp/1535:14) blocks:0 out_of_time:0 np:0 sleep:1 preempt:0 state:0 sig:0
279 P1 [job_completion at litmus/sched_gsn_edf.c:364]: (myapp/1535:14) job_completion().
280 P1 [__add_release at litmus/rt_domain.c:348]: (myapp/1535:15) add_release(), rel=620115000000
281 P1 [gsnedf_schedule at litmus/sched_gsn_edf.c:490]: (myapp/1535:15) scheduled_on = NO_CPU
Some time later, the release interrupt is handled by CPU 1. It links myapp/1536:15 to itself...
286 P1 [check_for_preemptions at litmus/sched_gsn_edf.c:302]: (myapp/1536:15) linking to local CPU 1 to avoid IPI
... and then tries initially to link myapp/1535:15 to CPU 4, which however we never see in the trace.
287 P1 [check_for_preemptions at litmus/sched_gsn_edf.c:314]: check_for_preemptions: attempting to link task 1535 to 4
Then the affinity code kicks in and determines that CPU 0 is much closer to CPU 1, which scheduled the task last.
288 P1 [gsnedf_get_nearest_available_cpu at litmus/sched_gsn_edf.c:275]: P0 is closest available CPU to P1
So far so good. Now myapp/1535:15 gets linked to P0. However, we never hear from P0 in the trace until much later.
289 P1 [gsnedf_schedule at litmus/sched_gsn_edf.c:445]: (swapper/1/0:0) will be preempted by myapp/1536
290 P1 [gsnedf_schedule at litmus/sched_gsn_edf.c:485]: (myapp/1536:15) scheduled_on = P1
291 P1 [litmus_schedule at kernel/sched/litmus.c:51]: (myapp/1536:15) migrate from 0
292 P1 [litmus_schedule at kernel/sched/litmus.c:65]: (myapp/1536:15) stack_in_use=-1
293 P1 [gsnedf_schedule at litmus/sched_gsn_edf.c:442]: (myapp/1536:15) blocks:0 out_of_time:0 np:0 sleep:1 preempt:0 state:0 sig:0
294 P1 [job_completion at litmus/sched_gsn_edf.c:364]: (myapp/1536:15) job_completion().
295 P1 [__add_release at litmus/rt_domain.c:348]: (myapp/1536:16) add_release(), rel=620615000000
296 P1 [gsnedf_schedule at litmus/sched_gsn_edf.c:490]: (myapp/1536:16) scheduled_on = NO_CPU
297 P1 [check_for_preemptions at litmus/sched_gsn_edf.c:302]: (myapp/1536:16) linking to local CPU 1 to avoid IPI
298 P1 [gsnedf_schedule at litmus/sched_gsn_edf.c:445]: (swapper/1/0:0) will be preempted by myapp/1536
299 P1 [gsnedf_schedule at litmus/sched_gsn_edf.c:485]: (myapp/1536:16) scheduled_on = P1
300 P0 [gsnedf_schedule at litmus/sched_gsn_edf.c:445]: (swapper/0/0:0) will be preempted by myapp/1535
301 P0 [gsnedf_schedule at litmus/sched_gsn_edf.c:485]: (myapp/1535:15) scheduled_on = P0
At this point, the **next** job of myapp/1536 has already been released, i.e., P0 "was out to lunch" for an entire period and so naturally myapp/1535 has a couple of tardy jobs while it catches up.
The more I think about it, I don't think it has anything to do with the affinity code (but try disabling it anyway). Rather, this could be explained by a "lost" IPI, in which case P0 simply idled until it received an interrupt for some other reason. Are you sure Xen always delivers all IPIs?
By the way, this would also explain why adding 'yes' background tasks helps: they prevent the CPUs from going to sleep and frequently cause I/O, which triggers the scheduler, so a lost IPI wouldn't be noticed.
(Alternatively, it could also be the case that P0 was not scheduled by the hypervisor for an extended amount of time, but I'm sure you've checked this already.)
To further investigate this issue, it would help to enable SCHED_STATE tracing in the kernel running under Xen. I'm happy to take another look once you've collected further traces.
Thanks,
Björn
More information about the litmus-dev
mailing list