[LITMUS^RT] prop/sched-domains
Glenn Elliott
gelliott at cs.unc.edu
Tue Feb 25 23:37:43 CET 2014
On Feb 25, 2014, at 3:31 AM, Björn Brandenburg <bbb at mpi-sws.org> wrote:
> Hi Glenn,
>
> I’m seeing warnings that seem to be related to the sched-domain patch. This happens when I activate the P-FP plugin and requires LOCKDEP to be active.
>
> litmus-rt login: Switching to LITMUS^RT plugin P-FP.
> ------------[ cut here ]------------
> WARNING: at kernel/lockdep.c:2740 lockdep_trace_alloc+0xec/0x100()
> DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
> CPU: 0 PID: 7 Comm: migration/0 Not tainted 3.10.5-litmus2013.1+ #1122
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> ffffffff81682f05 ffff88007a125aa8 ffffffff8150e94e ffff88007a125ae8
> ffffffff81033f76 ffff88007a125ad8 0000000000000046 ffff88007a0db210
> 00000000000080d0 00000000000080d0 00000000000000b2 ffff88007a125b48
> Call Trace:
> [<ffffffff8150e94e>] dump_stack+0x19/0x1b
> [<ffffffff81033f76>] warn_slowpath_common+0x66/0x90
> [<ffffffff81034041>] warn_slowpath_fmt+0x41/0x50
> [<ffffffff8107f56c>] lockdep_trace_alloc+0xec/0x100
> [<ffffffff810fa82c>] __kmalloc+0x4c/0x1d0
> [<ffffffff81155ae2>] ? __proc_create+0xb2/0x120
> [<ffffffff81155ae2>] __proc_create+0xb2/0x120
> [<ffffffff811565bc>] proc_create_data+0x8c/0xc0
> [<ffffffff81265e8a>] activate_domain_proc+0xca/0x150
> [<ffffffff812646a1>] do_plugin_switch+0x61/0xc0
> [<ffffffff8108c4a2>] stop_machine_cpu_stop+0xb2/0x150
> [<ffffffff8108c3f0>] ? queue_stop_cpus_work+0x160/0x160
> [<ffffffff8108c00d>] cpu_stopper_thread+0x8d/0x170
> [<ffffffff81512625>] ? _raw_spin_unlock_irqrestore+0x65/0x80
> [<ffffffff81060aae>] smpboot_thread_fn+0x1ce/0x310
> [<ffffffff810608e0>] ? smpboot_register_percpu_thread+0xe0/0xe0
> [<ffffffff810583a6>] kthread+0xd6/0xe0
> [<ffffffff81066175>] ? schedule_tail+0xd5/0x210
> [<ffffffff810582d0>] ? __init_kthread_worker+0x70/0x70
> [<ffffffff8151346c>] ret_from_fork+0x7c/0xb0
> [<ffffffff810582d0>] ? __init_kthread_worker+0x70/0x70
> ---[ end trace 2f7c7abc2a2e4492 ]—
>
> It seems the code is somehow calling kmalloc(), which can sleep, with interrupts disabled. I don’t think this is related to rebasing the patch. Can you please have a look?
>
> Thanks,
> Björn
Arg! I used GFP_ATOMIC in all of my code, not realizing Linux’s proc_create_data() would call plain kmalloc(). Here are two possible solutions:
1) Use kworker to set up the proc files after successful activation.
2) Use kworker to perform plugin activation.
I have run into a similar problem in my GPUSync branch. Here, I need plugin deactivation to wait/join on interrupt handling threads that are signaled to exit/shutdown. In GPUSync, an echo to /proc/litmus/active_plugin triggers an event on kworker, which invokes the real activation routine. Problem solved. However, plugin activation becomes asynchronous. You have to wait for some epsilon after activating a plugin to know if it was successful. That’s not very nice. I don’t know how else to approach the problem though.
Back to my sched-domains patch, normally I would advocate solution #1, since plugin activation remains synchronous. However, since I’ve run into this problem before, I am willing to entertain solution #2. I think we can defend against race conditions by serializing all plugin activation/deactivations on the same worker thread (perhaps we set up a reserved kworker just for litmus).
Questions of the Litmus community: Does anyone have an opinion one way or the other? Has atomic plugin activation caused you problems in the past?
Thanks,
Glenn
More information about the litmus-dev
mailing list