[LITMUS^RT] Budget Consumed By Jobs In Ready Queue

Jeremy Erickson jerickso at cs.unc.edu
Thu Aug 23 21:37:50 CEST 2012


I think my attachments are getting scrubbed.  Script:

#!/bin/bash

export PATH="/root/feather-trace-tools/:$PATH"
export LIBLITMUSDIR="/root/liblitmus"
$LIBLITMUSDIR/setsched GSN-EDF
set -e

# lets you see TRACE() output
cat /dev/litmus/log > /root/traces/litmus_log &

# you need to put your plugin name here
#/home/jerickso/litmus/liblitmus2010/setsched EDF-SS

# start a bunch of RTSPIN instances
$LIBLITMUSDIR/rtspin -w 5 10 100 &
sleep 1
$LIBLITMUSDIR/rtspin -w 5 10 100 &
sleep 1
$LIBLITMUSDIR/rtspin -w 18 20 100 &
sleep 1

# release the task system
(sleep 3 && $LIBLITMUSDIR/release_ts) &

# start tracing, assumes your trace nodes are specified correctly
# the in st_trace wrapper script
st_trace test

# at this point, st_trace is running until you control+D it

# at this point, st_trace has ended

# put the location where you want your traces moved to here
mv *.bin /root/traces/

# kill the other things we've started in the background
killall cat
killall rtspin



Stack trace:

[   98.878799] ------------[ cut here ]------------
[   98.879011] kernel BUG at kernel/../litmus/sched_litmus.c:12!
[   98.879011] invalid opcode: 0000 [#1] PREEMPT SMP
[   98.879011] CPU 0
[   98.879011] Pid: 1399, comm: rtspin Not tainted 3.0.0-litmuslitmus-3.0+
#46
[   98.879011] RIP: 0010:[<ffffffff81024bd8>]  [<ffffffff81024bd8>]
update_time_litmus.isra.118+0x48/0x50
[   98.879011] RSP: 0000:ffff880074b3bdf8  EFLAGS: 00010013
[   98.879011] RAX: 0000001705a180d5 RBX: ffff880074be3dc0 RCX:
000000170596b8e9
[   98.879011] RDX: ffff88007917ef60 RSI: ffff880074be3dc0 RDI:
ffff88007cc17430
[   98.879011] RBP: ffff880074b3bdf8 R08: ffffffffffcc000d R09:
0000000000000001
[   98.879011] R10: 0000000000000000 R11: 000000000000000d R12:
ffff880074be4040
[   98.879011] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[   98.879011] FS:  00007fb003f14720(0000) GS:ffff88007cc00000(0000)
knlGS:0000000000000000
[   98.879011] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   98.879011] CR2: 00007f58ce648000 CR3: 000000007918e000 CR4:
00000000000006b0
[   98.879011] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[   98.879011] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[   98.879011] Process rtspin (pid: 1399, threadinfo ffff880074b3a000, task
ffff880074be3dc0)
[   98.879011] Stack:
[   98.879011]  ffff880074b3be10 ffffffff81024bf4 ffff88007cc16c00
ffff880074b3bf70
[   98.879011]  ffffffff8142a16d 0000000000000000 ffff880000000004
ffffffff815642b7
[   98.879011]  ffff880000000004 ffffffff815642b7 ffff880074be4340
ffffffff8142d98d
[   98.879011] Call Trace:
[   98.879011]  [<ffffffff81024bf4>] pre_schedule_litmus+0x14/0x30
[   98.879011]  [<ffffffff8142a16d>] schedule+0x26d/0xcf0
[   98.879011]  [<ffffffff8142d98d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
[   98.879011]  [<ffffffff81099bee>] ? might_fault+0x4e/0xa0
[   98.879011]  [<ffffffff8142e501>] ? retint_careful+0xb/0x2e
[   98.879011]  [<ffffffff8105eecd>] ? trace_hardirqs_on_caller+0x13d/0x180
[   98.879011]  [<ffffffff8122810e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   98.879011]  [<ffffffff8142e508>] retint_careful+0x12/0x2e
[   98.879011] Code: ff 75 28 48 29 c8 ba 00 00 00 00 5d 48 0f 48 c2 48 01
86 d8 0e 00 00 48 01 86 90 00 00 00 48 8b 07 48 89 86 88 00 00 00 c3 0f 0b
<0f> 0b 66 0f 1f 44 00 00 55 48 81 c7 30 08 00 00 48 89 e5 53 48
[   98.879011] RIP  [<ffffffff81024bd8>]
update_time_litmus.isra.118+0x48/0x50
[   98.879011]  RSP <ffff880074b3bdf8>
[   98.879011] ---[ end trace 83cb1420d7776445 ]---
[   98.879011] note: rtspin[1399] exited with preempt_count 2
[   98.879011] BUG: sleeping function called from invalid context at
kernel/rwsem.c:21
[   98.879011] in_atomic(): 1, irqs_disabled(): 1, pid: 1399, name: rtspin
[   98.879011] INFO: lockdep is turned off.
[   98.879011] irq event stamp: 4196
[   98.879011] hardirqs last  enabled at (4195): [<ffffffff8106eaa2>]
rcu_note_context_switch+0xa2/0x230
[   98.879011] hardirqs last disabled at (4196): [<ffffffff8142d862>]
_raw_spin_lock_irq+0x12/0x50
[   98.879011] softirqs last  enabled at (4072): [<ffffffff81035940>]
__do_softirq+0x100/0x150
[   98.879011] softirqs last disabled at (4055): [<ffffffff8142facc>]
call_softirq+0x1c/0x26
[   98.879011] Pid: 1399, comm: rtspin Tainted: G      D
3.0.0-litmuslitmus-3.0+ #46
[   98.879011] Call Trace:
[   98.879011]  [<ffffffff8105c720>] ? print_irqtrace_events+0xd0/0xe0
[   98.879011]  [<ffffffff8102ca3d>] __might_sleep+0xed/0x120
[   98.879011]  [<ffffffff8142cccf>] down_read+0x1f/0x60
[   98.879011]  [<ffffffff81067ec6>] acct_collect+0x46/0x1b0
[   98.879011]  [<ffffffff81033a40>] do_exit+0x5d0/0x7b0
[   98.879011]  [<ffffffff810317d1>] ? kmsg_dump+0xf1/0x120
[   98.879011]  [<ffffffff810050b6>] oops_end+0x76/0xb0
[   98.879011]  [<ffffffff81005233>] die+0x53/0x80
[   98.879011]  [<ffffffff81001fd4>] do_trap+0xc4/0x170
[   98.879011]  [<ffffffff810022b0>] do_invalid_op+0x90/0xb0
[   98.879011]  [<ffffffff81024bd8>] ? update_time_litmus.isra.118+0x48/0x50
[   98.879011]  [<ffffffff8122814d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[   98.879011]  [<ffffffff8142e4f4>] ? restore_args+0x30/0x30
[   98.879011]  [<ffffffff8142f855>] invalid_op+0x15/0x20
[   98.879011]  [<ffffffff81024bd8>] ? update_time_litmus.isra.118+0x48/0x50
[   98.879011]  [<ffffffff8142a083>] ? schedule+0x183/0xcf0
[   98.879011]  [<ffffffff81024bf4>] pre_schedule_litmus+0x14/0x30
[   98.879011]  [<ffffffff8142a16d>] schedule+0x26d/0xcf0
[   98.879011]  [<ffffffff8142d98d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
[   98.879011]  [<ffffffff81099bee>] ? might_fault+0x4e/0xa0
[   98.879011]  [<ffffffff8142e501>] ? retint_careful+0xb/0x2e
[   98.879011]  [<ffffffff8105eecd>] ? trace_hardirqs_on_caller+0x13d/0x180
[   98.879011]  [<ffffffff8122810e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   98.879011]  [<ffffffff8142e508>] retint_careful+0x12/0x2e

-Jeremy Erickson

On Thu, Aug 23, 2012 at 3:31 PM, Jeremy Erickson <jerickso at cs.unc.edu>wrote:

> In a plugin I was debugging, I discovered problems caused by jobs that had
> not depleted their budget when put into the ready queue, but that were out
> of budget when taken out.  I decided to test what was happening by adding a
> "BUG_ON(is_queued(p));" statement to the update_time_litmus function in
> sched_litmus.c, which did indeed trigger for my plugin.  I think that's the
> source of my problems, although I'm not sure why it's happening.
>
> I discovered, however, that the same BUG_ON statement is triggered in
> 2012.2 with the standard GSN-EDF plugin.  That seems incorrect to me - a
> job should only be in the release or ready queue if it is not executing, in
> which case it should not be consuming any budget.  Is that a bug in
> LITMUS^RT?  I used the attached test script and got the attached stack
> trace pretty consistently on KVM.
>
> -Jeremy Erickson
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20120823/2bc0e4b8/attachment.html>


More information about the litmus-dev mailing list