[LITMUS^RT] running rtspin and calculate deadline miss ratio? -- Sisu

Wed Apr 17 19:55:20 CEST 2013

A python example of this process can be found here:
https://github.com/hermanjl/experiment-scripts/blob/master/parse/sched.py.
That script does not maintain a heap to parse out of order records; it just
ignores them.

On Wed, Apr 17, 2013 at 1:45 PM, Glenn Elliott <gelliott at cs.unc.edu> wrote:

> Just for clarification about out-of-order records: records are in a
> partially sorted order.  You only need a sorted lookahead buffer of a few
> hundred records to be reasonably sure that your records are in order.
>
>
> On Apr 17, 2013, at 1:43 PM, Glenn Elliott <gelliott at cs.unc.edu> wrote:
>
> Hi Sisu,
>
> I believe you should compute deadline misses by analyzing shech_trace
> logs.  /dev/litmus/sched_trace# has a character device for each CPU.  See
> "Recording Scheduling Traces" here:
> https://wiki.litmus-rt.org/litmus/Tracing
>
> You will have to write your own tool to combine the binary recordings for
> each CPU (each is timestamped).  The easiest way to do this is:
> 1) mmap() each sched_trace file into your analysis application.
> 2) Treat each mmap()'ed file as a giant C-array of "struct
> st_event_record".  You may want to #include sched_trace.h from
> litmus-rt/include/litmus to get the struct definitions.
> 3) Records can appear out of order, so for each stream, take the first
> 50-or-so records and put them into a single timestamp-ordered minheap.
> 4) Process each record one at a time by popping the first record on the
> min-heap.  Keep the min-heap full by moving more records from the arrays.
>
> (An alternative approach is to qsort() each array (or just the first
> X-elements of the array), and process the record with the smallest
> timestamp at the heads of the record streams.)
>
> Detecting deadline misses is then pretty straight forward once you've got
> this set up.  A job can be uniquely identified by its TID, and Job Number.
>  Every job has one release record, which includes a deadline.  Correlate
> this release record (by matching <TID, Job#>) to a unique completion
> record.  A deadline miss has occurred if the completion time is later than
> the deadline.
>
> Some tips:
> (1) The user processes' notion of a job can differ from the kernel's
> notion of a job if you use budget enforcement.  Accounting for this
> requires more complex processing of the sched_trace data.  I have some
> patches in the works that makes this correlation easier to do, but it's not
> ready for prime-time.
> (2) If your system is severely overutilized, that the regular tasks that
> read the sched_trace buffers and write them to disk can be starved.  This
> can cause the sched_trace ring buffers to overflow, resulting in the loss
> of tracing data.  You can control the size of the sched_trace buffers at
> Litmus compilation time (see CONFIG_SCHED_TASK_TRACE_SHIFT).  However, the
> kernel may refuse to compile if CONFIG_SCHED_TASK_TRACE_SHIFT and NR_CPUS
> lead to too much sched_trace buffer space---the binary kernel image becomes
> too big.  You have to hack other aspects of the kernel to get larger
> buffers, but it is tricky.  Let me know if you run into this problem.
> (3) (This holds for ft_tracing as well.)  I find it easier to dump logs to
> shared memory (RAM disk) during tracing because this causes fewer
> overheads.  I then copy the trace data out of RAM and write it to disk
> after experimentation is over.  Ubuntu has a RAM disk already set up for
> you at /dev/shm.  Just dump data to /dev/shm and read back the files later.
>  Of course, your system has to have sufficient RAM to make this work.
>
> -Glenn
>
>
> On Apr 17, 2013, at 12:47 PM, Sisu Xi <xisisu at gmail.com> wrote:
>
> Hi, all:
>
> Is there any tutorial on running multiple rt tasks (say, rtspin) for some
> time and calculate the deadline miss ratio? like the ones you presented in
> the paper?
>
> I run a single rtspin task with wcet of 5 and period of 10 for 100
> seconds. There is no output of this.
>
> I can trace the task execution via reading /dev/litmus/log, it shows:
>
> 208107 P2: rt: adding rtspin/1357 (5000000, 10000000, 10000000)
> rel=1088752672478 to ready queue at 1088754177964
> 208108 P2: check_for_preemptions: attempting to link task 1357 to 1
> 208110 P2: (rtspin/1357:2350) blocks:0 out_of_time:0 np:0 sleep:1
> preempt:0 state:0 sig:0
> 208111 P2: (rtspin/1357:2350) job_completion().
> 208112 P2: rt: adding rtspin/1357 (5000000, 10000000, 10000000)
> rel=1088762672478 to ready queue at 1088764173889
> 208113 P2: check_for_preemptions: attempting to link task 1357 to 1
> 208115 P2: (rtspin/1357:2351) blocks:0 out_of_time:0 np:0 sleep:1
> preempt:0 state:0 sig:0
> 208116 P2: (rtspin/1357:2351) job_completion().
>
> I assume the 1357 is the pid, and the number following (2350, 2351, etc)
> is the job id. However, I don't know the exact job release time and job
> completion time. Thus I don't know whether this job missed its deadline or
> not.
>
> How do you guys trace the deadline miss ratio?
>
> Thanks very much!
>
> Sisu
>
>
>
> --
> Sisu Xi, PhD Candidate
>
> http://www.cse.wustl.edu/~xis/
> Department of Computer Science and Engineering
> Campus Box 1045
> Washington University in St. Louis
> One Brookings Drive
> St. Louis, MO 63130
> _______________________________________________
> litmus-dev mailing list
> litmus-dev at lists.litmus-rt.org
> https://lists.litmus-rt.org/listinfo/litmus-dev
>
>
>
>
> _______________________________________________
> litmus-dev mailing list
> litmus-dev at lists.litmus-rt.org
> https://lists.litmus-rt.org/listinfo/litmus-dev
>
>

-- 
Jonathan Herman
Department of Computer Science at UNC Chapel Hill
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20130417/73b9820e/attachment.html>