[LITMUS^RT] Missing st_trace records

Wed Nov 5 22:59:52 CET 2014

Thanks, Björn. After changing from rtspin to rt_launch, I could see that
there are no missing records w/o changing anything.

I have 3 simple questions about the st_job_stats data. Any comments are
welcome!

*** Example: 8*(Period, WCET)=8*(200,180)ms on 8 Cores (both bare-metal and
VM cases) [8 "same" tasks using rt_launch]

Using st_job_stats,  I could see [Task,   Job,     Period,   Response, DL
Miss?,   Lateness,  Tardiness] records.

(1) Some files describe right period (200ms) but some files describe 0
period as follows. Does it mean that PID#13162 is not schedulable and
PID#13166 is only schedulable? But, the Lateness/DL_Miss? of PID#13162
shows no deadline missing.

# task NAME=<unknown> PID=13162 COST=0 PERIOD=0 CPU=-1
 13162,     2,          0,  180031469,        0,  -19968531,          0
 13162,     3,          0,  180026058,        0,  -19973942,          0
 13162,     4,          0,  180029476,        0,  -19970524,          0
 13162,     5,          0,  180027542,        0,  -19972458,          0
....

# task NAME=rt_launch PID=13166 COST=180000000 PERIOD=200000000 CPU=0
 13166,     2,  200000000,  180019319,        0,  -19980681,          0
 13166,     3,  200000000,  180022003,        0,  -19977997,          0
 13166,     4,  200000000,  180022586,        0,  -19977414,          0
 13166,     5,  200000000,  180021609,        0,  -19978391,          0

(2) When I checked the total lines (total number of jobs) for each PID,
each task has the exactly same number of jobs in some cases, but sometimes
the number of jobs is slightly different among 8 tasks as follows. Is this
expected or not? There is no missed record among total lines. Some tasks
have 1 or 2 more jobs. Is it possible?

116  116  115  115  114  115  114  114

(3) I want to repeat test-case 20 times and then average their
schedulability. In either case (whether including period=0 jobs are
included to scheduled job or not), I could see that inter-run variation
happened a lot as follows. Is this expected or not? Can you get consistent
traced records (consistent fraction of schedulable task sets) any time??

1.00 1.00 1.00 1.00 1.00 .13 1.00 1.00 1.00 .13 .13 1.00 .25 .13 .13 .13
.13 1.00 .25 1.00

Could you please comment for those 3 questions or even 1?
Thanks for your help in advance!

Mikyung

On Fri, Sep 19, 2014 at 4:21 AM, Björn Brandenburg <bbb at mpi-sws.org> wrote:

>
> On 18 Sep 2014, at 06:29, Mikyung Kang <mkkang01 at gmail.com> wrote:
> >
> > I'm trying to get whole tracing information of RT task sets using
> LITMUS-RT Version 2014.1.
> >
> > * System has 8 Cores, no hyper-threading, 16G memory
> > * Tested both Bare-metal case and Virtualization case (Xen): similar
> result
> > * Ubuntu 12.04 (Linux 3.10.5)
> > * Generated 10 tasks w/ Utilization=[1.0, 8.0] using rtspin
> > * Run 10 seconds using GSN-EDF scheduler
> >
> > When I spawned only 1 task (Period=100ms, WCET=10ms) during 10 seconds,
> all records are being saved into .bin file correctly w/o missing records.
> > But, more than 1 task, always records are being missed a lot.
>
> This sounds like something is broken. Even with 8x(10, 100) tasks you
> should have no tracing problems at all as there should be more than enough
> time for st_trace to catch up. Your system must be overutilized somehow.
>
> >
> > To avoid record-loss, I tried the following options based on the thread:
> https://lists.litmus-rt.org/pipermail/litmus-dev/2013/000480.html.
> >
> > * Kernel config: CONFIG_SCHED_TASK_TRACE_SHIFT=13 (up to 8K events)
> > * Used /dev/shm/* instead of disk for the binary record file
> > * Removed unnecessary events for the calculation of deadline miss ratio
> (switch_to/from, block, resume, action, np_enter/exit)
> > * Current KERNEL_IMAGE_SIZE 512*1024*1024
> >
> > Then, around 4K events are being saved into one task-assigned core
> (st-*.bin).
> > When I got the information through st_job_stats, I could see that the
> number of recorded events per task is very different even though tasks have
> the same period.
> > Moreover, usually 5~20% records are being missed for each task set, even
> though utilization is very low. Sometimes, more than that.
>
> This indicates that your system suffers from intervals of overload during
> which the tracing tools are starved. Are you sure this happens already with
> only two tasks?
>
> >
> > Is this expected record-loss ratio using st_trace tool?
>
> No, unless the st_trace tool is being starved there should be no records
> lost.
>
> > What should I check more? Is there any other way to reduce/remove
> record-loss?
>
> You can try editing litmus/Kconfig to raise the limit for
> CONFIG_SCHED_TASK_TRACE_SHIFT. You can also try running st_trace as a
> real-time task (with rt_launch).
>
> - Björn
>
>
> _______________________________________________
> litmus-dev mailing list
> litmus-dev at lists.litmus-rt.org
> https://lists.litmus-rt.org/listinfo/litmus-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20141105/c9f0210d/attachment.html>