[LITMUS^RT] Missing st_trace records

Björn Brandenburg bbb at mpi-sws.org
Fri Sep 19 10:21:01 CEST 2014


On 18 Sep 2014, at 06:29, Mikyung Kang <mkkang01 at gmail.com> wrote:
> 
> I'm trying to get whole tracing information of RT task sets using LITMUS-RT Version 2014.1.
> 
> * System has 8 Cores, no hyper-threading, 16G memory
> * Tested both Bare-metal case and Virtualization case (Xen): similar result
> * Ubuntu 12.04 (Linux 3.10.5)
> * Generated 10 tasks w/ Utilization=[1.0, 8.0] using rtspin
> * Run 10 seconds using GSN-EDF scheduler
> 
> When I spawned only 1 task (Period=100ms, WCET=10ms) during 10 seconds, all records are being saved into .bin file correctly w/o missing records.
> But, more than 1 task, always records are being missed a lot.

This sounds like something is broken. Even with 8x(10, 100) tasks you should have no tracing problems at all as there should be more than enough time for st_trace to catch up. Your system must be overutilized somehow.

> 
> To avoid record-loss, I tried the following options based on the thread: https://lists.litmus-rt.org/pipermail/litmus-dev/2013/000480.html.
> 
> * Kernel config: CONFIG_SCHED_TASK_TRACE_SHIFT=13 (up to 8K events)
> * Used /dev/shm/* instead of disk for the binary record file
> * Removed unnecessary events for the calculation of deadline miss ratio (switch_to/from, block, resume, action, np_enter/exit)
> * Current KERNEL_IMAGE_SIZE 512*1024*1024
> 
> Then, around 4K events are being saved into one task-assigned core (st-*.bin).
> When I got the information through st_job_stats, I could see that the number of recorded events per task is very different even though tasks have the same period.
> Moreover, usually 5~20% records are being missed for each task set, even though utilization is very low. Sometimes, more than that.

This indicates that your system suffers from intervals of overload during which the tracing tools are starved. Are you sure this happens already with only two tasks?

> 
> Is this expected record-loss ratio using st_trace tool? 

No, unless the st_trace tool is being starved there should be no records lost.

> What should I check more? Is there any other way to reduce/remove record-loss?

You can try editing litmus/Kconfig to raise the limit for CONFIG_SCHED_TASK_TRACE_SHIFT. You can also try running st_trace as a real-time task (with rt_launch).

- Björn





More information about the litmus-dev mailing list