[LITMUS^RT] Question about the overhead measurement in LITMUS and potential memory leak?

Fri Oct 2 10:07:08 CEST 2015

> On 02 Oct 2015, at 04:32, Meng Xu <mengxu at seas.upenn.edu> wrote:
> 
> I'm Meng Xu, a PhD student at the PRECISE lab at the University of Pennsylvania.
> 
> I'm trying to measure the context switch overhead with the feather trace tool on Freescale IMX6 ARM board. The board has 1GB RAM and 4 cores. 
> I followed the description at https://wiki.litmus-rt.org/litmus/Tracing <https://wiki.litmus-rt.org/litmus/Tracing>.
> I measured the context switch overhead with the command:
> 
> ST_TRACE_PATH=/dev/shm
> ST_TRACE_NAME=GSN-EDF
> OH_EVENTS="CXS_START CXS_END"
> ftcat /dev/litmus/ft_cpu_trace0 ${OH_EVENTS} > ${ST_TRACE_PATH}/oh-${ST_TRACE_NAME}-0.bin &
> ftcat /dev/litmus/ft_cpu_trace1 ${OH_EVENTS} > ${ST_TRACE_PATH}/oh-${ST_TRACE_NAME}-1.bin &
> ftcat /dev/litmus/ft_cpu_trace2 ${OH_EVENTS} > ${ST_TRACE_PATH}/oh-${ST_TRACE_NAME}-2.bin &
> ftcat /dev/litmus/ft_cpu_trace3 ${OH_EVENTS} > ${ST_TRACE_PATH}/oh-${ST_TRACE_NAME}-3.bin &
> 
> I run a rtspin RT task under GSN-EDF scheduler for 20 seconds and then combine the bin files as one "oh-GSN-EDF-all.bin ".
> When I use ft2csv command to parse the bin file, it shows no complete events. 
> 
> # ft2csv CXS oh-GSN-EDF-all.bin
> Total       :      11936
> Skipped     :          1
> Avoided     :          0
> Complete    :          0
> Incomplete  :       4941
> Non RT      :          0
> Interleaved :          0
> Interrupted :       1033
> 
> I tried to increase the buffer size of the overhead trace to a larger value, but it cannot be larger than 1024MB since RAM size is only 1024MB. 
> /dev/shm has size 512MB.
> 
> I also observe that there may be some memory leaking in the feather trace?
> When I boot the system, there is about 800MB free memory.
> After I run the feather trace, the free memory becomes 538MB. (ftcat has been killed already at this time.)

Well, Feather-Trace does allocate large buffers. I didn’t think they were leaking, but if you find evidence to the contrary I’d appreciate a patch. Keep in mind that free memory is expected to decrease to near-zero during normal operation of Linux due to the buffer cache.

> The commit point I used is 7f051e3fa168eb60386b0e8d970551c06696befb, which is committed by Bjorn on Jun 12th, 2014.
> 
> My question is:
> Did I configure anything wrong that cause no complete event is recorded?

No, I don’t see anything obviously wrong.

> Do you have any suggestion/advice on how I can get the complete events?
> 
> I have tried to set the ftcat to highest priority under FIFO scheduler, it didn't solve the problem. 
> I was suspecting it is because the buffer is not consumed by ftcat fast enough so that the old events are overwritten by new events. However, I only run on rtspin task which leave at least 3 full cores idle to consume the buffer. 

If ftcat is starved, then you get some incomplete records, but typically not 100% incomplete records. This must be something else.

Have you looked at the trace with ftdump? What does it look like? 

- Björn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20151002/cb5b72c7/attachment.html>