This is really, really nice. I'll give it a couple days for everyone to check it out then probably merge<div>into staging. It has inspired another question: should we move sched_trace towards this infrastructure?</div>


<div><br></div><div>I need to add visualization for container scheduling into something so that I can practically debug</div><div>my implementation. The unit-trace visualization code is a tad obtuse and I was not looking forward to</div>


<div>adding container support. The code for kernelshark seems modularized and slick. I would much rather</div><div>add code to this. I could add visualization for releases / deadlines / blocking etc fairly easily.</div><div>


<br></div><div>Other / future work (Glenn's interrupts, Chris's memory management) on litmus would benefit from an</div><div>easily extendable tracing framework. I don't want to extend unit-trace if we'll have to abandon it for</div>


<div>tracepoints anyway.</div><div><br></div><div>Chris, Glenn, Mac, and I are pro abandoning unit-trace for kernel visualization. Bjoern and Andrea, what do</div><div>you think about this? Going forward, I would see us dropping unit-trace for kernel visualization, but could</div>


<div>we replace sched_trace entirely in the long term? Would we want to?</div><div><br></div><div>For those that didn't get a chance to play with it, this also supports dynamically enabling / disabling events</div><div>


as well as a task-centric view of system events, so that you can list rt-spin processes and see how they are</div><div>behaving.</div><div><br><div class="gmail_quote">On Tue, Feb 14, 2012 at 2:59 PM, Andrea Bastoni <span dir="ltr"><<a href="mailto:bastoni@cs.unc.edu" target="_blank">bastoni@cs.unc.edu</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div></div><div>On 02/14/2012 12:05 AM, Glenn Elliott wrote:<br>

><br>

> On Feb 11, 2012, at 4:17 PM, Andrea Bastoni wrote:<br>

><br>

>> Hi all,<br>

>><br>

>> I've managed to expand and polish a bit a patch that I've had around for a<br>

>> while. It basically enables the same sched_trace_XXX() functions that we<br>

>> currently use to trace scheduling events, but it does so using kernel-style<br>

>> events (/sys/kernel/debug/tracing/ etc.).<br>

>><br>

>> So, why another tracing infrastructure:<br>

>> - Litmus tracepoints can be recorded and analyzed together (single<br>

>>  time reference) with all other kernel tracing events (e.g.,<br>

>>  sched:sched_switch, etc.). It's easier to correlate the effects<br>

>>  of kernel events on litmus tasks.<br>

>><br>

>> - It enables a quick way to visualize and process schedule traces<br>

>>  using trace-cmd utility and kernelshark visualizer.<br>

>>  Kernelshark lacks unit-trace's schedule-correctness checks, but<br>

>>  it enables a fast view of schedule traces and it has several<br>

>>  filtering options (for all kernel events, not only Litmus').<br>

>><br>

>> Attached (I hope the ML won't filter images ;)) you can find the visualization<br>

>> of a simple set of rtspin tasks. Particularly, getting the trace of a single<br>

>> task is straightforward using trace-cmd:<br>

>><br>

>> # trace-cmd record -e sched:sched_switch -e litmus:* ./rtspin -p 0 50 100 2<br>

>><br>

>> and to visualize it:<br>

>><br>

>> # kernelshark trace.dat<br>

>><br>

>> trace-cmd can be fetch here:<br>

>><br>

>> git://<a href="http://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git" target="_blank">git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git</a><br>

>><br>

>> (kernelshark it's just the "make gui" of trace-cmd; trace-cmd and kernelshark<br>

>> have a lot more features than simple filtering and visualization; hopefully it<br>

>> should be a good help for debugging.)<br>

>><br>

>> The patch is on "wip-tracepoints" on main repository and jupiter.<br>

>><br>

>> Info on trace-cmd, kernelshark, and ftrace are available here:<br>

>><br>

>> <a href="http://lwn.net/Articles/341902/" target="_blank">http://lwn.net/Articles/341902/</a><br>

>> <a href="http://lwn.net/Articles/425583/" target="_blank">http://lwn.net/Articles/425583/</a><br>

>> <a href="http://rostedt.homelinux.com/kernelshark/" target="_blank">http://rostedt.homelinux.com/kernelshark/</a><br>

>> <a href="http://lwn.net/Articles/365835/" target="_blank">http://lwn.net/Articles/365835/</a><br>

>> <a href="http://lwn.net/Articles/366796/" target="_blank">http://lwn.net/Articles/366796/</a><br>

><br>

><br>

> I saw these tracing tools at RTLWS this year and thought it would be nice to<br>

leverage the OS tracing and visualization tools. The validation methods of<br>

unit-trace are nice, but have fallen out of use. Unit-trace is mostly used for<br>

visual inspection/validation and I think kernelshark is probably more robust<br>

than unit-trace, right?<br>

<br>

</div></div>Umm, I think the major strength of this approach is that it's easier to<br>

correlate (also visually) Linux tasks and Litmus tasks. It also enable a quick<br>

way to visualize schedule traces, but ATM:<br>

<br>

- unit-trace schedule plots are prettier! :)<br>

When you visualize plots with kernelshark you also get (if you don't disable<br>

them) all the "spam" from other events/tracing points.<br>

<br>

- unit-trace can automatically check for deadline misses<br>

<div><br>

> Questions:<br>

> (1) I guess this would completely remove the feather-trace under-pinnings to sched_trace in favor of this?<br>

<br>

</div>Nope, as I said in a previous email, it adds to sched_trace_XXX(). You can have<br>

both enabled, both disabled, or one enabled and the other disabled. The defines<br>

in [include/litmus/sched_trace.h] do the enable/disable trick.<br>

<div><br>

> (2) How might this affect the analysis tools we use in sched_trace.git?  Can<br>

> we merely update to new struct formats, or is it more complicated than that?<br>

<br>

</div>Umm, you're always more than welcome to update them if you want! :) I don't see<br>

problems in using both methods. It's always nice to have Litmus-only traces<br>

without all the spam that can be generated by kernel function tracers. (You can<br>

play with "./trace-cmd record -e all /bin/ls" to get an idea on how many events<br>

will be recorded... and you're just tracing events, not all the functions!)<br>

<div><br>

> (3) How big is the buffer used by the Linux tracing?  Using<br>

> feather-trace-based tracing, I've seen dropped events in systems that are<br>

> temporarily overutilized.  This is because ft-trace gets starved for CPU<br>

> time.  I've made the sched_trace buffers huge to counter this, but this "fix"<br>

> doesn't always work.  Would Linux tracing make dropped events more or less<br>

> likely?  What recourse do we have if we find that events are being dropped?<br>

<br>

</div>[snip]<br>

<div>> Info on trace-cmd, kernelshark, and ftrace are available here:<br>

><br>

</div>[snip]<br>

> <a href="http://lwn.net/Articles/366796/" target="_blank">http://lwn.net/Articles/366796/</a><br>

<br>

buffer_size_kb; and perhaps starting/stopping the trace from the kernel may work.<br>

<br>

Thanks,<br>

<font color="#888888">- Andrea<br>

</font><div><div></div><div><br>

<br>

> -Glenn<br>

><br>

><br>

> _______________________________________________<br>

> litmus-dev mailing list<br>

> <a href="mailto:litmus-dev@lists.litmus-rt.org" target="_blank">litmus-dev@lists.litmus-rt.org</a><br>

> <a href="https://lists.litmus-rt.org/listinfo/litmus-dev" target="_blank">https://lists.litmus-rt.org/listinfo/litmus-dev</a><br>

><br>

<br>

<br>

_______________________________________________<br>

litmus-dev mailing list<br>

<a href="mailto:litmus-dev@lists.litmus-rt.org" target="_blank">litmus-dev@lists.litmus-rt.org</a><br>

<a href="https://lists.litmus-rt.org/listinfo/litmus-dev" target="_blank">https://lists.litmus-rt.org/listinfo/litmus-dev</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>Jonathan Herman<br>Department of Computer Science at UNC Chapel Hill<br>

</div>