[LITMUS^RT] RFC: kernel-style events for Litmus^RT

Jonathan Herman hermanjl at cs.unc.edu
Thu Feb 16 02:55:54 CET 2012


I agree, I should have separated those questions out.

On Wed, Feb 15, 2012 at 8:40 PM, Mac Mollison <mollison at cs.unc.edu> wrote:

> I think there are really two separate questions here (please let us
> know, Jonathan, if you agree or not):
>
> (1) "Should we switch from sched-trace to the Linux kernel tracing
> infrastructure used by kerneltrace instead of maintaining both side by
> side?"
>
> That, I have no opinion on.
>
> FYI, unit-trace has little to no bearing on this decision, because it
> would be easy to write a new unit-trace frontend that can parse the
> same trace files as kernelshark. I wrote a new frontend to parse trace
> files from my userspace scheduler, and it didn't take long.
>
> (2) "Is it a good idea to be adding new visuazliation functionality to
> kernelshark instead of unit-trace? i.e. where do we want to spend our
> effort in terms of developing visualization tools?"
>
> I concur that it is worth trying to extend kernelshark. You're going
> to get much more bang for your buck that way, as opposed to working
> with the extremely obtuse unit-trace visualizer code.
>
> Just in case that ultimately proves to be problematic, you could always
> switch back to the unit-trace visualizer. By then there may be a new,
> maintainable, extensible unit-trace visualizer anyway, because I think
> I'll have to create something like that for my userspace scheduling
> work.
>
> - Mac
>
>
> On Wed, 15 Feb 2012 18:36:15 -0500
> Jonathan Herman <hermanjl at cs.unc.edu> wrote:
>
> > This is really, really nice. I'll give it a couple days for everyone
> > to check it out then probably merge
> > into staging. It has inspired another question: should we move
> > sched_trace towards this infrastructure?
> >
> > I need to add visualization for container scheduling into something
> > so that I can practically debug
> > my implementation. The unit-trace visualization code is a tad obtuse
> > and I was not looking forward to
> > adding container support. The code for kernelshark seems modularized
> > and slick. I would much rather
> > add code to this. I could add visualization for releases / deadlines /
> > blocking etc fairly easily.
> >
> > Other / future work (Glenn's interrupts, Chris's memory management) on
> > litmus would benefit from an
> > easily extendable tracing framework. I don't want to extend
> > unit-trace if we'll have to abandon it for
> > tracepoints anyway.
> >
> > Chris, Glenn, Mac, and I are pro abandoning unit-trace for kernel
> > visualization. Bjoern and Andrea, what do
> > you think about this? Going forward, I would see us dropping
> > unit-trace for kernel visualization, but could
> > we replace sched_trace entirely in the long term? Would we want to?
> >
> > For those that didn't get a chance to play with it, this also supports
> > dynamically enabling / disabling events
> > as well as a task-centric view of system events, so that you can list
> > rt-spin processes and see how they are
> > behaving.
> >
> > On Tue, Feb 14, 2012 at 2:59 PM, Andrea Bastoni <bastoni at cs.unc.edu>
> > wrote:
> >
> > > On 02/14/2012 12:05 AM, Glenn Elliott wrote:
> > > >
> > > > On Feb 11, 2012, at 4:17 PM, Andrea Bastoni wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I've managed to expand and polish a bit a patch that I've had
> > > >> around
> > > for a
> > > >> while. It basically enables the same sched_trace_XXX() functions
> > > >> that we currently use to trace scheduling events, but it does so
> > > >> using
> > > kernel-style
> > > >> events (/sys/kernel/debug/tracing/ etc.).
> > > >>
> > > >> So, why another tracing infrastructure:
> > > >> - Litmus tracepoints can be recorded and analyzed together
> > > >> (single time reference) with all other kernel tracing events
> > > >> (e.g., sched:sched_switch, etc.). It's easier to correlate the
> > > >> effects of kernel events on litmus tasks.
> > > >>
> > > >> - It enables a quick way to visualize and process schedule traces
> > > >>  using trace-cmd utility and kernelshark visualizer.
> > > >>  Kernelshark lacks unit-trace's schedule-correctness checks, but
> > > >>  it enables a fast view of schedule traces and it has several
> > > >>  filtering options (for all kernel events, not only Litmus').
> > > >>
> > > >> Attached (I hope the ML won't filter images ;)) you can find the
> > > visualization
> > > >> of a simple set of rtspin tasks. Particularly, getting the trace
> > > >> of a
> > > single
> > > >> task is straightforward using trace-cmd:
> > > >>
> > > >> # trace-cmd record -e sched:sched_switch -e litmus:* ./rtspin -p
> > > >> 0 50
> > > 100 2
> > > >>
> > > >> and to visualize it:
> > > >>
> > > >> # kernelshark trace.dat
> > > >>
> > > >> trace-cmd can be fetch here:
> > > >>
> > > >> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git
> > > >>
> > > >> (kernelshark it's just the "make gui" of trace-cmd; trace-cmd and
> > > kernelshark
> > > >> have a lot more features than simple filtering and visualization;
> > > hopefully it
> > > >> should be a good help for debugging.)
> > > >>
> > > >> The patch is on "wip-tracepoints" on main repository and jupiter.
> > > >>
> > > >> Info on trace-cmd, kernelshark, and ftrace are available here:
> > > >>
> > > >> http://lwn.net/Articles/341902/
> > > >> http://lwn.net/Articles/425583/
> > > >> http://rostedt.homelinux.com/kernelshark/
> > > >> http://lwn.net/Articles/365835/
> > > >> http://lwn.net/Articles/366796/
> > > >
> > > >
> > > > I saw these tracing tools at RTLWS this year and thought it would
> > > > be
> > > nice to
> > > leverage the OS tracing and visualization tools. The validation
> > > methods of unit-trace are nice, but have fallen out of use.
> > > Unit-trace is mostly used for
> > > visual inspection/validation and I think kernelshark is probably
> > > more robust
> > > than unit-trace, right?
> > >
> > > Umm, I think the major strength of this approach is that it's
> > > easier to correlate (also visually) Linux tasks and Litmus tasks.
> > > It also enable a quick
> > > way to visualize schedule traces, but ATM:
> > >
> > > - unit-trace schedule plots are prettier! :)
> > > When you visualize plots with kernelshark you also get (if you don't
> > > disable
> > > them) all the "spam" from other events/tracing points.
> > >
> > > - unit-trace can automatically check for deadline misses
> > >
> > > > Questions:
> > > > (1) I guess this would completely remove the feather-trace
> > > under-pinnings to sched_trace in favor of this?
> > >
> > > Nope, as I said in a previous email, it adds to sched_trace_XXX().
> > > You can have
> > > both enabled, both disabled, or one enabled and the other disabled.
> > > The defines
> > > in [include/litmus/sched_trace.h] do the enable/disable trick.
> > >
> > > > (2) How might this affect the analysis tools we use in
> > > > sched_trace.git?
> > >  Can
> > > > we merely update to new struct formats, or is it more complicated
> > > > than
> > > that?
> > >
> > > Umm, you're always more than welcome to update them if you want! :)
> > > I don't see
> > > problems in using both methods. It's always nice to have
> > > Litmus-only traces without all the spam that can be generated by
> > > kernel function tracers. (You can
> > > play with "./trace-cmd record -e all /bin/ls" to get an idea on how
> > > many events
> > > will be recorded... and you're just tracing events, not all the
> > > functions!)
> > >
> > > > (3) How big is the buffer used by the Linux tracing?  Using
> > > > feather-trace-based tracing, I've seen dropped events in systems
> > > > that are temporarily overutilized.  This is because ft-trace gets
> > > > starved for CPU time.  I've made the sched_trace buffers huge to
> > > > counter this, but this
> > > "fix"
> > > > doesn't always work.  Would Linux tracing make dropped events
> > > > more or
> > > less
> > > > likely?  What recourse do we have if we find that events are being
> > > dropped?
> > >
> > > [snip]
> > > > Info on trace-cmd, kernelshark, and ftrace are available here:
> > > >
> > > [snip]
> > > > http://lwn.net/Articles/366796/
> > >
> > > buffer_size_kb; and perhaps starting/stopping the trace from the
> > > kernel may work.
> > >
> > > Thanks,
> > > - Andrea
> > >
> > >
> > > > -Glenn
> > > >
> > > >
> > > > _______________________________________________
> > > > litmus-dev mailing list
> > > > litmus-dev at lists.litmus-rt.org
> > > > https://lists.litmus-rt.org/listinfo/litmus-dev
> > > >
> > >
> > >
> > > _______________________________________________
> > > litmus-dev mailing list
> > > litmus-dev at lists.litmus-rt.org
> > > https://lists.litmus-rt.org/listinfo/litmus-dev
> > >
> >
> >
> >
>
> _______________________________________________
> litmus-dev mailing list
> litmus-dev at lists.litmus-rt.org
> https://lists.litmus-rt.org/listinfo/litmus-dev
>



-- 
Jonathan Herman
Department of Computer Science at UNC Chapel Hill
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20120215/16783088/attachment.html>


More information about the litmus-dev mailing list