[LITMUS^RT] Feather-Trace scalability & TSC calibration patches

Björn Brandenburg bbb at mpi-sws.org
Tue Jan 14 10:56:21 CET 2014


Hi everyone,

here's another set of patches extracted from our RTAS'14 branch. When tracing LITMUS^RT on a 64-core platform, we ran into two issues:

	1) Feather-Trace's global timestamp buffer became a major scalability bottleneck, and

	2) non-synchronized TSCs with a constant offset.

(1) is a serious problem because it distorts the measurements (i.e, the overhead Feather-Trace itself becomes much larger than the overhead that it is supposed to measure). (2) is not a problem for most measurements (CPU-local measurements are not affected by cross-CPU skew), but measuring IPI latencies becomes difficult if TSCs do not share a common time zero.

The following patches address (1) by changing Feather-Trace to record all timestamps into processor-local trace buffers, and provide a workaround for (2) by adding some benchmarking code that determines the offset between any two cores, which can then be used to patch up measurements to refer to a common time base.

	https://github.com/LITMUS-RT/litmus-rt/commits/wip-ft-pcpu

The corresponding userspace patches can be found here:

	https://github.com/LITMUS-RT/feather-trace-tools/commits/wip-ft-pcpu

This of course breaks userspace scripts in all sorts of ways because /litmus/dev/ft_trace0 goes away, new devices appear, etc. Nonetheless, the pain is worth it I believe because otherwise we won't be able to derive meaningful measurements on large multicore platforms. I'd appreciate help with testing and feedback.

Thanks,
Björn





More information about the litmus-dev mailing list