[LITMUS^RT] RFP: Jumbo Ring Buffers

Mon Apr 22 23:33:26 CEST 2013

As we've begun to use Litmus in more application-based studies, we run into situations where trace ring buffers overflow.  This caused by durations of system overutilization.  I've had a hard time with this problem in the past, and Sisu Xi ran into it last week.

We can increase ring buffer sizes to help mitigate overflow.  This works to a point, but it becomes untenable as CPU core counts increase.  This is because the ring buffer size contribute to the kernel image size which itself has a maximum size (KERNEL_IMAGE_SIZE).  The maximum per-CPU ring buffer size gets smaller as core counts increase.  We are considering research on manycore chips with core counts over 240.  These chips are available today, so issues with KERNEL_IMAGE_SIZE may be a very real problem.

Bottom-line: Statically allocated ring buffers stink.  They are prone to overflow and the ability to mitigate the problem decreases with greater core counts.

My test platform has over 64GB of memory.  I would be happy to give 32GB over to tracing if it were necessary for my experiments.

Is there a way to use dynamically allocated page-locked buffers instead of static ones?  Are we limited by the kernel address space mechanisms?  If so, can we hijack user high page allocation for our needs?  I suppose we could lose boot-time tracing capabilities, but I think we could probably live with this.  Comments?  Ideas?

-Glenn