[LITMUS^RT] prop/cpu-sbinheap (based on 2014.2)

Glenn Elliott gelliott at cs.unc.edu
Fri Jun 20 19:04:32 CEST 2014


Hi Felipe,

Thank you for taking the time to perform a detailed test!  These numbers are in line with what I’ve observed here.  There are more ways that sbinheap could be optimized: function inlining through macros and per-cpu-based arrays.  However, I think time might be better spent on other improvements.

Since there is no noticeable performance benefit, I defer to Björn on whether or not to merge the sbinheap data structure, and optionally, the changes to the schedulers to use it.  Although I feel this code has been well tested, that doesn’t mean that it won’t be a burden on maintenance in the future.

-Glenn


On Jun 17, 2014, at 8:13 AM, Felipe Cerqueira <felipeqcerqueira at gmail.com> wrote:

> Hi Glenn,
> 
> I got some overhead graphs from our 64-core machine, running up to 640 tasks.
> 
> "noheap" is the master branch and sbinheap is the "prop/sbinheap" branch. The difference between the two was not significant.
> The results are almost equal. sbinheap was slightly worse, but just because of measurement noise/small number of samples.
> 
> In general, the queue implementation doesn't seem to affect much the overheads. Otherwise we would also see more differences in the results
> of the RTSS'09 paper for the different heap implementations. Even though SCHED and RELEASE operations make heavy use of the task queue,
> I think the costs are mostly due to lock and cache contention on the global state, because it is accessed by all the processors simultaneously.
> And our machine has 32 + 32 CPUs connected by an interconnect, which makes cache coherence more costly.
> 
> Perhaps for larger task counts, the new heap would make a difference. But contention would also increase, so I think it would still dominate.
> Still, replacing the heavy binomial heap with this binary heap makes sense. And it's working fine :)
> 
> Thanks,
> Felipe
> 
> 
> 2014-06-12 10:51 GMT+02:00 Björn Brandenburg <bbb at mpi-sws.org>:
> 
> On 10 Jun 2014, at 20:16, Glenn Elliott <gelliott at cs.unc.edu> wrote:
> 
> > A while back there was some discussion of replacing GSN-EDF's and C-EDF’s use of bheap (the binomial heap data structure) for ordering CPUs.  These schedulers use bheap to organize CPUs into a min-heap, ordered by priority of the currently scheduled task.  This lets the scheduler quickly determine which CPU to preempt (i.e., the lowest-priority CPU).  bheap is a powerful data structure, but it is heavier-weight than is needed to just order CPUs—we don’t need bheap's quick merging.
> >
> > Some time ago, I proposed replacing bheap with the binheap data structure (include/litmus/binheap.h).  However, binheap was thought to be also too heavyweight since it supports arbitrarily large heaps.  We don’t need this flexibility to order CPUs since the maximum size of the heap is known at compile-time (i.e., NR_CPUS).
> >
> > To make some of my GPUSync locking protocols more efficient, I developed an “sbinheap” (static binary heap) data structure.  With exception of initialization routines, sbinheap uses the same API as the binheap data structure.  sbinheap builds the binary heap on a supplied contiguous array of heap nodes (either statically allocated or kmalloc’ed at runtime) using basic binary-tree-in-an-array indexing.  I’ve used sbinheap to: track the top M highest-priority released jobs, track the top M blocked tasks in a locking protocol, etc.  I’ve also used it to order CPUs the GSN-EDF and C-EDF schedulers.
> >
> > I thought sbinheap might be useful for others, so I’ve proposed a set of changes in the “prop/cpu-sbinheap” patch set.  This branch includes patches to add the sbinheap data structure and update GSN-EDF and C-EDF to use sbinheap instead of bheap to order CPUs.
> 
> Thanks a lot for the patches. On first glance, they look great.
> 
> > It might be hard measure performance benefits of sbinheap vs bheap when heaps are small.  Would someone at MPI be willing to quickly measure GSN-EDF scheduling overheads on their 64-core system under Litmus 2014.2 with and without my patch?
> 
> The machine is currently in use for another project, but we'll try to give it a run in the coming days.
> 
> > Locking protocol overheads may also be reduced.  I will do the same on one of our UNC systems.
> 
> Sounds good. Let's compare notes once the tests have completed.
> 
> Thanks,
> Björn
> 
> 
> _______________________________________________
> litmus-dev mailing list
> litmus-dev at lists.litmus-rt.org
> https://lists.litmus-rt.org/listinfo/litmus-dev
> 
> <RELEASE_noheap.pdf><RELEASE_sbinheap.pdf><SCHED_noheap.pdf><SCHED_sbinheap.pdf>_______________________________________________
> litmus-dev mailing list
> litmus-dev at lists.litmus-rt.org
> https://lists.litmus-rt.org/listinfo/litmus-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20140620/64c8b298/attachment.html>


More information about the litmus-dev mailing list