[LITMUS^RT] How can I set total number of cores on G-EDF algorithm

Glenn Elliott gelliott at cs.unc.edu
Tue Dec 24 02:18:31 CET 2013


On Dec 23, 2013, at 5:15 AM, Luo, Zheng <luozheng at wustl.edu> wrote:

> 
> On Dec 23, 2013, at 2:14, Davide Compagnin <dcompagn at gmail.com> wrote:
> 
>> Hi Luo, a possibility is to change the max number of processors in the kernel configuration file.
>> 
>> Il 22/dic/2013 18:59 "Luo, Zheng" <luozheng at wustl.edu> ha scritto:
>> Hi everyone,
>> 
>> I want to use arbitrary number of cores to do run the global EDF. I tried several method, however none of them works for me. I think I truly need some help now.
>> 
>> 1***
>> First, I tried the C-EDF, I think I can group the cores that will be used in the experiment to a cluster and the unused cores to a cluster. However, I read the documentation on the website, it showed how to use the L1,L2,L3 and ALL to do the division. I choose the parameter ALL, and run the experiment by choosing the ./setsched C_EDF. In the base_mt_task.c , I added:
>> 
>> CALL( init_rt_thread() );
>> int cluster = 0;
>> int cluster_sz = 4; 
>> be_migrate_to_cluster(cluster, cluster_sz);
>> param.cpu   = cluster_to_first_cpu(cluster, cluster_sz);
>> CALL( set_rt_task_param(gettid(), &param) );
>> 
>> I tried different number of the cluster and cluster_sz, when I used the different number, I did the  method said in the website: C-EDF plugin must be reloaded (for example by switching to the Linux plugin and back to C-EDF). When I see the cpu utilization by run top -1, all 8 cores are being used. I get very confused here. I don't know where I did wrong. I send the screenshot as an attachment. I also send the base_mt_task.c as an attachment too.
>> 
>> 2***
>> Second, I tried to use the G-EDF by disabling the cores in the linux system. I used: 
>> sudo echo "0" > /sys/devices/system/node/node0/cpu1/online
>> sudo echo "0" > /sys/devices/system/node/node0/cpu3/online
>> Those command did disable the CPU cores, however the G-EDF seems not working correctly. The execution order of the rt_thread, seems not according to the G-EDF. Some of the late deadline thread, even finished early. I am really confused here.
>> 
>> 3***
>> Third, I used the Native Linux user-space command to set the number of cores, by using 
>> cpu_set_t mask;
>> CPU_ZERO(&mask);
>> for (unsigned i = first_core; i <= last_core; ++i) {
>> CPU_SET(i, &mask);
>> }
>> sched_setaffinity( 0 , sizeof(mask), &mask);
>> Then I run the experiment, and I will get same result as #1 method, it seems that G-EDF ignored the CPU mask setting.
>> 
>> I tired those three method, however none of them worked. Or is there any better idea to specify the number of online cores to run the experiment? I think I truly need some help now. Thank you very much. Looking forward to your reply.
>> 
>> 
>> Zheng Luo
>> 
>> _______________________________________________
>> litmus-dev mailing list
>> litmus-dev at lists.litmus-rt.org
>> https://lists.litmus-rt.org/listinfo/litmus-dev
>> 
>> _______________________________________________
>> litmus-dev mailing list
>> litmus-dev at lists.litmus-rt.org
>> https://lists.litmus-rt.org/listinfo/litmus-dev
> 
> 
> 
> Hello Davide,
> 
> Thank you very much for the reply. That is one of the possibility. However, I think change the max number of processors in the kernel takes a lot of time. Because when I run the experiment, I will change the number of the cores very often. After change the kernel, it takes some time to recompile it and reinstall it. Is there any other way of not recompile and reinstall the kernel ?
> 
> 
> Zheng Luo

Hi Zheng,

You can: (1) Compile the Linux kernel with NR_CPUS = to the number of system CPUs, and then (2) use the kernel boot parameter “maxcpus” and set any value less than or equal to NR_CPUS.  You can specify kernel boot parameters via bootloader configuration (i.e., grub).  There are many tutorials for this online.

Of course, using kernel boot parameters will require you to reboot your machine every time you want to change the number of CPUs.  The approach also has some other deficiencies:
1) It’s hard to reason about how CPUs are connected to each other via cache.
2) You may still only have one release master CPU.

If any of the following statements are true, then you’re going to have to write a new Litmus scheduler plugin (or modify/extend an existing one):
1) Need more than one non-real-time CPU (and you can’t accomplish this by having empty clusters in C-EDF).
2) Can’t reboot between scheduler reconfigurations.

Going back to your original question, I think I may see some mistakes in your reasoning for item 1***.  C-EDF with “ALL” is equivalent to G-EDF.  C-EDF with L1 is equivalent to partitioned scheduling (on most systems when hyperthreading is disabled, at least).  C-EDF with L2 means that clusters are created around shared L2 caches.  For example, suppose you have CPUs 0 through 3.  CPU 0 and CPU 1 share an L2.  CPU 2 and CPU 3 share a distinct L2.  C-EDF with L2 would create two clusters: {CPU0, CPU1} and {CPU2, CPU3}.  You can extend this logic to clustering around L3 (this usually means grouping CPUs by CPU sockets—if your system has one socket, L3 clustering is probably equivalent to ALL).  NOTE: On some systems, the L2 is also private to each CPU (e.g., every Nehalem and later CPU by Intel).  In this case, L2 clustering would also be equivalent to partitioned scheduling.  C-EDF on a single-socket post-Nehalem Intel platform is boring: Your configurations are equivalent to partitioned EDF or global EDF, with nothing interesting in-between.

Regarding liblitmus’s be_migrate_to_cluster(), I believe that I have found a bug on systems where there is more than one level of shared caches (the system may also need to be multi-socket).  Such a CPU would be Intel’s older Dunnington Xeon, where pairs of CPUs share an L2 and groups of six CPUs share an L3.  However, only the L3 is shared in Intel’s Nehalem and later CPUs, so I think you’re probably safe.  If C-EDF doesn’t barf on you when a task enters real-time mode, I think you’re all right.  Unfortunately, a good fix will be pretty involved—lots of cache probing is necessary—and I won’t have the time to develop a fix until after January.  I would be overjoyed if someone took a crack at it themselves.  Contact me if you’re interested.

-Glenn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20131223/a1d176ea/attachment.html>


More information about the litmus-dev mailing list