[LITMUS^RT] How can I set total number of cores on G-EDF algorithm

Luo, Zheng luozheng at wustl.edu
Wed Dec 25 00:07:09 CET 2013


From: litmus-dev-bounces at lists.litmus-rt.org <litmus-dev-bounces at lists.litmus-rt.org> on behalf of Glenn Elliott <gelliott at cs.unc.edu>

Sent: Monday, December 23, 2013 6:18 PM
To: litmus-dev at lists.litmus-rt.org
Subject: Re: [LITMUS^RT] How can I set total number of cores on G-EDF algorithm


On Dec 23, 2013, at 5:15 AM, Luo, Zheng <luozheng at wustl.edu<mailto:luozheng at wustl.edu>> wrote:


On Dec 23, 2013, at 2:14, Davide Compagnin <dcompagn at gmail.com<mailto:dcompagn at gmail.com>> wrote:


Hi Luo, a possibility is to change the max number of processors in the kernel configuration file.

Il 22/dic/2013 18:59 "Luo, Zheng" <luozheng at wustl.edu<mailto:luozheng at wustl.edu>> ha scritto:
Hi everyone,

I want to use arbitrary number of cores to do run the global EDF. I tried several method, however none of them works for me. I think I truly need some help now.

1***
First, I tried the C-EDF, I think I can group the cores that will be used in the experiment to a cluster and the unused cores to a cluster. However, I read the documentation on the website, it showed how to use the L1,L2,L3 and ALL to do the division. I choose the parameter ALL, and run the experiment by choosing the ./setsched C_EDF. In the base_mt_task.c , I added:

CALL( init_rt_thread() );
int cluster = 0;
int cluster_sz = 4;
be_migrate_to_cluster(cluster, cluster_sz);
param.cpu   = cluster_to_first_cpu(cluster, cluster_sz);
CALL( set_rt_task_param(gettid(), &param) );

I tried different number of the cluster and cluster_sz, when I used the different number, I did the  method said in the website: C-EDF plugin must be reloaded (for example by switching to the Linux plugin and back to C-EDF). When I see the cpu utilization by run top -1, all 8 cores are being used. I get very confused here. I don't know where I did wrong. I send the screenshot as an attachment. I also send the base_mt_task.c as an attachment too.

2***
Second, I tried to use the G-EDF by disabling the cores in the linux system. I used:
sudo echo "0" > /sys/devices/system/node/node0/cpu1/online
sudo echo "0" > /sys/devices/system/node/node0/cpu3/online
Those command did disable the CPU cores, however the G-EDF seems not working correctly. The execution order of the rt_thread, seems not according to the G-EDF. Some of the late deadline thread, even finished early. I am really confused here.

3***
Third, I used the Native Linux user-space command to set the number of cores, by using
cpu_set_t mask;
CPU_ZERO(&mask);
for (unsigned i = first_core; i <= last_core; ++i) {
CPU_SET(i, &mask);
}
sched_setaffinity( 0 , sizeof(mask), &mask);
Then I run the experiment, and I will get same result as #1 method, it seems that G-EDF ignored the CPU mask setting.

I tired those three method, however none of them worked. Or is there any better idea to specify the number of online cores to run the experiment? I think I truly need some help now. Thank you very much. Looking forward to your reply.


Zheng Luo

_______________________________________________
litmus-dev mailing list
litmus-dev at lists.litmus-rt.org<mailto:litmus-dev at lists.litmus-rt.org>
https://lists.litmus-rt.org/listinfo/litmus-dev

_______________________________________________
litmus-dev mailing list
litmus-dev at lists.litmus-rt.org<mailto:litmus-dev at lists.litmus-rt.org>
https://lists.litmus-rt.org/listinfo/litmus-dev



Hello Davide,

Thank you very much for the reply. That is one of the possibility. However, I think change the max number of processors in the kernel takes a lot of time. Because when I run the experiment, I will change the number of the cores very often. After change the kernel, it takes some time to recompile it and reinstall it. Is there any other way of not recompile and reinstall the kernel ?


Zheng Luo

Hi Zheng,

You can: (1) Compile the Linux kernel with NR_CPUS = to the number of system CPUs, and then (2) use the kernel boot parameter “maxcpus” and set any value less than or equal to NR_CPUS. You can specify kernel boot parameters via bootloader configuration (i.e., grub). There are many tutorials for this online.

Of course, using kernel boot parameters will require you to reboot your machine every time you want to change the number of CPUs. The approach also has some other deficiencies:
1) It’s hard to reason about how CPUs are connected to each other via cache.
2) You may still only have one release master CPU.

If any of the following statements are true, then you’re going to have to write a new Litmus scheduler plugin (or modify/extend an existing one):
1) Need more than one non-real-time CPU (and you can’t accomplish this by having empty clusters in C-EDF).
2) Can’t reboot between scheduler reconfigurations.

Going back to your original question, I think I may see some mistakes in your reasoning for item 1***.  C-EDF with “ALL” is equivalent to G-EDF.  C-EDF with L1 is equivalent to partitioned scheduling (on most systems when hyperthreading is disabled, at least).  C-EDF with L2 means that clusters are created around shared L2 caches.  For example, suppose you have CPUs 0 through 3.  CPU 0 and CPU 1 share an L2.  CPU 2 and CPU 3 share a distinct L2.  C-EDF with L2 would create two clusters: {CPU0, CPU1} and {CPU2, CPU3}.  You can extend this logic to clustering around L3 (this usually means grouping CPUs by CPU sockets—if your system has one socket, L3 clustering is probably equivalent to ALL).  NOTE: On some systems, the L2 is also private to each CPU (e.g., every Nehalem and later CPU by Intel).  In this case, L2 clustering would also be equivalent to partitioned scheduling.  C-EDF on a single-socket post-Nehalem Intel platform is boring: Your configurations are equivalent to partitioned EDF or global EDF, with nothing interesting in-between.

Regarding liblitmus’s be_migrate_to_cluster(), I believe that I have found a bug on systems where there is more than one level of shared caches (the system may also need to be multi-socket).  Such a CPU would be Intel’s older Dunnington Xeon, where pairs of CPUs share an L2 and groups of six CPUs share an L3.  However, only the L3 is shared in Intel’s Nehalem and later CPUs, so I think you’re probably safe.  If C-EDF doesn’t barf on you when a task enters real-time mode, I think you’re all right.  Unfortunately, a good fix will be pretty involved—lots of cache probing is necessary—and I won’t have the time to develop a fix until after January.  I would be overjoyed if someone took a crack at it themselves.  Contact me if you’re interested.

-Glenn



Hi Glenn,

Thank you very much for the reply. That is very helpful. I can use the use the kernel boot parameter “maxcpus” and set any value less than or equal to NR_CPUS. Even thought it have deficiencies, it at least works.

Zheng Luo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20131224/d88d04a2/attachment.html>


More information about the litmus-dev mailing list