<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>On Feb 26, 2012, at 9:40 PM, Jonathan Herman wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div><div>I will be adding container support to litmus over the next</div><div>year. Exciting. My goals are to support a hierarchical container</div><div>scheme where each container can have its own scheduling policy and a</div>


<div>single container can be run on multiple CPUs simultaneously.</div></div><div><br></div><div>This is my idea and I would love it to be critiqued.</div><div><br></div><div><b>-- Interface --</b></div><div>Tasks should be able to join and leave containers dynamically, but I</div>


<div>don't intend to support (initially) dynamic creation of a container</div><div>hierarchy. Instead, plugins will configure their own containers.</div><div><br></div><div>I intend to completely copy the container proc interface from</div>


<div>Linux. The interface looks something like this:</div><div> - <plugin>/<container-id>*/tasks</div><div>Tasks are added and removed from containers by echoing their IDs into</div><div>the tasks file of each container.</div>


<div><br></div><div>For example, a CEDF plugin with containers would work as follows:</div><div>1. The scheduling plugin CODE would create a container for each</div><div>cluster, and the container framework would automatically expose them</div>


<div>under proc:</div><div> - CEDF/0/tasks</div><div> - CEDF/1/tasks</div><div> - CEDF/2/tasks</div><div> - CEDF/3/tasks</div><div>2. The scheduling plugin USER would create tasks, and echo their PIDs</div><div>into these containers as he/she sees fit.</div></blockquote><div><br></div><div>Having to use fwrite() to join a container seems a bit heavy handed.  Why the departure from the mechanism used to specify CPU partitioning?  Perhaps a system call could return to the user a description of the container hierarchy.  A task could traverse this hierarchy and join the container with a given identifier.  I would also appreciate an interface, to be used from within the kernel, for migrating tasks between containers.</div><div><br></div><div>What happens when a scheduled task changes containers?</div><br><blockquote type="cite"><div><b>-- The framework --</b></div><div>The framework I'm showing is designed with the following thoughts:</div><div>* Eventual support for less ad-hoc slack stealing</div><div>* Not breaking locking code too much</div>


<div>* Trying to minimize amount of extra data being passed around</div><div>* Keeping as much container code out of the plugins as possible</div><div><br></div><div>The basic idea is to have rt_task's become the fundamental schedulable</div>


<div>entity in Litmus. An rt_task can correspond to a struct task_struct,</div><div>as they do now, or an rt_cpu. I want to use these rt_tasks to abstract</div><div>out the code needed to manage container hierarchies. Per-plugin</div>


<div>scheduling code should not have to manage containers at all after initialization.</div><div><br></div><div>An rt_cpu is a virtual processor which can run other tasks. It can</div><div>have a task which is @linked to it, and it optionally enforces budget</div>


<div>with an @timer. The virtual processor is run when its corresponding</div><div>rt_task, or @server, is selected to run. When the rt_cpu is selected</div><div>to run, it chooses a task to execute by asking its corresponding</div>


<div>@container.</div><div><br></div><div><font face="'courier new', monospace">struct rt_cpu {</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">    </span>unsigned int cpu; /* Perhaps an RT_GLOBAL macro corresponds to</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                        </span>   * a wandering global virtual processor?</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                   </span>   */</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">        </span>struct rt_task *server; /* 0xdeadbeef for no server maybe?</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                                </span> * I'm thinking of doing something</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                                </span> * special for servers which have</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                         </span> * full utilization of a processor,</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                                </span> * as servers in the BASE container</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                               </span> * will.</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                                </span> */</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">       </span>struct rt_task *linked; /* What is logically running */</font></div>


<div><font face="'courier new', monospace"><br></font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">    </span>struct enforcement_timer timer;</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">        </span>struct bheap_node *hn; /* For membership in heaps */</font></div><div><font face="'courier new', monospace"><br>


</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">       </span>struct rt_container *container; /* Clearly necessary */</font></div><div><font face="'courier new', monospace">};</font></div>


<div><br></div><div>An rt_container struct is a group of tasks scheduled together. The container</div><div>can run tasks when one or more @procs are selected to run. When a</div><div>container can run a task, it selects the next task to run using a</div>


<div>@policy.</div><div><br></div><div><font face="'courier new', monospace">struct rt_container {</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre"> </span>/* Potentially one of these for each CPU */</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">        </span>struct rt_cpu *procs;</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">     </span>cpumask_t cpus; /* Or perhaps num_cpus? I want O(1) access to</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                        </span> * partitioned CPUs, but a container may also</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                     </span> * have multiple global rt_cpus. I am not sure</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                        </span> * how to accomplish O(1) access with global</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                      </span> * rt_cpus. Ideas? I'll try to think of something.</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">                        </span> */</font></div><div><font face="'courier new', monospace"><br></font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">   </span>/* To create the container hieararchy */</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">        </span>struct rt_container *parent;</font></div><div><font face="'courier new', monospace"><br></font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">        </span>/* The per-container method for scheduling container task */</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">      </span>struct *rt_policy policy;</font></div>


<div><font face="'courier new', monospace"><br></font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">    </span>/* Metadata kept seperate from the rest of the container</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">        </span> * because it is not used often. E.g. a task list, proc</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">   </span> * entries, or a container name would be stored in this.</font></div>


<div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">        </span> */</font></div><div><font face="'courier new', monospace"><span class="Apple-tab-span" style="white-space:pre">       </span>struct rt_cont_data *data;</font></div>


<div><font face="'courier new', monospace">};</font></div></blockquote><div><br></div><div>Is a given policy instance static?  That is, is a single G-EDF policy instance shared between containers, or are there several distinct G-EDF policy instances, one per container?</div><div><br></div><div>In general, I like this interface.  It appears clean to me.</div></div><br><div>-Glenn</div><div><br></div></body></html>