[LITMUS^RT] Container implementation idea

Mon Feb 27 19:23:25 CET 2012

On Feb 26, 2012, at 9:40 PM, Jonathan Herman wrote:

> I will be adding container support to litmus over the next
> year. Exciting. My goals are to support a hierarchical container
> scheme where each container can have its own scheduling policy and a
> single container can be run on multiple CPUs simultaneously.
> 
> This is my idea and I would love it to be critiqued.
> 
> -- Interface --
> Tasks should be able to join and leave containers dynamically, but I
> don't intend to support (initially) dynamic creation of a container
> hierarchy. Instead, plugins will configure their own containers.
> 
> I intend to completely copy the container proc interface from
> Linux. The interface looks something like this:
>  - <plugin>/<container-id>*/tasks
> Tasks are added and removed from containers by echoing their IDs into
> the tasks file of each container.
> 
> For example, a CEDF plugin with containers would work as follows:
> 1. The scheduling plugin CODE would create a container for each
> cluster, and the container framework would automatically expose them
> under proc:
>  - CEDF/0/tasks
>  - CEDF/1/tasks
>  - CEDF/2/tasks
>  - CEDF/3/tasks
> 2. The scheduling plugin USER would create tasks, and echo their PIDs
> into these containers as he/she sees fit.

Having to use fwrite() to join a container seems a bit heavy handed.  Why the departure from the mechanism used to specify CPU partitioning?  Perhaps a system call could return to the user a description of the container hierarchy.  A task could traverse this hierarchy and join the container with a given identifier.  I would also appreciate an interface, to be used from within the kernel, for migrating tasks between containers.

What happens when a scheduled task changes containers?

> -- The framework --
> The framework I'm showing is designed with the following thoughts:
> * Eventual support for less ad-hoc slack stealing
> * Not breaking locking code too much
> * Trying to minimize amount of extra data being passed around
> * Keeping as much container code out of the plugins as possible
> 
> The basic idea is to have rt_task's become the fundamental schedulable
> entity in Litmus. An rt_task can correspond to a struct task_struct,
> as they do now, or an rt_cpu. I want to use these rt_tasks to abstract
> out the code needed to manage container hierarchies. Per-plugin
> scheduling code should not have to manage containers at all after initialization.
> 
> An rt_cpu is a virtual processor which can run other tasks. It can
> have a task which is @linked to it, and it optionally enforces budget
> with an @timer. The virtual processor is run when its corresponding
> rt_task, or @server, is selected to run. When the rt_cpu is selected
> to run, it chooses a task to execute by asking its corresponding
> @container.
> 
> struct rt_cpu {
> 	unsigned int cpu; /* Perhaps an RT_GLOBAL macro corresponds to
> 			   * a wandering global virtual processor?
> 			   */
> 	struct rt_task *server; /* 0xdeadbeef for no server maybe?
> 				 * I'm thinking of doing something
> 				 * special for servers which have
> 				 * full utilization of a processor,
> 				 * as servers in the BASE container
> 				 * will.
> 				 */
> 	struct rt_task *linked; /* What is logically running */
> 
> 	struct enforcement_timer timer;
> 	struct bheap_node *hn; /* For membership in heaps */
> 
> 	struct rt_container *container; /* Clearly necessary */
> };
> 
> An rt_container struct is a group of tasks scheduled together. The container
> can run tasks when one or more @procs are selected to run. When a
> container can run a task, it selects the next task to run using a
> @policy.
> 
> struct rt_container {
> 	/* Potentially one of these for each CPU */
> 	struct rt_cpu *procs;
> 	cpumask_t cpus; /* Or perhaps num_cpus? I want O(1) access to
> 			 * partitioned CPUs, but a container may also
> 			 * have multiple global rt_cpus. I am not sure
> 			 * how to accomplish O(1) access with global
> 			 * rt_cpus. Ideas? I'll try to think of something.
> 			 */
> 
> 	/* To create the container hieararchy */
> 	struct rt_container *parent;
> 
> 	/* The per-container method for scheduling container task */
> 	struct *rt_policy policy;
> 
> 	/* Metadata kept seperate from the rest of the container
> 	 * because it is not used often. E.g. a task list, proc
> 	 * entries, or a container name would be stored in this.
> 	 */
> 	struct rt_cont_data *data;
> };

Is a given policy instance static?  That is, is a single G-EDF policy instance shared between containers, or are there several distinct G-EDF policy instances, one per container?

In general, I like this interface.  It appears clean to me.

-Glenn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20120227/c06177c6/attachment.html>