[LITMUS^RT] Job Quantity Issue

Fri Aug 28 21:40:14 CEST 2015

Hi Björn,

Thanks for your reply.  I've answered your questions below.

1) LITMUS-RT 2014.2, with Xen patch

2) GSN-EDF with no modifications to the plugin

3) KVM

4) Each call to the job() function results in the count being
decremented, so the task terminates when count == 0

5) I guess I was confusing myself between what LITMUS sees as
a job vs the "job" calls in the task.

Thanks for this explanation.  It is helpful.  However, I then
have a few more questions please.

- Shouldn't a forced job completion be counted as a missed deadline?
If this is what's happening in the second case, it looks like 
st_job_stats isn't reporting it, possibly due to how it determines
a missed job.

- Would you be able to recommend a way to have the userspace jobs
line up with the kernel jobs? My goal is to have a way to generate
a number of jobs in the task, have that match up with the jobs in the
trace, and be able to detect the missed deadlines in the benchmark
at runtime.

For instance, using this example, the task determines that it needs
to call job() 10 times and iterations 3 and 8 miss their deadlines, then
it would be able to know that at runtime, and the trace logs would also 
show 10 jobs with the third and eighth jobs missing deadlines. 

Thank you again for your patience and help,
Geoffrey

----- Original Message -----
> From: "Björn Brandenburg" <bbb at mpi-sws.org>
> To: "Geoffrey Tran" <gtran at isi.edu>, litmus-dev at lists.litmus-rt.org
> Sent: Friday, August 28, 2015 2:00:35 AM
> Subject: Re: [LITMUS^RT] Job Quantity Issue
> 
> 
> > On 28 Aug 2015, at 03:16, Geoffrey Tran <gtran at isi.edu> wrote:
> > 
> > I was hoping to please get assistance with the following problem. It
> > is somewhat related to the previous messages at:
> > https://lists.litmus-rt.org/pipermail/litmus-dev/2015/001107.html
> > 
> > I have written a simple application based off of base_task.c from
> > liblitmus.  However, there is some strange behaviour.  First of
> > all, by the time the first job is run, the job number is around
> > 4 or 5.
> > 
> > The second problem is that the number of jobs that show up in
> > the traces is non-deterministic by a large range.  Below I
> > show two outputs from st_job_stats, one where it behaves
> > somewhat as expected, and another where it does not. According to
> > the input parameters, there should be 10 jobs, with a WCET of
> > 1ms, period of 100ms. However, the issue does show up at other
> > parameters also.
> 
> 
> Hi Geoffrey,
> 
> let’s see if we can figure it out. A couple of questions:
> 
> 1) Which version of LITMUS^RT is this?
> 
> 2) Which plugin are you using? Do you have local modifications?
> 
> 3) Which hardware platform? Native, para-virtualized Xen, or some full system
> emulator (e.g., QEMU)?
> 
> 4) How do you determine when to shut down the task? Your pseudocode says
> “while job count != 0”, which means it shouldn’t terminate until your job
> counter wraps around?
> 
> 5) Why do you expect exactly ten jobs in the traces? I’m not sure I
> understand your setup correctly. If you expect occasional budget overruns
> under precise enforcement, of course the number of “jobs” (= budget
> replenishments) is going to vary, depending on whether or not you overran a
> budget.
> 
> Say you have a budget of 10ms, a period of 100ms. For simplicity, let’s
> assume your task is the only real-time task in the system. Your task is
> invoked at time 0. Your task requires 11ms to complete the first invocation
> (i.e., the first iteration of the “job” loop). When done, your task calls
> sleep_next_period().  Under precise enforcement, the following is going to
> happen:
> 
> a) During [0, 10), the first 10ms of budget are going to be consumed.
> 
> b) Precise enforcement kicks in at 10ms, realizing a budget overrun. You get
> a “forced” job completion record in the sched_trace data and the task
> becomes ineligible for execution until its budget is replenished.
> 
> c) At time 100ms, (i.e, after the period has elapsed), the budget is
> replenished. This is recorded as a new job release in the sched_trace
> stream. The kernel has no idea what you consider to be a “job” in your
> application; from the point of view of the kernel, one budget allocation ==
> one job.
> 
> d) At time 101ms, the task completes processing the first invocation and
> calls sleep_next_period().
> 
> e) The kernel processes the sleep_next_period() system call by discarding the
> rest of the current allocation (9ms in this case) and by marking the task as
> ineligible to execute until the next budget replenishment. This is recorded
> as a job completion recored with the “forced” field set to zero.
> 
> 	https://github.com/LITMUS-RT/litmus-rt/blob/master/include/litmus/sched_trace.h#L55
> 
> f) At time 200ms, the budget is replenished and the task can process the
> second invocation. Note that the kernel’s notion of “job” and the task’s
> invocation count now disagree: due to precise budget enforcement, the task
> required two “jobs” (= budget allocations) to complete one invocation.
> 
> In other words, precise enforcement encapsulates each task in a server (in
> the sporadic server / CBS sense). The kernel tracks **server jobs** and has
> no insight into what userspace logically considers to be one invocation.
> 
> If your task has more work to do, and if it is using precise enforcement, it
> should not call sleep_next_period() unless you really, really want to
> discard your remaining budget. Note that you can also the conditional call
> wait_for_job_release(job_no) instead of the sleep_next_period():
> 
> 	https://github.com/LITMUS-RT/liblitmus/blob/master/include/litmus.h#L232
> 
> For example:
> 
> 	while (task should not exit):
> 		get_job_no(&cur_job_no)
> 		do_work()
> 		wait_for_job_release(cur_job_no + 1)
> 
> This will act like sleep_next_period() if you do NOT overrun your budget, but
> it does nothing (i.e., return immediately) if you had to tap into the next
> job’s budget already.
> 
> I hope this helps to explain what’s happening. Please let us know if this
> solves the problem.
> 
> Best regards,
> Björn
> 
> 
> 
>