[LITMUS^RT] RT-litmus behaviour under real-workloads

Thu Feb 15 10:10:00 CET 2018

Unfortunately, I am forced to use this release since it is the closest to
Kernel 3.10 which has stable support for the Odroid-XU4. I am also using
other tools which demand 3.10 kernel.

The benchmark and what it does is not the issue. The majority perform
complex mathematical operations. Few, however, involve I/O, like
decoding/encoding jpeg. Are there any issues if the tasks did I/O
operations and not CPU bound in Litmus RT_mode? I read somewhere this is
discouraged, is that true?

It is worth noting that I am using ft_tools 2016.1 against litmus 2014.2.
The reason is that ft_tools 2014.2 did not have the st_trace_schedule,
st_draw and st_job_stat. I have assumed that this is the culprit.  Well, if
I don't have access to st_trace_schedule and st_job_stats on 2014.2, what
reliable methods I can use to debug my schedules? I only need to know that
they work fine before I move forward to doing some measurements.

Running without forced completions crashes the system into a black
unresponsive screen. Reboot required afterward. I still have no idea why.
Any previous experience with this?

Thanks

On 15 February 2018 at 03:48, Björn Brandenburg <bbb at mpi-sws.org> wrote:

>
> > On 15. Feb 2018, at 04:47, Ashraf E. Suyyagh <mrsuyyagh at gmail.com>
> wrote:
> >
> > First of all, my apologies for the long post, but needed to cover the
> setup and results in details. I am forced to use litmus 2014.2 on a 3.10
> kernel.
>
> This version is ancient. You really should upgrade to the latest release.
> We don’t have the resources to support old versions, sorry.
>
> > A sample command is running the bitcnts benchmark, the period is four
> times the WCET. Therefore, the implicit deadline is far away.
> > sudo ./bitcnts -p 1 -w 1502 6010 30 -- 1125000  &
>
> Since I don’t know your code, I don’t really know what this does.
>
> > I have run each of the benchmarks and they do execute periodically. So
> no problem in launching the benchmarks and I verified that my tasks are
> launching fine with correct results/outputs. However, the problems are the
> follows:
> >       • When running st-trace-schedule and collecting results by st-job
> stats for one task, ACET values are odd. In many cases, they are 0, in
> others extremely high values which makes no sense (e.g. 3677604500000) (see
> file result_01
>
> Yes, this makes no sense. Either the trace files are corrupt, or there’s
> some version mismatch, or something else fishy is going on. If the ACET
> values are garbage or implausible (which I haven’t seen before), then the
> trace is certainly not reliable.
>
> >       • This issue is a frequent case with most benchmarks, I will list
> one as an example. In a task with a WCET of 107ms and a period of 500ms
> running for 30 seconds, the results are odd. The ACET shows as before, 0.
> And there are lots of forced jobs in a pattern of 1,1,0,1,1,0 ... etc or
> 0,1,0,10,1 ...etc. Why would the task be forced? The execution time is less
> than the WCET and the deadline is four times the WCET. There is no way I
> can imagine that the job would exceed its deadline. (see file result_02).
>
> Forced completions have nothing to do with deadlines.
>
> In any case, I would recommend to run without forced completions,
> especially when trying to debug scheduler behavior.
>
> > Do note, in a previous project we have run those benchmarks tens of
> thousands of times each and we have a solid idea of how long they execute
> on average.
>
> Note the words ‘on average’. But anyway this information is likely bogus
> because the trace is either corrupt or being wrongly parsed.
>
> >       • Is the period capped? I have a task which is run as " sudo
> ./bitcnts -p 1 -w 1502 6010 30 -- 1125000  &  ". However, the trace shows a
> period of 1715 instead of 6010
>
> There’s no period capping in LITMUS^RT. This just shows that the trace is
> not being parsed correctly.
>
> > # Task,   Job,     Period,   Response, DL Miss?,   Lateness,  Tardiness,
> Forced?,       ACET,  Preemptions,   Migrations
> > # task NAME=bitcnts PID=4052 COST=1502000000 PERIOD=1715032704 CPU=1
> >   4052,     2, 1715032704, 1443401363,        0, -4566598637,
> 0,       0,          0,            0,            0
> >   4052,     3, 1715032704, 1446788641,        0, -4563211359,
> 0,       0,          0,            0,            0
> >
>
> Please try to reproduce this problem — bogus actual execution times and
> bogus trace information — on the latest LITMUS^RT version and on an x86
> platform. (Presumably your test tasks are portable, so you should be able
> to run this easily on the latest version.) If the problem persists, please
> post an example that reproduces the issue. Otherwise, I’m afraid we won’t
> be able to look into this in more detail.
>
> - Björn
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20180215/1df788e9/attachment.html>