[LITMUS^RT] Running LITMUS-RT on ARM64
Meng Xu
xumengpanda at gmail.com
Mon Aug 28 20:59:42 CEST 2017
On Mon, Aug 28, 2017 at 11:24 AM, Björn Brandenburg <bbb at mpi-sws.org> wrote:
>
>
> > On 28. Aug 2017, at 16:28, Andrii Anisov <andrii_anisov at epam.com> wrote:
> >
> >
> > On 28.08.17 16:57, Björn Brandenburg wrote:
> >> Perhaps it is related to virtualized time? Can you reproduce the effect on bare metal (i.e., on native LITMUS^RT without any virtualization layer)?
> > I think it is not about virtualized time.
> > I guess I could explain task execution time stretching with the VCPU nature in a virtualized system. I.e. VCPUs are also scheduled, and it might be vcpu with a task running to be scheduled out from physical CPU, then scheduled in. So task execution time increases. But I can not explain to myself an execution time reduction.
>
> Good point. However, looking at the trace, something funny appears to be going on with the system time/timing: the runtime of each job seems to depend on which of the tasks is scheduled first, which it obviously shouldn’t depend on.
>
> > Anyway I should also learn about rtspin internals and sort out what was said to me here[1].
> >
> > [1] https://lists.xenproject.org/archives/html/xen-devel/2017-08/msg02351.html
>
> Thanks for the pointer. Meng’s explanation hints at a problem related to timekeeping, which of course can be circumvented by simply executing a fixed amount of work, but his description of rtspin is not quite right.
>
> rtspin does NOT compare time stamps relative to wall clock time. Instead, it queries the kernel for the amount of CPU time that the process has consumed and simulates the workload based on that. See here:
>
> https://github.com/LITMUS-RT/liblitmus/blob/master/bin/rtspin.c#L217
>
> The cputime() function used here is defined in src/clocks.c and uses CLOCK_THREAD_CPUTIME.
>
> https://github.com/LITMUS-RT/liblitmus/blob/master/src/clocks.c#L14
>
> So if an rtspin process gets preempted, it will NOT count the time that it is preempted as execution time. Here’s the “done spinning” condition that clearly shows this:
>
> https://github.com/LITMUS-RT/liblitmus/blob/master/bin/rtspin.c#L211
>
> I’m pretty sure this works correctly on bare metal. I’m attaching a trace of two rtspin instances launched as follows:
>
> setsched P-FP
> rtspin -w -p 1 -o 2 -q 5 -s 1 1 20 10 &
> rtspin -w -p 1 -q 10 -s 1 5 10 10 &
>
> In the trace you can see that the WCET=5ms task exhibits a response time of 6ms whenever it is preempted by the WCET=1ms task. Also note how little execution-time variation the jobs are exhibiting.
>
>
>
>
>
> Andrii, can you please check that this basic example works as expected?
>
> Meng, in your experience, is rtspin showing strange behaviour related to preemptions if LITMUS^RT is being virtualized? This would suggest that LITMUS^RT’s execution time tracking is broken under (para-)virtualization…
We didn't use the rtspin in the LITMUS^RT experiment in virtualization
environment. We do not invoke any system call in the payload of the
job. We measure a piece of code that runs for 1ms on a 100%
utilization VCPU in virtualization environment as the payload. Then we
run the piece of code for x times when we want to get a task with x ms
WCET.
As to the clock_gettime(CLOCK_THREAD_CPUTIME, &ts) function, I agree
the workload in each job is fixed "if" the
clock_gettime(CLOCK_THREAD_CPUTIME, &ts) provides the "execution time"
of the thread.
However, after searching around the internet, it seems to me that this
clock_gettime() may provide an accurate measurement of the thread's
execution time.
According to: https://linux.die.net/man/3/clock_gettime,
[Quote]
"The CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clocks are
realized on many platforms using timers from the CPUs (TSC on i386,
AR.ITC on Itanium). These registers may differ between CPUs and as a
consequence these clocks may return bogus results if a process is
migrated to another CPU."
[/Quote]
Since VCPUs are scheduled around CPUs, is it possible that the
clock_gettime provides an inaccurate value?
I haven't looked into the details of clock_gettime(). Maybe looking
into the details of the clock_gettime() can help explain.
Best,
Meng
-----------
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/
More information about the litmus-dev
mailing list