[LITMUS^RT] Running LITMUS-RT on ARM64

Tue Aug 29 16:23:43 CEST 2017

On Tue, Aug 29, 2017 at 10:07 AM, Andrii Anisov <andrii_anisov at epam.com> wrote:
> Hello Björn,
>
>
> On 29.08.17 00:49, Björn Brandenburg wrote:
>>
>> Unlikely. To my understanding, Linux uses cycle counters on ARM64, and
>> these cycle counters are synchronized across cores.
>>
>> I do not know how Linux’s time-tracking code interacts with Xen’s
>> virtualization of the cycle counters, in either native or para-virtualized
>> mode.
>>
>> In particular, what happens when a vCPU is preempted?
>
> It is transparent to the guest domain. For a kernel it looks like an
> additional time spent for the task.
>
>>   To the guest kernel, the currently scheduled guest process “is using the
>> vCPU” (even though it is not making progress since the vCPU itself is not
>> scheduled), so most likely it will be charged for that execution time
>
> Yes, it is.
>
>> As a result, rtspin would be getting confused regarding how much of its
>> simulated workload is already complete.
>
> That is the case.
>
>> 1) rtspin works just fine on bare metal as intended. It was never designed
>> for virtualized environments.
>>
>> 2) rtspin should not be used as a benchmarking tool under virtualization.
>> You need a real workload, or at least something that carries out a known
>> amount of work.
>
> I'll implement one.
>
>> 3) It’s not just rtspin that will be confused — likely all of LITMUS^RT’s
>> budget tracking and enforcement code will not work as expected under
>> virtualization (when given vCPUs with non-100% utilization). This is because
>> the scheduler is not aware of any times during which a vCPU does not receive
>> service, so tasks will be charged for bogus execution time.
>
> This sounds really bad.

I'm wondering if it's designed for this on purpose by Linux people?
If yes, they must have discussed some tradeoffs; otherwise, it could
lead to some issues.
BTW, I remembered someone from Oracle did some benchmarking long time
ago. (I couldn't find the link for now. I can share it later if I find
it.) There are some interesting observation: sometimes, a program runs
faster in VM than in bare metal. This is because the scheduler makes
different scheduling decisions in VM and in native. Maybe the budget
accounting issue in VM could be a factor/reason that leads to the
interesting observation?

>
> Guys, I'll dig into this stuff. Will get back with results.

Thanks,

Meng

-- 
-----------
Meng Xu
PhD Candidate in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/