[LITMUS^RT] litmus-dev Digest, Vol 49, Issue 4

Thu Mar 17 00:22:20 CET 2016

Hi All

Thank you for helping.

1. It happens both on Generic Linux kernel and Litmus. I suppose that it is
an issue with generic Linux or my machine. But I think you may now the
answer as you also deal with clocks in Litmus.

2. It seems that NTP is not responsible as I tried with clock
CLOCK_MONOTONIC_RAW as well, but get same results.

3. Yes, I bind the thread to core 1 during its life time but still same
results.

More info, I tried out the same test program on a laptop and the results
are opposite. The difference between thread cpu timer and wall clock still
doubles but the interval of wall clock is always greater. Yet on my desktop
the interval of thread cpu time is greater. Really Really weird. The cpu
clock speed on the laptop is 2000 MHz but it runs at 1996 MHz.

I attached the program here so you can also try out at your side. Compile
with gcc -g -Wall -O2 clock.c -lrt -lpthread. The program takes a param to
set how many times the thread will loop.

Thank you for helping again.

Best wishes
Shuai

On 16 March 2016 at 19:14, <litmus-dev-request at lists.litmus-rt.org> wrote:

> Send litmus-dev mailing list submissions to
>         litmus-dev at lists.litmus-rt.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.litmus-rt.org/listinfo/litmus-dev
> or, via email, send a message with subject or body 'help' to
>         litmus-dev-request at lists.litmus-rt.org
>
> You can reach the person managing the list at
>         litmus-dev-owner at lists.litmus-rt.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of litmus-dev digest..."
>
>
> Today's Topics:
>
>    1. Re: Interesting thread cpu time VS wall clock (Bj?rn Brandenburg)
>    2. Re: Interesting thread cpu time VS wall clock (Glenn Elliott)
>    3. Re: Interesting thread cpu time VS wall clock (Manohar Vanga)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 16 Mar 2016 19:21:14 +0100
> From: Bj?rn Brandenburg <bbb at mpi-sws.org>
> To: litmus-dev at lists.litmus-rt.org
> Subject: Re: [LITMUS^RT] Interesting thread cpu time VS wall clock
> Message-ID: <2802045C-31A7-436E-9356-A5F09F34EB6D at mpi-sws.org>
> Content-Type: text/plain; charset=utf-8
>
>
> > On 16 Mar 2016, at 18:40, Shuai Zhao <zs673 at york.ac.uk> wrote:
> >
> > Hi
> >
> > I found an interesting thing while I trying to get response time of
> tasks: The CPU time of a task (CLOCK_THREAD_CPUTIME_ID) accounts faster
> than wall clock (CLOCK_REALTIME) !
> >
> > I noticed that you use the same thing in the rtspin.c so I guess that
> you may know the answer.
> >
> > I know that this is a question related to generic Linux or may be the
> cpu clock speed instead of litmus, but would you please have a look and
> help me?
> >
> > The CPU I use is i7 6700k 4.0MHz. But the clock speed is actually 4006
> MHz when I cat /proc/cpuinfo.
> >
> > Here is the code:
> >
> > /************* CPU cycle cosuming for 3 threads *************/
> > #define NUMS 500
> > static int num[NUMS];
> > static int loop_once(void) {
> >       int i, j = 0;
> >       for (i = 0; i < NUMS; i++)
> >               j += num[i]++;
> >       return j;
> > }
> > /************* CPU cycle cosuming END *************/
> >
> > void* run(void *tcontext) {
> >       struct timespec start, end, start1, end1, sleep = {0,10};
> >       int i;
> >
> >       exit_program = 0;
> >
> >       do {
> >               nanosleep(&sleep, NULL);
> >
> >               clock_gettime(CLOCK_REALTIME, &start);
> >               clock_gettime(CLOCK_THREAD_CPUTIME_ID, &start1);
> >
> >
> >               for(i = 0; i < loop_times; i++)
> >                       loop_once();
> >
> >
> >               clock_gettime(CLOCK_THREAD_CPUTIME_ID, &end1);
> >               clock_gettime(CLOCK_REALTIME, &end);
> >
> >
> >               response_time[0][count_first] = (end.tv_sec * 1000000000 +
> end.tv_nsec) - (start.tv_sec * 1000000000 + start.tv_nsec);
> >
> >               response_time[1][count_first] = (end1.tv_sec * 1000000000
> + end1.tv_nsec) - (start1.tv_sec * 1000000000 + start1.tv_nsec);
> >
> >
> >               /*finish the program */
> >               count_first++;
> >               if (count_first >= 10000)
> >                       exit_program = 1;
> >       } while (exit_program != 1);
> >
> >       return NULL;
> > }
> >
> > I basically copies the code from rtspin.c here. In the program I get
> time stamps of the wall-clock and thread-cpu clock and then let the thread
> loops for a given times. Then I get time stamps again to check the
> differences between these two clocks (wallclock - threadcpuclock).
> >
> > The difference I expect should be the interval of wall-clock always be
> bigger than thread cpu clock as there will be timer or reschedule
> interrupts.
> >
> > But on my machine I got amazing results: with the increment of the
> loop_times, the thread-cpu clock will goes faster than wall clock and I got
> results like this: (clock 0 is wall clock and clock 1 is thread cpu clock)
> >
> > <IMG_0257.JPG>?
> >
> > So we can see that the thread cpu clock goes faster than wall clock by
> 33191 nanosecond when we loop 100000 times and I am shock by this result.
> >
> > If the wall clock is greater then it makes sense, but here we have a
> negative difference.
> >
> > One possibility is that the wall clock and the cpu time are accounted by
> different timers in kernel. But they should all updated based on the HZ,
> which is 1000 in my case. So I cannot explain the reason why the cpu time
> in my computer counts faster than the wall clock.
> >
> > Thank you in advance for helping and looking forward your reply.
> >
> > Best wishes
> > Shuai
> >
>
> Dear Shuai,
>
> I?ve not seen this before. Does it happen only with LITMUS^RT, or also
> with a vanilla Linux kernel?
>
> - Bj?rn
>
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 16 Mar 2016 11:53:08 -0700
> From: Glenn Elliott <gelliott at cs.unc.edu>
> To: litmus-dev at lists.litmus-rt.org
> Subject: Re: [LITMUS^RT] Interesting thread cpu time VS wall clock
> Message-ID: <67257061-AB20-4094-AA1B-0D65329390B4 at cs.unc.edu>
> Content-Type: text/plain;       charset=utf-8
>
>
>
> > On Mar 16, 2016, at 10:40 AM, Shuai Zhao <zs673 at york.ac.uk> wrote:
> >
> > Hi
> >
> > I found an interesting thing while I trying to get response time of
> tasks: The CPU time of a task (CLOCK_THREAD_CPUTIME_ID) accounts faster
> than wall clock (CLOCK_REALTIME) !
> >
> > I noticed that you use the same thing in the rtspin.c so I guess that
> you may know the answer.
> >
> > I know that this is a question related to generic Linux or may be the
> cpu clock speed instead of litmus, but would you please have a look and
> help me?
> >
> > The CPU I use is i7 6700k 4.0MHz. But the clock speed is actually 4006
> MHz when I cat /proc/cpuinfo.
> >
> > Here is the code:
> >
> > /************* CPU cycle cosuming for 3 threads *************/
> > #define NUMS 500
> > static int num[NUMS];
> > static int loop_once(void) {
> >       int i, j = 0;
> >       for (i = 0; i < NUMS; i++)
> >               j += num[i]++;
> >       return j;
> > }
> > /************* CPU cycle cosuming END *************/
> >
> > void* run(void *tcontext) {
> >       struct timespec start, end, start1, end1, sleep = {0,10};
> >       int i;
> >
> >       exit_program = 0;
> >
> >       do {
> >               nanosleep(&sleep, NULL);
> >
> >               clock_gettime(CLOCK_REALTIME, &start);
> >               clock_gettime(CLOCK_THREAD_CPUTIME_ID, &start1);
> >
> >
> >               for(i = 0; i < loop_times; i++)
> >                       loop_once();
> >
> >
> >               clock_gettime(CLOCK_THREAD_CPUTIME_ID, &end1);
> >               clock_gettime(CLOCK_REALTIME, &end);
> >
> >
> >               response_time[0][count_first] = (end.tv_sec * 1000000000 +
> end.tv_nsec) - (start.tv_sec * 1000000000 + start.tv_nsec);
> >
> >               response_time[1][count_first] = (end1.tv_sec * 1000000000
> + end1.tv_nsec) - (start1.tv_sec * 1000000000 + start1.tv_nsec);
> >
> >
> >               /*finish the program */
> >               count_first++;
> >               if (count_first >= 10000)
> >                       exit_program = 1;
> >       } while (exit_program != 1);
> >
> >       return NULL;
> > }
> >
> > I basically copies the code from rtspin.c here. In the program I get
> time stamps of the wall-clock and thread-cpu clock and then let the thread
> loops for a given times. Then I get time stamps again to check the
> differences between these two clocks (wallclock - threadcpuclock).
> >
> > The difference I expect should be the interval of wall-clock always be
> bigger than thread cpu clock as there will be timer or reschedule
> interrupts.
> >
> > But on my machine I got amazing results: with the increment of the
> loop_times, the thread-cpu clock will goes faster than wall clock and I got
> results like this: (clock 0 is wall clock and clock 1 is thread cpu clock)
> >
> > <IMG_0257.JPG>?
> >
> > So we can see that the thread cpu clock goes faster than wall clock by
> 33191 nanosecond when we loop 100000 times and I am shock by this result.
> >
> > If the wall clock is greater then it makes sense, but here we have a
> negative difference.
> >
> > One possibility is that the wall clock and the cpu time are accounted by
> different timers in kernel. But they should all updated based on the HZ,
> which is 1000 in my case. So I cannot explain the reason why the cpu time
> in my computer counts faster than the wall clock.
> >
> > Thank you in advance for helping and looking forward your reply.
> >
> > Best wishes
> > Shuai
>
> Hi Shuai,
>
> Could NTP or PTP be responsible?  Seems unlikely, but can you rule it out?
>
> -Glenn
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 16 Mar 2016 20:14:25 +0100
> From: Manohar Vanga <mvanga at mpi-sws.org>
> To: litmus-dev at lists.litmus-rt.org
> Subject: Re: [LITMUS^RT] Interesting thread cpu time VS wall clock
> Message-ID:
>         <
> CAEktxaEHLP_8FjSxPRo8rbg3SqOp0_x5ijucMZ_OYGF09Fy8kQ at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hey Shuai,
>
> Are you pinning the thread to a single core? If not, can you check this
> (eg. using the taskset command)? I ask because we've seen timestamp counter
> variations across CPUs in that range in the past (tens of microseconds) on
> larger server machines. Although it is unlikely to be so prominent on a
> smaller desktop like the one you're using, it might be worth ruling out.
>
> Thanks!
>
> On Wed, Mar 16, 2016 at 6:40 PM, Shuai Zhao <zs673 at york.ac.uk> wrote:
>
> > Hi
> >
> > I found an interesting thing while I trying to get response time of
> tasks:
> > The CPU time of a task (CLOCK_THREAD_CPUTIME_ID) accounts faster than
> wall
> > clock (CLOCK_REALTIME) !
> >
> > I noticed that you use the same thing in the rtspin.c so I guess that you
> > may know the answer.
> >
> > I know that this is a question related to generic Linux or may be the cpu
> > clock speed instead of litmus, but would you please have a look and help
> me?
> >
> > The CPU I use is i7 6700k 4.0MHz. But the clock speed is actually 4006
> MHz
> > when I cat /proc/cpuinfo.
> >
> > Here is the code:
> >
> > /************* CPU cycle cosuming for 3 threads *************/
> > #define NUMS 500
> > static int num[NUMS];
> > static int loop_once(void) {
> > int i, j = 0;
> > for (i = 0; i < NUMS; i++)
> > j += num[i]++;
> > return j;
> > }
> > /************* CPU cycle cosuming END *************/
> >
> > void* run(void *tcontext) {
> > struct timespec start, end, start1, end1, sleep = {0,10};
> > int i;
> >
> > exit_program = 0;
> >
> > do {
> > nanosleep(&sleep, NULL);
> >
> > clock_gettime(CLOCK_REALTIME, &start);
> > clock_gettime(CLOCK_THREAD_CPUTIME_ID, &start1);
> >
> >
> > for(i = 0; i < loop_times; i++)
> > loop_once();
> >
> >
> > clock_gettime(CLOCK_THREAD_CPUTIME_ID, &end1);
> > clock_gettime(CLOCK_REALTIME, &end);
> >
> > response_time[0][count_first] = (end.tv_sec * 1000000000 + end.tv_nsec) -
> > (start.tv_sec * 1000000000 + start.tv_nsec);
> >
> > response_time[1][count_first] = (end1.tv_sec * 1000000000 + end1.tv_nsec)
> > - (start1.tv_sec * 1000000000 + start1.tv_nsec);
> >
> >
> > /*finish the program */
> > count_first++;
> > if (count_first >= 10000)
> > exit_program = 1;
> > } while (exit_program != 1);
> >
> > return NULL;
> > }
> >
> > I basically copies the code from rtspin.c here. In the program I get time
> > stamps of the wall-clock and thread-cpu clock and then let the thread
> loops
> > for a given times. Then I get time stamps again to check the differences
> > between these two clocks (wallclock - threadcpuclock).
> >
> > The difference I expect should be the interval of wall-clock always be
> > bigger than thread cpu clock as there will be timer or reschedule
> > interrupts.
> >
> > But on my machine I got amazing results: with the increment of the
> > loop_times, the thread-cpu clock will goes faster than wall clock and I
> got
> > results like this: (clock 0 is wall clock and clock 1 is thread cpu
> clock)
> >
> > [image: Inline images 1]?
> >
> > So we can see that the thread cpu clock goes faster than wall clock by
> > 33191 nanosecond when we loop 100000 times and I am shock by this result.
> >
> > If the wall clock is greater then it makes sense, but here we have a
> > negative difference.
> >
> > One possibility is that the wall clock and the cpu time are accounted by
> > different timers in kernel. But they should all updated based on the HZ,
> > which is 1000 in my case. So I cannot explain the reason why the cpu time
> > in my computer counts faster than the wall clock.
> >
> > Thank you in advance for helping and looking forward your reply.
> >
> > Best wishes
> > Shuai
> >
> >
> >
> >
> >
> > _______________________________________________
> > litmus-dev mailing list
> > litmus-dev at lists.litmus-rt.org
> > https://lists.litmus-rt.org/listinfo/litmus-dev
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20160316/ed57c089/attachment.html
> >
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: IMG_0257.JPG
> Type: image/jpeg
> Size: 2858542 bytes
> Desc: not available
> URL: <
> http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20160316/ed57c089/attachment.jpe
> >
>
> ------------------------------
>
> _______________________________________________
> litmus-dev mailing list
> litmus-dev at lists.litmus-rt.org
> https://lists.litmus-rt.org/listinfo/litmus-dev
>
>
> End of litmus-dev Digest, Vol 49, Issue 4
> *****************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20160316/5b6a4de3/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clock.c
Type: text/x-csrc
Size: 3469 bytes
Desc: not available
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20160316/5b6a4de3/attachment.c>