[LITMUS^RT] linked but never scheduled

Glenn Elliott gelliott at cs.unc.edu
Sat Sep 15 18:17:14 CEST 2012


On Sep 14, 2012, at 8:21 PM, Jonathan Herman <hermanjl at cs.unc.edu> wrote:

> 
> On Fri, Sep 14, 2012 at 7:57 PM, Glenn Elliott <gelliott at cs.unc.edu> wrote:
>> Well, I am stuck in a strange case where a task is linked but never
>> scheduled (G-EDF):
>> 
>> 3903 P0 [check_for_preemptions at litmus/sched_gsn_edf.c:365]:
>> check_for_preemptions: attempting to link task 1795 to 0
>> 3904 P0 [gsnedf_get_nearest_available_cpu at litmus/sched_gsn_edf.c:347]: Could
>> not find an available CPU close to P0
>> 3905 P0 [__add_ready at litmus/rt_domain.c:312]: rt: adding aux_threads/1799
>> (0, 4611686018427387903, 4611686018427387903) [inh_task: (nil)/0 (0, 0 0)]
>> rel=59812157846 to ready queue at 60312058176
>> 3907 P0 [link_task_to_cpu at litmus/sched_gsn_edf.c:218]: (aux_threads/1799:1)
>> linked = aux_threads/1795
>> 3908 P0 [link_task_to_cpu at litmus/sched_gsn_edf.c:219]: (aux_threads/1799:1)
>> entry->linked = aux_threads/1799
>> 3909 P0 [link_task_to_cpu at litmus/sched_gsn_edf.c:261]: (aux_threads/1795:2)
>> linked to 0.
>> 3914 P1 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1796:1) tick
>> 3915 P2 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1797:1) tick
>> 3916 P3 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1798:1) tick
>> 3917 P0 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1799:1) tick
>> 3918 P1 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1796:1) tick
>> 3919 P2 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1797:1) tick
>> 3920 P3 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1798:1) tick
>> 3921 P0 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1799:1) tick
>> 3922 P1 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1796:1) tick
>> 3923 P2 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1797:1) tick
>> 3924 P3 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1798:1) tick
>> 3925 P0 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1799:1) tick
>> 3926 P1 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1796:1) tick
>> 3927 P2 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1797:1) tick
>> 3928 P3 [gsnedf_tick at litmus/sched_gsn_edf.c:472]: (aux_threads/1798:1) tick
>> 
>> 
>> I updated gsnedf_tick to print the currently scheduled real-time task.  Task
>> 1795 should be running on P0 instead of 1799.
>> 
>> Has anyone ever seen something like this before?
>> 
>> Thanks,
>> Glenn
>> 
>> p.s. Ignore the crazy period/relative_deadline for 1799.  Those are just
>> place holders.  1799 is really just a worker thread that has a statically
>> low priority and should only run when the system is idle (or according to an
>> inherited priority).
>> 
>> p.p.s. I think there is a bug in litmus.c::litmus_fork().  is_realtime() is
>> returning false for forked children of a real-time task (expected), so the
>> fork-copied rt_param state is not reset.  Thus, the child gets the same
>> rt_params as the parent.  This is probably harmless if we're sure to re-init
>> any children that transition to real-time.  However, for the sake of
>> completeness, I wonder if we should execute "reinit_litmus_state(p,0);" for
>> all children.  I am doing some where tasks are forced to become real-time
>> from within the kernel (worker threads), and this possible bug bit me.
>> 
>> _______________________________________________
>> litmus-dev mailing list
>> litmus-dev at lists.litmus-rt.org
>> https://lists.litmus-rt.org/listinfo/litmus-dev
>> 
> 
> I would bet cashmoney that your task is not preemptable. Your
> check_for_preempt calls preempt()->preempt_if_preemptable after
> linking the task, which calls litmus_reschedule if the task is not
> preemptable. litmus_reschedule ALWAYS hits line 98 of your preempt.c
> and traces state.
> 
> -- 
> Jonathan Herman
> Department of Computer Science at UNC Chapel Hill


I think something else could be going wrong.  I believe the task is preemptible.  litmus_reschedule() is indeed being called.  Here's the code, with some trace statements added in.  I've put the code path that is being followed in bold.

 66 void litmus_reschedule(int cpu)
 67 {
 68     int picked_transition_ok = 0;
 69     int scheduled_transition_ok = 0;
 70 
 71     /* The (remote) CPU could be in any state. */
 72 
 73     /* The critical states are TASK_PICKED and TASK_SCHEDULED, as the CPU
 74      * is not aware of the need to reschedule at this point. */
 75 
 76     /* is a context switch in progress? */
 77     if (cpu_is_in_sched_state(cpu, TASK_PICKED)) {
 78         picked_transition_ok = sched_state_transition_on(
 79             cpu, TASK_PICKED, PICKED_WRONG_TASK);
 80 
 81         TRACE_CUR("cpu %d: picked_transition_ok = %d\n", cpu, picked_transition_ok);
 82     }
 83     else {
 84         TRACE_CUR("cpu %d: picked_transition_ok = 0 (static)\n", cpu);
 85     }
 86 
 87     if (!picked_transition_ok &&
 88         cpu_is_in_sched_state(cpu, TASK_SCHEDULED)) {
 89         /* We either raced with the end of the context switch, or the
 90          * CPU was in TASK_SCHEDULED anyway. */
 91         scheduled_transition_ok = sched_state_transition_on(
 92             cpu, TASK_SCHEDULED, SHOULD_SCHEDULE);
 93         TRACE_CUR("cpu %d: scheduled_transition_ok = %d\n", cpu, scheduled_transition_ok);
 94     }
 95     else {
 96         TRACE_CUR("cpu %d: scheduled_transition_ok = 0 (static)\n", cpu);
 97     }
 98 
 99     /* If the CPU was in state TASK_SCHEDULED, then we need to cause the
100      * scheduler to be invoked. */
101     if (scheduled_transition_ok) {
102         if (smp_processor_id() == cpu) {
103             set_tsk_need_resched(current);
104         }
105         else {
106             smp_send_reschedule(cpu);
107         }
108     }
109 
110     TRACE_STATE("%s picked-ok:%d sched-ok:%d\n",
111             __FUNCTION__,
112             picked_transition_ok,
113             scheduled_transition_ok);
114 }

Trace output:
3956 P1 [link_task_to_cpu at litmus/sched_gsn_edf.c:218]: (aux_threads/1820:1) linked = aux_threads/1816
3957 P1 [link_task_to_cpu at litmus/sched_gsn_edf.c:219]: (aux_threads/1820:1) entry->linked = aux_threads/1820
3958 P1 [link_task_to_cpu at litmus/sched_gsn_edf.c:261]: (aux_threads/1816:2) linked to 1. 
3962 P1 [preempt_if_preemptable at litmus/sched_plugin.c:36]: (aux_threads/1820:1) preempt_if_preemptable: aux_threads/1820
3963 P1 [preempt_if_preemptable at litmus/sched_plugin.c:45]: (aux_threads/1820:1) preempt local cpu.
3964 P1 [preempt_if_preemptable at litmus/sched_plugin.c:71]: (aux_threads/1820:1) calling litmus_reschedule()
3965 P1 [litmus_reschedule at litmus/preempt.c:81]: (aux_threads/1820:1) cpu 1: picked_transition_ok = 1
3966 P1 [litmus_reschedule at litmus/preempt.c:96]: (aux_threads/1820:1) cpu 1: scheduled_transition_ok = 0 (static)
3968 P1 [gsnedf_tick at litmus/sched_gsn_edf.c:474]: (aux_threads/1820:1) tick 67358153479
3972 P1 [gsnedf_tick at litmus/sched_gsn_edf.c:474]: (aux_threads/1820:1) tick 67359153359



So it looks like my linked task is not scheduled because the CPU state is TASK_PICKED and picked_transition_ok is set to 1.  I've never worked with this part of Litmus before, so I don't know what is "normal."  Is there supposed to be some sort of deferred safety-net that will schedule the linked task when picked_transition_ok==1 that is not being invoked?

-Glenn


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.litmus-rt.org/pipermail/litmus-dev/attachments/20120915/01c96e6e/attachment.html>


More information about the litmus-dev mailing list