[LITMUS^RT] non-rt sync releases (and race condition in do_release_ts() found?)
Björn Brandenburg
bbb at mpi-sws.org
Thu Jan 10 16:46:34 CET 2013
On Jan 10, 2013, at 12:26 AM, Glenn Elliott <gelliott at cs.unc.edu> wrote:
> By the way, I noticed that the sync release code was revised in the latest version of Litmus, and I think that I may have identified a race condition that can lead to a bad pointer dereference:
>
> 1) Task 1 does do_wait_for_ts_release().
> 2) Task 2 does do_wait_for_ts_release().
>
> ** At this point, each task has a list node in the task_release_list. **
>
> 3) do_release_ts() is called by Task 3.
> 4) Task 3 wakes up Task 1.
> 5) Task 1 resumes and exits do_wait_for_ts_release().
>
> ** Task 1's list node is popped from Task 1's stack. **
>
> 6) Task 1 make some function call, pushing data to its stack.
> 7) Task 3 attempts to iterate to the next list node in task_release_list. *CRASH* Task 3's pointer (pos) to Task 1's list node is no longer valid.
>
> I hit a crash in KVM where the ts_release_wait pointer, wait, is dereferenced (inside the list_for_each loop) in do_release_ts(). However, I couldn't easily reproduce the crash. I think we probably need to be using list_for_each_safe() in do_release_ts().
I've hit the same bug and pushed a fix to https://github.com/LITMUS-RT/litmus-rt/commits/prop/misc-fixes.
I've also included a patch that reimplements the plugin switching code, which caused lockups on my machine.
If there are no objections, I'll merge these patches into staging.
- Björn
More information about the litmus-dev
mailing list