You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The new implementation for prvSelectHighestPriorityTask for SMP uses vListInsertEnd to insert the current TCB to the end of the ready task list. vListInsertEnd doesn't actually insert an element to the end of a list-- it only adds it such that it is the last element returned by calling listGET_OWNER_OF_NEXT_ENTRY multiple times before it starts repeating. Effectively vListInsertEnd inserts the node right before the current pxIndex node of the list.
In testing with a personal project and stepping through debugging, at first the pxIndex of the ready list seems to be the tail element of the list (before the xListEnd element). Over time, however, as tasks are removed and added to the ready list, it looks like the pxIndex element migrates to the top of the list. Once it reaches the top of the list, vListInsertEnd actually ends up inserting the current task TCB node to the front of the ready list!
The fix would be to use listGET_OWNER_OF_NEXT_ENTRY to iterate through the list, instead of starting from the head element.
Target
Development board: Raspberry Pi Pico W (rp2040)
Instruction Set Architecture: ARM Cortex-M0+
IDE and version: pico-sdk, ninja, vim, cmake, crossdev generated arm-none-eabi-gcc
Toolchain and version: arm-none-eabi-gcc (Gentoo 13.2.1_p20240113-r1 p12) 13.2.1 20240113
Host
Host OS: Gentoo Linux
Version: Unstable (rolling release), kernel version 6.7.0
To Reproduce
I don't have a generic reproducer, since it strongly depends on scheduler and task interaction. Even reproducing it on my device is almost like trying to reproduce a race condition, and any slowdown from gdb conditionals renders the issue impossible to reproduce.
My project is set up to mock an HID USB device using Tinyusb. I have a task dedicated to USB handling, a CLI task, a task mocking controller input, and a watchdog task. By dumping the list of active tasks, it looks like the pico-sdk also has a few other tasks running in the background:
I configured all 4 of my tasks to have a core affinity so they only use core 2.
I triggered the issue by constantly requesting the CLI task to output my debug status info using uxTaskGetSystemState to get the system state. It can take seconds to almost a minute of me spamming requests (as a human, typing s and enter to trigger the CLI output) to trigger the bug.
What I observe is that the scheduler consistently schedules the current task once the bug is triggered, starving all others. Makes sense if pxIndex is the head node of the ready list, as the current task node gets added before the pxIndex node... becoming the new head node.
Expected behavior
No resource starvation on the core the bug triggers in.
Screenshots
N/A
Additional context
See this FreeRTOS forum post for a discussion and all of my findings about the issue.
I can open a pull request with an attempted fix, but I have no idea how to go about preparing unit tests and coverage, or how to do proper regression testing with FreeRTOS.
The text was updated successfully, but these errors were encountered:
Describe the bug
The new implementation for
prvSelectHighestPriorityTask
for SMP usesvListInsertEnd
to insert the current TCB to the end of the ready task list.vListInsertEnd
doesn't actually insert an element to the end of a list-- it only adds it such that it is the last element returned by callinglistGET_OWNER_OF_NEXT_ENTRY
multiple times before it starts repeating. EffectivelyvListInsertEnd
inserts the node right before the currentpxIndex
node of the list.In testing with a personal project and stepping through debugging, at first the
pxIndex
of the ready list seems to be the tail element of the list (before thexListEnd
element). Over time, however, as tasks are removed and added to the ready list, it looks like thepxIndex
element migrates to the top of the list. Once it reaches the top of the list,vListInsertEnd
actually ends up inserting the current task TCB node to the front of the ready list!The fix would be to use listGET_OWNER_OF_NEXT_ENTRY to iterate through the list, instead of starting from the head element.
Target
Host
To Reproduce
I don't have a generic reproducer, since it strongly depends on scheduler and task interaction. Even reproducing it on my device is almost like trying to reproduce a race condition, and any slowdown from gdb conditionals renders the issue impossible to reproduce.
My project is set up to mock an HID USB device using Tinyusb. I have a task dedicated to USB handling, a CLI task, a task mocking controller input, and a watchdog task. By dumping the list of active tasks, it looks like the pico-sdk also has a few other tasks running in the background:
I configured all 4 of my tasks to have a core affinity so they only use core 2.
I triggered the issue by constantly requesting the CLI task to output my debug status info using
uxTaskGetSystemState
to get the system state. It can take seconds to almost a minute of me spamming requests (as a human, typings
andenter
to trigger the CLI output) to trigger the bug.What I observe is that the scheduler consistently schedules the current task once the bug is triggered, starving all others. Makes sense if
pxIndex
is the head node of the ready list, as the current task node gets added before thepxIndex
node... becoming the new head node.Expected behavior
No resource starvation on the core the bug triggers in.
Screenshots
N/A
Additional context
See this FreeRTOS forum post for a discussion and all of my findings about the issue.
I can open a pull request with an attempted fix, but I have no idea how to go about preparing unit tests and coverage, or how to do proper regression testing with FreeRTOS.
The text was updated successfully, but these errors were encountered: