Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data race when rendering an LCD display #507

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

juhdanad
Copy link

@juhdanad juhdanad commented Oct 6, 2022

Summary

The example code for the LCD display uses two threads: one for simulating the AVR chip, and one for OpenGL. The state of the LCD display is stored as some flags in a uint16_t variable. When the OpenGL thread modifies a flag, it may accidentally undo a simultaneous write operation in the AVR thread, causing malfunction of the display.

In this PR I add mutex support to the LCD chip simulation. The pthread library is only included in the OpenGL-related parts of the code, so it is still possible to compile the LCD simulation itself without the threading library.

The introduction of this mutex does not noticeably affect performance.

The problem

The LCD display is stored in a struct hd44780_t. This struct has a 2 byte integer member which contains various bit flags. One flag is for the internal status: whether the chip is reading the first or second nibble of the current byte. The other flag id for rendering: whether the content of the LCD screen needs to be redrawn. Since the access to this variable is not guarded, the OpenGL thread, while clearing the redraw flag, can also overwrite the other flag if it is changed by the simulation thread at the same time. After this, the LCD chip will interpret the incoming nibbles in the wrong order.

Reproduction: The first commit of this PR adds a new LCD fps testing program to the examples. Check out this commit, build the project, and execute

cd /examples/board_hd44780/obj-x86_64-linux-gnu
./charlcd.elf ../atmega48_fps_test.axf > /dev/null

A counter starts counting on the LCD screen:
lcd-error-fixed
After some time (less than a minute on my laptop), garbage data starts to appear:
lcd-error

Solution

I added a new global mutex to the LCD OpenGL utilities. The both the drawing thread and the AVR thread uses this. The hd44780_t struct got two new callback members, one for locking and one for unlocking. This callback mechanism allows to use the struct without mutexes, if not needed. Finally, I split the flags of the LCD into two groups: there is a group of internal flags, which are not protected by the mutex (these flags are not needed for drawing), and there is a protected group. This splitting results in less mutex locking.

I tested the performance of this solution with the fps testing program. Both the old and new code displayed about 260000 frames in 1:54 (for this test it is important to redirect the output to /dev/null, otherwise the simulation will slow down).

Other possible ways to solve the problem

  • Only read data in the OpenGL thread: this works, but adding mutex support to the LCD display makes other applications besides rendering possible. Also, reading memory that is currently written to is still a data race.
  • Lock the whole AVR struct, execute some instructions, then unlock: this would also work, but it would introduce locking when not necessary. However, if there is a need for locking for some other reason (for example, forwarding key presses to the AVR thread), it may be a better option to use a single, global lock for everything. The current LCD simulator sometimes stops with a segmentation fault because of this, so a global lock will be needed in the future anyways.
  • Use a callback in the OpenGL thread too: this would mean that the OpenGL code can be compiled without the pthread library. In my opinion this would make the code more difficult to understand. I do not think that requiring pthread is a big problem if OpenGL is used.
  • Store a mutex directly in the hd44780_t struct: this would mean that either pthread is required for compiling this component, or that it is guarded by a preprocessor macro.

@juhdanad
Copy link
Author

juhdanad commented Oct 6, 2022

Sorry for force-pushing, I had to fix a minor mistake in the first commit, and I had to modify the first commit itself as it is required for testing the PR. I hope I was fast enough to not cause any trouble.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant