Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarking performance test and run script #75

Merged
merged 16 commits into from
Jun 28, 2023

Conversation

rjwills28
Copy link
Collaborator

I have added the Python performance script (i.e. websocket client) that creates N subscriptions to Coniql and measures memory usage, CPU and the number of missed updates. I have also added a bash script that takes care of creating the EPICS db, starts the IOC, starts Coniql and then runs M instances of the Python ws client (to simulate multiple clients connecting to a single Coniql instance).

Results from the Python client script get appended to a file so that you can compare different runs.

The number of PVs to subscribe to, number of samples to collect and number of websocket clients are all configurable.

Do we want this version of the performance tests merged in as part of Coniql? They might be useful for individual developers to run locally when making changes. Otherwise the Python client can be used for the automated performance tests and we just use the bash script as a template.

Copy link
Contributor

@AlexanderWells-diamond AlexanderWells-diamond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and will definitely be helpful for locally checking for significant performance differences when changing Coniql. We should also be able to re-use large parts of this for a container-based performance test setup.

How long do these tests take for you to run? From my understanding the default is to create a bunch of records that add 1 every .1 seconds, and then the client is waiting for 1000 updates, which should take 100 seconds? I'm seeing test runs take 5+ minutes so am confused where the time is being used.

There seems to be a crash when you try and increase the number of PVs in use above 1000.
I'm not sure what the start of the issue is but the last thing in the console is:

Exception in thread Thread-175:
Traceback (most recent call last):
-> Starting subscription: TEST:REC1806
  File "/usr/lib64/python3.8/threading.py", line 932, in _bootstrap_inner
  File "/usr/lib64/python3.8/threading.py", line 870, in run
  File "benchmark/coniql_performance_test.py", line 219, in coniql_subscription
  File "/usr/lib64/python3.8/asyncio/runners.py", line 44, in run
  File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete
  File "benchmark/coniql_performance_test.py", line 69, in subscribe
  File "/home/eyh46967/dev/coniql/venv_test1/lib64/python3.8/site-packages/websockets/imports.py", line 77, in __getattr__
  File "/home/eyh46967/dev/coniql/venv_test1/lib64/python3.8/site-packages/websockets/imports.py", line 28, in import_name
AttributeError: module 'websockets.legacy.client' has no attribute 'connect'
libgcc_s.so.1 must be installed for pthread_cancel to work
bash: line 1: 561763 Aborted                 (core dumped) python benchmark/coniql_performance_test.py -n 2000 -s 1000 -p 2 -f benchmark/performance_test_results_NClients_1.txt

I've got a few feedback points. This is an unsorted list and some of these points may not be worth acting upon.

  • Who is the intended reader for the results file? It's fine for human readability, I'm just wondering if we want a slightly more computer-friendly format e.g. csv, json.
  • Can there be a "total runtime" added to the end of the script?
  • Is there a way to estimate the remaining time? I notice that increasing the number of clients seems to exponentially increase the runtime which makes it hard to know if it's still running correctly or whether something might have broken
  • I note that the print statements in the Python file don't seem to appear anywhere - it looks like that printing is never passed back to the stdout of the original shell? This is another thing that would make debugging issues somewhat tricky, especially as I'm assuming stderr is also not being passed along.
    • A bit more investigation revealed that this is actually an issue with using VSCode's inbuilt terminal - it seems that it doesn't actually display the new tabs, they're created in the background somewhere. They still run, we just don't see the terminals and so we don't see the output. Do you know if there's a way to detect this from inside the shell?
  • It's tricky to spot errors in the spawned tabs, as they close at the end of the run. This is easy to recreate if you try running the script outside of the Coniql venv - there will be errors in the terminals but if you wait too long they close and lose the information
  • Can you make it so that, even if the Python script exits unexpectedly, it will still try to write out whatever results it has collected? Its an irritation that we can just lose all data in some cases

benchmark/run_performance_test.sh Outdated Show resolved Hide resolved
benchmark/coniql_performance_test.py Show resolved Hide resolved
benchmark/run_performance_test.sh Outdated Show resolved Hide resolved
benchmark/coniql_performance_test.py Outdated Show resolved Hide resolved
benchmark/coniql_performance_test.py Show resolved Hide resolved
benchmark/run_performance_test.sh Outdated Show resolved Hide resolved
@AlexanderWells-diamond
Copy link
Contributor

Another point to consider: We've still got the old Python and Javascript benchmarking scripts (along with some docs on how to run them) still present. Do we want to get rid of them? Should we add this script to the docs?

@rjwills28
Copy link
Collaborator Author

Thanks for the detailed review. I'll look at addressing the comments and additional features to the scripts.
I must admit I had not tried running with 1000 PVs. I see my CPU hit 100% when running with 200 PVs so that was as far as I went. I assumed that the performance tests will be run against a single instance of Coniql, which means it will have limits so we might have to think about what a reasonable number is to stress the system and still get meaningful results.

Quick comment on the runtime of the performance test. It runs until it has captures N samples (e.g. 1000 in your example). When you increase the number of PVs, you increase the CPU usage. When you hit 100% CPU then Coniql suffers and starts to miss PV updates. This means it takes longer for us to collect 1000 samples as more updates have been missed, if that makes sense.

In terms of the 'old' benchmarking scripts, I haven't looked at them myself but if we don't think they will be used once we have these tests running then I think we should remove them. Adding instructions on how to run these new scripts should then be added to the documentation.

Also neatens up the generated database file
@codecov
Copy link

codecov bot commented Jun 22, 2023

Codecov Report

Merging #75 (df500bd) into main (8f9c7de) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main      #75   +/-   ##
=======================================
  Coverage   93.30%   93.30%           
=======================================
  Files          10       10           
  Lines         807      807           
=======================================
  Hits          753      753           
  Misses         54       54           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@rjwills28
Copy link
Collaborator Author

Addressing some of the general comments:

Who is the intended reader for the results file? It's fine for human readability, I'm just wondering if we want a slightly more computer-friendly format e.g. csv, json.

I'm not sure about this one. I assume that we would want to look at the results and compare different runs ourselves. My intention was to write the results to file so that they can be extracted and presented in some way but I'm not sure how you would plan to do that? Perhaps in that case it would be better if they were just raw numbers that were then interpreted by some other method?

Can there be a "total runtime" added to the end of the script?

I have added the total runtime to the end of the script.

Is there a way to estimate the remaining time? I notice that increasing the number of clients seems to exponentially increase the runtime which makes it hard to know if it's still running correctly or whether something might have broken

I have added some progress monitoring to the Python script and also estimate the remaining time based on how long it has already taken to collect x samples.

I note that the print statements in the Python file don't seem to appear anywhere - it looks like that printing is never passed back to the stdout of the original shell? This is another thing that would make debugging issues somewhat tricky, especially as I'm assuming stderr is also not being passed along.

I have configured it so that the stdout & stderr of EPICS, Coniql and the performance test are written to log files so that we can always grab the output from the latest run. These will be saved under /benchmark/logs.

It's tricky to spot errors in the spawned tabs, as they close at the end of the run. This is easy to recreate if you try running the script outside of the Coniql venv - there will be errors in the terminals but if you wait too long they close and lose the information

See above comment.

Can you make it so that, even if the Python script exits unexpectedly, it will still try to write out whatever results it has collected? Its an irritation that we can just lose all data in some cases

Do you mean even if it doesn't complete. I have added a try-except block to run the subscriptions so even if this fails we should still reach the results write out code.

Copy link
Contributor

@AlexanderWells-diamond AlexanderWells-diamond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking better and better. Very nearly there. A couple of minor things, and an issue with the progress.txt file, to resolve.

As an aside, I'm wondering if it wouldn't be better/easier at this point to have done everything inside a Python script. Using the subprocess module would I think significantly ease some of the headaches of reading output from files. Although a rewrite at this stage probably isn't worth it. I'll see what I want/need to do to get this running well in Kubernetes before suggesting anything more significant.

benchmark/coniql_performance_test.py Outdated Show resolved Hide resolved
benchmark/run_performance_test.sh Outdated Show resolved Hide resolved
benchmark/coniql_performance_test.py Show resolved Hide resolved
benchmark/coniql_performance_test.py Outdated Show resolved Hide resolved
benchmark/coniql_performance_test.py Show resolved Hide resolved
benchmark/run_performance_test.sh Outdated Show resolved Hide resolved
@rjwills28
Copy link
Collaborator Author

As an aside, I'm wondering if it wouldn't be better/easier at this point to have done everything inside a Python script.

At some point the idea of using Jenkins to run these tests was thrown around and so I went down the bash route. Just let me know if you think we do need to refactor.

Copy link
Contributor

@AlexanderWells-diamond AlexanderWells-diamond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

@rjwills28 rjwills28 merged commit 1b38af5 into main Jun 28, 2023
18 checks passed
@rjwills28
Copy link
Collaborator Author

Thanks @AlexanderWells-diamond for all the help and suggesting on this PR.

@rjwills28 rjwills28 deleted the add_performance_test_script branch June 28, 2023 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants