A simple WorkerPool (without auto-scaling) is merged in Frappe: frappe/frappe#21482
Provides RQ worker pool with dynamic scaling 🚀
WARNING: Experimental POC. Don't run this anywhere near production sites.
There are roughly two metrics that need to be optimized while deciding on worker count:
- Memory usage - More # of workers => more memory usage.
- Average wait time for jobs ~ How responsive the system is.
Both of these are at odds with each other as a responsive system would require more workers i.e. more memory usage. We need to dynamically spawn workers while still control what parameters to optimize for.
This app provides a command to start worker pool instead of single worker.
Example:
First remove all bench worker
processes from process list and supervisor conf and replace them with equivalent commands for worker pool.
bench worker-pool --min-workers=2 --max-workers=5 --scaling-period=5 --utilization-threshold=0.5
This command will:
- Spawn two workers and start working
- Every 5 seconds:
- Check how much time workers spent in last 30 seconds
- If it was more than
--utilization-threshold
i.e. 50% then increase one worker. - If it was less than half of threshold i.e. 25% then it will decrease one worker.
- If it was within 25-50% range worker pool stays as is.
Test it out by simulating fake workload from bench console:
# A function that just sleeps
from frappe.core.doctype.rq_job.test_rq_job import test_func
while True:
import time
time.sleep(0.5)
frappe.enqueue(test_func, sleep=1)
- This will enqueue 2 jobs every second that consume 1 second each
- So roughly we will end up spewing 4-5 workers at which point workload and workers are balanced according to set parameters.
- If you stop enqueuing new jobs, overtime it will drop back to 2 workers again.
- To Monitor this in realtime go to
RQ Worker
doctype and setup auto-refresh:setInterval(() => {cur_list.refresh()}, 1000)
If you visualize this is roughly how it will look:
Because WorkerPool forks workers from the master process it can utilize shared memory much better. A worker pool of 8 workers consumes only 1/3 of memory compared to 8 individual workers.
--utilization-threshold
controls responsiveness vs efficiency. Low threshold means highly responsive system but very low efficiency and vice versa.- Some weird edge cases are handled weirdly.
- Extremely long running jobs which might not increase utilization if they started before scaling window but didn't end yet.
- Utilization in time bucket is computed by remembering old utilization. This isn't accurate at all and requires rework.
- Scaling up and down only happens in unit of 1 worker.
- Scale down happens at half the threshold for scale up. If you're familiar with idea of table doubling or resizeable arrays, this is similar concept to avoid repeated scale up/downs which are of no use.
MIT