Type: Pull Request Review
Status: Waiting on Author
Affects versions: None
Fix versions: None
Configuration Pull Request
This PR updates the `workers` formula to improve readability. The formula used to read `(cpus - 1) * mul + mul` where `cpus` is the number of cpus and `mul` is a multiplier which value is set in the yaml variables. If we expand this formula we can simplify it to `cpus * mul`:
(cpus - 1) * mul + mul
(mul * cpus) - mul + mul
mul * cpus
This PR also adds three more gunicorn config variables:
[threads](http://docs.gunicorn.org/en/stable/settings.html#threads) : This will allow us to use multiple threads for each worker. Making it possible to run more workers and keeping gunicorn memory footprint low.
[worker_class](http://docs.gunicorn.org/en/stable/settings.html#worker-class) : If `threads` are used then it is possible to specify which type of thread class gunicorn will use. By default it will use `gthread` if any number of `threads` if specified.
[max_requests_jitter](http://docs.gunicorn.org/en/stable/settings.html#max-requests-jitter) : This will allow us to specify a max number of requests before gunicorn restart a worker. By doing this we help handle any memory leak happening in the application.
We have experimented with a different amount of workers as well as max requests before gunicorn restart the workers. Using a number around 14 of workers have helped us better use all the resources in different cases. This is why we suggest adding gunicorn's `thread` and `worker_class` that way other users can run more workers while keeping gunicorn's memory footprint almost the same. Using 14 `sync` workers ( the default type of workers without threads) also works perfectly, hence using `threads` and `worker_class` is completely optional.
We've also added the `max_requests_jitter` configuration so users can specify how many requests will each worker handle before being restarted by gunicorn. This has allowed us to avoid memory leak or high memory consumption by one or multiple workers in different cases. We usually use `max_request` but in the latest gunicorn they've introduced `max_request_jitter` which will make gunicorn use a random number of max request between 0 and this value. This avoids having all threads restart at the same time and make the application sluggish.
Make sure that the following steps are done before merging:
- [ ] A DevOps team member has approved the PR if it is code shared across multiple services and you don't own all of the services.
- [x] Are you adding any new default values that need to be overridden when this change goes live? If so:
- [ ] Update the appropriate internal repo (be sure to update for all our environments)
- [ ] If you are updating a secure value rather than an internal one, file a DEVOPS ticket with details.
- [ ] Add an entry to the CHANGELOG.
- [ ] If you are making a complicated change, have you performed the proper testing specified on the [Ops Ansible Testing Checklist](https://openedx.atlassian.net/wiki/display/EdxOps/Ops+Ansible+Testing+Checklist)? Adding a new variable does not require the full list (although testing on a sandbox is a great idea to ensure it links with your downstream code changes).
- [ ] Think about how this change will affect Open edX operators. Have you updated the wiki page for the next Open edX release?