Jenkins job failures for the olive minos `terminate-instances` job were occurring when the SQS queue is empty (e.g. [job#449675](https://admin.edx-flatu.org:8080/job/terminate-instances-that-have-been-verified-for-retirement-prod-olivex/449675/console)), but succeed when there are any messages in the queue (e.g. [job#449676](https://admin.edx-flatu.org:8080/job/terminate-instances-that-have-been-verified-for-retirement-prod-olivex/449676/console))
From the errors reported in the failed job, we determined that, when the queue is empty, the [boto http_socket_timeout of 3 sec](https://github.com/edx/configuration/blob/cf4d221d384ed396a3008c31487f385544ef08e7/playbooks/roles/aws/templates/boto.cfg.j2#L2) was being hit before the [configured queue wait timeout of 10 sec](https://github.com/edx-olive/configuration/blob/a16cd2c9c6fd20fc604259ca4c89e1dda0fa0f9d/util/vpc-tools/asg_lifcycle_watcher.py#L38).
Since the `boto.cfg` file is part of the `aws` role and so is shared by lots of services, the best solution here was to decrease the queue wait to match the upstream [1 second timeout](https://github.com/edx/configuration/blob/cf4d221d384ed396a3008c31487f385544ef08e7/util/vpc-tools/asg_lifcycle_watcher.py#L35) instead of increasing the boto timeout.
To verify this fix, I:
Configured the jenkins job to run with branch from this PR
Watched jobs succeed when the queue is empty, see [job#449691](https://admin.edx-flatu.org:8080/job/terminate-instances-that-have-been-verified-for-retirement-prod-olivex/449691/console)
Watched jobs succeed when queue contains messages about instance needing retirement, see [job#449715](https://admin.edx-flatu.org:8080/job/terminate-instances-that-have-been-verified-for-retirement-prod-olivex/449715/console)
*Author Notes & Concerns*
The upstream reduced queue wait timeout was made as part of upgrading minos to boto3, however issues with using this upgrade on olive caused that change to be reverted.
We'll need to investigate these issues more thoroughly to maintain this configuration repo moving forward.
We are also working on changes to remove the need for OpenCraft to merge configuration changes to the `edx:configuration/olive` fork/branch, so we don't have to pester you about this stuff in future
[ ] @itsjeyd
[ ] @coryleeio
Configuration Pull Request
Make sure that the following steps are done before merging:
[ ] A DevOps team member has approved the PR if it is code shared across multiple services and you don't own all of the services.
[ ] Are you adding any new default values that need to be overridden when this change goes live? If so:
[ ] Update the appropriate internal repo (be sure to update for all our environments)
[ ] If you are updating a secure value rather than an internal one, file a DEVOPS ticket with details.
[ ] Add an entry to the CHANGELOG.
[ ] If you are making a complicated change, have you performed the proper testing specified on the [Ops Ansible Testing Checklist](https://openedx.atlassian.net/wiki/display/EdxOps/Ops+Ansible+Testing+Checklist)? Adding a new variable does not require the full list (although testing on a sandbox is a great idea to ensure it links with your downstream code changes).