Devops occasionally gets alerts concerning untagged ec2 instances in the testeng AWS account. We try to make sure that all of our instances are tagged, but some are still managing to spin up without a tag. I think these are artifacts of the packer build process (in our build-packer-ami job on Jenkins). Packer grabs a vanilla Ubuntu AMI to provision with Ansible and then save it for later use. If the provisioning fails, packer will clean up the temporary instance. However, if the job is aborted or the packer process itself errors out, the clean up never occurs (and is never tagged, since Packer only tags successfully built AMIs). These workers can stay running indefinitely.
The Janitor job should clean up untagged workers in the testeng account. Perhaps set some sort of rule, like:
IMPORTANT: make sure that this rule is ONLY in effect for the testeng account. Do not kill anything in other accounts.
To determine the source of these untagged instances, I did the following
To find all untagged instances, run the following command:
Checked AWS console for launch time (in EST)
Checked https://build.testeng.edx.org/job/build-packer-ami/ for jobs that failed or were aborted around this time (but remember, Jenkins shows time in UTC). The beginning of the build log should have the instance-id for the temporary worker.