Repo Health Job User Guide
What is Repo Health Job?
The repo health job
is a script written to parse all the repositories of given organizations by running the repo health checks on each repository. It generates the summary yaml file against each repository after running the mentioned checks on the repository.
The job also has an additional feature of combining all the yaml files into a single csv file to generate a repo health dashboard
and push the data to a specific google spread sheet which makes it easier to monitor and review the changes at any time.
What is it used for?
Right now, 2U
has setup a scheduled workflow build using the provided template workflow which is triggered daily and updates the repo health dashboard
with the updated data about all the edx
and openedx
repositories.
The repo health dashboard
is currently being used by the tech-arch-bom
team to plan and upgrade the repositories across organizations.
Why did we move from Jenkins to GitHub Actions?
Previously, the repo health job
was running as a scheduled Jenkins job but for making it easier for Axim
and other community organizations to use the job for their respective organizations and repositories, it has been moved to GitHub Action workflows.
How does the job currently work?
Right now, different components of the repo health job workflow are working as following:
The reusable repo health job workflow is present inside the
openedx/.github
repo and it can be referenced from any repo by any organization using the provided template.The reusable workflow triggers the bash script to run the repo health checks which is present in the
openedx/edx-repo-health
repository. The script executes all the repo_health_checks present in the repo on all the repositories of the given organizations.
How can you setup the repo health job for your organization?
To setup the repo health job
to run against your organization, you need to follow the steps mentioned below:
Setup the scheduled workflow
Create a workflow file by copying the template workflow file in your desired repository.
This workflow will be triggered according to your desired scheduled to parse and update the data.
Setup the repo health secrets
To successfully run the workflow file, you’ll need to add following secrets to your GitHub repository where the workflow is being hosted:
READTHEDOCS_API_KEY
: API key for readthedocs access. Needed to run the docs check against repositories.REPO_HEALTH_GOOGLE_CREDS_FILE
: Link to the credentials file for the google spreadsheet if you need the job to push the csv to a google spreadsheet.REPO_HEALTH_BOT_TOKEN
: GitHub token with read access to all repositories and write access to the target repo where you want the yaml files and csv report to be stored.REPO_HEALTH_BOT_EMAIL
: A unique email associated with your repo health bot. This email will be used to commit the generated yaml files and csv report to the target repository.
Provide the needed arguments to the schedule workflow
The scheduled workflow will need following input parameters to successfully run all the checks and generate desired reports
ORG_NAMES
: Space separated list of organization names to parse repositories i.e. 'openedx edx . . .'EDX_REPO_HEALTH_BRANCH
: Branch of theopenedx/edx-repo-health
repo to check out. This can be used to run custom checks against repositories if needed.ONLY_CHECK_THIS_REPOSITORY
: If you only want to run repo health on one repository, set this toorg/name
of the desired repository.REPORT_DATE
: The date for which repo health data is required.(format: YYYY-MM-DD)
. Pass this argument if you want to parse the repositories data for any specific date.REPO_HEALTH_OWNERSHIP_SPREADSHEET_URL
: URL for the google spreadsheet to populate the data.REPO_HEALTH_REPOS_WORKSHEET_ID
: ID for the google spreadsheet to populate the data.TARGET_REPO_TO_STORE_REPORTS
: Target repo to store the csv reports & results i.e.org/repo-name
REPOS_TO_IGNORE
: Space separated list of repositories to be ignored i.e. 'repo1 repo2 . . .'