Date: Fri, 29 Mar 2024 13:37:53 +0000 (UTC) Message-ID: <66724840.3.1711719473696@dab67b52ff4f> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_2_977877008.1711719473696" ------=_Part_2_977877008.1711719473696 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
The Github web ui search tool, alon= g with its paging, is not ideal for getting a list of repositories containi= ng a given string. The following techniques can help.
The following examples show how to get a li= st of edx repositories containing the string "edx-drf-extensions".
This script has the advantage of handling paging in addition to sorting = and filtering results. Since the script pages automatically, you may = get rate limited with a warning.
In a virtualenv, pip install PyGithub:
# Als= o see https://pygithub.readthedocs.io/en/latest/introduction.html pip install PyGithub
Use a simple script like the following:
=#!/us= r/bin/python from github import Github # Set this to a personal access token. # - Select "repo" for the oauth scopes. # See https://help.github.com/articles/creating-a-personal-access-token-for= -the-command-line/ g =3D Github('REPLACE_WITH_YOUR_ACCESS_TOKEN') repositories =3D set() # Note: Gets rate limited and fails if too many hits content_files =3D g.search_code(query=3D'org:edx edx-drf-extensions') for content in content_files: =09repositories.add(content.repository.full_name) =09rate_limit =3D g.get_rate_limit() =09if rate_limit.search.remaining =3D=3D 0: =09=09print('WARNING: Rate limit on searching was reached. Results are inc= omplete.') =09=09break for repo in sorted(repositories): =09print(repo) rate_limit =3D g.get_rate_limit() print('Search rate limit:') print(rate_limit.search)
From the command-line, use the following:= p>
# Sup= ply username to search private repos curl --user "REPLACE_WITH_GITHUB_USERNAME" https://api.github.com/search/co= de?q=3Dedx-drf-extensions+org:edx curl --user "REPLACE_WITH_GITHUB_USERNAME" https://api.github.com/search/co= de?q=3DREPLACE_WITH_SEARCH_TERM+org:edx # Skip username to quickly search public repos curl https://api.github.com/search/code?q=3Dedx-drf-extensions+org:edx # check the "last" link in the headers to see how many pages of results. curl -sI "https://api.github.com/search/code?q=3Dedx-drf-extensions+org:edx= " | grep 'rel=3D"last"' # or, just add "&page=3D2", etc., and see if you get results: curl https://api.github.com/search/code?q=3Dedx-drf-extensions+org:edx&= page=3D2
If you have jq installed (e.g. brew install= jq), you can get a sorted/filtered list using the following:
# Pip= e results to jq to get a filtered list of repositories curl -s "https://api.github.com/search/code?q=3Dedx-drf-extensions+org:edx"= 2>&1 | jq "[.items[].repository.full_name] | unique" # Note: Add '--user "REPLACE_WITH_GITHUB_USERNAME"' like above to search pr= ivate repos. # Note: Remember to get additional pages of results if there are any (see a= bove).
Enter the following filter in jq-play:
.item= s[].repository.full_name
Using the Github API from the browser is a quick and dirty approach, wit= h some limitations.
NOTE: This only searches public repos. You also need to remember to poss= ible page the results.
Enter the following filter in jq-play:
.item= s[].repository.full_name