Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

The Github web ui search tool, along with its paging, is not ideal for getting a list of repositories containing a given string.  The following techniques can help.

The following examples show how to get a list of edx repositories containing the string "edx-drf-extensions".

Github API from Browser

Using the Github API from the browser searches public repos only.  To search private repos you have access to, use one of the other techniques below.


  1. In the browser, use a url like the following: 
    1. https://api.github.com/search/code?q=edx-drf-extensions+org:edx
    2. https://api.github.com/search/code?q=REPLACE_WITH_SEARCH_TERM+org:edx
  2. Copy the search output into the JSON field in https://jqplay.org/
  3. Enter the following filter in jq-play:

    .items[].repository.full_name

Github API from Command Line

  1. From the command-line, use the following:

    # Supply username to search private repos
    curl --user "REPLACE_WITH_GITHUB_USERNAME" https://api.github.com/search/code?q=edx-drf-extensions+org:edx
    curl --user "REPLACE_WITH_GITHUB_USERNAME" https://api.github.com/search/code?q=REPLACE_WITH_SEARCH_TERM+org:edx
    
    # Skip username to quickly search public repos
    curl https://api.github.com/search/code?q=edx-drf-extensions+org:edx
  2. If you have jq installed (e.g. brew install jq), you can get a sorted/filtered list using the following:
  3. # Pipe results to jq to get a filtered list of repositories
    curl -s "https://api.github.com/search/code?q=edx-drf-extensions+org:edx" 2>&1 | jq "[.items[].repository.full_name] | unique"
    
    # Note: Add '--user "REPLACE_WITH_GITHUB_USERNAME"' like above to search private repos.
  4. Or, use jq-play online to filter the output:
    1. Copy the search output into the JSON field in https://jqplay.org/
    2. Enter the following filter in jq-play:

      .items[].repository.full_name

Github API from Python

This script has the advantage of sorting and filtering results to the unique set of repositories.


  1. In a virtualenv, pip install PyGithub:

    # Also see https://pygithub.readthedocs.io/en/latest/introduction.html
    pip install PyGithub
  2. Use a simple script like the following:

#!/usr/bin/python
from github import Github

# Set this to a personal access token.
# - Select "repo" for the oauth scopes.
# See https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/
g = Github('REPLACE_WITH_YOUR_ACCESS_TOKEN')

repositories = set()

# Note: Gets rate limited and fails if too many hits
content_files = g.search_code(query='org:edx edx-drf-extensions')
for content in content_files:
	repositories.add(content.repository.full_name)
	rate_limit = g.get_rate_limit()
	if rate_limit.search.remaining == 0:
		print('Rate limit on searching was reached.')
		break

for repo in sorted(repositories):
	print(repo)

rate_limit = g.get_rate_limit()
print('Search rate limit:')
print(rate_limit.search)

  • No labels