Analytics Data API - Paginated Course Summaries
Abstract
We want to move filtering, sorting, pagination, and aggregation of course summaries from client-side Insights to within the Analytics Data API.
Background
There exists as "Course Summaries" endpoint (undocumented) in the Analytics Data API at:
GET /api/v0/course_summaries/?course_ids=<course_id_0>...<course_id_n>
and:
POST /api/v0/course_summaries/
{
'course_ids': [ ... ],
...,
}
Both methods allow the client to get metadata about the enrollment count, enrollment delta, and start/end dates of the courses whose IDs are passed in. The POST method is identical to the GET method other than that its arguments are passed in the request body, allowing the number of course IDs to not be limited by URL length restrictions.
This endpoint is called by Insights on the server-side, and its data is sent in a template to the client. Because the API has no notion of user course access, Insights must handle ensure that only the course summaries to which the user has access are sent to the client. It does so in one of two ways:
- If the user has access to fewer than 500 courses, course_ids is set to the IDs of those courses. So, the returned course summaries are only the ones that the user has access to.
- Otherwise, course_ids is not passed in, resulting in all course summaries being returned. Insights must then filter out the course summaries that the user doesn't have access to.
Note: The reason for the two separate methods of filtering is that after a certain number of course IDs, it actually becomes slower to process all the IDs and load the correct course summaries than it does to simply load every course summary.
Once the data reaches the client, it is filtered, sorted, and paginated using Backgrid. Although the initial page load is slow, the loaded table is very responsive and snappy. Additionally, the data is aggregated to show some overall statistics at the top of the page.
Finally, there is a "Download CSV" link that writes all the course summary data (unfiltered, sorted by course display name) into a CSV format for the user to download. It does this all client-side.
Problems with the current design
- Allowing/requiring the API to return EVERY course summary is slow, taxing on the API server, and not scalable
- Causes >5s load time on Insights course listing page (that page users see right after logging in)
- Dave, paraphrased: "Sure, it works now... but what if we get 10,000 courses? Or 50,000? Allowing an endpoint to return that much data is not good"
- Having two separate methods of course-access filtering is confusing and requires code duplicated between Insights and the API
Solution
Paginate the API response. This necessarily means filtering, sorting, and aggregation will also be done in the API. We will also have to make a new Insights API endpoint that mirrors and uses as a backend the Analytics Data API course summaries endpoint in order to make the data available to the Insights client-side course listings page.
TODO: Add in `fields` and `exclude` parameters to course summaries endpoints
Insights: Course Index View
http(s)://<insights_host>/courses/#?<query_string>
Method | Description | Query Parameters | Statuses |
---|---|---|---|
GET |
|
For defaults, put enumeration-style key-value pairs in the URL |
TODO: Look into what happens for bad query params. Stick with current functionality. Want to keep same URL scheme and page behavior to avoid breaking URLs |
Insights: Course Summaries API
http(s)://<insights_host>/api/course_summaries/v1/course_summaries/?<query_string>
Method | Description | Query Parameters | Return Values | Statuses |
---|---|---|---|---|
GET | Get paginated list of course summaries |
|
|
|
Analytics Data API: Course Aggregate Data
http(s)://<data_api_host>/api/v1/course_aggregate_data/?<query_string>
Method | Description | Query Parameters | Return Value | Access |
---|---|---|---|---|
GET | Get aggregate data about a set of courses |
|
| Same as Insights Course Summaries API |
POST | Same as GET, but number of course IDs is not restricted by URL length | Same as above, but comma-separated lists are JSON arrays of strings | Same as above | Same as above |
Analytics Data API: Course Summaries
http(s)://<data_api_host>/api/v1/course_summaries/?<query_string>
Method | Description | Query Parameters | Return Value | Access |
---|---|---|---|---|
GET | Get a paginated list of course summaries, with optional filtering and sorting |
| Same as Insights Course Summaries API, but
| Same as Insights Course Summaries API |
POST | Same as GET, but number of course IDs is not restricted by URL length | Same as above, but comma-separated lists are JSON arrays of strings | Same as above, except next/previous URLs are not included | Same as above |