Learner Analytics API

Introduction

The Learner Analytics API provides data about how particular learners are engaging in particular courses. Its primary use case is the /wiki/spaces/AN/pages/30376815 project for Q2 2016. dfriedmanR (Deactivated) and Dennis Jen (Deactivated) are designing the spec with support from the /wiki/spaces/AN/overview and Engineering organization.

This spec follows the edX REST API Conventions and is inspired by the /wiki/spaces/LEARNER/pages/19662983 and /wiki/spaces/LEARNER/pages/27557954.

Current API version is V0.

Open Questions

  • What should be the format for dates?
  • What should be the default date range?
  • What should we return when we don't have data for the provided date range, or if the date range is too small or too big?

 

(therebedragons) This spec is an in-progress design. (therebedragons)

 

Here be dragons.

Access

The Learner Analytics API is planned to be implemented on the Data API. This API will follow the existing authentication and authorization workflow; clients must have a valid API key which permits them to access all data from the API. Note that this means that clients must be trusted to authenticate and authorize users themselves. For example, for the Learner Analytics app, Insights will act as an auth layer between the user and the Data API.

We are considering implementing direct access to the API using JWTs for authorization once the necessary infrastructure lands.

Endpoints

Notes:

  • This is a read-only API.
  • If an HTTP method on a particular endpoint is undocumented it is unsupported.
  • JSON is default Content-Type. Others accepted where noted.
  • Errors will be returned with developer messages according to the edX REST API Conventions spec.

Learner Detail

Data API: /api/v0/learners/<username>/?course_id=<course_id>

Insights: /api/learner_analytics/v0/learners/<username>/?course_id=<course_id>

MethodDescriptionQuery ParametersReturn ValueAccess
GETGet a particular student's data for a particular course.
  • course_id - The course within which user data is requested.
  • 200 - OK: User is authenticated and authorized

  • 400 - Bad Request: course_id not provided
  • 401 - Unauthorized: Requesting user is not authenticated; has no session.
  • 403 - Forbidden: Requesting user is authenticated but isn't course/org staff or an instructor.
  • 404 - Not Found: Requested user or course doesn't exist, or requested user doesn't belong to course.
User must be course or org staff or an instructor.

Example Response

{
    "last_updated": "2015-11-09",
    "username": "dan-friedman",
    "name": "Daniel Friedman",
    "email": "dfriedman@edx.org",
    "account_url": "https://openedx-platform-instance.org/api/user/v1/accounts/dan-friedman",
    "enrollment_mode": "honor",
    "enrollment_date": "2015-10-26 18:13:32.656640",
    "cohort": null,
    "segments": ["has_potential"],
    "engagements": {
        "date_range": {
            "start": "2015-10-19",
            "end": "2015-10-26"
        },
        "discussion_contributions": 24,
        "problems_attempted": 138,
        "problems_completed": 76,
        "videos_viewed": 26,
        "problem_attempts_per_completed": 4.5
    }
}

Learner List

Data API: /api/v0/learners/?course_id=<course_id>

Insights: /api/learner_analytics/v0/learners/?course_id=<course_id>

MethodDescriptionQuery ParametersHTTP HeadersReturn ValueAccess
GETGet a paginated list of student data for a particular course. Students are returned in increasing alphabetical order by username.
  • Filters:

    • course_id - The course within which user data is requested
    • text_search - Limits the set of learners returned by those which have names/usernames/email addresses matching the search terms
    • segments - Comma-separated list of /wiki/spaces/AN/pages/44073840 to which all returned users must belong. Segments are "OR"ed together; users returned must belong in any one of the supplied segments.
    • ignore_segments - Comma-separated list of /wiki/spaces/AN/pages/44073840 to which all returned users must NOT belong. Not to be used with segments argument.
    • cohort - The cohort to which all returned users must belong.
    • enrollment_mode - The enrollment track to which all returned users must belong.
  • Pagination:

    • page_size - Number of learner datasets to return per page
    • page - Page number to retrieve
    • order_by - Field by which we sort
    • sort_order - a string (either "asc" or "desc") indicating the desired order
  • Accept - application/json or text/csv
  • 200 - OK: User is authenticated and authorized

  • 400 - Bad Request: Any of the following:

    • course_id not provided

    • segments and ignore_segments used together
    • Values for order_by, page_size, or page are invalid.

  • 401 - Unauthorized: Requesting user is not authenticated; has no session.

  • 403 - Forbidden: Requesting user is authenticated but isn't course/org staff or an instructor.
  • 404 - Not Found: Any of the following:
    • Requested course doesn't exist.
    • Requested segments don't exist
    • Requested cohort doesn't exist
    • Requested enrollment track doesn't exist
User must be course or org staff or an instructor.

 

 

 

When sorting by problem_attempts_per_completed, we apply a secondary sort on an opaque attempt_ratio_order field in order to generate a meaningful sort order. This helps us distinguish between learners that have a ratio of n/0 from n+1/0 or i/j from i*x/j*x. 

 

 

 

Example Response

{
    "count": 99,
    "next": "https://openedx-data-api-instance.org/api/v0/learners/?course_id=<course_id>&page=2",
    "previous": null,
    "results": [
        {
		    "last_updated": "2015-11-09",
            "username": "dan-friedman",
            "name": "Daniel Friedman",
            "email": "dfriedman@edx.org",
            "account_url": "https://openedx-platform-instance.org/api/user/v1/accounts/dan-friedman",
            "enrollment_mode": "honor",
            "enrollment_date": "2015-10-26 18:13:32.656640",
            "cohort": null,
            "segments": ["has_potential"],
            "engagements": {
                "date_range": {
                    "start": "2015-10-19",
                    "end": "2015-10-26"
                },
                "discussion_contributions": 24,
                "problems_attempted": 138,
                "problems_completed": 76,
                "videos_viewed": 26,
                "problem_attempts_per_completed": 4.5
            }
        }
    ]
}

Engagement Timeline

Data API: /api/v0/engagement_timelines/<username>/?course_id=<course_id>

Insights: /api/learner_analytics/v0/engagement_timelines/<username>/?course_id=<course_id>

MethodDescriptionQuery ParametersHTTP HeadersReturn ValueAccess
GETGet a list of day-by-day engagement data for a given learner in a particular course.
  • Filters:

    • course_id - The course within which user data is requested
  • Accept - application/json or text/csv
  • 200 - OK: User is authenticated and authorized

  • 400 - Bad Request: Any of the following:

    • course_id not provided

  • 401 - Unauthorized: Requesting user is not authenticated; has no session.

  • 403 - Forbidden: Requesting user is authenticated but isn't course/org staff or an instructor.
  • 404 - Not Found: Any of the following:
    • Requested course doesn't exist.
    • Requested user doesn't exist
    • Requested user doesn't belong to course
User must be course or org staff or an instructor.

Example Response

{
    "days": [
        {
            "date": "2015-10-19",
            "discussion_contributions": 1,
            "problems_attempted": 5,
            "problems_completed": 2,
            "videos_viewed": 7
        },
        {
            "date": "2015-10-20",
            "discussion_contributions": 5,
            "problems_attempted": 0,
            "problems_completed": 0,
            "videos_viewed": 1
        },
        { ... }
    ]
}

Course Metadata

Data API: /api/v0/course_learner_metadata/<course_id>/

Insights: /api/learner_analytics/v0/course_learner_metadata/<course_id>/

MethodDescriptionReturn ValueAccess
GETGet metadata on learners within a course. Includes data on segments, cohorts, enrollment modes, and an engagement rubric.
  • 200 - OK: User is authenticated and authorized

  • 401 - Unauthorized: Requesting user is not authenticated; has no session.

  • 403 - Forbidden: Requesting user is authenticated but isn't course/org staff or an instructor.
  • 404 - Not Found: Any of the following:
    • Requested course doesn't exist.
    • Requested user doesn't exist
    • Requested user doesn't belong to course
User must be course or org staff or an instructor.
{
    "cohorts": {
        "Cohort A": 1000, "Cohort B": 900
    },
    "segments": {
        "low_grades": 4,
        "struggling": 12,
        "disengaging": 12,
        "inactive": 9
    },
    "enrollment_modes": {
        "honor": 2000,
        "verified": 100
    },
    "engagement_ranges": {
        "date_range": {
            "start": "2015-10-19",
            "end": "2015-10-26"
        },
        "problems_attempted": {
            "below_average": [0, 5],
            "average": [5, 15.2],
            "above_average": [15.2, null]
        },
        "problems_completed": {
            "below_average": [0, 5],
            "average": [5, 15.2],
            "above_average": null
        },
        "problem_attempts_per_completed": {
            "below_average": null,
            "average": [0, null],
            "above_average": null
        },
        "discussion_contributions": { ... }
    }
}

Note that particular metrics (e.g. problems_attempted) may not have engagement ranges for all of "below_average", "average", and "above_average". For courses with less available data, some ranges may be 'null'. The "average" range will always exist. 

Test Cases

To make sure the API will do what we need, list data we'll need in the app, and make sure there is a way to use the API to get it.

Data neededAPI call(s)

Given a username, get "profile" info:

  • email
  • full name
  • cohort
  • track
  • profile photo
  • date enrolled
  • age, gender, level of education
  • country
  • bio blurb

GET /api/v0/learners/dan-friedman/?course_id=my/cool/course

GET https://openedx-platform-instance.org/api/user/v1/accounts/dan-friedman

Given a username, get activity summaries over some time period (e.g. in scrollable graph), for the previous 3 weeks:

  • number of videos watched, per-day
  • number of problems attempted, per-day
  • number of problems completed, per-day
  • total number of problem attempts, per-day
GET /api/v0/learners/engagement_timelines/dan-friedman/?course_id=my/cool/course&start_date=2015-10-6&end_date=2015-10-27
Same data as above, between 10/2/2015 and 11/3/2015 
GET /api/v0/learners/engagement_timelines/dan-friedman/?course_id=my/cool/course&start_date=2015-10-2&end_date=2015-11-3

Given a username, get some summary stats:

  • number of videos watched in last 7 days
GET /api/v0/learners/dan-friedman/?course_id=my/cool/course&start_date=2015-10-20&end_date=2015-10-27

Base roster: get page 1 of the list, with the default sort, and no filters applied. Get back list:

  • username
  • user email (NOTE: may need to display this if the email is what matches search)
  • full name
  • discussion activity in last week
  • problems attempted in last week
  • attempts per problem completed
  • videos watched
GET /api/v0/learners/?course_id=my/cool/course

Roster highlighting: what to highlight? For each metric in the table, what are the threshold between three regions: "great", "normal", "concerning" (names can be changed)

GET /api/v0/learners/courses/my/course/id

(thresholds should be contained within the "engagements" sub-document)

Roster with filters, search criteria:

  • Return page 4 of roster sorted descending by problems attempted in last week, with filters applied: ["search:Armadillo", "disengaging", "no videos watched", "track:honor", "cohort:BestAnimals"]
GET /api/v0/learners/?course_id=my/cool/course&page=4&order_by=problems_attempted&sort_order=desc&text_search=Armadillo&segments=disengaging&enrollment_mode=honor&cohort=BestAnimals
(Maybe) Search result highlighting – when the user searched for "Armadillo", can we show them what matched for each result? 
(Maybe): search type-ahead: get a mini-set of results for "arma" – just username, full name, email. 
(Maybe): get course averages for the various stats, to display as context with user's activity 
[Future] What actionable tidbits have we computed about this student, to base messages on? (TBD; If it's a simple function of just the data on the page, could compute in-app)