XQueue/XQWatcher Architecture
- 1 TL;DR
- 2 Role
- 3 Design
- 3.1 Dependencies
- 3.2 Technology stack
- 3.3 Summary diagram
- 3.4 Authentication
- 3.5 API Endpoints
- 3.5.1 LMS-facing
- 3.5.2 External task processor -facing
- 3.5.3 Shared
- 4 Producers
- 5 Consumers
- 5.1 Pull
- 5.2 Push (Removed)
- 6 XQueue-Watcher
- 7 Future state / wish list
- 7.1 LMS
- 7.2 XQueue Core
- 7.3 Task processors
- 8 Resources
- 8.1 Source code
- 8.2 Related documentation
- 8.3 Diagram source
TL;DR
XQueue is a simple micro-service that sits between LMS and external task processing services (generally, “pull graders”).
LMS submits grading requests to XQueue. XQueue holds on to each submission a relevant pull grader asynchronously picks it, executes it, and pushes the result back to XQueue. XQueue then pushes the result back to LMS.
XQWatcher is edX’s canonical implementation of a pull grader, although other pull grader implementations can and do exist.
Role
XQueue is an asynchronous task processing service that currently handles three types of LMS tasks, in order of most to least utilized:
Grading requests for
CodeResponse
problemsInput validation requests for
MatlabInput
problems (similar to grading requests, except these aren't actually graded)PDF certificate generation requests for the Certificates app (deprecated - in the process of removing)
Its primary use case by far, and the one for which it was designed, is code grading or "autograding" (#1). In general, each CodeResponse problem will have a different grading backend and subsequently a separate queue, so it's possible for XQueue to end up with many "queues under management."
Design
Dependencies
XQueue, of course, receives all of its tasks from the LMS, and sends all results back to the LMS.
XQueue also requires that, for each active task queue, some service exists that is generally available to process and remove tasks from the queue. In the event of a task consumer outage, XQueue will happily hold onto tasks indefinitely, allowing task consumers to resume processing once they're back online. That said, there is the obvious risk of XQueue filling up its queue capacity either during consumer downtime or if a consumer is simply too slow.
Technology stack
Under the hood, XQueue is a Python Django web application with whatever relational database (e.g. MySQL) and file storage backends (e.g. Amazon S3) you like to use with Django. (File storage is used for message payloads.)
XQueue uses its relational DB to store metadata about queued submissions - the actual submissions are stored in S3.
Summary diagram
Authentication
XQueue has a separate user model that has no connection to edx-platform users. Service accounts must be created directly with XQueue, and clients must use basic (session) auth to communicate
API Endpoints
There are three groups of HTTP web APIs exposed by XQueue, all of which communicate using JSON.
LMS-facing
/xqueue/submit
Receive a new task
Allowed methods: POST
Format: must have three keys defined in a JSON object in the POST body.
lms_callback_url: the exact location where XQueue should POST the final result of a task
lms_key: an opaque identifier that the caller wants returned in the final callback
queue_name
Any files included in the POST will be dumped into XQueue's configured file storage system.
External task processor -facing
/xqueue/get_queuelen?queue_name={queue}
Get the number of unprocessed tasks in the given queue.
Allowed methods: GET
/xqueue/get_submission?queue_name={queue}
Retrieves a single submission from the given queue. Contains a unique submission identifier.
Allowed methods: GET
/xqueue/put_result
Pushes the result of a completed task back to XQueue.
Allowed methods: POST
Shared
/xqueue/login
/xqueue/logout
/xqueue/status
Producers
The edx-platform codebase contains an XQueue driver defined within CAPA (/common/lib/capa/capa/xqueue_interface.py
) that interacts directly with the XQueue LMS endpoints. The CodeGrader and MatlabInput tasks are part of the CAPA codebase and directly use this driver.
The unique key generated by the LMS, in order to correlate each submission with a given activity, is a hash of course, user, usage key, and the current time.
Certificate generation and cert updates also use the CAPA driver directly (/lms/djangoapps/certificates/queue.py
).
Consumers
XQueue was designed to handle two patterns of task consumers, pull and push. Push consumers have since been deprecated and removed due to the negative performance & scalability implications they had on XQueue.
Pull
Also called "active," these graders are constantly polling XQueue to see if there are new submissions to be processed. XQWatcher is a pull-grader implementation. We recommend using pull graders where possible.
The below diagram illustrates a pull request flow.
Push (Removed)
Also called "passive," these graders exposed a web interface that XQueue called when a new submission was available for processing. XServer was a push-grader implementation. We moved away from the use of push graders in favor of more general-purpose XBlock integrations (e.g. an LTI consumer).
The below diagram illustrates a push request flow.
XQueue-Watcher
XQueue-Watcher (a.k.a. xqueue-wather or just xqwatcher) is edX’s implementation of the XQueue pull-grader interface. It is not the only implementation of said pull-grader interface: there are different institutions that maintain their own forks of XQWatcher or have even written their own pull-grader implementations. However, XQWatcher is the only pull-grader that we maintain.
XQWatcher is a Flask-based Python application that polls XQueue for code submissions, and then executes those submissions against course-team authored graders, which run the student submission against a series of tests, and compares the result to the result obtained by running the same suite of tests against a course-team-authored exemplar submission. A single XQWatcher instance will run multiple threads that all listen to a single XQueue queue. Each thread will wait for a single submission on the queue, respond to it, and then wait for another submission. Thus the number of threads limits how many submissions can be graded at once. Another result of this design is that a separate instance of XQueueWatcher needs to be deployed for each course, with a configuration specific to that course.
Graders are defined in a course-team-owned git repository, named in the deploy-time configuration for the XQueueWatcher instance. Each problem has its own grader which lives in a separate directory. This directory also contains an answer.py
file which contains the course-team authored exemplar submission.
Incoming requests will contain the learner's code submission as well as a grader payload, specified by the course team as part of the CAPA code_response XBlock. The grader payload comprises a JSON object with the key "grader", whose value specifies the path to a python file within the course team's grader repo which contains the code that specifies a series of tests to run against the submission - some by static analysis of the student's submission, and some by executing the code in a sandboxed environment, managed by CodeJail.
Code execution is managed by the JailedGrader class, which marshalls together the student submission (saved as submission.py), the problem's grader, and harness code that facilitates the authoring of grader tests. The student submission will get graded, and its results compared to the results of running the same suite of tests against the course-team authored answer.py.
Future state / wish list
LMS
Stop generating PDF certificates in favor of Web certificates. Easier said than done but this is absolutely the direction we need to go.
Decrease coupling: pull the LMS XQueue interface out of CAPA, or at least remove the code dependency between Certificates and CAPA.
Convert all hard-coded queue names in edx-platform to be configurable. You never know when this could become useful...
XQueue Core
Move away from the push grader model. XQueue should really only exist to handle asynchronous communication with pull graders. For push graders, synchronous communication makes more sense and could be done using a separate XBlock or an LTI tool. (This includes the MatlabInput use case as well, as that's effectively a push grader.)
UPDATE: Push grading has been deprecated.
Remove get_queuelen. I know it's nice to have a "peek" operation but it's not really that interesting. Even XQWatcher doesn't bother with it and instead just tries repeatedly to get_submission until successful.
NOTE: This solution is better than the one proposed here (in response to a security issue): /wiki/spaces/PLAT/pages/49873364
Bulk get submissions for parallel grading (and bulk push back in). Jobs typically come in waves.
Switch from using a separate user store and session auth, to using the edx-platform authentication backend and signed JSON Web Tokens for authorization.
This would require that each queue is mapped to a whitelist of allowed producers and consumers (user authorization), and those clients would need to be appropriately scoped (app authorization).
Possible scopes (just brainstorming): "xqueue:submit" (basically just the LMS), "xqueue:score" (all grader clients). "read/write" and "pull/push" don't really make sense as all XQueue clients need to both add and remove from queues. If we want to keep it simple, we could just have an "xqueue" scope.
Task processors
Consider modifying XQWatcher to have a "slow" poll when the queue is empty, and a "fast" poll when it the queue is probably not empty (as in, if the last poll was successful).
XQWatcher is more of an instance of a code grader than a code grader framework. A framework/client library may speed adoption of external graders.
Resources
Source code
XQueue: https://github.com/edx/xqueue
Contains the Python code for the Django-based IDA.
Pull grader client: https://github.com/edx/xqueue-watcher
Contains the base Python code for Flask-based XQueue watchers.
The actual graders used on edx.org are maintained by the course authors in separate Github repos.
One example of a (private) Github repo where MIT graders are kept for one course: https://github.com/mitodl/graders-mit-600x
Push grader client: https://github.com/edx/xserver
This repo is now archived and is read-only. Push graders are now deprecated and removed.
Related documentation
Diagram source
DSL for https://www.websequencediagrams.com :
Pull Flow
title Pull Flow
participant LMS
participant XQueue
participant XQWatcher
XQWatcher-->XQueue: Poll for submissions
note left of LMS: Initiate a request\n(asynchronously)
LMS->XQueue: Submit request to queue
XQWatcher-->XQueue: Poll for submissions
XQWatcher<->XQueue: Get next submission
note right of XQWatcher: Do some work\n(e.g. grade code)
XQWatcher->XQueue: Push result back
XQueue->LMS: Send result via callback
note left of LMS: Display result
XQWatcher-->XQueue: Poll for submissions
Push Flow
title Push Flow
participant LMS
participant XQueue
participant XServer
note left of LMS: Initiate a request\n(asynchronously)
LMS->XQueue: Submit request to queue
XQueue->XServer: Submit to grader\n(blocking)
note right of XServer: Do some work\n(e.g. grade code)
XServer->XQueue: Respond with result
XQueue->LMS: Send result via callback
note left of LMS: Display result