Info |
---|
Table of Contents | ||
---|---|---|
|
Role
XQueue is an asynchronous task processing service that currently handles three types of LMS tasks, in order of most to least utilized:
Grading requests for
CodeResponse
problemsInput validation requests for
MatlabInput
problems (similar to grading requests, except these aren't actually graded)PDF certificate generation requests for the Certificates app (deprecated - in the process of removing)
Its primary use case by far, and the one for which it was designed, is code grading or "autograding" (#1). In general, each CodeResponse problem will have a different grading backend and subsequently a separate queue, so it's possible for XQueue to end up with many "queues under management."
Design
Dependencies
XQueue, of course, receives all of its tasks from the LMS, and sends all results back to the LMS.
XQueue also requires that, for each active task queue, some service exists that is generally available to process and remove tasks from the queue. In the event of a task consumer outage, XQueue will happily hold onto tasks indefinitely, allowing task consumers to resume processing once they're back online. That said, there is the obvious risk of XQueue filling up its queue capacity either during consumer downtime or if a consumer is simply too slow.
Technology stack
Under the hood, XQueue is a Python Django web application with whatever relational database (e.g. MySQL) and file storage backends (e.g. Amazon S3) you like to use with Django. (File storage is used for message payloads.)
XQueue uses its relational DB to store metadata about queued submissions - the actual submissions are stored in S3.
Summary diagram
Lucidchart | ||||||||
---|---|---|---|---|---|---|---|---|
|
...
|
...
|
Interfaces
There are three groups of HTTP web APIs exposed by XQueue, all of which communicate using JSON.
Authentication
XQueue has a separate user model that has no connection to edx-platform users. Service accounts must be created directly with XQueue, and clients must use basic (session) auth to communicate.
API Endpoints
LMS
...
-facing
/xqueue/submit
Receive a new task
Allowed methods: POST
Format: must have three keys defined in a JSON object in the POST body.
lms_callback_url: the exact location where XQueue should POST the final result of a task
lms_key: an opaque identifier that the caller wants returned in the final callback
queue_name
Any files included in the POST will be dumped into XQueue's configured file storage system.
...
External task processor
...
-facing
/xqueue/get_queuelen?queue_name={queue}
Get the number of unprocessed tasks in the given queue.
Allowed methods: GET
/xqueue/get_submission?queue_name={queue}
Retrieves a single submission from the given queue. Contains a unique submission identifier.
Allowed methods: GET
/xqueue/put_result
Pushes the result of a completed task back to XQueue.
Allowed methods: POST
Shared
...
/xqueue/login
/xqueue/logout
/xqueue/status
Producers
The edx-platform codebase contains an XQueue driver defined within CAPA (/common/lib/capa/capa/xqueue_interface.py
) that interacts directly with the XQueue LMS endpoints. The CodeGrader and MatlabInput tasks are part of the CAPA codebase and directly use this driver.
...
Certificate generation and cert updates also use the CAPA driver directly (/lms/djangoapps/certificates/queue.py
).
Consumers
XQueue is was designed to handle two patterns of task consumers, pull and push. Push consumers have since been deprecated and removed due to the negative performance & scalability implications they had on XQueue.
Pull
Also called "active," these graders are constantly polling XQueue to see if there are new submissions to be processed. XQWatcher is a pull-grader implementation. We recommend using pull graders where possible.
The below diagram illustrates a pull request flow.
...
Push (Removed)
Also called "passive," these graders expose exposed a web interface that XQueue calls when a new submission is available for processing. XxServer XServer was a push-grader implementation. We have moved away from the use of push graders in favor of more general-purpose XBlock integrations (e.g. an LTI consumer).
The below diagram illustrates a push request flow.
...
XQueueWatcher
XQueueWatcher (a.k.a. xqueue-watcher or just xqwatcher) is a Flask-based IDA that polls XQueue for code submissions, and then executes those submissions against course-team authored graders, which run the student submission against a series of tests, and compares the result to the result obtained by running the same suite of tests against a course-team-authored exemplar submission. A single XQueueWatcher instance will run multiple threads that all listen to a single XQueue queue. Each thread will wait for a single submission on the queue, respond to it, and then wait for another submission. Thus the number of threads limits how many submissions can be graded at once. Another result of this design is that a separate instance of XQueueWatcher needs to be deployed for each course, with a configuration specific to that course.
...
Code execution is managed by the JailedGrader class, which marshalls together the student submission (saved as submission.py), the problem's grader, and harness code that facilitates the authoring of grader tests. The student submission will get graded, and its results compared to the results of running the same suite of tests against the course-team authored answer.py.
...
Future state / wish list
LMS
Stop generating PDF certificates in favor of Web certificates. Easier said than done but this is absolutely the direction we need to go.
Decrease coupling: pull the LMS XQueue interface out of CAPA, or at least remove the code dependency between Certificates and CAPA.
Convert all hard-coded queue names in edx-platform to be configurable. You never know when this could become useful...
XQueue Core
Move away from the push grader model. XQueue should really only exist to handle asynchronous communication with pull graders. For push graders, synchronous communication makes more sense and could be done using a separate XBlock or an LTI tool. (This includes the MatlabInput use case as well, as that's effectively a push grader.)
UPDATE: Push grading has been deprecated. 🎉
Remove get_queuelen. I know it's nice to have a "peek" operation but it's not really that interesting. Even XQWatcher doesn't bother with it and instead just tries repeatedly to get_submission until successful.
NOTE: This solution is better than the one proposed here (in response to a security issue): /wiki/spaces/PLAT/pages/49873364
Bulk get submissions for parallel grading (and bulk push back in). Jobs typically come in waves.
Switch from using a separate user store and session auth, to using the edx-platform authentication backend and signed JSON Web Tokens for authorization.
This would require that each queue is mapped to a whitelist of allowed producers and consumers (user authorization), and those clients would need to be appropriately scoped (app authorization).
Possible scopes (just brainstorming): "xqueue:submit" (basically just the LMS), "xqueue:score" (all grader clients). "read/write" and "pull/push" don't really make sense as all XQueue clients need to both add and remove from queues. If we want to keep it simple, we could just have an "xqueue" scope.
Task processors
Consider modifying XQWatcher to have a "slow" poll when the queue is empty, and a "fast" poll when it the queue is probably not empty (as in, if the last poll was successful).
XQWatcher is more of an instance of a code grader than a code grader framework. A framework/client library may speed adoption of external graders.
Resources
Open edX source code
XQueue: https://github.com/edx/xqueue
Contains the Python code for the Django-based IDA.
Pull grader client: https://github.com/edx/xqueue-watcher
Contains the base Python code for Flask-based XQueue watchers.
The actual graders used on edx.org are maintained by the course authors in separate Github repos.
One example of a (private) Github repo where MIT graders are kept for one course: https://github.com/mitodl/graders-mit-600x
Push grader client: https://github.com/edx/xserver
This repo is now archived and is read-only. Push graders are now deprecated.
Related documentation
Diagram source
...
DSL for https://www.websequencediagrams.com)
...
:
Pull Flow
linenumbers | true |
---|---|
collapse | true |
Code Block | |
title Pull Flow participant LMS participant XQueue participant XQWatcher XQWatcher-->XQueue: Poll for submissions note left of LMS: Initiate a request\n(asynchronously) LMS->XQueue: Submit request to queue XQWatcher-->XQueue: Poll for submissions XQWatcher<->XQueue: Get next submission note right of XQWatcher: Do some work\n(e.g. grade code) XQWatcher->XQueue: Push result back XQueue->LMS: Send result via callback note left of LMS: Display result XQWatcher-->XQueue: Poll for submissions | |
Code Block | |
Push Flow
linenumbers | true |
---|---|
collapse | true |
Code Block | |
title Push Flow
participant LMS
participant XQueue
participant XServer
note left of LMS: Initiate a request\n(asynchronously)
LMS->XQueue: Submit request to queue
XQueue->XServer: Submit to grader\n(blocking)
note right of XServer: Do some work\n(e.g. grade code)
XServer->XQueue: Respond with result
XQueue->LMS: Send result via callback
note left of LMS: Display result
|