Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: ops details moved to internal space
Info

For more general architectural details: XQueue Architecture

Table of Contents

Development

So you need to develop/troubleshoot XQueue and its external grading? This page contains helpful details.

xqueue

A devstack xqueue Docker container is available, which supports the same development as other Python/Django IDAs.

To run xqueue in devstack, use: make dev.up.xqueue

xqueue-watcher

Local Development

Here’s the steps required to get a functional xqueue-watcher running locally (without codejail):

...

  • Provide a Docker devstack xqueue-watcher container

    • This container would be built with:

      • the same Python virtualenvs used in production

      • the same codejail environments (using AppArmor)

      • the ability to use arbitrary grading repos/code in the container

  • Make xqueue-watcher a Python module

    • Currently, several xqueue-watcher files are duplicated across grader repos - which is the very problem that Python modules are made to solve.

    • Have each grader install the xqueue-watcher Python module and derive its graders from the common code.

    • Would include a script/method for testing all grader code written by a course team.

      • Accepts a configuration file which specifies all tests to run.

        • Code to grade along with the expected results.

  • Define a clear best practice for writing graders.

    • Create an edX-written grader repo which demonstrates all the best practices.

Operations

So how is xqueue/xqueue-watcher run in production?

XQueue

Production Environment

prod-edx

There’s an XQueue autoscaling group (ASG) used for prod-edx with a single MySQL instance used.

The ASG is shown here:

https://console.aws.amazon.com/ec2autoscaling/home?region=us-east-1#/details/prod-edx-xqueue-v101?view=details

The EC2 instances can be viewed here:

https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:instanceState=running;search=prod-edx-xqueue

All possible XQueue queue names are listed here.

New queue names can be added using the process described on this page:

[How To] Adding a new xqueue queue

The queue names end up as a Django setting in XQueue - and the names are used to accept submissions only to those named queues, as shown here:

https://github.com/edx/xqueue/blob/91b1ccb0965e1ad2ded5d090c7569bc5a34a1f5d/submission_queue/ext_interface.py#L53-L54

prod-edge

No separate XQueue resources are allocated for prod-edge. The prod-edx XQueue instance is used for all prod-edge courses which use external graders. The recent prod-edge submissions can be seen on the read-replica’s XQueue DB with this SQL:

select queue_name, lms_callback_url, count(*) from queue_submission where lms_callback_url not like 'https://courses.edx.org%' group by 1, 2 order by 1, 2;

stage-edx

There’s a stage ASG for XQueue used to test changes in the stage-edx environment. These tests only test a basic round-trip with minimal grading. No current way to test the course authors' grading code currently exists on stage-edx.

The ASG is shown here:

https://console.aws.amazon.com/ec2autoscaling/home?region=us-east-1#/details/stage-edx-xqueue-v230?view=details

The EC2 instance can be view here:

https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:instanceState=running;search=stage-edx-xqueue

Deployment

XQueue is deployed using GoCD. Its pipelines are here:

https://gocd.tools.edx.org/go/pipelines#!/xqueue

Monitoring

Course Usage Statistics

A common question is “What courses currently have XQueue-graded problems?” There are a couple of ways to determine the answer.

DB Submissions Queues

Using the queries described below in the XQueue Database SQL, one can determine which submission queues are in existence and which ones have had recent submissions. However, XQueue submission queue names do not easily map back to course names, as the full course run key isn’t contained in each queue name. So this method doesn’t provide the full answer.

The list of possible queues - which are used only to allow submissions to known queue names - is here:

https://github.com/edx/edx-internal/blob/2ddae57e6e1def0def50217e89aaf900436e9385/ansible/vars/prod-edx.yml#L1509-L1611

Coursegraph Queries

All of the prod-edx edx.org course data is scraped from the modulestore/MongoDB regularly and placed into a neo4j DB instance here: https://coursegraph.edx.org/browser/

That data can be queried using neo4j queries like the ones detailed on this page: https://openedx.atlassian.net/wiki/spaces/SUST/pages/135102646/CourseGraph+Queries#What-courses-use-xqueue-(code-graders)%3F

A copy of the data for the query for all courses which use XQueue code graders (as of 2021-04-21) is available in a spreadsheet here:

https://docs.google.com/spreadsheets/d/1EEdk6FDGOi6MWZP5yeIEZHptEYMIeNVhsyuE-Lldugo/edit?usp=sharing

XQueue Database

The XQueue Django IDA has a single model - submission. The XQueue database can be viewed via the read replica using /edx/bin/prod-edx-xqueue-mysql-iam-auth.sh:

How to Access a Database from Tools GP (aka the "Read Replicas")

The database contains a few weeks worth of submissions at any given time. After submissions are graded/completed, they are marked as “retired”.

Retired submissions are deleted from the backing MySQL DB on a regular basis by a Jenkins job which calls a Django management command in the XQueue repository. The current command invocation only keeps the last 14 days of retired submissions. Here’s the Jenkins config for that command:

https://github.com/edx/edx-internal/blob/27d5b806fffb867e7a14611fdbe83b99c1258471/ansible/vars/prod-edx.yml#L874-L878

And the Jenkins job can be found here:

https://tools-edx-jenkins.edx.org/job/xqueue/job/prod-edx-delete_old_submissions/

Useful SQL

The single interesting non-Django table is queue_submission. Here’s various queries to view the queued submissions:

Code Block
languagesql
mysql> describe queue_submission;
+------------------+---------------+------+-----+---------+----------------+
| Field            | Type          | Null | Key | Default | Extra          |
+------------------+---------------+------+-----+---------+----------------+
| id               | int(11)       | NO   | PRI | NULL    | auto_increment |
| requester_id     | varchar(128)  | NO   |     | NULL    |                |
| queue_name       | varchar(128)  | NO   | MUL | NULL    |                |
| xqueue_header    | varchar(1024) | NO   |     | NULL    |                |
| xqueue_body      | longtext      | NO   |     | NULL    |                |
| s3_keys          | varchar(1024) | NO   |     | NULL    |                |
| s3_urls          | varchar(1024) | NO   |     | NULL    |                |
| arrival_time     | datetime      | NO   |     | NULL    |                |
| pull_time        | datetime      | YES  |     | NULL    |                |
| push_time        | datetime      | YES  |     | NULL    |                |
| return_time      | datetime      | YES  |     | NULL    |                |
| grader_id        | varchar(128)  | NO   |     | NULL    |                |
| pullkey          | varchar(128)  | NO   |     | NULL    |                |
| grader_reply     | longtext      | NO   |     | NULL    |                |
| num_failures     | int(11)       | NO   |     | NULL    |                |
| lms_ack          | tinyint(1)    | NO   |     | NULL    |                |
| lms_callback_url | varchar(128)  | NO   | MUL | NULL    |                |
| retired          | tinyint(1)    | NO   | MUL | NULL    |                |
+------------------+---------------+------+-----+---------+----------------+

-- The DB submission queue length goes up and down over time.
-- Completed submissions are marked as "retired" and deleted after a few weeks.

mysql> select count(*) from queue_submission;
+----------+
| count(*) |
+----------+
|   102885 |
+----------+

-- Shows all the submission queues and how many submissions are in each queue.
mysql> select queue_name, count(*) from queue_submission group by 1 order by 2 desc;

-- Shows examples of S3 bucket locations where submissions have been stored.
mysql> select queue_name, s3_keys, s3_urls from queue_submission where char_length(s3_keys) > 3 limit 10;

xqueue-watcher

Production Environment

prod-edx

The EC2 instances running xqueue-watcher graders in production can be found here:

https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:instanceState=running;search=xqwatcher

Those EC2 instances are started by/attached to an auto-scaling group here:

https://console.aws.amazon.com/ec2autoscaling/home?region=us-east-1#/details/prod-edx-xqwatcher-ServerAsGroup-v068?view=details

stage-edx

The EC2 instances running xqueue-watcher in stage-edx for simple smoke tests can be found here:

https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:instanceState=running;tag:Name=stage-edx-xqueue

Those EC2 instances are started by/attached to an auto-scaling group here:

https://console.aws.amazon.com/ec2autoscaling/home?region=us-east-1#/details/stage-edx-xqwatcher-ServerAsGroup-v082?view=details

Deployment

xqueue-watcher is built and deployed using GoCD. Its pipelines are here:

https://gocd.tools.edx.org/go/pipelines#!/xqwatcher

Note

WARNING: Currently, whenever the stage xqueue-watcher is deployed via GoCD, the production xqueue-watcher is also built and deployed - without waiting for a verification step on stage-edx.

MIT Grader Integration

There are two MIT repositories which are built into the xqueue-watcher AMIs which are deployed by edX - they are:

  • git@github.com:mitodl/graders-mit-600x.git

  • git@github.com:mitocw/graders-mit-7qbwx.git

The Ansible play which is run by GoCD creates codejail environments for each repository and clones the repository code into it. The configuration which specifies these two courses' queues is here:

https://github.com/edx/edx-internal/blob/9174c1f531e6339756577570c526019234aa8035/ansible/vars/edx.yml#L1279-L1323

NOTE: The release branch from each grader repository above is deployed. So any changes merged to the master branch should be merged into the release branch for deployment to production.

Stage XQWatcher Testing

A simple smoke-test of xqueue-watcher can be performed on stage-edx. To test xqueue-watcher, visit this course problem:

https://learning.stage.edx.org/course/course-v1:edx+xq101+2021_T1/block-v1:edx+xq101+2021_T1+type@sequential+block@Staff_Debug/block-v1:edx+xq101+2021_T1+type@vertical+block@Debug_hello_world

A submission to the second problem on the page (named pull grader debug: hello world) has the queue Watcher-MITx-6.00x setup for running the task. A successful submission (whether correct or incorrect) demonstrates that xqueue-watcher is operational, though it doesn’t check any problem-specific logic.

Prod XQWatcher Testing

A simple smoke-test of xqueue-watcher can also be performed on prod-edx. To test xqueue-watcher, visit this course problem:

https://courses.edx.org/courses/course-v1:MITx+6.00.1x_8+1T2016/courseware/Staff_Use/Staff_Debug/1?activate_block_id=block-v1%3AMITx%2B6.00.1x_8%2B1T2016%2Btype%40problem%2Bblock%40hello_world

As before, a submission to the second problem on the page (named pull grader debug: hello world) has the queue Watcher-MITx-6.00x setup for running the task. A successful submission (whether correct or incorrect) demonstrates that xqueue-watcher is operational, though it doesn’t check any problem-specific logic.

Monitoring

  • xqueue-watcher does not send data to New Relic.

  • xqueue-watcher logs are sent to Splunk. The easiest way to find them is to search for a specific queue name, such as: MITx-6.00x

Database

...

    • .