edx-video-pipeline for your Sandbox: Terraform Edition

STOP. This page is deprecated. Go Here

Build and Configure a Sandbox1

  1. Start a sandbox building
    1. Make sure to build with basic auth off.
  2. While that's baking, you'll need an AWS account:
    1. If you're not an admin, you will need to request EC2 access (full), S3 access (full), and a functioning ssh key pair.

Once the sandbox is completed building, access your sandbox via ssh.

  1. Change these settings in /edx/app/edxapp/cms.env.json

    "FEATURES": {
        ...
        "ENABLE_VIDEO_UPLOAD_PIPELINE": true,
        ...
    },

    and

    ...
    VIDEO_UPLOAD_PIPELINE ={
        "BUCKET""${YOUR_SANDBOX_DNS}-vedaupload",
        "ROOT_PATH": "" //LEAVE THIS BLANK
    },
    ...

    and

    ...
    "VIDEO_IMAGE_SETTINGS": {
            "DIRECTORY_PREFIX": "video-images/", 
            ...
            "STORAGE_KWARGS": {
                "bucket": "${S3 bucket name}", 
                "custom_domain": "s3.amazonaws.com/${S3 bucket name}"", 
                ...
            }, 
            ...
        }, 
        ...
    }
    ...
  2. While you're at it, create a superuser (The provided example sets 'staff' as a superuser, which is just fine)

  3. Restart your servers and workers. (Don't forget to exit from the sudo user first)

  4. Now log in, via web browser (utilizing a staff-access account) to your sandbox studio interface.

    1. Either create a unique course or navigate to one of the stock courses.
    2. In the course you wish to use, navigate to Settings > Advanced Settings 
    3. At the bottom of the page, in the "Video Upload Credentials" field, add the following configuration (don't neglect the brackets).2

      {
       "course_video_upload_token": "xxxx"
      }
  5. You should now be able to see the Content > Video Upload option in your sandbox CMS.

Configuring VAL Access3

  1. If you haven't, access your sandbox via ssh and create a superuser.

  2. Log in to the django admin (usually ${YOUR_SANDBOX_URL}/admin)
    1. Go to Oauth2 > Clients


    2. Click 'Add Clients' (rounded button, upper right hand corner)

    3. In the window, add the following information:
      1. User: Staff (usually pk=5, but the magnifying glass icon can help)
      2. Name: ${ANY RANDOM STRING} (e.g. 'veda_sandbox')
      3. URL: https://${YOUR_SANDBOX_URL}/api/val/v0
      4. Redirect uris: https://${YOUR_SANDBOX_URL}
      5. Client ID: autofilled, make note of this
      6. Client Secret: autofilled, make note of this
      7. Client Type: Confidential (Web Application)
      8. Logout URI: (can leave blank)

    4. Make Note of the Client ID and Client Secret, as these will be needed later

    5. Remember to Save!

Terraforming openveda

Clone the openveda repo on your local machine.

git clone https://github.com/yro/openveda
  1. Assume an appropriate IAM role (for example 'veda-admin', though any role with EC2 and S3 privileges is acceptable) via your edx-hub account in the command line.

  2. (If you don't have one) Generate an ssh key in the AWS Console > NETWORK&SECURITY (under the appropriate IAM role). Retain this key and the name.
     
    1. A NOTE ABOUT REGIONS: Your ssh key is region specific. You can change from us-east-1 (default at edX Cambridge), but you'll also need to change the aws_region variable in terraform.tfvars

  3. In openveda>terraform>terraform.tfvars on your local machine, input the various keys and urls you've collected so far.

    /*
    */
    // The URL to your sandbox instance, stripped of the 'https://'
    sandbox_dns = "{YOUR_SANDBOX_DNS_PREFIX}" # (Usually a github handle)
    
    // AWS Authentication Information
    aws_region = "us-east-1"
    
    // Full path to your private AWS SSH key
    private_key = "${FULL_FILEPATH_TO_YOUR_KEY}"
    
    // The key pair name, as AWS knows it
    key_pair_name = "${YOUR KEY NAME ACCORDING TO AWS}"
    
    // Your local machine's public IP
    local_ip = "${YOUR_LOCAL_IP}/32"
    
    //VAL Auth Information
    val_client_id = "${VAL_CLIENT_ID}"
    val_secret_key = "${VAL_CLIENT_SECRET}"
    
    // Sandbox Studio Auth Information
    val_username = "${SANDBOX_STUDIO_USERNAME}"
    val_password = "${SANDBOX_STUDIO_PASS}"
    
    

    NOTE: NEVER commit this back into a public repo. This is now full of secrets. 

  4. Now cd into openveda/terraform
    1. terraform plan

      Should cycle through without an error, and show you a comprehensive plan for what will be built.

    2. Now you're ready to go. The terraform plan should create your buckets and worker, and provision your worker. It will also begin running the ingest daemon.

      terraform apply

You should now be able to upload a video file (here's a sample) and have it process (you might need to get a snack, this can be slow). Eventually it should resolve to look like this:


Not working?

Let's check the logs. You can find your video pipeline sandbox's ec2 public IP in the terraform generated 'terraform.tfstate' file

Logs are accessible here:

ssh
ssh -i {{your_aws_ssh_key}} ec2-user@{{video pipeline sandbox's ec2 public IP}}
cat ~/openveda.out


While this EC2 instance is up, do NOT remove the openveda repo on your local machine. Terraform leaves a bunch of state information (in the form of 'terraform.tfstate' files) that terraform will need that to eventually move to the next step:

Shut it down

  1. cd into openveda/terraform

    terraform destroy





Notes:

    1. Adapted from the edx-platform wiki
    2. The course_video_upload_token can be any non-null string, and it is important in production, where we need to differentiate between thousands of video workflows, on the sandbox we're simply using a single workflow, so use any non-null string.
    3. This is extremely bad and shameful. Do NOT use this in a production setting. You should follow the instructions of your friendly devops engineer.
    4. Don't forget to terminate your EC2 instance!