edx-video-pipeline for your Sandbox
STOP. This page is deprecated. Go Here
Configuring Video Uploads1
- Start a sandbox building.
- Make sure to build with basic auth off.
- You'll need an AWS account:
- If you're not an admin, you will need to request EC2 access (full), S3 access (full), and a functioning ssh key pair.
- Login to AWS, go to S3 storage service.
- Create
threefour buckets.- All s3 buckets must have unique names, but theme the names roughly for storage, ingest, images, and delivery.
- Note the bucket names.
Under the ingest and delivery buckets'
Properties > Permissions > Edit CORS Configuration
, add this CORS configuration (making sure to change the ALLOWEDORIGIN field to the salient sandbox URL:<?xml version="1.0" encoding="UTF-8"?> <CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <CORSRule> <AllowedOrigin>https://${SANDBOX STUDIO URL}</AllowedOrigin> <AllowedMethod>GET</AllowedMethod> <AllowedMethod>POST</AllowedMethod> <AllowedMethod>PUT</AllowedMethod> <AllowedMethod>HEAD</AllowedMethod> <MaxAgeSeconds>3000</MaxAgeSeconds> <AllowedHeader>*</AllowedHeader> </CORSRule> </CORSConfiguration>
- In your delivery bucket, under
'Properties' > 'Permissions'
, addGrantee: 'Everyone'
and check'List'
- If you have created a new AWS account, or wish to limit access, the following steps are recommended:
- In the AWS IAM security service, add a new user.
Add the new user to a new Group, with (at least) the
AmazonS3FullAccess
andAmazonEC2FullAccess
policy attached.- Retrieve the
IAM user key
andIAM user secret
, note them for later
- Once the sandbox is completed building, access your sandbox via ssh.
Change these settings in
/edx/app/edxapp/cms.env.json
"FEATURES": { ... "ENABLE_VIDEO_UPLOAD_PIPELINE": true, ... },
and
... VIDEO_UPLOAD_PIPELINE = { "BUCKET": "${S3 bucket name}", "ROOT_PATH": "" // LEAVE THIS BLANK }, ...
and
... "VIDEO_IMAGE_SETTINGS": { "DIRECTORY_PREFIX": "video-images/", ... "STORAGE_KWARGS": { "bucket": "${S3 bucket name}", "custom_domain": "s3.amazonaws.com/${S3 bucket name}", ... }, ... }, ... } ...
REMOVED:
Change these settings in/edx/app/edxapp/cms.auth.json
- While you're at it, create a superuser (The provided example sets 'staff' as a superuser, which is just fine)
- Restart your servers and workers.
Now log in, via web browser (utilizing a staff-access account) to your sandbox studio interface.
- Either create a unique course or navigate to one of the stock courses.
- In the course you wish to use, navigate to
Settings > Advanced Settings
At the bottom of the page, in the
"Video Upload Credentials"
field, add the following configuration (don't neglect the brackets).2{ "course_video_upload_token": "xxxx" }
You should now be able to see the Content > Video Upload
option in your sandbox CMS, and see the uploaded videos in your S3 bucket.
- If you haven't, upload a video now. The upload tool accepts only *.mov and *.mp4 files, so you're going to want to use one of those.
- If you don't have a video file, you can use this one. You're going to need it in a moment.
- Check that a bucket object showed up in your upload bucket. Good? Good.
Configuring VAL Access3
- If you haven't, access your sandbox via ssh and create a superuser.
- Log in to the django admin (usually ${YOUR_SANDBOX_URL}/admin)
Go to Oauth2 > Clients
- Click
'Add Clients'
(rounded button, upper right hand corner) - In the window, add the following information:
- User: Staff (usually pk=5, but the magnifying glass icon can help)
- Name: ${ANY RANDOM STRING} (e.g. 'veda_sandbox')
- URL: ${YOUR_SANDBOX_URL}/api/val/v0
- Redirect uris: ${YOUR_SANDBOX_URL}
- Client ID: autofilled, make note of this
- Client Secret: autofilled, make note of this
- Client Type: Confidential (Web Application)
- Logout URI: (can leave blank)
- Make Note of the Client ID and Client Secret, as these will be needed later
- Remember to Save!
Configuring openveda
We're going to run a lightweight version of edx-video-pipeline, with HLS support on a dedicated EC2 instance, using the subdirectory of the s3 bucket as a kind of "watch folder". This assumes some basic level of competency in using AWS EC2, and is in no way exhaustive.
- Launch an EC2 instance.
- Any Linux AMI will do, we recommend the free-tier eligible AWS linux instance (ami-0b33d91d).
- Be sure to have more than 6 GB of storage or so. The max upload size for a file is 5g, so you just need a li'l extra to deal with the various unmentionables in the codebase.
- Any size instance is acceptable, just be prepared to sit around and wait a little longer for your completed encodes if you go small.
- For production VEDA we use the g2.2XL GPU nodes, (many $$$) but for the purposes of a sandbox a t2.micro is just fine.
- Click through until you get to "Step 6: Configure Security Group".
- Allow ssh access from your IP
- Launch!
- Any Linux AMI will do, we recommend the free-tier eligible AWS linux instance (ami-0b33d91d).
- ssh into your newly launched instance.
Get the machine ready and clone the necessary repos:
sudo yum -y update sudo yum -y install gcc sudo yum -y install git git clone https://github.com/yro/openveda git clone https://github.com/yro/v_videocompile git clone https://github.com/yro/vhls
Download and compile a static build of ffmpeg:
cd v_videocompile sudo python setup.py install v_videocompile
Install the HLS dependencies:
cd ~/vhls sudo python setup.py install
Finally, install openveda:
cd ~/openveda # we're going to be altering the configurations, so this modified command: sudo python setup.py develop
Now we're ready to configure. Remember all of those keys, secret keys, passwords, and access tokens I asked you to take a note of? Now is where they shine. Openveda has a config wizard that should walk you through it:
openveda -config
Follow the prompts. You will need the following information:
VEDA Working Directory (leave blank)
AWS S3 Storage Bucket Name
- Studio Ingest Bucket Name (this says 'optional', but for you it is not)
- AWS S3 Deliver Bucket Name (streaming bucket)
- AWS S3 Images Bucket Name
- VAL Token URL (this is also not optional) < "https://{YOUR_SANDBOX_URL}/oauth2/access_token"
- VAL API URL < "https://{YOUR_SANDBOX_URL}/api/val/v0/videos"
- VAL Username (This will be 'staff' if you have set staff as the authorized VAL user)
- VAL Password (Your authorized VAL user's studio password)
- VAL Client ID
- VAL Secret Key
You should be able to run (and receive):
python ~/openveda/openveda/pipeline/pipeline_val.py # And then you'll get: >${AN_ACCESS_TOKEN}
Now run (while in session):
openveda -i
You should see a couple of terminal progress bars, as the video encodes to its various endpoints.
Once it's done (and if you're using a t2.micro instance, maybe go get a snack or something while you're waiting), you should be able to check back once it's done and see:
You might need to refresh the page. That's a functional video pipeline!
Running edx-video-pipeline in the Background4
While ssh'd in to your EC2 instance, run:
nohup openveda -i&>openveda.out&
Now you can log out and walk away. Don't forget to terminate your instance when you're done.
The ingest loop is fragile, and isn't defensive against connectivity issues, but you don't expect to handle a lot of traffic, right? Right.
This has no provisions for monitoring, but should be fairly straightforward (and verbose) if something goes wrong.
Not working?
Let's check the logs.
Logs are accessible here:
|
Cleanup
Terminate your EC2 instance, delete your buckets, and you're done!
Errors
Openveda should have some limited provisions for basic user errors. If you see something that doesn't seem right, don't be shy to point it out.
Notes:
- Adapted from the edx-platform wiki
- The course_video_upload_token can be any non-null string, and it is important in production, where we need to differentiate between thousands of video workflows, on the sandbox we're simply using a single workflow, so use any non-null string.
- This is extremely bad and shameful. Do NOT use this in a production setting. You should follow the instructions of your friendly devops engineer.
- Don't forget to terminate your EC2 instance!