Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Feature Overview

Page Properties


CategoryVideo Service
DocumentationLINK TBD
User ImpactAs a course instructor I am able to encode my course videos for delivery across all edX viewing platforms.


Page Properties
hiddentrue




VEDA The edX Video pipeline is a video encoding and delivery pipeline IDA that is intended to enable a rapid and scalable video pipeline for course video encoding across all edX viewing platforms. VEDA handles edx-video-pipeline handles all video uploaded through the edX studio course video upload page for edx.org courses, all content generated by the in-house post-production department, and all course marketing videos.

VEDA delivers edx-video-pipeline delivers to youtube, AWS, and several transcription services. There is also a 'review/revision' limited workflow for in-house productions that can be triggered to complete via the video production planning software. Each individual workflow is completely modular, and each course team can deliver to specific endpoints (or accounts within endpoints) independently. In addition, VEDA handles edx-video-pipeline handles marketing video automation (as much as is possible given the current 'about page' workflow) via a native upload tool.

VEDA encodes edx-video-pipeline and edx-video-worker encodes video via a forked build of ffmpeg, using a cluster of celery workers to do the actual encoding jobs, with a central node acting as the interface to the various intake and delivery providers, and is, effectively, a django app sending tasks to be consumed by a cluster of celery workers.

Pipeline:


...

Milestone: HLS

HLS (HTTP Live Streaming) is a method of delivering video to users wherein the video player uses connection speed to determine the best-quality stream that is deliverable with no, or minimal, buffering. The video quality is constantly adjusted in set time intervals (usually 10 seconds) as the video is played back, constantly improving or degrading video playback quality as the connection speed is either improved or degraded.

...

HTTP Live Streaming milestones are broken down into 'phases' to match concurrent work being done in edx-platform on both the video player and within Studio. 

Current State:

Upon creating a new course instance in Studio, individual course teams must request (via email or support ticket) a support team member to enable the video upload page. Each course has a custom workflow, an individual associated youtube channel, and an optional transcription service provider. All of these options (and associated authentication credentials) are handled by VEDAedx-video-pipeline. In addition, each partner institution has a youtube-partner account that requires manual intervention from Google to implement. Prior to October 2016, edX enjoyed a concierge level of support from Google, and was able (with the assistance of a dedicated Media Support Specialist) to enable these institutions and courses in relatively short order.

...

  1. Enable Studio course instance (edX staff)
  2. Activate dedicated youtube channel (Course team)
  3. If new partner, request Youtube partner CMS from Google (edX staff, Google Staff)
  4. Associate course Youtube channel with Youtube partner CMS (edX staff)
  5. Generate VEDA token edx-video-pipeline token (edX Staff)
    1. Add associated workflow information (Review process, Transcription)
  6. Input VEDA token edx-video-pipeline token into Studio Advanced Settings, enabling video upload tool (edX Staff)

The current goal is to eliminate steps 2-6 through a series of compartmentalized upgrades to VEDA and edx-video-pipeline and edx-platform in parallel with work being done to enable HLS. The plan is currently comprehensive for Phase I, with more ambiguity around phases II and III allowing for improvisation and an updated course of action as more data is gathered.


...

Phase I:

  • Enable HLS

  • Automate Video Upload Enablement in Studio

  • Begin phase-out for VEDA Transcription edx-video-pipeline Transcription workflow
CategoryFeature
DocumentationTBD
User Impact

As a learner I am able to watch content, whether or not I am in a youtube-embargoed country optimized for my level of bandwidth access.

As a course team member I am able to deliver content across viewing platforms in an automated way with minimal intervention and without having to contact a support team for assistance. If I have a course currently in production, I will not experience a change in my workflow, and if I am an existing partner, I can request a customized workflow.

Proposal: 'Default Mode'

Enable a 'default' workflow in Studio, which is an HLS/Mobile only workflow with no customizations. Currently running courses would remain unchanged, and this workflow could be overridden by a custom VEDA tokenedx-video-pipeline token. Every new course instance would have this default token, and new partners would not have the option of overriding this default workflow. VEDA would edx-video-pipeline would then not differentiate between courses/institutions for this default workflow, instead tracking videos as individual assets rather than attempting to assign custom workflows based on course instances. 

...

In addition, HarvardX is our only partner currently availing itself of our transcription workflow, and we should begin discovery around whether or not we wish to automate this workflow further. My recommendation, if we wish to automate, is a studio-based solution from a subset of known vendors and a simplified file-handling workflow. Most vendors will push eagerly to an endpoint (such as AWS S3) when transcripts are ready, and there are opportunities for real improvement in this area.

Work needed: Studio

  • Changes to the Studio advanced settings template
    • Add a 'default' shared token, automatically enabling the video upload page for all new courses.
  • Thumbnail handling for individual videos
    • Optional video thumbnail upload
    • Selection of one-of-three automatically generated (by VEDAedx-video-pipeline) thumbnails served from the video streaming S3 bucket.
  • edx-platform video player HLS playback
  • Transcription workflow discovery.
    • Infrastructure plan
    • Vendor acquisition, planning
    • UX changes

Work Needed:

...

 edx-video-pipeline

  • Activate HLS encoding
    • Test and upgrade validation for failed or suboptimal encodes
    • Determine optimal object invalidation for cloudfront cached transport stream manifests, in the case of failed encode streams.
      • Option A: Cache rename (check header)
      • Option B: Max-Life
  • Encode and revalidate legacy video objects
    • Encode for HLS
    • (optional) Re-encode legacy streams, optimize object storage
    • Delete unused objects from AWS
  • Upgrade operational support
    • Terraform plan upgrade
    • NewRelic alerting upgrade
  • Enable shared-token default course workflow
    • Track new course objects via VAL ID
  • Optimize database tables
  • Video thumbnail workflow/discovery

Work Needed (as prep for Phase II):

  • Improved tracking for encode product/resultant parameters 
  • Improved stats for video overhead/bandwidth switching.
  • Socialization of youtube deprecation, socialization of VEDA transcription edx-video-pipeline transcription workflow deprecation.

Unanswered Questions:

  • What will be the reduction in support ticket volume from this one step?
  • Do our partners get value from having their videos on youtube?
  • What is the optimal number of optimal transport streams (bandwidth switching options)


...

Phase II:

  • Deprecate Youtube

  • Deprecated transcription services 

  • MAYBE: Staged rollout of Automated Transcription Services.

CategoryFeature
DocumentationTBD
User Impact

As a learner I am able to watch content, whether or not I am in a youtube-embargoed country optimized for my level of bandwidth access.

As a course team member I am able to deliver content across viewing platforms in an automated way with NO manual intervention and without having to contact a support team for assistance.

With the work in place from phase I, very little new building needs to occur in VEDA. As edx-video-pipeline. As we stop uploading and validating videos to/from Youtube, that encode stream and module is simply deprecated, and eventually, deleted. There are opportunities to do some further database cleaning, and some deleting of code from the VEDA repoedx-video-pipeline repo. Some discovery around the marketing video roadmap should be done as well.

...

  • Deactivate course token metadata
  • Deactivate Studio Advanced Setting for video upload token
  • IF WE DECIDE TO AUTOMATE TRANSCRIPTION:
    • Simplified transcription upload workflow and file handling 
      • AWS S3 storage/serving of static transcript assets
      • Eager push of completed assets from vendors
    • Credential storage and handling.
    • UX changes

Work Needed:

...

 edx-video-pipeline

  • Deactivate token validation from edx-studio
  • Deactivate youtube encoding and youtube upload
  • Drop course table
  • (Optional) Enable ML Encode optimization
  • Deactivate Transcription workflow

Work Needed (as prep for Phase III):

  • Generate and test ML algorithm for determining optimal encoding schema.

Unanswered Questions:

  • What is the operational need for transcription automation?
  • Should we handle non-english transcriptions? How many languages and service providers should we interface with?


...

Phase III:

Image Modified

Proposal 1:

...

Decentralizing edx-video-pipeline

If we were to completely eliminate custom workflows, with no service to youtube or transcription providers, VEDA could edx-video-pipeline could become a decentralized cluster of workers, consuming a queue with tasks generated by studio uploads. A separate django instance dedicated to VEDA would edx-video-pipeline would be redundant, and we could eliminate all of the central VEDA control edx-video-pipeline control node, which would greatly simplify operational complexity and code complexity. This would, in addition, create opportunities to easily bundle VEDA with edx-video-pipeline with devstack and open instances, and allow for an easily scaled VEDA scaled edx-video-pipeline 'swarm' to consume the video tasks generated by studio.

...