Video Pipeline 1.5

Video Pipeline 1.0 Successes

After more than a year of enabling version 1.0 of the video pipeline, we have successfully:

  • mobile enabled over 1100 (out of 1700) courses
  • migrated over 61,000 videos
  • educated and trained old and new course staff to use the Video Upload page in Studio

Video Pipeline 1.0 Issues

After more than a year of field testing this for (1) new courses and (2) reruns of pipelined courses, we have encountered various issues with the pipeline and its required workflow. 

Overview

  • Current video pipeline is an inefficient, manual process. Process promotes waste, non-valued work, and ineffective employee/partner communications.
  • Process relies on risky, ever-changing third-party hosting from YouTube that is not supported by a SLA.
  • Process was created in early days of edX when org had < 5 partners. Given significant increase in partners, the current YouTube workflow is not sustainable or scalable.
  • Current video process is fragile and produces ongoing fails for partners and learners.

Burden on edX Video Team

Some of which have been a time-consuming burden on the video team who have manually worked around the issues. For example:

  • Creating YouTube CMS Content Owner (CO) accounts per partner
  • Numerous handoffs between media team and partners to create YouTube channels in CMS for each course
  • Manually enabling every course for video upload by adding HEX ID to studio
  • Resolving frequent YouTube errors due to YT backend changes

Errors experienced by Mobile Learners

Some of which have manifested in learner-facing video issues on the released mobile apps.  For example, per MA-2147 - Getting issue details... STATUS :

  • Number of videos with only youtube URLs, and no mobile encodings: 3,411
    • Learners are not able to view these videos within the mobile app.
  • Number of videos with only web-hosted video URLs, and no mobile encodings: 5,287
    • The mobile app will attempt to access these videos, but may continue to spin if the video is large.
  • Number of videos explicitly designated as only-on-web (most likely for licensing reasons): 375
    • Learners are not able to view these videos within the mobile app.

Scenarios causing Course Teams to Misuse the Pipeline

After some analysis, the following set of scenarios summarize the root cause of the issues:

  1. Course author updates/adds videos after video migration.

    For courses that underwent a transformation using a manual video migration script run by the engineering team, the course staff were never trained to use the Video Upload tool.  So then, when they later updated or added videos in the course, they didn't know about the new process and didn't use the Video Upload tool.

  2. Course author updates/adds videos in a Course or in a Course rerun, bypassing the Video Upload tool.

    Similar to #1, except for courses that had been using the Video Upload tool.  In this case, the video was updated/added by a course author that was unaware or forgot about the process.

  3. Course updated/created in another environment (Edge, OLX, Open edX) and then imported into Prod.

    In this case, the course team was using an open edX environment to experiment and create their course content.  Once they were ready to start their course, they imported their course into studio.edx.org.  However, since there was no video pipeline service available at the time of course creation, the videos in the course were not processed through the pipeline.

  4. Video pipeline integration settings overridden on import on Prod.  ( MA-1302 - Getting issue details... STATUS )

    When a course copy having empty video pipeline settings (e.g., when importing from an open edX environment) is imported onto its Production copy, it overrides any previously stored settings in Production.  In this case, video pipeline integration settings are reset and the Video Upload page disappears for that Production course.

  5. Manual entering of Video ID results in incorrect or missed value.

    Since the Video Upload page is separated from the Video Authoring content block, course authors are required to (somehow) find the correct corresponding video and then manually copy and paste the corresponding Video ID into the corresponding Video block.  This is understandably an error prone process and can lead to unwanted errors.

  6. Manual re-use of Video ID across courses.

    When course authors want to reuse the same video across courses, they may manually enter the Video ID of a video that appears in the Video Upload page of another course.  Although theoretically allowed, the video abstraction layer (VAL) isn't updated to link the video with the other course. This results in a performance issue since a bulk search for all videos in a course will not include those outliers.

    Note/Question: Should we just update the Video Authoring block to update VAL to link to the course whenever a legitimate Video ID is added?

  7. Licensed video not marked as only_on_web.

    In some legitimate cases where a video is not meant to have a downloadable reference (e.g., for licensed content), the intention is to mark the video as `only_on_web`.  However, this is not made clear in the Video Authoring block and so course teams forget to do so.

    Note/Question: Should these videos be marked as 'only_streaming' instead?  Now that we have YouTube support on the mobile apps, they can still be streamed using YouTube; just not downloaded.

Video Pipeline 1.5 Plan

While a long-term plan to reassess the edX Asset Pipeline Strategy is under discussion, we plan to address the above issues through tactical short-term measures.

1. Consolidate and Validate Video Identifiers in Video Authoring UI 

 Addresses: Scenarios #6 and #7 above.

Related ticket:  MA-385 - Getting issue details... STATUS

We want to simplify the Video Authoring block, so that it's clear to course authors that we support only 3 mutually exclusive cases:

  1. YouTube - with YouTube link supplied by the video author.

  2. Video ID - if the video was processed through the edX Video Pipeline workflow for this course (or processed through another course).

    Note:
    when processed through the Video Pipeline, a video will have multiple encodings, including YouTube, Web, Mobile, Audio-only, etc.  So it would be "nice" for the course author to see the links to the various encodings in the Video Authoring UI once the video is processed (related ticket: MA-150 - Getting issue details... STATUS ).

    Note 2: Upon saving the Video block, the corresponding video in the Video pipeline database (VAL) should be updated to have a link to the course, if it doesn't already have so.

  3. Externally Hosted - with an external link supplied by the video author (e.g., video hosted on their own S3 site, if they are not using the edX Video Pipeline).

Additional Notes

  • If the video pipeline is enabled for the course, display a warning if an option other than "Video ID" is used.
  • When the "Video ID" option is chosen,
    • verify that the provided ID exists in the VAL database.
    • if the ID is not already associated with the course in the VAL database, create the association.
    • ONLY do this validation if Video Pipeline is enabled on the server.
  • To handle existing video blocks that don't follow the mutual exclusivity, use the following preference order, which is used by the mobile apps:
    1. If Video ID is specified, use this field iff the ID is in VAL.
    2. Else if a Hosted URL is specified, use this field.
    3. Else, use the YouTube field.
  • If there are any resulting content issues from the above preference order, course teams will need to fix them, since it affects the mobile apps.
  • Consider removing old YouTube ID fields from UI as well - since they are no longer needed.

2. Simplify Video Linking

Addresses: Scenarios #1, #2, #5 above.

Related ticket:  MA-2066 - Getting issue details... STATUS

Allow course authors to search for their upload videos within the Video Authoring block by providing a type-ahead/searchable dropdown with the file display names and Video IDs of all the uploaded videos associated with the course.  This prevents unnecessary user errors (forgetting, not knowing, incorrect copy and pasting, etc).

Additional Notes

  • Populate the drop-down by querying VAL for all videos associated with the course.
  • Allow Course Authors to still enter or cut-and-paste the Video IDs directly into the video block, in case they didn't upload the video through this course.
  • Follow through with the additional notes captured by #1 above.

3. Course Validation

Addresses: Scenarios #3 and #4 above.
Related ticket:  MA-736 - Getting issue details... STATUS

Although Course Validation can be generalized to other requirements and other features, for the purpose of Video assets, adding a validation step in the Course Publishing/Importing workflow will allow course authors to understand all the video-related issues in their course.  In particular, for courses that are designated as "Mobile Available", reporting all the non-mobile available videos in the course will allow course authors to address and fix those issues.

Validation entry points:

  1. Course Import - Note that, a chicken-and-egg problem can arise if the course author is prevented from importing the course without a way to actually fix the course - especially since (1) courses are automatically published upon import and (2) the video pipeline is not available outside of Prod.
  2. Course Publish - Note that, today, the underlying database publishes courses at various times, without an explicit Publish action by the course author.
  3. Manual Validation UI - This may be the best option at this time, given the issues with the above choices.
  4. Manual Script or Command - For a bulk validation report across courses, allow open edX operators to validate using a script or management command.

Validation results:

  1. Failed
  2. Warnings
  3. Success

It is a policy decision on what types of video issues are actual failures rather than simply warnings.  And it's also a matter of policy on whether we want to allow courses with video issues to be published to edx.org.

Additional Notes

  • To minimize the scope for v1.5
    • go with options 3 and 4 for the validation entry points.
    • output all validation failures and warnings in a spreadsheet, so we don't have to implement a lot of UI features such as sorting, searching, paging, etc.
    • For Scott Dunn (Deactivated): Option 4 can be done first (to provide to Course teams to validate courses), then Option 3 (surface through UI), then Option 1 (to automatically validate on import).

4. Move Video Pipeline Settings into server-specific storage

Addresses: Scenario #4 above.

Related ticket:  MA-1302 - Getting issue details... STATUS

Store the video pipeline credentials for each course in a django-admin accessible VAL database.  This will allow the settings to remain on the server instance and not overridden when the course is imported from other servers.

5. Video Pipeline UI Improvements

 To provide a better user experience for course authors, the following UI improvements are highly recommended, in the short term, by the edX video team.

  1. Deleting unneeded videos from the library ( MA-326 - Getting issue details... STATUS )
  2. Better error states ( MA-769 - Getting issue details... STATUS , MA-242 - Getting issue details... STATUS )
  3. Warning message when upload will be interrupted ( MA-333 - Getting issue details... STATUS )


Transferring a Course from Edge to Prod Workflow:

  • On Prod, create your Course with Course Team (with negotiated org-course-run values)
  • On Prod, specify Video Pipeline Settings, which remains on Prod (not in the OLX of the course) - assuming #4 above is implemented, fixing MA-1302.
  • Import from Edge to Prod.
  • Run the Validation Script - finding issues with missing VideoIDs, etc.
  • Upload Videos on Prod.
  • To Fix Video IDs, either
    1. Fix on Prod:
      1. Update all videos with issues on Prod to have correct information, or set OnlyOnWeb to disable VideoID.
      2. Export back to Edge if course teams want to continue to editing the course on Edge.
      3. Importing into Edge should bypass any validation issues since Video Pipeline is not enabled on Edge.
    2. Fix on Edge:
      1. Manually copy VideoIDs from Prod onto course on Edge.

      2. Update all videos with issues to have correct information, or set OnlyOnWeb to disable VideoID.

      3. Re-import back to Prod and verify that all issues have been fixed.