Video xBlock Tech Debt

Here's a running list of tech debt items that we'd like to address as part of the Video Phase 2 work.

xModule → xBlock

  • Replace Video xModule → Video xBlock (either start-scratch or duplicate+prune)
  • There are many benefits (performance, cognitive load, smaller code-size) to making it an xBlock.

Frontend

  • Decouple YouTube Javascript code from edX Video JS to reduce the base JS bundle size (see comment) – 

Simplify

  • Eliminate unnecessary backward compatibility code
    • The video xModule's __init__ method is one of the slowest xBlock initialization routines - due to a whole bunch of backward compatibility code.
      • Parses XML data!
      • The Field is_set_on calls are pretty slow!
    • We no longer need to store multiple YouTube-URLs for each speed to support old browsers.  All that code can be eliminated in both the backend and front-end.
    • Get rid of "sources" field which was deprecated in favor of `html5_sources` – Quickly check from GraphCourse if sources field is still populated or being used.
    • Eliminate CDN_URLs related code in get_html method of current xmodule – Talk with Ed or Clayton.
  • Have a single instance of a Video support only one of the following (don't need to have extensive fallback logic):
    • HLS-encoded, VAL-backed, VEDA-generated video
    • YouTube video
    • Support for HTML5 links may not be needed - check with Product
      • if they are still needed, keep support for it, but we still don't need to support HTML5 link + any of the above
  • YouTube
    • Currently, all YouTube code is coupled with other video code.  Keep YouTube integration separate with clearer interfaces.

Youtube Dependencies

  • For YT videos, we get the duration and video image from YT API.
    • Analytics team also uses YT API to get the video duration
  • Transcripts
    • Do we need to rely on YT for transcripts if not what are the options we've here to provide as alternative ?
      • How will the authors generate/author transcripts if we do not rely on YT subs?
        • Is this manual process and we shouldn't care?
    • Simplify the transcripts to incorporate new flow because currently we rely on YT transcripts. This is kind of blocker to fully support the HLS/HTML5 videos –  EDUCATOR-323 - Getting issue details... STATUS  This is for v1.
    • We also need to change the transcript backend module to incorporate new flow i.e. HLS – HTML5 – YT for V2.
    • Currently, a transcript file is shared among the video components having the same external source url – TNL-6539 - Getting issue details... STATUS . So, when a video component modifies its transcript file, changes will also get reflected for other video components which are using it. 
    • Should we use edx-val subtitles for the valid edx_video_ids?https://github.com/edx/edx-val/blob/master/edxval/models.py#L164-L209. If so then following can be considered:
      • Provide an option to upload transcripts on video uploads page – probably an implicit call to edx-val endpoint to create the subtitle given the language.
      • Use these subtitles for the video components that are using edx-val api to fetch the video encodings.
      • What if the transcripts are not available in edx-val for an edx_video_id? 
        • Upload transcript from video uploads page
        • Use existing transcripts(if any) from contentstore (i.e. which are primarily created for external sources) to generate subtitles for edx_video_id in edx-val.

Export video to Open edX Instances

We started work on this spike to know what's the process of video playback on edge and other open edX instances. EDUCATOR-324 - Getting issue details... STATUS

It turned out that the videos can be played back on open edX instances and we're storing/serving cloudfront public urls which are accessible to anybody not just open edX. The next question is: Do we need to restrict the video playback for edX hosted videos?

There are following possibilities to consider

  • edx_video_id won't render the video and only external source(s) will work. This would be done via some domain check maybe ?
    • edX will not pay for their playbacks.
  • edx_video_id renders the video with lowest bandwidth – Not good solution, edX still pay for videos.
    • check if open edX instance, render the lowest bandwidth video from edx-val
      • edX still pays for the playbacks.
  • S3 signed urls if applicable ??

Tests

  • The Video JS tests are still flaky - causing many Jenkins re-runs.  Consider stepping back and rethinking the tests to completely eliminate these issues.
  • Eliminate dependency on the YouTube server
    • Replace Bokchoy tests that currently access YouTube as part of test execution
    • Create minimal end-to-end integration tests instead for accessing YouTube and verifying API expectations haven't changed.
  • Currently we rely on Youtube API response to decide whether to run the tests or just skip them if YT is not available. Try to eliminate this dependency. These are very dangerous assumptions.