Re-encode videos for HLS

Goal is to re-encode for HLS, all the course videos, that are missing HLS profiles.

How many such videos are there?

To date, the total successfully encoded and delivered videos are 150370. Out of these, 87975 are the number videos that are missing HLS encodes which is ~60%.

Total videos
$ SELECT DISTINCT COUNT(edxval_video.edx_video_id) FROM edxval_video WHERE edxval_video.status = 'file_complete';
$ 150370

$ SELECT DISTINCT COUNT( edxval_video.edx_video_id) FROM edxval_video WHERE edxval_video.status = 'file_complete' AND NOT (edxval_video.id IN (SELECT U1.video_id AS Col1 FROM edxval_encodedvideo U1 INNER JOIN edxval_profile U2 ON (U1.profile_id = U2.id) WHERE U2.profile_name = 'hls'));
$ 87975


Course-wise video stats can be found on the thread:  EDUCATOR-3555 - Getting issue details... STATUS

Plan for HLS Re-encoding

  • Setup an edxval endpoint that provide with paginated set of edx_video_ids which are missing the HLS profile.
  • Setup a management command in edx-video-pipeline, that does the following:
    • Request edxval for set of videos to re-encode
    • Retrieve corresponding VEDA Video object and switch off process_transcription flag – This is required to avoid unintentional transcription that happens in Delivery phase.
    • Check VEDA to see whether an HLS profile is there and if it does then, just update edxval with it otherwise kick off encoding task (a.k.a worker_tasks_fire) with veda_id and encode_profile=HLS.
  • Encode worker generates the HLS encode, push it S3 and initiate a delivery task
  • Deliver worker process the delivery task and delivers the successful HLS encode profile to edxval


NOTE:

We will also need to see if re-encoded videos belong to the custom white-labeled sites because HLS endpoints have CORS enabled only allow set of whitelisted sites. We will eventually need to whitelist all the belonging sites.