There were 7767 courses for which transcripts were migrations from content-store to S3.
Date | Job Run # | Migrated Courses |
---|---|---|
1 | 0 | |
2 | 0 | |
3 | 2 | |
4 | 2 | |
5 | 100 | |
6 | 500 | |
7 | 500 | |
8 | 1000 | |
9 | 1000 | |
10 | 1000 | |
11 | 1000 | |
12 | 1000 | |
13 | 1000 | |
14 | 1000 | |
15 | 1000 | |
16 | 663 |
The above artifacts can be found on Splunk, an example query can look like the following where run
specifies a job run.
index=prod-edx "Transcript Migration" "run=12" "video-transcripts-migration-process-started-for-course" |
Transcripts have been successfully migrated for ~756,201 videos and around ~161,944 external videos have been created in edxval
from the corresponding video components. Below are the results gathered from the logs emitted by transcripts migration job on Splunk.
Run # | Videos submitted for Transcripts Migration | Videos with no transcripts | Videos completed Transcripts Migration | Number of External Videos |
---|---|---|---|---|
1 | 0 | 0 | 0 | 0 |
2 | 0 | 0 | 0 | 0 |
3 | 278 | 2 | 276 | 216 |
4 | 250 | 6 | 244 | 1 |
5 | 11,155 | 723 | 10,432 | 3,072 |
6 | 50,555 | 5,611 | 45,029 | 10,346 |
7 | 50,651 | 5,394 | 45,334 | 9,865 |
8 | 97,303 | 8,684 | 88,807 | 18,547 |
9 | 107,406 | 11,073 | 96,333 | 19,764 |
10 | 101,718 | 13,636 | 88,071 | 18,946 |
11 | 103,437 | 10,053 | 93,630 | 20,023 |
12 | 98,299 | 11,585 | 86,895 | 17,756 |
13 | 101,329 | 10,037 | 91,512 | 19,799 |
14 | 99,473 | 9,902 | 89,783 | 9,100 |
15 | 102,728 | 9,382 | 91,150 | 3,674 |
16 | 64,315 | 7,964 | 56,278 | 10,835 |
Below are the queries for the above mentioned artifacts, "run" can be adjusted to the desired job run:
# Videos submitted for the Migration excluding those videos that are having not video transcripts index=prod-edx "Transcript Migration" "run=16" "transcripts-migration-tasks-submitted" # Videos without transcripts index=prod-edx "Transcript Migration" "run=16" "transcripts-migration-tasks-submitted" # Videos completed Transcripts Migration index=prod-edx "Transcript Migration" "run=16" "video-transcripts-migration-complete-for-a-video" # Number of created External Videos (i.e. That do not have edx_video_id set) index=prod-edx "Transcript Migration" "run=16" "generated-edx-video-id" |
Total of ~990,378 transcripts were successfully migrated from Content-store to S3. There were small amount of migrations failures too which can be seen below and to be exact, it is 0.283% of the successfully migrated transcripts which is 2,806 in amount.
Run # | Number of successfully migrated transcripts | Number of transcripts whose migration failed | Transcript with no content to migrate |
---|---|---|---|
1 | 0 | 0 | 0 |
2 | 0 | 0 | 0 |
3 | 276 | 0 | 0 |
4 | 242 | 0 | 2 |
5 | 11,791 | 81 | 0 |
6 | 54,688 | 4 | 54 |
7 | 50,213 | 3 | 65 |
8 | 98,717 | 8 | 55 |
9 | 109,211 | 2 | 158 |
10 | 99,990 | 3 | 75 |
11 | 103,298 | 5 | 73 |
12 | 96,878 | 9 | 119 |
13 | 101,484 | 5 | 115 |
14 | 99,206 | 0 | 100 |
15 | 101,217 | 2,495 | 97 |
16 | 63,167 | 191 | 52 |
Below are the queries for the above mentioned artifacts, "run" can be adjusted to the desired job run:
# Number of successfully migrated transcripts index=prod-edx "Transcript Migration" "run=16" "video-transcript-migration-succeeded-for-a-video" # Number of transcripts whose migration failed index=prod-edx "Transcript Migration" "run=16" "video-transcript-migration-failed-with-unknown-exc" # Transcript with no content to migrate index=prod-edx "Transcript Migration" "run=16" "video-transcript-migration-failed-with-known-exc" |
|
Let's talk about the 0.283% failures(i.e. 2,806 transcripts), this is taken as the max number of the transcripts that might not have been migrated due the exceptions occurred. The following is what I have observed in manual verification of errors:
Migration failures falling into the below categories are ignore-able:
Also, we can see, there are considerably significant amount of failures in 15th run as compared to other runs, and most of these are for non-english transcript languages that failed on decoding with utf-8-sig
.