Model the various caps of file storage: 10mb (current), 50mb, 100mb, 500mb or 1gb.
Use the read-replica to figure out how many submission records exist with uploaded files.
Group that by month to get a rough idea of growth rate of file uploads.
Multiply that number by 10MB, 100MB, etc. to get an idea of total size at these different levels.
Understand what our file-retention policy is for ORA uploads, if any.
Incorporate this into the rough model developed above.
Verify the truth of the statements below:
For instance, we have a restriction at the nginx layer that files uploaded to edX systems cannot be larger than 20MB.
May need to discuss how to support multi-part file upload in LMS if we’re supporting much larger file uploads (i.e. > 50MB)
The current ORA2 upload workflow seems to be:
The client-side asks the XBlock/python backend for a URL to upload file contents to.
The XBlock method produces such a URL, which is a generated S3 URL in production.
The client takes this URL and submits a PUT request to it, along with the file contents.
Notice that nowhere in the above steps do we upload a file through an edX nginx instance - so that hard, system-wide cap may not actually apply in this use-case.
If we want to support “additive uploads” (as opposed to the “one-shot” uploads we currently support), we’ll need a way to query the file storage backend (S3 in production) for a set of key metadata to determine cumulative storage allocation for a given (student, ORA block). Provide definitive answers to the following questions:
Can we fetch S3 key metadata efficiently? That is, if a student has uploaded 17 items, do we need to do 17 different queries for S3 keys, or can we do one query on an appropriate key prefix?
Is there a compelling reason that we would store file metadata, including size, in a model outside of S3, so that we don’t have to ask S3 for this metadata? (An answer to this question is a nice-to-have, provided that the answer to the first question is “yes, we can efficiently query S3 key metadata”).