Temp folder runs out of space after failed exports

Description

This course in studio has many images which presumably is keeping export from working:
https://studio.edx.org/course/slashes:VJx+VJx+3T2014

The course team needs to use import/export to add polls.

To reproduce, go to https://studio.edx.org/export/slashes:VJx+VJx+3T2014 and click export. It will time out or give an error:

There has been an error with your export.
There has been a failure to export your course to XML. Unfortunately, we do not have specific enough information to assist you in identifying the failed component. It is recommended that you inspect your courseware to identify any components in error and try again.
The raw error message is:
[Errno 28] No space left on device

Steps to Reproduce

None

Current Behavior

None

Expected Behavior

None

Reason for Variance

None

Release Notes

None

User Impact Summary

None

Activity

Show:
Adam Palay
September 11, 2014, 12:46 PM

, gunicorn was rolled back after the release yesterday. But ostensibly it will be upgraded again.

Zubair Afzal
September 11, 2014, 10:50 AM

Closing this issue as with this PR: https://github.com/edx/edx-platform/pull/5000, gunicorn is updated to 19.1.1, which gives time to worker to do cleanup tasks.

Adam Palay
August 22, 2014, 8:09 PM

, it's not the same issue. 7.00x is just timing out because it's so big. File a ticket with devops to export the course via management command. See https://openedx.atlassian.net/browse/PLAT-81

AlisonK
August 22, 2014, 7:20 PM

The MIT 7.00x team cannot export the course. It seems likely that it's the same issue VJx was seeing: https://openedx.atlassian.net/browse/PLAT-82 and ✓ Cannot export from studio

Unlike VJx, this course is running, the file hosting arrangement cannot be easily changed. Can you advise on next steps? The course team simply wants to create a new branch of the same material, they aren't looking to export regularly.

To reproduce:
1. visit the course export page https://studio.edx.org/export/MITx/7.00x_2/2T2014
2. export
3. end up with a blank page at this URL: https://studio.edx.org/export/MITx/7.00x_2/2T2014?_accept=application/x-tgz

Zubair Afzal
August 19, 2014, 11:10 AM

During fix for https://github.com/edx/edx-platform/pull/4560, I have checked that for each uploaded course there is only one tmp file which is the last uploaded file chunk (size<20Mb set by fileupload jquery in import.html) created by django https://docs.djangoproject.com/en/1.4/topics/http/file-uploads/#where-uploaded-data-is-stored. It does not depend on if worker got completed successfully or timeout.

After the upload complete the uploaded course tarball is saved on data folder "/edx/var/edxapp/data/org-course-run" and extracted there. If the worker completes its task then the temp data folder are cleaned up but if worker timeout during import then for gunicorn (0.17.4) SIGKILL signal (https://github.com/benoitc/gunicorn/blob/0.17.4/gunicorn/arbiter.py#L401-L411) is sent when immediately terminates the worker/process and no cleanup is done for temp data folder.

But new version of gunicorn (19.1.0) sends SIGABRT signal (https://github.com/benoitc/gunicorn/blob/19.0/gunicorn/arbiter.py#L425-L440) on worker timeout which gives time to worker to do cleanup tasks (execute code in "finally" clauses) before it is killed so we need to update gunicorn.

No Action Needed

Assignee

Zubair Afzal

Reporter

teppoR

Reach

None

Impact

None

Platform Area

None

Customer

None

Partner Manager

None

URL

None

Contributor Name

None

Groups with Read-Only Access

None

Actual Points

None

Category of Work

None

Platform Map Area (Levels 1 &amp; 2)

None

Platform Map Area (Levels 3 &amp; 4)

None

Priority

CAT-3