Slow transaction: /ccx.views:save_ccx

Description

New Relic collected an interesting transaction trace
Application: prod-edge-edxapp-lms
https://rpm.newrelic.com/accounts/88178/applications/3415687/transaction_traces/5f499f-00dfed00-3c69-11e5-be55-f8bc12425d50

Timestamp: August 06, 2015 04:28
Url: /courses/ccx-v1avidsonNext+Cal_APccx_Edge+3T2015+ccx@44/save_ccx
Transaction Duration: 25.226 seconds

Steps to Reproduce

You can reproduce by creating a CCX course:

http://open-edx-building-and-running-a-course.readthedocs.org/en/latest/building_course/custom_courses.html

Going to the Coach tab, and changing some due dates.

Note: you will need to enable CCX locally by setting FEATURES['CUSTOM_COURSES_EDX'] to True in your settings file

Current Behavior

None

Expected Behavior

None

Reason for Variance

None

Release Notes

None

User Impact Summary

None

Activity

Show:
hrazaR
September 7, 2015, 12:03 PM
hrazaR
September 4, 2015, 7:47 PM

We are currently working on 2 approaches to reduce the time first one is on client side and other on server side, client side involves sending only those nodes to server which are changed so in this way instead of traversing whole structure only changed nodes are traverse which reduces latency and on server side we are trying to reduce the latency by bulk_operations on delete(), update() and create().Currently for deleting the ccx_override_field we first look the field in the db, fetch it and then delete in an loop and we tried to reduce this using bulk_delete where we collect the ccx_override_field ids to be deleted using hash and then delete all ccx_override_fields using single query.

hrazaR
September 2, 2015, 9:14 PM

, , after debugging request body of save_ccx it seems whenever we made a change (example due date) in an unit or its sub-unit the code at https://github.com/edx/edx-platform/blob/master/lms/djangoapps/ccx/views.py#L208 is traversing the whole structure of course sending from https://github.com/edx/edx-platform/blob/19604a4a6eee6f65bb048700124923972ebc1f57/lms/static/js/ccx/schedule.js#L222, an solution can be send only those nodes and their children which have changed values instead of sending the whole outline of course for traversing.

Adam Palay
September 2, 2015, 5:03 PM

, , , that profile is really interesting. It doesn't look like it's spending too much time in "commit", as it looked like in the new relic trace.

It also looks like it could be an easy win to do the deletes all at once.

David Ormsbee
September 2, 2015, 4:20 PM

Attached a profiler dump of this operation taken from my devstack. It also points to too many MySQL operations as the main cause.

Fixed

Assignee

hrazaR

Reporter

New Relic

Labels

Reach

None

Impact

None

Platform Area

None

Customer

None

Partner Manager

None

URL

None

Contributor Name

None

Groups with Read-Only Access

None

Actual Points

None

Category of Work

None

Platform Map Area (Levels 1 & 2)

None

Platform Map Area (Levels 3 & 4)

None

Priority

CAT-2