The courseware_studentmodule
table and courseware_studentmodulehistory
table are monolithic MySQL tables in edx-platform. Their data currently grows without bound - and queries to them can be slow. We're embarking on a project to offer the platform option to use an alternate DB in which to store this data. The new backend should be able to handle the large data size along with all the read/write requirements.
Initial Discovery Work
Dave Ormsbee (Deactivated) performed the exploratory work on this topic - those details are in this page: Discovery: Transitioning XBlock User State Away from CSM
Call Catalog
Julia Eskew (Deactivated) documented all CSM calls in edx-platform on this page: Courseware StudentModule (CSM) Call Catalog
Courseware StudentModule History
In the courseware_studentmodulehistory
table, a new row is added for each update and insert to the courseware_studentmodule
table which is for a CAPA problem. At the moment, it keeps the history forever. The table's data is used by the course team for support purposes. The data is viewable as a "submission history" button attached to the problem in the courseware. The table data has also been used for development purposes to diagnose bugs - and, on rare occasions, to correct state-corrupting issues in production.
edx.org CSM Requirements
The CSM/CSMH solution that's to be implemented needs to meet several performance requirements. The read/write requirements are determined by viewing New Relic data on query volume. Here's an example of that data:
https://rpm.newrelic.com/accounts/88178/applications/3343327/datastores#/table/MySQL
The relevant "courseware_" table queries can be seen on that page. Click through each query and peak throughputs can be seen. The peaks on June 2nd, 2015 (a day of heavy requests) were:
Query | Rate (calls/min) |
---|---|
select | 30K |
update | 4K |
insert | 2.5K |
<Insert information on data size requirements>