The "contentstore" is where all course assets are stored. It's really a wrapper of code around a GridFS (MongoDB) backend and it stores binary files which can PDFs, WAVs, JPGs, or other. The contentstore code is mainly here:
https://github.com/edx/edx-platform/tree/master/cms/djangoapps/contentstore
The contentstore has some known technical problems, explained somewhat in this presentation:
http://doctoryes.github.io/mug_talk_modulestore/#1
An effort to move all course assets out of GridFS and into external storage was begun in 2014 and abandoned. The docs from that effort: GridFS Replacement
Since that effort, the performance team has implemented an optimization for non-locked course assets. See Dave Ormsbee or Toby Lawrence (Deactivated) for further details.
/edx/bin
and run the appropriate script to enter a mongo shell for the databse.If you need to find out how many assets are contained in each course, the JS below will assist you.
/* The original "group" */ db.fs.files.group( { key: {"_id.course": 1, "_id.org": 1, "_id.run": 1}, reduce: function(cur, result) { result.count += 1 }, initial: {count: 0} } ) /* ..and with category included. */ db.fs.files.group( { key: {"_id.course": 1, "_id.org": 1, "_id.run": 1, "_id.category": 1}, reduce: function(cur, result) { result.count += 1 }, initial: {count: 0} } ) var mapFunction = function() { var slicer = function(x) { return x.slice(0, x.lastIndexOf("+")) }; var split_id = null; if (typeof this._id === "string") split_id = slicer(this._id); var key = [ this._id.course, this._id.org, this._id.run, this._id.category, split_id ]; emit( key, 1 ); }; var reduceFunction = function(key, values) { return Array.sum(values); } db.fs.files.mapReduce( mapFunction, reduceFunction, { out: {inline: 1} } ) /* For debugging the mapper... */ var emit = function(key, value) { print("emit"); print("key: " + key + " value: " + tojson(value)); } /* To find all courses which aren't the three specified. */ db.fs.files.find({"_id.course": {$nin : ["DemoX", "import_test", "LargeCourse101"]}}) |
db.fs.files.aggregate( {$group : { _id : '$<field_name>', count : {$sum : 1}}} ).result |