Action items:
- Discuss with DEVOPS to increase RAM.
- Wait until mongo for forums is in house
Improvements:
After applying the index on Nov 11, 11:05, we can see a significant improvement as expected.
Regressions:
After applying the index on Nov 11, 11:05, we can see that the 99th percentile began to timeout for the users endpoint. After inspection of the index on the read replica, it's not clear to why this could of happened. The slow query that occurs in these timeouts is
COMMAND database=comments-prod command={:count=>"contents", :query=>{"context"=>"course", "author_id"=>"8567599", "course_id"=>"course-v1:KULeuvenX+EUHURIx+3T2015", "anonymous"=>false, "anonymous_to_peers"=>false, "_type"=>{"$in"=>["CommentThread"]}}}
When running the query against the read replica, the expected index is being used. Removing `author_id_1` should not have affected this because the compound index `author_id_1_course_id_1` would be used instead. The median remains the same and the 95th percentile did not spike like the 99th. It seems like the proper indexes are in place, otherwise, there would be a regression for the median as well. One possible explanation could be the increase of index size explained in this wiki below.
Index size:
The indexes that we added takes up more RAM than the ones removed. To add to that, the `delete_spam` index that was removed had a size of 0 which does not really improve performance at all. Overall it seems like another 1GB of indexes was added.