Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The good:

Deleted 4 indexes.

 

...

Improvements:

After applying the index on Nov 11, 11:05, we can see a significant improvement as expected. 

Image Added

Regressions:

After applying the index on Nov 11, 11:05, we can see that the 99th percentile began to timeout for the users endpoint. After inspection of the index on the read replica, it's not clear to why this could of happened. The slow query that occurs in these timeouts is 

Code Block
COMMAND      database=comments-prod command={:count=>"contents", :query=>{"context"=>"course", "author_id"=>"8567599", "course_id"=>"course-v1:KULeuvenX+EUHURIx+3T2015", "anonymous"=>false, "anonymous_to_peers"=>false, "_type"=>{"$in"=>["CommentThread"]}}}

When running the query against the read replica, the expected index is being used. Removing `author_id_1` should not have affected this because the compound index `author_id_1_course_id_1` would be used instead. The median remains the same and the 95th percentile did not spike like the 99th. It seems like the proper indexes are in place, otherwise, there would be a regression for the median as well. One possible explanation could be the increase of index size explained in this wiki below. 

Image Added

The threads endpoint is experiencing a similar situation as the users endpoint. 

Image Added

Index size:

The indexes that we added takes up more RAM than the ones removed. To add to that, the `delete_spam` index that was removed had a size of 0 which does not really improve performance at all.  Overall it seems like another 1GB of indexes was added.

Expand
titledb.contents.stats(1024*1024) prod

{
"ns" : "comments-prod.contents",
"count" : 7567251,
"size" : 5808,
"avgObjSize" : 804,
"storageSize" : 6624,
"numExtents" : 29,
"nindexes" : 23,
"lastExtentSize" : 511,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 11907,
"indexSizes" : {
"_id_" : 328,
"parent_id_1" : 269,
"parent_ids_1" : 278,
"tags_array_1" : 253,
"author_id_1_course_id_1" : 494,
"votes.up_1__type_1" : 475,
"votes.down_1__type_1" : 467,
"commentable_id_1_created_at_-1" : 461,
"course_id_1__type_1_created_at_-1" : 736,
"_type_1_comment_thread_id_1_author_id_1_updated_at_1" : 489,
"comment_thread_id_1_sk_1" : 301,
"comment_thread_id_1_endorsed_1" : 142,
"_type_1_course_id_1_pinned_-1_created_at_-1" : 751,
"_type_1_course_id_1_pinned_-1_comment_count_-1_created_at_-1" : 793,
"_type_1_course_id_1_pinned_-1_votes.point_-1_created_at_-1" : 851,
"commentable_id_1" : 190,
"_type_1_course_id_1_context_1_pinned_-1_created_at_-1" : 709,
"_type_1_context_1" : 348,
"_type_-1_course_id_1_context_1_pinned_-1_last_activity_at_-1_created_at_-1" : 701,
"_type_-1_course_id_1_commentable_id_1_context_1_pinned_-1_created_at_-1" : 780,
"_type_-1_course_id_1_endorsed_-1_pinned_-1_last_activity_at_-1_created_at_-1" : 677,
"_type_-1_course_id_1_endorsed_-1_pinned_-1_votes.point_-1_created_at_-1" : 725,
"_type_-1_course_id_1_endorsed_-1_pinned_-1_comment_count_-1_created_at_-1" : 679
},
"ok" : 1
}

...