Page Comparison

...

/courses/(course_id)/
- GET
/course_topics/(course_id)
- GET
/threads/
- GET
- POST
/threads/(thread_id)
/comments
- GET
- POST
/comments/(comment_id)
- GET not implemented
- PATCH
- DELETE

...

Testing Strategy:

Thread and Comment pool:

Various methods of select post data were considered.

...

Originally the plan was to isolate each endpoint and determine what kind of load it can handle, but after analysis of the data, some of these endpoints seem unnecessary to isolate for a load test. These endpoints include DELETE and PATCH which are a significantly small part of the overall load in production. For the isolated test for these endpoints, it will be paired with it's appropriate GET Thread/Comment. For example, every DELETE Thread request requires a thread_id. We obtain this thread_id by calling GET Thread List with randomize parameters, which returns a list of threads where one is then randomly selected. This selected thread is then DELETEd. Below is the chart of the additional request we make. As long the ratio of how many of these requests happen in each task is understood, we can get the desired endpoint distribution.

Request	Requires	Returns	Order of requests
GET Thread	thread_id	Thread	Taken from thread_id pool
GET Thread List		Thread List	GET Thread List
GET Comment List	thread_id	Comment List	GET Thread List	GET Comment List
POST Thread	course_id	Thread	POST Thread
POST Response	thread_id	Comment	GET Thread List	POST Response
POST Comment	Comment	Comment	GET Thread List	GET Comment List	POST comment
PATCH Thread	thread_id	Thread	GET Thread List	PATCH Thread
PATCH Comment	comment_id	Comment	GET Thread List	GET Comment List	PATCH Comment
DELETE Thread	thread_id	No Content	POST Thread List	GET Thread List	DELETE Thread
DELETE Response	comment_id	No Content	GET Thread List	POST Response	GET Comment List	DELETE Response*
DELETE Comment	comment_id	No Content	GET Thread List	GET Comment List	POST Comment	DELETE Comment*

*GET Thread List can always return a response (so we delete a random response), but will not always return a comment so the comment created will be the one deleted.

Thread and Comment pool:

Various methods of select post data were considered.

Selecting threads from a smaller pool or selecting the same thread. Rather than getting the entire list of thread_ids to send requests against, we would just store a random portion of the threads. A test was run to see if matters whether the retrieved thread was random or not, but the sandbox it was run against did not have the correct mongo indexes set up. Regardless, this strategy would not work when trying to DELETE threads as the pool of potential threads would be smaller. Additionally this relies on storing data that must be shared amongst the locust users which could lead to race conditions as a locust user could be trying to GET a thread that another locust user was in the middle of DELETEing. When dealing with much larger file IO operations, it could cause some limitations on the machine that spawns the locusts.
Retrieving the list of thread ids when starting locust. This method was effective up until the number of threads in the data set started to increase. As the median number of posts in a course is ~2000, when trying to retrieve 20*(page size max of 100), it would take 20 queries. Additionally, as mentioned in the above strategy, storing data amongst the locust users is not a trivial task. Each locust user would try to generate it's own list of threads which is unacceptable. If a thread was POSTed or DELETEd, only that locust user would have that updated information. Attempts at using the lazy module did not work either as each list of threads was instantiated separately by each locust user. Again, even if the locust users were able to use the same global variables, there would be race conditions.

Calling GET thread_list per DELETE/PATCH/GET_comment. Since the ratio of GET thread_list is significantly higher than any of the other calls except for GET Thread, we can achieve the desired distribution of requests for the discussion API. The table below is a 7 day snapshot on NewRelic for the the discussion API without having to store any of the thread_ids. The table below is a 7 day snapshot on NewRelic for the discussion forums. The only drawback is that in order to GET a single thread, we need to have a thread_id. This issue will be discussed in the next bullet.

Action	Count		Discussion API Call
.forum.views:single_thread	675980	4760	GET Thread
.forum.views:forum_form_discussion	234783	1653	GET Thread List
.forum.views:inline_discussion	155176	1093	GET Thread List
create_thread	31176	220	POST Thread
create_comment	27438	193	POST comment
create_sub_comment	14345	101	POST comment
users	13820	97	-
.forum.views:user_profile	12336	87	-
.forum.views:followed_threads	7698	54	GET Thread List
vote_for_comment	6731	47	PATCH Comment
vote_for_thread	6242	44	PATCH Thread
upload	4208	30	-
update_comment	3403	24	PATCH Comment
follow_thread	3870	27	PATCH Thread
update_thread	2827	20	PATCH Thread
delete_thread	2091	15	DELETE Thread
endorse_comment	1232	9	PATCH Comment
delete_comment	770	5	DELETE Comment
flag_abuse_for_comment	373	3	PATCH Comment
flag_abuse_for_thread	142	1	PATCH Thread

Using pre-stored thread_id data

	373	3	PATCH Comment
flag_abuse_for_thread	142	1	PATCH Thread

Using pre-stored thread_id data. Since GET Thread is called more than GET Thread List, we cannot use GET Thread List to get a thread_id. Instead, we can use a pre-defined set of thread_ids as mentioned in the first two bullets. This will allow us to be able to test GET Threads in isolation. Unfortunately the issue of trying to GET a DELETEd thread may still arise. Another option could be to have the locust user only call GET Thread List once and then run multiple GET Thread's. Again, the same issue still arises if one of those Threads happened to get DELETEd.

Things that were left out:

...

Versions Compared

Old Version 43

New Version 44

Key

Testing Strategy:

Thread and Comment pool:

Thread and Comment pool:

Things that were left out: