Page Comparison

...

We have gathered and compared two sets of load test results. The former results (i.e. with profile_image) are run on the latest changes and the later are run on older implementation (i.e. without profile_image).

There appears to be anomaly in the results below (highlighted 99%) where the new implementation is faster than the old. Where as, for other results the percentile of new implementation increases significantly with the increase in load. For example, the first pair of results where no. of clients = 48, the difference is in tens or hundrend and for the last pair of results where no. of clients = 610, the difference is in thousands.

Logically; there should only be increase in response time for GET endpoints, but we see in results that all other endpoints show differences too. This is because all PATCH, POST and DELETE endpoints first call GET endpoint to retrieve an 'id' of thread and/or comment and then make further processing on it.

New Relic:

Here is the new relic permaLink_without_profile and permaLink_with_profile where the average rpm for former is 1.55k and for later is 1.48k and 84.5% of requests are made for AccountViewSet.list (i.e. user accounts API for multiple usernames) for later.

...

With Profile Image

...

Without Profile Image

...

No. of clients =

req/s =

Methods	median response time	95%	99%
DELETE_comment
DELETE_thread
GET_comment_list
GET_thread
GET_thread_list
PATCH_comment
PATCH_thread
POST_comment_comment
POST_comment_response
POST_thread
auto_auth

No. of clients =

req/s =

Methods	median response time	95%	99%
DELETE_comment
DELETE_thread
GET_comment_list
GET_thread
GET_thread_list
PATCH_comment
PATCH_thread
POST_comment_comment
POST_comment_response
POST_thread
auto_auth

No. of clients =

req/s =

Methods	median response time	95%	99%
DELETE_comment
DELETE_thread
GET_comment_list
GET_thread
GET_thread_list
PATCH_comment
PATCH_thread
POST_comment_comment
POST_comment_response
POST_thread
auto_auth

No. of clients =

req/s =

...

No. of clients =

req/s =

Methods	median response time	95%	99%
DELETE_comment
DELETE_thread
GET_comment_list
GET_thread
GET_thread_list
PATCH_comment
PATCH_thread
POST_comment_comment
POST_comment_response
POST_thread
auto_auth

No. of clients =

req/s =

Methods	median response time	95%	99%
DELETE_comment
DELETE_thread
GET_comment_list
GET_thread
GET_thread_list
PATCH_comment
PATCH_thread
POST_comment_comment
POST_comment_response
POST_thread
auto_auth

No. of clients =

req/s =

Methods	median response time	95%	99%
DELETE_comment	270	350	1300
DELETE_thread	180	210	280
GET_comment_list	170	250	300
GET_thread	170	230	260
GET_thread_list	240	590	740
PATCH_comment	270	350	450
PATCH_thread	190	260	370
POST_comment_comment	330	440	650
POST_comment_response	280	350	400
POST_thread	170	230	250
auto_auth	220	230	230

No. of clients =

req/s =

Methods	median response time	95%	99%
DELETE_comment	270	340	380
DELETE_thread	180	240	360
GET_comment_list	150	210	330
GET_thread	160	220	300
GET_thread_list	170	340	480
PATCH_comment	260	340	370
PATCH_thread	150	260	270
POST_comment_comment	330	410	550
POST_comment_response	280	360	440
POST_thread	170	230	270
auto_auth	210	210	210

No. of clients = 192

req/s = 16.10

Methods	median response time	95%	99%
DELETE_comment	280	370	550
DELETE_thread	180	230	290
GET_comment_list	180	260	370
GET_thread	170	240	340
GET_thread_list	240	610	780
PATCH_comment	260	380	1100
PATCH_thread	170	260	300
POST_comment_comment	340	430	540
POST_comment_response	290	370	400
POST_thread	180	230	350
auto_auth	210	220	220

No. of clients = 192

req/s = 16

Methods	median response time	95%	99%
DELETE_comment	280	360	430
DELETE_thread	180	290	290
GET_comment_list	150	210	290
GET_thread	160	220	310
GET_thread_list	180	350	490
PATCH_comment	250	340	450
PATCH_thread	190	260	430
POST_comment_comment	330	430	550
POST_comment_response	280	370	470
POST_thread	170	230	290
auto_auth	220	230	230

No. of clients = 240

req/s = 18.70

Methods	median response time	95%	99%
DELETE_comment	290	390	470
DELETE_thread	180	250	350
GET_comment_list	180	270	400
GET_thread	180	240	300
GET_thread_list	240	620	800
PATCH_comment	270	400	860
PATCH_thread	200	270	2300
POST_comment_comment	340	470	760
POST_comment_response	290	390	760
POST_thread	180	240	390
auto_auth	220	230	230

No. of clients = 240

req/s = 19.7

Methods	median response time	95%	99%
DELETE_comment	280	370	460
DELETE_thread	180	250	300
GET_comment_list	150	220	280
GET_thread	160	230	280
GET_thread_list	180	350	500
PATCH_comment	260	380	500
PATCH_thread	190	250	380
POST_comment_comment	330	420	500
POST_comment_response	280	370	460
POST_thread	180	240	340
auto_auth	210	220	220

No. of clients = 288

req/s = 23.20

Methods	median response time	95%	99%
DELETE_comment	290	380	460
DELETE_thread	190	290	380
GET_comment_list	180	270	340
GET_thread	180	240	300
GET_thread_list	250	630	790
PATCH_comment	260	350	440
PATCH_thread	170	240	280
POST_comment_comment	340	430	560
POST_comment_response	290	390	470
POST_thread	180	240	320
auto_auth	210	210	210

No. of clients = 288

req/s = 22.2

Methods	median response time	95%	99%
DELETE_comment	280	420	1400
DELETE_thread	180	240	270
GET_comment_list	160	240	660
GET_thread	160	240	490
GET_thread_list	180	360	570
PATCH_comment	270	360	740
PATCH_thread	190	260	630
POST_comment_comment	340	450	620
POST_comment_response	290	440	1300
POST_thread	180	240	1500
auto_auth	220	220	220

No. of clients = 336

req/s = 26.70

Methods	median response time	95%	99%
DELETE_comment	290	410	570
DELETE_thread	190	320	1200
GET_comment_list	190	270	370
GET_thread	180	250	350
GET_thread_list	250	630	800
PATCH_comment	240	370	410
PATCH_thread	180	270	330
POST_comment_comment	350	460	560
OST_comment_response	300	390	470
POST_thread	180	250	320
auto_auth	240	250	250

No. of clients = 336

req/s = 28.2

Methods	median response time	95%	99%
DELETE_comment	290	440	2400
DELETE_thread	190	260	440
GET_comment_list	160	240	550
GET_thread	170	240	690
GET_thread_list	180	370	540
PATCH_comment	270	380	1700
PATCH_thread	190	280	1500
POST_comment_comment	360	480	2400
OST_comment_response	300	430	1300
POST_thread	180	240	290
auto_auth	240	260	260

No. of clients = 384

req/s = 31

Methods	median response time	95%	99%
DELETE_comment	300	400	890
DELETE_thread	190	300	320
GET_comment_list	190	280	410
GET_thread	180	260	360
GET_thread_list	260	660	830
PATCH_comment	230	380	430
PATCH_thread	200	260	400
POST_comment_comment	360	500	590
POST_comment_response	300	410	470
POST_thread	180	250	290
auto_auth	290	360	360

No. of clients = 384

req/s = 31.4

...

Background:

In reference to

Jira Legacy

server	JIRA (openedx.atlassian.net)
serverId	13fd1930-5608-3aac-a5dd-21b934d3a4b4
key	MA-2678

; we have changed forums implementation with following details (for reference; see PR#192):

Changed post/response/comment behaviour to update post's 'last_activity_at' only at time of creation of post and creation of response/comment on a post. Previously post's 'last_activity_at' was being updated for both creation and update.
To calculate 'read' status of a post, used 'last_acitvity_at' instead of 'updated_at'.
To calculate 'unread comment count' for a post, used 'created_at' instead of 'updated_at'.

'unread comment count' was being calculate as:

unread_comment_count = Comment.collection.find(:comment_thread_id => t._id, :author_id => {"$ne" => user.id}, :updated_at => {"$gte" => read_dates[thread_key]}).count
and had a compound index against it
index({_type: 1, comment_thread_id: 1, author_id: 1, updated_at: 1})

With new implementation:

unread_comment_count = Comment.collection.find(:comment_thread_id => t._id, :author_id => {"$ne" => user.id}, :created_at => {"$gte" => read_dates[thread_key]}).count
So, we removed index
index({_type: 1, comment_thread_id: 1, author_id: 1, updated_at: 1})

and added a new one
index({comment_thread_id: 1, author_id: 1, created_at: 1})

Results:

The load tests were run on 4x c4.2xlarge instances for lms and 3x m4.large instances for forums.

The results of load tests below show differences between the old and new implementation. The two set of results looks quite similar except when it reaches "No. of clients = 336"; where there is huge difference between old and new percentiles as well as sudden rise in percentile for both old and new index with respect to "No. of clients = 224". For all the next tests (i.e. No. of clients = 460, No. of clients = 510, No. of clients = 578), the difference between old and new percentile is minimised and the new index results have lower percentile for most of the endpoints.

I have captured new relic charts too; permaLink_old_index with average rpm = 1.86k and permaLink_new_index with average rpm = 1.81k

54428.10340150047002101900290022014003200200210054002901600330029018003200220120029004101900380035018003800200140029002602702705444350490710500550870list70067044084010005902505784590370210055002109501600250170041002102800570031018004200110026002301700420044019005700380200054002101700530053053005300578477230047011001900_list12003600940130014001901200280

Old Index

New Index

No. of clients = 48048

req/s = 369.804

600530380680600370

Methods	median response time	95%	99%
DELETE_comment	320 250	470310	1200
DELETE_thread	200 160	2901901300	200
GET_comment_list	210150	320190450	310
GET_thread	190140	280180410	220
GET_thread_list	270160	710370900	520
PATCH_comment	290230	450360	1200
PATCH_thread	200190	300240	370
POST_comment_comment	390290	550360	430
POST_comment_response	330250	470310	450
POST_thread	160	190	280	200
auto_auth	340200	350250350	250

No. of clients = 48048

req/s = 388.89

8602001300170390370list5506404101400320610410270

Methods	median response time	median response time	95%	95%	99%	99%
DELETE_comment	320	440		250	270	300	470	480	520
DELETE_thread	170	160	210	300	230	310
GET_comment_list	150	160	200	260	320	350
GET_thread	170	260		150	150	190	250	280	310
GET_thread_	190	390	list	170	190	360	370	520	530
PATCH_comment	280	420		220	240	320	400	410	460
PATCH_thread	200	290	180	190	360	300	330	330
POST_comment_comment	380	530		290	310	340	510	490	680
POST_comment_response	250	260	300	450	390	520
POST_thread	190	260		160	160	200	250	370	340
auto_auth	260	270	220	210	230	230	230	230

No. of clients = 51096

req/s = 4118.405

570440390460420690590400

Methods	median response time	95%	99%
DELETE_comment	320 270	480340	390
DELETE_thread	200170	340230470	330
GET_comment_list	170	210	330
GET_thread	190160	290200	250
GET_thread_list	280190	710440880	610
PATCH_comment	280240	400330	360
PATCH_thread	190200	320260	350
POST_comment_comment	390310	560380	470
POST_comment_response	260	330	490	440
POST_thread	190170	270210	340
auto_auth	230180	250220250	220

No. of clients = 51096

req/s = 4018.46

780360450340850820510780460

Methods	median response time	median response time	95%	95%	99%	99%
DELETE_comment	320	510		270	280	340	360	360	980
DELETE_thread	190	250	170	170	220	230	230	290
GET_comment_list	170	280		160	160	210	220	340	400
GET_thread	180	280	630	GET_thread_list	200	400	160	160	200	210	270	340
GET_thread_list	190	190	460	460	600	640
PATCH_comment290	220	230	310	310	340	460	740
PATCH_thread	190	200	260	280	350	1000
POST_comment_comment	400	580		300	310	370	410	480	680
POST_comment_response	260	260	330	350	420	620
POST_thread	190	270		170	170	210	210	340	330
auto_auth	240	250	250auth	180	210	230	220	230	220

No. of clients =

req/s =

Methods	median response time	95%	99%
DELETE_comment

320

400

490

DELETE_thread

190

240

280

GET_comment_list

190

250

370

GET_thread

190

250

340

GET_thread_list

220

520

660

PATCH_comment

270

360

390

PATCH_thread

230

310

370

POST_comment_comment

350

450

540

POST_comment_response

300

400

490

POST_thread

190

240

340

auto_auth

200

No. of clients =

req/s =

31.

Methods	median response time	median response time	95%	95%	99%	99%
DELETE_comment	310	320	410	420

590

530

DELETE_thread

200

340

	190	190	240	230	250	250
GET_comment_list

170

280

	190	190	250	250	380	380
GET_thread	180

300

180	250	250	360	340
GET_thread_

210

420

list	220	210	530	520	670	660
PATCH_comment

290

500

	260	250	380	360	1400	430
PATCH_thread

200

340

220	230	280	310	400	370
POST_comment_comment

400

600

	340	340	450	470	540	620
POST_comment_response

340

530

	290	290	400	400	510	550
POST_thread	190

290

190	240	230	370	360
auto_auth

210

250

210

200

220

200

220

200

No. of clients =

req/s =

44.

Methods	median response time	95%	99%
DELETE_comment

400

600

700

DELETE_thread

240

320

530

GET_comment_list

240

350

480

GET_thread

240

360

470

GET_thread_list

270

630

800
PATCH_comment	330

510

600

PATCH_thread

290

420

530

POST_comment_comment

420

640

820

POST_comment_response

360

560

730

POST_thread

240

310

440

auto_auth

210

No. of clients =

req/s =

45.

Methods	median response time	median response time	95%	95%	99%	99%
DELETE_comment

330

540

	390	410	650	690	1300	1100
DELETE_thread

200

320

240	230	370	280	430	340
GET_comment_list

180

310

	230	240	350	350	500	480
GET_thread

180

330

	230	240	370	370	600	520
GET_thread

210

450

_list	270	270	640	630	850	790
PATCH_comment	300	310

610

490	480	1200	610
PATCH_thread

210

360

280	290	460	470	640	630
POST_comment_comment

400

600

	410	430	670	660	1000	970
POST_comment_response

340

550

	360	360	560	600	860	850
POST_thread

230

300

310

500

430

auto_auth

240

280

190

210

220

No. of clients = 610270

req/s = 3651.1

130058005800

Methods	median response time	95%	99%
DELETE_comment	1500560	56007900	760010000
DELETE_thread	1300280	5400620	65005200
GET_comment_list	1200300	53003100	66004000
GET_thread	1900310	67003400	91005000
GET_thread_list	1300330	54003000	69004200
PATCH_comment	1400400	60001700	68006500
PATCH_thread	1100380	45005100	59006300
POST_comment_comment	1600560	60007900	760010000
POST_comment_response	1300470	56006100	76008200
POST_thread	1200290	52002900	64003900
auto_auth	320	320	320

No. of clients = 610300

req/s = 48.9

Methods	median response time	95%	99%
DELETE_comment	3601600	6304600	19005900
DELETE_thread	210	650460	1600
GET_comment_list	180	370	1300
GET_thread	190	400	2100
GET_thread_list	220	500	1200
PATCH_comment	310	610	1100
PATCH_thread	220	480	1400
POST2500
GET_comment_commentlist	430480	8901300	24001900
POSTGET_comment_responsethread	360560	6802100	18002900
POSTGET_thread_list	200530	4201700	18002500
autoPATCH_authcomment	270800	4302800430	3500

UPDATE:

To narrow down the behaviour of feature endpoints on high traffic and huge data, I have conducted some more tests and here are the results.

Case 1:

I ran a few tests with a fresh new course for each run and initial data of 100 threads, 10 responses each thread and 7 comments to each response in each course. The percentile shows acceptable numbers as opposed to above results where data was increasing in a single course with each run (see above "with profile image" column). Hence we know the increasing data in any course has directly proportional effect to response percentile.

...

PATCH_thread	710	2700	3600
POST_comment_comment	1400	4300	5500
POST_comment_response	1000	3600	4800
POST_thread	430	1100	1700
auto_auth	240	250	250

No. of clients = 336

req/s = 53

Methods	median response time	95%	99%
DELETE_comment

320

2500

460

5500

530

6500

DELETE_thread

200

980

270

2300

300

2900

GET_comment_list

200

800

300

2100

380

2700

GET_thread

190

920

270

2700

360

3300

GET_thread_list

270

850

680

2400

850

3300

PATCH_comment

220

1500

400

3900

500

5000

PATCH_thread

210

1200

290

3400

330

4100

POST_comment_comment

380

2300

520

5400

590

7000

POST_comment_response

330

1700

460

4400

540

5900

POST_thread

190

740

260

2000

300

2700

auto_auth

240

330

320

1300

320

1300

No. of clients

...

Methods

 = 336

req/s = 54.3

Methods	median response time	median response time	95%	95%	99%	99%
DELETE_comment

350

520

990

	1500	5300	9200	9600	12000	11000
DELETE_thread

220

330

1100

	470	2300	4100	4600	6700	5100
GET_comment_

list

220

360

490

list	510	1800	5000	3700	6800	4500
GET_thread

200

310

450

	590	2100	6500	4800	9900	5700
GET_thread_list

290

580

760

1800

980

4900

PATCH_comment

3900

290 540

6800

490

4800

PATCH_

thread

comment

210

810

320

3500

390

POST_comment_comment

420

600

780

POST_comment_response

350

530

720

POST_thread

200

310

500

auto_auth

230

240

No. of clients = 544: (error rate = 364 (1.57%) )

...

Case 2:

I had created a new fresh course populated in it a huge number of threads, responses and comments (in thousands) and then run GET thread and comment endpoints for both with profile image and without (as these are the only endpoints this changes is reflected in).

Comparing the two set of results; we can see the difference in 99% for with profile image but I believe the numbers are acceptable, only the last two cases that I have highlighted shows anomaly.

For profile image: when no. of clients = 510 shows greater response time than no. of clients = 544. Its reason that I could assume is the users involved in thread and comments for later test are less in number possibly.
For without profile image:
- 99% for no. of clients = 544 is greater than that of with profile image which is weird.
- there is an instant rise for clients=544 than for clients=510; to see if its a valid increase I used no. of clients somewhere between the two numbers i.e. 522, but the 99% was even higher than for 544, again weird.

Without Profile Image

With Profile Image

7600	6400	1100	9700
PATCH_thread	790	2900	5400	6300	7800	7300
POST_comment_comment	1300	4800	9300	9300	12000	11000
POST_comment_response	1000	3700	7700	7800	10000	9600
POST_thread	470	1700	4600	3500	6400	4000
auto_auth	320	710	320	830	320	830

No. of clients = 460

req/s = 44.5

Methods	median response time	95%	99%
DELETE_comment	5600	9800	12000
DELETE_thread	3700	6200	7900
GET_comment_list	3700	6100	8100
GET_thread	5200	9800	13000
GET_thread_list	3700	6800	8600
PATCH_comment	4800	7900	9500
PATCH_thread	4600	7300	9900
POST_comment_comment	5500	9300	11000
POST_comment_response	5000	8300	10000
POST_thread	3500	6000	9200
auto_auth	2000	2600	2600

No. of clients = 48460

req/s = 449.10

error rate =

(0.03%)

210290370170230260250650780autoauth210220

Methods	median response time	95%	99%
DELETE_comment	6100	10000	11000
DELETE_thread	3500	5700	6400
GET_comment_list	3000	5000	5700
GET_thread	4200	7100	8700
GET_thread_list	3100	5300	6200
PATCH_	comment	4400	7800	220

No. of clients = 48

req/s = 4

error rate = 0

480 210 220 220

Methods	median response time	95%	99%
GET_comment_list	170	220	250
GET_thread	170	230	250
GET_thread_list	180	360	8600
PATCH_thread	4100	6900	7600
POST_comment_comment	5700	9900	11000
POST_comment_response	4800	8500	10000
POST_thread	3000	4800	5500
auto_auth	1700	2700	2700

No. of clients = 144510

req/s = 12

error rate = 0

32.4

list

Methods	median response time	95%	99%
DELETE_comment	7700	15000	27000
DELETE_thread	5400	12000	15000
GET_comment_list	2106300	29014000360	31000
GET_thread	170 8600	240 19000270	36000
GET_thread_	250	690	830
auto_auth	230	230	230

No. of clients = 144

req/s = 11.4

error rate = 0

500

Methods	median response time	95%	99%
GET_comment_list	160	230	280
GET_thread	170	240	260
GET_thread_list	180	360	list	6400	14000	32000
PATCH_comment	7000	14000	25000
PATCH_thread	6900	16000	35000
POST_comment_comment	7400	15000	36000
POST_comment_response	7000	14000	26000
POST_thread	6000	11000	32000
auto_auth	220 2900	290 4000290	4400

No. of clients = 288510

req/s = 2333.9

error rate = 0

220310420180 240 280 260

Methods	median response time	95%	99%
DELETE_comment	7600	14000	19000
DELETE_thread	6700	11000	14000
GET_comment_list	6700	11000	16000
GET_thread	9400	16000	22000
GET_thread_list	7000	710	870
auto_auth	220	230	230

No. of clients = 288

req/s = 23.3

error rate = 0

510 230 230 230

Methods	median response time	95%	99%
GET_comment_list	170	230	310
GET_thread	180	240	280
GET_thread_list	190	370	11000	17000
PATCH_comment	6900	12000	16000
PATCH_thread	6900	12000	21000
POST_comment_comment	7600	14000	19000
POST_comment_response	7200	13000	18000
POST_thread	6700	11000	15000
auto_auth	2400	3200	3300

No. of clients = 510578

req/s = 42.3

error rate = 1(0.00%)

26.70

19002600 2000

Methods	median response time	95%	99%
DELETE_comment	9500	26000	32000
DELETE_thread	8700	25000	32000
GET_comment_list	2408600	410 22000	31000
GET_thread	190 13000	360 27000	40000
GET_thread_list	290 8600	880 22000	32000
auto_auth	300	390	390

No. of clients = 510

req/s = 40.8

error rate = 0

640

Methods	median response time	95%	99%
GET_comment_list	180	270	580
GET_thread	190	290	860
GET_thread_list	200	420	PATCH_comment	8800	22000	32000
PATCH_thread	8800	20000	29000
POST_comment_comment	9200	24000	31000
OST_comment_response	9300	24000	32000
POST_thread	8300	20000	32000
auto_auth	320 3600	1300 52001300	5500

No. of clients = 544578

req/s = 4440.402

error rate = 0

360 480 290 390 810 970

Methods	median response time	95%	99%
DELETE_comment	9100	24000	32000
DELETE_thread	8500	22000	26000
GET_comment_list 250	8100	22000	32000
GET_thread 200	12000	26000	40000
GET_thread_list 290	8300	22000	32000
auto_auth	280	310	310

No. of clients = 544

req/s = 45.4

error rate = 0

2500 290

Methods	median response time	95%	99%
GET_comment_list	190	420	2500
GET_thread	200	480	4500
GET_thread_list	210	550	PATCH_comment	8300	23000	31000
PATCH_thread	8000	23000	28000
POST_comment_comment	9000	24000	32000
OST_comment_response	8500	23000	32000
POST_thread	8300	22000	32000
auto_auth 280	4100	5100	290

...

5400

Versions Compared

Old Version 1

New Version Current

Key

With Profile Image

Without Profile Image

Background:

Results:

Old Index

New Index

UPDATE:

Case 1:

Case 2:

Page Comparison

Versions Compared

Old Version 1

New Version Current

Key

<span class="diff-html-changed" data-a11y-before="Start of changed content" data-a11y-after="End of changed content" id="changed-diff-0">[data-colorid=</span>

With Profile Image

Without Profile Image

Background:

Results:

Old Index

New Index

UPDATE:

Case 1:

Case 2: