Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

We have gathered and compared two sets of load test results. The former results (i.e. with profile_image) are run on the latest changes and the later are run on older implementation (i.e. without profile_image).

There appears to be anomaly in the results below (highlighted 99%) where the new implementation is faster than the old. Where as, for other results the percentile of new implementation increases significantly with the increase in load. For example, the first pair of results where no. of clients = 48, the difference is in tens or hundrend and for the last pair of results where no. of clients = 610, the difference is in thousands.

Logically; there should only be increase in response time for GET endpoints, but we see in results that all other endpoints show differences too. This is because all PATCH, POST and DELETE endpoints first call GET endpoint to retrieve an 'id' of thread and/or comment and then make further processing on it.

New Relic:

Here is the new relic permaLink_without_profile and permaLink_with_profile where the average rpm for former is 1.55k and for later is 1.48k and 84.5% of requests are made for AccountViewSet.list (i.e. user accounts API for multiple usernames) for later.

 

...

With Profile Image

...

Without Profile Image

...

No. of clients = 
req/s = 
Methodsmedian response time95%99%

DELETE_comment

 

  

DELETE_thread

 

  

GET_comment_list

   

GET_thread

   

GET_thread_list

   

PATCH_comment

   

PATCH_thread

   

POST_comment_comment

   

POST_comment_response

   

POST_thread

   

auto_auth

   
No. of clients = 
req/s =
Methodsmedian response time95%99%

DELETE_comment

   

DELETE_thread

   

GET_comment_list

   

GET_thread

   

GET_thread_list

   

PATCH_comment

   

PATCH_thread

   

POST_comment_comment

   

POST_comment_response

   

POST_thread

   

auto_auth

   
No. of clients = 
req/s = 
Methodsmedian response time95%99%
DELETE_comment

 

  
DELETE_thread   
GET_comment_list   
GET_thread   
GET_thread_list   
PATCH_comment   
PATCH_thread   
POST_comment_comment   
POST_comment_response   
POST_thread   
auto_auth   
No. of clients = 
req/s = 

...

No. of clients = 
req/s = 
Methodsmedian response time95%99%

DELETE_comment

   

DELETE_thread

   

GET_comment_list

   

GET_thread

   

GET_thread_list

   

PATCH_comment

   

PATCH_thread

   

POST_comment_comment

   

POST_comment_response

   

POST_thread

   

auto_auth

   
No. of clients = 
req/s = 
Methodsmedian response time95%99%

DELETE_comment

   

DELETE_thread

   

GET_comment_list

   

GET_thread

   

GET_thread_list

   

PATCH_comment

   

PATCH_thread

   

POST_comment_comment

   

POST_comment_response

   

POST_thread

   

auto_auth

   
No. of clients = 
req/s = 
Methodsmedian response time95%99%

DELETE_comment

2703501300

DELETE_thread

180210280

GET_comment_list

170250300

GET_thread

170230260

GET_thread_list

240590740

PATCH_comment

270350450

PATCH_thread

190260370

POST_comment_comment

330440650

POST_comment_response

280350400

POST_thread

170230250

auto_auth

220230230
No. of clients = 
req/s = 
Methodsmedian response time95%99%

DELETE_comment

270340380

DELETE_thread

180240360

GET_comment_list

150210330

GET_thread

160220300

GET_thread_list

170340480

PATCH_comment

260340370

PATCH_thread

150260270

POST_comment_comment

330410550

POST_comment_response

280360440

POST_thread

170230270

auto_auth

210210210
No. of clients = 192
req/s = 16.10
Methodsmedian response time95%99%

DELETE_comment

280370550

DELETE_thread

180230290

GET_comment_list

180260370

GET_thread

170240340

GET_thread_list

240610780

PATCH_comment

2603801100

PATCH_thread

170260300

POST_comment_comment

340430540

POST_comment_response

290370400

POST_thread

180230350

auto_auth

210220220
No. of clients = 192
req/s = 16
Methodsmedian response time95%99%

DELETE_comment

280360430

DELETE_thread

180290290

GET_comment_list

150210290

GET_thread

160220310

GET_thread_list

180350490

PATCH_comment

250340450

PATCH_thread

190260430

POST_comment_comment

330430550

POST_comment_response

280370470

POST_thread

170230290

auto_auth

220230230
No. of clients = 240
req/s = 18.70
Methodsmedian response time95%99%

DELETE_comment

290390470

DELETE_thread

180250350

GET_comment_list

180270400

GET_thread

180240300

GET_thread_list

240620800

PATCH_comment

270400860

PATCH_thread

2002702300

POST_comment_comment

340470760

POST_comment_response

290390760

POST_thread

180240390

auto_auth

220230230
No. of clients = 240
req/s = 19.7
Methodsmedian response time95%99%

DELETE_comment

280370460

DELETE_thread

180250300

GET_comment_list

150220280

GET_thread

160230280

GET_thread_list

180350500

PATCH_comment

260380500

PATCH_thread

190250380

POST_comment_comment

330420500

POST_comment_response

280370460

POST_thread

180240340

auto_auth

210220220
No. of clients = 288
req/s = 23.20
Methodsmedian response time95%99%

DELETE_comment

290380460

DELETE_thread

190290380

GET_comment_list

180270340

GET_thread

180240300

GET_thread_list

250630790

PATCH_comment

260350440

PATCH_thread

170240280

POST_comment_comment

340430560

POST_comment_response

290390470

POST_thread

180240320

auto_auth

210210210
No. of clients = 288
req/s = 22.2
Methodsmedian response time95%99%

DELETE_comment

2804201400

DELETE_thread

180240270

GET_comment_list

160240660

GET_thread

160240490

GET_thread_list

180360570

PATCH_comment

270360740

PATCH_thread

190260630

POST_comment_comment

340450620

POST_comment_response

2904401300

POST_thread

1802401500

auto_auth

220220220
No. of clients = 336
req/s = 26.70
Methodsmedian response time95%99%

DELETE_comment

290410570

DELETE_thread

1903201200

GET_comment_list

190270370

GET_thread

180250350

GET_thread_list

250630800

PATCH_comment

240370410

PATCH_thread

180270330

POST_comment_comment

350460560

OST_comment_response

300390470

POST_thread

180250320

auto_auth

240250250
No. of clients = 336
req/s = 28.2
Methodsmedian response time95%99%

DELETE_comment

2904402400

DELETE_thread

190260440

GET_comment_list

160240550

GET_thread

170240690

GET_thread_list

180370540

PATCH_comment

2703801700

PATCH_thread

1902801500

POST_comment_comment

3604802400

OST_comment_response

3004301300

POST_thread

180240290

auto_auth

240260260
No. of clients = 384
req/s = 31
Methodsmedian response time95%99%

DELETE_comment

300400890

DELETE_thread

190300320

GET_comment_list

190280410

GET_thread

180260360

GET_thread_list

260660830

PATCH_comment

230380430

PATCH_thread

200260400

POST_comment_comment

360500590

POST_comment_response

300410470

POST_thread

180250290

auto_auth

290360360
No. of clients = 384
req/s = 31.4

...

Background:

In reference to 

Jira Legacy
serverJIRA (openedx.atlassian.net)
serverId13fd1930-5608-3aac-a5dd-21b934d3a4b4
keyMA-2678
; we have changed forums implementation with following details (for reference; see PR#192):

  1. Changed post/response/comment behaviour to update post's 'last_activity_at' only at time of creation of post and creation of response/comment on a post. Previously post's 'last_activity_at' was being updated for both creation and update.
  2. To calculate 'read' status of a post, used 'last_acitvity_at' instead of 'updated_at'.
  3. To calculate 'unread comment count' for a post, used 'created_at' instead of 'updated_at'.

'unread comment count' was being calculate as:

unread_comment_count = Comment.collection.find(:comment_thread_id => t._id, :author_id => {"$ne" => user.id}, :updated_at => {"$gte" => read_dates[thread_key]}).count

and had a compound index against it
index({_type: 1, comment_thread_id: 1, author_id: 1, updated_at: 1})
 

With new implementation:

unread_comment_count = Comment.collection.find(:comment_thread_id => t._id, :author_id => {"$ne" => user.id}, :created_at => {"$gte" => read_dates[thread_key]}).count

So, we removed index
index({_type: 1, comment_thread_id: 1, author_id: 1, updated_at: 1})

and added a new one

index({comment_thread_id: 1, author_id: 1, created_at: 1})

Results:

The load tests were run on 4x c4.2xlarge instances for lms and 3x m4.large instances for forums.

The results of load tests below show differences between the old and new implementation. The two set of results looks quite similar except when it reaches "No. of clients = 336"; where there is huge difference between old and new percentiles as well as sudden rise in percentile for both old and new index with respect to "No. of clients = 224". For all the next tests (i.e. No. of clients = 460, No. of clients = 510, No. of clients = 578), the difference between old and new percentile is minimised and the new index results have lower percentile for most of the endpoints.

I have captured new relic charts too; permaLink_old_index with average rpm = 1.86k and permaLink_new_index with average rpm = 1.81k


54428.10340150047002101900290022014003200200210054002901600330029018003200220120029004101900380035018003800200140029002602702705444350490710500550870list70067044084010005902505784590370210055002109501600250170041002102800570031018004200110026002301700420044019005700380200054002101700530053053005300578477230047011001900_list12003600940130014001901200280

Old Index

New Index

No. of clients = 48048
req/s = 369.804
600530380680600370
Methodsmedian response time95%99%

DELETE_comment

320

250

4703101200

DELETE_thread

200

160

2901901300200

GET_comment_list

210150320190450310

GET_thread

190140280180410220

GET_thread_list

270160710370900520

PATCH_comment

2902304503601200

PATCH_thread

200190300240370

POST_comment_comment

390290550360430

POST_comment_response

330250470310450

POST_thread

160190280200

auto_auth

340200350250350250
No. of clients = 48048
req/s = 388.89
8602001300170390370list5506404101400320610410270
Methodsmedian response timemedian response time95%95%99%99%

DELETE_comment

320440

250270300470480520

DELETE_thread

170160210300230310

GET_comment_list

150160200260320350

GET_thread

170260

150150190250280310

GET_thread_

190390

list

170190360370520530

PATCH_comment

280420

220240320400410460

PATCH_thread

200290180190360300330330

POST_comment_comment

380530

290310340510490680

POST_comment_response

250260300450390520

POST_thread

190260

160160200250370340

auto_auth

260270220210230230230230


No. of clients = 51096
req/s = 4118.405
570440390460420690590400
Methodsmedian response time95%99%
DELETE_comment320

270

480340390
DELETE_thread200170340230470330
GET_comment_list170210330
GET_thread190160290200250
GET_thread_list280190710440880610
PATCH_comment280240400330360
PATCH_thread190200320260350
POST_comment_comment390310560380470
POST_comment_response260330490440
POST_thread190170270210340
auto_auth230180250220250220


No. of clients = 51096
req/s = 4018.46
780360450340850820510780460
Methodsmedian response timemedian response time95%95%99%99%
DELETE_comment320510270280340360360980
DELETE_thread190250170170220230230290
GET_comment_list170280160160210220340400
GET_thread180280630

GET_thread_list

200400160160200210270340
GET_thread_list190190460460600640
PATCH_comment290220230310310340460740
PATCH_thread1902002602803501000
POST_comment_comment400580300310370410480680
POST_comment_response260260330350420620
POST_thread190270170170210210340330
auto_auth240250250auth180210230220230220


No. of clients = 
162
req/s = 
31
Methodsmedian response time95%99%

DELETE_comment

320
400
490

DELETE_thread

190
240
280

GET_comment_list

190
250
370

GET_thread

190
250
340

GET_thread_list

220
520
660

PATCH_comment

270
360
390

PATCH_thread

230
310
370

POST_comment_comment

350
450
540

POST_comment_response

300
400
490

POST_thread

190
240
340

auto_auth

200
200
200


No. of clients = 
162
req/s = 
31.
6
Methodsmedian response timemedian response time95%95%99%99%

DELETE_comment

310320410420
590
530

DELETE_thread

200340

190190240230250250

GET_comment_list

170280

190190250250380380

GET_thread

180
300
180250250360340

GET_thread_

210420

list

220210530520670660

PATCH_comment

290500

2602503803601400430

PATCH_thread

200340
220230280310400370

POST_comment_comment

400600

340340450470540620

POST_comment_response

340530

290290400400510550

POST_thread

190
290
190240230370360

auto_auth

210250
210200220200220200


No. of clients = 
240
req/s = 
44.
6
Methodsmedian response time95%99%

DELETE_comment

400
600
700

DELETE_thread

240
320
530

GET_comment_list

240
350
480

GET_thread

240
360
470

GET_thread_list

270
630
800

PATCH_comment

330
510
600

PATCH_thread

290
420
530

POST_comment_comment

420
640
820

POST_comment_response

360
560
730

POST_thread

240
310
440

auto_auth

210
210
210


No. of clients = 
240
req/s = 
45.
40
Methodsmedian response timemedian response time95%95%99%99%

DELETE_comment

330540

39041065069013001100

DELETE_thread

200320
240230370280430340

GET_comment_list

180310

230240350350500480

GET_thread

180330

230240370370600520

GET_thread

210450

_list

270270640630850790

PATCH_comment

300310
610
4904801200610

PATCH_thread

210360
280290460470640630

POST_comment_comment

400600

4104306706601000970

POST_comment_response

340550

360360560600860850

POST_thread

230230300310500
430

auto_auth

240280
190210220220220220



No. of clients = 610270
req/s = 3651.1
130058005800
Methodsmedian response time95%99%

DELETE_comment

150056056007900760010000

DELETE_thread

1300280540062065005200

GET_comment_list

12003005300310066004000

GET_thread

19003106700340091005000

GET_thread_list

13003305400300069004200

PATCH_comment

14004006000170068006500

PATCH_thread

11003804500510059006300

POST_comment_comment

160056060007900760010000

POST_comment_response

13004705600610076008200

POST_thread

12002905200290064003900
auto_auth320320320

No. of clients = 610300
req/s = 48.9
Methodsmedian response time95%99%

DELETE_comment

3601600630460019005900

DELETE_thread

2106504601600

GET_comment_list

1803701300

GET_thread

1904002100

GET_thread_list

2205001200

PATCH_comment

3106101100

PATCH_thread

2204801400
POST2500

GET_comment_commentlist

430480890130024001900

POSTGET_comment_responsethread

360560680210018002900

POSTGET_thread_list

200530420170018002500

autoPATCH_authcomment

27080043028004303500

 

UPDATE:

To narrow down the behaviour of feature endpoints on high traffic and huge data, I have conducted some more tests and here are the results.

Case 1:

I ran a few tests with a fresh new course for each run and initial data of 100 threads, 10 responses each thread and 7 comments to each response in each course. The percentile shows acceptable numbers as opposed to above results where data was increasing in a single course with each run (see above "with profile image" column). Hence we know the increasing data in any course has directly proportional effect to response percentile.

...

PATCH_thread

71027003600

POST_comment_comment

140043005500

POST_comment_response

100036004800

POST_thread

43011001700
auto_auth240250250
No. of clients = 336
req/s = 53
Methodsmedian response time95%99%

DELETE_comment

320
2500
460
5500
530
6500

DELETE_thread

200
980
270
2300
300
2900

GET_comment_list

200
800
300
2100
380
2700

GET_thread

190
920
270
2700
360
3300

GET_thread_list

270
850
680
2400
850
3300

PATCH_comment

220
1500
400
3900
500
5000

PATCH_thread

210
1200
290
3400
330
4100

POST_comment_comment

380
2300
520
5400
590
7000

POST_comment_response

330
1700
460
4400
540
5900

POST_thread

190
740
260
2000
300
2700

auto_auth

240
330
320
1300
320
1300


No. of clients

...

Methods
 = 336
req/s = 54.3
Methodsmedian response timemedian response time95%95%99%99%

DELETE_comment

 350 520 990

15005300920096001200011000

DELETE_thread

220 330 1100 

47023004100460067005100

GET_comment_

list
220 360 490 

list

51018005000370068004500

GET_thread

200310 450 

59021006500480099005700

GET_thread_list

290 
580
760 
1800
980 
4900
PATCH_comment
3900
290 540 
6800
490 
4800

PATCH_

thread

comment

210 
810
320 
3500
390 
POST_comment_comment420 600 780 
POST_comment_response350 530 720 
POST_thread200 310 500 
auto_auth230 240 240

 No. of clients = 544: (error rate = 364 (1.57%) )

...

Case 2:

I had created a new fresh course populated in it a huge number of threads, responses and comments (in thousands) and then run GET thread and comment endpoints for both with profile image and without (as these are the only endpoints this changes is reflected in). 

Comparing the two set of results; we can see the difference in 99% for with profile image but I believe the numbers are acceptable, only the last two cases that I have highlighted shows anomaly.

  • For profile image: when no. of clients = 510 shows greater response time than no. of clients = 544. Its reason that I could assume is the users involved in thread and comments for later test are less in number possibly.
  • For without profile image: 
    • 99% for no. of clients = 544 is greater than that of with profile image which is weird.
    • there is an instant rise for clients=544 than for clients=510; to see if its a valid increase I used no. of clients somewhere between the two numbers i.e. 522, but the 99% was even higher than for 544, again weird.
Without Profile Image
With Profile Image
7600640011009700

PATCH_thread

79029005400630078007300

POST_comment_comment

13004800930093001200011000

POST_comment_response

1000370077007800100009600

POST_thread

47017004600350064004000

auto_auth

320710320830320830


No. of clients = 460
req/s = 44.5
Methodsmedian response time95%99%

DELETE_comment

5600980012000

DELETE_thread

370062007900

GET_comment_list

370061008100

GET_thread

5200980013000

GET_thread_list

370068008600

PATCH_comment

480079009500

PATCH_thread

460073009900

POST_comment_comment

5500930011000

POST_comment_response

5000830010000

POST_thread

350060009200

auto_auth

200026002600


No. of clients = 48460
req/s = 449.10
error rate =
1
(0.03%)
210290370170230260250650780autoauth210220
Methodsmedian response time95%99%

DELETE_comment

61001000011000

DELETE_thread

350057006400

GET_comment_list

300050005700

GET_thread

420071008700

GET_thread_list

310053006200

PATCH_

comment

44007800220
No. of clients = 48
req/s = 4

error rate = 0

480 210 220 220
Methodsmedian response time95%99%
GET_comment_list 170220250
GET_thread170 230 250 
GET_thread_list180 360 8600

PATCH_thread

410069007600

POST_comment_comment

5700990011000

POST_comment_response

4800850010000

POST_thread

300048005500

auto_auth

170027002700


No. of clients = 144510
req/s = 12

error rate = 0

32.4
list
Methodsmedian response time95%99%

DELETE_comment

77001500027000

DELETE_thread

54001200015000

GET_comment_list

 21063002901400036031000

GET_thread

170 8600240 19000270 36000

GET_thread_

250 690 830 
auto_auth230 230 230
No. of clients = 144
req/s = 11.4

error rate = 0

500 
Methodsmedian response time95%99%
GET_comment_list160230280
GET_thread170 240 260 
GET_thread_list180 360 

list

64001400032000

PATCH_comment

70001400025000

PATCH_thread

69001600035000

POST_comment_comment

74001500036000

POST_comment_response

70001400026000

POST_thread

60001100032000

auto_auth

220 2900290 40002904400


No. of clients = 288510
req/s = 2333.9
error rate = 0
 220310420180 240 280 260 
Methodsmedian response time95%99%

DELETE_comment

76001400019000

DELETE_thread

67001100014000

GET_comment_list

67001100016000

GET_thread

94001600022000

GET_thread_list

7000710 870 
auto_auth220 230 230
No. of clients = 288
req/s = 23.3

error rate = 0

510 230 230 230
Methodsmedian response time95%99%
GET_comment_list 170230310 
GET_thread180 240 280 
GET_thread_list190 370 1100017000

PATCH_comment

69001200016000

PATCH_thread

69001200021000

POST_comment_comment

76001400019000

POST_comment_response

72001300018000

POST_thread

67001100015000

auto_auth

240032003300


No. of clients = 510578
req/s = 42.3
error rate = 1(0.00%)
26.70
19002600 2000 
Methodsmedian response time95%99%

DELETE_comment

95002600032000

DELETE_thread

87002500032000

GET_comment_list

 2408600410 2200031000

GET_thread

190 13000360 2700040000

GET_thread_list

290 8600880 2200032000
auto_auth300 390 390
No. of clients = 510
req/s = 40.8

error rate = 0

640 
Methodsmedian response time95%99%
GET_comment_list 180 270580
GET_thread190 290 860 
GET_thread_list200 420 

PATCH_comment

88002200032000

PATCH_thread

88002000029000

POST_comment_comment

92002400031000

OST_comment_response

93002400032000

POST_thread

83002000032000

auto_auth

320 36001300 520013005500


No. of clients = 544578
req/s = 4440.402

error rate = 0

360 480 290 390 810 970 
Methodsmedian response time95%99%

DELETE_comment

91002400032000

DELETE_thread

85002200026000

GET_comment_list

250 
81002200032000

GET_thread

200 
120002600040000

GET_thread_list

290 
83002200032000
auto_auth280 310 310
No. of clients = 544
req/s = 45.4

error rate = 0

2500 290 
Methodsmedian response time95%99%
GET_comment_list 190 4202500
GET_thread200 480 4500 
GET_thread_list210 550 

PATCH_comment

83002300031000

PATCH_thread

80002300028000

POST_comment_comment

90002400032000

OST_comment_response

85002300032000

POST_thread

83002200032000

auto_auth

280 
41005100290

...

5400