This would have two parts:
1. A ConfigModel to specify a set of known crawlers.
2. Adding a flag to the constructor for DjangoKeyValueStore to disable writes, and passing that flag through courseware when we get crawler traffic.
This is to help address observed latency spikes where courseware index transactions block on lock contention because they're trying to update the same sequential. This also removes crawler traffic from having CSM side-effects.
Analytics discrepancies of events vs. CSM state.
XBlocks that rely on write-then-read within the same request for correct functionality.
, , , , , ,
Update and followup on this: We've turned on the flag to disable CSM writes for known crawlers in production and seen a significant decrease in CSM write latency spikes. This is the 99th percentile chart for CSM writes comparing a day with this feature turned on compared to the same period a week ago:
Blue is the last day, dashed yellow is last week, and the units are milliseconds. Time interval was 5 mins. Our peaks have been been cut down about an order of magnitude, from 2-6 second peaks to much rarer ~200-500ms peaks. I've also confirmed in Splunk logs that crawlers were running against our site today, so it's not just a case that we had a lucky day.
I'll keep an eye on this over the next couple of weeks, but our team isn't planning to do any more work on CSM health in the near term if this pattern holds. There is still the issue of pursuing limits on field data size (PLAT-529, ), but that's been a much rarer problem and I'm explicitly prioritizing other stuff in the short term. If you have any concerns, please let me know.
BTW, now that this noise has been cleared out of the courseware index request, we're seeing traces that may indicate other performance issues in courseware rendering. Some of these are absurdly large sequences (i.e. hundreds of problems), but some of them look like they may point to separate issues with old mongo and CCX.
That's great news, Dave! Seems like we're now getting to the more interesting challenges.
I posted this in the perf channel, but one last graph for completeness on this ticket: This is the effect on 99th%ile courseware index views.