Large Instances Meeting Notes 2024-03-19

Introductions

Assign meeting lead and note taker.

@Felipe Montoya will lead the meeting, @Braden MacDonald take notes.

Greetings and introductions - welcome @Chintan Joshi .
Mentioned Tutor Version Manager from eduNEXT as a possible tool for @Chintan Joshi 's need to run multiple Tutor instances on one machine.

Updates from each org on the call

What's new with your deployment(s)?

eduNEXT:

  • has been working on drydock to manage k8s deployments in a more gitops oriented way. Added the ability to put annotations on the jobs so that ArgoCD will run them in a specific order. Now, they’re able to initialize the instances using the tutor-defined jobs/order.

  • has been experiencing performance issues in a couple of installations that are running on bare metal servers. Wondering if anyone has guidelines around best practices for bare metal. (By “bare metal” they mean it’s a very large instance and they install kubernetes onto it.)

  • had a customer worry about their shared ArgoCD instance - if they ever need to change providers, worried it’s hard to replicate. So eduNEXT is interested in moving to 1:1 ArgoCD per Open edX instance, and wanted to know if there would be interest for incorporating ArgoCD into Harmony.

    • @Braden MacDonald said that sounds interesting and we should explore it as an option.

  • @Moisés González asked what load testing tools others use. They have used functionality built into Grafana and New Relic, but are interested in other tools especially ones that can incorporate profiling and compare instance.

    • @Felipe Montoya : it would be good to collaborate on building a standard performance measurement tool for Open edX instances. Is anyone interested?

OpenCraft:

  • Continued development of Meilisearch for Studio content search - see thread on Discourse.

  • @Gábor Boros is working on removing functionality from OpenCraft’s internal “Grove” tool and migrating to Harmony. Making good progress. Encountered some issues with installation etc. so opened several PRs to Harmony to fix those. One other minor issue encountered is that the migration to Harmony causes the load balancer to be replaced and a new cluster IP assigned; but that’s expected and manageable.

Harmony project updates

 

Discussion