Large Instances Meeting Notes 2024-01-09

Video recording: https://us02web.zoom.us/rec/share/-ArZjY1QxUMGGC55cIXD3rNcPhJEk1VzJ0axht4t5EGhLaH59nWP1gB93yAd8Ewe.LJHUz1Bh4H4brZCV

 

  1. Assign meeting lead and note taker.

@Felipe Montoya will lead and @Braden MacDonald and Otterbox AI will take notes.

  1. Greetings & introductions as needed.

  2. Updates from each org on the call - 2U, eduNEXT, OpenCraft, Raccoon Gang. What's new with your deployment(s)?

OpenCraft - @Gábor Boros opened a few Harmony PRs over the previous weeks, and most of them still need reviewers. Please review one if you’re interested.

Racoon Gang - @Maksim Sokolskiy mostly been on holidays. But they have worked on an interesting customization related to bulk processing of student answers. Tested on a big exam and further testing is in progress. Hoping to present to the community later this year. The issue they were trying to solve is that specialized exams that get a lot of students answering at the same time would overload the database.

Q from @Maksim Sokolskiy : is anyone familiar with AWS Aurora “parallel queries”? Wondering if it will help us with handling more simultaneous users.

→ seems like nobody on the call has experience with it.

eduNEXT - @Felipe Montoya : working on ElasticSearch issue but otherwise been mostly on holiday. @Jhony Avella we also had to apply one fix to the Harmony release workflow, as there was an issue where Helm comments are not compatible with the OCI repository. We got it fixed though and now it’s working again.

2U - @Adam Blackwell (Deactivated) has two announcements: (1) He is leaving 2U, and (2) 2U has open sourced their helm charts repo https://github.com/edx/helm-charts . It’s open source but not accepting contributions. Some pieces of this Helm chart don’t work well for Studio/LMS because of issues with codejail, but we hope to separate it in the future to run codejail separately. But it works well for other IDAs.

Q from @Felipe Montoya : if you deploy images using these helm charts, how do you build the images? A: we used to use a simple GitHub actions pattern, but now we use Go CD primarily. We define a bunch of yaml code that says whenever a PR merges to any of the specified repos, it picks up the changes, builds a Dockerfile using Argo CD, then pushes that to ECR. We use a private dockerfile which wraps the public one and injects 2U-specific things like New Relic.

Matt Hughes will be joining instead of Adam in the future meetings of this group.

@Felipe Montoya Reminder for 2U SRE: there’s no expectations for contributions to the group in any specific way. We are interested just in sharing lessons learned from each other’s experience running large instances.

  1. Harmony project updates: Review list of PRs and issues, and assign anything un-assigned.

There are four open PRs: https://github.com/openedx/openedx-k8s-harmony/pulls

@Maksim Sokolskiy will ask his team to review the Velero chart.

@Adam Blackwell (Deactivated) Related to the OpenFAAS PR, the has functionality for cron jobs, and it can spin up jobs that have the same environment as a given IDA. Asked about how OpenCraft uses it - which @Gábor Boros said includes monitoring PRs to spin up sandboxes, collecting metrics, and other things.

  1. Open discussion/questions, if any.