Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Meeting recording

https://drive.google.com/file/d/14oCPR8AJExDmNJxg2gZ98SFQgLpaK4z9/view?usp=drive_link

Agenda

  1. Assign meeting lead (Felipe Montoya ) and note taker (Braden MacDonald ). (tick)

  2. Greetings & introductions as needed. (tick)

  3. Updates from each org on the call - 2U, eduNEXT, OpenCraft, Raccoon Gang, Lawrence. What's new with your deployment(s)?

  4. Harmony project updates: Review list of PRs and issues, and assign anything un-assigned.

  5. Open discussion/questions, if any.

...

  • Releasing the chart - Jhony Avella created a new PR that’s from the same repo rather than a fork, and now it seems to be working. You can now use Helm to install it directly for testing.

  • OpenSearch cluster - Maksim Sokolskiy has resolved all the comments. Mostly been testing on minikube so would appreciate if someone can test on EKS. Found a new issue with the “reindex_course” command that affects both OpenSearch and ElasticSearch. PR is ready for second round of review.

Meeting chat log

00:06:33 Adam Blackwell (he/him/his): GitHub
- argoproj-labs/argocd-vault-plugin: An Argo CD plugin to retrieve
secrets from Secret Management tools and inject them into Kubernetes
secrets
00:06:40 Adam Blackwell (he/him/his): Argo CD Vault Plugin
00:08:02 Felipe Montoya: @moises you are refering to Introduction - External Secrets Operator ?
00:09:10 Moisés González: Yes that one
00:11:55 Adam Blackwell (he/him/his): Kubernetes logs | Vector documentation ?
00:15:10 Moisés González: Sorry I got a spotty connection. Can you write the last thing you said Adam?
00:15:20 Adam Blackwell (he/him/his): I appreciate your appreciation
00:15:37 Adam Blackwell (he/him/his): Axim if hoping to put their eggs in the harmony chart(s)
00:17:49 Jeremy Bowman: Is this relevant to the topic at hand? GitHub
- openobserve/openobserve: 🚀 10x easier, 🚀 140x lower storage cost,
🚀 high performance, 🚀 petabyte scale - Elasticsearch/Splunk/Datadog
alternative for 🚀 (logs, metrics, traces).
00:27:12 Adam Blackwell (he/him/his): Minimum 3, but the 1.25 EKS upgrades did break a bunch of v1beta things.
00:27:27 Adam Blackwell (he/him/his): I believe only for scheduled cronjobs though, so it didn’t impact learners.
00:28:29 Maksim Sokolskiy: GitHub - aulasneo/tutor-contrib-hpa
Did you use this?
00:30:04 jhony: or GitHub - eduNEXT/tutor-contrib-pod-autoscaling: This repository aims to provide support for HPA and VPA in tutor
00:30:49 Adam Blackwell (he/him/his): We use ls :facepalm:
00:31:24 Adam Blackwell (he/him/his): We want to use helm hooks for mysql migrations
00:32:42 Jeremy Bowman: Need to drop for another meeting
00:34:47 Adam Blackwell (he/him/his): +1
00:36:12 Adam Blackwell (he/him/his): We default our Django services to: resources:
limits:
cpu: 100m
memory: 512Mi
requests:
cpu: 25m
memory: 512Mi
00:38:32 Adam Blackwell (he/him/his): Bad liveness probes can hurt people*
00:38:59 Moisés González: We can atleast have the discussion
00:41:31 Adam Blackwell (he/him/his): For us, notes uses /heartbeat,
most other services use /health, we have a
`liveness_probe_initial_delay_seconds and
liveness_probe_initial_delay_seconds var exposed in the currrrent django
helm chart.
00:42:19 Adam Blackwell (he/him/his): (For the curious, GitHub - edx/portal-designer: A place to create and design new learner-portal instances. is public but not part of Open edX)
00:46:46 Adam Blackwell (he/him/his): I would love a better way to do
performance testing on a Django service before we move things from EC2
to EKS, right now we have teams write smoke test runbooks, but they are
incomplete.
00:48:42 Adam Blackwell (he/him/his): latency under load
00:49:09 Adam Blackwell (he/him/his): We at once point would scale up manually before major news announcements.
00:49:27 Maksim Sokolskiy: ++
00:49:40 Adam Blackwell (he/him/his): That also relates to bad liveness probes which slow down scaling
00:49:57 Felipe Montoya: We ask customers to let us know if they will do
big announments to be able to judge if a manual scale up is necessary
00:51:52 Felipe Montoya: k6
00:54:35 Moisés González: https://k6.io/
00:54:42 Lawrence McDaniel: I have a hard stop in 5 minutes
00:54:56 Adam Blackwell (he/him/his): Sorry for derailing the conversation.
00:55:18 Gábor Boros: Same here about the 5mins
01:03:31 Adam Blackwell (he/him/his): Have to drop off for another
meeting, but wanted to thank all of you for all of your inspiring
collaboration!
01:03:53 Braden MacDonald: Reacted to “Have to drop off for…” with