Large Instance Meeting Notes 21.03.2023

Notes

  • @Braden MacDonald chairs the meeting, @Felipe Montoya in the scribe role.

  • Status of the harmony project:

    • Open PR17: metric server. Gabor and Braden to review.

    • Gabor: do you need to be a core contributor to participate? not really. The CC thing is only to be a formal maintainer of the repo and be tagged as the owner of an issue.

    • Felipe does the placeholder assignee in place of Moises as well.

    • Open PR 24: karpenter. Waiting on Lawrence who wants a refactor to have better modularity, but currently the best available code is in the PR. Nobody is being blocked by this PR so waiting for the refactor is possible. It's also possible to talk in slack or the PR about the available design options.

    • Issue #26: gabor: requests that the description of this PR is more ample.Moises: what is the definition of production ready?We can already run in a very small cluster, but the question remains open if that is sufficient.The current feature set in master is not production ready. Lacks a PR that needs to be merged in edx-search. Both edunext and opencraft continue analyzing what needs to be done for master to work.

    • Issue 25. Braden and Felipe are the contributors right now. [ACTION] Felipe will nominate @Jhony Avella in the regular CC project

    • Issue 23: elasticsearch SSL. Not a blocker right now.

    • Issue 22: tutor contrib index. No action needed for now.

    • Issue 21: open search support. Not a blocker right now

    • Issue 18: fix references to multi. A nice first contribution for someone. Currently assigned to Moises

    • Issue 15: Conference talk on k8s. Jhony talked to Max, Gabor and Lawrence and got interesting data to prepare the slides. Revisit the slides on Tuesday with the Braden other people involved.

    • Issue 14: Jeremy: more discussions with vendors and the SRE team. Diana will take ownership of the issue.

    • Issue 4: will be closed now since the PR is merged

    • Issue 2: is related to the PR 17, Lawrence will take the lead on adding prometheus as metric server was done.Adam: insights agent plugin is a parallel connection that can be used by the vendors that are being discussed for Issue 14. Adam to comment on this issue

  • Devops Working Group:

  • 2U updates:

    • Adam: introducing Tyler Thompson from the SRE team. 2 or 3 services where moved to k8s lately. Not using harmony. 2U is interested in what harmony would need to be able to run edx.org there.

    • Jeremy: smooth the learning curve of k8s. To what extent I need to be an expert on this. Compelling reasons to use k8s.

    • Felipe: used chat gpt to get k8s charts explained in english and translated to equivalent terraform modules.

    • Jeremy: found complete bogus info about django plugins that don’t even exist.

    • Braden: it does save time, but proceed with caution

    • Jeremy: the curve smoothing needs to happen at all levels of the project, not only k8s. Part of the documentation project.

    • Moises: we started using ArgoCD as a way to smooth the curve.

    • Jhony: some of this will be shown in the conference talk

    • Adam: “app of apps” pattern vs “applicationset”SRE team at 2u is happy using argoCD.How could one be migrated to the other.

    • Felipe: our IRT also uses the UI happily to check the status of production.

    • Adam: any blogposts or memories of how to move ansible based workloads to k8s based ones.Credentials was moved by creating the application layer in k8s and connecting to the same databases used by the previous infra and then a DNS cutover.