Large Instances Meeting Notes 2025-07-08

Large Instances Meeting Notes 2025-07-08

Recording | Transcript | Chat

AI-generated summary below. May contain mistakes and inaccuraces.

Attendees:

  • @Braden MacDonald

  • @Felipe Montoya

  • @Maksim Sokolskiy

  • @Moisés González


Meeting Overview:

Attendance & General Updates:

  • Many regular participants were on vacation; the meeting was informal.

  • There was discussion about potential participation from other large organizations like WGU and 2U.


Operational Updates:

eduNEXT Cluster Networking Issue (Felipe & Moisés):

  • Issue traced back to outdated Ubuntu AMIs used in AWS EKS clusters.

  • Networking misconfigurations in the old AMIs caused pod disconnections and scheduling failures.

  • Solution: Upgraded to newer AMIs across all clusters and updated maintenance processes.

OpenCraft & Maksim's Updates:

  • OpenCraft experienced no major issues post-conference.

  • Transitioning towards full use of Harmony, while aiming to reduce custom tooling like Grove.

  • Maksim discussed plans to trial Harmony for future projects and potentially contribute feedback. He discussed their GitLab-based CI/CD setup using spot instances for efficient resource usage and offered to share more details next time.


Tooling & Automation:

  • Picasso: GitHub Action by eduNEXT for image building and deployment is open-source.

    • OpenCraft interested in evaluating it for reuse.

  • Maksim confirmed Raccoon Gang is using OpenCraft’s fork of the tutor-contrib-s3 plugin and said it’s working well.


Collaboration Opportunities & Shared Goals:

Harmony Adoption:

  • Both eduNEXT and OpenCraft have migrated most or all clusters to Harmony.

  • Consensus that future work should align with Tutor-compatible tooling to ease DevOps-Dev collaboration.

Discovery & Programs Discussion:

  • General dissatisfaction with maintaining Discovery Service. Interest in replacing it with a more maintainable solution. Discovery is actively being deprecated, with no clear one-size-fits-all replacement.

  • Braden mentioned OpenCraft is developing Learner Pathways for a client and open to collaboration.

  • Felipe shared related proposals and discussions on better catalog/program handling in Open edX.


Key Action Points & Takeaways:

  1. More AMI vigilance: Update AWS base images regularly to avoid latent networking bugs.

  2. Evaluate Picasso: OpenCraft and others to explore its fit in their build pipeline.

  3. Harmony rollout: Most attendees confirmed using or transitioning to Harmony for large-scale Open edX operations.

  4. Programs support: Shared need for a better alternative to Discovery/Credentials; possibility to align around new community efforts like Learner Pathways.

  5. Follow-up: Maksim will share CI/CD setup details and possibly pricing analysis in the next meeting.