Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

\uD83D\uDC65 Participants

\uD83E\uDD45 Goals

\uD83D\uDDE3 Discussion topics

Time

Item

Presenter

Notes

Abstraction Layer for Open edX Core Metrics

  • Review the current list of core metrics

  • We floated the idea of creating an abstraction layer for those core metrics. How should we go about implementing it?

    • Figures compat module

      • AppSembler are planning to follow Django release process

      • Events are a possible vanguard effort that is facing the same problems we have related to well specified data models.

        • Schema enforcement is possible with externalized definitions

        • Allows testing of contracts

        • The events team is using Avro schema

  • Is there a way to demo Figures?

  • Is there any sharable piece of the 2U DBT work?

    • Andy has been thinking about this, but 2U rely on a lot of upstream data cleansing in order to produce the final set of metrics.

    • Probably makes the most sense to have a separate community implementation

    • With an event stream you could fill the tables that edX analytics api uses

  • Tracking logs

    • They are a mess

    • Our approach has been fix in the warehouse

    • Dave advocates for a approach that fixes the data upstream


Plugin for data API is being explored as an option by Tobias at MITx

Tobias Macey

  • Initial use case is related to exporting the course content.

  • Another need is enriching block ids with human facing names and follow bread crumbs through course navigation

  • Andy asks about the list of users for the course – this data is “toxic waste” and fill of PII that you would not want to get by mistake.

  • Expected to use celery facilities for asynchronous

  • Does this need to cover CCX courses?

    • Tobias does not think that it does.

    • Dave worries this may not work by default

  • Would this plugin need to reach across boundaries and use ORM models from other apps?

    • V1 probably yes

    • Hope that this can help define stable APIs

  • Is the long term vision batch forever or is streaming changes part of their future plans

    • See incremental updates over Kafka or Pulsar as a valuable future state

    • Focused on batch for now?

  • Is there any design docs for the data API?

    • MIT are currently in the early discovery phase and the data that will be pulled hasn’t been fully defined.

    • API is focused on raw data now, not core metrics.

  • MIT code will be in a public repo, and will publish details about design and progress in the Data Working Group channel.

  • Is the plugin model baked enough to consider it a best practice for decomposing the monolith?

    • CI protections are still weak.

    • This was one of the goals of the events and signals work

    • Overall, stable APIs are emergent and not robust yet.

✅ Action items

  •  Ed will prod the group for content early in the week on weeks that we have scheduled meetings
  •  

⤴ Decisions

  • General consensus that schema enforcement across the platform would be valuable