Data WG 2022-03-24 Meeting Notes

 Date

Mar 24, 2022

 Participants

  • @Edward Zarecor

  • @Diego Millan

  • @Andy Shultz (Deactivated)

  • @Simon Chen

  • @Dave Ormsbee (Axim)

  • @Maria Fernanda Magallanes Z

  • @Tobias Macey

 Goals

  •  

 Discussion topics

Time

Item

Presenter

Notes

Time

Item

Presenter

Notes

5M

Followup on Elasticsearch → OpenSearch plans

@Edward Zarecor

  • There’s a new Slack channel, #search-migration, for tracking plans and progress

  • The current proposal is to migrate a number of our services to use Opensearch

  • Discovery was done into abandoning *search in favor of MySQL full text search, but was deemed not currently feasible because of performance problems

10M

edx insights acceptance tests

@Simon Chen

  • Do any of the Open edX members run acceptance tests on edx-analytics-dashboard? If so, how do you do it?

  • How valuable do you find these acceptance tests?

  • Notes

    • The tests are not trivial to run locally

    • The use bokchoy which has/is being deprecated

    • Do they run in CI or via the deployment pipeline?

    • There do seem to be some acceptance tests running in GitHub

    • Cost of doing nothing is maintenance, but isn’t high

    • @Jill Vogel any opinion on this matter?

10M

learner view deprecation

@Andy Shultz (Deactivated)

  • We’re deprecating the learner view because it is expensive and not analytics

  • Where is the right place for this student info to live? Is this “data” even a concern for this WG? The only actual users we have found for it use it as a student directory!

  • Notes

    • View was of detailed user data and modules they interacted with.

    • Was sparsely used

    • PII risk

    • Other alternatives exist for getting similar data, but maybe not to the module level

    • We can see how they are doing on questions, but not at as low a level

    • Will Insights by design avoid pulling in PII in the future? Yes. Will focus on aggregates over learner level data which can leak out in numerous ways.

    • Can we add an ADR specifying this decision

    • Already turned off at edX

    • Deprecation has been announced, end of March is the comment deadline

    • It would be removed in Nutmeg

    • If this data comes back, individual learner engagement, is that a matter for this group?

      • Yes and no, an inform of plans to build a student directory would be helpful however.

 

Exporting courses via an API

@Tobias Macey

 

Standard for ETL for Insights

@Maria Fernanda Magallanes Z

  • Starting a conversation among Insights users on standards and how to keep it going as 2U deprecates pipeline

  • Insights is the only place that access control is done today in our instructor analytics system

  • Insights currently does data transformation beyond what is done in the API, can we eliminate this?

  • How can we move into a phase of converging on a design recommendation?

  • Starting with ETL and defining our migration path in phases may be a way of doing this.

 Action items

Review the notes for accuracy, Everyone.
Add creating an ADR for the PII stance of insights as part of the Learner View deprecation work. @Andy Shultz (Deactivated)
MIT had issues authenticating against Studio because of delegated authentication to the LMS. They were being redirected to the login path an unable to exercise the APIs for import/export that exist in Studio.
@Maria Fernanda Magallanes Z will start a conversation in Discourse to try to get more engagement form Insights users.

 Decisions