Announcing the first beta releases of panorama-elt and tutor-contrib-panorama, the basic tools to integrate Open edX and other systems into a datalake. Contributions are welcome!
Notes
Python based ELT toolkit that attempts to be modular and support diverse data sources and data lakes
Currently focused on AWS and only supports Athena today
Tutor plugin allows running the ELT tools alongside tutor, but expects an AWS destination for the data
Full support is currently only available in the Kubernetes version, locally only the ELT part for RDBMS tables
Athena TLDR;
put files in an s3 bucktet
Athena allows SQL over CSV, JSON and other formats
Athena is based on Hive, so there is something available that is open source, but there are no plans to work on this
The plugin is usable for local installations and dev installations
Docs are currently empty
Tracking logs are not supported locally
5 min
Insights 2.0?
@Edward Zarecor
At the conference the 2U product team spoke about “Insights 2.0.”
Can someone describe the scope of that effort?
Will it be valuable outside of 2U?
Notes
Insights 2.0 is super aspirational
2U have designs
Not planned for work for two quarters
Insights data is useful, but has significant gaps
Intends to allow in-context analytics
What they have been working on was focused on replacing the pipeline
If they have to replace the Insights frontend, they would open a number of architectural questions
should it be combined with the data api
should insights be an MFE
Should django be side-lined to avoid ORM performance tax
As this isn’t actively in development, this is all speculation
@Dave Ormsbee (Axim) asks has there been clarification about the intended audience of Insights in 2.0.
Yes, will be staying focused on the aggregates, not individual learners.
Learner view is being deprecated
@Dave Ormsbee (Axim) will the data be primarily instructional or will it included things like program enrollments?
The most popular piece currently is the enrollment dashboard
Imagine that any investment would support the admin persona and probably add additional features for admins
@Edward Zarecor were the designs based on user research or blue sky?
2U did user research, mostly with administrators
They would like to do a big round of customer interviews with instructors
@Edward Zarecor are the designs sharable?
They are very preliminary, Ed would need to work with 2U product to discuss further
@Andrés González what are the specific challenges that 2U are facing with PII?
They are really planning to focus on aggregated data as a general rule to avoid any risks associated with data that could be associated with any particular learner.
Is there a business specific data need that requires individual data access regimes?
@Andy Shultz (Deactivated) proposes an early fork in an analytics design the separates aggregated data and individual data.
10 min
Bite-sized work
@Edward Zarecor
Let’s think about what bite-sized work would be valuable to start on now and commit to doing some of it. I have a few idea.
@Sofiane Bebert thought this plugin is very useful and asks whether it should be part of the default Open edX release?
@Dave Ormsbee (Axim) not sure if http://edx.org runs plugin.
This plugin wasn’t upgraded to Django 3.2, so unlikely that this is run there – edx.org
Walking the course tree is expensive, we do that elsewhere, so that in itself is not a disqualification
Need more data from OpenCraft
Have they considered making it part of the default
@Julien Maupetit is there an event for completion at the block level currently, with the block name and the block id
@Dave Ormsbee (Axim) if it is not the case that this event is already created, it should be easy to do so because the completion API is already persisting this data to a table.
Figures
@John Baldwin
Was an appsembler contribution to the community
Replaced an internal product they had that generated CSV files
Figures is a dashboard the provides different context views
Course-centric with drill down to view learners
Not great at analytics, really more of an exploratory dashboard
During development they struggled getting robust requirements from users or future users
John dug into what was available in the platform and accepted the definitions that he found there
How can one find block ids – seems like a perennial problem for multiple users.
Figures
A Django app
It’s an Open edX plugin that plugs into the LMS and use it’s resources
It is not opinionated about the architecture.
It does not currently process tracking logs
It does creates its own tables
The platform models don’t track rich history
Performance of analytics queries, for example, courseware_studentmodule doesn’t have the indexes for efficient querying
Figures does use celery jobs to marshall the data into it’s datamart
@Sofiane Bebert is working on an update of Figures for Maple. He doesn’t think that an update for Nutmeg will be difficult.
@John Baldwin is also working on a series of fixes that improve performance and fix known bugs. Not really adding new features, but adding some instrumentation. If one does not have persistent grades installed, queries can be very expensive
Figures is an active project, but current level of investment is not high because of other business priorities.
Currently runs in the Appsembler Tahoe platform
Tahoe has a feature flag that allows it to use a different celery queue than the LMS to prevent resource contention, this is really a server var and is set at start up.
Interested in a conversation with folks about the Instructor dashboard
Question of user friendliness versus robustness of the data available.
After Sofiane completes the upgrades for Maple and Nutmeg, it would be possible for someone to take on maintenance of the Tutor plugin.
Action items
@Dave Ormsbee (Axim) to report back on block level completion events in the tracking logs.
Update: It doesn’t look like it currently makes it to the tracking logs. It is possible to add this support, but it will likely require refactoring work. See Slack thread for details.
Update: Yes, OpenCraft have a few clients which use this plugin, and so we do support it, though we’re lagging behind somewhat. @Sofiane Bebert has submitted an update for Maple (thank you!) and @Gábor Boros is reviewing.
Should we consider making it part of the Open edX by default? Sure, but note:
* synchronous vs asynchronous: sync aggregation can hurt performance, and async requires cron jobs (would be better served by celery-beat). Synchronous is the default, so it should be configured to async if installed on large deployments. * Provides a Course/Chapter Progress Bar view which could be integrated into the Learning MFE (current clients use custom theme on the old courseware view). * APIs might be useful for data analytics. * There’s also a few old open issues.