Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Current »

Insights & Analytics Pipeline sub-session

How does the Analytics Pipeline work?

How can we make Analytics Pipeline near-realtime?

  • Smaller hourly partitions for more frequent runs
  • Streaming data using Apache Sparc (instead of hadoop, hive, sqoop..).  edX Analytics team are in the process of doing this right now!
  • Use a lighter-weight solution if direct MySQL queries are sufficient

What reporting do we want?

  • Dashboard for blended learning use case (small classes, many copies of "same" course")
    • Data per learner
    • Divide by class, subject, geographic, organisation
  • Timeline to show learning rates/engagement/enrollment as course progresses
    • Tag significant events (course start/end, advertising events, assignment deadlines...)
  • Survey data integration, e.g. to measure learner satisfaction

How has reporting improved?

Lightweight Analytics sub-session

Figures, lightweight analytics, what is it?

  • Analytics for small sites, hosting on a single server, where Insights is out of the question. You can start with it and grow. Insights is great for course specific and can handle MOOC-size data. Figures is here to fill a gap.
  • Currently only working on devstack, will soon be production ready.
  • It includes a Javascript single page application and a reusable Django app (minimize modifications in the platform). Plug-in, in line with open/close principle (open for extension but close for modification - cf Nimisha).
  • Analytics tool useful for data scientist, or you need to make management decisions on how your courseware is doing.
  • Figures gets data from the Django models. In the future, building more end-points.
  • Daily snapshots of the daily aggregate data, which then can get rendered via charts.
  • Plan for doing a code walkthrough via hangout.

What other metrics would we want or not want?

  • List of courses and page course which provides metrics and charts.
  • We have existing metrics, what other metrics would we want or not want?
  • What demographics would we pull from registration or another external source?

Deployment?

  • Appsembler uses Ansible. EduNext too.
  • Kyoto U: Developing a management system with dashboard around 16 courses on edx. Notifications, course invite on new course and newsletter (msnses) for marketing. Concern is around data processing.

Real-time analytics while keeping a light-weight infrastructure?

  • Near-real time jobs using Celeri
  • Need to be real time?
    • For the course authors, it'd be great to see progress on live assignments.
    • Marketing would also love it, in reality near-real time probably sufficient.
    • It might be useful for customer retention.

Source of persistent data

  • Course enrollment
  • User profile
  • Nb learners’ per course
  • Courses per learner
  • Students module of how many active learners
  • Grade percentage
  • When did they do a section
  • Average learner progress aggregated
  • Generated certificate, how many course completion
  • etc.

Multi-tenancy

  • John, Jhony, Qasim are interested in working on this.

Ideas for new features?

  • Which learners how they are responding, and aggregates
  • Timeline to see the engagement rates, deadline for assignments
  • What the learner goals are survey data in there
  • Make the prob response in the instructor dashboard, improvements on data and usability.
  • No labels