Open edX Data Project
[Owner: Faqir Bilal]
Some background:
The purpose of this project is to provide a comprehensive view of the organizations using Open edX, including their location, engagement level, technical information, and sector/subsector. And so one of the goals is that by understanding how the platform is being used, we can identify potential areas of focus for the product and better serve the needs of users/learners.
The scope of this project is quite broad - we've gathered data from over 4,500 organizations using Open edX, and we've organized the data into different categories to help us make sense of it all. I should mention that this project is still a work in progress, but we're excited about the insights we've already gained and the potential for further analysis. We'll be constantly updating the data to ensure it's up-to-date and valuable.
Current Status (April 2023)
Understand who is using Open edX platform, what their use cases are
Have a lot of data points, now looking to have data visualizations to make sense of the data
4k+ organizations! data includes country of HQ, which version of the software they’re using, how many courses they have (if the website is publicly available), and more
edly is tracking this info monthly with help of custom scripts to scrape info from sites
Some stats:
0.97% adoption for Olive - although many sites we can’t tell version
148 countries represented
54 languages
Upcoming: Work on making a public dashboard, and new scraper for more data
Call to action: Please make recommendations about useful visualizations, anything you’d like to see, websites or info that would be useful to scrape
Here's the scope of the research conducted so far:
Market Overview:
The total number of organizations using Open edX: This metric provides a sense of the scale of Open edX adoption across different organizations. It can help us understand the market size and penetration of the platform.
The total number of courses available: This metric provides an idea of the content available on Open edX, and can also be used as a proxy for user engagement. It can help us understand the breadth and depth of the content available on the platform.
Primary sector served: This metric shows the industries or sectors that are using Open edX, and can help us identify which sectors are more likely to adopt the platform. It can help us understand the different use cases for Open edX and how the platform is being used by different organizations.
Open edX version distribution: This metric shows the distribution of Open edX versions being used by the organizations, which is important to track as new versions are released and older versions become outdated. It can help us understand the pace of adoption of new versions and which versions are most popular among users.
2. Course and User Engagement:
The number of employees: This metric shows the size of the organizations using Open edX, and can help us understand the different user bases for the platform. It can also be used as a proxy for the potential reach of the platform.
Average number of courses per employee: This metric shows the engagement level of users on the Open edX platform, and how the platform is being used within organizations. It can help us understand the level of adoption and integration of the platform within organizations.
Average number of courses per user: This metric shows the engagement level of individual users on the Open edX platform, and can help us understand the level of interest and participation in the content available on the platform.
3. Location:
The location of organizations using Open edX: This metric can help us understand the geographic distribution of Open edX usage and identify potential areas for growth or expansion.
Number of organizations by city, state, and country: These metrics can help us understand the concentration of Open edX usage in different regions and countries, and how usage varies across different locations.
4. Technical:
Open edX version distribution: This metric shows the distribution of Open edX versions being used by the organizations, which is important to track as new versions are released and older versions become outdated. It can help us understand the pace of adoption of new versions and which versions are most popular among users.
Status distribution (live/not live): This metric shows the distribution of the status of Open edX installations, which is important for identifying potential technical issues or maintenance needs.
Language distribution: This metric shows the distribution of languages used on Open edX installations, which can help us understand the global reach of the platform and identify potential opportunities for localization.
5. Sector and Subsector:
Primary sector served: This metric shows the industries or sectors that are using Open edX, and can help us identify which sectors are more likely to adopt the platform. It can help us understand the different use cases for Open edX and how the platform is being used by different organizations.
Subsector distribution: This metric shows the distribution of subsectors within the primary sectors using Open edX, which can help us understand the specific use cases and applications of the platform within different industries.
6. Additional Information:
Year site launched distribution: This metric can help us understand the pace of adoption of Open edX and how the platform has evolved over time.
Credit-bearing courses distribution: This metric can help us understand the different types of courses being offered on Open edX and how the platform is being used within the education and training
Here's how can you help/contribute:
Please take look at the dashboard skeleton for v1 of this project, and propose any visualizations or useful analysis that we should include as part of our standard reports