Architectural Design Questions

This document lists the main technical and design questions we need to answer to build a unified user grouping system. This system aims to combine and eventually replace existing group types like cohorts, course groups, and teams. It should allow flexible ways to group users, support plugin-based extensions, and work well at scale.

The questions in this document cover topics such as the data model, how group membership is calculated, how it connects with Aspects, how to manage performance, and how groups behave in different situations.

Not all of these questions need to be answered right away. They are meant to guide the project. As we move through research, design, and implementation, the answers will become clearer and will be documented along the way.

⚠️ This is a living document — more questions and answers may be added as the project evolves. Anyone involved in the project can leave comments or suggest new questions to help us cover all important areas.

Lessons from Existing User Groups

What can we learn from the current implementation of user groups, cohorts, and teams in Open edX?
- The current mechanisms were designed independently, which has led to duplicated logic and poor extensibility.
- There are multiple configuration points across the platform for managing current mechanisms, which can create confusion for different users (course authors, admins, students).
What technical or design issues from the current system should we avoid when building the unified grouping model?
- There are multiple terms used to describe the same concept. For example, “Team Sets” are also referred to as “Topics” or “Groups,” which causes inconsistency.
- Data is stored in different places: for instance, Team Sets are saved within the course block (in JSON), while Teams are Django models.
- There is no clear or easily accessible way to retrieve which users belong to which groups.
Do group types (cohorts, teams…) differ in definition or only in behavior (e.g., exclusivity, ...)?
- They differ in both. The data models are distinct between types, and their behaviors diverge in some cases (e.g., mutual exclusivity, group visibility).
What is the minimal shared structure that all grouping mechanisms (such as teams, cohorts, …) have?
- All grouping mechanisms rely on user partitions to control content access.
- There is a membership relationship between users and groups in Cohorts and Teams. However, in Enrollment Tracks, no explicit membership model exists—assignment is inferred from the user’s enrollment mode.

More details in: Technical Exploration of Existing User Grouping Mechanisms & Technical Exploration Questions & Answers

Model Design and Structure

Can group roles and behaviors be compared to roles and permissions in the system? Can behaviors be stacked or combined to create new group types?
- Functionalities are available for the unified groups, and there is no sense of types (for now): Long-Term Requirements for the Unified Model | 1. Group Model Design
Will the system support different types of user groups with different behaviors (like content access, discussion grouping, messaging), or is there only one general type of user group?
- Functionalities are available for the unified groups, and there is no sense of types (for now): Functional & Non-Functional Requirements for the Unified Model | 1. Group Model Design
Should constraints like mutual exclusivity of group membership (e.g., a user can only belong to one group of a certain type) be enforced? If so, at what level should they be implemented — in the data model, at the business logic layer, or elsewhere?
- This is still in discussion, TBD
What kind of historical metadata is relevant and should be kept for each user group?
- This was noted in the requirements document for the unified model: Long-Term Requirements for the Unified Model | 1. Group Model Design
How relevant is managing historical tracking for groups and user's membership?
Can the group's criteria change after it was created?
- Yes!
If no criteria are defined for a group, does that make it a manual group by default?
- Manual groups should be explicitly specified: Long-Term Requirements for the Unified Model | 1. Group Model Design
How do we store additional parameters or configurations needed by the criteria without overloading the base model?
- TBD, see the proposed alternatives: Unified User Group Model Technical Approach
How will user groups be uniquely identified across scopes (e.g., course/org/instance)? Should group IDs be globally unique or scoped?
- TBD, too detailed to answer for now

Criteria Definition and Evaluation

Should individual criteria define their own set of supported operations and value types? For example, should “last login” support numeric/date comparisons (<, >=, =) with int or date, while “country” uses string operations (=, in list)?
- Yes! This was noted as part of the functional requirements for the unified model: Long-Term Requirements for the Unified Model | 2. Criteria Management
What should happen if a criterion is removed, uninstalled, or fails to load? Should the rule be disabled automatically, or should the group show an error?
- This was noted as part of the functional requirements: Long-Term Requirements for the Unified Model
Would it be possible to support more than two criteria per group in the future?
- For now we’d support up to 3
What type of criteria would be consider for site / organization scopes?
- Examples: Learners currently enrolled or who have completed a course in X subject (a course with X tag, a course in X program), for example.
- We’re doing to build the system so it can scale to this scopes, not for the MVP which will be centered only in course-level groups.
For a criterion that uses Aspects, what data, metrics, or visualizations must be available for that criterion to function correctly?
- The aspects plugin should present the data that’s available for the criteria to work: Long-Term Requirements for the Unified Model | 5. Backend and Data Source Integration
Should a criterion be reusable? e.g., saved last login < 10 days from course course-v1:OpenedX+CC01+2024 used in course rerun course-v1:OpenedX+CC01+2025 or in the course course-v1:OpenedX+CC02+2025?
- It looks like currently if a course has cohorts and is re-run, the cohorts are maintained from run to run EXCEPT for manually created runs where the course admin uploaded a CSV list of learners to manually create a cohort - I think this functionality should behave the same (any manually created, CSV uploaded user groups should not remain), but user groups created based on grouping criteria should remain.
  - This was noted in the requirements list: Long-Term Requirements for the Unified Model | 3. User Assignment to GroupsLong-Term Requirements for the Unified Model | 3. User Assignment to Groups
For performance reasons, should the system rely on precomputed views (e.g., in Aspects or MySQL) when evaluating complex criteria combinations?
- Noted in the requirements list: Long-Term Requirements for the Unified Model | 5. Backend and Data Source IntegrationLong-Term Requirements for the Unified Model | 5. Backend and Data Source Integration
For criteria based on frequently changing data (e.g., course progress or daily engagement), how often should group membership be refreshed?
- We will define beforehand which refresh cadence to use for each requirement. See for more details: MVP Requirements: Static User Group Creation, Management, and Usage | User Group Management
Should each criterion (or group?) include a recommended refresh frequency as part of its definition?
- We will define beforehand which refresh cadence to use for each requirement. See for more details: MVP Requirements: Static User Group Creation, Management, and Usage | User Group Management
Can team characteristics (e.g., language, topic) be used as part of the group definition or as a criteria?
- We will consider learner language to be an MVP grouping criteria (if possible). See for more info: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4934664196
How should the system perform intersections between multiple criteria - especially across different data sources (e.g., behavioral criteria from Aspects and demographic data from MySQL)? How can we ensure consistency?
- TBD, see the proposals for the unified model: Unified User Group Model Technical Approach
Should each criterion define its own evaluation backend (e.g., Aspects, MySQL)?
- Yes!

Extensibility

Evaluation Frequency and Sync

What is the recommended strategy for group refresh: real-time, scheduled (e.g., via cronjob), or manual only?
- More of the MVP requirements around this in our WIP User Group Criteria and Management doc: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4934664196
Should the system support multiple modes depending on the group type or data source? Should it be configurable by the instructor or site admin? (e.g., weekly for expensive queries, daily for login-based groups)
- More of the MVP requirements around this in our WIP User Group Criteria and Management doc: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4934664196
What types of user data changes should trigger group re-evaluation for dynamic groups? Are real-time updates even a possibility (R. yes, event-based)? What is the acceptable delay between a user data update and the group reflecting that change?
Should groups clearly show when their membership data is out of date (e.g., via a “stale” warning or last refresh timestamp)?
- Noted in the functional requirements for the unified model: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4905762858/Long-Term+Requirements+for+the+Unified+Model#4.-Refresh-Strategies-%26-Staleness
When using automated refresh (e.g., cronjob), how do we preserve visibility into when and why the group changed? Could automatic updates lead to losing historical state?
- Audit trail of users being added, removed, and the group updated. Noted in the functional requirements of the unified model: Long-Term Requirements for the Unified Model
If some user attributes (like country of residence) don't change often, should they still trigger a refresh or be treated as static?
- This could be updated using an event-based mechanism: MVP Requirements: Static User Group Creation, Management, and Usage | User Group Management
Should the system differentiate between user-triggered refreshes and automated ones (e.g., background sync), especially for auditing or debugging?
- This was noted as part of the functional requirements file: Long-Term Requirements for the Unified Model | 1. Group Model Design

Cross-System Consistency

How should we ensure consistency between the LMS and the Aspects dashboard when showing user group membership, especially in cases where data is out of date or out of sync between systems?
- This was noted in the requirements file: Long-Term Requirements for the Unified Model | 5. Backend and Data Source Integration and in the consistency strategies document: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4976115715

Data Sources and Backend Integration

Which other backend data sources should be supported for evaluating group criteria (e.g., MySQL, Aspects)? Should the system support two backends initially (MySQL and Aspects), and be extensible for more?
- Yes! See the extension proposal for the model: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4917657604/Unified+User+Group+Model+Technical+Approach#5.-What-Should-the-Extensions-Mechanism-Look-Like%3F
Can each criterion define its own evaluation backend?
- Yes! See: Long-Term Requirements for the Unified Model | 5. Backend and Data Source Integration
Can a single criterion rely on different backends?
- Yes! See: Long-Term Requirements for the Unified Model | 5. Backend and Data Source Integration
What should happen if Aspects is temporarly unavailable? Should criteria that depend on it be hidden or disabled?
- No! See: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4905762858/Long-Term+Requirements+for+the+Unified+Model#2.-Reliability-%26-Failover
How should the integration and communication with Aspects be standardized?
- TBD, no need to specify at the moment
How can we ensure that Aspects can efficiently return large query results (e.g., user-level data across many courses) without performance degradation?
- Superset returns reasonable results with pagination. We’d need to be careful with MySQL queries, though.

Performance and Scalability

What are the known performance risks when evaluating user group criteria, especially at scale? Mainly those depending on Aspects.
- TBD, we’d need to do a performance report when defining criteria
How resource-intensive is it to evaluate and combine multiple criteria? From different backends?
- TBD
Some criteria may involve expensive operations (e.g., querying Aspects). What is an acceptable response time or timeout threshold for these operations? Should we define fallbacks if these thresholds are exceeded?
- See requirements file: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4905762858/Long-Term+Requirements+for+the+Unified+Model#1.-Performance-%26-Scalability
Should we implement caching to reduce evaluation load or an invalidation mechanism to avoid updating already up-to-date groups?
- Yes! See the requirements file: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4905762858/Long-Term+Requirements+for+the+Unified+Model#2.-Reliability-%26-Failover

Overrides

Can instructors manually remove users from dynamic groups, overriding the criteria?
- No!
If so, how should we record and respect manual overrides so that users aren't re-added during refresh?
- No overrides allowed

Use Cases and Role Differences

Linking key user group use cases here.

Should we clearly distinguish between instructor-managed groups (like cohorts) and student-managed groups (like some teams)?
- TBD, but at the moment the answer is only instructor managed
In student-managed groups, how do we define eligibility criteria (e.g., filter who can see or join a group)?
- TBD, but at the moment the answer is only instructor managed
Should group behaviors (e.g., like cohorts) enforce mutually exclusive membership through criteria? For example, what happens if Group A is defined by “last login < 10 days” and Group B by “country = Spain” and a user matches both? How should we handle and validate criteria combinations that are logically incompatible or mutually exclusive?
- TBD! But I drafted a proposal for how to handle mutually exclusive groups: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4976115715/User+Group+Consistency+and+Refresh+Framework#3.-Handling-Mutual-Exclusivity
In student-managed groups (e.g., like teams), how should criteria be used? Should the system evaluate eligibility when the user attempts to join the group (join-time validation)? Or should group visibility be filtered in advance so that only eligible users can see and join those groups?

Umbrella Model

Should legacy group types (cohorts, teams) become specializations of the unified model, or should they be deprecated over time?
- They should be deprecated over time and replaced with the unified model version.
Should we keep the concepts of teams and cohorts to better organize and explain group behaviors, instead of listing features without a clear structure. For example, saying that teams allow collaboration and cohorts control content visibility?
Could course teams choose how they want their group to behave? For example, selecting "behaves like a team" to enable collaboration or "behaves like a cohort" to control access to content? Teams can also be used to control access to content, so considering “group types” might not be a good approach? There’s overlapping functionality, so is it still confusing?
Can we create new group types by combining or reusing existing behaviors, such as a group that supports collab (like teams) and content access (like cohorts)? This already exists with teams.
How could we support adding new behaviors in the future, such as making group-based grading available for any group type to adopt?
Would it make sense to treat these capabilities as built-in to user groups, without needing to define a specific “type”?

Migration Path

User grouping (for now) would be an additional model that includes criteria to select users. How can we connect this model to the existing ones, while respecting the constraints of each?