Configurable priorities and weights for assessment steps

To unlock additional ways to use ORA problems, where more than one of the assessment steps contribute to the final score.

Problem

The way ORA-2 calculates the final score has a fixed priority for the assessment steps. When there is a staff grade, nothing else matters, else, when there are peer grades their median is the final score and nothing else matters, else, when there is a self assesment, this is the final score. The way this is described in the documentation can be found here.
Course creators and instructors want to accommodate scenarios in which more than 1 of the assessment steps can be weighted in the final score.
The current system allows for staff members to fully override the grade, which is an option that would need to be kept.

Use cases:

As an instructor, I want to add an ORA2 problem in which learners will be assessed by their peers and also assess their own work, knowing that both the peers and the self assessment will be considered for the final score.
As a learner, I want to engage in an ORA2 problem in which I will be assessed by my peers and also assess my own work, knowing that both the peers and my self assessment will be considered for the final score.
As an course creator, I want to add an ORA2 problem in which the course staff will assess the learners submission first, providing thorough feedback in each criterion and then the learners will be able to reflect and perform a self assessment knowing that it will also be considered for the final score.

Supporting market data: We presented this proposal in the educators working group and built a survey to collect feedback. here are the results:

Proposed solution:

To reimplement the final score calculation function in a way that gives more flexibility via configurations, and make the current behaviour the default configuration, so that it doesn't afect any outcome unless the course creator actively changes the configuration of the problem.

Here is a comparison of the different aspects of the current calculation method and the proposed alternative

Current method	Proposed alternative

Current method	Proposed alternative
There is ASSESSMENT_SCORE_PRIORITY = ['staff', 'peer', 'self'], this is a platform wide setting.	ASSESSMENT_SCORE_PRIORITY = ['staff', 'peer', 'self'] would become a setting that can be overwritten by each ORA problem.
	we’d need to define ASSESSMENT_SCORE_WEIGHTS = [W1, W2, W3] This would need to be set for each ORA problem. the order of the weights correspond to the order of the steps set in ASSESSMENT_SCORE_PRIORITY If the problem does not have peer or self steps enabled, those weights should be 0. the weight for staff can be set to 100 even if the problem does not have staff step. this is to allow staff members to fully override the final score when needed.
# Once a student has completed all assessments, we search assessment APIs in descending priority order as defined by ASSESSMENT_SCORE_PRIORITY until one of the APIs provides a score. # We then use that score as the student's overall score.	# Once a student has completed all assessments, we search assessment APIs in descending priority order as defined by ASSESSMENT_SCORE_PRIORITY for the problem. #1: We start with Aggregated_weight = 0 Final_score =0 #2: if the priority 1 step provides a score (S1), then we make: Aggregated_weight = Aggregated_weight + W1 Final_score = Final_score + S1W1 If Aggregated_weight is 100, then we go to #5. Else we keep going. #3: if the priority 2 step provides a score (S2), then we make: Aggregated_weight = Aggregated_weight + W2 Final_score = Final_score + S2W2 If Aggregated_weight is 100, then we go to #5. Else we keep going. #4: if the priority 3 step provides a score (S3), then we make: Aggregated_weight = Aggregated_weight + W3 Final_score = Final_score + S3*W3 #5: if there is a staff override score, then we make Final_score = Staff_override_score
The way the calculation is implemented always for a staff member to fully override the final grade. even after one staff member already provided a grade, another staff member can fully override it.	In order to keep the ability to fully override a final score, we’d need to implement a distinction between the normal staff assessement performed as part of the activated staff step and an eventual staff override score.

Here is a list of possible cases with the more flexible configuration:

ASSESSMENT_SCORE_PRIORITY for the problem	ASSESSMENT_SCORE_WEIGTHS for the problem (W1, W2, W3)	Explanation

ASSESSMENT_SCORE_PRIORITY for the problem	ASSESSMENT_SCORE_WEIGTHS for the problem (W1, W2, W3)	Explanation
Staff, peer, self	100, 100, 100	This is the current behavior when all 3 steps are activated. Each step is queried in order of priority, and the first one that responds produces 100% of the final score
peer, self	100, 100	This is the current behavior when staff step is not activated. Each of the activated steps is queried in order of priority, and the first one that responds produces 100% of the final score. Staff members can still fully override the grade if then want to.
self	100	This is the current behavior when only the self step is activated. This step will provide the final score. Staff members can still fully override the grade if then want to.
peer, self	50, 50	This will cover the first and second use cased listed above. Staff members can still fully override the grade if then want to.
Staff, peer	50, 50	This will cover the third use case listed above. Staff members can still fully override the grade if then want to.
Peer, self, staff	30, 30, 40	This will cover the general case, when all 3 steps are activated and weighted. Staff members can still fully override the grade if then want to.

This is of course an initial proposed solution. If members of the community have other proposals or ideas, we appreciate the feedback in this document.

Other approaches considered:

Since the current ORA component will only allow one of the assessment steps to affect the final score, course creators could use more than one ORA component with the same prompts and settings, but with different assessment steps each. This would be very hard to manage because learners would have to submit the exact same response in both problems.

Competitive research:

How do Canvas/Moodle/Coursera solve this problem?

Moodle features a lot of flexibility for the calculation of final scores in a “Workshop“

The final grade can be split into 2 main weighted components:

Grade for submission. The score a student gets for their submitted work
Grade for assessment. The score a student gets for having reviewed other’s submitted work

Grade for submission

The final grade for every submission is calculated as weighted mean of particular assessment grades given by all reviewers of this submission. This includes the assessments given by peers and also the assessment given by the submitter, if allowed. The value is rounded to a number of decimal places set in the Workshop settings form.

The teacher can influence the grade in two ways:

by providing their own assessment, possibly with a higher weight than usual peer reviewers have
by overriding the grade to a fixed value

Grade for assessment

The grade for assessment tries to estimate the quality of assessments that the participant gave to the peers. This grade (also known as grading grade) is calculated by the artificial intelligence hidden within the Workshop module as it tries to do a typical teacher's job.

Proposed plan for any relevant usability/UX testing

The main usability challenge in this proposal is for course authors to have a clear UI/UX to configure the ORA problem. We plan to produce a low resolution interactive prototype and perform a few tests with users to validate for clarity and effectiveness.

View	proposed UI

View	proposed UI
Studio, when setting the weights for each of the steps.

Plan for long-term ownership/maintainership

edunext is commited to build and contribute this work as part of the unidigital (spanish government) project. As part of that commitment, edunext would commit to maintain the feature for a minimum of 2 years and after that, either find a suitable maintainer to hand it over to, or to follow a the deprecation procedure in case the feature has any inconvenience or its maintenance is a burden that no one can carry.

Open questions for rollout/releases

TBD

Open edX Community