Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info

How to get to CourseGraph for edX internal Users (VPN required): https://coursegraph.edx.org/browser/

Ground Rules: Help The Community! Whenever you come up with a new query, add it here. Even if you think the use case is obscure.

Table of Contents

Table of Contents
printablefalse

Orphans

All non-orphan problems in a course

All non-orphan problems in a course
Code Block
MATCH 
	(course)-[:PARENT_OF*]->(p:problem) 
WHERE 
	p.course_key = <course_key> 
RETURN 
	p

All root orphans in a course

All root orphans in a course
Code Block
match 
	(n:item) 
where 
	NOT () -[:PARENT_OF]-> (n) 
	AND n.course_key ="<course_key>" 
	AND not n.detached  
	AND NOT n.block_type='course' 
return 
	n.location

Number of Non-orphan Items in the modulestore

Number of non-orphan blocks
Code Block
MATCH 
	(c:course) -[:PARENT_OF*]-> (n) 
RETURN 
	count(distinct(c)) + count(distinct(n))

Note that this query isn't perfect. There are some "expected" orphans, like static tabs or about blocks. In a future version of neo4j, we'll include a "detached" parameter for these detached xblock classes.

Number of non-orphan blocks in each course

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(b) 
RETURN 
	c.course_key as id, 
	count(c) as blocks 
ORDER BY 
	blocks DESC

XBlocks

What XBlocks do we have? (total count)

XBlock counts
Code Block
MATCH
    (c:course) -[:PARENT_OF*]-> (n:item)
RETURN
    distinct(n.block_type) as block_type,
    count(n) as number
order by
    number DESC

What xBlocks do we have? (across all courses, with counts by type)

This was used in 2016 and June 2020 for tracking / visualization. (data linked)

XBlock counts
Code Block
MATCH
    (c:course) -[:PARENT_OF*]-> (n:item)
RETURN
    distinct(n.block_type) as block_type,
    count(distinct(c)) as coursecount,
    count(n) as number
order by
    number DESC

Which courses have a specific XBlock field enabled?

XBlock field matches
Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(n:`block_name`)
WHERE 	
	(n.field_name)
RETURN 
	c.org, 
	c.course_key
ORDER BY 
	c.org, 
	c.course_key

Where are all the X block types used in my course? (with direct link to where they are used)

Code Block
MATCH
    (c:course) -[:PARENT_OF*]-> (v:vertical) -[:PARENT_OF*]-> (n:item)
WHERE
    c.course_key = 'courseID' // replace with the course ID
    AND
    n.block_type = 'problem' // insert desired block type here
RETURN  
    n.block_type,
    n.display_name,
    n.location as block_id,
    v.display_name as unit_location,
    replace('https://courses.edx.org/courses/course-id/jump_to_id/','course-id',c.course_key) + right(n.location, 32) as Unit_URL,
    n.edited_on, 
    n.visible_to_staff_only
ORDER BY
  n.edited_on;

What courses contain deprecated advanced modules? 

Code Block
MATCH 
	(c:course) 
WHERE 
	ANY(module 
		IN 
			c.advanced_modules 
		WHERE 
			module = 'peergrading' 
			or module = 'rate' 
			or module = 'randomize' 
			or module = 'peergrading' 
			or module = 'mentoring' 
			or module = 'graphical_slider_tool' 
			or module = 'combinedopenended') 
	AND c.end <>"None"
RETURN 
	distinct c.course_key as course_key, 
	c.advanced_modules as modules, 
	c.end as course_end_date 
order by 
	c.end desc

What courses contain specific advanced modules? 

Code Block
MATCH
    (c:course)
WHERE
'officemix' IN  c.advanced_modules
RETURN
    distinct c.course_key as course_key,
    c.end as course_end_date
order by
    c.end desc

What courses are using known third-party discussion forums?

Code Block
// NOTE: this query takes advantage of the fact that many (but not all) LTI components contain the resource name in the launch_url (e.g. Piazza LTIs contain 'piazza' in the URL
MATCH 
	(c:course)-[:PARENT_OF*]->(n:lti_consumer) 
WHERE 
	n.launch_url CONTAINS 'piazza' OR  n.launch_url CONTAINS 'inscribe' OR n.launch_url CONTAINS 'edstem' OR n.launch_url CONTAINS 'yellowdig' OR n.launch_url CONTAINS 'discourse'
RETURN 
	distinct(c.course_key) as courseruns,
    c.org,
    n.launch_url,
    c.start,
    c.end

Courses with creative-common licenses

Orgs with creative-commons courses, sorted

Code Block
MATCH 
	(c:course)
WHERE 
	c.license CONTAINS "creative-commons"
RETURN 
	distinct(c.org) as org, 
	count(distinct(c.course)) as number_creative_common_courses
ORDER BY 
	number_creative_common_courses DESC

All creative-commons courses

Code Block
MATCH 
	(c:course)
WHERE 
	c.license CONTAINS "creative-commons"
RETURN 
	distinct(c.course) as course, 
	c.display_name as title, 
	c.org as org
ORDER BY 
	course

Cohorts and differentiated content

Courses with cohorts

Courses with cohorts
Code Block
match 
	(c:course)
where 
	size([x in c.user_partitions WHERE x contains "scheme=<class 'openedx.core.djangoapps.course_groups.partition_scheme.CohortPartitionScheme'>"]) >= 1
return 
	c.course_key

...

gives a filtered list of the cohorts in a course. We use the size function to see if there's more than one of them in a course.

Which courses use group visibility settings (i.e. has differentiated content)?

Courses with cohorts
Code Block
match 
	(c:course)-[:PARENT_OF*]->(v:vertical)-[:PARENT_OF*]->(i:item) 
where 
	i.group_access <> '{}' // this attribute is {} when group visibility is not set
return 
	distinct(id(v)) as unit_id,
	c.course_key as course_id,
	c.display_name as course_title,
	c.org as org,
	v.display_name as unit_title,
	("https://studio.edx.org/container/" + v.location) as url

Courses with graded differentiated content

Code Block
match 
	(n:course) -[:PARENT_OF*]-> (c:item)
with 
	c.course_key as course_key, 
	count(c) - sum(case when c.group_access <> '{}' 
	and c.graded then 0 else 1 end) as diff_content
where 
	diff_content <> 0
return 
	course_key

Problems

What types of "problem" XBlocks do we have?

XBlock field matches
Code Block
MATCH 
	(c:course) -[:PARENT_OF*]-> (n:problem)
RETURN 
	// n.data contains tags that indicate what type of problem is being used. This is likely a non-exhaustive list!
	CASE
      WHEN n.data CONTAINS "<choiceresponse"
      	THEN "checkbox"
      WHEN n.data CONTAINS "<multiplechoiceresponse"
      	THEN "multiple choice"
      WHEN n.data CONTAINS "<numericalresponse" 
      	THEN "numerical input" 
      WHEN n.data CONTAINS "<stringresponse"
      	THEN "text input"
      WHEN n.data CONTAINS "<optionresponse"
      	THEN "drop-down"
      WHEN n.data CONTAINS "<imageresponse"
      	THEN "image response"
      WHEN n.data CONTAINS "<annotationresponse"
      	THEN "annotation response"
      WHEN n.data CONTAINS "<formularesponse"
      	THEN "formula response"
      WHEN n.data CONTAINS "<coderesponse"
      	THEN "code response"
      WHEN n.data CONTAINS "<customresponse"
      	THEN "custom"
	ELSE "other" 
    END AS problem_type, 
    count(*) as number

All jsinput problems, counted by org

All jsinput problems, counted by org
Code Block
MATCH 
	(course)-[:PARENT_OF*]->(p:problem) 
WHERE 
	p.data 
CONTAINS 
	'jsinput' 
RETURN 
	distinct(p.org) as org, 
	count(distinct(p.course_key)) as numberOfCourses, 
	count(distinct(p)) as numberOfProblems

Courses with new drag and drop xblocks

new drag and drop problems by course
Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(n:`drag-and-drop-v2`)
RETURN 
	distinct(n.course_key), 
	count(distinct(n)), 
	c.start, 
	c.end 
order by 
	c.start DESC

Courses with old drag and drop problems

old drag and drop problems by course
Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(p:problem)
WHERE 
	p.data CONTAINS "drag_and_drop_input"
RETURN 
	distinct(p.course_key), 
	count(distinct(p)), 
	c.start, 
	c.end 
order by 
	c.start desc

What courses use xqueue (code graders)?

We wanted to identify the open courses with code graders, and to find an example problem from each course so we could look up the queuename. Since code graders aren't a unique problem type, we looked for a phrase specific to code graders in the content of the problem.

...

Code Block
match (c:course) -[:PARENT_OF*]-> (v:vertical) -[:PARENT_OF*]-> (n:problem) 
where n.data contains "coderesponse queuename"
and c.course_key = 'courseID' // replace with the course ID
return v.location, n.location

Where are a handful of modules that use MathJax?

Code Block
MATCH
    (c:course) -[:PARENT_OF*]-> (v:vertical) -[:PARENT_OF*]-> (n:item)
WHERE
n.block_type = 'problem'// MathJax may also exist in 'html' block types
AND
n.data CONTAINS '[mathjax]' // Some MathJax equations may also be found using '[mathjax inline]'
    AND
    n.visible_to_staff_only = false 
    AND
    c.start <= '2021-06-29' and c.end > '2021-06-29' // insert today's date for live courses
RETURN  
    c.course_key,
    n.block_type,
    n.display_name,
    n.location as block_id,
    v.display_name as unit_location,
    replace('https://courses.edx.org/courses/course-id/jump_to_id/','course-id',c.course_key) + right(n.location, 32) as Unit_URL
    LIMIT 20;

Course Structure

Components that are direct children of subsections instead of units/verticals

Useful to find Components on old XML-authored courses that are children of the Subsection instead of the Unit (based on the Studio-enforced Course->Section->Subsection->Unit->Component hierarchy).

...

Code Block
match (c:course) -[:PARENT_OF]-> (ch:chapter) -[:PARENT_OF]-> (s:sequential) -[:PARENT_OF]-> (cmp)
where
  c.course_key = "<course key>" and
  cmp.block_type <> "vertical" and
  cmp.block_type <> "unit"
return c.course_key as course_id, s.location as subsection_id, cmp.location as component_id

Subsections that contain word clouds and peer instruction

This query might be useful in when setting a bug priority - in this particular scenario, the two components would conflict when in the same subsection, but a query showed they were rarely used together.

Code Block
match (w:word_cloud)<-[:PARENT_OF*]-(s:sequential) -[:PARENT_OF*]-> (p:`ubcpi`)
Return s.course_key,s.display_name, p.display_name, w.display_name

Length of the longest Subsection title

Code Block
MATCH
    (n:sequential)
RETURN
    max(length(n.display_name))

Length of the longest Section title

Length of the longest Section title
Code Block
MATCH
    (n:chapter)
RETURN
    max(length(n.display_name))

How many pages contain nested xblocks? 

Find and count the xblocks (p) that contain other xblocks ( n), that appear in units (v). For all these searches, you'll get more real results by adding  constraints like "vertical in a recent course" or "is published to learners" 

Nested Xblocks in Units
Code Block
MATCH
    (v:vertical) -[:PARENT_OF]-> (p) -[:PARENT_OF*]-> (n)
RETURN
    distinct(p.block_type), count(p)

Note for the below that we're not eliminating cyclical relationships; no guarantee that n is a leaf node.

Units with Nested Xblocks
Code Block
//Count all units that contain nested xblocks i.e. blocks that contain other blocks
MATCH (v:vertical) -[:PARENT_OF]-> (p) -[:PARENT_OF*]-> (n) return count(distinct(v))

Useful toolkit for searching path depth:  

Searching path depth
Code Block
//Find examples of problems that are nested 3 or more steps in a unit. 
MATCH
    (v:vertical) -[:PARENT_OF*3..]-> (p:problem)
WHERE NOT (p)-[:PARENT_OF]->()
return v.location, v.type, p.location
limit 10

//Other phrases
// (v:vertical) -[:PARENT_OF*3]-> (p:problem) = problem is exactly three steps from vertical
// (v:vertical) -[:PARENT_OF*..3]-> (p:problem) = problem is 3 or fewer steps from vertical

Subsection settings from the Course Outline

These queries relate to settings that can be configured from the Course Outline via the gear icon next to a subsection.

What courses have used the "hide after due" feature?
Code Block
MATCH 
	(c:course) -[:PARENT_OF*]-> (n:sequential) 
WHERE 
	n.hide_after_due 
RETURN 
	distinct(n.course_key), 
	count(n)
How many active or upcoming courses use the "Limit Access" feature?

This refers to subsection prerequisites under the advanced tab.

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(s:sequential)
WHERE 
	c.enable_subsection_gating
AND
    s.location CONTAINS "sequential+block"
AND
    c.end > "2021-01-01"    
RETURN 
	count(distinct(c.course_key))
How many gated subsections are there in active or upcoming courses using the “Limit Access” feature?
Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(s:sequential) 
WHERE 
	c.enable_subsection_gating
AND
    s.location CONTAINS "sequential+block"
AND
    c.end > "2021-01-01"    
RETURN 
	distinct(c.course_key), 
	count(c)
ORDER BY
    c.course_key asc

Order of content on a page ([ :PRECEDES] relationship)

Particular layouts of a page might be of interest - for instance: What courses interleave exercises and text on a page? Does any page start with a discussion prompt before offering more guidance on how to participate? -[:PRECEDES]-> will help search for a specific order of children.

...

Code Block
match (c:course)-[:PARENT_OF*]->(v:vertical)-[:PARENT_OF*]->(a:problem)-[:PRECEDES]->(h:html)-[:PRECEDES]->(b:problem)
where a.graded = True
return c.course_key, v.location

Length of the longest Section title

Code Block
MATCH
    (n:chapter)
RETURN
    max(length(n.display_name))

Videos

Where are all the video components in a course?

Code Block
match 
	(c:course)-[:PARENT_OF]->(h:chapter)-[:PARENT_OF]->(s:sequential)-[:PARENT_OF]->(v:vertical)-[:PARENT_OF]->(i:video) 
// where c.course_key = 'courseID' // uncomment this line to specify course id!
return 
	i.location as location, 
	c.course_key as course_id, 
	h.index as chapter_index, 
	s.index as sequential_index, 
	v.index as vertical_index, 
	i.index as video_index
order by 
	c.course_key, 
	h.index, 
	s.index, 
	v.index, 
	i.index

What are the video titles and Video IDs in a course?

Code Block
match
    (c:course)-[:PARENT_OF]->(h:chapter)-[:PARENT_OF]->(s:sequential)-[:PARENT_OF]->(v:vertical)-[:PARENT_OF]->(i:video)
where c.course_key = 'courseID' // replace with the course ID
return
    i.location as location,
    c.course_key as course_id,
    i.index as video_index,
    i.display_name as video_title,
    i.edx_video_id
order by
    c.course_key,
    h.index,
    s.index,
    v.index,
    i.index


Where are all in-use videos located in my course? (with direct link)

Code Block
MATCH
    (c:course) -[:PARENT_OF*]-> (v:vertical) -[:PARENT_OF]-> (n:item)
WHERE
    c.course_key = 'courseID' // replace with the course ID
    AND
    n.block_type = 'video' 
RETURN
    n.edx_video_id as video_id,
    n.location as block_id,
    v.display_name as unit_name,
   replace('https://courses.edx.org/courses/course-id/jump_to_id/','course-id',c.course_key) + right(n.location, 32) as Unit_URL,
    n.edited_on, 
    n.visible_to_staff_only

What courses have not enabled video upload? 

Code Block
MATCH 
	(c:course) 
WHERE 
	c.video_upload_pipeline = "{}" 
RETURN 
	distinct c.course_key as course_id 

Which courses allow video download?

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(v:video)
WHERE 
	v.download_video = true
RETURN 
	distinct c.course_key as course_id,
	count(v) as num_downloadable_videos

Where are all my org's videos and what video source / URL are they being served from?

Code Block
MATCH
	(c:course)-[:PARENT_OF]->(h:chapter)-[:PARENT_OF]->(s:sequential)-[:PARENT_OF]->(v:vertical)-[:PARENT_OF]->(i:video)
WHERE 
	c.course_key contains 'edX'
RETURN
	i.location as location,
	c.course_key as course_id,
	i.display_name as video_title,
	i.html5_sources as video_source
ORDER BY
	c.course_key,
	i.display_name

How many videos use start/end times (i.e. trim the video)?

Simple query:

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(n:`video`)
WHERE 	
	(n.start_time <> "0:00:00" or n.end_time <> "0:00:00")
RETURN 
	count(n.location)

Note that this is an over-estimate–many videos that have an explicit end_time actually specify that end time to be the same as what the normal end of the video would be anyway (so there is effectively no trimming happening). Unfortunately, data around total video duration lives in edx-val, and is not available from CourseGraph.

How many creative-commons licensed videos are there?

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(v:video)
WHERE 
	v.license contains "creative-commons"
RETURN 
	count(v.location) as num_videos

ORA

What courses have ORA with misconfigured dates?

For course teams, ORA due dates are set in a slightly different workflow than other assignments (in Unit as opposed to Subsection). On occasion, teams will forget to update ORA due dates in a course rerun, which blocks learners from accessing the assessment. 

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(o:openassessment) 
WHERE  
	o.submission_due < c.start
	AND c.start <= '2017-02-13' and c.end > '2017-02-13' // or whatever the current dates are, to match only ongoing courses
	AND o.visible_to_staff_only = False
RETURN 
	c.org as org, 
	c.course_key as course_id, 
	c.display_name as course_title, 
	o.display_name as ora_title, 
	o.location as location
ORDER BY 
	c.course_key

Which courses use peer assessment ORAs?

ORAs can use a combination of peer-assessment, self-assessment, and staff-assessment.

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(n:openassessment) 
WHERE 
	c.org contains 'edX'
	AND ANY(rubric
      IN n.rubric_assessments
      WHERE rubric contains 'peer' // rubric contains a peer assessment
    )
RETURN 
	c.course_key, 
	count(n.location) as num_peer_oras

What settings are used for file uploads in ORA?

ORAs can be configured to allow file uploads, allowing for a preset or set of custom file types.

Code Block
MATCH
    (c:course) -[:PARENT_OF*] -> (o:openassessment)
WHERE
    c.start < '2020-09-01' and c.end > '2020-09-01'  // or whatever the current dates are, to match only ongoing courses
RETURN
    c.course_key,
    o.file_upload_type_raw,		// an ORA setting we're interested in 
    o.white_listed_file_types;	// an ORA setting we're interested in 

Advanced settings

Which courses use teams?

Code Block
MATCH 
	(c:course) 
WHERE 
	c.teams_configuration <> "{}" // where teams_configuration is not null
RETURN 
	c.course_key,
    c.org,
    c.teams_configuration

Exams

Which courses have timed exams?

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(s:sequential) 
WHERE 
	NOT s.is_proctored_enabled // not proctored exam
	AND NOT s.is_practice_exam // not practice exam
	AND s.is_time_limited // is timed
	AND s.default_time_limit_minutes <> 'None' AND s.default_time_limit_minutes > 0 // has time set
RETURN 
	distinct c.course_key, 
	c.start, 
	c.end, 
	c.self_paced as is_self_paced, 
	c.display_name, 
	c.org, 
	count(distinct s) as num_exams

When are all the upcoming finals?

Code Block
MATCH
    (c:course) -[:PARENT_OF*]-> (n:sequential)
WHERE
    not n.is_practice_exam
    and n.due >= "2018-03-26"  // current date you are considering
    and n.due <= "2018-04-10"  // after above date
    and n.display_name =~ '(?i).*final.*'  // Use regex instead of CONTAINS to get case-insensitivity.
    // FYI, for now spanish and english final exam sequentials contain the
    // word "final".  This will change once we add more languages.
RETURN
    c.course_key as course_key,
    c.start as course_start_date,
    c.end as course_end_date,
    ("https://courses.edx.org/courses/" + c.course_key + "/jump_to/" + n.location) as exam_link,
    n.start as exam_start_date,
    n.due as exam_due_date,
    n.is_time_limited as is_time_limited,
    n.is_proctored_enabled as is_proctored_enabled,
    n.display_name as display_name

Which timed exams are upwards of 24 hours?

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(s:sequential) 
WHERE 
	NOT s.is_proctored_enabled // not proctored exam
	AND NOT s.is_practice_exam // not practice exam
	AND s.is_time_limited // is timed
	AND s.default_time_limit_minutes <> 'None' AND s.default_time_limit_minutes > 1439 // has time set upwards of 24 hrs
RETURN 
	c.course_key as course_id,
    c.start as course_start_date,
    c.end as course_end_date,
    s.display_name as exam_title,
    s.default_time_limit_minutes as allotted_time_in_minutes,    
	s.start as exam_start_date, 
	s.due as exam_due_date
order by s.default_time_limit_minutes desc

Proctored Exams

Which courses have practice proctored exams?

Code Block
MATCH 
	(c:course) -[:PARENT_OF*]-> (n:sequential) 
WHERE 
	n.is_proctored_enabled 
	and n.is_practice_exam 
RETURN 
	c.course_key as course_key, 
	c.start as course_start_date, 
	c.end as course_end_date, 
	count(distinct(n)) as number_practice_exams

Which courses have official proctored exams?

Code Block
MATCH 
	(c:course) -[:PARENT_OF*]-> (n:sequential)
WHERE 
	n.is_proctored_enabled
	AND not n.is_practice_exam
RETURN 
	c.course_key as course_key, 
	c.start as course_start_date, 
	c.end as course_end_date, 
	count(distinct(n)) as number_actual_exams

Link and start date for the first proctored exam in a course

Code Block
MATCH
    (c:course) -[:PARENT_OF*]-> (n:sequential)
WHERE
    n.is_proctored_enabled
    AND not n.is_practice_exam
    AND n.start >= "2018-01-01" // or current date
    AND n.start <= "2018-03-01" // or future date
WITH
	min(n.start) as first_exam_start_date,
    c as c,
    n as n
RETURN
	c.org as organization,
    c.course_key as course_key,
    c.start as course_start_date,
    c.end as course_end_date,
    first_exam_start_date,
    head([x in collect(n) where x.start = first_exam_start_date | "https://courses.edx.org/courses/" + x.course_key + "/jump_to/" + x.location]) as link

Which courses have currently available proctored exams?

Code Block
MATCH 
	(c:course) -[:PARENT_OF*]-> (n:sequential)
WHERE 
	n.is_proctored_enabled
	AND not n.is_practice_exam
	AND n.start <= "2017-01-09" // or whatever the current date is
	AND n.due >= "2017-01-09" // or whatever the current date is
RETURN 
	c.course_key as course_key, 
	c.start as course_start_date, 
	c.end as course_end_date, 
	count(distinct(n)) as number_currently_open_actual_exams

What are the release and due dates of the proctored exams?

Code Block
MATCH 
	(c:course) -[:PARENT_OF*]-> (n:sequential)
WHERE 
	n.is_proctored_enabled
RETURN 
	c.course_key as course_key, 
	c.start as course_start_date, 
	c.end as course_end_date, 
	n.location as exam_location, 
	n.start as exam_start_date, 
	n.due as exam_due_date, 
	n.is_practice_exam as is_practice_exam, 
	(n.is_time_limited and not n.is_practice_exam) as is_timed_exam	

What is the proctoring backend in use by a course?

Code Block
MATCH
    (c:course) -[:PARENT_OF*]-> (n:sequential)
WHERE
    n.is_proctored_enabled
RETURN
    c.course_key as course_key,
    c.start as course_start_date,
    c.end as course_end_date,
    n.location as exam_location,
    c.proctoring_provider

Course quality & integrity

What courses have randomized problem banks?

Code Block
MATCH (c:course)-[:PARENT_OF*]->(n:library_content) // this takes advantage of the fact that all library_content items have mode: random
WHERE NOT n.visible_to_staff_only 
RETURN distinct c.course_key, c.start, c.end, c.self_paced as is_self_paced, c.display_name, c.org, count(distinct c) as num_exams

What are the locations for randomized content blocks in a course?

Code Block
MATCH
    (c:course)-[:PARENT_OF*]->(v:library_content)
WHERE c.course_key = <course_key> 
RETURN
    v.location, v.display_name

Which courses use discussion blackout dates?

Code Block
match (c:course) 
where c.discussion_blackouts <> [] // non null blackout dates
return c.org, c.course_key, c.display_name, c.start, c.end, c.discussion_blackouts

Discussion forums

What courses don't use edX's built-in discussions?

Code Block
MATCH (c:course) WHERE LENGTH([x IN c.tabs WHERE x CONTAINS "DiscussionTab"]) = 0 RETURN c.course_key

Which courses have in-context discussion blocks? 

Code Block
MATCH
    (c:course) -[:PARENT_OF*]-> (n:item)
WHERE n.block_type = "discussion"
RETURN
    distinct(n.block_type) as block_type,
    c.course_key as course_id,
    c.org as organization,
    count(distinct(n)) as discussion_blocks,
    count(n) as number
order by
    number DESC

Course updates

Number of courses with visible updates, by pacing type

Code Block
match (ci:course_info), (c:course)
where size([item in ci.items where item CONTAINS "u'status': u'visible'"]) > 1
and (ci.location ENDS WITH "/updates" or ci.location ENDS WITH "@updates") // this ensures we're just looking at updates, not handouts
and c.course_key = ci.course_key
return c.self_paced, count(c.course_key)

Note: we have to "join" on the course block, because course_info blocks aren't children of course blocks and so don't inherit the self_paced attribute from them.

LTI

Which courses have LTI components?

Code Block
match (c:course)-[:PARENT_OF*]->(lti_consumer)
return distinct c.course_key, c.org, c.display_name

Where are graded LTI components - that are embedded in the course / that open an external link?

Code Block
match
(c:course)-[:PARENT_OF*]->(b:lti_consumer)
WHERE b.graded = true
and b.launch_target = "iframe" // or "new_window"
return
    b.location as location, c.course_key as course_id

Which courses use X LTI? 

Code Block
// NOTE: this query takes advantage of the fact that many (but not all) LTI components contain the resource name in the launch_url (e.g. Piazza LTIs contain 'piazza' in the URL

MATCH 
	(c:course)-[:PARENT_OF*]->(n:lti_consumer) 
WHERE 
	n.launch_url CONTAINS 'piazza' // launch_url is where the LTI 'points' to. Most are lowercase
RETURN 
	n.org, 
    count(distinct n.course_key)

What are all the LTIs used on edX and in what volume are they being used?

Code Block
MATCH	
(c:course) -[:PARENT_OF*]-> (v:vertical) -[:PARENT_OF*]-> (n:item)	
RETURN	
distinct(n.lti_id) as lti_id,	
count(n) as number	
order by	
number DESC	

Due dates

Which courses had assignments due between X and X dates?

This query might be useful in situations where learners are unable to access the platform for a period of time (e.g. platform outage, natural disaster) which affects their ability to submit key assignments.

Code Block
// NOTE: this does *not* include ORA due dates. ORA due dates are set on the component level, rather than the sequential level.
MATCH
	(c:course)-[:PARENT_OF*]->(s:sequential)
WHERE
	s.due >= '2017-08-25' // replace with outage start date in 'YYYY-MM-DD' format
    AND s.due <= '2017-09-02' // replace with outage end date in 'YYYY-MM-DD' format
RETURN
    c.course_key,
    c.org,
    c.display_name as `course title`,
    s.due,
    count(distinct s) as `number of assignments due`
ORDER BY
	c.course_key

Content: HTML and Problems

Which courses contain a certain keyword anywhere in their course content?

Example: we might be curious which courses mention the g2crowd tool anywhere in their HTML content.

...

Code Block
MATCH 
	(c:course)-[:PARENT_OF*]->(h:html) 
WHERE 
	h.data CONTAINS 'http'
	AND c.course_key = 'courseID' // replace with the course ID
RETURN 
	h.course_key, 
	h.data, 
	"https://studio.edx.org/container/" + h.location as link

What courses contain scripts with python code?

Example: An org would like to know all their courses that require review for the python3 upgrade

Code Block
MATCH
	(c:course)-[:PARENT_OF*]->(p:problem)

Where
	p.data CONTAINS 'loncapa/python' // this phrase identifies a python-graded problem
AND
	c.org =~ '(?i)edx.*' // case-insensitive search to include all edX-variants
RETURN
	p.org as org,
	c.display_name, c.course_key, c.start, c.end,
	count(distinct(p)) as numberOfProblems

Where all the python-graded problems?

Example: A partner needs guidance on what content to review for the python3 upgrade - give them links to all the content.

Code Block
MATCH
(c:course)-[:PARENT_OF*]->(p:problem)

Where
p.data CONTAINS 'loncapa/python'
AND
c.course_key = 'courseID' // replace with the course ID
RETURN
c.org as org,
c.display_name, c.course_key, c.start, c.end, ("https://courses.edx.org/courses/" + c.course_key + "/jump_to/" + p.location) as problem_location

Which HTML modules have (illegal) self-closing <script> and <iframe> tags?

Example: We need to find which HTML modules and courses use <script /> or <iframe /> when they should use <script></script> or <iframe></iframe> instead. Using the self-closing version is invalid HTML, but worked because of some data munging we did in the old courseware view. It does not work in the new Courseware MFE.

Code Block
MATCH
    (v:vertical)-[:PARENT_OF*]->(h:html)
Where
    h.data =~ '(?i)<iframe [^>]*\\/>' or
    h.data =~ '(?i)<script [^>]*\\/>'
RETURN
	h.course_key, v.location, 'https://courses.edx.org/xblock/'+ v.location,  h.location, h.data
order by h.course_key


Which HTML modules use X iframe in them and where?

Code Block
MATCH
(c:course) -[:PARENT_OF*]-> (v:vertical) -[:PARENT_OF*]-> (n:item)
WHERE
c.start < '2021-05-12' // insert today's date for live courses
AND
c.end > '2021-05-12' // insert today's date for live courses
AND
n.block_type = 'html'
AND
n.data CONTAINS 'geogebra'// insert a specfic iframe identifier
RETURN
n.course_key,
replace('https://courses.edx.org/courses/course-id/jump_to_id/','course-id',c.course_key) + right(n.location, 32) as Unit_URL;

Age of Course Data

When was the course last updated in Coursegraph?

This won't capture courses added since the last update, but this query can be used to see when Coursegraph itself was last updated, based on the last update date.

...