Right now, the Block API caching is optimized for views that use all transformers across a full course, and cannot efficiently sub-query for parts of the course, or a subset of the transformer collected data. This was done for efficiency reasons – storing them separately the naive way (per-block cache entries) drastically reduced the effectiveness of compression and gives us 7X larger sizes for our first major use case (the full course data).
The basic design for per-transformer block data looks like this:
1. Think of a Block Cache Unit as being a structure + a list of transformer collects + a list of xblock fields.
2. The structure part of the BCU is the only place that defines the list of usage keys.
3. Each transformer in a BCU stores the collected data for one transformer across all usage keys in the BCU as one cache entry.
4. Each XBlock field in a BCU is similarly stored – one key entry for that field across all usage keys.
5. Transformers and XBlock fields use the alpha-sorted key values an implicit index, and can store their data either as a flat list (we can make this more optimized later on).
structure: key1 -> [key3, key2] (so order = key1, key2, key3)
transformer A: [value for key1, value for key2, value of key3]
transformer B: [value for key1, value for key2, value of key3]
xblock field 'graded': ['graded' for key1, 'graded' for key2, 'graded' for key3]
The BCU knows how to read/write these from the cache and stitch together the usage keys and stored values.
This story does not cover changing things to make subsets of the course efficient, but a follow-on story can basically extend BCUs to support making node subsets.