This is the most current Blockstore design document, but many details continue to be refined in conversations on the Issues page of the Blockstore repo. Some of these topics still under debate are:

How granular is a Bundle in different use cases (i.e. single problem, entire unit, outline of entire course, etc.)?
Exactly what files get placed where inside a Bundle?
What does the import/export look like for courses and content libraries?

This is the design document for Blockstore, a system for authoring, discovering, and reusing educational content. Development is being funded by Harvard LabXchange and the Amgen Foundation, with significant in-kind contributions from edX.

Abstract

All lesson content in the Open edX platform is currently stored in the modulestore, which requires that all content is organized into “courses” that are each a directed acyclic graph (DAG) of XBlocks/XModules (or in “libraries” which are implemented in the same way as courses, but which have a shallower graph and support a limited set of content types).

This proposal outlines a design for a new service that stores content for the Open edX platform, called “Blockstore.” Blockstore is meant to be a lower-level service than the modulestore, and it is designed around the concept of storing small, reusable pieces of content, rather than large, fixed content structures such as courses. In other systems and academic contexts, these are often called “learning objects,” and Blockstore is thus a type of Learning Object Repository (LOR). For Open edX, Blockstore is designed to facilitate a much greater level of content re-use than is currently possible, enable new adaptive learning features, and enable delivery of learning content in new ways (not just large traditional courses).

Motivation

At its heart, edx-platform's current modulestore works with large, static course structures. Various dynamic courseware features such as A/B tests, cohorts, and randomized problem banks work around this by copying every piece of content that might be displayed to any user and then selectively showing a subset of that using permission access checks. When you use a randomized problem bank in a sequence, the system is in fact copying the entire content library into that sequence.

This poses a number of problems:

It creates very large data structures, degrading courseware performance. Many common courseware interactions noticeably slow down as the amount of content in a course increases.
The underlying structure is static, so the ordering of elements is fixed, making adaptive learning sequences extremely cumbersome to implement. Course teams have heroically worked around this using LTI hacks, using Open edX as both an LTI provider and consumer in chained LTI launches (sequences with one unit that acts as an LTI consumer to an adaptive engine interface that then becomes an LTI consumer for individual problems in the original course).
Course content is largely duplicated for every run, making it cumbersome to manage across multiple runs, especially if those runs are on different instances of Open edX as is the case with some partners.
Trying to work around these limitations and maintain performance has significantly complicated the codebase and slowed feature development. Content Libraries are far less powerful than they were intended to be because of the large infrastructure changes that would have been required to execute the original vision.

General Themes / Concepts

The high level ideas that ground this proposalare:

Blockstore stores data in Content Bundles, which are a local grouping of files that Blockstore knows little about.
Blockstore doesn't understand much about the things inside of it. There is no special data structure within the core of Blockstore for Sequences vs. Units vs. anything else. OLX content, smaller assets like images, and larger assets like videos are all stored as files in Content Bundles, using conventions and groupings that make sense to the client application. A separate plugin layer will be able to listen to and take action for particular types of Content Bundles.
Blockstore is a lower level storage abstraction that XBlocks (and other clients) build upon.
We will compose Blockstore primitives in various ways to store content, but there isn't a 1:1 mapping of concepts. For instance, a Collection is not equivalent to a Course or a Library. A Collection might in fact store multiple Course Runs and multiple Library equivalents. A Content Bundle might be used to store a Sequence, an individual Problem, or the outline of a Course Run. The concrete primitives that Blockstore offers are versioned storage and the ability to access files in other Bundles using Links. This gives us a lot of flexibility, but requires us to be disciplined about how we use it.
Blockstore represents author intent and grouping. It favors author-friendliness even if it makes certain bookkeeping harder.
A Content Bundle in Blockstore is something an author wants to edit, version, import, and export as a single thing. That means a Bundle can be a single problem or an entire sequence. Things stored in Blockstore are not read-optimized, and are not the data structure that students interact with in the end. The definitions of a mostly static learning sequence and a learning sequence with an adaptive component might look completely different when stored in Blockstore, even if the Learner eventually experiences them in a similar way. The imported and exported bundle that is a Content Bundle should be as author-friendly as possible – assets are grouped together with where they're used, and as few Blockstore concepts as possible should leak into how the content is written.
Versioned content is the core of Blockstore, and plugin extensibility is focused around annotating that content.
Things that create, transform, update, or execute content live outside Blockstore. Plugins know when content has been changed, but they don't modify content. Plugins maintain their own data and APIs. Plugin data changes can happen outside of the lifecycle of the content itself. This means that an export of the same version of a Content Bundle will always yield the same authored content, but may yield different plugin metadata (example: new tags that were added). Also, versions are meaningful, and not every edit of every file spawns a new version. A version is like a "commit" in that sense.

Layers

Name	Responsibilities
Core	This is the storage of the content itself, and essential mechanisms for updating it. This layer doesn't understand anything about the actual files in the Content Bundle. Content data models: Content Bundles, Versions, Links, Drafts (though it's possible that Drafts can move out of Core) Roles Signals
Persistence / BundleDataStore	Low level swappable piece that determines how we store the files for bundles. Must support at least S3 and the local file system. Extension point. Try to implement this with django-storages to start. Might have to switch one day for scale, but as long as we can maintain the files in the same location, this should be fine. Graceful handling of large assets matters.
Plugin	This is the more extensible layer that manages the constellation of metadata about a Bundle. It should be easy to add these over time, possibly in a separate repo. This layer actually does understand the contents, and might subscribe to events for particular content types. Tagging Search The discovery aspect is interesting because XBlocks will exist in various Content Bundles at different granularities depending on the lifecycle/requirements. But I should still be able to say "What are all the capa problems in my Collection" regardless of whether they're standalone or in a Sequence. The Bundle Parts proposal may simplify this. Webhook notifications Licensing Dispatch Import hooks (per-type) for things like validation.
Execution (external process)	The XBlock runtime that actually executes content, probably at a Unit level. This lives in a separate process and needs to know how to grab content from Blockstore for the purpose of preview and authoring.

Blockstore Core Layer Concepts

Term	Definition
Content Bundle	A group of files that are versioned together and can be accessed from other Content Bundles. Blockstore stores a UUID and basic metadata about a Content Bundle, like a title, slug (for slightly prettier URLs), and "type". UUID never changes. Blockstore's core layer does not parse or understand the actual contents of the files in Content Bundles, though it may know how to delegate certain actions like preview and editing to plugins based on type.
Bundle Version	An immutable snapshot of a Content Bundle. Once created, the contents for the files in a given Bundle Version does not change. Metadata about a Version can change though, like tagging. That will often be asynchronously updated after a new Bundle Version has been created. Creating a new Bundle Version emits a signal that other parts of the system will listen for, similar to how course_publish works today. Importing creates a new Version directly (if any changes were made). It does not interact with Drafts. Versions and even entire Content Bundles can be force deleted if necessary to to deal with copyright violations or other inappropriate content. This will likely break any Content Bundles that has links to the deleted Content Bundle. Force deletion may not be possible for content that is copied/referenced across Blockstore instances, but that entire use case is still very fuzzy right now.
Link	The method by which one Content Bundle references and uses content from another. Since a Content Bundle is a collection of files, this is conceptually a symlink between one Content Bundle Version and another, via a special named folder (e.g. `.blockstore/links/want_tutorial/videos/wand_demo.mp4`). The version number is not encoded into the symlink path. This is so that updating to the next Version doesn't require changing the path. A Link has: An alias – the name of the symlink, essentially. A target BundleVersion. For now, all Links are assumed to be local to the server instance, but later we could add an optional server namespace field. Since a Content Bundle is immutable, updating a Link results in a new Version. Cycles are not allowed (bad things happen, like infinite version bumps). A Content Bundle Version cannot have Links to multiple Versions of another Content Bundle. So A.v1 can make a Link to B.v1 or B.v2, but A.v1 cannot simultaneously access files from both B.v1 and B.v2. This is to make dependency upgrades saner. Exporting multiple Content Bundles and preserving versioned Link information will likely require a smart client that knows how to pull down Content Bundle Versions to a shared namespace and then generate the necessary symlinks on disk. Because of the size and potentially interconnected nature of dependencies, it would be nice if clients could download individual files rather than trying to generate a tarball. Especially if video is involved. A "how many bytes will this be" type of summary would also be very useful.
Version Range	A lot of annotation around a Content Bundle (e.g. tagging around teaching standards) is going to be about content that may change over time as new Versions are created. Associating them with the Content Bundle as a whole might be inaccurate. Associating them with specific Versions might be wasteful, particularly when the changes are relatively minor. Treating a Version Range as a first class concept would help to simplify data modeling in other parts of the system. Version Ranges need a start version, but the end version can be null (i.e. open-ended). This could be a model mixin.
Draft	A mutable space for changes to be made before they are committed to a Bundle Version. There are no "draft" vs. "published" branches. Drafts get committed to Bundle Versions, and Bundle Versions keep increasing. Because versions are shared and immutable, there may be multiple Versions of a given Bundle that are live in different courses at any given point. Import creates new Bundle Versions directly and does not interact with Drafts. Under the covers, Drafts are copy-on-write. Edits of Drafts are done at the individual file level, so it's possible for two people to be concurrently editing different units in the same sequence, as long as the units are separate files.
Collection	1:M grouping of Content Bundles. There is no Collection-level versioning. It's a pointer to a bunch of Content Bundles which have their own versions. Ownership is captured at the Collection level. Permissions are determined at the Collection level. Licensing is determined at the Collection level. Mapping today's concepts: A Course would have most of its content one Collection. Multiple runs of the same course would be in the same Collection. So perhaps it's more accurate to that the contents of a CatalogCourse are in a Collection? A Content Library would be in a Collection. It's possible that we'd use one Collection for multiple Libraries and Course Runs. Conceptually, it'd be like "Collections are always Libraries, it's just that some of those Libraries include Sequences and Course Run definitions". A lot of how this will evolve will depend on UI. Import/export could happen at the Collection level, but in practice we'd want to very strongly lean towards allowing subsets of Collections to be imported or exported. Questions: Would this be problematic for changing licensing information over time? Does it need to be fixed to particular Content Bundle Versions? Is there some notion of related Collections? A course might use a bunch of problems from a problem bank – they might be different Collections, but it seems useful to be able to associate them? Maybe that's overdoing it? How do we delete Collections? Safe if empty? Safe if none of the Content Bundles are referenced externally?
Signal	We'll emit named Django signals for life cycle events around the Core layer, including: Content Bundle creation and deletion. Version creation and deletion. Link creation and deletion (happens on Version creation) Collection creation, update (e.g. title, ownership, roles)

Higher Level Concepts

These concepts exist for the systems that author content in Blockstore, but Blockstore itself is unaware of them.

Term	Definition
Learning Context	A grouping against which student state is stored. General state is stored via a (context, student, block) tuple. If a student sees the same problem block in multiple places within a context, their state (answer, saved work) carries over. If a student sees the same problem block in a different context, their state (answers, saved work) does not carry over. Courses and Pathways reference Learning Contexts but they are not Learning Contexts themselves. The common case will be that each Pathway will use a separate Learning Context, but there may be situations in which you'll want multiple Pathways to use the same LearningContext. Open Issues: The case of XBlock state seems straightforward, but does all state really follow this (e.g. completion, scoring, gating, etc.)?
Block	The smallest piece of content that can be authored, this maps to a leaf-node XBlock ("Component" in Studio) in edx-platform today.
Unit	An ordered list of Blocks that is the smallest chunk in which content can be consumed. A single page worth of content. Maps to a VerticalBlock today, but that name is horrible – we should either create a new Unit Block or make it an alias of Vertical (that's really what it should have been named in the first place). Making a new Block might allow us to drop support for some legacy cruft. While Blocks can be almost anything, Units are expected to have some common fields like title, possibly an icon, a URL that can be independently rendered, etc.
Sequence	Sequences are a linear group of Units. For existing courses, Sequences are largely static, though there are exceptions in randomized problems, A/B testing, and the adaptive use cases.

Why Files?

This proposal leans on the file system metaphor more strongly than the initial proposal. Some advantages to this:

We can unify content storage and lifecycle updates.
Studio currently stores authored course content in three different ways. Data that manifests as content and settings scoped XBlock field data are stored in the ModuleStore, as sort-of versioned documents in MongoDB. Smaller binary assets like images and PDF files are stored without versioning via the ContentStore interface, which writes to GridFS. For performance reasons, self-hosted videos are typically put in S3. Treating these in a more uniform way will help when building add-on functionality like search, tagging, and licensing. It should also lower cost and operational complexity.
It simplifies the overall design.
Blockstore offers the capability to store, version, and reuse content. The more it knows about the internals of that content, the more coupled it will be, making it hard to adapt and make changes. For example, adding a major feature like internationalization of OLX should require no changes to the core of Blockstore. Files also help with the partial sharing use cases (more on that in the Content Reuse section).
Assets can get surprisingly sophisticated.
Splitting the world into sophisticated XBlock-like data and simple binary blob assets is intuitive, but many assets are more sophisticated than they appear. A Video can be represented by a single mp4 file, or it can be a set of HLS files with different quality video encodings, multiple audio tracks, and subtitles for many languages. An ebook may come in three different formats, that are grouped, licensed, and updated in sync with each other. Even simple assets can become more complex when internationalization comes into play.
It leads to more author-friendly grouping for reuse.
OLX and static assets can live side by side in the same versioned Bundle, without requiring any external links. If there are precursor files like source LaTeX files that compile out to OLX, those can be stored in the same Bundle. They would be ignored by the XBlock runtime, but still be very valuable to store, version, and share.
We need a serialization format for import and export anyway.
Import and export are going to be a critical part of the supported workflow. This lets Blockstore be agnostic to OLX conventions that are handled at the XBlock runtime layer. The only file conventions it imposes are its own simple notions of Bundle metadata and Links.

Using Blockstore to Model Courseware

So what would the Content Bundles look like?

Content Bundle File Conventions

Blockstore exposes Bundle-level metadata as a .blockstore directory.
- This folder a fully virtual folder (nothing actually exists there on S3, and it is optionally materialized on export)
Metadata is exported as JSON files.
All Links to other Content Bundles will be of the form links/{alias}
- Link mapping is stored in the .blockstore/info.json directory on export.

Standalone Problem (leaf-level XBlock, appropriate for problem banks)

quicksort_complexity/
                     # Actual definition of the OLX for the problem.
                     problem.xml
 
                     # Convention: Files in static/ will be available to the browser during execution.
                     static/         
                            diagram.png
 
                     # Blockstore metadata
                     .blockstore/
                                 info.json   # UUID, version, title, links, dependencies

Video (centrally managed, probably in a separate Collection)

the_perfect_egg/
               # VideoModule OLX
               video.xml
               static/
                      hls/
                          playlist.m3u8
                          
                          # Dozens of alternate languages and encodings would
                          # be in the following directories:
                          audio/
                          subs/
                          video/

               # Blockstore metadata
               .blockstore/
                           info.json   # UUID, version, title, links, dependencies

Static Sequence (either standalone pathway or part of a Course)

wand_tutorial/
              sequence.xml
              links/
                    beginner_wand/video.xml  # Link to another Bundle (mapping is in info.json)
                    wand_safety/video.xml
              static/
                     ollivander.jpeg
              units/
                    construction.xml
                    maintenance.xml
                    materials.xml
                    ownership.xml

               # Blockstore metadata
               .blockstore/
                           info.json   # UUID, version, title, links, dependencies

Course Run Example

The core part of this would be some sort of document that has navigation information and provides pointers to all the different sequences that make up a course. For example, you could have a Content Bundle that defined an XML file that looks something like:

<course-navigation>
	<chapter name="Magic Basics">
		<!-- Policy related information like deadlines should live in a separate file, to make reuse easier. -->
        <!-- It's possible that we could treat the Course as its own Context, but it might be nice to be able
             to manually specify what context each sequence should be considered a part of. -->
		<sequence src="links/magic_source/sequence.xml"/>
		<exam src="links/magic_ethics_exam/exam.xml"/>
	</chapter>
	<chapter name="Magic History">
		<!-- etc. -->
	</chapter>
</course-navigation>

Some opinions that this approach has:

Sequences are separate entities, Chapters (and other hierarchy between Course and Sequences) are details of Course.
In this particular mindset, "chapters" don't really exist as a separate entity, but only as a navigational convenience of Course. Everything is sequences and things that point to sequences. If a Course wants a flat list of sequences or a hierarchy an extra level deep – that's purely a Course Navigation concern. Unlike today, you wouldn't expect to be able to call a student_view on a chapter (that doesn't really work well in practice, but the framework currently allows for it).
Course navigation isn't an XBlock, and might not even be OLX.
For content compatibility reasons, it's really important that the individual Units be OLX. Maybe the Sequences as well, though I think there's a decent case to be made either way there (and even if it is OLX, it doesn't necessarily have to be backed by an XBlock runtime). But the things-that-point-to-sequences don't necessarily need to be OLX. The data model for courses/chapters is a combination of a simple container hierarchy that can be translated to in a straightforward way, and a mess of global policy attributes that need to be moved out. It should have some reasonably human-readable serialized format, but whether that's XML, JSON, YAML, or something else that better fits our "X + overrides" use cases
This is not the LMS data representation.
The way sequences are referenced here would be terribly inefficient if we had to make read calls to each sequence to get the titles to display. When we actually get to the representation in the LMS, we might want a uniform way to specify sequences that enables much more dynamic behavior by relying on queries of sequence relationships. But Blockstore is for authoring, and surfacing whatever is simplest and most intuitive for that.

Content Reuse

Content Bundles are created with the intended granularity of reuse. If you intend to have a bunch of problems for a problem bank, a Bundle is an individual problem. If you have a sequence that can be reused in multiple contexts, then the Bundle is your entire sequence. To use either of these, you would make a Link to it and reference the appropriate linked OLX file from a file in your own Bundle (e.g. a CourseRun definition referencing the above Sequence).

Reusing Content Bundles in their entirety is intended to be the common case, but it's possible that someone will want to reuse a small part of an existing Bundle. In that case, you can still use Links and reference a file directly, such as an individual Unit or image that's part of a larger sequence.

Reuse as designed for by the original author is done by referencing Bundles, and reuse in ways permitted, but not designed for by the original author is done by referencing specific files within Bundles.

Implementation Details

A non-exhaustive stab at some of the key implementation issues.

Storage

We have to be able to handle files that run from very small to extremely large. The metadata for Bundles will use the Django ORM, but the storage interface for Bundle files will be a pluggable backend that supports at least S3 and the local filesystem. Using django-storages would be nice, but I'm not entirely sure it supports all the features we'd want (e.g. "Content-Disposition" header support so we could reuse the same blob of data with different names).

Git or Mercurial have come up as possible backing stores as well. The main reason I avoid them is because of unknown operational complexity. At the granularity of the separate repositories, even a straightforward port of edx.org content could yield something on the order of a million repositories, with over 50 TB of video data. We already manage that in S3, but we have no experience running git repositories at this scale. Simple hosted AWS's EFS (their hosted NFS solution) performs horribly for git workloads, so we'd have to manage it ourselves. Our usage of video means that git-lfs is likely a requirement. None of which is insurmountable, but it raises the uncertainty and costs, and we likely won't take advantage of sufficient features to justify it.

Storage in an S3 store could look like:

/{bundle_uuid}/data/{file named after hash}
All source data files are stored by hash, to allow for cheap renames across versions. URLs can be sent with custom content-disposition headers to enable browsers to download them with sensible filenames. Using the UUID as the start of the name makes us less likely to run into per-partition performance throttling from S3.
/{bundle_uuid}/versions/{version}/mapping.json

Q: What should be done in a situation where an asset was marked as public in an earlier version but private in a later version (or vice versa)?

Import/Export

Simple import and export of a single Bundle can be done with a tar.gz or zip file. But import and export that involves multiple Bundles at a time would benefit from a command line tool that could talk to our API – particularly if it wants to export a whole Course's worth of Bundles and preserve Links to other Bundles (sometimes even multiple versions of the same Bundles).

Scale and Schema

Current edx.org has approximately:

~7K courses
~5K content libraries
~40 TB of content data
- The vast majority of that is video at various encodings (including the raws)
- ~2 TB of non-video static assets
- ~400 GB of versioned XBlock content data

Our design goal would be to support a 100X increase in this, ideally without requiring partitioning, since that's not compatible with foreign key constraints. We've had operational experience with tables over 1B rows, but we probably don't want to push our design beyond that if possible.

Model	Rows for edx.org	Rows at 100X
Collection	~10K	~1M
Bundle	~1M (~20 per course, more for content libraries, plus each video)	~100M
BundleVersion	~10M	~1B

Limits we're going to set for Links and Files:

Max 100 Files per BundleVersion
Max 2,000 total dependencies for a BundleVersion (including dependencies of dependencies)
- Chosen to accommodate the largest known courses.

If we assume these limits, then naively creating rows on a per-BundleVersion basis will quickly explode the tables for Links and Files beyond what we want. One approach around this is to more smartly collapse redundant information across Bundle Versions, and another is to take the data out of the database entirely and into the file store.

BundleVersionFiles

Access patterns:

Common:
- Get URL for single file in a BundleVersion.
- Get names/URLs for all files for a given BundleVersion.
Less frequent:
- What files changed in this BundleVersion?
- What is the history of this file across all BundleVersions?

The files for a given BundleVersion will be tracked using a summary JSON file per BundleVersion, stored by the file store interface (i.e. in S3). A drawback here is that it's not easy to track things at a per-file level. On the bright side, it's really simple to implement and understand.

Links

Link relationships are more complicated, because we expect to be able to query them in various ways:

Common:
- What are all the Links that a given BundleVersion is using?
- What Linked Bundles have been updated (have newer Bundle Versions)?
- Can I add this Link without forming a cycle (needed anytime we add a Link)?
Reporting/Notifications:
- What Bundles are using my Bundle?
- What Versions of my Bundle are being used?
- How many Bundles use my Bundle?

Avoiding Link Cycles

There are a few problems with cycles:

Infinite recursion when following links. This can be worked around by keeping a followed list and being mindful of the possibility, but as an unusual edge case, people would probably not account for it.
Infinite version bumping. Say there is a cycle between A and B. Then when B is updated, A will have the option of updating its Link to the new version of B. But doing that will bump the version of A, and B will be prompted to update it's Link to the new version of A.

Data Model

So with those constraints, the proposed design:

per-BundleVersion file describing the entire set of Link dependencies (including dependency-of-dependencies)
- We'd probably want to cap it at some number of total dependencies, say 2,000 (some courses have over 500 videos).
- When one BundleVersion adds another, this file is the only thing that needs to be inspected, since we've captured all transitive dependencies.
A table for Bundle Link relationships that has just enough information encoded in it to track basic usage and send notifications.
- (Borrowing Bundle ID, Latest Is Using, Lending Bundle ID)
  - Does not encode the transitive dependencies.
  - "Latest Is Using" means that the latest version of borrowing bundle is still using the lending bundle.
    - So if I want to see what Bundles are using mine to see what needs notification, I query this table for Lending Bundle = My Bundle and LatestIsUsing == True.
Notifications and queryable usage at the Bundle level is in the relational database.
Cycle prevention and full dependency expansion happens in a file at the BundleVersion level (stored alongside other BundleVersion data).

Example of what a summary file for a BundleVersion might look like:

{
  // Metafile format version
  "_meta": {
    "version": 1
  },
  // Information about the BundleVersion
  "info": {
    "id": "HqzGTIKeTyWcFn0hWpXKBA",
    "version": "1",
    "title": ""
  },
  // Map paths to source files
  "files": {
    "private": {
      "course.xml": "d41d8cd98f00b204e9800998ecf8427e"
    },
    "public": {
      "syllabus.pdf": "94287380b7700b204e9800998ecf8421"
    }
  },
  // Links to the Sequences we're using (after @ is version). In
  // the Bundle, these end up as sub-dirs of links/
  "links": {
    "week_001": "47skF26fRayxQ73j48oFmA@1",
    "week_002": "rSTYnCDSSAi_uaxuDKfWYw@1",
    "week_003": "TWoW-EYESzK5bJlEiRx2yQ@1",
    "week_004": "kY6RjNJuSWuia9wUa3D1zg@1",
    "week_005": "MFHKfOP0Qeuuj-Y96Koi8g@1",
    "week_006": "Jd9_H8UVSbeDOTlWsFYgAA@1",
    "week_007": "mz_g-lJXQCCTLlm7ous-LA@1",
    "week_008": "1U0Wymd4TUSS4lGUgTPWCw@1",
    "week_009": "i7oIlRWRTGqY07f81Xbdpw@1",
    "week_010": "urcqDvH6QXqAhD_j-XwbwQ@1"
  },
  // Each Link has associated dependencies for all the things it
  // depends on (including dependencies of dependencies). Our own
  // dependencies are the union of all our Link dependencies.
  // * It's ok if different Links require different Versions of
  //   the same Bundle.
  // * No dependency can be added if any version of this Bundle
  //   is listed as one of its dependencies.
  //
  // The goal is to make dependency calculation and cycle
  // detection very fast when trying to add or update a Link.
  "dependencies": {
    "47skF26fRayxQ73j48oFmA@1": [
      "ruSUc2xjQESeyq0fcu0QRw@2",
      "rcHOkLaoSH-VmTspp1xahA@7",
      "xcdnuGWcT_WoN0ueyULCkA@10"
    ]
    // + a lot more
  }
}

Architecture and Engineering

Blockstore Design