Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Transactional DDL. If a database schema migration fails in MySQL, it has a nontrivial risk of leaving the database in a broken state that triggers a CAT-1 incident to restore site functionality. In PostgreSQL, such a failure simply results in the transaction being rolled back so the database is left in the state it was in before the migration attempt.

  • 1st-class Django support. Django has many useful features that only work with PostgreSQL, and many more cases where “this works on MySQL, but not as well as on PostgreSQL”. And this is a logical prioritization choice for the project’s maintainers - according to the 2022 JetBrains survey of Django developers, 79% of them work with PostgreSQL while only 28% use MySQL in any capacity.

  • Better full text search. MySQL comes with a rudimentary full text search engine, but the one in PostgreSQL is good enough to replace dedicated tools like Elasticsearch for many use cases (including many Open edX installations).

  • Better JSON support. MySQL can store JSON data has text, but its availability to work with it as structured data is very limited. PostgreSQL does this well enough to outperform dedicated document databases like MongoDB in many use cases.

  • Proper Unicode behavior. MySQL made the odd choice to make its default “utf8” encoding only support 3 of the 4 available bytes in the UTF-8 standard. The full 4-byte encoding was only added later as “utf8mb4”, and the default character set was still the much more restrictive latin1 until MySQL 8.0. This is forcing a difficult migration process to correct.

  • Remove blocker to adoption. Organizations that are already strongly invested in PostgreSQL may be reluctant to use Open edX because it only works with MySQL.

  • Developer Experience/Hiring/Retention. For many reasons like the ones noted above, PostgreSQL has become strongly preferred over MySQL by a solid majority of developers. StackOverflow’s 2023 developer survey notes that about 43% of developers would like to be working with PostgreSQL vs. only 24% for MySQL (and 71% of PostgreSQL users are happy with it, vs. only 51% for MySQL). Consequently, despite MySQL’s early lead in popularity for historical reasons, PostgreSQL is now more widely used. (The MariaDB fork of MySQL fares even worse on most of these metrics.). This may not be a determining factor in software engineers deciding where to work, but it lands database selection solidly in the “con” column for many such decisions and could help tip the scales one way or the other.

...

As noted in the history section above, there are some pretty good reasons why PostgreSQL support hasn’t been added to Open edX yet despite its advantages.

  • Development effort opportunity cost. Although we primarily use database-agnostic Django functionality, we have hard-coded references to MySQL in devstack, Tutor, some migration steps with custom SQL, etc. We don’t yet have a solid grasp of the total scope of effort, and any development effort put into it will come at the expense of other valuable initiatives.

  • More maintenance overhead. Supporting two databases takes more work than supporting one, and it’s unlikely that all current Open edX sites would promptly switch from MySQL to PostgreSQL once the support is added.

  • Operational unknowns. We collectively have a lot of experience with running MySQL at scale and have tweaked various indexes and queries to accommodate MySQL (and occasionally Aurora quirks). While it’s clear that Postgres can also scale, we may see performance regressions during the process of migrating any large site.

After adding PostgreSQL support, should we drop MySQL support?

To allow for sane migration plans, we’d need at least some span of time when MySQL and PostgreSQL are both supported. But would we want to continue supporting both indefinitely, or drop MySQL support after allowing a reasonable amount of time for migration?

Reasons to drop MySQL support

  • Lower maintenance overhead. If we drop MySQL support, we get back to only needing to support a single database, reducing the long-term maintenance burden.

  • Freedom to use PostgreSQL-only features. Officially supporting MySQL would essentially prevent usage of many of the nicer PostgreSQL-only features in the core Open edX code. We’d only be able to use it in optional features/extensions, if at all.

  • No need to work around MySQL quirks. Several prominent open source projects have declined to support MySQL despite code contributions to add it, because their experience has been that resolving bugs involving its use takes valuable development effort away from other high priority tasks (and is simply unpleasant to deal with). Having experienced many of these problems ourselves (as described above), we might well make the same decision once PostgreSQL is a viable option.

Reasons to keep MySQL support

  • Community pressure. A significant percentage of the Open edX community may be reluctant to migrate from MySQL to PostgreSQL, especially if they already have significant in-house MySQL experience. We’d have to poll the community to gauge sentiment regarding this.

  • Fewer blockers to adoption. Organizations that have already invested heavily in MySQL might be more reluctant to adopt Open edX if it only works with PostgreSQL.