Review our Working group landing page and details like how we work, how we make decisions
@Edward Zarecor
Quick review of the Landing page.
Discussion of a shared working group calendar. tCRIL can provide Zoom service to all working groups. Should discuss with someone at OpenCraft to understand if there are any issues with changing the Zoom for BTR? @Xavier Antoviaque Should I discuss with anyone in particular?
20 minutes
Discuss platform database support stance.
@Dave Ormsbee (Axim)
We currently run MySQL 5.7 across our stack, though we’ve generally avoided assuming MySQL (tests run in SQLite).
Some recent discussions and developments w.r.t. our use of databases:
Full text search: The idea of using in-database full text search has been floated as a long term replacement to Elasticsearch to help simplify the stack. It’s unclear whether the result quality makes this practical though, particularly with MySQL. At the moment, there is no vendor-neutral abstraction for this in the Django ORM.
Per-field collation: As of Django 3.2, we can specify collation at the field-level. This would allow us to more easily fine tune issues around case sensitivity (e.g. making a UsageKey field case sensitive, but leaving a title to use an accent-aware, case-insensitive collation). These settings are vendor-specific though, and not compatible across databases.
OpenCraft recently ran into this in a Blockstore PR because that repository tries to explicitly use utf8mb4 (i.e. "actual Unicode") when the default is utf8 (3-byte Unicode).
We’ve also had periodic conversations about PostgreSQL–it’s generally better regarded and better supported by Django than MySQL. PostgreSQL would have been our default if we were starting from scratch today, but migrating over would be non-trivial and would likely put us in a years-long transitional period where both databases would need to be supported.
Short term question: Does it make sense to push up the move to MySQL 8.0?
MySQL 5.7 isn’t end of life until Oct. 2023, but 8.0 is already over three years old. It features better Unicode support, window functions, and improved JSON field queries. Not to mention proper ARM64 builds.
Would we even consider MariaDB? (I’m guessing no, since Aurora tracks only MySQL and PostgreSQL.)
Longer term question: What is our stance on PostgreSQL support in general? I can see at least four plausible paths:
Status quo: MySQL is official and well tested, but aim for PostgreSQL compatibility and accept patches to fix any issues.
Officially support both databases, by increasing our testing with PostgreSQL.
Lean into MySQL, by allowing more MySQL-specific features via libraries like django-mysql.
Lean into PostgreSQL, with an eye towards long term migration.
It is also possible that we can fund improvements to Django’s MySQL support in order to address specific gaps.
Notes:
Dave Ormsbee presented the problems we are facing with MySQL collation outlined above.
Is anyone currently running the platform under PostgreSQL
@Igor Degtiarov mentioned that RaccoonGang have done this recently and was going to do some investigation of how hard/easy this was.
Would we consider upgrading the MySQL 5.8 right now?
Should we consider upgrading to MariaDB, it is being currently used on Macs for development
We need to answer and document our approach to multi-database support. We are currently implicitly cross db and behind the ORM, but
Is anyone actually using that feature?
What are we losing in terms of specialized DB features?
Key questions
Should we upgrade to MySQL 8.0?
[SC] Looks like upgrading to MySQL 8.0 might encounter issues on platform. See MySQL 8.0
Do we have sufficient reasons to upgrade now?
MySQL 8 is over 3 years old
Mac development on M1s cannot use MySQL 7
Better Unicode support generally – current MySQL 5.7 uses a 3 byte Unicode format
There are interesting collation improvements
Would running against Maria/5.8 in development be markedly worse?
Maybe, we have seen actual breakage moving to 8.
Tests currently run on SQL lite today.
2U upgrade to 8 is currently not-planned
Need to have control over schedule
Propose upgrading development tools first
Which collation should we be using?
Should we be declaring a universal standard for all IDA
probably yes, but not documented.
Moving to PostgreSQL
has not come up in a thoroughgoing way
we don’t know if it would be better or just different without real evidence\
not confident it could actually replace Elasticsearch
Does MySQL 8 have features we really want or will we come to regret staying with MySQL in 2 years.
Blockstore team was also facing a MySQL issue.
We sometimes have columns we want insensitive, some we want sensitive
This already breaks cross database compatibility because we need to pass in the collation name to the field in the ORM
@Jill Vogel is investigating creating a “custom field” that auto-magically handles field level collation settings.
@Andy Shultz (Deactivated) proposes solving with a field for insensitive sort, a field for sensitive sort. A simple solution that makes the table a little wider, but would be cross platform.
Currently facing this in pipeline from Snowflake (sensitive) → MySQL (insensitive)
If we cannot/will not upgrade to MySQL 5.7 on the server side for Nutmeg – which we can’t – @Dave Ormsbee (Axim) recommends sticking with MariaDB in ARM64 dev environments as it “just works.”
Action items
Develop a solution for running MySQL 8 in Development environments
Could build ARM images ourselves ([Dave]: This is only relevant if we want to run MySQL 5.7 in dev envs–8.0 has pre-built images)
Could simply stick with MariaDB which seems an adequate drop-in replacement
Develop a way to run our migration tests against MySQL 5.7 and 8 simultaneously. This is very important for 2U
We should specify which named release we will upgrade to MySQL 8
Simon to connect with Jeremy on the timing of potential upgrade at 2U
Decisions
Nutmeg: MySQL 5.7 on the server side, and ARM64 dev environments can use MariaDB 10.4
Olive: Support both MySQL 5.7 and 8.0 on the server side, and ARM64 dev environments will move to MySQL 8.0