i18n for React
Requirements
- multiple plurals
- developer notes ("descriptions" in the FormattedMessage component)
- must be able to get a translated message as a plain string (not just a React element)
General problem: trying to find a file format that Transifex supports that can give us the first two items. We could turn .json into .po but (probably on the inverse transformation) that breaks plurals. The .po plural format is so hairy we actually haven't found any packages online that convert it to .json, so I guess nobody else wanted to do it either.
See whiteboard photos here. See how studio-frontend did it here (caveat: we won't be taking their entire process).
Process
Legend: Working in user account app. Open question.
in .jsx files: react-intl
- given a translation file and a FormattedMessage component, squishes them into one translated message
extraction: many .jsx → many .json
- makes .json files in src/data/i18n/messages/, in a directory structure that parallels the structure of the .jsx files
- each .json file is an array of JS objects (message id, default message, notes to translator)
- these files do not need to be checked in to version control
concatenation: many .json → one .json (or other format)
- can do this several ways
- reactifex creates one .json file of message id / English pairs (the KEYVALUEJSON format), and then uses curl and the Transifex API to add the translator comments back in
- the Android XML format that Transifex can read seems much closer to .json than the .po format
- this file does not need to be checked in to version control
validation (not needed for first draft)
- XSS safety has been documented in Preventing XSS in React.
- i18n-tools validation also supports:
- Ensuring interpolation variables in default and translated strings match (i.e. same number of variables, same names).
- what it covers
- react-intl-translations-manager finds missing translations, etc.
translation: jobs run weekly on Jenkins
PR has merged: https://github.com/edx/edx-internal/pull/734 and this ticket to reseed the jobs: - DEVOPS-8285Getting issue details... STATUS
- jobs are at https://tools-edx-jenkins.edx.org/job/translations/ - must be on VPN
some of these jobs have been failing for months
- to prepare your repo for pulling message files from Transifes, in your repo you must add a .tx/config file-- can copy from Profile
- Transifex project is at https://www.transifex.com/open-edx/edx-platform/
- notes
incorporation: translated .json → react-intl
- put your translated messages file at src/i18n/messages/es.json (or whatever locale)
- update src/i18n/i18n-loader.js to load your new locale and your new messages file
react-intl and gender agreement
PL translation:
Male:
Female:
Other:
Incorporating i18n dependencies
Definitions
- Component: A logical unit of code that has its own package; the content of a repo. (NOTE that in this section, this is not the same as, say, a React component.)
- Top-level component: A micro frontend or other component that is not designed to be consumed. Example: profile app.
- Consumer: A component that uses or incorporates some different component.
Optimal choice: Avoid
When possible, components should accept an already-localized string as props, deferring the localization to elements higher up in the element hierarchy.
If you must: Aggregate
Components that need to provide their own i18n must also manage their own translations. This means they will need to have their own translations job on Jenkins.
Because translations can take a long time (e.g., weeks or months) to come in, we want consumers to be able to update the translations for their dependencies, without necessarily pulling in code changes for those dependencies that have happened in the meantime. This requirement implies that translations should be packaged and versioned side by side with, but separately from, the code.
Components must aggregate the translations from their dependencies in their own dependency package. Example: Say the Footer component is i18n'ized, and it uses another component FooterSubcomponent that is also i18n'ized. Then the translations package that Footer generates should include the translations for FooterSubcomponent as well.
Implications
- If you introduce new messages in a code commit, you should remember to bump your messages peer dependency in package.json so your consumers are reminded to bump their translations version.
- A top-level MFE might wish to support languages that its dependencies don't (yet). The devs will need to add the necessary languages to those components.
- A non-top-level component that doesn't do its own i18n, but that uses a dependency that needs i18n, will still need to package that dependency's translations.
- A change in the translations for a component won't show up until all the components above it are rebuilt.
- If you bump your translation version but not your code version for a dependency, you might get extra strings that your version of the code isn't using. We're okay with this for now.
- Transifex will never see a bundled list of strings to translate, only the strings directly from an individual repo.
Namespace your message ids. ( Enforce?)
Rejected Alternatives
- We could have each i18n'ed dependency wrap itself in its own IntlProvider translation context.
- But we would still have to pass the desired locale all the way down. (This would be easier if we were building per-locale bundles.)
- Updating a component version to get its newest translation would require us to also accept any code changes that had been made in the meantime.
- We could have each i18n'ed dependency incorporate translations, bumping the patch version, and then check in new translations to every "still supported" major version. But this seems like a maintenance headache.
- We could have each top-level component provide the translations for all of its dependencies.
- But it still has to get those messages out.
- Plus now translators are re-translating some components over and over, possibly not even consistently.
- We could load the translation bundles from the client at run time, but then we'd have to deal with potential headaches:
- One component's translations just not arriving before the request timed out
- Even if it did arrive eventually (or we retried), have we been waiting to render the page all this time? Or are we going to re-render, letting the user see the language suddenly change?