How to set up Python internationalisation with Click, Poetry, Forgejo, and Weblate

TL;DR—look at Protokolo and do exactly what it does.

This is a short article because I am lazy but do want to be helpful. The sections are the steps you should take. All code presented in this article is licensed CC0-1.0.

Use gettext

As a first step, you should use gettext. This effectively means wrapping all string literals in _() calls. This article won’t waste a lot of time on how to do this or how gettext works. Just make sure to get plurals right, and make sure to provide translator comments where necessary.

I recommend using the class-based API. In your module, create the following file i18n.py.

import gettext as _gettext_module
import os

_PACKAGE_PATH = os.path.dirname(__file__)
_LOCALE_DIR = os.path.join(_PACKAGE_PATH, "locale")

TRANSLATIONS = _gettext_module.translation(
    "your-module", localedir=_LOCALE_DIR, fallback=True
)
_ = TRANSLATIONS.gettext
gettext = TRANSLATIONS.gettext
ngettext = TRANSLATIONS.ngettext
pgettext = TRANSLATIONS.pgettext
npgettext = TRANSLATIONS.npgettext

This assumes that your compiled .mo files will live in your-module/locale/<lang>/LC_MESSAGES/your-module.mo. We’ll take care of that later. Putting the compiled files there isn’t ideal (you want them in /usr/share/locale), but it’s the best you can do with Python packaging.

In subsequent files, just do the following to translate strings:

from .i18n import _

# TRANSLATORS: translator comment goes here.
print(_("Hello, world!"))

However, the Click module doesn’t use our TRANSLATIONS object. To fix this, we need to use the GNU gettext API. This is kind of dirty, because it messes with the global state, so let’s do it in cli.py (the file which contains all your Click groups and commands).

if gettext.find("your-module", localedir=_LOCALE_DIR):
    gettext.bindtextdomain("your-module", _LOCALE_DIR)
    gettext.textdomain("your-module")

Internationalise Click

When using Click, you have two challenges:

  1. You need to translate the help docstrings of your groups and commands.
  2. You need to translate the Click gettext strings.

Translating docstrings

Normally, you have some code like this:

@click.group(name="your-module")
def main():
    """Help text goes here."""
    ...

And when you run your-module --help, you get the following output:

$ your-module --help
Usage: your-module [OPTIONS] COMMAND [ARGS]...

  Help text goes here.

Options:
  --help     Show this message and exit.

You cannot wrap the docstring in a _() call. So by necessity, we will need to remove the docstring and do something like this:

_MAIN_HELP = _("Help text goes here.")

@click.group(name="your-module", help=_MAIN_HELP)
def main():
    ...

For multiple paragraphs, I translate each paragraph separately, which is easier for the translators:

_HELP_TEXT = (
    _("Help text goes here.")
    + "\n\n"
    + _(
        "Longer help paragraph goes here. We use implicit string concatenation"
        " to avoid putting newlines in the translated text."
    )
)

Translate the Click gettext strings

We will create a script generate_pot.sh that generates our .pot file, including the Click translations. My script-fu isn’t very good, but it appears to work.

#!/usr/bin/env sh

# Set VIRTUAL_ENV if one does not exist.
if [ -z "${VIRTUAL_ENV}" ]; then
    VIRTUAL_ENV=$(poetry env info --path)
fi

# Get all the translation strings from the source.
xgettext --add-comments --from-code=utf-8 --output=po/your-module.pot src/**/*.py
xgettext --add-comments --output=po/click.pot "${VIRTUAL_ENV}"/lib/python*/*-packages/click/**.py

# Put everything in your-module.pot.
msgcat --output=po/your-module.pot po/your-module.pot po/click.pot
# Update the .po files. Ideally this should be done by Weblate, but it appears
# that it isn't.
for name in po/*.po
do
    msgmerge --output="${name}" "${name}" po/your-module.pot;
done

After running this script, all strings that must be translated are in your .pot and existing .po files.

You can use the above script for argparse as well, with minor modifications.

Generate .pot file automagically

You don’t want to manually run the generate_pot.sh script. Instead, you want the CI (Forgejo Actions) to run it on your behalf whenever a gettext string is changed or introduced.

Use the following .forgejo/workflows/gettext.yaml file.

name: Update .pot file

on:
  push:
    branches:
      - main
    # Only run this job when a Python source file is edited. Not strictly 
    # needed.
    paths:
      - "src/your-module/**.py"

jobs:
  create-pot:
    runs-on: docker
    container: nikolaik/python-nodejs:python3.11-nodejs21
    steps:
      - uses: actions/checkout@v3
      - name: Install gettext and wlc
        run: |
          apt-get update
          apt-get install -y gettext wlc          
      # We mostly install your-module to install the click dependency.
      - name: Install your-module
        run: poetry install --no-interaction --only main
      - name: Lock Weblate
        run: |
          wlc --url https://hosted.weblate.org/api/ --key ${{secrets.WEBLATE_KEY }} lock your-project/your-module          
      - name: Push changes from Weblate to upstream repository
        run: |
          wlc --url https://hosted.weblate.org/api/ --key ${{secrets.WEBLATE_KEY }} push your-project/your-module          
      - name: Pull Weblate translations
        run: git pull origin main
      - name: Create .pot file
        run: ./generate_pot.sh
      # Normally, POT-Creation-Date changes in two locations. Check if the diff
      # includes more than just those two lines.
      - name: Check if sufficient lines were changed
        id: diff
        run:
          echo "changed=$(git diff -U0 | grep '^[+|-][^+|-]' | grep -Ev
          '^[+-]("POT-Creation-Date|#:)' | wc -l)" >> $GITHUB_OUTPUT
      - name: Commit and push updated your-module.pot
        if: ${{ steps.diff.outputs.changed != '0' }}
        run: |
          git config --global user.name "your-module-bot"
          git config --global user.email "<>"
          git add po/your-module.pot po/*.po
          git commit -m "Update your-module.pot"
          git push origin main          
      - name: Unlock Weblate
        run: |
          wlc --url https://hosted.weblate.org/api/ --key ${{ secrets.WEBLATE_KEY }} pull your-project/your-module
          wlc --url https://hosted.weblate.org/api/ --key ${{ secrets.WEBLATE_KEY }} unlock your-project/your-module          

The job is fairly self-explanatory. The wlc command talks with Weblate, which we will set up soon. The job installs dependencies, gets the latest translations from Weblate, generates the .pot, and then pushes the generated .pot (and .po files) if there were changed strings.

See reuse-tool for a GitHub Actions job. It is currently missing the wlc locking.

Set up Weblate

Create your project in Weblate. In the VCS settings, set version control system to ‘Git’. Set your source repository and branch correctly. Set the push URL to https://<your-token>@codeberg.org/your-name/your-module.git. You get the token from https://codeberg.org/user/settings/applications. You will need to give the token access to ‘repository’. There should be a more granular way of doing this, but I am not aware of it.

Set the repository browser to https://codeberg.org/your-name/your-module/src/branch/{{branch}}/{{filename}}#{{line}}. Turn ‘Push on commit’ on, and set merge style to ‘rebase’. Also, always lock on error.

In your project settings on Weblate, generate a project API token. Then in your Forgejo Actions settings, create a secret named WEBLATE_KEY with the project API token as value.

Publishing your translations with Poetry

Now that all the translation plumbing is working, you just need to make sure that you generate your .mo files when building/publishing with Poetry.

We add a build step to Poetry using the undocumented build script. Add the following to your pyproject.toml:

[tool.poetry.build]
generate-setup-file = false
script = "_build.py"

Do NOT name your file build.py. It will break Arch Linux packaging.

Create the file _build.py. Here are the contents:

import glob
import logging
import os
import shutil
import subprocess
from pathlib import Path

_LOGGER = logging.getLogger(__name__)
ROOT_DIR = Path(os.path.dirname(__file__))
BUILD_DIR = ROOT_DIR / "build"
PO_DIR = ROOT_DIR / "po"


def mkdir_p(path):
    """Make directory and its parents."""
    Path(path).mkdir(parents=True, exist_ok=True)


def rm_fr(path):
    """Force-remove directory."""
    path = Path(path)
    if path.exists():
        shutil.rmtree(path)


def main():
    """Compile .mo files and move them into src directory."""
    rm_fr(BUILD_DIR)
    mkdir_p(BUILD_DIR)

    msgfmt = None
    for executable in ["msgfmt", "msgfmt.py", "msgfmt3.py"]:
        msgfmt = shutil.which(executable)
        if msgfmt:
            break

    if msgfmt:
        po_files = glob.glob(f"{PO_DIR}/*.po")
        mo_files = []

        # Compile
        for po_file in po_files:
            _LOGGER.info(f"compiling {po_file}")
            lang_dir = (
                BUILD_DIR
                / "your-module/locale"
                / Path(po_file).stem
                / "LC_MESSAGES"
            )
            mkdir_p(lang_dir)
            destination = Path(lang_dir) / "your-module.mo"
            subprocess.run(
                [
                    msgfmt,
                    "-o",
                    str(destination),
                    str(po_file),
                ],
                check=True,
            )
            mo_files.append(destination)

        # Move compiled files into src
        rm_fr(ROOT_DIR / "src/your-module/locale")
        for mo_file in mo_files:
            relative = (
                ROOT_DIR / Path("src") / os.path.relpath(mo_file, BUILD_DIR)
            )
            _LOGGER.info(f"copying {mo_file} to {relative}")
            mkdir_p(relative.parent)
            shutil.copyfile(mo_file, relative)


if __name__ == "__main__":
    main()

It is probably a little over-engineered (building into build/ and then consequently copying to src/your-module/locale is unnecessary), but it works.

Finally, make sure to actually include *.mo files in pyproject.toml:

include = [
    { path = "src/your-module/locale/**/*.mo", format="wheel" }
]

And that’s it! A rather dense and curt blog post, but it should contain helpful bits and pieces.

See also