Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement internationalization support #9538

Merged
merged 1 commit into from
Aug 24, 2021
Merged

Conversation

msujew
Copy link
Member

@msujew msujew commented Jun 1, 2021

What it does

Closes #1327

Adds internationalization support for Theia. The following features are supported:

  • Support for vscode language packs to localize Theia itself.
  • Support for localized vscode extensions (both package.json and runtime translations via nls.localize())
  • Support for custom Theia localizations through a LocalizationContribution interface.
  • Support for localization of the monaco editor (search widget, context menu and so on).
  • Showing the original text under the localized value (for commands).

Things that are out of scope for this PR (and will probably be addressed in another PR):

  • Support for vscode language packs to localize other vscode extensions (like the builtins)

How it works

  1. The core/browser package contains a nls namespace with a localize function similiar to vscode. It accepts a translation key, a default value (usually the original english value) and additional args. The translations are loaded before any other imports so that they are available during load-time of any other code.
  2. The backend application loads vscode language packs and Theia's LocalizationContributions and places them in a LocalizationProvider. Additionally, the backend exposes a /i18n/<locale> HTTP endpoint to retrieve localizations for any locale.
  3. The vscode plugin host starts the vscode extension process with the locale that is currently used by the user. This allows vscode extensions to translate themselves.
  4. The monaco-loader.ts uses the frontend locale as well to configure monaco to use the currently active language.

How to test

  1. Test that the Configure display language command can be invoked but only displays en (even though german translations exist in the backend, but no language pack is installed)
  2. Download & install the german language pack (v1.50, search for german in the Theia extension menu)
  3. Download & install the i18n sample extension in german
  4. Execute the Hello and Bye command contributed by the i18n sample extension. Those will simply show Hello and Bye respectively in a notification.
  5. Execute the Configure Display Language command and choose the german locale de.
  6. The application will be reloaded (you will maybe have to reload it yourself) and partially translated into german. You will see this in the main menu as well as the workspace and some common commands. (Test for vscode localization packs)
  7. Execute the Hallo (Hello) and Auf Wiedersehen (Bye) command, which will now display german text as a notification. (Test for vscode extensions to correctly localize themselves and for Theia to read the translated package.json correctly)
  8. In the context menu of the file explorer, you will see that the Download menu item has been replaced with Herunterladen. This was done with a german LocalizationContribution.
  9. (optionally) Replace the editor.main.nls.de.js in the examples/browser/lib/vs/editor source directory with this file. Run gzip -k editor.main.nls.de.js in that directory. Reloading the Theia application and using de locale will now show german translation in the monaco search widget (Ctrl+F) as well as in the context menu. This is a workaround, see below for more information.

Open Issues

  • Do we need internationalization support for numbers and dates? - Will be addressed in a separate PR, if at all
  • How about right to left languages?
  • The @theia/monaco-editor-core package contains only english translations. Is this done on purpose? Replacing the nls.*.json files with the correct translations will translate monaco correctly.
  • The translations for the monaco editor rely on the order of certain language pack versions. This isn't an issue for the default languages (like de, it, fr...), but makes it virtually impossible for vscode language packs to actually contribute to monaco's translation process.
  • Language Packs can't be uninstalled without restarting the server (all localizations are held in memory after startup), as we don't know which translations are contributed by vscode language packs and which by Theia extensions yet.
  • Currently the LocalizationProvider holds a single active language for the whole application. Is there a way to hold a value for every user?
  • How do we handle translations contributed by Theia extensions without having a language pack installed? Selecting the german locale e.g. will just translate Theia's own commands, not the rest.
  • Theia's quick-input-bar doesn't filter by the details of the item, unlike vscode. This makes finding commands quite difficult after switching the locale.

Review checklist

Reminder for reviewers

// Prefix the command label with the category if it exists, else return the simple label.
return command.category ? `${command.category}: ${command.label}` : command.label;
return command.category ? `${command.category}: ${label}` : label;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like command categories are also available for localization in VSCode:
image

Would it be possible to add those to the PR conveniently?

Copy link
Member Author

@msujew msujew Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally forgot about those. Done. You can test this with the commands in the File category.

Also, you reminded me that vscode shows the original text under the localized text. I implemented this as well.

@colin-grant-work
Copy link
Contributor

A general observation: as this is currently implemented, a lot of services need to know about it or opt into it. For example:

    export function compareCommands(a: Command, b: Command): number {
        if (a.label && b.label) {
            const aLabel = LocalizationInfo.localize(a.label);
            const bLabel = LocalizationInfo.localize(b.label);
            const aCommand = (a.category ? `${LocalizationInfo.localize(a.category)}: ${aLabel}` : aLabel).toLowerCase();
            const bCommand = (b.category ? `${LocalizationInfo.localize(b.category)}: ${bLabel}` : bLabel).toLowerCase();
            return (aCommand).localeCompare(bCommand);
        } else {
            return 0;
        }
    }

The commands module, in order to implement a simple utility, needs to know that it may have to translate the input, and the same will need to be done everywhere we touch a string that might ever want to be localized - and at the limit, that's basically anywhere a string is handled.

By contrast, VSCode's solution looks like it happens at load time, and thereafter the application can behave as though it's dealing with fixed strings, but you have to reload the application to apply a new localization. In general, we can't make very many assumptions about what we know at load time, so we may not be able to achieve the same level of parsimony that VSCode has, but I wonder if there's an architectural alternative that reduces the impact relative to the current approach.

@msujew
Copy link
Member Author

msujew commented Jun 9, 2021

Hi @colin-grant-work, thanks for the input.

I wasn't too happy about the label: string | LocalizationInfo solution either. I present another take on that in a fork of my i18n branch which uses a more vscode like API by loading everything on startup. However, this solution is quite diametrical towards the usual architecture of Theia, since it uses an express endpoint to serve localizations instead of using the websocket connection. What do you think about that?

@paul-marechal
Copy link
Member

paul-marechal commented Jun 9, 2021

However, this solution is quite diametrical towards the usual architecture of Theia, since it uses an express endpoint to serve localizations instead of using the websocket connection.

@msujew Serving things over regular HTTP is not completely crazy if we are talking about fetching data (think stateless things like REST API). Remote logic with side effects and state are usually better encapsulated with the RPC system. So it sounds fine here.

On the other hand, the way you defined the express route is pretty weird:

  • Why not use a proper BackendApplicationContribution to initialize the localization registry?
  • Then replace the this.use(...) by something like configure(app) and do app.get('/i18/:locale', ...)?

@msujew
Copy link
Member Author

msujew commented Jun 9, 2021

@paul-marechal True, if we look at it from a pure fetching perspective, that makes more sense. I always had the additional overhead of storing the locale in the backend (for the purpose of localizing vscode extensions) in the back of my mind. However, just fetching Theias own localization is completely independent of that, so doing this through an http call would be appropriate. For everything with else we could still use the RPC based API.

And yes, I just wanted to have a quick and dirty solution for presenting the express based approach. Implementing all of that as a proper BackendApplicationContribution looks way better.

I'll refactor my original code and come back to you.

@msujew msujew force-pushed the i18n branch 2 times, most recently from 8acb1dc to 3543dc6 Compare June 14, 2021 13:49
@msujew
Copy link
Member Author

msujew commented Jun 14, 2021

@paul-marechal I implemented everything with a BackendApplicationContribution and removed the LocalizationService completely. Do you have an idea where to put the loadTranslations call (currently made here) to ensure that the translations are loaded before localize is called?

packages/core/src/browser/frontend-application.ts Outdated Show resolved Hide resolved
@msujew msujew force-pushed the i18n branch 2 times, most recently from 6a72b3c to 8db5529 Compare June 15, 2021 20:52
@msujew
Copy link
Member Author

msujew commented Jun 15, 2021

I just implemented a vscode style LocalizedString type. We need that type to store the original untranslated name of commands to display them under the translated name in the quick input.

However, it increases maintenance efforts and is a breaking change. WDYT, is that a feature we want to support in Theia? Or do we rather use something like originalLabel?: string, originalCategory?: string on commands?

@paul-marechal paul-marechal dismissed their stale review June 16, 2021 03:31

loadTranslations is awaited before importing dependent source files.

@paul-marechal
Copy link
Member

We need that type to store the original untranslated name of commands to display them under the translated name in the quick input.

I wasn't sure what you meant, so here's how it looks like in VS Code:

image

@msujew
Copy link
Member Author

msujew commented Jun 16, 2021

I actually didn't like my first approach because of the breaking changes it introduces. Therefore I already implemented the originalLabel?: string, originalCategory?: string approach.

@colin-grant-work colin-grant-work self-requested a review June 21, 2021 22:12
packages/core/src/common/menu.ts Outdated Show resolved Hide resolved
packages/core/src/node/backend-application-module.ts Outdated Show resolved Hide resolved
packages/keymaps/src/browser/keybindings-widget.tsx Outdated Show resolved Hide resolved
packages/monaco/src/browser/monaco-browser-module.ts Outdated Show resolved Hide resolved
packages/monaco/src/browser/monaco-frontend-module.ts Outdated Show resolved Hide resolved
packages/navigator/src/browser/navigator-contribution.ts Outdated Show resolved Hide resolved
packages/core/src/common/command.ts Show resolved Hide resolved
@msujew msujew force-pushed the i18n branch 4 times, most recently from 6c80000 to 6913978 Compare June 22, 2021 12:45
@msujew
Copy link
Member Author

msujew commented Jun 25, 2021

@iamtangram I'm currently waiting for further reviews. If you're so eager for this feature to release, you can join the discussion as well, discussing open questions that I posted in the initial PR comment or doing a review of my code ;)

Any input - even from non-commiters of Theia - is greatly appreciated.

@msujew
Copy link
Member Author

msujew commented Jul 7, 2021

As I don't know the service, it's hard to tell. Do you want to open a feature request detailing what you want to do?

Sure, I just created #9708 where we can discuss anything related to that.

@marcdumais-work
Copy link
Contributor

At a quick glance, I can't easily tell what's pending before this PR can be approved. @tsmaeder do we need #9708 to be resolved before we can proceed? What else needs addressing?

@marcdumais-work
Copy link
Contributor

@msujew I see there are currently merge conflicts - it could be a good idea to rebase the PR on latest master?

@marcdumais-work
Copy link
Contributor

marcdumais-work commented Jul 28, 2021

(but you can send me a PR yourself by using a link in the following fashion: msujew/theia@i18n...misakajimmy:)

We need be careful about integrating an other person's changes within a pull-request - any contribution should be done "above board", without obfuscating the provenance/authorship, and are subject to the usual Eclipse Foundation rules. For example:

When the contributor has signed the Eclipse Contributor Agreement (ECA) and the following conditions are met, a CQ is not required:

  • Was developed from scratch; written 100% by submitting contributor;

See here for more details.

@msujew
Copy link
Member Author

msujew commented Aug 6, 2021

@tsmaeder I wrote a small proposal for the coding guidelines:

I18n

  1. Always translate any user facing text with the nls.localize(key, defaultValue, ...args) function.

What is user facing text? Any strings that are hardcoded (not calculated) that could be in any way visible to the user, be it labels for commands and menus or messages.

1.1. Parameters for messages should be passed as the args of the localize function. They are inserted at the location of the placeholders - in the form of {\d+} - in the localized text. E.g. {0} will be replaced with the first arg, {1} with the second, etc.

  1. Use utility functions where possible:
// bad
command: Command = { label: nls.localize(key, defaultValue), originalLabel: defaultValue };
// good
command = Command.toLocalizedCommand({ id: key, label: defaultValue });

Copy link
Contributor

@jbicker jbicker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :-)
Since the last requests have been resolved by @msujew and no more have been received for about two weeks, I think it's okay to approve it. If there are no further objections, we will merge it next week.

@msujew
Copy link
Member Author

msujew commented Aug 20, 2021

FYI, I will drop the german custom translation before merging this PR and follow up with another PR that includes all translations contributed by the vscode language packs. Or would you rather see this as a part of this PR? We would like to include everything in the August release if possible.

@fstasi
Copy link

fstasi commented Aug 23, 2021

FYI, I will drop the german custom translation before merging this PR and follow up with another PR that includes all translations contributed by the vscode language packs. Or would you rather see this as a part of this PR? We would like to include everything in the August release if possible.

@msujew I agree on having two separate PRs, one for the actual code implementation and another for the translations. This will reduce the noise in this one.

@paul-marechal
Copy link
Member

  • Language Packs can't be uninstalled without restarting the server (all localizations are held in memory after startup), as we don't know which translations are contributed by vscode language packs and which by Theia extensions yet.

Can you point to the code that is the cause of loading everything in memory?

  • Currently the LocalizationProvider holds a single active language for the whole application. Is there a way to hold a value for every user?

Sounds like state that should live in the frontend only?

  • Theia's quick-input-bar doesn't filter by the details of the item, unlike vscode. This makes finding commands quite difficult after switching the locale.

I was about to comment on this. If the plan is to not support this yet then OK to merge.

Copy link
Member

@paul-marechal paul-marechal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the "How to test" steps and everything worked.

Code LGTM, although regarding the various pending issues I am fine with merging this PR before the upcoming release if we are ready to potentially make breaking changes, as required to add/fix the rest later.

@msujew
Copy link
Member Author

msujew commented Aug 24, 2021

Can you point to the code that is the cause of loading everything in memory?

We load everthing here:

if (deployed.contributes?.localizations) {
this.localizationProvider.addLocalizations(...buildTheiaLocalizations(deployed.contributes.localizations));
}

Sounds like state that should live in the frontend only?

I agree. There's two issues with that though:

  1. The backend needs to know the selected language by the user to start the plugin host with the correct language (for vscode extensions to localize themselves correctly)
  2. The backend will regularly send stuff like messages and so on to the frontend. This will have to be translated as well.

We can work around both points I guess. For 1. we only need to know the selected language on startup, and forget about it afterwards. This is basically how it works already, so we don't have to change anything. For 2. we would need to only send the translation keys to the frontend, which will then translate the message on its own. This is quite unintuitive though.

I was about to comment on this. If the plan is to not support this yet then OK to merge.

This was fixed by #9928. I'll rebase everything, and then that'll work out of the box.

@msujew
Copy link
Member Author

msujew commented Aug 24, 2021

@paul-marechal After rebasing filtering by the original command name is now possible. The only command still getting translated is the Configure Display Language command. You can test it that way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Theia and National Language Support and Accessibility
8 participants