Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev/translation#67 - Define "Translation" table. Add during installation/upgrade. #20478

Merged

Conversation

totten
Copy link
Member

@totten totten commented Jun 2, 2021

Overview

This creates a table/entity, Translation (civicrm_translation), to represent a single translated string (e.g. a translated message-title or a translated event-description). Loosely speaking, any field in the database can be designated as translatable -- and then it will be permitted to store values like:

INSERT INTO civicrm_translation (entity_table, entity_id, entity_field, language, string)
VALUES ('civicrm_event', 100, 'title', 'fr_FR', 'La nouvelle chaine')

(See also: https://lab.civicrm.org/dev/translation/-/issues/67)

ping @eileenmcnaughton

After

The SQL table, DAO, and BAO are defined.

Comments


This is based on a civi-data-translate strings table, but with some changes:

  • Entity names are usually singular, but String is conflicted. This use "Translation" which fits in a bit better grammatically. (I previously used hybrid String/Strings depending on context, but we negotiated Translation on tcon.)
  • The language only needs 5 characters (NN_nn).
  • Consolidated bool is_active and bool is_default into one int status_id.
  • Added indexing
  • Marked the dynamic foreign key
  • Pared back the list of fields to which it applies -- to be basically empty by default. To activate support for translating a specific field, it needs to be declared via hook_civicrm_translatedFields.

This commit includes the BAO with some of the backing-methods required for API exposure. However, the API won't really work until we have the validation-values event, so the API has been kicked to a subsequent PR.

The list of translatable entities/fields will be signficant because it will determine when/how to redirect data in API calls. This patch does not commit to specific translatable fields - but it does provide a hook to determine them.

When the API PR becomes unblocked, it will include test-coverage that hits the API, BAO, and hook.

…tion/upgrade.

This creates an entity, `Translation` (`civicrm_translation`), to represent a single translated database value. Loosely speaking,
any field in the database can be designated as translatable -- and then it will be permitted to store values like:

```sql
INSERT INTO civicrm_translation (entity_table, entity_id, entity_field, language, string)
VALUES ('civicrm_event', 100, 'title', 'fr_FR', 'La nouvelle chaine')
```

This is based on a `civi-data-translate` strings table, but with some changes:

* Entity names are usually singular, but `String` is conflicted. I previously used hybrid
  String/Strings (depending on context), but we negotiated `Translation` on tcon.
* The language only needs 5 characters (NN_nn).
* Consolidated `bool is_active` and `bool is_default` into one `int status_id`.
* Added indexing
* Mark dynamic foreign key

This commit includes the BAO with some of the backing-methods required for
API exposure.  However, the API won't really work until we have the
validation-values event, so the API has been kicked to a subsequent PR.

The list of translatable entities/fields will be signficant because it will
determine when/how to redirect data in API calls.  This patch does not
commit to specific translatable fields - but it does provide a hook to
determine them.

When the API PR becomes unblocked, it will include test-coverage that hits the API, BAO, and hook.
@civibot
Copy link

civibot bot commented Jun 2, 2021

(Standard links)

@civibot civibot bot added the master label Jun 2, 2021
@totten totten changed the title dev/translation#67 - Define "Translation" entity. Add during installation/upgrade. dev/translation#67 - Define "Translation" table. Add during installation/upgrade. Jun 2, 2021
@eileenmcnaughton
Copy link
Contributor

CRM_Core_DAOConformanceTest::testFieldsHaveTitles with data set #9 ('CRM_Core_DAO_Translation')
A title must be defined for id in CRM_Core_DAO_Translation
Failed asserting that an array has the key 'title'.

@totten
Copy link
Member Author

totten commented Jun 2, 2021

@eileenmcnaughton Yup, updated.

<!-- Prediction: In a large DB with many events/contribution-pages/groups/mailings/etc, entity ID will have best selectivity. -->
<!-- Prediction: Over diverse set of deployments, the selectivity of 'table' and 'language' will be similar. -->
<fieldName>entity_id</fieldName>
<fieldName>entity_table</fieldName>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@totten note that in a combined index the second 2 parts of the index will never be hit if the first is not - ie

SELECT * FROM civicrm_translation WHERE id = 5 AND entity_table = 'civicrm_discount' AND language = 'en_US'

will utilise the full index.

If the query is

SELECT * FROM civicrm_translation WHERE entity_table = 'civicrm_discount' AND language = 'en_US'

then the index will not be used at all.

For this query

SELECT * FROM civicrm_translation WHERE id = 5 AND AND language = 'en_US'

the index on id is used (yay) but not on language

I'm actually fine merging as is - I suspect the entity_id is what we really need but it might be over time that we add other other indexes on the other 2 fields so we can retrieve all by language or all by entity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually fine merging as is - I suspect the entity_id is what we really need but it might be over time that we add other other indexes...

Yeah, this is my theory too - at least for the current cases where you start by filtering the main entity (MessageTemplate, Event, etc) and then (as a final step) merge-in the translations for the output.

When/if we have a case for textual filtering on translated columns, then we'll need to consider other indexes...

* Ex: $fields['civicrm_event']['summary'] = TRUE
* Ex: $fields['civicrm_event']['summary'] = 'yesplease';
*
* At time of writing, the `$fields` list x is prepopulated based on `<localizable>` fields in core's `xml/schema`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@totten I thought we decided we would not use existing localizable as that would lead to both systems trying to do the same field (we could add 'translatable' with 1 being ignored if both is set)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I changed the logic, but I'll update the stale comment.

* @param array $fields
* List of data fields to translate, organized by table and column.
* Omitted/unlisted fields are not translated. Any listed field may be translated.
* Values should be TRUE-ish.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this confusing - the value should be TRUE - but you are WARNING that setting a different value might resolve to true - it comes across as you saying 'there is no recommendation - do what you want' and also is a lot more cognitive load than just daying 'the value should be boolean' and expecting people to have enough sense/coding skills to realise that they should set a boolean or something weird might happen

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was because the original list of translatable fields (SchemaStructure) had TRUE-ish things rather than strict booleans. But since we're not feeding off that now, we can document it as straight-up boolean.

The previous comments reflected a developmental iteration.
@totten
Copy link
Member Author

totten commented Jun 3, 2021

@eileenmcnaughton Pushed clarifications on the hook docblock.

* At time of writing, the `$fields` list x is prepopulated based on `<localizable>` fields in core's `xml/schema`.
* In the future, it may also be prepopulated with `<localizable>` fields in ext's `xml/schema`.
* For the interim, you may wish to fill-in `<localizable>` fields from ext's.
* @return mixed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other places we have * @return null

  • the return value is ignored
    Do we ignore this one?

Copy link
Member Author

@totten totten Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eileenmcnaughton Yes, the return value is ignored here.

Of course, both @return mixed and @return null feel misleading to me (for the ~95% of hooks which don't really return anything). To my skimming, the @return mixed looks more consistent with the style of surrounding code. (~77x mixed vs ~35x null).

If we wanted it to make sense, then I suppose those ~110 regular (non-returning) hook-stubs should actually be made truly void...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm ok -

@eileenmcnaughton
Copy link
Contributor

OK - just noting that this is as discussed in the gitlab & the usage will follow to get the schema changes merged in a non-staley way. The main difference is the change of table name (the original, strings, seemed risky in terms of having a special meaning in php)

@eileenmcnaughton eileenmcnaughton merged commit 5d1ca59 into civicrm:master Jun 3, 2021
@totten totten deleted the master-translation-table-only branch June 4, 2021 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants