Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string encodings do not seem to be detected correctly #102

Closed
bseeger opened this issue May 25, 2021 · 2 comments
Closed

string encodings do not seem to be detected correctly #102

bseeger opened this issue May 25, 2021 · 2 comments

Comments

@bseeger
Copy link

bseeger commented May 25, 2021

Please follow the general troubleshooting steps first:

  • [ x] I read the README and followed the instructions.
  • [ x] I am sure that the used CSL metadata follows the CSL schema.
  • [ x] I use a valid CSL stylesheet

Bug reports:

Hello --
Not sure if this is an issue or not, but I'm using this library in a module I'm creating for Drupal 8 and seeing string conversion errors.

Namely, if I ask for a MLA formatted bibliography of the CSL metadata data below, I get

 "citation-MLA": "<div class=\"csl-bib-body\">\n  <div class=\"csl-entry\">DoeJ. <i>y nonymous eritage</i>. 2001.</div>\n</div>",

Notice the title is clipped and I see iconv errors in the logs.

The best I can figure out is that the code for mb_ucfirst in StringHelper.php isn't considering UTF-8 at all and comes back saying my string is ISO-8859-1 encoded.

Drupal log:

Notice: iconv(): Wrong charset, conversion from `ISO-8859-1' to `UTF-8//IGNORE' is not allowed in Symfony\Polyfill\Mbstring\Mbstring::mb_convert_case() (line 285 of /var/www/drupal/vendor/symfony/polyfill-mbstring/Mbstring.php)

#0 /var/www/drupal/web/core/includes/bootstrap.inc(600): _drupal_error_handler_real(8, 'iconv(): Wrong ...', '/var/www/drupal...', 285, Array)
#1 [internal function]: _drupal_error_handler(8, 'iconv(): Wrong ...', '/var/www/drupal...', 285, Array)
#2 /var/www/drupal/vendor/symfony/polyfill-mbstring/Mbstring.php(285): iconv('ISO-8859-1', 'UTF-8//IGNORE', 'H')
#3 /var/www/drupal/vendor/symfony/polyfill-mbstring/Mbstring.php(590): Symfony\Polyfill\Mbstring\Mbstring::mb_convert_case('H', 0, 'ISO-8859-1')
#4 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Util/StringHelper.php(144): Symfony\Polyfill\Mbstring\Mbstring::mb_strtoupper('H', 'ISO-8859-1')
#5 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Util/StringHelper.php(109): Seboettg\CiteProc\Util\StringHelper::mb_ucfirst('Heritage')
#6 [internal function]: Seboettg\CiteProc\Util\StringHelper::Seboettg\CiteProc\Util\{closure}('Heritage', 2)
#7 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Util/StringHelper.php(110): array_walk(Array, Object(Closure))
#8 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Styles/TextCaseTrait.php(68): Seboettg\CiteProc\Util\StringHelper::capitalizeForTitle('My Anonymous He...')
#9 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Rendering/Text.php(198): Seboettg\CiteProc\Rendering\Text->applyTextCase('My Anonymous He...', 'en')
#10 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Rendering/Text.php(103): Seboettg\CiteProc\Rendering\Text->renderVariable(Object(stdClass), 'en')
#11 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Rendering/Choose/ChooseIf.php(88): Seboettg\CiteProc\Rendering\Text->render(Object(stdClass), NULL)
#12 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Rendering/Choose/Choose.php(90): Seboettg\CiteProc\Rendering\Choose\ChooseIf->render(Object(stdClass))
#13 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Style/Macro.php(86): Seboettg\CiteProc\Rendering\Choose\Choose->render(Object(stdClass), NULL)
#14 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Rendering/Text.php(254): Seboettg\CiteProc\Style\Macro->render(Object(stdClass))
#15 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Rendering/Text.php(111): Seboettg\CiteProc\Rendering\Text->renderMacro(Object(stdClass))
#16 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Rendering/Group.php(104): Seboettg\CiteProc\Rendering\Text->render(Object(stdClass), 0)
#17 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Rendering/Layout.php(126): Seboettg\CiteProc\Rendering\Group->render(Object(stdClass), 0)
#18 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Rendering/Layout.php(91): Seboettg\CiteProc\Rendering\Layout->renderSingle(Object(stdClass), 0)
#19 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/Style/Bibliography.php(70): Seboettg\CiteProc\Rendering\Layout->render(Object(Seboettg\CiteProc\Data\DataList), NULL)
#20 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/CiteProc.php(137): Seboettg\CiteProc\Style\Bibliography->render(Object(Seboettg\CiteProc\Data\DataList))
#21 /var/www/drupal/vendor/seboettg/citeproc-php/src/Seboettg/CiteProc/CiteProc.php(183): Seboettg\CiteProc\CiteProc->bibliography(Object(Seboettg\CiteProc\Data\DataList))
#22 /var/www/drupal/web/modules/contrib/idc_export/src/Service/CitationsService.php(30): Seboettg\CiteProc\CiteProc->render(Object(Seboettg\CiteProc\Data\DataList), 'bibliography')
#23 /var/www/drupal/web/modules/contrib/idc_export/src/Plugin/views/field/CitationMLA.php(60): Drupal\idc_export\Service\CitationsService->renderFromMetadata(Array, 'modern-language...', 'bibliography')

The code that Drupal is running is here:
mb_string::mb_detect_encoding

And that won't consider UTF-8 unless it's in the handed in explicitly in the list. If I put it in the list, things work well.

But maybe it's something else in my setup?
Drupal 8.9.14
PHP 7.2.27

Used CSL stylesheet:

modern-language-association

Used CSL metadata

  [
          {
              "author": [
                  {
                      "family": "Doe",
                      "given": "James",
                      "suffix": "III"
                  }
              ],
              "id": "item-1",
              "issued": {
                  "date-parts": [
                      [
                          "2001"
                      ]
                  ]
              },
              "title": "My Anonymous Heritage",
              "type": "book"
          }
      ]
@bseeger
Copy link
Author

bseeger commented May 25, 2021

another data point -- it has to do with the style sheet chosen, because if I switch to ieee, the problem goes away. But I need MLA style, so I'm a little stuck.

@jonasraoni
Copy link

Hi @bseeger!

I'm having the same issue (I'm using a ready Docker container), looks like it's something related to the environment/installation, try to run the code below, if you get an error, then this issue can be probably closed, as it's something on your end:

echo iconv('ISO-8859-1', 'UTF-8//IGNORE', 'test');

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants