Skip to content
This repository has been archived by the owner on Sep 27, 2019. It is now read-only.

Incorrect optical size in generated database #398

Closed
zhouyan opened this issue Feb 1, 2017 · 26 comments
Closed

Incorrect optical size in generated database #398

zhouyan opened this issue Feb 1, 2017 · 26 comments
Labels

Comments

@zhouyan
Copy link

zhouyan commented Feb 1, 2017

I recently noticed that the generated database, by default, record the optical size in sp. However, it seems that the sizes are calculated incorrectly. (Or maybe I misunderstood the behavior of luaotfload)

(I don't have any configuration file, so everything is in its default setting)

Consider the entry for Latin Modern Roman, in particular the line,

{ 652911.58156912, 718202.73972603, 620266.00249066, 843 },

within the r table (regular). This line corresponding to LMRoman10-Regular,
The reported optical size through otfinfo is

design size 10 pt, size range (9.5 pt, 11 pt], subfamily ID 1, subfamily name Regular

If I understand correctly, these values are PostScript points, that is, the sp value shall be,

11pt => 11 / 72 * 72.27 * 65536 => 723599.36

While in the table, the corresponding value 718202.73972603 is actually,

11 / 72.27 * 72 * 65536

In particular, this cause problems when selecting font when the size is specified according to bp instead of pt. For example,

\documentclass{article}

\def\FontName{
  \directlua{
    local i = font.current()
    local f = font.getfont(i)
    local n = f.fullname
    tex.sprint(n)
  }
}

\usepackage{fontspec}
\setmainfont{Latin Modern Roman}

\begin{document}

\fontsize{10.99995}{11}\selectfont % slightly smaller than 11pt
\FontName
\fontsize{11.04124}{11}\selectfont % slightly smaller than 11bp
\FontName

\end{document}

The output is
screen shot 2017-02-01 at 23 52 49

(I recall that luaotfload used to select font by regarding the optical size as an Closed-Open interval, while OpenType use an Open-Closed interval, and thus I used font size very slightly smaller than the upper bound of the intended use)

In the above example, for 11bp, the 10 point fonts shall be used, but the 12 point one is used instead.

After patching the database by using the values 723599.36 etc, the expected output is obtained

screen shot 2017-02-01 at 23 53 36

@eroux
Copy link
Member

eroux commented Feb 1, 2017

I think this is a duplicate of #389

@zhouyan
Copy link
Author

zhouyan commented Feb 1, 2017

I think it is an issue introduced by that and related patch.

The designsize as returned by the LuaTeX font loader is merely Size in point x 65536, regardless wether this Size in point is PS or US point.

For example, LMRoman10-Regular has a design size 655360 and Arno Pro Regular (12 point) has 786432. So if this Size in point means PS point, as in most OpenType fonts, instead of US point, then the correct sp value shall be v / 72 * 72.27 instead of v / 72.27 * 72, which does not make any sense to me. The same goes for the dd option, (Line 1050--1054 in luaotfload-database.lua, in the GitHub version)

In the particular case of LMRoman, I believe the pt option is probably more appropriate. Knuth most likely designed the original CM with pt instead of bp in mind. But the default bp option, which is the default, is more appropriate for most other fonts.

@zhouyan
Copy link
Author

zhouyan commented Feb 1, 2017

If I unsderstand correctly, The current formula in the source is for converting value in pt to value in bp and dd, while what was really needed was the other way around.

@eroux
Copy link
Member

eroux commented Feb 1, 2017

Oh ok, sorry... maybe you could build a pull request?

@phi-gamma
Copy link
Member

phi-gamma commented Feb 1, 2017 via email

@phi-gamma
Copy link
Member

phi-gamma commented Feb 1, 2017 via email

@zhouyan
Copy link
Author

zhouyan commented Feb 1, 2017

I think the Open-Close interval issue was fixed a long time ago. I was just been cautious.

I think treating the design size either as US point or PS point is OK. It will only make a difference in marginal case. The difference at 20pt is less than 0.1 point. however, if we need to make a difference, and converted it the wrong way, basically we squared the ratio and the difference become slighted bigger. However, that difference is still very small (would be a lot worse for didot point)

It is only an issue when the size in question happened to be on the boundary, in which case any margin or error will cause a change of font. However, even in that case I still don't think it is a big problem. For example, say a design size is 18 point and upper bound is 24 point, and the next range is design size 30 point (lower bound 24 point). It is purely academia to argue around 24 point which shall be better.

The only real effect is for fonts such as LM, Arno, which has a few very narrow intervals (top -bottom < 2pt). In this case, it does make a difference when switch between a 10 point font and a 12 point font.

Personally, at text size, I usually use font at its design size, which avoids all these problems. For example use Arno at either 10point or 12 point. And if I want 11point, I switch to Minion.

@phi-gamma phi-gamma added the bug label Feb 2, 2017
@zhouyan
Copy link
Author

zhouyan commented Feb 2, 2017

I am afraid my last pull request did not really fix the problem. I think I misunderstood the source.

Before I make an even bigger mess (very sorry about that), please correct me if I understood the following correctly,

  • The design_size_dimension function (on Line 1045) is only used when loading a font.
  • The get_size_info function (on Line 1356) is what used for generating the database.
  • The configuration for choose among pt, dd, bp only affects fonts loading instead of database generating.

If this understanding is correct, then original design_size_dimension was correct. That is, we do want to convert from TeX point to DTP point and then multiple the value by 2^16. That is the value shall be shrieked by a ratio 7200 / 7227. My fix in the pull request was incorrect.

Instead, the real problem is that in get_size_info, there is also a conversion from pt to bp. Which is redundant. That is, we end up with a double conversion. After removing the following

            design_size         = (design_size         * 7227) / 7200
            design_range_top    = (design_range_top    * 7227) / 7200
            design_range_bottom = (design_range_bottom * 7227) / 7200

inside get_size_info, and restore design_size_dimension to its state before the pull request. Everything seems to work correct.

The following is an example with more rigorous test of the marginal case,

\documentclass{article}
\usepackage{luaotfload}

\font\lme = {name:Latin Modern Roman} at 723599.36sp % exact 11bp
\font\lmh = {name:Latin Modern Roman} at 723599.86sp % exact 0.5sp bigger
\font\lmm = {name:Latin Modern Roman} at 723599sp    % less than 1sp smaller
\font\lmp = {name:Latin Modern Roman} at 723600sp    % less than 1sp larger

\def\FontName{\directlua{tex.sprint(font.getfont(font.current()).fullname)}}

\begin{document}

\noindent\lme\FontName\par
\noindent\lmh\FontName\par
\noindent\lmm\FontName\par
\noindent\lmp\FontName\par

\end{document}

screen shot 2017-02-02 at 15 22 54

And I get the expected output, that is the first three choose LM10 and the last choose LM12. (Note that, the second one also choose LM10, while the asked size is bigger than 11bp is that, sp is the smallest TeX unit, and it is truncated [instead of rounded] to integer.)

Another test with Arno pro, which has a boundary at 10.9, which in bp is an exact integer in sp.

\documentclass{article}
\usepackage{luaotfload}

% bounds (10, 10.9]
\font\lme = {name:Arno Pro} at 717021sp % exact 10.9bp
\font\lmm = {name:Arno Pro} at 717020sp % exact 1sp smaller
\font\lmp = {name:Arno Pro} at 717022sp % exact 1sp bigger

\def\FontName{\directlua{tex.sprint(font.getfont(font.current()).fullname)}}

\begin{document}

\noindent\lme\FontName\par
\noindent\lmm\FontName\par
\noindent\lmp\FontName\par

\end{document}

Also give correct output.
screen shot 2017-02-02 at 15 32 46

If this time my understanding is correct. I will create a new pull request to fix my mess (sorry again)

@eroux
Copy link
Member

eroux commented Feb 2, 2017

Not directly related: did you test the Open-Closed vs Closed-Open interval?

@zhouyan
Copy link
Author

zhouyan commented Feb 2, 2017 via email

@eroux
Copy link
Member

eroux commented Feb 2, 2017

Well, as long as the conversion is the same in both cases:

  • when luaotfload reading the font, x bp is y sp
  • when the user requires font at x bp, luaotfload looks for th variant as y sp (and not y-1 or y+1)

then everything's fine! Thank you for inspecting

@zhouyan
Copy link
Author

zhouyan commented Feb 2, 2017

Regarding the Open-Closed interval, I artificially patched the generated database such that Arno-Subhead shall be (14, 100] and Arno-Display be (100, 1000]. 100bp is exactly 6578176sp, and
loading at that size the Subhead version is correctly chosen. So there's no issue at all there.

I will do some more testing and create new pull request after some more testing.

@zhouyan
Copy link
Author

zhouyan commented Feb 2, 2017

I created three separated PR. Each are related but I feel yet independent enough to allow merging only some of them

Below are test results when #400 and #401 are applied, For all three configurations

test-bp.pdf
test-dd.pdf
test-pt.pdf

The source that created these results are,

#!/usr/bin/env perl

for (qw(pt bp dd)) {
    open my $conf, ">", "luaotfload.conf";
    print $conf "[db]\n";
    print $conf "    designsize-dimen = $_\n";

    open my $tex, ">", "test-$_.tex";
    print $tex "\\input{test}\n";
    print $tex "\\begin{document}\n";
    print $tex "\\testsize{$_}\n";
    print $tex "\\end{document}\n";
    system "lualatex test-$_.tex";
}
\documentclass{article}

\usepackage{luaotfload}
\usepackage{graphics}
\usepackage[scale=0.85,a5paper,landscape]{geometry}

\pagestyle{empty}
\parindent=0pt

\def\testsize#1{%
  \leavevmode%
  \hbox to 1.2in{Asked size\hfil\strut}%
  \hbox to 1.2in{Actual size\hfil\strut}%
  \hbox to 1.2in{Check\hfil\strut}%
  \hbox to 1.2in{Optical range\hfil\strut}%
  Font name
  \vskip\baselineskip
  \testpoint{#1}{5}
  \testpoint{#1}{5.5}
  \testpoint{#1}{6}
  \testpoint{#1}{6.5}
  \testpoint{#1}{7}
  \testpoint{#1}{7.5}
  \testpoint{#1}{8}
  \testpoint{#1}{8.5}
  \testpoint{#1}{9}
  \testpoint{#1}{9.5}
  \testpoint{#1}{10}
  \testpoint{#1}{11}
  \testpoint{#1}{12}
  \testpoint{#1}{14}
  \testpoint{#1}{17.2}
  \testpoint{#1}{24}
}

\def\testpoint#1#2{%
  \begingroup%

  % Load the font at size #2 in unit #1
  \font\testfont = {name:Latin Modern Roman} at #2#1%
  \testfont%

  % Get the actual size in pt
  \edef\sizept{%
    \directlua{
      local s = font.getfont(font.current()).size / 65536
      s = math.round(s * 1000) / 1000
      tex.sprint(s)
    }%
  }

  % Get the actual size in bp
  \edef\sizebp{%
    \directlua{
      local s = font.getfont(font.current()).size / 65536 * 7200 / 7227
      s = math.round(s * 1000) / 1000
      tex.sprint(s)
    }%
  }

  % Get the actual size in bp
  \edef\sizedd{%
    \directlua{
      local s = font.getfont(font.current()).size / 65536 * 1157 / 1238
      s = math.round(s * 1000) / 1000
      tex.sprint(s)
    }%
  }

  % Get the lower bound of the optical range
  \edef\minsize{%
    \directlua{
      local s = font.getfont(font.current()).shared.rawdata.metadata.minsize
      tex.sprint(s / 10)
    }%
  }

  % Get the upper bound of the optical range
  \edef\maxsize{%
    \directlua{
      local s = font.getfont(font.current()).shared.rawdata.metadata.maxsize
      tex.sprint(s / 10)
    }%
  }

  \edef\inrangept{%
    \directlua{
      local s = font.getfont(font.current()).size
      local l = font.getfont(font.current()).shared.rawdata.metadata.minsize
      local u = font.getfont(font.current()).shared.rawdata.metadata.maxsize
      l = l * 65536 / 10
      u = u * 65536 / 10
      if s > l and s <= u then
        tex.sprint("Pass")
      else
        tex.sprint("Fail")
      end
    }%
  }

  \edef\inrangebp{%
    \directlua{
      local s = font.getfont(font.current()).size * 7200 / 7227
      local l = font.getfont(font.current()).shared.rawdata.metadata.minsize
      local u = font.getfont(font.current()).shared.rawdata.metadata.maxsize
      l = l * 65536 / 10
      u = u * 65536 / 10
      if s > l and s <= u then
        tex.sprint("Pass")
      else
        tex.sprint("Fail")
      end
    }%
  }

  \edef\inrangedd{%
    \directlua{
      local s = font.getfont(font.current()).size * 1157 / 1238
      local l = font.getfont(font.current()).shared.rawdata.metadata.minsize
      local u = font.getfont(font.current()).shared.rawdata.metadata.maxsize
      l = l * 65536 / 10
      u = u * 65536 / 10
      if s > l and s <= u then
        tex.sprint("Pass")
      else
        tex.sprint("Fail")
      end
    }%
  }

  % Show the asked size, actual size, the optical range, and the font name
  \leavevmode%
  \begingroup%
    \normalfont\ttfamily
    \hbox to 1.2in{#2{}#1\hfil\strut}%
    \hbox to 1.2in{\csname size#1\endcsname{}#1\hfil\strut}%
    \hbox to 1.2in{\csname inrange#1\endcsname\hfil\strut}%
    \hbox to 1.2in{(\minsize, \maxsize]\hfil\strut}%
  \endgroup%
  \resizebox{!}{7pt}{%
    \directlua{tex.sprint(font.getfont(font.current()).fullname)}%
  }\par%

  \endgroup%
}
```

phi-gamma added a commit to phi-gamma/luaotfload that referenced this issue Feb 2, 2017
As discussed in issue lualatex#398.

Ad futuram rei memoriam the gist of it:

- For the index, all values are scaled (decipoints * sp) / 10 *
  (7227 / 7200).

- The ``bp`` case (the default, OT-standard), needs no conversion
  because it matches how values are stored in the index.

- The ``pt`` case essentially reverts the bp→pt part of scaling
  done for the database by scaling the asked size by the same
  factor, i. e. by 7227 / 7200.

- The ``dd`` needs an extra 1238 / 1157.

Requesting a font at 10pt will then:

- ask for a size of 655360 for ``bp`` / default;
- ask for 657817 for ``pt``;
- ask for 703870 for ``dd``.
@phi-gamma
Copy link
Member

phi-gamma commented Feb 2, 2017 via email

@zhouyan
Copy link
Author

zhouyan commented Feb 3, 2017

May I ask you to
review this fix?
phi-gamma/luaotfload@529578f

I think it is correct for bp and pt, but incorrect for dd. I found that one's head can easily get confused when thinking of these ratios only in head. So I wrote down the logic below, please review, (also see #400)

test.pdf

@phi-gamma
Copy link
Member

phi-gamma commented Feb 3, 2017 via email

@zhouyan
Copy link
Author

zhouyan commented Feb 3, 2017

No problem at all.

I am working on a slightly improved test that take into account of the rounding error. The default behavior, bp, though the most sensible one, will introduce a rounding errors either in the database, or the asked size when loading font

I think yours is better then #400 in the sense that, #400 will introduce rounding for both, even though both will be within 1ULP. In your fix, there's no rounding error whatsoever for asked size, while the values in database will have a potentially larger error due to that they are written in textual form. However, the lack of error for the asked size, I believe shall make things more predictable and easier to reason with.

I will update a new test once your fix and #401 are merged.

@zhouyan
Copy link
Author

zhouyan commented Feb 3, 2017

I worked out a new test in plain TeX (depends on nothing but luaotfload itself) and the perl script (attached).

testsize.zip

It includes test settings for most optical fonts out there, excluding those with optical sizing in design but not in OpenType features, such as some hoefler & co fonts.

Free fonts: Latin Modern family, EB Garamond (need a separate download, the one included in TeXLive only have 12 points design)

Commercial fonts: Adobe optical fonts and Minion Math

Full list

Adobe Jenson Pro
Arno Pro
Brioso Pro
Chaparral Pro
Cronos Pro
EB Garamond
Garamond Premier Pro
Kepler Std
Latin Modern Mono
Latin Modern Roman
Latin Modern Sans
Minion Math
Minion Pro
Sanvito Pro
Utopia Std
Warnock Pro

Note that without manual patching of the database (see #363, an issue that I don't believe can ever be fixed entirely given the mess of font foundries' naming schemes and ID fields, etc. And frankly, I don't think worth anyone's energy to try. I merely patch the database manually after luaotfload-tool -u), many tests will fail.

Without any intervention, the following faces will fail the tests,

Cronos Pro
EB Garamond
Latin Modern Mono
Minion Math
Sanvito Pro

Sanvito and EB Garamond fail mainly because bugs in font. There are gaps between ranges in their reported values.

Others fail the tests because of #363, which is unrelated (those pass the tests may select incorrect width and weights, also #363, again unrelated to issues here, and the founders are to blame).

In the new test, no conversion of dimension is done in Lua side. Instead, the returned design size, actual loaded size, etc., are taken literally and comparisons are done in TeX instead in Lua. The basic logic is that, if one ask for a size in TeX, and get a range selected correctly in the eye of TeX, then it should be considered to be OK, even if in real there might be some rounding errors within Lua that can make a difference in marginal case.

@zhouyan
Copy link
Author

zhouyan commented Feb 3, 2017

I just tried phi-gamma/luaotfload@7308f84b with the new test file, and everything works out well.

@phi-gamma
Copy link
Member

phi-gamma commented Feb 3, 2017 via email

@phi-gamma
Copy link
Member

phi-gamma commented Feb 3, 2017 via email

@zhouyan
Copy link
Author

zhouyan commented Feb 3, 2017 via email

@zhouyan
Copy link
Author

zhouyan commented Feb 3, 2017 via email

@zhouyan
Copy link
Author

zhouyan commented Feb 3, 2017

I wasn't quite able to express what I aimed in the test before. Basically, I think if user ask for at 10bp, then the font with, say (8, 10] shall be chosen instead of (10, 20]. It shouldn't matter if in TeX 10bp is really 10.00000000001bp or 9.9999999999bp.

@zhouyan
Copy link
Author

zhouyan commented Feb 3, 2017

Just a simple example of how I patch the database and an idea I am thinking about for properly test all edge cases of interest.

kpse.set_program_name 'luatex'

require 'lualibs'

names = load(gzip.load('luaotfload-names.lua.gz'), 't')()

families_sys = names.families.system.otf
files_sys = names.files.bare.system.otf

factor = 2^16 * 7227 / 7200

families_sys['ebgaramond'] = {
  r = {
    {
      8 * factor, 9.5 * factor, 0 * factor,
      files_sys['EBGaramond08-Regular']
    },
    {
      12 * factor, 9.5 * factor, 120 * factor,
      files_sys['EBGaramond12-Regular']
    },
  },
}


data = table.serialize(names, true)
gzip.save('luaotfload-names.lua.gz', data)
local f = io.open('luaotfload-names.luc', 'wb')
local s = load(data)
f:write(string.dump(s, true))
f:close()

By altering the database manually, we can artificially create situations of ranges that we want to test, without looking for a specific font that happened to have the edge case of interest.

It would be better if there's a way to dynamically place a hook such that it can be altered dynamically after the database is loaded, instead of having to go through the generate-patch-test cycle.

@u-fischer
Copy link

I'm closing the issue. @zhouyan if you think that there are open questions here, please open a new issue at https://github.com/u-fischer/luaotfload/issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants