Default print() output #32

mjskay · 2019-10-31T22:02:30Z

As I've been writing rename_variables() I've found it's a little awkward to work with draws objects when the default print output at the console is typically gigantic. This also makes examples a little verbose, as it feels necessary to call summarise_draws() constantly.

Two thoughts:

Any objections to making the default print() for draws objects call summarise_draws()?
If we agree to do (1), we will now have three ways of getting the same info (print, summary, and summarise_draws). That possibly feels a bit overkill? I can see how they are typically used in different ways, so having them all as aliases is probably fine, but it is worth considering.

The text was updated successfully, but these errors were encountered:

paul-buerkner · 2019-11-01T12:13:01Z

I agree we should have a better print() method than the current one. For now, just using print() as another alias of summarise_draws is fine, but we may want to think of whether we should do something similar to what tibbles do that is truncating a lot of output to make the structure visible without printing an overwhelming amount of information. The problem with summarise_draws is that it is computationally non-trivial for large posteriors so we may not want to do all those computations for a simple print call (or store the summary in some internal environment just as rstan does it to avoid recomputation).

jgabry · 2019-11-01T19:04:53Z

Hmm, what if print() just provides a useful summary of the structure? (e.g. like str() but doesn’t have to look like str() output)

paul-buerkner · 2019-11-01T19:37:00Z

Yes this is something I would like as well. Jonah Gabry <[email protected]> schrieb am Fr., 1. Nov. 2019, 20:04:

…

Hmm, what if print() just provides a useful summary of the structure? (e.g. like str() but doesn’t have to look like str() output) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#32>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2ADGKRZYM6G6RQD4TVLQRR4VLANCNFSM4JHTEMIA> .

mjskay · 2019-11-01T19:37:14Z

Good idea. What about something like this:

 'draws_array' num [1:100, 1:4, 1:10]
 - 4 chains × 100 iterations
 - 10 variables (median±mad):
      mu      tau theta[1] theta[2] theta[3] theta[4] theta[5] theta[6] theta[7] theta[8] 
 4.5±3.5  2.9±2.6  5.5±4.9  4.5±4.1  4.5±4.6  5.0±4.7  3.8±4.3  4.4±4.6  6.2±4.5  4.5±4.6

Where the first line will give the basic structure according to the specific format, but the remaining lines would be the same structure for all formats. Could either keep listing variables at the end or truncate it to the first k (10?) with an option not to truncate. That might also save us from needing to pre-compute summaries and keep them around.

I have some formatting code I have been playing with while experimenting with rv-like interfaces that can output the last line, so I'd be happy to write this if we want it.

paul-buerkner · 2019-11-02T09:29:38Z

I would very much like such a light-weight print method!

mjskay · 2019-11-02T22:17:07Z

Hmm, now that I've tried implementing something like this, I've realized it becomes a bit annoying in other ways: because it masks the default print output of the underlying format, if you want to get a sense of what the draws format looks like in the specific format you are using it is harder.

So perhaps we should leave print alone?

paul-buerkner · 2019-11-02T23:02:35Z

This is fine with me as well. I like how tibble prints out stuff that is truncated a lot but still gives us a lot of helpful detail. Not sure how reasonable and doable such a print method would be for the other formats though. Matthew Kay <[email protected]> schrieb am Sa., 2. Nov. 2019, 23:17:

…

Hmm, now that I've tried implementing something like this, I've realized it becomes a bit annoying in other ways: because it masks the default print output of the underlying format, if you want to get a sense of what the draws format looks like in the specific format you are using it is harder. So perhaps we should leave print alone? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#32>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2AECUGJYDXYA62AVAA3QRX36JANCNFSM4JHTEMIA> .

jgabry · 2019-11-07T19:46:18Z

Not sure how reasonable and doable such a print method would be for the other formats though.

It’s definitely trickier but I think worth thinking about at some point. Default print for arrays isn’t super helpful in my experience, so I’d love to have something nicer if possible!

mjskay · 2019-11-07T20:53:33Z

Seems reasonable. The approach I was taking (with mean+/-sd) is fine for the rvar stuff but does weird things when people start subsetting the other formats (which was what made me realize it won't really work here). But something that shows structure and abbreviates only as necessary, more like tibbles, makes sense.

jgabry · 2019-11-07T20:56:30Z

I meant to also say though that it's fine by me if you want to leave print() alone for now though. In that case we can leave this issue open to make sure we come back to it at some point.

paul-buerkner · 2020-03-24T09:25:56Z

I have now added some print methods that mostly truncate the output and give additional meta-information but make sure the underlying format is still visible. I would be happy to hear your thoughts.

MansMeg · 2020-04-02T11:57:09Z

@paul-buerkner asked me to check the print statements so I think I print them here for ease of discussion:

x <- example_draws()
print(as_draws_matrix(x))
print(as_draws_array(x))
print(as_draws_df(x))
print(as_draws_list(x))

This results in:

> print(as_draws_matrix(x))
# A draws_matrix: 400 draws, and 10 variables
    variable
draw   mu tau theta[1] theta[2] theta[3] theta[4] theta[5] theta[6]
  1  2.01 2.8     3.96    0.271    -0.74      2.1    0.923      1.7
  2  1.46 7.0     0.12   -0.069     0.95      7.3   -0.062     11.3
  3  5.81 9.7    21.25   14.931     1.83      1.4    0.531      7.2
  4  6.85 4.8    14.70    8.586     2.67      4.4    4.758      8.1
  5  1.81 2.8     5.96    1.156     3.11      2.0    0.769      4.7
  6  3.84 4.1     5.76    9.909    -1.00      5.3    5.889     -1.7
  7  5.47 4.0     4.03    4.151    10.15      6.6    3.741     -2.2
  8  1.20 1.5    -0.28    1.846     0.47      4.3    1.467      3.3
  9  0.15 3.9     1.81    0.661     0.86      4.5   -1.025      1.1
  10 7.17 1.8     6.08    8.102     7.68      5.6    7.106      8.5
# ... with 390 more draws, and 2 more variables
> print(as_draws_array(x))
# A draws_array: 100 iterations, 4 chains, and 10 variables
, , variable = mu

         chain
iteration   1    2     3   4
        1 2.0  3.0  1.79 6.5
        2 1.5  8.2  5.99 9.1
        3 5.8 -1.2  2.56 0.2
        4 6.8 10.9  2.79 3.7
        5 1.8  9.8 -0.03 5.5

, , variable = tau

         chain
iteration   1    2    3   4
        1 2.8 2.80  8.7 3.8
        2 7.0 2.76  2.9 6.8
        3 9.7 0.57  8.4 5.3
        4 4.8 2.45  4.4 1.6
        5 2.8 2.80 11.0 3.0

, , variable = theta[1]

         chain
iteration     1     2    3     4
        1  3.96  6.26 13.3  5.78
        2  0.12  9.32  6.3  2.09
        3 21.25 -0.97 10.6 15.72
        4 14.70 12.45  5.4  2.69
        5  5.96  9.75  8.2 -0.91

, , variable = theta[2]

         chain
iteration      1    2   3   4
        1  0.271  1.0 2.1 5.0
        2 -0.069  9.4 7.3 8.2
        3 14.931 -1.2 5.7 6.0
        4  8.586 12.5 2.8 2.7
        5  1.156 11.9 3.2 3.2

, , variable = theta[3]

         chain
iteration     1     2     3   4
        1 -0.74  0.22   1.4 5.7
        2  0.95  9.68   4.1 3.5
        3  1.83 -1.37  -8.3 3.1
        4  2.67 11.15 -10.8 3.2
        5  3.11 12.72 -27.8 2.6

# ... with 95 more iterations, and 5 more variables
> print(as_draws_df(x))
# A draws_df: 100 iterations, 4 chains, and 10 variables
     mu tau theta[1] theta[2] theta[3] theta[4] theta[5] theta[6]
1  2.01 2.8     3.96    0.271    -0.74      2.1    0.923      1.7
2  1.46 7.0     0.12   -0.069     0.95      7.3   -0.062     11.3
3  5.81 9.7    21.25   14.931     1.83      1.4    0.531      7.2
4  6.85 4.8    14.70    8.586     2.67      4.4    4.758      8.1
5  1.81 2.8     5.96    1.156     3.11      2.0    0.769      4.7
6  3.84 4.1     5.76    9.909    -1.00      5.3    5.889     -1.7
7  5.47 4.0     4.03    4.151    10.15      6.6    3.741     -2.2
8  1.20 1.5    -0.28    1.846     0.47      4.3    1.467      3.3
9  0.15 3.9     1.81    0.661     0.86      4.5   -1.025      1.1
10 7.17 1.8     6.08    8.102     7.68      5.6    7.106      8.5
# ... with 390 more draws, and 2 more variables
> print(as_draws_list(x))
# A draws_list: 100 iterations, 4 chains, and 10 variables

[chain = 1]
$mu
 [1] 2.01 1.46 5.81 6.85 1.81 3.84 5.47 1.20 0.15 7.17

$tau
 [1] 2.8 7.0 9.7 4.8 2.8 4.1 4.0 1.5 3.9 1.8

$`theta[1]`
 [1]  3.96  0.12 21.25 14.70  5.96  5.76  4.03 -0.28  1.81  6.08

$`theta[2]`
 [1]  0.271 -0.069 14.931  8.586  1.156  9.909  4.151  1.846  0.661
[10]  8.102

$`theta[3]`
 [1] -0.74  0.95  1.83  2.67  3.11 -1.00 10.15  0.47  0.86  7.68


[chain = 2]
$mu
 [1]   2.99   8.17  -1.15  10.93   9.82 -10.90  -9.26   1.79   5.35
[10]   0.87

$tau
 [1] 2.80 2.76 0.57 2.45 2.80 6.08 9.33 6.81 2.82 6.69

$`theta[1]`
 [1]  6.26  9.32 -0.97 12.45  9.75  2.56 11.92  9.89  4.31  9.26

$`theta[2]`
 [1]  1.0  9.4 -1.2 12.5 11.9 -8.8 -6.1 11.6  2.8  8.4

$`theta[3]`
 [1]   0.22   9.68  -1.37  11.15  12.72 -20.73 -12.17   1.77   5.98
[10]  -3.31


[chain = 3]
$mu
 [1]  1.79  5.99  2.56  2.79 -0.03  1.06  3.67  3.51  8.85  8.85

$tau
 [1]  8.72  2.91  8.41  4.39 11.03  2.70  1.68  0.52  5.96  5.96

$`theta[1]`
 [1] 13.3  6.3 10.6  5.4  8.2  5.0  5.2  3.7 13.1 13.1

$`theta[2]`
 [1] 2.1 7.3 5.7 2.8 3.2 4.3 4.1 4.1 4.7 4.7

$`theta[3]`
 [1]   1.38   4.11  -8.27 -10.77 -27.78  -3.94   0.36   3.84   2.75
[10]   2.75


[chain = 4]
$mu
 [1]  6.46  9.15  0.20  3.69  5.48  2.38 11.82  4.90  0.88  3.81

$tau
 [1]  3.8  6.8  5.3  1.6  3.0  2.3  4.3  3.1 15.8  2.7

$`theta[1]`
 [1]  5.78  2.09 15.72  2.69 -0.91  0.59 18.87  1.50  9.07  7.52

$`theta[2]`
 [1]  5.0  8.2  6.0  2.7  3.2  1.1 13.0  6.1 11.6  4.3

$`theta[3]`
 [1]  5.69  3.47  3.13  3.16  2.55 -0.12 14.96  3.31  4.29  4.69

# ... with 90 more iterations, and 5 more variables
>

I think these print methods are really nice. I only have two - very minor - suggestions.

Suggestions

I think that we would like to limit the number of chains as a default as well - it doesn't look like that currently done, but in posteriordb I work with 10 chains, and then the print() would give too much. I think 1 or 3 chains would be enough to show using print.
If I shift my console width the print jumps to second row. I think it is nice to have a solution where the columns are adapted to the console size. But, that is need to have. See below

# A draws_matrix: 400 draws, and 10 variables
    variable
draw   mu tau theta[1] theta[2] theta[3]
  1  2.01 2.8     3.96    0.271    -0.74
  2  1.46 7.0     0.12   -0.069     0.95
  3  5.81 9.7    21.25   14.931     1.83
  4  6.85 4.8    14.70    8.586     2.67
  5  1.81 2.8     5.96    1.156     3.11
  6  3.84 4.1     5.76    9.909    -1.00
  7  5.47 4.0     4.03    4.151    10.15
  8  1.20 1.5    -0.28    1.846     0.47
  9  0.15 3.9     1.81    0.661     0.86
  10 7.17 1.8     6.08    8.102     7.68
    variable
draw theta[4] theta[5] theta[6]
  1       2.1    0.923      1.7
  2       7.3   -0.062     11.3
  3       1.4    0.531      7.2
  4       4.4    4.758      8.1
  5       2.0    0.769      4.7
  6       5.3    5.889     -1.7
  7       6.6    3.741     -2.2
  8       4.3    1.467      3.3
  9       4.5   -1.025      1.1
  10      5.6    7.106      8.5
# ... with 390 more draws, and 2 more variables

paul-buerkner · 2020-04-02T12:58:15Z

Thank you for your comments!

I think that we would like to limit the number of chains as a default as well - it doesn't look like that currently done, but in posteriordb I work with 10 chains, and then the print() would give too much. I think 1 or 3 chains would be enough to show using print.

They are limited by default in a format dependent manner, which can be set globally via the options(max_chains = <x>). The format dependent defaults are shown in ?print.draws_<format>.

For example, draws_list shows 4 chains by default, but I honestly think this may be too much. Perhaps just show 2 by default for this format?

If I shift my console width the print jumps to second row. I think it is nice to have a solution where the columns are adapted to the console size. But, that is need to have.

I agree but have two concerns.

First, this may not be trivial to implement. I know tibble has some features in that regard but as far as I can see, the related code is not quite trivial. Printing is one of the primary concerns for tibble so I understand why they put so much effort into it. I am not sure I can put that effort into the print methods of posterior though. But perhaps this adaptive printing is easier than I think so if anybody has expierience with that, I would love to hear their thoughts.

Second, I currently control the number of variables, iterations, chains, etc. shown via format specific defaults and the option to set defaults globally via options(). However, to my understanding, this interferes with a console width dependent printing. I think, we can either have printing that does one or the other, not both, at least not for the dimension that spans along the console width.

MansMeg · 2020-04-02T19:06:46Z

Yes. I agree that it is not trivial, and - at least to me - it is more of a nice to have than anything important.

Regarding how many chains to print - I actually think that print should show the minimal possible so I agree that 2 chains is probably a good idea as default.

paul-buerkner · 2020-04-05T19:42:41Z

I am going to close this issue for now since we have reasonable print outputs to start with. If, at a later stage, we want to make things prettier, for instance, more adaptive to the console width, we can open a specific new issue dedicated to that purpose.

paul-buerkner added the feature New feature or request label Nov 1, 2019

jgabry mentioned this issue Nov 8, 2019

summary with one line per _named_ parameter #43

Open

paul-buerkner added this to the CRAN release milestone Nov 10, 2019

paul-buerkner self-assigned this Mar 24, 2020

paul-buerkner closed this as completed Apr 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default print() output #32

Default print() output #32

mjskay commented Oct 31, 2019

paul-buerkner commented Nov 1, 2019

jgabry commented Nov 1, 2019

paul-buerkner commented Nov 1, 2019 via email

mjskay commented Nov 1, 2019

paul-buerkner commented Nov 2, 2019

mjskay commented Nov 2, 2019

paul-buerkner commented Nov 2, 2019 via email

jgabry commented Nov 7, 2019

mjskay commented Nov 7, 2019

jgabry commented Nov 7, 2019

paul-buerkner commented Mar 24, 2020 •

edited

Loading

MansMeg commented Apr 2, 2020

paul-buerkner commented Apr 2, 2020

MansMeg commented Apr 2, 2020

paul-buerkner commented Apr 5, 2020

Default print() output #32

Default print() output #32

Comments

mjskay commented Oct 31, 2019

paul-buerkner commented Nov 1, 2019

jgabry commented Nov 1, 2019

paul-buerkner commented Nov 1, 2019 via email

mjskay commented Nov 1, 2019

paul-buerkner commented Nov 2, 2019

mjskay commented Nov 2, 2019

paul-buerkner commented Nov 2, 2019 via email

jgabry commented Nov 7, 2019

mjskay commented Nov 7, 2019

jgabry commented Nov 7, 2019

paul-buerkner commented Mar 24, 2020 • edited Loading

MansMeg commented Apr 2, 2020

Suggestions

paul-buerkner commented Apr 2, 2020

MansMeg commented Apr 2, 2020

paul-buerkner commented Apr 5, 2020

paul-buerkner commented Mar 24, 2020 •

edited

Loading