should variables() behave differently for `draws_rvars` format? #208

jgabry · 2021-11-01T22:04:54Z

When working on stan-dev/cmdstanr#582 I was reminded that variables() and nvariables() behave differently for the draws_rvars format compared to all other formats. I don't think this difference (treating vector/matrix variables as a single variable vs. treating elements as separate variables) is documented anywhere (or maybe I missed it). Should I add this information to the help("draws-index") page where these functions are documented?

The text was updated successfully, but these errors were encountered:

mjskay · 2021-11-01T23:55:14Z

Yeah, thanks that would be helpful! I think it's mentioned in the vignette but probably not elsewhere.

Alternatively, we could discuss if we want to keep that difference? It may be that we want to be able to get variable names and counts counted either way (by the base variable name or by the combination of variable and indices) when using any format, in which case we might want an additional function (or option) to do that. In hindsight while the idiosyncratic version of variables() makes sense for draws_rvars, it could be problematic in some cases, e.g. if someone was using the variables listed by variables() with extract_variable(). Kinda breaks the abstraction a bit...

jgabry · 2021-11-02T00:39:15Z

I think it's mentioned in the vignette but probably not elsewhere.

I forgot about the vignette. You're right, it does show this behavior there. I can also add something to the roxygen doc ( I guess the doc will depend on what we decide about adding a new function or argument, like you mentioned).

Alternatively, we could discuss if we want to keep that difference? It may be that we want to be able to get variable names and counts counted either way (by the base variable name or by the combination of variable and indices) when using any format, in which case we might want an additional function (or option) to do that

I like the idea of being able to get this info for any format. I just changed the name of this issue from just about the doc to discussing this question. I'd be ok with either a separate function or just an argument to variables() that toggles returning the names of all elements (although I guess this would need a different default depending on the format in order to not break backwards compatibility).

it could be problematic in some cases

Yeah I just ran into something in CmdStanR where we have some code that mistakenly assumes that variables() will return the same thing regardless of the draws format. variables() and nvariables() being different does make it trickier to work with draws objects because it requires special code just for draws_rvars.

mjskay · 2021-11-02T03:48:11Z

Yeah I just ran into something in CmdStanR where we have some code that mistakenly assumes that variables() will return the same thing regardless of the draws format. variables() and nvariables() being different does make it trickier to work with draws objects because it requires special code just for draws_rvars.

Makes sense. I think a default of giving the variable names with indices is probably sensible for compatibility with other parts of the interface.

If we had an option for giving only the "base" name, what is a sensible name? variables(x, base = TRUE)? stem = TRUE? array = TRUE? dimensions = FALSE? dims = FALSE? indices = FALSE? with_dims = FALSE?

One potential complication is that variables<-() for draws_rvars in the default case will be a bit hairy. It would maybe have to only allow renaming scalars, or possibly try to detect if someone is doing a valid rename of an array (e.g. if they rename all of c("x[1]", "x[2]", "x[3]") to c("y[1]", "y[2]", "y[3]") simultaneously). So there may still be some inconsistencies in the interface, but we might be able to keep them to a minimum.

I'd be curious what @paul-buerkner thinks.

paul-buerkner · 2022-01-05T14:05:08Z

I agree that, ideally, variables() should work the same way for all formats so changing the default behavior for rvars and adding a "base" argument sounds sensible to me. I don't have a good intuition about the name of this additional argument but "base" feels good in my ears.

I don't have a good intiution about variables<-() but we could also add the base argument there to do variables(x, base = TRUE) <- (). We could make base=TRUE the default here (slight inconsistency that I think is ok) and for base = FALSE do the input checks as you suggested.

jgabry added the documentation Improvements or additions to documentation label Nov 1, 2021

jgabry changed the title ~~document that variables() behaves differently for draws_rvars format~~ should variables() behave differently for draws_rvars format? Nov 2, 2021

mjskay mentioned this issue Jul 22, 2022

rvar indexing and casting improvements #247

Merged

This was referenced Jan 25, 2024

Add a function to extract single draw of multidimensional variable #340

Closed

extract_variable_array and the with_indices parameter for variables() #342

Merged

mjskay closed this as completed in #342 Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

should variables() behave differently for `draws_rvars` format? #208

should variables() behave differently for `draws_rvars` format? #208

jgabry commented Nov 1, 2021

mjskay commented Nov 1, 2021

jgabry commented Nov 2, 2021 •

edited

Loading

mjskay commented Nov 2, 2021 •

edited

Loading

paul-buerkner commented Jan 5, 2022

should variables() behave differently for draws_rvars format? #208

should variables() behave differently for draws_rvars format? #208

Comments

jgabry commented Nov 1, 2021

mjskay commented Nov 1, 2021

jgabry commented Nov 2, 2021 • edited Loading

mjskay commented Nov 2, 2021 • edited Loading

paul-buerkner commented Jan 5, 2022

should variables() behave differently for `draws_rvars` format? #208

should variables() behave differently for `draws_rvars` format? #208

jgabry commented Nov 2, 2021 •

edited

Loading

mjskay commented Nov 2, 2021 •

edited

Loading