Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should rvar auto-drop dimensions on subsetting? #245

Open
wds15 opened this issue Jul 8, 2022 · 2 comments
Open

should rvar auto-drop dimensions on subsetting? #245

wds15 opened this issue Jul 8, 2022 · 2 comments
Labels
interface Interface improvements or changes

Comments

@wds15
Copy link

wds15 commented Jul 8, 2022

Hi!

rvars are a great thing and I have started using them a bit. I have run into issues when using it and there could be a nice solution but let me start with describing my setting. I wanted to transfer code from the transformed parameters block into R in order to eventually handle post-processing of my draws in R instead of Stan. Here are the respective Stan bits:

parameters {
  // the first 1:num_groups parameters are EX modelled while the
  // num_groups+1:2*num_groups are NEX
  vector[2] log_beta_raw[2*num_groups,num_comp];
  vector[num_inter] eta_raw[2*num_groups];

  // hierarchical priors
  vector[2] mu_log_beta[num_comp];
  // for differential discounting we allow the tau's to vary by
  // stratum (but not the means)
  vector<lower=0>[2] tau_log_beta_raw[num_strata,num_comp];
  cholesky_factor_corr[2] L_corr_log_beta[num_comp];

  vector[num_inter] mu_eta;
  vector<lower=0>[num_inter] tau_eta_raw[num_strata];
  cholesky_factor_corr[num_inter] L_corr_eta;
}
transformed parameters {
  vector[2] beta[2*num_groups,num_comp];
  vector[num_inter] eta[2*num_groups];
  vector<lower=0>[2] tau_log_beta[num_strata,num_comp];
  vector<lower=0>[num_inter] tau_eta[num_strata];

  // in the case of fixed tau's we fill them in here
  if (prior_tau_dist == 0) {
    tau_log_beta = prior_EX_tau_mean_comp;
    tau_eta = prior_EX_tau_mean_inter;
  } else {
    tau_log_beta = tau_log_beta_raw;
    tau_eta = tau_eta_raw;
  }

  // EX parameters which vary by stratum which is defined by the group
  for(g in 1:num_groups) {
    int s = group_stratum_cid[g];
    for(j in 1:num_comp) {
      beta[g,j] = mu_log_beta[j] +
        diag_pre_multiply(tau_log_beta[s,j], L_corr_log_beta[j]) * log_beta_raw[g,j];
    }
    if (num_inter > 0)
      eta[g] = mu_eta + diag_pre_multiply(tau_eta[s], L_corr_eta) * eta_raw[g];
  }

  // NEX parameters
  beta[num_groups+1:2*num_groups] = log_beta_raw[num_groups+1:2*num_groups];
  eta[num_groups+1:2*num_groups] = eta_raw[num_groups+1:2*num_groups];

  // exponentiate the slope parameter to force positivity
  for(g in 1:2*num_groups)
    for(j in 1:num_comp)
      beta[g,j,2] = exp(beta[g,j,2]);
}

I ended up with this respective rvar style code:

library(OncoBayes2)
library(posterior)

example_model("combo2")

post <- blrmfit$posterior

spost <- subset_draws(as_draws_rvars(post),
                      variable=c("log_beta_raw", "eta_raw", "mu_log_beta", "tau_log_beta_raw", "L_corr_log_beta", "mu_eta", "tau_eta_raw", "L_corr_eta"))

sdata <- blrmfit$standata

S_chain <- 1E3
num_chains <- 4
S <- S_chain * num_chains

beta <- rvar(array(0, dim=c(S, 2*sdata$num_groups,sdata$num_comp,2)), nchain=num_chains)
eta <- rvar(array(0, dim=c(S, 2*sdata$num_groups)), nchain=num_chains)
tau_log_beta <- rvar(array(0, dim=c(S, sdata$num_strata, sdata$num_comp)), nchain=num_chains)
tau_eta <- rvar(array(0, dim=c(S, sdata$num_strata)), nchain=num_chains)

## in the case of fixed tau's we fill them in here
if (sdata$prior_tau_dist == 0) {
    tau_log_beta = as_rvar(sdata$prior_EX_tau_mean_comp)
    tau_eta = as_rvar(sdata$prior_EX_tau_mean_inter)
} else {
    tau_log_beta = spost$tau_log_beta_raw
    tau_eta = spost$tau_eta_raw
}

## EX parameters which vary by stratum which is defined by the group
for(g in 1:sdata$num_groups) {
    s = sdata$group_stratum_cid[g];
    for(j in 1:sdata$num_comp) {
        beta[g,j,drop=TRUE] = spost$mu_log_beta[j,,drop=TRUE] + (rdo(diag(tau_log_beta[s,j,,drop=TRUE], sdata$num_comp, sdata$num_comp)) %**% spost$L_corr_log_beta[j,,drop=TRUE]) %**% spost$log_beta_raw[g,j,,drop=TRUE]
        ##diag_pre_multiply(tau_log_beta[s,j], L_corr_log_beta[j]) * log_beta_raw[g,j];
    }
    if (num_inter > 0)
        eta[g] = spost$mu_eta + (rdo(diag(tau_eta[s,drop=TRUE], sdata$num_inter, sdata$num_inter)) %**% spost$L_corr_eta) %**% spost$eta_raw[g]
    ##+ diag_pre_multiply(tau_eta[s], L_corr_eta) * eta_raw[g];
}


## NEX parameters
beta[sdata$num_groups+1:2*sdata$num_groups] = spost$log_beta_raw[sdata$num_groups+1:2*sdata$num_groups];
eta[sdata$num_groups+1:2*sdata$num_groups] = spost$eta_raw[sdata$num_groups+1:2*sdata$num_groups];

## exponentiate the slope parameter to force positivity
for(g in 1:2*sdata$num_groups)
    for(j in 1:sdata$num_comp)
        beta[g,j,2] = exp(beta[g,j,2]);

You see that the rvars code is not nice as I have to use many times drop. This is due to the fact that I nest matrices and vectors into arrays... but that structure gets cast into big arrays in rvar. However, when subsetting the leading indices (for the array stuff in Stan) and then using the rvar, the math won't work, since the dimensions do not line up - this is due to it's intended use is after subsetting things first to the array level.

Thus, I was wondering if we could have a as_draws_rvars flavour which creates nested lists from the samples? So all arrays get converted to nested lists and the vectors/matrices become rvars? With that I would not have to deal with drop in such an ugly way.

... or maybe there is a simpler way of doing this?

@mjskay
Copy link
Collaborator

mjskay commented Jul 8, 2022

You see that the rvars code is not nice as I have to use many times drop. This is due to the fact that I nest matrices and vectors into arrays... but that structure gets cast into big arrays in rvar. However, when subsetting the leading indices (for the array stuff in Stan) and then using the rvar, the math won't work, since the dimensions do not line up - this is due to it's intended use is after subsetting things first to the array level.

Yeah I see, it is kind of awkward. When we originally put the rvar interface together I was following the pattern from the nascent {rray} package of not auto-dropping array dimensions. One question might be if that was the right design decision after all? Maybe that actually makes real use more annoying. Not sure if other folks (@paul-buerkner @jgabry @avehtari) have thoughts about this? It is one of the big differences between rvars and base-R arrays; perhaps we should drop dimensions by default like base R does?

Thus, I was wondering if we could have a as_draws_rvars flavour which creates nested lists from the samples? So all arrays get converted to nested lists and the vectors/matrices become rvars? With that I would not have to deal with drop in such an ugly way.

I could see something like this being useful, even beyond this use case. Ragged arrays come to mind. Perhaps having a way to specifying dimensions to be created as lists when converted. This relates to some other issues: eventually I would like some generic machinery for parsing indices that can be used in other packages (there was some work on it here but it's been awhile and needs updating: #152).

beta[g,j,drop=TRUE] = spost$mu_log_beta[j,,drop=TRUE] + (rdo(diag(tau_log_beta[s,j,,drop=TRUE], sdata$num_comp, sdata$num_comp)) %**% spost$L_corr_log_beta[j,,drop=TRUE]) %**% spost$log_beta_raw[g,j,,drop=TRUE]

this usage reminds me we need a complete diag() implementation so you don't have to use rdo() here.

@mjskay mjskay mentioned this issue Jul 8, 2022
@paul-buerkner
Copy link
Collaborator

I think that both drop = FALSE and drop = TRUE have their merits, but I agree this use-case here is ugly. It would be interesting to see for what purpose other people use rvars (or why they don't use them). Does anyone wants to chime in?

In general, I am open to changing the drop default if we find enough evidence that this is the better choice in expectation.

@mjskay mjskay changed the title structured rvars should rvar auto-drop dimensions on subsetting? Jul 23, 2022
@paul-buerkner paul-buerkner added the interface Interface improvements or changes label Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interface Interface improvements or changes
Projects
None yet
Development

No branches or pull requests

3 participants