Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior in random #2909

Closed
ColCarroll opened this issue Mar 28, 2018 · 4 comments · Fixed by #2979
Closed

Unexpected behavior in random #2909

ColCarroll opened this issue Mar 28, 2018 · 4 comments · Fixed by #2979
Assignees

Comments

@ColCarroll
Copy link
Member

Consider the following:

with pm.Model() as model:
    a = pm.Uniform('a', lower=0, upper=1, shape=10)
    b = pm.Binomial('b', n=1, p=a, shape=10)
    
b.random(size=10000).mean(axis=0)

# array([0.7022, 0.0073, 0.9857, 0.5378, 0.9821, 0.7176, 0.0905, 0.2513, 0.5835, 0.0521])

I would have expected this mean to be close to 0.5 for each element.

This is happening because the implementation for .random methods all follow this pattern:

def random(self, point=None, size=None):
    n, p = draw_values([self.n, self.p], point=point)
    return generate_samples(stats.binom.rvs, n=n, p=p,
                            dist_shape=self.shape,
                            size=size)

Note that a single value is drawn for p, and used in all draws for the random variable. Indeed, in the above example, the value drawn for p was quite close to the mean of the draws!

This is causing strange behavior in sample_prior (#2876). If this is considered a bug, I can fix it by allowing draw_values to have a size argument. Otherwise I can change #2876 to work around the behavior.

(NB: I think this does not effect any of the sampling procedures)

@junpenglao
Copy link
Member

I hope it doesnt effect sample_ppc...

@ColCarroll
Copy link
Member Author

A quick experiment says it does not:

with pm.Model() as model:
    a = pm.Uniform('a', lower=0, upper=1, shape=10)
    b = pm.Binomial('b', n=1, p=a, shape=10, observed=np.ones(10))
    trace = pm.sample(1000)
    post = pm.sample_ppc(trace)

post['b'].mean(axis=0).std()  # 0.0139301830569451
b.distribution.random(size=post['b'].shape[0]).mean(axis=0).std()  # 0.2558969323770803

@lucianopaz
Copy link
Contributor

Sorry to jump in on this discussion not being a member but I think that what @ColCarroll pointed out should be considered a bug and should be fixed as he says by adding a size parameter to draw_values.

Consider his first example. The current behavior of random draws a single value from b's distribution metaparameters. I think that this has at least two problems.

  1. I think that the current implementation is not consistent, as I would expect a call to b.random(size=10000) to have the same distribution as a call like [b.random() for i in range(10000)]. I would only expect the first call to be faster because it could be vectorized or distributed in some more efficient way.

  2. It is not fully consistent with the bayesian interpretation of metaparameters. In his example, each b[i] is sampled as P(b[i]|a) where a is drawn once in the beginning and has a common value across all b[i]. This seems consistent with the frequentist interpretation, where the model parameters are real fixed numbers in nature and should be the same for each b_i that is drawn. However, this was clearly not intended as the value of a is not stored in between calls to random, and different calls will very likely use different a's. A method more consistent with the bayesian interpretation, where the uncertainty in the metaparameters' value is encoded by a probability distribution, would be to jointly sample a and b together, in other words, to draw a value of a for each draw of b.

In the end, if draw_values also took the size parameter, as @ColCarroll suggests, it would be possible to get an independent value of a for each entry to the b array, solving both the problems I mentioned above.

@springcoil
Copy link
Contributor

@ColCarroll I'd agree with @lucianopaz on this. There's a bug in draw_values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants