add binomial entropy and kl #149

alicanb · 2018-06-26T07:47:08Z

This is a larger PR than I intended but basically it adds binomial entropy and binomial-poisson and binomial-geometric KL with some helper functions:

binomial._log1pmprobs: I used this a lot so I made it a separate function. it calculates
(-probs).log1p() safely.
binomial._Elnchoosek(): for x~Bin(n, p), this calculates E[log(nchoosek)], E[log(n!)], E[log(x!)], E[log((n-x)!)]

vishwakftw · 2018-06-26T13:04:36Z

torch/distributions/binomial.py

@@ -77,6 +77,11 @@ def probs(self):
    def param_shape(self):
        return self._param.size()

+    def _log1pmprobs(self):


Since it is a function for internal use, I think this can be moved to the top, like in MVN. Something like:

def _log1pmtensor(tensor): # Do the same thing

Uses of the function in kl.py can be done via importing this function along with Binomial.

vishwakftw · 2018-06-26T13:04:41Z

torch/distributions/binomial.py

@@ -109,3 +111,27 @@ def enumerate_support(self):
        values = values.view((-1,) + (1,) * len(self._batch_shape))
        values = values.expand((-1,) + self._batch_shape)
        return values
+
+    def _Elnchoosek(self):


Same idea here.

vishwakftw · 2018-06-26T13:04:46Z

torch/distributions/binomial.py

+        s = self.enumerate_support()
+        s[0] = 1  # 0! = 1
+        # x is factorial matrix i.e. x[k,...] = k!
+        x = torch.cumsum(s.log(), dim=0)


x is the log of factorial matrix right?

vishwakftw · 2018-06-26T13:04:51Z

torch/distributions/binomial.py

+        indices[0] = torch.arange(x.size(0) - 1, -1, -1,
+                                  dtype=torch.long, device=x.device)
+        # x[tuple(indices)] is x reversed on first axis
+        lnchoosek = x[-1] - x - x[tuple(indices)]


I think x.flip(dim=0) will exhibit same behaviour.

weird, I tried using flip and it didn't work before- maybe I messed with arguments...

vishwakftw · 2018-06-26T13:04:55Z

torch/distributions/binomial.py

+        elognfac = x[-1]
+        elogkfac = ((lnchoosek + s * self.logits + self.total_count * self._log1pmprobs()).exp() *
+                    x).sum(dim=0)
+        elognmkfac = ((lnchoosek + s * self.logits + self.total_count * self._log1pmprobs()).exp() *


E[log(n-k)!] = E[log k!] but for Bin(n, (1 - p)). Can we use this fact here?

vishwakftw · 2018-06-26T13:05:02Z

torch/distributions/kl.py

    inf_idxs = p.total_count > q.total_count
    kl[inf_idxs] = _infinite_like(kl[inf_idxs])
    return kl


+@register_kl(Binomial, Poisson)


Heterogeneous combinations were placed below. This section was for homogeneous combinations.

vishwakftw · 2018-06-26T13:05:05Z

torch/distributions/kl.py

+            q.rate)
+
+
+@register_kl(Binomial, Geometric)


Same as above comment.

vishwakftw · 2018-06-26T13:05:09Z

torch/distributions/kl.py

@@ -273,6 +290,11 @@ def _kl_geometric_geometric(p, q):
    return -p.entropy() - torch.log1p(-q.probs) / p.probs - q.logits


+@register_kl(Geometric, Binomial)


Same as above comment.

vishwakftw

Some comments have been given. Please check them.

Could you check if the KL test passes with lower tolerance, and how much time it takes in the default tolerance setting?

fritzo

Thanks for adding these!

alicanb · 2018-06-26T17:45:58Z

@vishwakftw thanks for the comments! One thing I want us to work out before wrapping this up is an approximation to E[logk!] for large n, I tried Stirling's but couldn't come up with a closed form. Any ideas?

vishwakftw · 2018-06-26T17:49:27Z

I think we have to make use of Stirling's inequality and the Taylor series to compute this. I guess the reason you are unable to come up with a closed form is because of the log (k) term.

I tried using them, and got about 0.5% relative error.

This might help after the expansion of log k! <= 1 + klog k + 0.5 log k - k

<source: wikipedia: https://en.wikipedia.org/wiki/Taylor_expansions_for_the_moments_of_functions_of_random_variables>

vishwakftw

Looks good to me!! @fritzo what do you think?

vishwakftw · 2018-06-27T00:49:18Z

Also, are you going to try the large n approximation using Stirling and Taylor expansions? @alicanb

alicanb · 2018-06-27T00:53:50Z

@vishwakftw btw I tried it with 0.01 precision as well. 2 things on my wishlist:

large n approximation for _Elnchoosek
KL(Bin(N,p)|Bin(M,p)) where M>N. Although we can calculate this expensively, making it work for batch is hard... Maybe it doesn't worth the effort.

vishwakftw · 2018-06-27T13:38:27Z

@alicanb I have a closed form solution for E[log x!], E[log (n - x)!] and E[log n!] (this is simply log n!) for large n.

alicanb · 2018-06-27T17:52:54Z

Great, have you experimented with any large n? n=30 seems not large enough for KL(Bin|Geom) for me with 0.1 precision.

vishwakftw · 2018-06-27T18:33:21Z

This is the gist for the approximations.

I ran some tests: n = {10, 20, 50, 75, 100} and p = {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}
Max relative error: 0.198 (n = 10, p = 0.9) and min relative error: 0.00025 (n = 100, p = 0.1). This is for E[log(n - x)!]

alicanb · 2018-06-27T22:53:09Z

btw lgamma(n * (1-p) + 1) + 0.5 * polygamma(1,n * (1-p) + 1) * n * p * (1-p) is a pretty good approximation even for small n, but ~~it's non-differentiable~~ we don't have polygamma(2,x)...

add binomial entropy and kl

ae69a52

alicanb requested review from fritzo and vishwakftw June 26, 2018 07:47

vishwakftw reviewed Jun 26, 2018

View reviewed changes

torch/distributions/kl.py Outdated

q.rate)

@register_kl(Binomial, Geometric)

Copy link

vishwakftw Jun 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above comment.

vishwakftw reviewed Jun 26, 2018

View reviewed changes

fritzo reviewed Jun 26, 2018

View reviewed changes

address review comments

ac2e622

vishwakftw approved these changes Jun 27, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add binomial entropy and kl #149

add binomial entropy and kl #149

alicanb commented Jun 26, 2018

vishwakftw Jun 26, 2018

vishwakftw Jun 26, 2018

vishwakftw Jun 26, 2018

vishwakftw Jun 26, 2018

alicanb Jun 27, 2018

vishwakftw Jun 26, 2018

vishwakftw Jun 26, 2018

vishwakftw Jun 26, 2018

vishwakftw Jun 26, 2018

vishwakftw left a comment •

edited

Loading

fritzo left a comment

alicanb commented Jun 26, 2018

vishwakftw commented Jun 26, 2018 •

edited

Loading

vishwakftw left a comment

vishwakftw commented Jun 27, 2018

alicanb commented Jun 27, 2018

vishwakftw commented Jun 27, 2018

alicanb commented Jun 27, 2018

vishwakftw commented Jun 27, 2018

alicanb commented Jun 27, 2018 •

edited

Loading

		@@ -273,6 +290,11 @@ def _kl_geometric_geometric(p, q):
		return -p.entropy() - torch.log1p(-q.probs) / p.probs - q.logits


		@register_kl(Geometric, Binomial)

add binomial entropy and kl #149

Are you sure you want to change the base?

add binomial entropy and kl #149

Conversation

alicanb commented Jun 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vishwakftw left a comment • edited Loading

Choose a reason for hiding this comment

fritzo left a comment

Choose a reason for hiding this comment

alicanb commented Jun 26, 2018

vishwakftw commented Jun 26, 2018 • edited Loading

vishwakftw left a comment

Choose a reason for hiding this comment

vishwakftw commented Jun 27, 2018

alicanb commented Jun 27, 2018

vishwakftw commented Jun 27, 2018

alicanb commented Jun 27, 2018

vishwakftw commented Jun 27, 2018

alicanb commented Jun 27, 2018 • edited Loading

vishwakftw left a comment •

edited

Loading

vishwakftw commented Jun 26, 2018 •

edited

Loading

alicanb commented Jun 27, 2018 •

edited

Loading