Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keras wrapper for blocksparse layers #18

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ThomasHagebols
Copy link

@ThomasHagebols ThomasHagebols commented Nov 30, 2018

As requested in #14 and following up on an email with @scott-gray

Added a keras blocksparse layer with an example in a Jupyter Notebook. This example Jupyter Notebook shows a simple network training on Cifar10 with an test accuracy of 54%. (60% since latest commit)

As of now it does not yet support saving the model. Maybe someone else knows how to fix that? (Fixed in last commit)

It also doesn't support eager execution. I'll follow up on that and create an issue later, since it seems to be a problem in the code outside of this commit.

I put the code in the examples directory. If you think it's a good addition I could also move it to the blocksparse / module directory.

@ThomasHagebols
Copy link
Author

I made some new commits. In these commits I added the functionality to save a model.

I also updated the jupyter notebook and trained a (really) simple mlp to 60% accuracy on cifar 10.

@scott-gray
Copy link
Contributor

Sorry for not replying earlier. I've been doing a lot of development on blocksparse related things (mostly related to learned sparsity). Although, I don't think I've made any interface breaking changes for you. For image data I was planning on providing some seperable conv kernels to go with the bs_matmul ops. That way you can just sparisfy the feature conjunctions (1d_conv ops) where the spatial conjunctions are already sparse. But you should still be able to simulate local spatial sparsity with a carefully constructed pure bs_matmul layout.

Anyway, I should be able to get to this request soon.

@ThomasHagebols
Copy link
Author

Cool, I noticed some commits recently. Looking forward to play around with new features. Depthwise Separable Convs are an interesting approach for reducing parameters. I saw that in Mobilenet(V1) 75% of the parameters are in the 1x1 convolution, so sparsifying those might bring some significant improvements. :D

Haha, if it breaks I'll find out eventually. Maybe I should write tests to check for issues at a later stage.

I saw that you added a prune method to BlocksparseMatMul. That was meant for removing blocks right? I couldn't really figure out how to use that. I think it would be interesting to add the possibility of learned sparsity to the wrapper. You said before that adding blocks is hard from a memory management point of view. Would this also be the case when you remove blocks and add some other blocks? (with the constraint that the number of blocks removed >= the number of blocks added) Depending on the complexity I think it would be cool if I could add this functionality to the Keras wrapper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants