Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding flags to datamodules #388

Merged
merged 21 commits into from
Dec 16, 2020
Merged

Adding flags to datamodules #388

merged 21 commits into from
Dec 16, 2020

Conversation

briankosw
Copy link
Contributor

What does this PR do?

Adds the flags shuffle, drop_last, and pin_memory to datamodules.

Fixes #245

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • [] Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@codecov
Copy link

codecov bot commented Nov 20, 2020

Codecov Report

Merging #388 (ebaaf18) into master (13863cc) will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #388      +/-   ##
==========================================
- Coverage   80.79%   80.77%   -0.03%     
==========================================
  Files         100      101       +1     
  Lines        5728     5706      -22     
==========================================
- Hits         4628     4609      -19     
+ Misses       1100     1097       -3     
Flag Coverage Δ
cpu 25.20% <ø> (-0.03%) ⬇️
pytest 25.20% <ø> (-0.03%) ⬇️
unittests 80.05% <ø> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pl_bolts/optimizers/lars_scheduling.py 78.72% <0.00%> (-17.03%) ⬇️
pl_bolts/datasets/kitti_dataset.py 34.61% <0.00%> (-0.68%) ⬇️
pl_bolts/datasets/ssl_amdim_datasets.py 74.32% <0.00%> (-0.68%) ⬇️
pl_bolts/utils/semi_supervised.py 96.77% <0.00%> (-0.06%) ⬇️
pl_bolts/datasets/cifar10_dataset.py 96.77% <0.00%> (-0.04%) ⬇️
pl_bolts/datasets/dummy_dataset.py 100.00% <0.00%> (ø)
pl_bolts/utils/__init__.py 100.00% <0.00%> (ø)
pl_bolts/datamodules/experience_source.py 95.93% <0.00%> (+0.06%) ⬆️
pl_bolts/datasets/imagenet_dataset.py 20.11% <0.00%> (+0.70%) ⬆️
pl_bolts/datasets/mnist_dataset.py 57.14% <0.00%> (+14.28%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 13863cc...ebaaf18. Read the comment docs.

@akihironitta akihironitta added the datamodule Anything related to datamodules label Nov 24, 2020
@briankosw briankosw marked this pull request as ready for review December 1, 2020 05:31
@briankosw
Copy link
Contributor Author

briankosw commented Dec 1, 2020

@nateraw would love to get your feedback on this PR. Specifically, I wanted some feedback on whether the flags for validation and test dataloaders should be parametrized the same way as the train dataloader. For example, previously shuffle was set to True for train and False for validation and test. Should they both be set as the newly added flag? Also, I set the flags' defaults as shuffle=False, pin_memory=False, and drop_last=False, as I think the user should explicitly specify if these flags should be turned on.

@pep8speaks
Copy link

pep8speaks commented Dec 1, 2020

Hello @briankosw! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-12-13 10:18:17 UTC

Copy link
Contributor

@akihironitta akihironitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briankosw Thank you for your contribution as always! It seems the doc tests failed. Would you mind having a look?

pl_bolts/datamodules/kitti_datamodule.py Show resolved Hide resolved
@akihironitta
Copy link
Contributor

@briankosw mind resolving the conflicts, too?

@briankosw
Copy link
Contributor Author

Hey @akihironitta,

  1. Do you think shuffle=False should be hardcoded in there for val and test loaders?
  2. Do you find the default values that I've given for shuffle, pin_memory, and drop_last to be reasonable?

@akihironitta akihironitta self-assigned this Dec 14, 2020
@akihironitta
Copy link
Contributor

  1. Do you think shuffle=False should be hardcoded in there for val and test loaders?

Since they are not used for training, I guess hardcoding shuffle=False sounds fine...
@Borda What do you think about the above?

  1. Do you find the default values that I've given for shuffle, pin_memory, and drop_last to be reasonable?

Yes, it looks good to me as is :] As DataLoader uses False by default, let's keep them all False. https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

@Borda Borda requested a review from akihironitta December 14, 2020 21:10
@Borda
Copy link
Member

Borda commented Dec 14, 2020

@akihironitta mind check if your comments were resolved, otherwise it lgtm

Copy link
Contributor

@akihironitta akihironitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briankosw LGTM! Thank you :]

@briankosw
Copy link
Contributor Author

  1. Do you think shuffle=False should be hardcoded in there for val and test loaders?

Since they are not used for training, I guess hardcoding shuffle=False sounds fine...

Sounds good! I went ahead and hardcoded shuffle=False for validation and testing. Do I need to add anything to the documentation or the changelog for this PR?

@Borda Borda merged commit 7beb933 into Lightning-Universe:master Dec 16, 2020
chris-clem pushed a commit to chris-clem/pytorch-lightning-bolts that referenced this pull request Dec 16, 2020
* Adding flags to datamodules

* Finishing up changes

* Fixing syntax error

* More syntax errors

* More

* Adding drop_last flag to sklearn test

* Adding drop_last flag to sklearn test

* Updating doc for reflect drop_last=False

* Adding flags to datamodules

* Finishing up changes

* Fixing syntax error

* More syntax errors

* More

* Adding drop_last flag to sklearn test

* Adding drop_last flag to sklearn test

* Updating doc for reflect drop_last=False

* Cleaning up parameters and docstring

* Fixing syntax error

* Fixing documentation

* Hardcoding shuffle=False for val and test
@akihironitta akihironitta mentioned this pull request Dec 17, 2020
8 tasks
chris-clem added a commit to chris-clem/pytorch-lightning-bolts that referenced this pull request Dec 17, 2020
Borda added a commit that referenced this pull request Dec 17, 2020
* Add BaseDataModule

* Add pre-commit hooks

* Refactor cifar10_datamodule

* Move torchvision warning

* Refactor binary_mnist_datamodule

* Refactor fashion_mnist_datamodule

* Fix errors

* Remove VisionDataset type hint so CI base testing does not fail (torchvision is not installed there)

* Implement Nate's suggestions

* Remove train and eval batch size because it brakes a lot of tests

* Properly add transforms to train and val dataset

* Add num_samples property to cifar10 dm

* Add tesats and docs

* Fix flake8 and codafactor issue

* Update changelog

* Fix isort

* Add typing

* Rename to VisionDataModule

* Remove transform_lib type annotation

* suggestions

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Akihiro Nitta <[email protected]>

* Add flags from #388 to API

* Make tests work

* Move _TORCHVISION_AVAILABLE check

* Update changelog

* Fix CI base testing

* Fix CI base testing

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Akihiro Nitta <[email protected]>
@briankosw briankosw deleted the feature/data_module_flag branch December 21, 2020 08:02
Borda added a commit that referenced this pull request Jan 18, 2021
* Add DCGAN module

* Undo black on conf.py

* Add tests for DCGAN

* Fix flake8 and codefactor

* Add types and small refactoring

* Make image sampler callback work

* Upgrade DQN to use .log (#404)

* Upgrade DQN to use .log

* remove unused

* pep8

* fixed other dqn

* fix loss test case for batch size variation (#402)

* Decouple DataModules from Models - CPCV2 (#386)

* Decouple dms from CPCV2

* Update tests

* Add docstrings, fix import, and update changelog

* Update transforms

* bugfix: batch_size parameter for DataModules remaining (#344)

* bugfix: batch_size for DataModules remaining

* Update sklearn datamodule tests

* Fix default_transforms. Keep internal for every data module

* fix typo on binary_mnist_datamodule

thanks @akihironitta

Co-authored-by: Akihiro Nitta <[email protected]>

Co-authored-by: Akihiro Nitta <[email protected]>

* Fix a typo/copy paste error (#415)

* Just a Typo (#413)

missing a ' at the end of dataset='stl10

* Remove unused arguments (#418)

* tests: Use cached datasets in LitMNIST and the doctests (#414)

* Use cached datasets

* Use cached datasets in doctests

* clear replay buffer after trajectory (#425)

* stale: update label

* bugfix: Add missing imports to pl_bolts/__init__.py (#430)

* Add missing imports

* Add missing imports

* Apply isort

* Fix CIFAR num_samples (#432)

* Add static type checker mypy to the tests and pre-commit hooks (#433)

* Add mypy check to GitHub Actions

* Run mypy on pl_bolts only

* Add mypy check to pre-commit

* Add an empty line at the end of files

* Update mypy config

* Update mypy config

* Update mypy config

* show

Co-authored-by: Jirka Borovec <[email protected]>

* missing logo

* Add type annotations to pl_bolts/__init__.py (#435)

* Run mypy on pl_bolts only

* Update mypy config

* Add type hints to pl_bolts/__init__.py

* mypy

Co-authored-by: Jirka Borovec <[email protected]>

* skip hanging (#437)

* Option to normalize latent interpolation images (#438)

* add option to normalize latent interpolation images

* linspace

* update

Co-authored-by: ananyahjha93 <[email protected]>

* 0.2.6rc1

* Warnings fix (#449)

* Revert "Merge pull request #1 from ganprad/warnings_fix"

This reverts commit 7c5aaf0.

* Fixes warning related np.integer in SklearnDataModule

Fixes this warning:
```DeprecationWarning: Converting `np.integer` or `np.signedinteger` to a dtype is deprecated. The current result is `np.dtype(np.int_)` which is not strictly correct. Note that the result depends on the system. To ensure stable results use may want to use `np.int64` or `np.int32````

* Refactor datamodules/datasets (#338)

* Remove try: ... except: ...

* Fix experience_source

* Fix imagenet

* Fix kitti

* Fix sklearn

* Fix vocdetection

* Fix typo

* Remove duplicate

* Fix by flake8

* Add optional packages availability vars

* binary_mnist

* Use pl_bolts._SKLEARN_AVAILABLE

* Apply isort

* cifar10

* mnist

* cityscapes

* fashion mnist

* ssl_imagenet

* stl10

* cifar10

* dummy

* fix city

* fix stl10

* fix mnist

* ssl_amdim

* remove unused DataLoader and fix docs

* use from ... import ...

* fix pragma: no cover

* Fix forward reference in annotations

* binmnist

* Same order as imports

* Move vars from __init__ to utils/__init__

* Remove vars from __init__

* Update vars

* Apply isort

* update min requirements - PL 1.1.1 (#448)

* update min requirements

* rc0

* imports

* isort

* flake8

* 1.1.1

* flake8

* docs

* Add missing optional packages to `requirements/*.txt` (#450)

* Import matplotlib at the top

* Add missing optional packages

* Update wandb

* Add mypy to requirements

* update Isort (#457)

* Adding flags to datamodules (#388)

* Adding flags to datamodules

* Finishing up changes

* Fixing syntax error

* More syntax errors

* More

* Adding drop_last flag to sklearn test

* Adding drop_last flag to sklearn test

* Updating doc for reflect drop_last=False

* Adding flags to datamodules

* Finishing up changes

* Fixing syntax error

* More syntax errors

* More

* Adding drop_last flag to sklearn test

* Adding drop_last flag to sklearn test

* Updating doc for reflect drop_last=False

* Cleaning up parameters and docstring

* Fixing syntax error

* Fixing documentation

* Hardcoding shuffle=False for val and test

* Add DCGAN module

* Small fixes

* Remove DataModules

* Update docs

* Update docs

* Update torchvision import

* Import gym as optional package to build docs successfully (#458)

* Import gym as optional package

* Fix import

* Apply isort

* bugfix: batch_size parameter for DataModules remaining (#344)

* bugfix: batch_size for DataModules remaining

* Update sklearn datamodule tests

* Fix default_transforms. Keep internal for every data module

* fix typo on binary_mnist_datamodule

thanks @akihironitta

Co-authored-by: Akihiro Nitta <[email protected]>

Co-authored-by: Akihiro Nitta <[email protected]>

* Option to normalize latent interpolation images (#438)

* add option to normalize latent interpolation images

* linspace

* update

Co-authored-by: ananyahjha93 <[email protected]>

* update min requirements - PL 1.1.1 (#448)

* update min requirements

* rc0

* imports

* isort

* flake8

* 1.1.1

* flake8

* docs

* Apply suggestions from code review

* Apply suggestions from code review

* Add docs

* Use LSUN instead of CIFAR10

* Update TensorboardGenerativeModelImageSampler

* Update docs with lsun

* Update test

* Revert TensorboardGenerativeModelImageSampler changes

* Remove ModelCheckpoint callback and nrow=5 arg

* Apply suggestions from code review

* Fix test_dcgan

* Apply yapf

* Apply suggestions from code review

Co-authored-by: Teddy Koker <[email protected]>
Co-authored-by: Sidhant Sundrani <[email protected]>
Co-authored-by: Akihiro Nitta <[email protected]>
Co-authored-by: Héctor Laria <[email protected]>
Co-authored-by: Bartol Karuza <[email protected]>
Co-authored-by: Happy Sugar Life <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: ananyahjha93 <[email protected]>
Co-authored-by: Pradeep Ganesan <[email protected]>
Co-authored-by: Brian Ko <[email protected]>
Co-authored-by: Christoph Clement <[email protected]>
@Borda Borda added this to the v0.3 milestone Jan 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodule Anything related to datamodules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pin_memory should be true only if gpus are specified
4 participants