Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding types to some of datamodules #462

Merged
merged 81 commits into from
Jan 20, 2021
Merged

Adding types to some of datamodules #462

merged 81 commits into from
Jan 20, 2021

Conversation

briankosw
Copy link
Contributor

@briankosw briankosw commented Dec 18, 2020

What does this PR do?

Adding types to pl_bolts.datamodules.

related to #434

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented Dec 18, 2020

Hello @briankosw! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-01-20 12:45:02 UTC

@briankosw briankosw marked this pull request as draft December 18, 2020 11:03
@codecov
Copy link

codecov bot commented Dec 18, 2020

Codecov Report

Merging #462 (af563a5) into master (6b2136b) will increase coverage by 0.02%.
The diff coverage is 84.61%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #462      +/-   ##
==========================================
+ Coverage   77.53%   77.55%   +0.02%     
==========================================
  Files         114      114              
  Lines        6664     6671       +7     
==========================================
+ Hits         5167     5174       +7     
  Misses       1497     1497              
Flag Coverage Δ
cpu 25.88% <76.92%> (+0.07%) ⬆️
pytest 25.88% <76.92%> (+0.07%) ⬆️
unittests 77.03% <84.61%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pl_bolts/callbacks/byol_updates.py 100.00% <ø> (ø)
pl_bolts/callbacks/variational.py 95.91% <ø> (ø)
pl_bolts/datamodules/async_dataloader.py 21.21% <80.00%> (+1.21%) ⬆️
pl_bolts/datamodules/experience_source.py 95.90% <100.00%> (ø)
pl_bolts/models/self_supervised/amdim/datasets.py 56.89% <0.00%> (-0.74%) ⬇️
pl_bolts/datasets/__init__.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6b2136b...af563a5. Read the comment docs.

@briankosw
Copy link
Contributor Author

A couple things that I need some guidance on:

  1. Modules async_dataloader, experience_source, vocdetection_datamodule, and sklearn_datamodule have a few places where I wasn't able to infer the types.
  2. Since torchvision is not a strict dependency, parameters and return values that rely on torchvision.transforms cannot have types. I considered adding Callable. What's the best practice here?

@Borda
Copy link
Member

Borda commented Dec 20, 2020

@briankosw how is it going, still wip or ready to review?

@Borda Borda requested a review from akihironitta December 20, 2020 23:10
@briankosw briankosw marked this pull request as ready for review December 21, 2020 01:08
@briankosw
Copy link
Contributor Author

@briankosw how is it going, still wip or ready to review?

Just need some guidance on the comment I've left above, but otherwise I'm done!

@briankosw
Copy link
Contributor Author

briankosw commented Dec 22, 2020

A couple things that I need some guidance on:
1. Modules async_dataloader, experience_source, vocdetection_datamodule, and sklearn_datamodule have a few places where I wasn't able to infer the types.
2. Since torchvision is not a strict dependency, parameters and return values that rely on torchvision.transforms cannot have types. I considered adding Callable. What's the best practice here?

@akihironitta would love to get your feedback on these points!

@Borda
Copy link
Member

Borda commented Jan 2, 2021

@akihironitta mind re-review so we can land this one :]

@Borda Borda requested a review from akihironitta January 2, 2021 21:25
@akihironitta
Copy link
Contributor

@briankosw I'm sorry for replying very late.

  1. Since torchvision is not a strict dependency, parameters and return values that rely on torchvision.transforms cannot have types. I considered adding Callable. What's the best practice here?

For types using optional packages, can you have a look at: #444 (review)?

@akihironitta
Copy link
Contributor

  1. Modules async_dataloader, experience_source, vocdetection_datamodule, and sklearn_datamodule have a few places where I wasn't able to infer the types.

For complicated ones, I guess, for now, we can just annotate them with Any as adding complicated types can lower readability of the code... @Borda Does it sound reasonable?

For functions returning self, let's use forward references as explained in PEP 484 (https://www.python.org/dev/peps/pep-0484/#forward-references). For example in our case, we can add types to AsynchronousLoader.__iter__ as the following:

class AsynchronousLoader(object):
    ...
    def __iter__(self) -> "AsynchronousLoader":
        # We don't want to run the thread more than once
        # Start a new thread if we are at the beginning of a new epoch, and our current worker is dead
        if (not hasattr(self, 'worker') or not self.worker.is_alive()) and self.queue.empty() and self.idx == 0:
            self.worker = Thread(target=self.load_loop)
            self.worker.daemon = True
            self.worker.start()
        return self

@akihironitta
Copy link
Contributor

@Borda Sure.

@akihironitta akihironitta changed the title Adding types to datamodules Adding types to datamodules [wip] Jan 19, 2021
@akihironitta
Copy link
Contributor

This is a lot more work than I expected. Hopefully, I'll finish this PR today...

@Borda
Copy link
Member

Borda commented Jan 19, 2021

you can just pin the failing files so spilt the datamodiles into particular files and lets ignore the failing ones only

@@ -76,7 +78,7 @@ def __init__(
"You want to use transforms loaded from `torchvision` which is not installed yet."
)

super().__init__(
super().__init__( # type: ignore[misc]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy will raise errors without these # type: ignore[misc] due to the bug reported in python/mypy#6799.

@akihironitta akihironitta changed the title Adding types to datamodules [wip] Adding types to datamodules Jan 20, 2021
@akihironitta akihironitta changed the title Adding types to datamodules Adding types to some of datamodules Jan 20, 2021
@akihironitta
Copy link
Contributor

akihironitta commented Jan 20, 2021

you can just pin the failing files so spilt the datamodiles into particular files and lets ignore the failing ones only

@Borda Some datamodule files are still included in the ignore list, but I think this PR is ready for review for now.

@akihironitta akihironitta self-requested a review January 20, 2021 13:14
@Borda Borda merged commit f0cc60b into Lightning-Universe:master Jan 20, 2021
Comment on lines +95 to +98
# yapf: disable
if (not hasattr(self, 'worker') or not self.worker.is_alive()) and self.queue.empty() and self.idx == 0: # type: ignore[has-type] # noqa: E501
self.worker = Thread(target=self.load_loop)
# yapf: enable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the record, I had a collision here between yapf and flake8. Lightning-AI/pytorch-lightning#5591

Copy link
Contributor

@akihironitta akihironitta Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying yapf will cause flake8's error [W503] line break before binary operator if without these ignores.

@briankosw
Copy link
Contributor Author

Thank you for finishing this up and my apologies @akihironitta. It was a hectic week, so I couldn't get to it.

@akihironitta
Copy link
Contributor

@briankosw No worries :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodule Anything related to datamodules Priority High priority task refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants