Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[py2py3] Migrate WMCore/Cache/* #9818

Merged
merged 1 commit into from
Aug 20, 2020

Conversation

mapellidario
Copy link
Member

@mapellidario mapellidario commented Jul 9, 2020

Fixes #9806

Status

Done

ACHTUNG! Some of the changes in this PR will conflict with #9762. They are likely to be dealt with in #9762, since it has a lower priority and will likely be merged after this PR is done.

Description

Futurize is run on WMCore.Cache.WMConfigCache and the modules that it depends on.

Manual changes are applied to fix what futurize is not able to migrate:

  1. Dictionaries
  1. bytes and unicode strings

File: JSONThunker.py

  • In python3 the meaning of str() in a return statement changed.
    The code now behaves in the same way as before, but the names
    used to check the string type have been updated to python3 nomenclature
  1. urllib

File: WMConfigCache.py, urllib.urlopen changed to urllib.request.urlopen by futurize.

  • urllib.request.urlopen behaves as urllib2.urlopen, throwing an
    exception in case of HTTP error, instead of simply returning valid
    data containing the details of the error.
    ACHTUNG! This may require changes to code that use
    WMCore.Cache.WMConfigCache.ConfigCache.addConfig
  • urllib.request.urlopen behaves as urllib2.urlopen, requiring to
    specify the file: scheme if you want to open a file in a local
    filesystem. This required changing WMConfigCache_t unit tests.

Migration plan

step1

  • src/python/WMCore/WMException.py

step2

  • src/python/WMCore/Lexicon.py
  • src/python/WMCore/Algorithms/Permissions.py
  • src/python/WMCore/Wrappers/JsonWrapper/JSONThunker.py
  • src/python/WMCore/Services/pycurl_manager.py
  • src/python/Utils/CertTools.py

step 3

  • src/python/WMCore/Services/Requests.py

step 4

  • src/python/WMCore/Database/CMSCouch.py
  • src/python/WMCore/GroupUser/Decorators.py

step5

  • src/python/WMCore/GroupUser/CouchObject.py

step 6

  • src/python/WMCore/GroupUser/Group.py

step 7

  • src/python/WMCore/GroupUser/User.py
  • src/python/Utils/Patterns.py
  • src/python/WMCore/DataStructs/WMObject.py

step 8

  • src/python/WMCore/Cache/WMConfigCache.py

Is it backward compatible (if not, which system it affects?)

yes

External dependencies / deployment changes

requires python-future.org

@mapellidario mapellidario changed the title Py2py3 wmconfigcache [py2py3] Migrate WMCore/Cache/* Jul 9, 2020
@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 9 warnings
    • 159 comments to review
  • Pycodestyle check: succeeded
    • 60 comments to review
  • Python3 compatibility checks: succeeded
    • there are suggested fixes for newer python3 idioms

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10255/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
    • 1 changes in unstable tests
  • Pylint check: failed
    • 22 warnings and errors that must be fixed
    • 13 warnings
    • 279 comments to review
  • Pycodestyle check: succeeded
    • 87 comments to review
  • Python3 compatibility checks: succeeded
    • there are suggested fixes for newer python3 idioms

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10256/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: failed
    • 77 new failures
    • 9 changes in unstable tests
  • Pylint check: failed
    • 27 warnings and errors that must be fixed
    • 14 warnings
    • 377 comments to review
  • Pycodestyle check: succeeded
    • 202 comments to review
  • Python3 compatibility checks: succeeded
    • there are suggested fixes for newer python3 idioms

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10261/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: failed
    • 77 new failures
    • 9 changes in unstable tests
  • Pylint check: failed
    • 27 warnings and errors that must be fixed
    • 14 warnings
    • 377 comments to review
  • Pycodestyle check: succeeded
    • 202 comments to review
  • Python3 compatibility checks: succeeded
    • there are suggested fixes for newer python3 idioms

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10262/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 26 warnings and errors that must be fixed
    • 14 warnings
    • 377 comments to review
  • Pycodestyle check: succeeded
    • 202 comments to review
  • Python3 compatibility checks: succeeded
    • there are suggested fixes for newer python3 idioms

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10263/artifact/artifacts/PullRequestReport.html

@@ -145,7 +133,7 @@ def handleDictThunk(self, toThunk):
toThunk = self.checkRecursion(toThunk)
special = False
tmpdict = {}
for k, v in toThunk.iteritems():
for k, v in iteritems(toThunk):
if type(k) == type(int):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear @amaltaro and @todor-ivanov,
I am a bit baffled by this line. it is comparing the type of a key with type(int), which is just type.

It seems that the type of the key is used to prepend a string to the key itself in a temporary dictionary. If the key is an int, we prepend a _i: and shortly aftre if it is a float we prepend _f:.
I would replace if type(k) == type(int): with if isinstance(k, int).

Are my intuitions correct or do we really want to check if a key is of type type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a trick question. According to google - and my understanding - here we are checking whether k is one of the basic data types or a class definition. The type of those objects is type ...
isinstance seems to accept subclasses as well, while type only compares against the direct object type.

With that said, I have the felling that this code:

elif type(k) == type(float):
...

will never evaluate to True, otherwise the if above (type(int)) would have too.

Here I'm not sure as well, but I think we could replace:

if type(k) == type(int):

by

isinstance(k, type) 

Some references that I found:
https://realpython.com/python-metaclasses/#type-and-class
and
https://note.nkmk.me/en/python-type-isinstance/#:~:text=source%3A%20type.py-,With%20isinstance(),an%20instance%20of%20any%20type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree with your solution if there was a single check. However here we have something like

for k, v in iteritems(toThunk):
    if type(k) == type(int):
        tmpdict['_i:%s' % k] = self._thunk(v)
    elif type(k) == type(float):
         tmpdict['_f:%s' % k] = self._thunk(v)
    else:
         tmpdict[k] = self._thunk(v)

and it seems like the author was trying to put in the dictionary key some information about the value type, but erroneously put a type() in front of the comparison type.

If we use isinstance(k, type) we would match both the int and the float case, wouldn't we?

This being said, I also have the same feeling that those lines now never match True. git blame brought me to 10y ago, when this file was created. If in all these years this change to the keys has been made, should we even change it now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is something very much wrong with this code. However, as we discussed over slack, let's keep it as it is, file a new GH issue reporting this problem and move on. In a couple of months from now, we can get back to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created the issue #9852 related to this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: failed
    • 9 new failures
  • Pylint check: failed
    • 26 warnings and errors that must be fixed
    • 14 warnings
    • 373 comments to review
  • Pycodestyle check: succeeded
    • 201 comments to review
  • Python3 compatibility checks: succeeded
    • there are suggested fixes for newer python3 idioms

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10264/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 26 warnings and errors that must be fixed
    • 14 warnings
    • 374 comments to review
  • Pycodestyle check: succeeded
    • 201 comments to review
  • Python3 compatibility checks: succeeded
    • there are suggested fixes for newer python3 idioms

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10266/artifact/artifacts/PullRequestReport.html

@mapellidario
Copy link
Member Author

mapellidario commented Jul 10, 2020

Dear @amaltaro and @todor-ivanov , have a look at last commit's changes.

-        elif isinstance(toThunk, dict):
+        elif type(toThunk) is dict:

This fixes a lot of regressions found by jenkins in a previous commit, for example WMCore_t.WorkQueue_t.Policy_t.Start_t.ResubmitBlock_t/ResubmitBlockTest/testSingleChunksSplit.

The incriminated previous commit was something that I considered rather safe,
but it is clear that I have to reconsider myself. It was about some suggestions
from the jenkins module that checks for regressions to python versions older
than 2.6
.
As usual, it suggests using isinstance instead of comparing the type directly.

This can, and actually did, break things.

Further investigation revealed that when I run the test

python setup.py test --buildBotMode=true --reallyDeleteMyDatabaseAfterEveryTest=true --testCertainPath=test/python/WMCore_t/WorkQueue_t/Policy_t/Start_t/ResubmitBlock_t.py

With this version of JSONThunker (which raises an exception to catch the stdout log)

        # src/python/WMCore/Wrappers/JsonWrapper/JSONThunker.py
        # L 250
        elif type(toThunk) is dict:
            print("CHECKPOINT", type(toThunk))
            temp = self.handleDictThunk(toThunk)
            raise Exception("dummy")
            return self.handleDictThunk(toThunk)

we get

-------------------- >> begin captured stdout << ---------------------
CHECKPOINT <type 'dict'>

--------------------- >> end captured stdout << ----------------------

While if we run it with

        # src/python/WMCore/Wrappers/JsonWrapper/JSONThunker.py
        # L 250
        elif isinstance(toThunk, dict):
            print("CHECKPOINT", type(toThunk))
            temp = self.handleDictThunk(toThunk)
            raise Exception("dummy")
            return self.handleDictThunk(toThunk)

we get instead

-------------------- >> begin captured stdout << ---------------------
CHECKPOINT <type 'dict'>
CHECKPOINT <class 'WMCore.Database.CMSCouch.Document'>
CHECKPOINT <type 'dict'>
CHECKPOINT <type 'dict'>

--------------------- >> end captured stdout << ----------------------

If I have to guess, it seems that since isinstance checks for inheritance too,
some objects that are derived from dict would pass this check. Since this
causes problems in some unit test, I would not apply this change.

This change was not suggested by futurize, and after seeing these errors I understand
why they did not include this in their fixers.

We applied some similar changes (type() is -> isinstance) in #9731 . The unit tests did not catch any problem with those changes. There may be some chance that those changes could create some problems, even though my instinct suggest it is going to be remote, since we do not changed the code manipulating some nasty objects, it seemed more traditional stuff. In any case, we should keep an eye on that.

@mapellidario mapellidario requested a review from amaltaro July 10, 2020 15:02
@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 26 warnings and errors that must be fixed
    • 15 warnings
    • 381 comments to review
  • Pycodestyle check: succeeded
    • 201 comments to review
  • Python3 compatibility checks: succeeded
    • there are suggested fixes for newer python3 idioms

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10269/artifact/artifacts/PullRequestReport.html

@todor
Copy link

todor commented Jul 10, 2020

@mapellidario Please note that the contributor to this repo is @todor-ivanov, you are tagging the wrong person. @todor is my account and I am not contributing to this repo, so maybe you will want to remove me and not granting permissions.

@amaltaro
Copy link
Contributor

I fixed the mention to Todor and myself in your comment above, Dario.
Apologies for the mistake, Todor.

@mapellidario
Copy link
Member Author

Something strange is happening. I added a new commit yo my feature branch, rebased my feature branch on dmwm/master, pushed with --force to my remote but github did not seem to have noticed it.

If I open https://github.com/mapellidario/WMCore/tree/py2py3-wmconfigcache it says This branch is 14 commits ahead of dmwm:master., while here it says that there are 13 commits. I will wait until tomorrow, then I will try to squash commits and push again

@mapellidario mapellidario force-pushed the py2py3-wmconfigcache branch from d16c773 to 0bcfc70 Compare July 15, 2020 16:50
@amaltaro
Copy link
Contributor

Yes Dario, try rebasing and squashing everything (that does not involve tests), and it might clear up. Honestly, I'm not sure I understand the problem you reported here.

@mapellidario
Copy link
Member Author

I think that it simply took about 1h for github to register my last force-push event in this PR. All good what turns out to be good.

mapellidario added a commit to mapellidario/WMCore that referenced this pull request Jul 16, 2020
WMCore.Cache and its deps have been made compatible with both
python2 and python3.

This is relative to the PR on github dmwm/WMCore dmwm#9818.

A combination of automatic changes (by futurize, full stage 1 and stage2)
and manual changes have been applied.

1. Dictionaries

- replaced mydict.$operator() with view$operator(mydict)

  This is a temporary change and will not be needed anymore
  after dropping python2 support. This is used for performance
  reasons when running python2 with idioms that are compatibles
  with python3.

  Resources:
  * http://python-future.org/compatible_idioms.html#dictionaries
  * http://python-future.org/what_else.html#dict

- used the guidelines at
  http://python-future.org/compatible_idioms.html#dict-keys-values-items-as-a-list
  for the cases in which we want to check if a string is
  among the **keys** of a dictionary. no need to use viewkeys for performance reasons.
  we can just use `list(mydict)` or `for key in mydict:` in both py2 and py3.

2. bytes and unicode strings

File: JSONThunker.py

- In python3 the meaning of `str()` in a return statement changed.
  The code now behaves in the same way as before, but the names
  used to check the string type have been updated to python3 nomenclature

3. urllib

File: WMConfigCache.py, `urllib.urlopen` changed to `urllib.request.urlopen` by futurize.

- `urllib.request.urlopen` behaves as `urllib2.urlopen`, throwing an
  exception in case of HTTP error, instead of simply returning valid
  data containing the details of the error.
  ACHTUNG! This may require changes to code that use
  `WMCore.Cache.WMConfigCache.ConfigCache.addConfig`
- `urllib.request.urlopen` behaves as `urllib2.urlopen`, requiring to
  specify the `file:` scheme if you want to open a file in a local
  filesystem. This required changing `WMConfigCache_t` unit tests.
@mapellidario mapellidario force-pushed the py2py3-wmconfigcache branch from 0bcfc70 to 8c9be3d Compare July 16, 2020 12:26
@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 26 warnings and errors that must be fixed
    • 15 warnings
    • 386 comments to review
  • Pycodestyle check: succeeded
    • 202 comments to review
  • Python3 compatibility checks: succeeded
    • there are suggested fixes for newer python3 idioms

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10287/artifact/artifacts/PullRequestReport.html

@amaltaro amaltaro merged commit 2d7c084 into dmwm:master Aug 20, 2020
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 27, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 27, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 28, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 28, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 28, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 28, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 28, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 28, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 28, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 28, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 28, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 31, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 31, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 31, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 31, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request May 31, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request Jun 1, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request Jun 1, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request Jun 1, 2021
mapellidario added a commit to mapellidario/WMCore that referenced this pull request Jun 1, 2021
@mapellidario mapellidario deleted the py2py3-wmconfigcache branch September 18, 2024 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate WMCore/Cache/* to be compatible with both python2 and python3
4 participants