Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report message encoding error in JobCreator #9862

Closed
amaltaro opened this issue Aug 9, 2020 · 4 comments · Fixed by #9868
Closed

Report message encoding error in JobCreator #9862

amaltaro opened this issue Aug 9, 2020 · 4 comments · Fixed by #9868

Comments

@amaltaro
Copy link
Contributor

amaltaro commented Aug 9, 2020

Impact of the bug
WMAgents

Describe the bug
I've seen an issue with JobCreator - while creating a job area - and it looks like WMException class is unable to encode all the error message raised from JobCreator, which actually masks the original exception/error in the component.

How to reproduce it
not sure

Expected behavior
Error messages should be properly encoded/decoded using utf-8 codec. If needed we can ignore and bypass when a byte/word can't be encoded/decoded, it's only for error reporting anyways.

Additional context and error message
Exception from submit8, running the latest WMAgent tag 1.3.6:

2020-08-08 01:43:29,680:140258839922432:INFO:JobCreatorPoller:Found end in iteration over subscription 12124
2020-08-08 01:43:29,755:140258839922432:INFO:JobCreatorPoller:Retrieved 1 jobGroups from jobSplitter
2020-08-08 01:43:29,783:140258839922432:ERROR:CreateWorkArea:Error in creating directories: mkdir: cannot create directory ‘/storage/local/data1/cmsdataops/srv/wmagent/v1.3.6.patch4/install/wmagent/JobCreator/JobCache/cmsunified_ACDC0_task_JME-RunIISummer19UL16GENAPV-00004__v1_T_200808_062137_1106/JME-RunIISummer19UL16GENAPV-00004_0MergeRAWSIMoutput/JobCollection_78499_0/job_1124238’: File exists


2020-08-08 01:43:29,784:140258839922432:ERROR:JobCreatorPoller:Exception in processing wmbsJobGroup 78499
. Error: 'ascii' codec can't encode character u'\u2018' in position 62: ordinal not in range(128)
Traceback (most recent call last):
  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMComponent/JobCreator/JobCreatorPoller.py", line 208, in creatorProcess
    cache=False)
  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMComponent/JobCreator/CreateWorkArea.py", line 170, in processJobs
    self.createWorkArea(cache=cache)
  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMComponent/JobCreator/CreateWorkArea.py", line 273, in createWorkArea
    createDirectories(nameList)
  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMComponent/JobCreator/CreateWorkArea.py", line 42, in createDirectories
    raise CreateWorkAreaException(msg)
  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMCore/WMException.py", line 33, in __init__
    message = message.decode('utf-8', 'ignore')
  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/external/python/2.7.13-comp/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 62: ordinal not in range(128)
2020-08-08 01:43:29,884:140258839922432:ERROR:BaseWorkerThread:Error in worker algorithm (1):
Backtrace:
  <WMComponent.JobCreator.JobCreatorPoller.JobCreatorPoller object at 0x7f908e8688d0> <@========== WMException Start ==========@>
Exception Class: JobCreatorException
Message: Exception in processing wmbsJobGroup 78499
. Error: 'ascii' codec can't encode character u'\u2018' in position 62: ordinal not in range(128)
        ModuleName : WMComponent.JobCreator.JobCreatorPoller
        MethodName : creatorProcess
        ClassInstance : None
        FileName : /data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMComponent/JobCreator/JobCreatorPoller.py
        ClassName : None
        LineNumber : 233
        ErrorNr : 0
Traceback: 
  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMComponent/JobCreator/JobCreatorPoller.py", line 208, in creatorProcess
    cache=False)

  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMComponent/JobCreator/CreateWorkArea.py", line 170, in processJobs
    self.createWorkArea(cache=cache)

  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMComponent/JobCreator/CreateWorkArea.py", line 273, in createWorkArea
    createDirectories(nameList)

  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMComponent/JobCreator/CreateWorkArea.py", line 42, in createDirectories
    raise CreateWorkAreaException(msg)

  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/cms/wmagent/1.3.6.patch4/lib/python2.7/site-packages/WMCore/WMException.py", line 33, in __init__
    message = message.decode('utf-8', 'ignore')

  File "/data/srv/wmagent/v1.3.6.patch4/sw/slc7_amd64_gcc630/external/python/2.7.13-comp/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
@amaltaro
Copy link
Contributor Author

amaltaro commented Aug 9, 2020

@mapellidario hi Dario! I haven't looked any further into this error message, but in case you know - by heart - the py2/py3 way to resolve it, please let us know.

@mapellidario
Copy link
Member

mapellidario commented Aug 12, 2020

What I gathered from having a first look at this is (and please correct me if I am wrong) the following.

Prerequisites

  1. WMException initializes with
class WMException(exceptions.Exception):
    def __init__(self, message, errorNo=None, **data):
        self.name = str(self.__class__.__name__)
        if hasattr(message, "decode"):
            # Fix for the unicode encoding issue, see #8056 and #8403
            # interprets this string using utf-8 codec and ignoring any errors
            message = message.decode('utf-8', 'ignore')
  1. in python3, the str type (unicode codepoints) has the attribute encode and the bytes type has the attribute decode
  2. in python2, both the unicode (unicode codepoints) and the str type (bytes) have both encode and decode attributes

What happens here:

  1. There is the error 2020-08-08 01:43:29,783:140258839922432:ERROR:CreateWorkArea:Error in creating directories: mkdir: cannot create directory ‘/storage/local/data1/cmsdataops/srv/wmagent/v1.3.6.patch4/install/wmagent/JobCreator/JobCache/cmsunified_ACDC0_task_JME-RunIISummer19UL16GENAPV-00004__v1_T_200808_062137_1106/JME-RunIISummer19UL16GENAPV-00004_0MergeRAWSIMoutput/JobCollection_78499_0/job_1124238’: File exists and we would like to pass it to WMEsception
  2. This error string contains the non-ascii character , which is U+2018, LEFT SINGLE QUOTATION MARK
  3. This error message is passed to WMException in the __init__ attribute message as a unicode string.
  4. hasattr(message, "decode") returns True in python2
  5. message = message.decode('utf-8', 'ignore') tries to decode a unicode string, which should already be "decoded". What happens in this case is that python2 tries to implicitly encode message before decoding it. Implicit encodings/decoding use the default system encoding (the one returned by sys.getdefaultencoding()) which is usually ascii in python2. The implicit encoding fails because there is U+2018, which is not an ascii character

Solution: Instead of checking if a variable has the attribute decode, we could simply check its type:

+ from builtins import bytes # this allows to refer to a list of bytes with the type "bytes" in python2 also, as if it were python3

- if hasattr(message, "decode"):
+ if type(message) == bytes:

This change would make WMException depend on python future.

I can open a PR with this change, but I do not know if this error can be replicated in unit test, so I would need to ask you to run some tests.

P.S. : if hasattr(message, "decode"): this would have been perfectly fine if we needed to support python3 only

@amaltaro
Copy link
Contributor Author

Dario, I think that's a good description of the problem and possible solution.

Can I also suggest you to create a unit test here:
https://github.com/dmwm/WMCore/blob/master/test/python/WMCore_t/WMException_t.py

and try to reproduce this problem with the current code? That will give us more confidence on the proposed solution.

@mapellidario
Copy link
Member

This is a very nice idea, indeed! I added a unit test in this commit, which has been added to the PR as well.

I manually checked that if I reverted the change it fails: using if hasattr(message, "decode"): (instead of the new if type(message) == bytes:) the new unit test fails with

[dmwm WMCore]$ python setup.py test --buildBotMode=true --reallyDeleteMyDatabaseAfterEveryTest=true --testCertainPath=test/python/WMCore_t/WMException_t.py
running test
Using the tests below: test/python/WMCore_t/WMException_t.py
#### WE ARE DELETING YOUR DATABASE. 3 SECONDS TO CANCEL ####
#### buildbotmode is true
We are going to trash databases after every test
### We are in buildbot mode ###
----------------------------------------------------------------------
Ran 2 tests in 0.000s

OK
path lists is ['test/python/WMCore_t/WMException_t.py']
Out of 2 cases, we will run 2
#1 create an exception and do some tests. ... ok
#2 create an exception with non-ascii characters and do some tests. ... ERROR

======================================================================
ERROR: create an exception with non-ascii characters and do some tests.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/dmwm/wmcore_unittest/WMCore/test/python/WMCore_t/WMException_t.py", line 60, in testExceptionUnicode
    self.logger.debug("String version of exception: " + str(exception))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 147-152: ordinal not in range(128)

----------------------------------------------------------------------
XML: /home/dmwm/wmcore_unittest/WMCore/nosetests.xml
----------------------------------------------------------------------
Ran 2 tests in 0.053s

FAILED (errors=1)
Testing complete, there are now 1 threads

mapellidario added a commit to mapellidario/WMCore that referenced this issue Aug 20, 2020
This solves the issue dmwm#9862,
The rationale for this change is described in the
issue comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants