Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CouchDB 3.x: JobAccountant/CleanCouchPoller fail to authenticate to localhost CouchDB #11044

Closed
amaltaro opened this issue Mar 17, 2022 · 2 comments · Fixed by #11045
Closed

Comments

@amaltaro
Copy link
Contributor

amaltaro commented Mar 17, 2022

Impact of the bug
WMAgent using CouchDB 3.x

Describe the bug
While testing a private build of WMAgent 1.5.7.patch2 + CouchDB 3.1, I noticed that JobAccountant fails to update fwjrs documents in local CouchDB. This is the error message recorded in the logs [1]

Note that the agent configuration config.JobStateMachine.couchurl already has the url in the correct format.

How to reproduce it
None

Expected behavior
The component should be setting the Authorization header and passing it over through the http request, such that updates to the CouchDB database are successful.

UPDATE: same for CleanCouchPoller thread and WorkQueueManager component.

Additional context and error message
[1]

2022-03-17 19:58:52,098:139926084232960:INFO:AccountantWorker:Job 1065 , handle successful job
2022-03-17 19:58:52,104:139926084232960:ERROR:AccountantWorker:Error occurred: associating log collect location, will try again
 CouchUnauthorisedError - reason: Unauthorized, data: {} result: b'{"error":"unauthorized","reason":"You are not a server admin."}\n'
2022-03-17 19:58:52,104:139926084232960:INFO:AccountantWorker:Handling /data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/install/wmagentpy3/JobCreator/JobCache/amaltaro_TC_PY3_ProdPsi_Mar2022_Val_220312_025605_5711/GenSimFu
ll/GenSimFullMergeFEVTDEBUGoutput/Digi_2021noPU/Reco_2021noPU/Reco_2021noPUMergeDQMoutput/Reco_2021noPUMergeDQMoutputEndOfRunDQMHarvestMerged/Reco_2021noPUMergeDQMoutputMergedEndOfRunDQMHarvestLogCollect/JobCollection_146_0/job_1066/Repor
t.0.pkl
2022-03-17 19:58:52,106:139926084232960:INFO:AccountantWorker:Job 1066 , handle successful job
@amaltaro
Copy link
Contributor Author

UPDATE: the same problem happens to the CleanCouchPoller, a thread of the TaskArchiver component. Logs are:

2022-03-18 03:11:24,399:140694967277312:INFO:LogDB:<LogDB(url=https://alancc7-cloud1.cern.ch/couchdb/wmstats_logdb, identifier=vocms0261.cern.ch, agent=1)>
2022-03-18 03:11:24,411:140694967277312:ERROR:BaseWorkerThread:Error in event loop (2): <WMComponent.TaskArchiver.CleanCouchPoller.CleanCouchPoller object at 0x7ff61a13bc10> CouchUnauthorisedError - reason: Unauthorized, data: {} result: b'{"error":"unauthorized","reason":"You are not a server admin."}\n'
Backtrace:
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/WorkerThreads/BaseWorkerThread.py", line 161, in __call__
    self.initInThread(parameters)
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/WorkerThreads/BaseWorkerThread.py", line 147, in initInThread
    self.setup(parameters)
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMComponent/TaskArchiver/CleanCouchPoller.py", line 131, in setup
    self.jobsdatabase = self.jobCouchdb.connectDatabase("%s/jobs" % jobDBName)
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/Database/CMSCouch.py", line 1008, in connectDatabase
    if create and dbname not in self.listDatabases():
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/Database/CMSCouch.py", line 981, in listDatabases
    return self.get('/_all_dbs')
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/Services/Requests.py", line 122, in get
    return self.makeRequest(uri, data, 'GET', incoming_headers,
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/Database/CMSCouch.py", line 135, in makeRequest
    self.checkForCouchError(getattr(e, "status", None),
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/Database/CMSCouch.py", line 151, in checkForCouchError
    raise CouchUnauthorisedError(reason, data, result)

I'm going to add a fix for this as well in the same PR fixing the originally reported issue.

@amaltaro amaltaro changed the title CouchDB 3.x: JobAccountant fails to authenticate to localhost CouchDB CouchDB 3.x: JobAccountant/CleanCouchPoller fails to authenticate to localhost CouchDB Mar 18, 2022
@amaltaro amaltaro changed the title CouchDB 3.x: JobAccountant/CleanCouchPoller fails to authenticate to localhost CouchDB CouchDB 3.x: JobAccountant/CleanCouchPoller fail to authenticate to localhost CouchDB Mar 18, 2022
@amaltaro
Copy link
Contributor Author

And now WorkQueueManager also complains when writing out the spec file, likely because I reverted one line change in #11001 , which didn't look the correct fix anyways. Here is the component traceback (after modifying an error to exception log level):

2022-03-18 04:41:45,467:139675677374208:INFO:WorkQueue:Got 6 elements matching the constraints
2022-03-18 04:41:45,474:139675677374208:ERROR:WorkQueueManagerWMBSFileFeeder:Error in wmbs inject loop: url=http://localhost:5984/workqueue/amaltaro_TaskChain_ProdMinBias_Mar2022_Val_220318_034116_1199/spec, code=401, reason=Unauthorized, headers={'Cache-Control': 'must-revalidate', 'Content-Length': '78', 'Content-Type': 'application/json', 'Date': 'Fri, 18 Mar 2022 03:41:45 GMT', 'Server': 'CouchDB/3.1.2 (Erlang OTP/22)', 'X-Couch-Request-ID': 'a205801ca0', 'X-CouchDB-Body-Time': '0'}, result=b'{"error":"unauthorized","reason":"You are not authorized to access this db."}\n'
Traceback (most recent call last):
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMComponent/WorkQueueManager/WorkQueueManagerWMBSFileFeeder.py", line 57, in algorithm
    previousWorkList = self.queue.getWork(resources, jobCounts,
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/WorkQueue/WorkQueue.py", line 333, in getWork
    wmspec = self.backend.getWMSpec(match['RequestName'])
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/WorkQueue/WorkQueueBackend.py", line 148, in getWMSpec
    wmspec.load(self.db['host'] + "/%s/%s/spec" % (self.db.name, name))
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/WMSpec/Persistency.py", line 78, in load
    data = request.makeRequest('', incoming_headers={"Accept": "*/*"}, decoder=False)
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/Services/Requests.py", line 161, in makeRequest
    result, response = self.makeRequest_pycurl(uri, data, verb, headers)
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/Services/Requests.py", line 178, in makeRequest_pycurl
    response, result = self.reqmgr.request(uri, data, headers, verb=verb,
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/Utils/PortForward.py", line 69, in portMangle
    return callFunc(callObj, url, *args, **kwargs)
  File "/data/srv/wmagent/v1.5.7.patch2-127c31293503ab5f4da2da887a06775a/sw.amaltaro/slc7_amd64_gcc630/cms/wmagentpy3/1.5.7.patch2-127c31293503ab5f4da2da887a06775a/lib/python3.8/site-packages/WMCore/Services/pycurl_manager.py", line 306, in request
    raise exc

when it's localhost, we either pass the credentials in the url, or we pass it with the Authorization header.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant