Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment: Create a single script to create universal virtual environment for all WMCore services. #10979

Open
todor-ivanov opened this issue Feb 4, 2022 · 12 comments · May be fixed by #10980
Assignees
Labels
deployment Issue related to deployment of the services Low Priority Medium Priority New Feature Technical Debt Used to track issues that address technical needs internal to WM team

Comments

@todor-ivanov
Copy link
Contributor

Impact of the new feature
Deployment process of all WMCore Central Services

Is your feature request related to a problem? Please describe.
In the light of the work related to giving up the RPM based service deployment, a strong need of a single script to create a universal virtual environment for all WMCore services emerges. It should be an addition and complimentary to the K8 deployment procedures rather than a substitute. And should also facilitate the automation in the CI/CD process discussed in the following (non exhaustive) list of issues #10921 #10935 #10911 . Another strong plus of such an environment would be the unification between development and runtime environment for the project.

Describe the solution you'd like
A single script which should be able to create the python virtual environment for the project and setup all the needed packages and services dependencies inside it in such a way that it could be used for both development and run time.

Describe alternatives you've considered
Do Nothing and think about it when we completely get rid of the RPM based deployment process. Also leave people find themselves a way to setup the proper development environment in a way that it matches (at least partially) the runtime environment.

Additional context
This issue is created mostly to follow the automation of the virtual environment setup explained here: [1], Which I am already using for a long time now and it eases my work extremely.

[1]
https://github.com/dmwm/WMCore/wiki/setup-wmcore-virtual-environment

@todor-ivanov todor-ivanov added New Feature Medium Priority deployment Issue related to deployment of the services labels Feb 4, 2022
@todor-ivanov todor-ivanov self-assigned this Feb 4, 2022
@todor-ivanov
Copy link
Contributor Author

FYI @vkuznet @amaltaro

@todor-ivanov todor-ivanov linked a pull request Feb 4, 2022 that will close this issue
@vkuznet
Copy link
Contributor

vkuznet commented Feb 9, 2022

On today's meeting James pointed out that CMS is migrating to support and run on various OSes, see slide 4. If CMS/CERN will shift from CC as official Linux OS and start using different OSes/Linux flavors it seems to me that we must move to venv since we may not be in control which package manager which will be used in different Linux flavors. I consider this as yet additional requirement from CMS computing.

@todor-ivanov
Copy link
Contributor Author

todor-ivanov commented Feb 25, 2022

just for logging purposes I am pasting here some of my progress:

I did manage to run the reqmgr2 from the wmcore pypi package and got the following error in the logs: [1]. Even progressing so far was not an easy task.

  • First step was to create the virtual env, activate it and deploy the wmcore package inside it from the pypi test index:
$ python3 -m venv WMCoreTest.venv3
$ source WMCoreTest.venv3/bin/activate
(WMCoreTest.venv3) $  pip install --index-url https://test.pypi.org/simple/ wmcore
  • Upon that I had to manually recreate the whole structure of this part of the deployment tree: /data/.... from a previously deployed service:
 tree -L 2 ~/WMCoreTest.venv3/data/srv/
├── current
│   ├── apps
│   ├── auth
│   ├── bin
│   └── config
├── enabled
│   └── reqmgr2
├── logs
│   └── reqmgr2
└── state
    └── reqmgr2
  • I also had to run the frontend from an old VM based deployment.
  • The command I used for starting reqmgr2 was:
$ wmc-httpd -r -d ~/WMCoreTest.venv3/data/srv/state/reqmgr2 -l "|rotatelogs ~/WMCoreTest.venv3/data/srv/logs/reqmgr2/reqmgr2-%Y%m%d-tivanov-unit02.log 86400" ~/WMCoreTest.venv3/data/srv/current/config/reqmgr2/config.py
  • Even though during the package building an uploading process the requirements.txt file was indeed parsed, I could not see it in the finally deployed package, neither the requirements themselves were satisfied. So I had to manually do so by:
pip install -r requirements.txt
  • I had to also manually satisfy the Secrets dependency:
cp -rv /data/srv/HG2201a/auth/reqmgr2/ReqMgr2Secrets.py data/srv/current/auth/reqmgr2/
cp -rv /data/srv/HG2201a/auth/reqmgr2/dmwm-service-{cert,key}.pem data/srv/current/auth/reqmgr2/

[1]

[25/Feb/2022:14:51:16]  WATCHDOG: starting server daemon (pid 2250)
[25/Feb/2022:14:51:16]  INFO: final CherryPy configuration: {'engine.autoreload.on': False,
...
data.wmstats_url = 'https://tivanov-unit02.cern.ch/wmstatsserver'
data.couch_config_cache_db = 'reqmgr_config_cache'
data.couch_wmstats_db = 'wmstats'
data.couch_workqueue_db = 'workqueue'
data.couch_reqmgr_aux_db = 'reqmgr_auxiliary'
data.dbs_url = 'https://cmsweb-testbed.cern.ch/dbs/int/global/DBSWriter'
data.central_logdb_url = 'https://tivanov-unit02.cern.ch/couchdb/wmstats_logdb'

[25/Feb/2022:14:51:17]  Creating CouchDB connection instances to CouchDB host: 'https://tivanov-unit02.cern.ch/couchdb' ...
[25/Feb/2022:14:51:17]  Creating CouchDB connection to 'reqmgr_workload_cache' ... 
...
[25/Feb/2022:14:51:17]  Creating CouchDB connection to 'reqmgr_auxiliary' ... 
[25/Feb/2022:14:51:17]  INFO: loading ui into /reqmgr2
[25/Feb/2022:14:51:17] TemplatedPage ### ReqMgr uses JINJA templates
[25/Feb/2022:14:51:17] TemplatedPage Templates are located in: ~/WMCoreTest.venv3/data/srv/current/apps/reqmgr2/data/html/ReqMgr/templates
[25/Feb/2022:14:51:17]  INFO: starting server in ~/WMCoreTest.venv3/data/srv/state/reqmgr2
[25/Feb/2022:14:51:17] ENGINE Bus STARTING
[25/Feb/2022:14:51:17] ENGINE Serving on http://0.0.0.0:8246
[25/Feb/2022:14:51:17] ENGINE Bus STARTED
Exception instantiating couch services for :
 url = https://tivanov-unit02.cern.ch/couchdb
 database = acdcserver
 Exception: (35, 'Peer does not recognize and trust the CA that issued your certificate.')

[25/Feb/2022:14:51:45]  SERVER OTHER ERROR pycurl.error 1fef1faed33ddeca20f103b9a4a2339a ((35, 'Peer does not recognize and trust the CA that issued your certificate.'))
[25/Feb/2022:14:51:45]    Traceback (most recent call last):
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/WMCore/REST/Server.py", line 728, in default
[25/Feb/2022:14:51:45]        return self._call(RESTArgs(list(args), kwargs))
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/WMCore/REST/Server.py", line 809, in _call
[25/Feb/2022:14:51:45]        obj = apiobj['call'](*safe.args, **safe.kwargs)
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/WMCore/ReqMgr/Service/Auxiliary.py", line 67, in get
[25/Feb/2022:14:51:45]        reqmgr_db_info = self.reqmgr_db.info()
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/WMCore/Database/CMSCouch.py", line 634, in info
[25/Feb/2022:14:51:45]        return self.get('/%s/' % self.name)
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/WMCore/Services/Requests.py", line 121, in get
[25/Feb/2022:14:51:45]        encode, decode, contentType)
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/WMCore/Database/CMSCouch.py", line 132, in makeRequest
[25/Feb/2022:14:51:45]        encode, decode, contentType)
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/WMCore/Services/Requests.py", line 159, in makeRequest
[25/Feb/2022:14:51:45]        result, response = self.makeRequest_pycurl(uri, data, verb, headers)
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/WMCore/Services/Requests.py", line 177, in makeRequest_pycurl
[25/Feb/2022:14:51:45]        ckey=ckey, cert=cert, capath=capath)
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/Utils/PortForward.py", line 69, in portMangle
[25/Feb/2022:14:51:45]        return callFunc(callObj, url, *args, **kwargs)
[25/Feb/2022:14:51:45]      File "/afs/cern.ch/user/t/tivanov/WMCoreDev.d/WMCoreTest.venv3/lib64/python3.6/site-packages/WMCore/Services/pycurl_manager.py", line 283, in request
[25/Feb/2022:14:51:45]        curl.perform()
[25/Feb/2022:14:51:45]    pycurl.error: (35, 'Peer does not recognize and trust the CA that issued your certificate.')
[25/Feb/2022:14:51:45] tivanov-unit02.cern.ch 127.0.0.1 "GET /reqmgr2/data/about HTTP/1.1" 500 Internal Server Error [data: 338 in 742 out 25096 us ] [auth: OK "" "" ] [ref: "" "ServerMonitor/2.0" ]

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

@todor-ivanov thanks for sharing results. Here is my observations:

  • the directory structure you manually created is artefact of CMSWEB deployment. It comes from hard-coded paths in configuration which I mentioned in CI/CD document. Therefore, we can keep it or we may need to start (create new issue) conversion of hard-coded config areas to env based one
  • the auth parts (both in reqmgr2 and in your CouchDB error) are not required for dev environment deployment. As such those should be made optional and configurable, e.g. all services can be started on pure HTTP protocol without requiring certificates. This valid both for WMCore services and for CouchDB. The communication between them can be done via HTTP protocol rather then HTTPs. Therefore, we need to decouple authentication from deployment, i.e. in dev environment we can deploy things as HTTP based, while in production we can use authentication and HTTPs.
  • if you do need certificates, you should add a procedure how to generate self-signed trusted certificates You may use this manual to see how it should be properly done

I hope this will help you to move forward.

@todor-ivanov
Copy link
Contributor Author

Thanks @vkuznet I'll try to minimize the hard coded paths and will move them to either environment setup or through script parameters, but in both ways I do agree with you we need to have them relative to the root of the deployment tree, rather than hardwired to a global path, as they are right now.

@todor-ivanov
Copy link
Contributor Author

todor-ivanov commented Mar 11, 2022

So here is some more info:
Upon enormous amount of manual changes which I had to do in the following 3 cmsweb deployment scripts:

I did manage to achieve the following:

  • I deployed the wmcore package through pip inside a previously created virtual environment.
  • I deployed the whole directory structure for one single package (reqmgr2ms) as provided by those scripts above, but inside that same virtual environment. I was skipping all the cmspkg parts so I avoid downloading and installing any rpms from comp repo.
  • I deployed a fresh "base" virtual environment with the wmcore pip package only, so I can compare them both.
  • Finally, I compared the two trees. And here is the excerpt of what all the cmsweb deployment scripts add as a configuration structure to the base installation path (we are talking about a single package here - reqmgr2ms and dependencies):
--- ../WMCoreTest.venv3/.tree	2022-03-11 14:59:28.000000001 +0100
+++ ../WMCoreTest.venv3.base/.tree	2022-03-11 14:31:48.000000001 +0100
@@ -5548,57 +5548,6 @@
 │                   └── RESTServerSetup.py
 ├── lib64 -> lib
 ├── pip-selfcheck.json
-├── pyvenv.cfg
-└── srv
-    ├── auth
-    ├── current -> HG2202e
-    ├── enabled
-    │   ├── mongodb
-    │   └── reqmgr2ms
-    ├── HG2202e
-    │   ├── apps -> apps.sw
-    │   ├── apps.sw
-    │   │   ├── mongo -> ../sw/slc7_amd64_gcc630/external/mongo/
-    │   │   └── reqmgr2ms -> ../sw/slc7_amd64_gcc630/cms/reqmgr2ms/
-    │   ├── auth
-    │   │   ├── reqmgr2ms
-    │   │   │   ├── dmwm-service-cert.pem
-    │   │   │   ├── dmwm-service-key.pem
-    │   │   │   └── ReqMgr2MSSecrets.py
-    │   │   └── wmcore-auth
-    │   │       └── header-auth-key
-    │   ├── bin
-    │   ├── config
-    │   │   ├── backend
-    │   │   ├── mongodb
-    │   │   │   ├── deploy
-    │   │   │   ├── manage
-    │   │   │   └── monitoring.ini
-    │   │   ├── reqmgr2ms
-    │   │   │   ├── config-monitor.py
-    │   │   │   ├── config-output.py
-    │   │   │   ├── config-ruleCleaner.py
-    │   │   │   ├── config-transferor.py
-    │   │   │   ├── config-unmerged.py
-    │   │   │   ├── deploy
-    │   │   │   ├── etc
-    │   │   │   │   └── rucio.cfg
-    │   │   │   ├── manage
-    │   │   │   ├── monitoring-monitor.ini
-    │   │   │   ├── monitoring-output.ini
-    │   │   │   ├── monitoring-ruleCleaner.ini
-    │   │   │   ├── monitoring-transferor.ini
-    │   │   │   └── monitoring-unmerged.ini
-    │   │   └── wmcore-auth
-    │   │       └── deploy
-    │   └── sw
-    ├── logs
-    │   ├── mongodb
-    │   └── reqmgr2ms
-    └── state
-        ├── mongodb
-        │   └── db
-        └── reqmgr2ms
-            └── tmp
+└── pyvenv.cfg
 
-894 directories, 4707 files
+868 directories, 4682 files

@todor-ivanov
Copy link
Contributor Author

And looking at the tree diff now I just found out, one very import file is missing.

This one: srv/current/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/etc/profile.d/init.sh ,

which in the current setup is crucial for running the service. Its contents (from a VM based install) are:

cat /data/srv/current/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/etc/profile.d/init.sh 
if [ -f /data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/etc/profile.d/dependencies-setup.sh ]; then . /data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/etc/profile.d/dependencies-setup.sh; fi
REQMGR2MS_ROOT="/data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5"
REQMGR2MS_VERSION="1.0.1.pre5"
REQMGR2MS_REVISION="1"
REQMGR2MS_CATEGORY="cms"
[ ! -d /data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/bin ] || export PATH="/data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/bin${PATH:+:$PATH}";
[ ! -d /data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/lib ] || export LD_LIBRARY_PATH="/data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}";
[ ! -d /data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/xbin ] || export PATH="/data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/xbin${PATH:+:$PATH}";
[ ! -d /data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/${PYTHON_LIB_SITE_PACKAGES} ] || export PYTHONPATH="/data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/${PYTHON_LIB_SITE_PACKAGES}${PYTHONPATH:+:$PYTHONPATH}";
[ ! -d /data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/x${PYTHON_LIB_SITE_PACKAGES} ] || export PYTHONPATH="/data/srv/HG2203c/sw/slc7_amd64_gcc630/cms/reqmgr2ms/1.0.1.pre5/x${PYTHON_LIB_SITE_PACKAGES}${PYTHONPATH:+:$PYTHONPATH}";

but we should definitely search for setting this up through the standard ways of the virtual environment setup mechanisms. One thing I am completely sure won't be needed any more is the mangling of PYTHONPATH, because from inside the virtual environment any WMCoer imports work perfectly fine:

(WMCoreTest.venv3) [...@... WMCoreTest.venv3]$ python
Python 3.6.8 (default, Nov 16 2020, 16:55:22) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from WMCore.MicroService.MSUnmerged.MSUnmergedRSE import MSUnmergedRSE
>>> 

@klannon klannon added the Technical Debt Used to track issues that address technical needs internal to WM team label Apr 12, 2022
@todor-ivanov
Copy link
Contributor Author

todor-ivanov commented Apr 20, 2022

Logging a major problem I've hit while dealing with package dependencies.

After we have the fix for including all the dependencies in the wmcore meta package here (thanks Erik) we are now facing another error [1]. This was somehow expected one, because we know the gfal2-python package still depends on system packages like Boost and gfa2, gfal2-devel, gfal2-plugins. I did try to resolve it as it was done before and well explained here: [2] and here: [3]:

$ sudo yum-config-manager --add-repo https://dmc-repo.web.cern.ch/dmc-repo/dmc-el7.repo
$ sudo yum-config-manager --enable dmc-el7/x86_64

$ yum install glib2 glib2-devel gfal2 gfal2-devel
$ sudo yum install gfal2-plugin-gridftp gfal2-plugin-file gfal2-plugin-http gfal2-plugin-srm gfal2-plugin-xrootd

And after checking all the system package versions I have they are indeed the latest:

$ rpm -qa | grep ^gfal2
gfal2-plugin-file-2.20.5-1.el7.cern.x86_64
gfal2-util-scripts-1.7.1-1.el7.cern.noarch
gfal2-plugin-xrootd-2.20.5-1.el7.cern.x86_64
gfal2-devel-2.20.5-1.el7.cern.x86_64
gfal2-plugin-srm-2.20.5-1.el7.cern.x86_64
gfal2-plugin-http-2.20.5-1.el7.cern.x86_64
gfal2-plugin-gridftp-2.20.5-1.el7.cern.x86_64
gfal2-2.20.5-1.el7.cern.x86_64

But I still fail to tackle the problem. Maybe I am missing something here. Alan if you have any input would be very welcome.

The final result is - even though I do try to deploy the latest wmcore version from test pypi index with the following command :

 ./deploy/deploy-centralvenv.sh -c unit02.cern.ch -i test -d /data/tmp/WMCore.venv3/ -j preprod -l wmcore==2.0.3.rc1 

I still get the very old package wmcore==1.3.5 from prod pypi index. This as an expected downgrade because of the dependency issues, which need to be resolved.

[1]

Building wheels for collected packages: gfal2-python
  Building wheel for gfal2-python (setup.py): started
  Building wheel for gfal2-python (setup.py): finished with status 'error'
  ERROR: Command errored out with exit status 1:
   command: /data/tmp/WMCore.venv3/bin/python3 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/user/pip-install-9i857i7h/gfal2-python_613f3045780a41d38222d48655ea075e/setup.py'"'"'; __file__='"'"'/tmp/user/pip-install-9i857i7h/gfal2-python_613f3045780a41d38222d48655ea075e/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/user/pip-wheel-ek6jne5r
       cwd: /tmp/user/pip-install-9i857i7h/gfal2-python_613f3045780a41d38222d48655ea075e/
  Complete output (87 lines):
  running bdist_wheel
  running build
  running build_ext
  -- The C compiler identification is GNU 4.8.5
  -- The CXX compiler identification is GNU 4.8.5
  -- Check for working C compiler: /usr/bin/cc
  -- Check for working C compiler: /usr/bin/cc -- works
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/c++
  -- Check for working CXX compiler: /usr/bin/c++ -- works
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  cmake source dir : /tmp/user/pip-install-9i857i7h/gfal2-python_613f3045780a41d38222d48655ea075e
  -- gfal2-bindings is used as APPLICATION_NAME
  -- Found Python2.7: /bin/python2.7
  -- Found Python3: /data/tmp/WMCore.venv3/bin/python3
  -- Found Python3.6: /bin/python3.6
  -- Found PythonCurrentVersion: 2.7
  -- Found PythonCurrentInclude: /usr/include/python2.7
  -- Found PythonCurrentLibs: -lpthread -ldl -lutil -lm -lpython2.7 -Xlinker -export-dynamic
  -- Found PythonCurrentModsDir: /usr/lib64/python2.7/site-packages
  CMake Error at /usr/share/cmake/Modules/FindBoost.cmake:1138 (message):
    Unable to find the requested Boost libraries.
  
    Unable to find the Boost header files.  Please set BOOST_ROOT to the root
    directory containing Boost or BOOST_INCLUDEDIR to the directory containing
    Boost's headers.
  Call Stack (most recent call first):
    CMakeLists.txt:27 (find_package)
  
  
  CMake Warning at CMakeLists.txt:46 (message):
    Boost Python3 library not found
  
  
  -- Found PkgConfig: /usr/bin/pkg-config (found version "0.27.1")
  -- checking for module 'glib-2.0'
  --   found glib-2.0, version 2.56.1
  -- GLIB2 libraries: glib-2.0
  -- GLIB2 include dir: /usr/include/glib-2.0;/usr/lib64/glib-2.0/include
  -- Found GLIB2: glib-2.0
  -- checking for module 'gthread-2.0'
  --   found gthread-2.0, version 2.56.1
  -- GTHREAD2 libraries: gthread-2.0;glib-2.0
  -- GTHREAD2 include dir: /usr/include/glib-2.0;/usr/lib64/glib-2.0/include
  -- Found GTHREAD2: gthread-2.0;glib-2.0
  -- checking for module 'gfal2'
  --   found gfal2, version 2.20.5
  -- checking for module 'gfal_transfer'
  --   found gfal_transfer, version 2.20.5
  -- GFAL2 libraries: gfal2;glib-2.0;gfal_transfer;gfal2;glib-2.0
  -- GFAL2 include dir: /usr/include/gfal2;/usr/include/glib-2.0;/usr/lib64/glib-2.0/include;/usr/include/gfal2;/usr/include/glib-2.0;/usr/lib64/glib-2.0/include
  -- Found GFAL2: gfal2;glib-2.0;gfal_transfer;gfal2;glib-2.0
  -- Configuring incomplete, errors occurred!
  See also "/tmp/user/pip-install-9i857i7h/gfal2-python_613f3045780a41d38222d48655ea075e/build/temp.linux-x86_64-3.6/CMakeFiles/CMakeOutput.log".
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/user/pip-install-9i857i7h/gfal2-python_613f3045780a41d38222d48655ea075e/setup.py", line 114, in <module>
      Extension('gfal2', sources=glob("src/*.cpp"))
    File "/data/tmp/WMCore.venv3/lib/python3.6/site-packages/setuptools/__init__.py", line 129, in setup
      return distutils.core.setup(**attrs)
    File "/usr/lib64/python3.6/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/usr/lib64/python3.6/distutils/dist.py", line 955, in run_commands
      self.run_command(cmd)
    File "/usr/lib64/python3.6/distutils/dist.py", line 974, in run_command
      cmd_obj.run()
    File "/data/tmp/WMCore.venv3/lib/python3.6/site-packages/wheel/bdist_wheel.py", line 299, in run
      self.run_command('build')
    File "/usr/lib64/python3.6/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/lib64/python3.6/distutils/dist.py", line 974, in run_command
      cmd_obj.run()
    File "/usr/lib64/python3.6/distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/usr/lib64/python3.6/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/lib64/python3.6/distutils/dist.py", line 974, in run_command
      cmd_obj.run()
    File "/tmp/user/pip-install-9i857i7h/gfal2-python_613f3045780a41d38222d48655ea075e/setup.py", line 84, in run
      _run_make(self.build_temp, self.get_ext_fullpath('gfal2'))
    File "/tmp/user/pip-install-9i857i7h/gfal2-python_613f3045780a41d38222d48655ea075e/setup.py", line 75, in _run_make
      check_call(cmake_cmd, cwd=build_dir)
    File "/usr/lib64/python3.6/subprocess.py", line 311, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['cmake', '-DSKIP_DOC=TRUE', '-DSKIP_TESTS=TRUE', '-DPYTHON_EXECUTABLE_3=/data/tmp/WMCore.venv3/bin/python3', '/tmp/tivanov/pip-install-9i857i7h/gfal2-python_613f3045780a41d38222d48655ea075e']' returned non-zero exit status 1.
  ----------------------------------------
  ERROR: Failed building wheel for gfal2-python

[2]
cms-sw/cmsdist#7326

[3]
dmwm/CMSKubernetes#848

@todor-ivanov
Copy link
Contributor Author

Again logging some status of the work here:
The deployment script for the virtual environment is in an almost-final state right now. I am already testing the setup now.
The script provides the ability to run from both pypi package and from source, and in order to work around the gfal2 compilation issues from above I continued by running from source. A separate issue needs to be created for fixing this package installation.

I did manage to deploy and produce properly all manage and startup scripts + setting up the environment for them. Here is how I am running it right now.

[user@unit02 srv]$ cd /data/tmp/WMCore.venv3/
[user@unit02 WMCore.venv3]$ . bin/activate
Setting up WMCore related environment variables:
(WMCore.venv3) [user@unit02 WMCore.venv3]$ wmcmanage status:reqmgr2ms-rulecleaner
ms-rulecleaner is NOT RUNNING
(WMCore.venv3) [user@unit02 WMCore.venv3]$ wmcmanage start:reqmgr2ms-rulecleaner
stopping reqmgr2ms-rulecleaner
ms-rulecleaner not running, not killing
starting reqmgr2ms-rulecleaner
ms-rulecleaner not running, not killing

But I am fighting with the following error from the service logs:

(WMCore.venv3) [user@unit02 WMCore.venv3]$ less /data/tmp/WMCore.venv3/srv/logs/reqmgr2ms-rulecleaner/reqmgr2ms-rulecleaner-20220425-tivanov-unit02.log 

[25/Apr/2022:09:32:33]  INFO: instantiating extension heartbeatMonitor
[25/Apr/2022:09:32:33]  ERROR: terminating due to error: Traceback (most recent call last):
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/REST/Main.py", line 435, in start_daemon
    self.run()
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/REST/Main.py", line 496, in run
    self.install_application()
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/REST/Main.py", line 262, in install_application
    obj = getattr(module, class_name)(self, ext)
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/REST/HeartbeatMonitorBase.py", line 14, in __init__
    super(HeartbeatMonitorBase, self).__init__(config)
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/REST/CherryPyPeriodicTask.py", line 33, in __init__
    self.setUpLogDB(config)
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/REST/CherryPyPeriodicTask.py", line 43, in setUpLogDB
    thread_name=config.object.rsplit(".", 1)[-1])
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/Services/LogDB/LogDB.py", line 58, in __init__
    self.thread_name, agent=self.agent, **kwds)
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/Services/LogDB/LogDBBackend.py", line 54, in __init__
    self.server = CouchServer(db_url)
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/Database/CMSCouch.py", line 960, in __init__
    CouchDBRequests.__init__(self, url=dburl, usePYCurl=usePYCurl, ckey=ckey, cert=cert, capath=capath)
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/Database/CMSCouch.py", line 102, in __init__
    {"cachepath": None, "pycurl": usePYCurl, "key": ckey, "cert": cert, "capath": capath})
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/Services/Requests.py", line 564, in __init__
    Requests.__init__(self, url, idict)
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/Utils/PortForward.py", line 69, in portMangle
    return callFunc(callObj, url, *args, **kwargs)
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/Services/Requests.py", line 92, in __init__
    self.reqmgr = RequestHandler()
NameError: name 'RequestHandler' is not defined

[25/Apr/2022:09:32:33]  WATCHDOG: server exited with exit code 1

FIY @amaltaro @vkuznet if you want you may start looking into the the code as it is right now and leave comments if you wish. I will request for yet another review from you in a moment. The wiki with all the details will come later today... or I'd rather say tomorrow. I did postpone the wok on it because last week many pieces of this development turned sudden turns.

@todor-ivanov
Copy link
Contributor Author

I, kind of, managed to locate the culprit here:

In [1]: from WMCore.Services.pycurl_manager import RequestHandler, ResponseHeader
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-23aedec58259> in <module>
----> 1 from WMCore.Services.pycurl_manager import RequestHandler, ResponseHeader

/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/Services/pycurl_manager.py in <module>
     54 import re
     55 import subprocess
---> 56 import pycurl
     57 from io import BytesIO
     58 import http.client

ModuleNotFoundError: No module named 'pycurl'

But this silent and misleading behavior was due to ignoring the import error from here [1]

So I had to go and manually install the pycurl package but it then generated another package dependency problem:

Installing collected packages: pycurl
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
dbs3-client 4.0.18 requires pycurl==7.19.3, but you have pycurl 7.45.1 which is incompatible.
dbs3-pycurl 3.17.7 requires pycurl==7.43.0.6, but you have pycurl 7.45.1 which is incompatible.
Successfully installed pycurl-7.45.1

And then upon restart of the service another error popped up:

ERROR initializing MicroService REST module.
Traceback (most recent call last):
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/MicroService/Service/Data.py", line 57, in __init__
    module = importlib.import_module('.'.join(arr[:-1]))
  File "/usr/lib64/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/MicroService/MSManager.py", line 33, in <module>
    from WMCore.MicroService.Tools.Common import getMSLogger
  File "/data/tmp/WMCore.venv3/srv/WMCore/src/python/WMCore/MicroService/Tools/Common.py", line 29, in <module>
    from dbs.apis.dbsClient import aggRuns, aggFileLumis
ModuleNotFoundError: No module named 'dbs'

FYI: @amaltaro @vkuznet

[1]

try:
from WMCore.Services.pycurl_manager import RequestHandler, ResponseHeader
except ImportError:
pass

@vkuznet
Copy link
Contributor

vkuznet commented Apr 27, 2022

Todor, I think you should adjust your requirements.txt file to include dbs3-client proper version which will bring pycurl for you. And, your final error comes from missed PYTHONPATH which should include dbs client code. When I run code locally I have the following env:

PYTHONPATH=/Users/vk/CMS/DMWM/GIT/WMCore/src/python:/Users/vk/CMS/DMWM/GIT/WMCore/test/python:/Users/vk/CMS/DMWM/GIT/DBSClient/src/python:/Users/vk/CMS/DMWM/GIT/PycurlClient/src/python:/Users/vk/CMS/DMWM/GIT/WMCore/venv/lib/python3.9/site-packages

which has the following order WMCore:DBSClient:PycurClient:venv.

@todor-ivanov
Copy link
Contributor Author

Thanks @vkuznet I will fix that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployment Issue related to deployment of the services Low Priority Medium Priority New Feature Technical Debt Used to track issues that address technical needs internal to WM team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants