Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update files does nothing when passing a list of LFNs to it #102

Open
belforte opened this issue Oct 24, 2023 · 8 comments
Open

update files does nothing when passing a list of LFNs to it #102

belforte opened this issue Oct 24, 2023 · 8 comments
Assignees
Labels

Comments

@belforte
Copy link
Member

I am calling this

 curl -v -X PUT -H User-Agent: CRABClient/development -H Accept: */* --data @/tmp/crab_curlDatalmdmbfvw --cert /tmp/x509up_u8516 --key /tmp/x509up_u8516 --capath /etc/grid-security/certificates/ https://cmsweb.cern.ch:8443/dbs/prod/phys03/DBSWriter/files

where the body data is

cat /tmp/crab_curlDatalmdmbfvw
{"logical_file_name": ["/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root", "/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_2.root"], "is_file_valid": 0}

and it sort of hangs forever.
But same call with only one element in the "logical_file_name" list works finely. Same if I pass a file name, not a list.

What am I doing wrong ?

BTW I also tried to use dataset argument, which case the call returns with HTTP 200 OK, but file status is not changed.
Everything works if I stick with "logical_file_name":"a-LFN" or `"logical_file_name":["a-LFN"] so i thought that the problem is not in the curl details but somehow in how server reacts to speficic parameters.

Should a timestamp help, this query was launched at 00:41 Oct 25 CEST

Here's a detailed example with a list of one element:


belforte@lxplus805/CRABClient> cat data
{"logical_file_name": ["/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_2.root"], "is_file_valid": 0}
belforte@lxplus805/CRABClient> curl -v -X PUT -H "User-Agent: CRABClient/development" -H "Accept: */*" --data @data --cert "/tmp/x509up_u8516" --key "/tmp/x509up_u8516" --capath "/etc/grid-security/certificates/" "https://cmsweb.cern.ch:8443/dbs/prod/phys03/DBSWriter/files"
*   Trying 188.185.89.194...
* TCP_NODELAY set
* Connected to cmsweb.cern.ch (188.185.89.194) port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: /etc/grid-security/certificates/
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS handshake, CERT verify (15):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: DC=ch; DC=cern; OU=computers; CN=cmsweb.cern.ch
*  start date: Feb 28 02:56:51 2023 GMT
*  expire date: Apr  3 02:56:51 2024 GMT
*  subjectAltName: host "cmsweb.cern.ch" matched cert's "cmsweb.cern.ch"
*  issuer: DC=ch; DC=cern; CN=CERN Grid Certification Authority
*  SSL certificate verify ok.
> PUT /dbs/prod/phys03/DBSWriter/files HTTP/1.1
> Host: cmsweb.cern.ch:8443
> User-Agent: CRABClient/development
> Accept: */*
> Content-Length: 154
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 154 out of 154 bytes
< HTTP/1.1 200 OK
< Date: Tue, 24 Oct 2023 22:55:02 GMT
< Server: Apache
< Content-Type: application/json
< Content-Length: 2
< X-Ratelimit-Limit: 100
< X-Ratelimit-Remaining: 99
< X-Ratelimit-Reset: 1698188103
< CMS-Server-Time: D=53141 t=1698188102938519
< 
* Connection #0 to host cmsweb.cern.ch left intact
[]belforte@lxplus805/CRABClient> 

and here with a list of two elements

belforte@lxplus805/CRABClient> cat data2
{"logical_file_name": ["/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root","/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_2.root"], "is_file_valid": 0}
belforte@lxplus805/CRABClient> curl -v -X PUT -H "User-Agent: CRABClient/development" -H "Accept: */*" --data @data2 --cert "/tmp/x509up_u8516" --key "/tmp/x509up_u8516" --capath "/etc/grid-security/certificates/" "https://cmsweb.cern.ch:8443/dbs/prod/phys03/DBSWriter/files"
*   Trying 188.185.89.194...
* TCP_NODELAY set
* Connected to cmsweb.cern.ch (188.185.89.194) port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: /etc/grid-security/certificates/
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS handshake, CERT verify (15):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: DC=ch; DC=cern; OU=computers; CN=cmsweb.cern.ch
*  start date: Feb 28 02:56:51 2023 GMT
*  expire date: Apr  3 02:56:51 2024 GMT
*  subjectAltName: host "cmsweb.cern.ch" matched cert's "cmsweb.cern.ch"
*  issuer: DC=ch; DC=cern; CN=CERN Grid Certification Authority
*  SSL certificate verify ok.
> PUT /dbs/prod/phys03/DBSWriter/files HTTP/1.1
> Host: cmsweb.cern.ch:8443
> User-Agent: CRABClient/development
> Accept: */*
> Content-Length: 264
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 264 out of 264 bytes

and it hangs there for a couple of minutes until I get

< HTTP/1.1 502 Proxy Error
< Date: Tue, 24 Oct 2023 22:58:32 GMT
< Server: Apache
< Content-Length: 469
< Content-Type: text/html; charset=iso-8859-1
< 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/auth/complete/dbs/prod/phys03/DBSWriter/files">PUT&nbsp;/auth/complete/dbs/prod/phys03/DBSWriter/files</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
</body></html>
* Connection #0 to host cmsweb.cern.ch left intact
@belforte
Copy link
Member Author

belforte commented Oct 25, 2023

I tried to add -H "Content-type: application/json" to curl command. Same result [1] :-(

@vkuznet or @d-ylee do you have an example, maybe from the test suite ? I looked but could not find my away around, at least not w/o learning golang first.

[1]

belforte@lxplus805/CRABClient> curl -v -X PUT -H "User-Agent: CRABClient/development" -H "Content-type: application/json"  -H "Accept: */*" --data @data2 --cert "/tmp/x509up_u8516" --key "/tmp/x509up_u8516" --capath "/etc/grid-security/certificates/" "https://cmsweb.cern.ch:8443/dbs/prod/phys03/DBSWriter/files"
*   Trying 188.185.101.116...
* TCP_NODELAY set
* Connected to cmsweb.cern.ch (188.185.101.116) port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: /etc/grid-security/certificates/
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS handshake, CERT verify (15):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: DC=ch; DC=cern; OU=computers; CN=cmsweb.cern.ch
*  start date: Feb 28 02:56:51 2023 GMT
*  expire date: Apr  3 02:56:51 2024 GMT
*  subjectAltName: host "cmsweb.cern.ch" matched cert's "cmsweb.cern.ch"
*  issuer: DC=ch; DC=cern; CN=CERN Grid Certification Authority
*  SSL certificate verify ok.
> PUT /dbs/prod/phys03/DBSWriter/files HTTP/1.1
> Host: cmsweb.cern.ch:8443
> User-Agent: CRABClient/development
> Content-type: application/json
> Accept: */*
> Content-Length: 264
> 
* upload completely sent off: 264 out of 264 bytes
< HTTP/1.1 502 Proxy Error
< Date: Wed, 25 Oct 2023 12:20:24 GMT
< Server: Apache
< Content-Length: 469
< Content-Type: text/html; charset=iso-8859-1
< 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="/auth/complete/dbs/prod/phys03/DBSWriter/files">PUT&nbsp;/auth/complete/dbs/prod/phys03/DBSWriter/files</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
</body></html>
* Connection #0 to host cmsweb.cern.ch left intact
belforte@lxplus805/CRABClient> date
Wed 25 Oct 14:25:28 CEST 2023
belforte@lxplus805/CRABClient> cat data2
{"logical_file_name": ["/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root","/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_2.root"], "is_file_valid": 0}
belforte@lxplus805/CRABClient> ``

@belforte
Copy link
Member Author

I think that one problem is that I thought the API wanted a list of file names in the format ['f1',' f2', 'f3'] instead it wants a comma-separated list of names as a single string "f1,f2,f3". With this format the calls is always OK, but as soon as I put a list instead of a single file name, nothing is changed in DBS !
The more I test, the more it looks a problem with a list of file names.
Also the python3 client works OK with a single LFN but fails with a list.
I will document that and open a new issue

On the other hand changing status of all files in a dataset works, by putting this in the data file passed to curl
{"dataset": "/GenericTTbar/belforte-Stefano-TestRucioP-230817-94ba0e06145abd65ccb1d21786dc7e1d/USER", "is_file_valid": 0}

@vkuznet
Copy link
Contributor

vkuznet commented Oct 25, 2023

@belforte , thanks for reporting. As I'm no longer maintain dbs2go code I pay little attention to tickets, sorry for the late response. Said that, I quickly look at dbs2go log file and found the following:

[2023-10-25 13:14:01.412326002 +0000 UTC m=+7268372.748396754] HTTP/1.1 500 PUT /dbs/prod/phys03/DBSWriter/files [data: 147 in 1695 out] [remoteAddr: 10.100.156.0:40374] [X-Forwarded-For: 188.185.101.116] [X-Forwarded-Host: cmsweb-k8s-prodsrv.cern.ch] [auth: no-TLS cipher-none "/DC=org/DC=terena/DC=tcs/C=IT/O=Istituto Nazionale di Fisica Nucleare/CN=Stefano Belforte [email protected]" belforte X509Proxy] [ref: "-" "CRABClient/development"] [req: 3.173465ms proxy-resp: 0]
[2023-10-25 13:14:01.48815402 +0000 UTC m=+7268372.824224772] DBSError Code:110 Description:DBS DB insert record error Function:dbs.files.UpdateFiles Message: Error: ORA-00933: SQL command not properly ended
 Stacktrace:
goroutine 9935493 [running]:
github.com/dmwm/dbs2go/dbs.Error({0xae4bc0?, 0xc00043d2d0?}, 0x6e, {0x0, 0x0}, {0xa07bb5, 0x15})
        /go/src/github.com/vkuznet/dbs2go/dbs/errors.go:185 +0x99
github.com/dmwm/dbs2go/dbs.(*API).UpdateFiles(0xc000655300)
        /go/src/github.com/vkuznet/dbs2go/dbs/files.go:685 +0xc10
github.com/dmwm/dbs2go/web.DBSPutHandler({0xae81b0, 0xc0000a4990}, 0xc000842800, {0x9f792c, 0x5})
        /go/src/github.com/vkuznet/dbs2go/web/handlers.go:442 +0xb77
github.com/dmwm/dbs2go/web.FilesHandler({0xae81b0?, 0xc0000a4990?}, 0x454134?)
        /go/src/github.com/vkuznet/dbs2go/web/handlers.go:776 +0x6e
net/http.HandlerFunc.ServeHTTP(0x50000c000408e01?, {0xae81b0?, 0xc0000a4990?}, 0x0?)
        /usr/local/go/src/net/http/server.go:2109 +0x2f
github.com/dmwm/dbs2go/web.limitMiddleware.func1({0xae81b0?, 0xc0000a4990?}, 0x0?)
        /go/src/github.com/vkuznet/dbs2go/web/middlewares.go:111 +0x38
net/http.HandlerFunc.ServeHTTP(0x94c100?, {0xae81b0?, 0xc0000a4990?}, 0x11?)
        /usr/local/go/src/net/http/server.go:2109 +0x2f
github.com/ul

which confirms your observation and it is server error.

Said that, I'll leave it up to @d-ylee and @todor-ivanov to investigate further. The fix should be trivial to apply as log says it is improver SQL statement and all of them comes from templates. Therefore, my first bet would be ti inspect how SQL query is constructed if API gets list of files and correct it accordingly. The files should be look at are the following:

My first impression that we lack of loop logic when user provides list of files and I suggest to generalize code to use it for both use cases, either when single file is provided or multiple. In former case the single LFN should be converted to loop of LFNs, and then code should be modified to use the loop to update each individual LFN.

@belforte
Copy link
Member Author

thanks @vkuznet
For the time being I willl handle this as "NotImplementedYet" in my CRABClient re-implementation of https://github.com/dmwm/DBS/blob/master/Client/utils/DataOpsScripts/DBS3SetFileStatus.py. That old file does not work with new server and does not run in py3. Due time for some updating.
@d-ylee @todor-ivanov as already mentioned (working) examples are often the best documentation, please consider adding them

@belforte
Copy link
Member Author

recap for @d-ylee and @todor-ivanov
I finally got json and list format under control and nothing hangs. Please ignore initial postings.
But when I pass a list of file names, instead of a single one, nothing is changed in the DB.

@belforte belforte changed the title hanging PUT update files does nothing when passing a list of LFNs to it Oct 25, 2023
@todor-ivanov
Copy link
Contributor

todor-ivanov commented Oct 26, 2023

Hi @belforte

Thanks for reporting this. I am ramping up now, so I'll have to find my way around the code as well, but the guidelines @vkuznet gave were indeed promising. Even thoug you said it is somehow working for you and you are marking this as NotImplemented in CRABClient, I'd like to keep that issue open for now, and fix the broken SQL statement Valya was pointing to. But Firstly I'll have to setup a proper test environment, so I am capable of making sure I am not breaking this or another API, by applying the eventual fix. I'll come back with more details ASAP.

@belforte
Copy link
Member Author

Ciao Todor, I am happy with you using this as a real-life exercise to ramp up your expertise in dbs2go.
Feel free to take your time, you surely have lots of things in your plate now. And I surely agree with "first: don't break it" !
At least as far as CRAB users are concerned .
I do not know how DataManagement operators deal with file invalidation, hopefully the never needed to do more than a few at a time and are living happily with calling some script a few times.

@todor-ivanov
Copy link
Contributor

Thanks Stefano! I'll let you know once I have a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants