Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VC: Fix URL processing defaults #4921

Merged
merged 1 commit into from
May 19, 2023
Merged

VC: Fix URL processing defaults #4921

merged 1 commit into from
May 19, 2023

Conversation

cheatfate
Copy link
Contributor

No description provided.

of "http": "80"
of "https": "443"
else:
$defaultEth2RestPort
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm on record for being skeptical of this idea of the default port, and I'm not sure adding even more nuance to it to address really specific cases is ideal.

Is there a reason, conceptually, that http://foo and https://foo should have the usual default ports, but that not including a protocol should point to this (still) semi-arbitrary port? Entering a port as a one-off setup thing simply isn't that difficult compared to unexpected changes, say, from changing from foo to http://foo or https://foo or vice versa and discovering that, oops, it's counterproductive as a minimal-diagnostic troubleshooting measure because Nimbus tried to do something clever in the background.

This all becomes quite a maze of arbitrary defaults with negligible consistent unifying or explanatory logic behind it.

But, sure, I get that this PR fixes a very specific ongoing issue, so I won't try to block it, but I want to note my reservations about this continuing drive to add implicit defaults with real costs but questionable benefits which were never necessary to begin with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also -- if this is going to gather this incidental complexity, is it possible to ensure it's (a) isolated in a single unified place so the logic doesn't have to be duplicated, and (b) that it's tested in a useful way to ensure that seemingly innocuous changes later won't break it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tersec how often you entering "https://" in your browser? Or most of the times you just entering "something.com"? Do you pining "http://google.com" or just "google.com" are you establish SSH connections with "ssh://github.com" or with "github.com"? So every program in here are using specific defaults, if you execute web browser it will try 443 and 80 and if you execute SSH it will try to use 23, and same situation is with FTP, SMTP, POP3, IMAP and all the others.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before HTTPS was everywhere, regularly. For a while I used HTTPS Everywhere and I now use dom.security.https_only_mode. But since web browsing is effectively a single-protocol activity (https://) for me now, and has been for a couple of years, I don't need the scheme://.

This isn't the case for Nimbus, though, since the Nimbus VC does support multiple connection schemes or protocols.

To your specific examples, e.g., Firefox and Chromium do not know how to connect to ssh://github.com, so it's not necessary to specify that.

Furthermore, Nimbus VC users hopefully aren't entering these URLs every time, but only when configuring, unlike web browsing where one is entering (partial) URLs frequently, so it's not comparable. It's better/fine for the URLs used in these configurations to be still human-readable, but not necessarily quite as few keystrokes as web browsing URLs.

Copy link
Contributor

@jshufro jshufro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My $0.02 as the guy who reported this...

Passing just a hostname probably should not be supported, but if you want to special-case that, special-case just that.

IE, if i pass 192.168.1.9 map that to http://192.168.1.9:5052

If I pass 192.168.1.9:5052, error out- i forgot to pass a scheme
If I pass http://192.168.1.9, i want port 80, don't do anything special
If I pass https://192.168.1.9, i want port 443, don't do anything special

If I pass 192.168.1.9, i would prefer an error- this is not an address that the VC knows how to dial. However, if you really really think it's worth special-casing, sure, convert this case and this case only to http://192.168.1.9:5052

@github-actions
Copy link

github-actions bot commented May 10, 2023

Unit Test Results

         6 files   -        3       720 suites   - 351   26m 5s ⏱️ - 7m 20s
  3 368 tests  -    304    3 347 ✔️  -      46  21 💤  - 258  0 ±0 
10 454 runs   - 5 205  10 412 ✔️  - 4 942  42 💤  - 263  0 ±0 

Results for commit bd64b59. ± Comparison against base commit 34e7181.

♻️ This comment has been updated with latest results.

@cheatfate
Copy link
Contributor Author

cheatfate commented May 10, 2023

My $0.02 as the guy who reported this...

Passing just a hostname probably should not be supported, but if you want to special-case that, special-case just that.

IE, if i pass 192.168.1.9 map that to http://192.168.1.9:5052

If I pass 192.168.1.9:5052, error out- i forgot to pass a scheme If I pass http://192.168.1.9, i want port 80, don't do anything special If I pass https://192.168.1.9, i want port 443, don't do anything special

If I pass 192.168.1.9, i would prefer an error- this is not an address that the VC knows how to dial. However, if you really really think it's worth special-casing, sure, convert this case and this case only to http://192.168.1.9:5052

VC knows how to dial because VC has only 2 ways of dialing, secure and non-secure HTTP connection. So i would prefer more complex solution which tries to connect via order [HTTPS, HTTP] to the specific address,

Otherwise i would prefer more strict solution which ensures that both scheme and port are specified and also removes default values localhost:5052 which after all, become totally non-sense. For some reason localhost and for some reason 5052.

@cheatfate
Copy link
Contributor Author

One more thing to consider is adding new scheme bnode:// like we already have enode:// which will point to default port 5052 for example.

@tersec
Copy link
Contributor

tersec commented May 10, 2023

One more thing to consider is adding new scheme bnode:// like we already have enode:// which will point to default port 5052 for example.

Well, to start with, just, why have a default?

@cheatfate
Copy link
Contributor Author

One more thing to consider is adding new scheme bnode:// like we already have enode:// which will point to default port 5052 for example.

Well, to start with, just, why have a default?

Ok i could ask this question too, if we are not going to use defaults, why we need scheme?

@jshufro
Copy link
Contributor

jshufro commented May 10, 2023

I think you're missing the forest for the trees here.

Nimbus uses rfc3986 URIs. The argument passed to --beacon-node is an rfc3986 URI. The std/uri package is capable of parsing rfc3986 URIs accurately.

There is no need to customize this logic, it is an extremely common practice in software around the world to use this format as-is. I don't think you need to cater to users who do not understand the URI format, as they will most likely never have cause to set --beacon-node in the first place.

#4924 adds some minimal validation, but does not attempt to customize any logic. I believe this is in accordance with both standard industry practice and the Principle of Least Astonishment

@cheatfate
Copy link
Contributor Author

cheatfate commented May 12, 2023

I think you're missing the forest for the trees here.

I'm not missing anything, i have added the feature which enhances capabilities of default URI parser and allows to accept not only fully specified URIs, but also schemeless URIs.

Nimbus uses rfc3986 URIs. The argument passed to --beacon-node is an rfc3986 URI. The std/uri package is capable of parsing rfc3986 URIs accurately.

RFC3986 does not specify specific scheme types at all, and my PR is just fixing scheme defaults.

  • Is beacon-node is fully-featured HTTP or HTTPS server? No (it has limited REST capable HTTP server with limited amount of enpoints)
  • Should the beacon-node be running on ports less than 1024? No (this port range requires administrative privileges).
  • Should the beacon-node has its own scheme? YES (imho).
  • Should the beacon-node has default REST port number? YES (imho).

With this PR i'm fixing regression which was added with feature PR, so if beacon-node was specified using "http" or "https" scheme it will use ":80" and ":443" port respectively if URI do not have port specified.

#4924 validates only for scheme and host to be present, nothing more. And because it so simple it will not work, so it become possible to send faulty beacon-node addresses to VC.

@jshufro
Copy link
Contributor

jshufro commented May 12, 2023

RFC3986 does not specify specific scheme types at all, and my PR is just fixing scheme defaults.

This PR fixes scheme defaults, but it does so by updating a function that should not exist, as long as the argument to --beacon-node is expected to be an rfc3986 URI.

Is beacon-node is fully-featured HTTP or HTTPS server? No

I run my beacon node behind cloudflare, which handles SSL termination. It works quite well. The VC works over https. You're assuming your users are entering a URI that is a raw beacon node. This is a bad assumption.

Should the beacon-node be running on ports less than 1024? No

My VC doesn't talk directly to my BN. I should be able to pick whichever port I want. For example, this is incredibly sensible architecture:
https://pastebin.com/1cPjY8Kw

Should the beacon-node has its own scheme? YES (imho).

But it uses the http and https protocols. Why would it have its own scheme? Are you suggesting removing TLS encryption, too?

And because it so simple it will not work, so it become possible to send faulty beacon-node addresses to VC

Yes, and then the VC will produce an error, which is a good thing. If I send a faulty address to the VC, the VC should tell me so I can correct it.

The way to look at this is very simple.

--beacon-node is an argument which expects a URI. The default value is http://localhost:5052. If the user specifies a value, then the value they specified will be used without modification.

@cheatfate
Copy link
Contributor Author

RFC3986 does not specify specific scheme types at all, and my PR is just fixing scheme defaults.

This PR fixes scheme defaults, but it does so by updating a function that should not exist, as long as the argument to --beacon-node is expected to be an rfc3986 URI.

Argument to --beacon-node is NOT EXPECTED to have scheme but it still expected to be compatible RFC3696 URI.

Is beacon-node is fully-featured HTTP or HTTPS server? No

I run my beacon node behind cloudflare, which handles SSL termination. It works quite well. The VC works over https. You're assuming your users are entering a URI that is a raw beacon node. This is a bad assumption.

This assumption does not interfere with both your point of view and mine, i just want to say that "http://" and "https://" here is equal to "beaconnode://".

Should the beacon-node be running on ports less than 1024? No

My VC doesn't talk directly to my BN. I should be able to pick whichever port I want. For example, this is incredibly sensible architecture: https://pastebin.com/1cPjY8Kw

And my PR does solve it, you can easily specify any port in any range you want.

Should the beacon-node has its own scheme? YES (imho).

But it uses the http and https protocols. Why would it have its own scheme? Are you suggesting removing TLS encryption, too?

Because it provides its own protocol and because it could have default port which could be used by all the validator clients around the world, this could help everybody.

And because it so simple it will not work, so it become possible to send faulty beacon-node addresses to VC

Yes, and then the VC will produce an error, which is a good thing. If I send a faulty address to the VC, the VC should tell me so I can correct it.

You just do not understand that your type of vision does not work well with our current code. We do not have mapping of all the known "scheme => port", so its why we limiting here to "http://" and "https://" so our HTTP client library will be able to establish connection with remote endpoint. AS soon as you start using "mail://" or "ftp://" it will just reject your URI.

The way to look at this is very simple.

--beacon-node is an argument which expects a URI. The default value is http://localhost:5052. If the user specifies a value, then the value they specified will be used without modification.

Default value is not part of this PR and not even question in this PR, this PR fixes regression which was introduced by my PR. So lets stay on topic please.

@jshufro
Copy link
Contributor

jshufro commented May 12, 2023

Argument to --beacon-node is NOT EXPECTED to have scheme but it still expected to be compatible RFC3696 URI.

This is a bad design decision. Teku, lighthouse, and lodestar all use rfc3696, and scheme is required. Prysm uses rfc3696 without a scheme because it uses grpc instead of http.

This assumption does not interfere with both your point of view and mine, i just want to say that "http://" and "https://" here is equal to "beaconnode://".

You plan to achieve this by replacing beaconnode:// with http:// before passing the result of parseUri to the client, or do you plan to reimplement http?

And my PR does solve it, you can easily specify any port in any range you want.

Your PR solves a problem that never existed until you added normalizeUri. The better solution is removing normalizeUri. You added it because you wanted to create a good UX for your users, but you created a bad one: your users aren't opening a browser and typing in google.com, they're telling the validator where to dial the beacon chain. You users want rfc3696. Your users are familiar with rfc3696, whether or not they know it. Your users can be trusted to specify a scheme and port.

Because it provides its own protocol

Incorrect. It uses the http protocol or the https protocol, through nim-chronos, which I'm sure you know provides an http/https client. It depends on a server that implements the beacon api spec which is NOT a protocol. The protocol is http or https.

The fact that it supports BasicAuth headers is another element of the http and https protocols that I use heavily.

Communications pass http/1.1 or http/2 headers. They pass Host headers. They pass Accept headers. These are all attributes of the http protocol. That is because the validator uses the http protocol. It does not use a custom protocol.

Until it does use a custom protocol, it is a bad idea to change this behavior. I doubt it ever will use its own protocol. Even prysm is migrating to http instead of grpc because http/https is the accepted standard.

this could help everybody.

This already hurt me and several other people. I upgraded nimbus, you started sending traffic to 5052 instead of 443, and I had downtime.

Further, I had to update rescue-ui to special-case Nimbus because it doesn't work like every other client does. I also had to warn all of our users about your breaking change on twitter.

You just do not understand that your type of vision does not work well with our current code.

This is false. It worked perfectly fine a month ago. It only stopped working when you added normalizeUri.

I have a lot of experience in this area. My vision is fine.

We do not have mapping of all the known "scheme => port"

You don't need one. You don't need to add :443 when the scheme is https. You client already automatically dials port 443 when the scheme is https. It worked fine until you added normalizeUri.

AS soon as you start using "mail://" or "ftp://" it will just reject your URI.

Good. It should.

PR fixes regression which was introduced by my PR

I agree that it fixes the immediate issue, but it is technical debt. If we're staying on topic, I encourage the nimbus team to accept #4924 and close this PR. #4924 keeps nimbus's validator client functionality flexible and in line with lodestar, teku, prysm, lighthouse, and pretty much every other software application which does RPC over HTTP.

@tersec
Copy link
Contributor

tersec commented May 12, 2023

By comparison, https://ethstaker.cc/mev-relay-list shows the URLs (multiple, often, because one use case for mev-boost is to have backup when one relay has downtime):

They're not trying to save a few bytes here and there, because these are relatively rare configuration events where it's fine to spend a few more seconds typing, or, practically speaking in these cases, using copy/paste.

@cheatfate
Copy link
Contributor Author

Ok, lets go back to the original issue.

  • With new commit i'm removing all the defaults except two:
    1. http scheme defaults to 80 port if port is not specified.
    2. https scheme defaults to 443 port if port is not specified.
  • All other custom and other types of URI scheme are prohibited.
  • Empty scheme and schemeless URIs are acceptable, but only with port specified.

Also i have added a test suite.

@tersec
Copy link
Contributor

tersec commented May 14, 2023

Test summaries need regeneration

@jshufro
Copy link
Contributor

jshufro commented May 15, 2023

Operating under the assumption that prepending a default scheme is desirable (which I still don't feel good about) now.

I would like to suggest this approach #4957

Validating the scheme before calling parseUri has some nice attributes- mainly, you can avoid the dizzying logic of what goes where when parseUri is called on an incomplete URI.

This PR passes the new tests.

@tersec
Copy link
Contributor

tersec commented May 17, 2023

CI issues aside, this PR looks generally good

@zah zah force-pushed the vc-fix-url-handling-defaults branch from 9debbcc to a36cacd Compare May 18, 2023 23:03
@zah zah merged commit a36cacd into unstable May 19, 2023
@zah zah deleted the vc-fix-url-handling-defaults branch May 19, 2023 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants