-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PIP-25: Token based authentication #2888
Conversation
...ommon/src/main/java/org/apache/pulsar/broker/authentication/AuthenticationProviderToken.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments about naming so it's clear that private/public can be used.
Also, it would be really good to have an integration test in this PR a) to validate that both the PSK and private/public paths work and b) to illustrate the workflow as a basis for documentation.
...ommon/src/main/java/org/apache/pulsar/broker/authentication/AuthenticationProviderToken.java
Show resolved
Hide resolved
pulsar-broker/src/main/java/org/apache/pulsar/utils/auth/tokens/TokensCliUtils.java
Outdated
Show resolved
Hide resolved
@ivankelly @maskit Please take another look |
The 2 paths are tested in the auth provider unit test. It's not the same as integration test but it covers all the plugin code, from getting variables from |
pulsar-broker/src/main/java/org/apache/pulsar/utils/auth/tokens/TokensCliUtils.java
Outdated
Show resolved
Hide resolved
@maskit @ivankelly Updates to support |
|
||
if (validationKeyConfig.startsWith("data:")) { | ||
log.info("Reading token validation key from config file"); | ||
validationKey = validationKeyConfig.substring("data:".length()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be valid for a super simple case, but it wouldn’t work for cases that has mime-type or base64 encoding.
Why didn’t you use the URL class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... because I've never used URL.openStream()
before :)
Though, one thing is that the key is expected to be already base64 encoded.
$ bin/pulsar tokens create-secret-key
secret-key CBqtuReZe+8nKqtJBbmuPTIg6b4W1KjOS21pcJkTJ/8=
Then one would just have to store the key string somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m ok with parsing the schemes here and having a limitation in its format for now, but I still think this data scheme parsing is incorrect. It always need a comma after the colon.
https://tools.ietf.org/html/rfc2397#section-3
Also, base64 encoding is not mandatory. You can pass URL encoded string instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maskit I have updated to conform with data:
standard url, though I had to add manual parsing since java URL
class doesn't support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you look into #1212 I commented on another line? The URL class I suggested is org.apache.pulsar.client.api.url.URL
. The parser already exists in it.
@maskit @ivankelly please take another look. I have moved to use standard |
The "data:" parser you wrote would be a duplicate code because it's already implemented in the in-house URL class I wrote. I'm not going to make a change request because I don't want to delay this feature, but either of the parsers should be removed eventually to eliminate duplicate code. Also, because environment variables are general enough as data sources, I think "env:" parser should be added as a As for "token:", I'm ok with having it in this plugin as a special case. But if you renamed it to "raw:", it could be used on other places. |
run integration tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no change here that shows the whole thing working. It doesn't even have to be an integration test, but there should be a test where a client connects to a server using token based auth.
...roker-common/src/main/java/org/apache/pulsar/broker/authentication/utils/AuthTokenUtils.java
Outdated
Show resolved
Hide resolved
...n/src/test/java/org/apache/pulsar/broker/authentication/AuthenticationProviderTokenTest.java
Show resolved
Hide resolved
String outputFile; | ||
|
||
@Parameter(names = { | ||
"-b", "--base-64" }, description = "Encode the key in base64") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--base64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be enabled by default in the case that -o is not specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but don't know which option is more confusing
pulsar-broker/src/main/java/org/apache/pulsar/utils/auth/tokens/TokensCliUtils.java
Outdated
Show resolved
Hide resolved
token = r.readLine(); | ||
} else if (tokenFile != null) { | ||
token = new String(Files.readAllBytes(Paths.get(tokenFile)), Charsets.UTF_8); | ||
} else if (System.getenv("TOKEN") != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this here? Seems unnecessary. Anyone who wanted to use this would just pass $TOKEN as the parameter on the shell
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason is that a "token" if effectively like a password (or a private key): whoever has it, can access the system. Passing that as a CLI argument will expose to other users on the same machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and how would they get it into the environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, I'm fine with it staying, just not sure how much real security it offers and trying to reduce options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and how would they get it into the environment?
I guess you can do:
TOKEN=xyz pulsar tokens show
The env variable will not be visible on the process list
pulsar-broker/src/main/java/org/apache/pulsar/utils/auth/tokens/TokensCliUtils.java
Show resolved
Hide resolved
pulsar-broker/src/main/java/org/apache/pulsar/utils/auth/tokens/TokensCliUtils.java
Outdated
Show resolved
Hide resolved
pulsar-client/src/main/java/org/apache/pulsar/client/impl/auth/AuthenticationTls.java
Show resolved
Hide resolved
public void configure(String encodedAuthParamString) { | ||
// Interpret the whole param string as the token. If the string contains the notation `token:xxxxx` then strip | ||
// the prefix | ||
if (encodedAuthParamString.startsWith("token:")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this different to data: on the server side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data is meant to carry arbitrary binary data. A token is guaranteed to be a valid ASCII string (being composed of base64 encoded parts).
The encodedAuthParamString
are common to all the auth providers and are a way to pass implementation specific configs. In case of token auth, we're taking the "token" string itself.
I didn't want to use data:
here to simplify the user input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be all or nothing, no? I don't see reasons to support token, file and env but not data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, token instead of data makes sense. token: is used for passing the token to the client. data: is used to allow the server to load the signing key. They're different usecases. The signing key could be quite large in case of public-private key, whereas the token is generally quite small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we shouldn't support data: for small data? Even if it's not a suitable scheme, I don't see reasons to drop it. If we allowed using any schemes for any cases and used the URL class to support them, we wouldn't need to maintain this if-else block.
getContent() returns a InputStream so you should be able to read raw data.
Sure, I think I can help you out on this since Pulsar seems like compilable with Java 11 :) |
@ivankelly @maskit Addressed most of comments and added simple integration tests, Please one more look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, but nothing blocking
pulsar-broker/src/main/java/org/apache/pulsar/utils/auth/tokens/TokensCliUtils.java
Outdated
Show resolved
Hide resolved
pulsar-broker/src/main/java/org/apache/pulsar/utils/auth/tokens/TokensCliUtils.java
Outdated
Show resolved
Hide resolved
|
||
@Cleanup | ||
PulsarClient client = PulsarClient.builder() | ||
.serviceUrl(pulsarCluster.getPlainTextServiceUrl()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not for this PR, but do we have a requireTLS option? We should have an option to block authorization unless TLS is in use to avoid people sending tokens over cleartext.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was another PR that was meant to allow to disable the clear-text service endpoints.
...test/java/org/apache/pulsar/tests/integration/auth/token/TokenAuthWithPublicPrivateKeys.java
Outdated
Show resolved
Hide resolved
return ByteStreams.toByteArray((InputStream) new URL(keyConfUrl).getContent()); | ||
} catch (Exception e) { | ||
throw new IOException(e); | ||
} | ||
} else if (keyConfUrl.startsWith("env:")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't you have a URL handler for this also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, adding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there's one tricky part.
The key, passed as an env variable, needs to be base64 encoded. That might not be needed in general case for env:
(eg: a token would be fine as it is).
It's kind of confusing and we'd have to define something similar to env:base64,MY_SECRET_KEY_VAR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we shoudn’t define encoding field on env scheme to keep it simple. The requirement comes from application use case (which is this plugin here), so the application should take care of it.
If you get URLConnection from the URL object, you can check the scheme name as protocol name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn’t read key generation part, but if the key is in PEM format we can use the same format for all scheme and decode the raw data read from the InputStream in a consistent way, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the env:
schema as it was not working properly. Java seems to be mangling long var values. We can add later when we can make it work.
When you do the integration tests, we need a test for proxy (it seems to be missing in this and c++ change) |
if (charset == null) { | ||
charset = "US-ASCII"; | ||
if (contentType == null) { | ||
this.contentType = "application/data"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFC2397 says:
If <mediatype> is omitted, it defaults to text/plain;charset=US-ASCII.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this.contentType = "application/data"; | |
this.contentType = "text/plain;charset=US-ASCII"; |
if (matcher.group("base64") == null) { | ||
// Support Urlencode but not decode here because already decoded by URI class. | ||
this.data = new String(matcher.group("data").getBytes(), charset); | ||
this.data = matcher.group("data").getBytes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this is probably equivalent to the original code, so this change is fine.
However, I realized that the both code have the same issue. Because we drop charset information here, users cannot read the data as a string with the specified charset. I think we should stop parsing parameters in mimetype and return the whole mimetype part as content-type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with original code is that it was getting a list of bytes and converting them into a String
(which in case of keys is not possible) and then exposing back to a ByteArrayInputStream
. I just cut through the String conversion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can see that, and we shoud fix that part. So this would be out of this PR’s scope but we need to return charset as a part of contentType so that URL class users can convert the data into String after reading the raw data from InputStream. I think we can do this by removing charset part from the regex.
...roker-common/src/main/java/org/apache/pulsar/broker/authentication/utils/AuthTokenUtils.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 because there's no blockers. I can improve the small stuffs around scheme and URL class later. The behavior in normal use cases wouldn't be changed so much.
pulsar-common/src/main/java/org/apache/pulsar/client/api/url/DataURLStreamHandler.java
Outdated
Show resolved
Hide resolved
@ivankelly the test is actually going trough the proxy already since |
@maskit @ivankelly Few more changes:
|
String cmdString = Arrays.stream(cmd).collect(Collectors.joining(" ")); | ||
ByteBuf stdout = Unpooled.buffer(); | ||
ByteBuf stderr = Unpooled.buffer(); | ||
docker.execStartCmd(execid).withDetach(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of code in common with runCommand here. Doesn't need to be in this patch, but it should be factors out to something that takes a Consumer<byte[]>
run cpp tests |
Motivation
This is the first PR for the implementation of token based authentication
Modifications
TODOs (in separate PRs):