Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store API key hashes, instead of plain-text values #2193

Closed
r33drichards opened this issue Sep 11, 2024 · 6 comments · Fixed by #2249
Closed

Store API key hashes, instead of plain-text values #2193

r33drichards opened this issue Sep 11, 2024 · 6 comments · Fixed by #2249
Labels
Milestone

Comments

@r33drichards
Copy link

r33drichards commented Sep 11, 2024

Shlink version

4.2.0

PHP version

8.3

How do you serve Shlink

Self-hosted Apache

Database engine

PostgreSQL

Database version

16.3

Current behavior

When I look at my api_keys table for shlink, i can see the api key present, and it is not hashed

Screenshot 2024-09-11 at 12 34 57 PM

Expected behavior

Since the exact value of this api key is not needed, only needing to be checked against a supplied one, it should be hashed like a password, to prevent accidental leaks https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html#background

Minimum steps to reproduce

  • deploy shlinkio
  • create api_key
  • check db
  • observe the plain text key
@r33drichards r33drichards changed the title api keys stored as plane text in database api keys stored as plain text in database Sep 11, 2024
@acelaya acelaya removed the bug label Sep 12, 2024
@acelaya
Copy link
Member

acelaya commented Sep 12, 2024

Shlink does need to be able to display API keys, for example in the api-key:list console command, and I think it was needed in some other place as well.

Also, I think the raw key itself is used to identify API keys in a few places, so hashing them would affect this kind of mechanism as well. I would need to verify this though.

So I agree with you they should be hashed, and I would probably implement it differently today, but it's not easy to change now for backwards compatibility reasons.

I will plan on changing this, but it's going to take a while and needs to be tested carefully.

Among other things it will need the name to be mandatory, so that you have later a way to identify API keys in case you need to revoke, restrict, etc. In existing installations most API keys don't have a name ATM.

These are a few things I can think of, that would be needed in order to allow this change:

  1. Update API key creation command, making the name mandatory. It should warn about duplicated names, even if that's not currently enforced in the database.
    Another option would be to generate a name on the fly if one is not provided, using current date or something in those lines.
    This command would print the plain-text API key once, and then it's up to you to save the value securely.
  2. Add a command to allow editing existing API key names.
  3. Migrate existing keys, setting their raw value in the title field for those where the title is null, then update the key to hash it. Update the logic that checks API keys so that it verifies the hash instead of comparing plain-text values.
    This does not immediately remove the issue, as you still have the plain-text value in one column, but the system will no longer rely on it, allowing you to edit that using the command from step 2.
  4. Next major version would set the name column in the database as non-nullable and unique.

In addition to this, I need to check what's the performance impact on verifying the API key hash on every request. I guess it would be negligible, but something worth checking.

But overall, this needs to be done carefully, as once keys have been hashed there's no way back, so breaking users installations would be bad. They would need to generate new API keys.

@acelaya acelaya changed the title api keys stored as plain text in database Store API key hashes, instead of plain-text values Sep 12, 2024
@acelaya acelaya added this to the 4.3.0 milestone Sep 12, 2024
@acelaya
Copy link
Member

acelaya commented Sep 12, 2024

On the other hand, one reason to hash user passwords is because they are provided by them, and that ensures nobody with access to the database info (legitimately or illegitimately), can know what the user originally provided, which they frequently reuse in other services.

API keys are random and generated by Shlink. If someone can access your database, you are basically screwed already, as they can do anything they would through the API. They don't really need the API key anymore, and the value itself does not present extra risks as it does with user-provided passwords.

@r33drichards
Copy link
Author

Thanks for taking the time to reply!

Shlink does need to be able to display API keys, for example in the api-key:list console command, and I think it was needed in some other place as well.

Also, I think the raw key itself is used to identify API keys in a few places, so hashing them would affect this kind of mechanism as well. I would need to verify this though.

So I agree with you they should be hashed, and I would probably implement it differently today, but it's not easy to change now for backwards compatibility reasons.

ahh yea that makes sense, seems like encrypting the token and storing the encrypted value and decrypting it could be a viable option then.

Not sure if you need a name to revoke and delete a token, surely just selecting the token by its database id is enough? All of those things are not strictly necessary to implement column level encryption for the value of the tokens, but I agree especially on the ability to revoke tokens being an important and useful thing to add.

API keys are random and generated by Shlink. If someone can access your database, you are basically screwed already, as they can do anything they would through the API. They don't really need the API key anymore, and the value itself does not present extra risks as it does with user-provided passwords.

You are absolutely correct in this situation! If I was in a scenario where I had a database leak and I know about it, I'd have to rotate every secret key b/c they would all be compromised if I wanted to continue using the service as is as well. But if you consider the situation where my database is compromised and I DON'T know that its compromised, I have no way to prevent someone from using an api key that they would have obtained from the stolen database to continue to leak data. If someone obtained the db dump through a one time exploit or a purchase on the black market, they could continue to export data. If the values are encrypted or hashed, I could at least have forward secrecy with my data in the event of a leak.

Encrypting tokens can also help prevent against internal threats such as curious administrators or contractors who might have database access but shouldn't see raw API keys.

So yea in summary I agree that hashing is probably not an option for shlink, but having something that sits in front of the data layer that encrypts and decrypts that value so that its encrypted at rest would be a security enhancement.

@acelaya
Copy link
Member

acelaya commented Sep 14, 2024

Oh, I didn't mean to disregard hashing as an option, in fact I think it would be preferable if I had to choose.

I just wanted to a) provide more context, and b) explain the difference between user-provided and system-generated credentials.

@acelaya
Copy link
Member

acelaya commented Nov 5, 2024

One tricky thing I found is that the API key is currently checked via SELECT ... WHERE key={provided_key} LIMIT 1. If the keys are hashed in the database, there's no longer a way to do a single SELECT via a non-hashed key.

I've been investigating, and since API keys are system-generated and random, in most places it is considered to be secure enough to use a non-salted hash. Since that produces always the same result for a specific key, it would allow to continue doing a single SELECT, but with the hash instead of the plain key.

A SHA256 is secure enough and quite fast so that it should not add any overhead to API requests.

@acelaya acelaya mentioned this issue Nov 5, 2024
8 tasks
@acelaya acelaya moved this from In Progress to In review in Shlink Nov 8, 2024
@github-project-automation github-project-automation bot moved this from In review to Done in Shlink Nov 9, 2024
@acelaya
Copy link
Member

acelaya commented Nov 9, 2024

This is now implemented. Starting with Shlink 4.3, API keys will be hashed before persisted in the database.

You can see a detailed explanation of how this has been implemented in the PR description #2249

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants