Skip to content
This repository has been archived by the owner on Feb 8, 2018. It is now read-only.

use id as primary key instead of username #835

Closed
8 of 33 tasks
chadwhitacre opened this issue Apr 11, 2013 · 25 comments
Closed
8 of 33 tasks

use id as primary key instead of username #835

chadwhitacre opened this issue Apr 11, 2013 · 25 comments

Comments

@chadwhitacre
Copy link
Contributor

chadwhitacre commented Apr 11, 2013

We're adopting a zero-downtime approach (#3864 (comment)) here. Here are the basic steps for migrating from a column foo to foo_id without having to turn on maintenance mode:

Step branch.sql Code changes to ... Deploy which first?
1 add a new foo_id column write to both foo and foo_id branch.sql
2 backfill foo_id based on foo read from foo_id branch.sql
3 drop foo no longer write to foo code changes
After step ... We are writing to ... We are reading from ...
0 foo foo
1 foo and foo_id foo
2 foo and foo_id foo_id
3 foo_id foo_id

Todo

See #503 and #287.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@chadwhitacre
Copy link
Contributor Author

I'd be surprised if there weren't an exploitable race condition hidden in there somewhere.

@chadwhitacre
Copy link
Contributor Author

What if we keep the username as the foreign key in the db but add joins and constraints on id when modifying data? We can either do joins when inspecting the db interactively (pita) or in the application when modifying the data (complicates the application somewhat).

@zbynekwinkler
Copy link
Contributor

Why do we want it?

@chadwhitacre
Copy link
Contributor Author

I don't have a big problem right now with using username as the foreign key through the database. I could imagine reasons related to sharding at scale but we're not there yet. I'm closing this ticket. If someone cares about this we can reopen.

@chadwhitacre
Copy link
Contributor Author

IRC

@Changaco
Copy link
Contributor

Changaco commented Apr 3, 2015

We currently have 7 tables referencing username (elsewhere, tips, transfers, exchanges, absorptions, takes, emails) and 5 tables referencing id (exchange_routes, community_members, statements, email_queue, balances_at).

@chadwhitacre chadwhitacre changed the title use id as pk instead of username use id as primary key instead of username Aug 29, 2015
@chadwhitacre
Copy link
Contributor Author

I'd be surprised if there weren't an exploitable race condition hidden in there somewhere.

There are certainly race conditions, because our web app process(es) is (are) multi-threaded. A Participant object on one thread doesn't know about username changes in the database. Yes, at the database level we shouldn't have any race conditions, but we do at the web app level.

@chadwhitacre
Copy link
Contributor Author

Example: if one thread is changing a slug, and another thread is updating a payment instruction based on the slug, then the payment instruction will fail with an error because it won't find a Team with that slug, or—what's worse—it will silently fail by assigning the payment instruction to a Team that has managed to "steal" the old slug of the Team that is changing slugs.

@chadwhitacre
Copy link
Contributor Author

Upgrading to TeamX ★.

@chadwhitacre
Copy link
Contributor Author

@sammyshj Looking at #835 (comment), maybe emails would be a good table to start with? Want to put together a PR changing emails.participant to reference participants.id instead of .username?

@kaguillera kaguillera mentioned this issue May 25, 2017
15 tasks
@chadwhitacre
Copy link
Contributor Author

Closing in light of our decision to shut down Gratipay.

Thank you all for a great run, and I'm sorry it didn't work out! 😞 💃

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants