-
Notifications
You must be signed in to change notification settings - Fork 620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move README rendering off the web server #1745
Conversation
☔ The latest upstream changes (presumably #1686) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested this in staging and confirmed that the readme is uploaded correctly. (Although I'm currently missing CORS headers in my S3 bucket to allow the readme to display in the page.)
This needs rebased and I have a few minor comments on the doc comments. Otherwise, this looks good to me.
Rendering the README is a process that can take an arbitrarily long amount of time, and ends with uploading a file to S3 -- which also can take an arbitrarily long period of time. Since there are no README contents which we consider invalid (which I've indicated by changing the return type of the rendering functions to `String` instead of `Result`), there's no reason for us to do any of this work on the web server. Just like the index updates, we can offload this to our worker to be completed at some point in the future. This does mean that a crate will render in our UI before the README has finished rendering. During this brief period, we will behave the same as if the crate had no README file, showing the description instead. Eventually I think it'd make sense to have this job behave like the `render-readmes` worker, where it pulls the tar file down and reads the actual readme file. This would mean that Cargo can stop sending us the README as its own field when its already sending us a tar file which contains the README. I've left it alone for now though, since there's no benefit while Cargo is already sending us the README, and we can't have it stop doing that until we move to publish v2. A few implementation notes: - The `readme` column on `crates` was never used for anything in Rust, but is used on the DB side to generate the search index. I've left the column in place, but removed most uses of it on the Rust side. - The `readme_file` column on `crates` never used for anything. This commit does not include the migration, which must be deployed separately. This commit only includes code which works with or without the migration for deployment. - `background_jobs::Environment` gained two fields that are just straight up copied from `App`. I've opted to do this rather than just giving it `App`, since I do want to separate "the stuff you need to do background jobs" from "the stuff you need to be the web server". Since we do eventually plan on not having s3 uploads happen on the server, I want to bundle all of this into a single field and just have it live on `background_jobs::Environment` eventually. - I did some general cleanup of `krate::publish` while I was in there, removed a lot of pre-1.0 style code. - I've left the `render-readmes` binary alone for now since it's untested. I'll change it to just queue up jobs once we get to the point where jobs are doing all the tar file work as well.
⚠️ This commit contains a destructive migration.⚠️ ⚠️ It must be deployed separately from the previous commit.⚠️ This column was never used. It's no longer referenced in Rust code, so we can just remove it.
08aa788
to
b631302
Compare
@bors r=jtgeibel |
📌 Commit b631302 has been approved by |
Move README rendering off the web server :warning: This PR contains destructive migrations. The commits included must be deployed separately. :warning: Rendering the README is a process that can take an arbitrarily long amount of time, and ends with uploading a file to S3 -- which also can take an arbitrarily long period of time. Since there are no README contents which we consider invalid (which I've indicated by changing the return type of the rendering functions to `String` instead of `Result`), there's no reason for us to do any of this work on the web server. Just like the index updates, we can offload this to our worker to be completed at some point in the future. This does mean that a crate will render in our UI before the README has finished rendering. During this brief period, we will behave the same as if the crate had no README file, showing the description instead. Eventually I think it'd make sense to have this job behave like the `render-readmes` worker, where it pulls the tar file down and reads the actual readme file. This would mean that Cargo can stop sending us the README as its own field when its already sending us a tar file which contains the README. I've left it alone for now though, since there's no benefit while Cargo is already sending us the README, and we can't have it stop doing that until we move to publish v2. A few implementation notes: - The `readme` column on `crates` was never used for anything in Rust, but is used on the DB side to generate the search index. I've left the column in place, but removed most uses of it on the Rust side. - The `readme_file` column on `crates` never used for anything. - `background_jobs::Environment` gained two fields that are just straight up copied from `App`. I've opted to do this rather than just giving it `App`, since I do want to separate "the stuff you need to do background jobs" from "the stuff you need to be the web server". Since we do eventually plan on not having s3 uploads happen on the server, I want to bundle all of this into a single field and just have it live on `background_jobs::Environment` eventually. - I did some general cleanup of `krate::publish` while I was in there, removed a lot of pre-1.0 style code. - I've left the `render-readmes` binary alone for now since it's untested. I'll change it to just queue up jobs once we get to the point where jobs are doing all the tar file work as well.
☀️ Test successful - checks-travis |
Rendering the README is a process that can take an arbitrarily long amount of time, and ends with uploading a file to S3 -- which also can take an arbitrarily long period of time. Since there are no README contents which we consider invalid (which I've indicated by changing the return type of the rendering functions to
String
instead ofResult
), there's no reason for us to do any of this work on the web server.Just like the index updates, we can offload this to our worker to be completed at some point in the future. This does mean that a crate will render in our UI before the README has finished rendering. During this brief period, we will behave the same as if the crate had no README file, showing the description instead.
Eventually I think it'd make sense to have this job behave like the
render-readmes
worker, where it pulls the tar file down and reads the actual readme file. This would mean that Cargo can stop sending us the README as its own field when its already sending us a tar file which contains the README. I've left it alone for now though, since there's no benefit while Cargo is already sending us the README, and we can't have it stop doing that until we move to publish v2.A few implementation notes:
readme
column oncrates
was never used for anything in Rust, but is used on the DB side to generate the search index. I've left the column in place, but removed most uses of it on the Rust side.readme_file
column oncrates
never used for anything.background_jobs::Environment
gained two fields that are just straight up copied fromApp
. I've opted to do this rather than just giving itApp
, since I do want to separate "the stuff you need to do background jobs" from "the stuff you need to be the web server". Since we do eventually plan on not having s3 uploads happen on the server, I want to bundle all of this into a single field and just have it live onbackground_jobs::Environment
eventually.krate::publish
while I was in there, removed a lot of pre-1.0 style code.render-readmes
binary alone for now since it's untested. I'll change it to just queue up jobs once we get to the point where jobs are doing all the tar file work as well.